WorldWideScience

Sample records for automated genome mining

  1. Automated genome mining for natural products

    Directory of Open Access Journals (Sweden)

    Zajkowski James

    2009-06-01

    Full Text Available Abstract Background Discovery of new medicinal agents from natural sources has largely been an adventitious process based on screening of plant and microbial extracts combined with bioassay-guided identification and natural product structure elucidation. Increasingly rapid and more cost-effective genome sequencing technologies coupled with advanced computational power have converged to transform this trend toward a more rational and predictive pursuit. Results We have developed a rapid method of scanning genome sequences for multiple polyketide, nonribosomal peptide, and mixed combination natural products with output in a text format that can be readily converted to two and three dimensional structures using conventional software. Our open-source and web-based program can assemble various small molecules composed of twenty standard amino acids and twenty two other chain-elongation intermediates used in nonribosomal peptide systems, and four acyl-CoA extender units incorporated into polyketides by reading a hidden Markov model of DNA. This process evaluates and selects the substrate specificities along the assembly line of nonribosomal synthetases and modular polyketide synthases. Conclusion Using this approach we have predicted the structures of natural products from a diverse range of bacteria based on a limited number of signature sequences. In accelerating direct DNA to metabolomic analysis, this method bridges the interface between chemists and biologists and enables rapid scanning for compounds with potential therapeutic value.

  2. Automated genome mining of ribosomal peptide natural products

    Energy Technology Data Exchange (ETDEWEB)

    Mohimani, Hosein; Kersten, Roland; Liu, Wei; Wang, Mingxun; Purvine, Samuel O.; Wu, Si; Brewer, Heather M.; Pasa-Tolic, Ljiljana; Bandeira, Nuno; Moore, Bradley S.; Pevzner, Pavel A.; Dorrestein, Pieter C.

    2014-07-31

    Ribosomally synthesized and posttranslationally modified peptides (RiPPs), especially from microbial sources, are a large group of bioactive natural products that are a promising source of new (bio)chemistry and bioactivity (1). In light of exponentially increasing microbial genome databases and improved mass spectrometry (MS)-based metabolomic platforms, there is a need for computational tools that connect natural product genotypes predicted from microbial genome sequences with their corresponding chemotypes from metabolomic datasets. Here, we introduce RiPPquest, a tandem mass spectrometry database search tool for identification of microbial RiPPs and apply it for lanthipeptide discovery. RiPPquest uses genomics to limit search space to the vicinity of RiPP biosynthetic genes and proteomics to analyze extensive peptide modifications and compute p-values of peptide-spectrum matches (PSMs). We highlight RiPPquest by connection of multiple RiPPs from extracts of Streptomyces to their gene clusters and by the discovery of a new class III lanthipeptide, informatipeptin, from Streptomyces viridochromogenes DSM 40736 as the first natural product to be identified in an automated fashion by genome mining. The presented tool is available at cy-clo.ucsd.edu.

  3. Pep2Path: automated mass spectrometry-guided genome mining of peptidic natural products.

    Science.gov (United States)

    Medema, Marnix H; Paalvast, Yared; Nguyen, Don D; Melnik, Alexey; Dorrestein, Pieter C; Takano, Eriko; Breitling, Rainer

    2014-09-01

    Nonribosomally and ribosomally synthesized bioactive peptides constitute a source of molecules of great biomedical importance, including antibiotics such as penicillin, immunosuppressants such as cyclosporine, and cytostatics such as bleomycin. Recently, an innovative mass-spectrometry-based strategy, peptidogenomics, has been pioneered to effectively mine microbial strains for novel peptidic metabolites. Even though mass-spectrometric peptide detection can be performed quite fast, true high-throughput natural product discovery approaches have still been limited by the inability to rapidly match the identified tandem mass spectra to the gene clusters responsible for the biosynthesis of the corresponding compounds. With Pep2Path, we introduce a software package to fully automate the peptidogenomics approach through the rapid Bayesian probabilistic matching of mass spectra to their corresponding biosynthetic gene clusters. Detailed benchmarking of the method shows that the approach is powerful enough to correctly identify gene clusters even in data sets that consist of hundreds of genomes, which also makes it possible to match compounds from unsequenced organisms to closely related biosynthetic gene clusters in other genomes. Applying Pep2Path to a data set of compounds without known biosynthesis routes, we were able to identify candidate gene clusters for the biosynthesis of five important compounds. Notably, one of these clusters was detected in a genome from a different subphylum of Proteobacteria than that in which the molecule had first been identified. All in all, our approach paves the way towards high-throughput discovery of novel peptidic natural products. Pep2Path is freely available from http://pep2path.sourceforge.net/, implemented in Python, licensed under the GNU General Public License v3 and supported on MS Windows, Linux and Mac OS X. PMID:25188327

  4. Pep2Path: automated mass spectrometry-guided genome mining of peptidic natural products.

    Directory of Open Access Journals (Sweden)

    Marnix H Medema

    2014-09-01

    Full Text Available Nonribosomally and ribosomally synthesized bioactive peptides constitute a source of molecules of great biomedical importance, including antibiotics such as penicillin, immunosuppressants such as cyclosporine, and cytostatics such as bleomycin. Recently, an innovative mass-spectrometry-based strategy, peptidogenomics, has been pioneered to effectively mine microbial strains for novel peptidic metabolites. Even though mass-spectrometric peptide detection can be performed quite fast, true high-throughput natural product discovery approaches have still been limited by the inability to rapidly match the identified tandem mass spectra to the gene clusters responsible for the biosynthesis of the corresponding compounds. With Pep2Path, we introduce a software package to fully automate the peptidogenomics approach through the rapid Bayesian probabilistic matching of mass spectra to their corresponding biosynthetic gene clusters. Detailed benchmarking of the method shows that the approach is powerful enough to correctly identify gene clusters even in data sets that consist of hundreds of genomes, which also makes it possible to match compounds from unsequenced organisms to closely related biosynthetic gene clusters in other genomes. Applying Pep2Path to a data set of compounds without known biosynthesis routes, we were able to identify candidate gene clusters for the biosynthesis of five important compounds. Notably, one of these clusters was detected in a genome from a different subphylum of Proteobacteria than that in which the molecule had first been identified. All in all, our approach paves the way towards high-throughput discovery of novel peptidic natural products. Pep2Path is freely available from http://pep2path.sourceforge.net/, implemented in Python, licensed under the GNU General Public License v3 and supported on MS Windows, Linux and Mac OS X.

  5. Mine hoist automation and control systems

    Energy Technology Data Exchange (ETDEWEB)

    Cock, M.J.L. [CEGELEC Projects Ltd., Rugby (United Kingdom). Mining Marine and Industrial Drives Division

    1995-06-01

    In the past control systems for mine hoists have used many technologies including analogue control, relays and static logic. The dramatic advances in technology in recent years now means that all control functions can be performed using a distributed microprocessor system which minimises training, and gives superior diagnostic information, provides very high reliability. The modern distributed microprocessor system covers all the needs of a mine hoist, from advanced control through automation sequencing to safety systems and electronic speed distance protection. The safety core remains as a proven dual line relay system, but is enhanced by comprehensive first up and status monitoring. The advantages of a distributed microprocessor control system are outlined. Details are presented of the proven MWS2000 system as applied to a cycloconvertor winder, and on the range of options available, which includes the elimination of all drum driven auxiliary shafts, cam gear units and mechanical speed distance protection. Special control techniques for deep level hoisting are incorporated in the system, including `S` shaped speed control of emergency mechanical brakes to minimise rope stress. Finally, a review is given of the latest developments in control technology, and the implications for future developments in mine hoisting. 10 figs.

  6. Automated control of mine dewatering pumps / Tinus Smith

    OpenAIRE

    Smith, Tinus

    2014-01-01

    Deep gold mines use a vast amount of water for various purposes. After use, the water is pumped back to the surface. This process is energy intensive. The control is traditionally done with manual interventions. The purpose of this study is to investigate the effects of automated control on mine dewatering pumps. Automating mine dewatering pumps may hold a great number of benefits for the client. The benefits include electricity cost savings through load shifting, as well as preventative m...

  7. Sensing for advancing mining automation capability:A review of underground automation technology development

    Institute of Scientific and Technical Information of China (English)

    Ralston Jonathon; Reid David; Hargrave Chad; Hainsworth David

    2014-01-01

    This paper highlights the role of automation technologies for improving the safety, productivity, and environmental sustainability of underground coal mining processes. This is accomplished by reviewing the impact that the introduction of automation technology has made through the longwall shearer auto-mation research program of Longwall Automation Steering Committee (LASC). This result has been achieved through close integration of sensing, processing, and control technologies into the longwall mining process. Key to the success of the automation solution has been the development of new sensing methods to accurately measure the location of longwall equipment and the spatial configuration of coal seam geology. The relevance of system interoperability and open communications standards for facilitat-ing effective automation is also discussed. Importantly, the insights gained through the longwall automa-tion development process are now leading to new technology transfer activity to benefit other underground mining processes.

  8. Text mining from ontology learning to automated text processing applications

    CERN Document Server

    Biemann, Chris

    2014-01-01

    This book comprises a set of articles that specify the methodology of text mining, describe the creation of lexical resources in the framework of text mining and use text mining for various tasks in natural language processing (NLP). The analysis of large amounts of textual data is a prerequisite to build lexical resources such as dictionaries and ontologies and also has direct applications in automated text processing in fields such as history, healthcare and mobile applications, just to name a few. This volume gives an update in terms of the recent gains in text mining methods and reflects

  9. Present situation and developing trend of coal mine automation and communication technology

    Institute of Scientific and Technical Information of China (English)

    HU Sui-yan

    2008-01-01

    Introduced developing process of coal mine automation and communicationtechnology, analyzed present features and characteristics of coal mine automation andcommunication technology, and put forward a few key technical problems needed to besolved.

  10. Automated design of genomic Southern blot probes

    Directory of Open Access Journals (Sweden)

    Komiyama Noboru H

    2010-01-01

    experimentally validate a number of these automated designs by Southern blotting. The majority of probes we tested performed well confirming our in silico prediction methodology and the general usefulness of the software for automated genomic Southern probe design. Conclusions Software and supplementary information are freely available at: http://www.genes2cognition.org/software/southern_blot

  11. Pneumatic automation systems in coal mines

    Energy Technology Data Exchange (ETDEWEB)

    Shmatkov, N.A.; Kiklevich, Yu.N.

    1981-04-01

    Giprougleavtomatizatsiya, Avtomatgormash, Dongiprouglemash, VNIIGD and other plants develop 30 new pneumatic systems for mine machines and equipment control each year. The plants produce about 200 types of pneumatic systems. Major pneumatic systems for face systems, machines and equipment are reviewed: Sirena system for remote control of ANShch and AShchM face systems for steep coal seams, UPS control systems for pump stations, PAUZA control system for stowing machines, remote control system of B100-200 drilling machines, PUSK control system for coal cutter loaders with pneumatic drive (A-70, Temp), PUVSh control system for ventilation barriers activated from moving electric locomotives, PAZ control system for skip hoist loading. Specifications of the systems are given. Economic benefit produced by the pneumatic control systems are evaluated (from 1,500 to 40,000 rubles/year). Using the systems increases productivity of face machines and other machines used in black coal mines by 5 to 30%.

  12. Automation and remote control at Mining 85 in Birmingham

    Energy Technology Data Exchange (ETDEWEB)

    Czauderna, N.

    1985-09-12

    Looking round the exhibition showed that more and more manufacturers and users take advantage of the new measuring, control and regulatory techniques with economic profit. Process computers and micro-computers make a general use of these techniques possible. The backbone of all automation systems is data transmission between widely distributed stations on individual machines. Here, too, it was clear that in mining multiple data transmission on the basis of highly integrated components and own microprocessors is occupying an ever more prominent place in communication techniques. BUS systems are more and more in evidence and already installed in some plants. For the purpose of this report and ease of reference to the large field automation and remote control at Minig 85 the products have been grouped under sensor, transmission and automation technologies. (orig./MOS).

  13. Comparative genomics using data mining tools

    Indian Academy of Sciences (India)

    Tannistha Nandi; Chandrika B-Rao; Srinivasan Ramachandran

    2002-02-01

    We have analysed the genomes of representatives of three kingdoms of life, namely, archaea, eubacteria and eukaryota using data mining tools based on compositional analyses of the protein sequences. The representatives chosen in this analysis were Methanococcus jannaschii, Haemophilus influenzae and Saccharomyces cerevisiae. We have identified the common and different features between the three genomes in the protein evolution patterns. M. jannaschii has been seen to have a greater number of proteins with more charged amino acids whereas S. cerevisiae has been observed to have a greater number of hydrophilic proteins. Despite the differences in intrinsic compositional characteristics between the proteins from the different genomes we have also identified certain common characteristics. We have carried out exploratory Principal Component Analysis of the multivariate data on the proteins of each organism in an effort to classify the proteins into clusters. Interestingly, we found that most of the proteins in each organism cluster closely together, but there are a few ‘outliers’. We focus on the outliers for the functional investigations, which may aid in revealing any unique features of the biology of the respective organisms.

  14. Optimizing wireless LAN for longwall coal mine automation

    Energy Technology Data Exchange (ETDEWEB)

    Hargrave, C.O.; Ralston, J.C.; Hainsworth, D.W. [Exploration & Mining Commonwealth Science & Industrial Research Organisation, Pullenvale, Qld. (Australia)

    2007-01-15

    A significant development in underground longwall coal mining automation has been achieved with the successful implementation of wireless LAN (WLAN) technology for communication on a longwall shearer. WIreless-FIdelity (Wi-Fi) was selected to meet the bandwidth requirements of the underground data network, and several configurations were installed on operating longwalls to evaluate their performance. Although these efforts demonstrated the feasibility of using WLAN technology in longwall operation, it was clear that new research and development was required in order to establish optimal full-face coverage. By undertaking an accurate characterization of the target environment, it has been possible to achieve great improvements in WLAN performance over a nominal Wi-Fi installation. This paper discusses the impact of Fresnel zone obstructions and multipath effects on radio frequency propagation and reports an optimal antenna and system configuration. Many of the lessons learned in the longwall case are immediately applicable to other underground mining operations, particularly wherever there is a high degree of obstruction from mining equipment.

  15. BEACON: automated tool for Bacterial GEnome Annotation ComparisON

    KAUST Repository

    Kalkatawi, Manal Matoq Saeed

    2015-08-18

    Background Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). Results The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON’s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced. Conclusions We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/

  16. The evolution of genome mining in microbes - a review.

    Science.gov (United States)

    Ziemert, Nadine; Alanjary, Mohammad; Weber, Tilmann

    2016-08-27

    Covering: 2006 to 2016The computational mining of genomes has become an important part in the discovery of novel natural products as drug leads. Thousands of bacterial genome sequences are publically available these days containing an even larger number and diversity of secondary metabolite gene clusters that await linkage to their encoded natural products. With the development of high-throughput sequencing methods and the wealth of DNA data available, a variety of genome mining methods and tools have been developed to guide discovery and characterisation of these compounds. This article reviews the development of these computational approaches during the last decade and shows how the revolution of next generation sequencing methods has led to an evolution of various genome mining approaches, techniques and tools. After a short introduction and brief overview of important milestones, this article will focus on the different approaches of mining genomes for secondary metabolites, from detecting biosynthetic genes to resistance based methods and "evo-mining" strategies including a short evaluation of the impact of the development of genome mining methods and tools on the field of natural products and microbial ecology. PMID:27272205

  17. BAGEL2 : mining for bacteriocins in genomic data

    NARCIS (Netherlands)

    de Jong, Anne; van Heel, Auke J.; Kok, Jan; Kuipers, Oscar P.

    2010-01-01

    Mining bacterial genomes for bacteriocins is a challenging task due to the substantial structure and sequence diversity, and generally small sizes, of these antimicrobial peptides. Major progress in the research of antimicrobial peptides and the ever-increasing quantities of genomic data, varying fr

  18. The evolution of genome mining in microbes – a review

    DEFF Research Database (Denmark)

    Ziemert, Nadine; Alanjary, Mohammad; Weber, Tilmann

    2016-01-01

    clusters that await linkage to their encoded natural products. With the development of high-throughput sequencing methods and the wealth of DNA data available, a variety of genome mining methods and tools have been developed to guide discovery and characterisation of these compounds. This article reviews......Covering: 2006 to 2016. The computational mining of genomes has become an important part in the discovery of novel natural products as drug leads. Thousands of bacterial genome sequences are publically available these days containing an even larger number and diversity of secondary metabolite gene...

  19. Semi-automated literature mining to identify putative biomarkers of disease from multiple biofluids

    OpenAIRE

    Jordan, Rick; Visweswaran, Shyam; Gopalakrishnan, Vanathi

    2014-01-01

    Background Computational methods for mining of biomedical literature can be useful in augmenting manual searches of the literature using keywords for disease-specific biomarker discovery from biofluids. In this work, we develop and apply a semi-automated literature mining method to mine abstracts obtained from PubMed to discover putative biomarkers of breast and lung cancers in specific biofluids. Methodology A positive set of abstracts was defined by the terms ‘breast cancer’ and ‘lung cance...

  20. Longwall automation:Delivering enabling technology to achieve safer and more productive underground mining

    Institute of Scientific and Technical Information of China (English)

    Ralston Jonathon C.; Reid David C.; Dunn Mark T.; Hainsworth David W

    2015-01-01

    The ongoing need to deliver improved safety, productivity and environmental benefit in coal mining presents an open challenge as well as a powerful incentive to develop new and improved solutions. This paper assesses the critical role that enabling technologies have played in the delivery of remote and automated capability for longwall mining. A brief historical account is given to highlight key technical contributions which have influenced the direction and development of present-day longwall technology. The current state of longwall automation is discussed with particular attention drawn to the technologies that enable automated capability. Outcomes are presented from an independently conducted case study that assessed the impact that CSIRO’s LASC longwall automation research has made to the longwall mining industry in Australia. Importantly, this study reveals how uptake of this innova-tive technology has significantly benefitted coal mine productivity, improved working conditions for personnel and enhanced environmental outcomes. These benefits have been widely adopted with CSIRO automation technology being used in 60 per cent of all Australian underground operations. International deployment of the technology is also emerging. The paper concludes with future challenges and opportunities to highlight the ongoing scope for longwall automation research and development.

  1. Highlights of recent articles on data mining in genomics & proteomics

    Science.gov (United States)

    This editorial elaborates on investigations consisting of different “OMICS” technologies and their application to biological sciences. In addition, advantages and recent development of the proteomic, genomic and data mining technologies are discussed. This information will be useful to scientists ...

  2. WormBase: methods for data mining and comparative genomics.

    Science.gov (United States)

    Harris, Todd W; Stein, Lincoln D

    2006-01-01

    WormBase is a comprehensive repository for information on Caenorhabditis elegans and related nematodes. Although the primary web-based interface of WormBase (http:// www.wormbase.org/) is familiar to most C. elegans researchers, WormBase also offers powerful data-mining features for addressing questions of comparative genomics, genome structure, and evolution. In this chapter, we focus on data mining at WormBase through the use of flexible web interfaces, custom queries, and scripts. The intended audience includes users wishing to query the database beyond the confines of the web interface or fetch data en masse. No knowledge of programming is necessary or assumed, although users with intermediate skills in the Perl scripting language will be able to utilize additional data-mining approaches. PMID:16988424

  3. Mining nematode genome data for novel drug targets.

    Science.gov (United States)

    Foster, Jeremy M; Zhang, Yinhua; Kumar, Sanjay; Carlow, Clotilde K S

    2005-03-01

    Expressed sequence tag projects have currently produced over 400 000 partial gene sequences from more than 30 nematode species and the full genomic sequences of selected nematodes are being determined. In addition, functional analyses in the model nematode Caenorhabditis elegans have addressed the role of almost all genes predicted by the genome sequence. This recent explosion in the amount of available nematode DNA sequences, coupled with new gene function data, provides an unprecedented opportunity to identify pre-validated drug targets through efficient mining of nematode genomic databases. This article describes the various information sources available and strategies that can expedite this process.

  4. Digital Coal Mine Integrated Automation System Based on ControlNet

    Institute of Scientific and Technical Information of China (English)

    CHEN Jin-yun; ZHANG Shen; ZUO Wei-ran

    2007-01-01

    A three-layer model for digital communication in a mine is proposed. Two basic platforms are discussed: A uniform transmission network and a uniform data warehouse. An actual, ControlNet based, transmission network platform suitable for the Jining No.3 coal mine is presented. This network is an information superhighway intended to integrate all existing and new automation subsystems. Its standard interface can be used with future subsystems. The network, data structure and management decision-making all employ this uniform hardware and software. This effectively avoids the problems of system and information islands seen in traditional mine-automation systems. The construction of the network provides a stable foundation for digital communication in the Jining No.3 coal mine.

  5. AGAPE (Automated Genome Analysis PipelinE for pan-genome analysis of Saccharomyces cerevisiae.

    Directory of Open Access Journals (Sweden)

    Giltae Song

    Full Text Available The characterization and public release of genome sequences from thousands of organisms is expanding the scope for genetic variation studies. However, understanding the phenotypic consequences of genetic variation remains a challenge in eukaryotes due to the complexity of the genotype-phenotype map. One approach to this is the intensive study of model systems for which diverse sources of information can be accumulated and integrated. Saccharomyces cerevisiae is an extensively studied model organism, with well-known protein functions and thoroughly curated phenotype data. To develop and expand the available resources linking genomic variation with function in yeast, we aim to model the pan-genome of S. cerevisiae. To initiate the yeast pan-genome, we newly sequenced or re-sequenced the genomes of 25 strains that are commonly used in the yeast research community using advanced sequencing technology at high quality. We also developed a pipeline for automated pan-genome analysis, which integrates the steps of assembly, annotation, and variation calling. To assign strain-specific functional annotations, we identified genes that were not present in the reference genome. We classified these according to their presence or absence across strains and characterized each group of genes with known functional and phenotypic features. The functional roles of novel genes not found in the reference genome and associated with strains or groups of strains appear to be consistent with anticipated adaptations in specific lineages. As more S. cerevisiae strain genomes are released, our analysis can be used to collate genome data and relate it to lineage-specific patterns of genome evolution. Our new tool set will enhance our understanding of genomic and functional evolution in S. cerevisiae, and will be available to the yeast genetics and molecular biology community.

  6. AGAPE (Automated Genome Analysis PipelinE) for pan-genome analysis of Saccharomyces cerevisiae.

    Science.gov (United States)

    Song, Giltae; Dickins, Benjamin J A; Demeter, Janos; Engel, Stacia; Gallagher, Jennifer; Choe, Kisurb; Dunn, Barbara; Snyder, Michael; Cherry, J Michael

    2015-01-01

    The characterization and public release of genome sequences from thousands of organisms is expanding the scope for genetic variation studies. However, understanding the phenotypic consequences of genetic variation remains a challenge in eukaryotes due to the complexity of the genotype-phenotype map. One approach to this is the intensive study of model systems for which diverse sources of information can be accumulated and integrated. Saccharomyces cerevisiae is an extensively studied model organism, with well-known protein functions and thoroughly curated phenotype data. To develop and expand the available resources linking genomic variation with function in yeast, we aim to model the pan-genome of S. cerevisiae. To initiate the yeast pan-genome, we newly sequenced or re-sequenced the genomes of 25 strains that are commonly used in the yeast research community using advanced sequencing technology at high quality. We also developed a pipeline for automated pan-genome analysis, which integrates the steps of assembly, annotation, and variation calling. To assign strain-specific functional annotations, we identified genes that were not present in the reference genome. We classified these according to their presence or absence across strains and characterized each group of genes with known functional and phenotypic features. The functional roles of novel genes not found in the reference genome and associated with strains or groups of strains appear to be consistent with anticipated adaptations in specific lineages. As more S. cerevisiae strain genomes are released, our analysis can be used to collate genome data and relate it to lineage-specific patterns of genome evolution. Our new tool set will enhance our understanding of genomic and functional evolution in S. cerevisiae, and will be available to the yeast genetics and molecular biology community.

  7. Large-scale data mining pilot project in human genome

    Energy Technology Data Exchange (ETDEWEB)

    Musick, R.; Fidelis, R.; Slezak, T.

    1997-05-01

    This whitepaper briefly describes a new, aggressive effort in large- scale data Livermore National Labs. The implications of `large- scale` will be clarified Section. In the short term, this effort will focus on several @ssion-critical questions of Genome project. We will adapt current data mining techniques to the Genome domain, to quantify the accuracy of inference results, and lay the groundwork for a more extensive effort in large-scale data mining. A major aspect of the approach is that we will be fully-staffed data warehousing effort in the human Genome area. The long term goal is strong applications- oriented research program in large-@e data mining. The tools, skill set gained will be directly applicable to a wide spectrum of tasks involving a for large spatial and multidimensional data. This includes applications in ensuring non-proliferation, stockpile stewardship, enabling Global Ecology (Materials Database Industrial Ecology), advancing the Biosciences (Human Genome Project), and supporting data for others (Battlefield Management, Health Care).

  8. Chapter 13: Mining electronic health records in the genomics era.

    Directory of Open Access Journals (Sweden)

    Joshua C Denny

    Full Text Available The combination of improved genomic analysis methods, decreasing genotyping costs, and increasing computing resources has led to an explosion of clinical genomic knowledge in the last decade. Similarly, healthcare systems are increasingly adopting robust electronic health record (EHR systems that not only can improve health care, but also contain a vast repository of disease and treatment data that could be mined for genomic research. Indeed, institutions are creating EHR-linked DNA biobanks to enable genomic and pharmacogenomic research, using EHR data for phenotypic information. However, EHRs are designed primarily for clinical care, not research, so reuse of clinical EHR data for research purposes can be challenging. Difficulties in use of EHR data include: data availability, missing data, incorrect data, and vast quantities of unstructured narrative text data. Structured information includes billing codes, most laboratory reports, and other variables such as physiologic measurements and demographic information. Significant information, however, remains locked within EHR narrative text documents, including clinical notes and certain categories of test results, such as pathology and radiology reports. For relatively rare observations, combinations of simple free-text searches and billing codes may prove adequate when followed by manual chart review. However, to extract the large cohorts necessary for genome-wide association studies, natural language processing methods to process narrative text data may be needed. Combinations of structured and unstructured textual data can be mined to generate high-validity collections of cases and controls for a given condition. Once high-quality cases and controls are identified, EHR-derived cases can be used for genomic discovery and validation. Since EHR data includes a broad sampling of clinically-relevant phenotypic information, it may enable multiple genomic investigations upon a single set of genotyped

  9. BBP: Brucella genome annotation with literature mining and curation

    Directory of Open Access Journals (Sweden)

    He Yongqun

    2006-07-01

    Full Text Available Abstract Background Brucella species are Gram-negative, facultative intracellular bacteria that cause brucellosis in humans and animals. Sequences of four Brucella genomes have been published, and various Brucella gene and genome data and analysis resources exist. A web gateway to integrate these resources will greatly facilitate Brucella research. Brucella genome data in current databases is largely derived from computational analysis without experimental validation typically found in peer-reviewed publications. It is partially due to the lack of a literature mining and curation system able to efficiently incorporate the large amount of literature data into genome annotation. It is further hypothesized that literature-based Brucella gene annotation would increase understanding of complicated Brucella pathogenesis mechanisms. Results The Brucella Bioinformatics Portal (BBP is developed to integrate existing Brucella genome data and analysis tools with literature mining and curation. The BBP InterBru database and Brucella Genome Browser allow users to search and analyze genes of 4 currently available Brucella genomes and link to more than 20 existing databases and analysis programs. Brucella literature publications in PubMed are extracted and can be searched by a TextPresso-powered natural language processing method, a MeSH browser, a keywords search, and an automatic literature update service. To efficiently annotate Brucella genes using the large amount of literature publications, a literature mining and curation system coined Limix is developed to integrate computational literature mining methods with a PubSearch-powered manual curation and management system. The Limix system is used to quickly find and confirm 107 Brucella gene mutations including 75 genes shown to be essential for Brucella virulence. The 75 genes are further clustered using COG. In addition, 62 Brucella genetic interactions are extracted from literature publications. These

  10. Automated training for algorithms that learn from genomic data.

    Science.gov (United States)

    Cilingir, Gokcen; Broschat, Shira L

    2015-01-01

    Supervised machine learning algorithms are used by life scientists for a variety of objectives. Expert-curated public gene and protein databases are major resources for gathering data to train these algorithms. While these data resources are continuously updated, generally, these updates are not incorporated into published machine learning algorithms which thereby can become outdated soon after their introduction. In this paper, we propose a new model of operation for supervised machine learning algorithms that learn from genomic data. By defining these algorithms in a pipeline in which the training data gathering procedure and the learning process are automated, one can create a system that generates a classifier or predictor using information available from public resources. The proposed model is explained using three case studies on SignalP, MemLoci, and ApicoAP in which existing machine learning models are utilized in pipelines. Given that the vast majority of the procedures described for gathering training data can easily be automated, it is possible to transform valuable machine learning algorithms into self-evolving learners that benefit from the ever-changing data available for gene products and to develop new machine learning algorithms that are similarly capable. PMID:25695053

  11. Data mining and the human genome

    Energy Technology Data Exchange (ETDEWEB)

    Abarbanel, Henry [The MITRE Corporation, McLean, VA (US). JASON Program Office; Callan, Curtis [The MITRE Corporation, McLean, VA (US). JASON Program Office; Dally, William [The MITRE Corporation, McLean, VA (US). JASON Program Office; Dyson, Freeman [The MITRE Corporation, McLean, VA (US). JASON Program Office; Hwa, Terence [The MITRE Corporation, McLean, VA (US). JASON Program Office; Koonin, Steven [The MITRE Corporation, McLean, VA (US). JASON Program Office; Levine, Herbert [The MITRE Corporation, McLean, VA (US). JASON Program Office; Rothaus, Oscar [The MITRE Corporation, McLean, VA (US). JASON Program Office; Schwitters, Roy [The MITRE Corporation, McLean, VA (US). JASON Program Office; Stubbs, Christopher [The MITRE Corporation, McLean, VA (US). JASON Program Office; Weinberger, Peter [The MITRE Corporation, McLean, VA (US). JASON Program Office

    2000-01-07

    As genomics research moves from an era of data acquisition to one of both acquisition and interpretation, new methods are required for organizing and prioritizing the data. These methods would allow an initial level of data analysis to be carried out before committing resources to a particular genetic locus. This JASON study sought to delineate the main problems that must be faced in bioinformatics and to identify information technologies that can help to overcome those problems. While the current influx of data greatly exceeds what biologists have experienced in the past, other scientific disciplines and the commercial sector have been handling much larger datasets for many years. Powerful datamining techniques have been developed in other fields that, with appropriate modification, could be applied to the biological sciences.

  12. A web server for mining Comparative Genomic Hybridization (CGH) data

    Science.gov (United States)

    Liu, Jun; Ranka, Sanjay; Kahveci, Tamer

    2007-11-01

    Advances in cytogenetics and molecular biology has established that chromosomal alterations are critical in the pathogenesis of human cancer. Recurrent chromosomal alterations provide cytological and molecular markers for the diagnosis and prognosis of disease. They also facilitate the identification of genes that are important in carcinogenesis, which in the future may help in the development of targeted therapy. A large amount of publicly available cancer genetic data is now available and it is growing. There is a need for public domain tools that allow users to analyze their data and visualize the results. This chapter describes a web based software tool that will allow researchers to analyze and visualize Comparative Genomic Hybridization (CGH) datasets. It employs novel data mining methodologies for clustering and classification of CGH datasets as well as algorithms for identifying important markers (small set of genomic intervals with aberrations) that are potentially cancer signatures. The developed software will help in understanding the relationships between genomic aberrations and cancer types.

  13. An automated annotation tool for genomic DNA sequences using GeneScan and BLAST

    Indian Academy of Sciences (India)

    Andrew M. Lynn; Chakresh Kumar Jain; K. Kosalai; Pranjan Barman; Nupur Thakur; Harish Batra; Alok Bhattacharya

    2001-04-01

    Genomic sequence data are often available well before the annotated sequence is published. We present a method for analysis of genomic DNA to identify coding sequences using the GeneScan algorithm and characterize these resultant sequences by BLAST. The routines are used to develop a system for automated annotation of genome DNA sequences.

  14. Data mining approaches for information retrieval from genomic databases

    Science.gov (United States)

    Liu, Donglin; Singh, Gautam B.

    2000-04-01

    Sequence retrieval in genomic databases is used for finding sequences related to a query sequence specified by a user. Comparison is the main part of the retrieval system in genomic databases. An efficient sequence comparison algorithm is critical in bioinformatics. There are several different algorithms to perform sequence comparison, such as the suffix array based database search, divergence measurement, methods that rely upon the existence of a local similarity between the query sequence and sequences in the database, or common mutual information between query and sequences in DB. In this paper we have described a new method for DNA sequence retrieval based on data mining techniques. Data mining tools generally find patterns among data and have been successfully applied in industries to improve marketing, sales, and customer support operations. We have applied the descriptive data mining techniques to find relevant patterns that are significant for comparing genetic sequences. Relevance feedback score based on common patterns is developed and employed to compute distance between sequences. The contigs of human chromosomes are used to test the retrieval accuracy and the experimental results are presented.

  15. Proceedings of the third International Federation of Automatic Control symposium on automation in mining, mineral and metal processing

    Energy Technology Data Exchange (ETDEWEB)

    O' Shea, J.; Polis, M. (eds.)

    1980-01-01

    Sixty-four papers covering many aspects of automation in mining, mineral and metal processing are presented. Opening and concluding remarks are also given. Fourteen papers are individually abstracted.

  16. Automated Comparative Auditing of NCIT Genomic Roles Using NCBI

    Science.gov (United States)

    Cohen, Barry; Oren, Marc; Min, Hua; Perl, Yehoshua; Halper, Michael

    2008-01-01

    Biomedical research has identified many human genes and various knowledge about them. The National Cancer Institute Thesaurus (NCIT) represents such knowledge as concepts and roles (relationships). Due to the rapid advances in this field, it is to be expected that the NCIT’s Gene hierarchy will contain role errors. A comparative methodology to audit the Gene hierarchy with the use of the National Center for Biotechnology Information’s (NCBI’s) Entrez Gene database is presented. The two knowledge sources are accessed via a pair of Web crawlers to ensure up-to-date data. Our algorithms then compare the knowledge gathered from each, identify discrepancies that represent probable errors, and suggest corrective actions. The primary focus is on two kinds of gene-roles: (1) the chromosomal locations of genes, and (2) the biological processes in which genes plays a role. Regarding chromosomal locations, the discrepancies revealed are striking and systematic, suggesting a structurally common origin. In regard to the biological processes, difficulties arise because genes frequently play roles in multiple processes, and processes may have many designations (such as synonymous terms). Our algorithms make use of the roles defined in the NCIT Biological Process hierarchy to uncover many probable gene-role errors in the NCIT. These results show that automated comparative auditing is a promising technique that can identify a large number of probable errors and corrections for them in a terminological genomic knowledge repository, thus facilitating its overall maintenance. PMID:18486558

  17. Amalgamation of Automated Testing and Data Mining: A Novel Approach in Software Testing

    Directory of Open Access Journals (Sweden)

    Sarita Sharma

    2011-09-01

    Full Text Available Software engineering comprehends several disciplines devoted to prevent and remedy malfunctions and to warrant adequate behavior. Testing is a widespread validation approach in industry, but it is still largely ad hoc, expensive, and unpredictably effective. In today's industry, the design of software tests is mostly based on the testers' expertise, while test automation tools are limited to execution of pre-planned tests only. Evaluation of test outputs is also associated with a considerable effort by human testers who often have improper knowledge of the requirements specification. This manual approach to software testing results in heavy losses to the world's economy. This paper proposes the potential use of data mining algorithms for automated induction of functional requirements from execution data. The induced data mining models of tested software can be utilized for recovering missing and incomplete specifications, designing a minimal set of regression tests, and evaluating the correctness of software outputs when testing new, potentially inconsistent releases of the system.

  18. Data mining for regulatory elements in yeast genome.

    Science.gov (United States)

    Brazma, A; Vilo, J; Ukkonen, E; Valtonen, K

    1997-01-01

    We have examined methods and developed a general software tool for finding and analyzing combinations of transcription factor binding sites that occur relatively often in gene upstream regions (putative promoter regions) in the yeast genome. Such frequently occurring combinations may be essential parts of possible promoter classes. The regions upstream to all genes were first isolated from the yeast genome database MIPS using the information in the annotation files of the database. The ones that do not overlap with coding regions were chosen for further studies. Next, all occurrences of the yeast transcription factor binding sites, as given in the IMD database, were located in the genome and in the selected regions in particular. Finally, by using a general purpose data mining software in combination with our own software, which parametrizes the search, we can find the combinations of binding sites that occur in the upstream regions more frequently than would be expected on the basis of the frequency of individual sites. The procedure also finds so-called association rules present in such combinations. The developed tool is available for use through the WWW.

  19. Event-based text mining for biology and functional genomics.

    Science.gov (United States)

    Ananiadou, Sophia; Thompson, Paul; Nawaz, Raheel; McNaught, John; Kell, Douglas B

    2015-05-01

    The assessment of genome function requires a mapping between genome-derived entities and biochemical reactions, and the biomedical literature represents a rich source of information about reactions between biological components. However, the increasingly rapid growth in the volume of literature provides both a challenge and an opportunity for researchers to isolate information about reactions of interest in a timely and efficient manner. In response, recent text mining research in the biology domain has been largely focused on the identification and extraction of 'events', i.e. categorised, structured representations of relationships between biochemical entities, from the literature. Functional genomics analyses necessarily encompass events as so defined. Automatic event extraction systems facilitate the development of sophisticated semantic search applications, allowing researchers to formulate structured queries over extracted events, so as to specify the exact types of reactions to be retrieved. This article provides an overview of recent research into event extraction. We cover annotated corpora on which systems are trained, systems that achieve state-of-the-art performance and details of the community shared tasks that have been instrumental in increasing the quality, coverage and scalability of recent systems. Finally, several concrete applications of event extraction are covered, together with emerging directions of research.

  20. Clinic-Genomic Association Mining for Colorectal Cancer Using Publicly Available Datasets

    OpenAIRE

    Fang Liu; Yaning Feng; Zhenye Li; Chao Pan; Yuncong Su; Rui Yang; Liying Song; Huilong Duan; Ning Deng

    2014-01-01

    In recent years, a growing number of researchers began to focus on how to establish associations between clinical and genomic data. However, up to now, there is lack of research mining clinic-genomic associations by comprehensively analysing available gene expression data for a single disease. Colorectal cancer is one of the malignant tumours. A number of genetic syndromes have been proven to be associated with colorectal cancer. This paper presents our research on mining clinic-genomic assoc...

  1. Joint Genome Institute's Automation Approach and History

    Energy Technology Data Exchange (ETDEWEB)

    Roberts, Simon

    2006-07-05

    Department of Energy/Joint Genome Institute (DOE/JGI) collaborates with DOE national laboratories and community users, to advance genome science in support of the DOE missions of clean bio-energy, carbon cycling, and bioremediation.

  2. Automated Data Mining of A Proprietary Database System for Physician Quality Improvement

    International Nuclear Information System (INIS)

    Purpose: Physician practice quality improvement is a subject of intense national debate. This report describes using a software data acquisition program to mine an existing, commonly used proprietary radiation oncology database to assess physician performance. Methods and Materials: Between 2003 and 2004, a manual analysis was performed of electronic portal image (EPI) review records. Custom software was recently developed to mine the record-and-verify database and the review process of EPI at our institution. In late 2006, a report was developed that allowed for immediate review of physician completeness and speed of EPI review for any prescribed period. Results: The software extracted >46,000 EPIs between 2003 and 2007, providing EPI review status and time to review by each physician. Between 2003 and 2007, the department EPI review improved from 77% to 97% (range, 85.4-100%), with a decrease in the mean time to review from 4.2 days to 2.4 days. The initial intervention in 2003 to 2004 was moderately successful in changing the EPI review patterns; it was not repeated because of the time required to perform it. However, the implementation in 2006 of the automated review tool yielded a profound change in practice. Using the software, the automated chart review required ∼1.5 h for mining and extracting the data for the 4-year period. Conclusion: This study quantified the EPI review process as it evolved during a 4-year period at our institution and found that automation of data retrieval and review simplified and facilitated physician quality improvement

  3. Lunar surface mining for automated acquisition of helium-3: Methods, processes, and equipment

    Science.gov (United States)

    Li, Y. T.; Wittenberg, L. J.

    1992-01-01

    In this paper, several techniques considered for mining and processing the regolith on the lunar surface are presented. These techniques have been proposed and evaluated based primarily on the following criteria: (1) mining operations should be relatively simple; (2) procedures of mineral processing should be few and relatively easy; (3) transferring tonnages of regolith on the Moon should be minimized; (4) operations outside the lunar base should be readily automated; (5) all equipment should be maintainable; and (6) economic benefit should be sufficient for commercial exploitation. The economic benefits are not addressed in this paper; however, the energy benefits have been estimated to be between 250 and 350 times the mining energy. A mobile mining scheme is proposed that meets most of the mining objectives. This concept uses a bucket-wheel excavator for excavating the regolith, several mechanical electrostatic separators for beneficiation of the regolith, a fast-moving fluidized bed reactor to heat the particles, and a palladium diffuser to separate H2 from the other solar wind gases. At the final stage of the miner, the regolith 'tailings' are deposited directly into the ditch behind the miner and cylinders of the valuable solar wind gases are transported to a central gas processing facility. During the production of He-3, large quantities of valuable H2, H2O, CO, CO2, and N2 are produced for utilization at the lunar base. For larger production of He-3 the utilization of multiple-miners is recommended rather than increasing their size. Multiple miners permit operations at more sites and provide redundancy in case of equipment failure.

  4. Automated Data Preparation and Physics Mining Tools for Space Weather Studies (Invited)

    Science.gov (United States)

    Karimabadi, H.; Sipes, T.

    2009-12-01

    Heliophysics is a data centric field which relies heavily on the use of spacecraft data for further advances. The prevalent approach to analysis of spacecraft data is based on visual inspection of data. As a result, the vast majority of the collected data from various missions has gone unexplored. The computer aided algorithmic approach to data analysis as facilitated through data mining techniques are essential for analysis of large data sets and enable discovery of hidden information and patterns in the data. Many data analysis problems in space weather stand to benefit from the application of data mining techniques. Examples include identifying spacecraft charging signatures in plasma detectors, identifying plasma frequency lines in wave spectrograms (and hence density), detecting and classifying substorm infection features, among others (R. Friedel, private communication). Thus while the need for advanced algorithmic approach to data exploration and knowledge discovery is generally recognized by experimentalists, the adoption of such techniques (“data mining”) has been slow. This has been partly due to the steep learning curve of some of the techniques and/or the requirement to have a working knowledge of statistics. Another factor is the existence of a plethora of data mining approaches, and it is often a daunting task for a scientist to determine the appropriate technique. Our goal has been to make such tools accessible to non-experts and remove it from gee-whiz domain to a practical tool that will become part of the standard arsenal of data analysis. To this end, we have developed an automated data mining technique called MineTool. Its first deployment to analysis of Cluster has been very successful (Karimabadi et al., JGR, 114, A06216 , 2009) and this tool is gaining adoption among experimentalists. In this talk, we will provide an overview of this tool, illustrate its use through examples, and discuss future directions of research.

  5. Automating the Analysis of Spatial Grids A Practical Guide to Data Mining Geospatial Images for Human & Environmental Applications

    CERN Document Server

    Lakshmanan, Valliappa

    2012-01-01

    The ability to create automated algorithms to process gridded spatial data is increasingly important as remotely sensed datasets increase in volume and frequency. Whether in business, social science, ecology, meteorology or urban planning, the ability to create automated applications to analyze and detect patterns in geospatial data is increasingly important. This book provides students with a foundation in topics of digital image processing and data mining as applied to geospatial datasets. The aim is for readers to be able to devise and implement automated techniques to extract information from spatial grids such as radar, satellite or high-resolution survey imagery.

  6. New development of longwall mining equipment based on automation and intelligent technology for thin seam coal

    Institute of Scientific and Technical Information of China (English)

    Guo-fa WANG

    2013-01-01

    The paper introduced complete sets of automatic equipment and technology used in thin seam coal face,and proposed the comprehensive mechanization and automation of safe and high efficiency mining models based on the thin seam drum shearer.The key technology of short length and high power thin seam drum shearer,and new type roof support with big extension ratio and plate canopy were introduced.The new research achievement on automatic control system of complete sets of equipment for the thin seam coal,which composed of electronic-hydraulic system,compact thin seam roof supports,high effective shearer with intelligent control system,and characterized by automatical follow-up and remote control technology,was described in this paper..

  7. PLAN: a web platform for automating high-throughput BLAST searches and for managing and mining results

    Directory of Open Access Journals (Sweden)

    Zhao Xuechun

    2007-02-01

    Full Text Available Abstract Background BLAST searches are widely used for sequence alignment. The search results are commonly adopted for various functional and comparative genomics tasks such as annotating unknown sequences, investigating gene models and comparing two sequence sets. Advances in sequencing technologies pose challenges for high-throughput analysis of large-scale sequence data. A number of programs and hardware solutions exist for efficient BLAST searching, but there is a lack of generic software solutions for mining and personalized management of the results. Systematically reviewing the results and identifying information of interest remains tedious and time-consuming. Results Personal BLAST Navigator (PLAN is a versatile web platform that helps users to carry out various personalized pre- and post-BLAST tasks, including: (1 query and target sequence database management, (2 automated high-throughput BLAST searching, (3 indexing and searching of results, (4 filtering results online, (5 managing results of personal interest in favorite categories, (6 automated sequence annotation (such as NCBI NR and ontology-based annotation. PLAN integrates, by default, the Decypher hardware-based BLAST solution provided by Active Motif Inc. with a greatly improved efficiency over conventional BLAST software. BLAST results are visualized by spreadsheets and graphs and are full-text searchable. BLAST results and sequence annotations can be exported, in part or in full, in various formats including Microsoft Excel and FASTA. Sequences and BLAST results are organized in projects, the data publication levels of which are controlled by the registered project owners. In addition, all analytical functions are provided to public users without registration. Conclusion PLAN has proved a valuable addition to the community for automated high-throughput BLAST searches, and, more importantly, for knowledge discovery, management and sharing based on sequence alignment results

  8. Evaluation of Three Automated Genome Annotations for Halorhabdus utahensis

    DEFF Research Database (Denmark)

    Bakke, Peter; Carney, Nick; DeLoache, Will;

    2009-01-01

    Genome annotations are accumulating rapidly and depend heavily on automated annotation systems. Many genome centers offer annotation systems but no one has compared their output in a systematic way to determine accuracy and inherent errors. Errors in the annotations are routinely deposited in...... databases such as NCBI and used to validate subsequent annotation errors. We submitted the genome sequence of halophilic archaeon Halorhabdus utahensis to be analyzed by three genome annotation services. We have examined the output from each service in a variety of ways in order to compare the methodology...... and effectiveness of the annotations, as well as to explore the genes, pathways, and physiology of the previously unannotated genome. The annotation services differ considerably in gene calls, features, and ease of use. We had to manually identify the origin of replication and the species...

  9. Investigating the Control of Chlorophyll Degradation by Genomic Correlation Mining.

    Science.gov (United States)

    Ghandchi, Frederick P; Caetano-Anolles, Gustavo; Clough, Steven J; Ort, Donald R

    2016-01-01

    Chlorophyll degradation is an intricate process that is critical in a variety of plant tissues at different times during the plant life cycle. Many of the photoactive chlorophyll degradation intermediates are exceptionally cytotoxic necessitating that the pathway be carefully coordinated and regulated. The primary regulatory step in the chlorophyll degradation pathway involves the enzyme pheophorbide a oxygenase (PAO), which oxidizes the chlorophyll intermediate pheophorbide a, that is eventually converted to non-fluorescent chlorophyll catabolites. There is evidence that PAO is differentially regulated across different environmental and developmental conditions with both transcriptional and post-transcriptional components, but the involved regulatory elements are uncertain or unknown. We hypothesized that transcription factors modulate PAO expression across different environmental conditions, such as cold and drought, as well as during developmental transitions to leaf senescence and maturation of green seeds. To test these hypotheses, several sets of Arabidopsis genomic and bioinformatic experiments were investigated and re-analyzed using computational approaches. PAO expression was compared across varied environmental conditions in the three separate datasets using regression modeling and correlation mining to identify gene elements co-expressed with PAO. Their functions were investigated as candidate upstream transcription factors or other regulatory elements that may regulate PAO expression. PAO transcript expression was found to be significantly up-regulated in warm conditions, during leaf senescence, and in drought conditions, and in all three conditions significantly positively correlated with expression of transcription factor Arabidopsis thaliana activating factor 1 (ATAF1), suggesting that ATAF1 is triggered in the plant response to these processes or abiotic stresses and in result up-regulates PAO expression. The proposed regulatory network includes the

  10. Mining association rule bases from integrated genomic data and annotations

    OpenAIRE

    Martinez, Ricardo; Pasquier, Nicolas; Pasquier, Claude

    2008-01-01

    International audience During the last decade, several clustering and association rule mining techniques have been applied to identify groups of co-regulated genes in gene expression data. Nowadays, integrating biological knowledge and gene expression data into a single framework has become a major challenge to improve the relevance of mined patterns and simplify their interpretation by the biologists. The GenMiner approach was developed for mining association rules showing gene groups tha...

  11. Automated quality control for genome wide association studies

    Science.gov (United States)

    Ellingson, Sally R.; Fardo, David W.

    2016-01-01

    This paper provides details on the necessary steps to assess and control data in genome wide association studies (GWAS) using genotype information on a large number of genetic markers for large number of individuals. Due to varied study designs and genotyping platforms between multiple sites/projects as well as potential genotyping errors, it is important to ensure high quality data. Scripts and directions are provided to facilitate others in this process.

  12. Application of the Deformation Information System for automated analysis and mapping of mining terrain deformations - case study from SW Poland

    Science.gov (United States)

    Blachowski, Jan; Grzempowski, Piotr; Milczarek, Wojciech; Nowacka, Anna

    2015-04-01

    Monitoring, mapping and modelling of mining induced terrain deformations are important tasks for quantifying and minimising threats that arise from underground extraction of useful minerals and affect surface infrastructure, human safety, the environment and security of the mining operation itself. The number of methods and techniques used for monitoring and analysis of mining terrain deformations is wide and expanding with the progress in geographical information technologies. These include for example: terrestrial geodetic measurements, Global Navigation Satellite Systems, remote sensing, GIS based modelling and spatial statistics, finite element method modelling, geological modelling, empirical modelling using e.g. the Knothe theory, artificial neural networks, fuzzy logic calculations and other. The presentation shows the results of numerical modelling and mapping of mining terrain deformations for two cases of underground mining sites in SW Poland, hard coal one (abandoned) and copper ore (active) using the functionalities of the Deformation Information System (DIS) (Blachowski et al, 2014 @ http://meetingorganizer.copernicus.org/EGU2014/EGU2014-7949.pdf). The functionalities of the spatial data modelling module of DIS have been presented and its applications in modelling, mapping and visualising mining terrain deformations based on processing of measurement data (geodetic and GNSS) for these two cases have been characterised and compared. These include, self-developed and implemented in DIS, automation procedures for calculating mining terrain subsidence with different interpolation techniques, calculation of other mining deformation parameters (i.e. tilt, horizontal displacement, horizontal strain and curvature), as well as mapping mining terrain categories based on classification of the values of these parameters as used in Poland. Acknowledgments. This work has been financed from the National Science Centre Project "Development of a numerical method of

  13. Automated quality control for genome wide association studies.

    Science.gov (United States)

    Ellingson, Sally R; Fardo, David W

    2016-01-01

    This paper provides details on the necessary steps to assess and control data in genome wide association studies (GWAS) using genotype information on a large number of genetic markers for large number of individuals. Due to varied study designs and genotyping platforms between multiple sites/projects as well as potential genotyping errors, it is important to ensure high quality data. Scripts and directions are provided to facilitate others in this process. PMID:27635224

  14. Mining and characterization of two amidase signature family amidases from Brevibacterium epidermidis ZJB-07021 by an efficient genome mining approach.

    Science.gov (United States)

    Ruan, Li-Tao; Zheng, Ren-Chao; Zheng, Yu-Guo

    2016-10-01

    Amidases have received increasing attention for their significant potential in the production of valuable carboxylic acids. In this study, two amidases belonging to amidase signature family (BeAmi2 and BeAmi4) were identified and mined from genomic DNA of Brevibacterium epidermidis ZJB-07021 by an efficient strategy combining comparative analysis of genomes and identification of unknown region by high-efficiency thermal asymmetric interlaced PCR (HiTAIL-PCR). The deduced amino acid sequences of BeAmi2 and BeAmi4 showed low identity (derivatives. PMID:27180252

  15. Correcting Inconsistencies and Errors in Bacterial Genome Metadata Using an Automated Curation Tool in Excel (AutoCurE).

    Science.gov (United States)

    Schmedes, Sarah E; King, Jonathan L; Budowle, Bruce

    2015-01-01

    Whole-genome data are invaluable for large-scale comparative genomic studies. Current sequencing technologies have made it feasible to sequence entire bacterial genomes with relative ease and time with a substantially reduced cost per nucleotide, hence cost per genome. More than 3,000 bacterial genomes have been sequenced and are available at the finished status. Publically available genomes can be readily downloaded; however, there are challenges to verify the specific supporting data contained within the download and to identify errors and inconsistencies that may be present within the organizational data content and metadata. AutoCurE, an automated tool for bacterial genome database curation in Excel, was developed to facilitate local database curation of supporting data that accompany downloaded genomes from the National Center for Biotechnology Information. AutoCurE provides an automated approach to curate local genomic databases by flagging inconsistencies or errors by comparing the downloaded supporting data to the genome reports to verify genome name, RefSeq accession numbers, the presence of archaea, BioProject/UIDs, and sequence file descriptions. Flags are generated for nine metadata fields if there are inconsistencies between the downloaded genomes and genomes reports and if erroneous or missing data are evident. AutoCurE is an easy-to-use tool for local database curation for large-scale genome data prior to downstream analyses.

  16. Discovery of Defense- and Neuropeptides in Social Ants by Genome-Mining

    OpenAIRE

    Gruber, Christian W.; Markus Muttenthaler

    2012-01-01

    Natural peptides of great number and diversity occur in all organisms, but analyzing their peptidome is often difficult. With natural product drug discovery in mind, we devised a genome-mining approach to identify defense- and neuropeptides in the genomes of social ants from Atta cephalotes (leaf-cutter ant), Camponotus floridanus (carpenter ant) and Harpegnathos saltator (basal genus). Numerous peptide-encoding genes of defense peptides, in particular defensins, and neuropeptides or regulato...

  17. Mining for Single Nucleotide Polymorphisms in Pig genome sequence data

    NARCIS (Netherlands)

    Kerstens, H.H.D.; Kollers, S.; Kommandath, A.; Rosario, del M.; Dibbits, B.W.; Kinders, S.M.; Crooijmans, R.P.M.A.; Groenen, M.A.M.

    2009-01-01

    Background - Single nucleotide polymorphisms (SNPs) are ideal genetic markers due to their high abundance and the highly automated way in which SNPs are detected and SNP assays are performed. The number of SNPs identified in the pig thus far is still limited. Results - A total of 4.8 million whole g

  18. Ask and Ye Shall Receive? Automated Text Mining of Michigan Capital Facility Finance Bond Election Proposals to Identify Which Topics Are Associated with Bond Passage and Voter Turnout

    Science.gov (United States)

    Bowers, Alex J.; Chen, Jingjing

    2015-01-01

    The purpose of this study is to bring together recent innovations in the research literature around school district capital facility finance, municipal bond elections, statistical models of conditional time-varying outcomes, and data mining algorithms for automated text mining of election ballot proposals to examine the factors that influence the…

  19. Draft Genome Sequences of Two Novel Acidimicrobiaceae Members from an Acid Mine Drainage Biofilm Metagenome

    Science.gov (United States)

    Pinto, Ameet J.; Sharp, Jonathan O.; Yoder, Michael J.

    2016-01-01

    Bacteria belonging to the family Acidimicrobiaceae are frequently encountered in heavy metal-contaminated acidic environments. However, their phylogenetic and metabolic diversity is poorly resolved. We present draft genome sequences of two novel and phylogenetically distinct Acidimicrobiaceae members assembled from an acid mine drainage biofilm metagenome. PMID:26769942

  20. Toward the automated generation of genome-scale metabolic networks in the SEED

    Directory of Open Access Journals (Sweden)

    Gould John

    2007-04-01

    Full Text Available Abstract Background Current methods for the automated generation of genome-scale metabolic networks focus on genome annotation and preliminary biochemical reaction network assembly, but do not adequately address the process of identifying and filling gaps in the reaction network, and verifying that the network is suitable for systems level analysis. Thus, current methods are only sufficient for generating draft-quality networks, and refinement of the reaction network is still largely a manual, labor-intensive process. Results We have developed a method for generating genome-scale metabolic networks that produces substantially complete reaction networks, suitable for systems level analysis. Our method partitions the reaction space of central and intermediary metabolism into discrete, interconnected components that can be assembled and verified in isolation from each other, and then integrated and verified at the level of their interconnectivity. We have developed a database of components that are common across organisms, and have created tools for automatically assembling appropriate components for a particular organism based on the metabolic pathways encoded in the organism's genome. This focuses manual efforts on that portion of an organism's metabolism that is not yet represented in the database. We have demonstrated the efficacy of our method by reverse-engineering and automatically regenerating the reaction network from a published genome-scale metabolic model for Staphylococcus aureus. Additionally, we have verified that our method capitalizes on the database of common reaction network components created for S. aureus, by using these components to generate substantially complete reconstructions of the reaction networks from three other published metabolic models (Escherichia coli, Helicobacter pylori, and Lactococcus lactis. We have implemented our tools and database within the SEED, an open-source software environment for comparative

  1. KAIKObase: An integrated silkworm genome database and data mining tool

    Directory of Open Access Journals (Sweden)

    Nagaraju Javaregowda

    2009-10-01

    Full Text Available Abstract Background The silkworm, Bombyx mori, is one of the most economically important insects in many developing countries owing to its large-scale cultivation for silk production. With the development of genomic and biotechnological tools, B. mori has also become an important bioreactor for production of various recombinant proteins of biomedical interest. In 2004, two genome sequencing projects for B. mori were reported independently by Chinese and Japanese teams; however, the datasets were insufficient for building long genomic scaffolds which are essential for unambiguous annotation of the genome. Now, both the datasets have been merged and assembled through a joint collaboration between the two groups. Description Integration of the two data sets of silkworm whole-genome-shotgun sequencing by the Japanese and Chinese groups together with newly obtained fosmid- and BAC-end sequences produced the best continuity (~3.7 Mb in N50 scaffold size among the sequenced insect genomes and provided a high degree of nucleotide coverage (88% of all 28 chromosomes. In addition, a physical map of BAC contigs constructed by fingerprinting BAC clones and a SNP linkage map constructed using BAC-end sequences were available. In parallel, proteomic data from two-dimensional polyacrylamide gel electrophoresis in various tissues and developmental stages were compiled into a silkworm proteome database. Finally, a Bombyx trap database was constructed for documenting insertion positions and expression data of transposon insertion lines. Conclusion For efficient usage of genome information for functional studies, genomic sequences, physical and genetic map information and EST data were compiled into KAIKObase, an integrated silkworm genome database which consists of 4 map viewers, a gene viewer, and sequence, keyword and position search systems to display results and data at the level of nucleotide sequence, gene, scaffold and chromosome. Integration of the

  2. VirSorter: mining viral signal from microbial genomic data

    Directory of Open Access Journals (Sweden)

    Simon Roux

    2015-05-01

    Full Text Available Viruses of microbes impact all ecosystems where microbes drive key energy and substrate transformations including the oceans, humans and industrial fermenters. However, despite this recognized importance, our understanding of viral diversity and impacts remains limited by too few model systems and reference genomes. One way to fill these gaps in our knowledge of viral diversity is through the detection of viral signal in microbial genomic data. While multiple approaches have been developed and applied for the detection of prophages (viral genomes integrated in a microbial genome, new types of microbial genomic data are emerging that are more fragmented and larger scale, such as Single-cell Amplified Genomes (SAGs of uncultivated organisms or genomic fragments assembled from metagenomic sequencing. Here, we present VirSorter, a tool designed to detect viral signal in these different types of microbial sequence data in both a reference-dependent and reference-independent manner, leveraging probabilistic models and extensive virome data to maximize detection of novel viruses. Performance testing shows that VirSorter’s prophage prediction capability compares to that of available prophage predictors for complete genomes, but is superior in predicting viral sequences outside of a host genome (i.e., from extrachromosomal prophages, lytic infections, or partially assembled prophages. Furthermore, VirSorter outperforms existing tools for fragmented genomic and metagenomic datasets, and can identify viral signal in assembled sequence (contigs as short as 3kb, while providing near-perfect identification (>95% Recall and 100% Precision on contigs of at least 10kb. Because VirSorter scales to large datasets, it can also be used in “reverse” to more confidently identify viral sequence in viral metagenomes by sorting away cellular DNA whether derived from gene transfer agents, generalized transduction or contamination. Finally, VirSorter is made

  3. Automated whole-genome multiple alignment of rat, mouse, and human

    Energy Technology Data Exchange (ETDEWEB)

    Brudno, Michael; Poliakov, Alexander; Salamov, Asaf; Cooper, Gregory M.; Sidow, Arend; Rubin, Edward M.; Solovyev, Victor; Batzoglou, Serafim; Dubchak, Inna

    2004-07-04

    We have built a whole genome multiple alignment of the three currently available mammalian genomes using a fully automated pipeline which combines the local/global approach of the Berkeley Genome Pipeline and the LAGAN program. The strategy is based on progressive alignment, and consists of two main steps: (1) alignment of the mouse and rat genomes; and (2) alignment of human to either the mouse-rat alignments from step 1, or the remaining unaligned mouse and rat sequences. The resulting alignments demonstrate high sensitivity, with 87% of all human gene-coding areas aligned in both mouse and rat. The specificity is also high: <7% of the rat contigs are aligned to multiple places in human and 97% of all alignments with human sequence > 100kb agree with a three-way synteny map built independently using predicted exons in the three genomes. At the nucleotide level <1% of the rat nucleotides are mapped to multiple places in the human sequence in the alignment; and 96.5% of human nucleotides within all alignments agree with the synteny map. The alignments are publicly available online, with visualization through the novel Multi-VISTA browser that we also present.

  4. GroopM: an automated tool for the recovery of population genomes from related metagenomes

    Directory of Open Access Journals (Sweden)

    Michael Imelfort

    2014-09-01

    Full Text Available Metagenomic binning methods that leverage differential population abundances in microbial communities (differential coverage are emerging as a complementary approach to conventional composition-based binning. Here we introduce GroopM, an automated binning tool that primarily uses differential coverage to obtain high fidelity population genomes from related metagenomes. We demonstrate the effectiveness of GroopM using synthetic and real-world metagenomes, and show that GroopM produces results comparable with more time consuming, labor-intensive methods.

  5. SeqMule: automated pipeline for analysis of human exome/genome sequencing data.

    Science.gov (United States)

    Guo, Yunfei; Ding, Xiaolei; Shen, Yufeng; Lyon, Gholson J; Wang, Kai

    2015-09-18

    Next-generation sequencing (NGS) technology has greatly helped us identify disease-contributory variants for Mendelian diseases. However, users are often faced with issues such as software compatibility, complicated configuration, and no access to high-performance computing facility. Discrepancies exist among aligners and variant callers. We developed a computational pipeline, SeqMule, to perform automated variant calling from NGS data on human genomes and exomes. SeqMule integrates computational-cluster-free parallelization capability built on top of the variant callers, and facilitates normalization/intersection of variant calls to generate consensus set with high confidence. SeqMule integrates 5 alignment tools, 5 variant calling algorithms and accepts various combinations all by one-line command, therefore allowing highly flexible yet fully automated variant calling. In a modern machine (2 Intel Xeon X5650 CPUs, 48 GB memory), when fast turn-around is needed, SeqMule generates annotated VCF files in a day from a 30X whole-genome sequencing data set; when more accurate calling is needed, SeqMule generates consensus call set that improves over single callers, as measured by both Mendelian error rate and consistency. SeqMule supports Sun Grid Engine for parallel processing, offers turn-key solution for deployment on Amazon Web Services, allows quality check, Mendelian error check, consistency evaluation, HTML-based reports. SeqMule is available at http://seqmule.openbioinformatics.org.

  6. GenMiner: mining informative association rules from genomic data

    OpenAIRE

    Martinez, Ricardo; Pasquier, Nicolas; Pasquier, Claude

    2007-01-01

    International audience GENMINER is a smart adaptation of closed itemsets based association rules extraction to genomic data. It takes advantage of the novel NORDI discretization method and of the JCLOSE algorithm to efficiently generate minimal non-redundant association rules. GENMINER facilitates the integration of numerous sources of biological information such as gene expressions and annotations, and can tacitly integrate qualitative information on biological conditions (age, sex, etc.)....

  7. Recent advances in genome mining of secondary metabolites in Aspergillus terreus

    Directory of Open Access Journals (Sweden)

    Clay Chia Chun Wang

    2014-12-01

    Full Text Available Filamentous fungi are rich resources of secondary metabolites (SMs with a variety of interesting biological activities. Recent advances in genome sequencing and techniques in genetic manipulation have enabled researchers to study the biosynthetic genes of these SMs. Aspergillus terreus is the well-known producer of lovastatin, a cholesterol-lowering drug. This fungus also produces other SMs, including acetylaranotin, butyrolactones and territram, with interesting bioactivities. This review will cover recent progress in genome mining of SMs identified in this fungus. The identification and characterization of the gene cluster for these SMs, as well as the proposed biosynthetic pathways, will be discussed in depth.

  8. SNP-RFLPing: restriction enzyme mining for SNPs in genomes

    OpenAIRE

    Cheng Yu-Huei; Chang Phei-Lang; Yang Cheng-Hong; Chang Hsueh-Wei; Chuang Li-Yeh

    2006-01-01

    Abstract Background The restriction fragment length polymorphism (RFLP) is a common laboratory method for the genotyping of single nucleotide polymorphisms (SNPs). Here, we describe a web-based software, named SNP-RFLPing, which provides the restriction enzyme for RFLP assays on a batch of SNPs and genes from the human, rat, and mouse genomes. Results Three user-friendly inputs are included: 1) NCBI dbSNP "rs" or "ss" IDs; 2) NCBI Entrez gene ID and HUGO gene name; 3) any formats of SNP-in-se...

  9. A Novel CalB-Type Lipase Discovered by Fungal Genomes Mining

    OpenAIRE

    Vaquero, Maria E.; de Eugenio, Laura I.; Martínez, Maria J.; Jorge Barriuso

    2015-01-01

    The fungus Pseudozyma antarctica produces a lipase (CalB) with broad substrate specificity, stability, high regio- and enantio-selectivity. It is active in non-aqueous organic solvents and at elevated temperatures. Hence, CalB is a robust biocatalyst for chemical conversions on an industrial scale. Here we report the in silico mining of public metagenomes and fungal genomes to discover novel lipases with high homology to CalB. The candidates were selected taking into account homology and cons...

  10. Mining the pig genome to investigate the domestication process.

    Science.gov (United States)

    Ramos-Onsins, S E; Burgos-Paz, W; Manunza, A; Amills, M

    2014-12-01

    Pig domestication began around 9000 YBP in the Fertile Crescent and Far East, involving marked morphological and genetic changes that occurred in a relatively short window of time. Identifying the alleles that drove the behavioural and physiological transformation of wild boars into pigs through artificial selection constitutes a formidable challenge that can only be faced from an interdisciplinary perspective. Indeed, although basic facts regarding the demography of pig domestication and dispersal have been uncovered, the biological substrate of these processes remains enigmatic. Considerable hope has been placed on new approaches, based on next-generation sequencing, which allow whole-genome variation to be analyzed at the population level. In this review, we provide an outline of the current knowledge on pig domestication by considering both archaeological and genetic data. Moreover, we discuss several potential scenarios of genome evolution under the complex mixture of demography and selection forces at play during domestication. Finally, we highlight several technical and methodological approaches that may represent significant advances in resolving the conundrum of livestock domestication.

  11. Genome mining offers a new starting point for parasitology research.

    Science.gov (United States)

    Lv, Zhiyue; Wu, Zhongdao; Zhang, Limei; Ji, Pengyu; Cai, Yifeng; Luo, Shiqi; Wang, Hongxi; Li, Hao

    2015-02-01

    Parasites including helminthes, protozoa, and medical arthropod vectors are a major cause of global infectious diseases, affecting one-sixth of the world's population, which are responsible for enormous levels of morbidity and mortality important and remain impediments to economic development especially in tropical countries. Prevalent drug resistance, lack of highly effective and practical vaccines, as well as specific and sensitive diagnostic markers are proving to be challenging problems in parasitic disease control in most parts of the world. The impressive progress recently made in genome-wide analysis of parasites of medical importance, including trematodes of Clonorchis sinensis, Opisthorchis viverrini, Schistosoma haematobium, S. japonicum, and S. mansoni; nematodes of Brugia malayi, Loa loa, Necator americanus, Trichinella spiralis, and Trichuris suis; cestodes of Echinococcus granulosus, E. multilocularis, and Taenia solium; protozoa of Babesia bovis, B. microti, Cryptosporidium hominis, Eimeria falciformis, E. histolytica, Giardia intestinalis, Leishmania braziliensis, L. donovani, L. major, Plasmodium falciparum, P. vivax, Trichomonas vaginalis, Trypanosoma brucei and T. cruzi; and medical arthropod vectors of Aedes aegypti, Anopheles darlingi, A. sinensis, and Culex quinquefasciatus, have been systematically covered in this review for a comprehensive understanding of the genetic information contained in nuclear, mitochondrial, kinetoplast, plastid, or endosymbiotic bacterial genomes of parasites, further valuable insight into parasite-host interactions and development of promising novel drug and vaccine candidates and preferable diagnostic tools, thereby underpinning the prevention and control of parasitic diseases. PMID:25563615

  12. A Tool for Multiple Targeted Genome Deletions that Is Precise, Scar-Free, and Suitable for Automation.

    Directory of Open Access Journals (Sweden)

    Wayne Aubrey

    Full Text Available Many advances in synthetic biology require the removal of a large number of genomic elements from a genome. Most existing deletion methods leave behind markers, and as there are a limited number of markers, such methods can only be applied a fixed number of times. Deletion methods that recycle markers generally are either imprecise (remove untargeted sequences, or leave scar sequences which can cause genome instability and rearrangements. No existing marker recycling method is automation-friendly. We have developed a novel openly available deletion tool that consists of: 1 a method for deleting genomic elements that can be repeatedly used without limit, is precise, scar-free, and suitable for automation; and 2 software to design the method's primers. Our tool is sequence agnostic and could be used to delete large numbers of coding sequences, promoter regions, transcription factor binding sites, terminators, etc in a single genome. We have validated our tool on the deletion of non-essential open reading frames (ORFs from S. cerevisiae. The tool is applicable to arbitrary genomes, and we provide primer sequences for the deletion of: 90% of the ORFs from the S. cerevisiae genome, 88% of the ORFs from S. pombe genome, and 85% of the ORFs from the L. lactis genome.

  13. Chapter 10: Mining genome-wide genetic markers.

    Directory of Open Access Journals (Sweden)

    Xiang Zhang

    Full Text Available Genome-wide association study (GWAS aims to discover genetic factors underlying phenotypic traits. The large number of genetic factors poses both computational and statistical challenges. Various computational approaches have been developed for large scale GWAS. In this chapter, we will discuss several widely used computational approaches in GWAS. The following topics will be covered: (1 An introduction to the background of GWAS. (2 The existing computational approaches that are widely used in GWAS. This will cover single-locus, epistasis detection, and machine learning methods that have been recently developed in biology, statistic, and computer science communities. This part will be the main focus of this chapter. (3 The limitations of current approaches and future directions.

  14. Automated realtime detection of mining induced seismicity in the Ruhr coal mining district, Germany, using master waveforms

    Science.gov (United States)

    Fischer, Kasper D.; Wlecklik, Dennis; Friederich, Wolfgang; Wehling-Benatelli, Sebastian

    2016-04-01

    The exploitation of the subsurface by mining, geothermal or petroleum production causes seismic events in the surrounding areas. Shallow focal depths can lead to perceptible ground motions in densely populated areas and in rare cases to damages even for small events (magnitude smaller than 3.5). Thus, the monitoring of this kind of activities is necessary and increasingly requested by governmental agencies. A reliable detection and localisation of small events generally requires a dense and therefore expensive local seismic station network. At the end of 2014 and beginning of 2015, a dense seismic network of 12 stations was set up as a test case in the area of the black coal mine Prosper-Haniel in the Ruhr district, Germany. This network was capable of detecting almost 400 events within 4 weeks. A cluster analysis identified 135 events of magnitude -0.7 or higher, which could be located. This cluster analysis was also used to construct master events for running a real-time single-station cross-correlation detector in the Seiscomp3 software. The results of the real-time cross-correlation detector are compared to the results of the cluster analysis with respect to the number, magnitudes and locations of the events. This two-step monitoring of the source area provides a cost efficient way for long term monitoring of the mining activity.

  15. metabolicMine: an integrated genomics, genetics and proteomics data warehouse for common metabolic disease research.

    Science.gov (United States)

    Lyne, Mike; Smith, Richard N; Lyne, Rachel; Aleksic, Jelena; Hu, Fengyuan; Kalderimis, Alex; Stepan, Radek; Micklem, Gos

    2013-01-01

    Common metabolic and endocrine diseases such as diabetes affect millions of people worldwide and have a major health impact, frequently leading to complications and mortality. In a search for better prevention and treatment, there is ongoing research into the underlying molecular and genetic bases of these complex human diseases, as well as into the links with risk factors such as obesity. Although an increasing number of relevant genomic and proteomic data sets have become available, the quantity and diversity of the data make their efficient exploitation challenging. Here, we present metabolicMine, a data warehouse with a specific focus on the genomics, genetics and proteomics of common metabolic diseases. Developed in collaboration with leading UK metabolic disease groups, metabolicMine integrates data sets from a range of experiments and model organisms alongside tools for exploring them. The current version brings together information covering genes, proteins, orthologues, interactions, gene expression, pathways, ontologies, diseases, genome-wide association studies and single nucleotide polymorphisms. Although the emphasis is on human data, key data sets from mouse and rat are included. These are complemented by interoperation with the RatMine rat genomics database, with a corresponding mouse version under development by the Mouse Genome Informatics (MGI) group. The web interface contains a number of features including keyword search, a library of Search Forms, the QueryBuilder and list analysis tools. This provides researchers with many different ways to analyse, view and flexibly export data. Programming interfaces and automatic code generation in several languages are supported, and many of the features of the web interface are available through web services. The combination of diverse data sets integrated with analysis tools and a powerful query system makes metabolicMine a valuable research resource. The web interface makes it accessible to first

  16. Metagenomic technology and genome mining: emerging areas for exploring novel nitrilases.

    Science.gov (United States)

    Gong, Jin-Song; Lu, Zhen-Ming; Li, Heng; Zhou, Zhe-Min; Shi, Jin-Song; Xu, Zheng-Hong

    2013-08-01

    Nitrilase is one of the most important members in the nitrilase superfamily and it is widely used for bioproduction of commodity chemicals and pharmaceutical intermediates as well as bioremediation of nitrile-contaminated wastes. However, its application was hindered by several limitations. Searching for new nitrilases and improving their application performances are the driving force for researchers. Genetic data resources in various databases are quite rich in post-genomic era. Besides, more than 99 % of microbes in the environment are unculturable. Metagenomic technology and genome mining are thus becoming burgeoning areas and provide unprecedented opportunities for searching more useful novel nitrilases due to the abundance of already existing but unexplored gene resources, namely uncharacterized genome information in the database and unculturable microbes in the natural environment. These techniques seem to be innovative and highly efficient. This study reviews the current status and future directions of metagenomics and genome mining in nitrilase exploration. Moreover, it discussed their utilization in coping with the challenges for nitrilase application. In the next several years, with the rapid development of nitrile biocatalysis, these two techniques would be bound to attract increasing attentions and even become a dominant trend for finding more novel nitrilases. Also, this review would provide guidance for exploitation of other commercially important enzymes. PMID:23801047

  17. Genome Mining in Sorangium cellulosum So ce56

    Science.gov (United States)

    Ewen, Kerstin Maria; Hannemann, Frank; Khatri, Yogan; Perlova, Olena; Kappl, Reinhard; Krug, Daniel; Hüttermann, Jürgen; Müller, Rolf; Bernhardt, Rita

    2009-01-01

    Myxobacteria, especially members of the genus Sorangium, are known for their biotechnological potential as producers of pharmaceutically valuable secondary metabolites. The biosynthesis of several of those myxobacterial compounds includes cytochrome P450 activity. Although class I cytochrome P450 enzymes occur wide-spread in bacteria and rely on ferredoxins and ferredoxin reductases as essential electron mediators, the study of these proteins is often neglected. Therefore, we decided to search in the Sorangium cellulosum So ce56 genome for putative interaction partners of cytochromes P450. In this work we report the investigation of eight myxobacterial ferredoxins and two ferredoxin reductases with respect to their activity in cytochrome P450 systems. Intriguingly, we found not only one, but two ferredoxins whose ability to sustain an endogenous So ce56 cytochrome P450 was demonstrated by CYP260A1-dependent conversion of nootkatone. Moreover, we could demonstrate that the two ferredoxins were able to receive electrons from both ferredoxin reductases. These findings indicate that S. cellulosum can alternate between different electron transport pathways to sustain cytochrome P450 activity. PMID:19696019

  18. GMATA: An Integrated Software Package for Genome-Scale SSR Mining, Marker Development and Viewing

    Science.gov (United States)

    Wang, Xuewen; Wang, Le

    2016-01-01

    Simple sequence repeats (SSRs), also referred to as microsatellites, are highly variable tandem DNAs that are widely used as genetic markers. The increasing availability of whole-genome and transcript sequences provides information resources for SSR marker development. However, efficient software is required to efficiently identify and display SSR information along with other gene features at a genome scale. We developed novel software package Genome-wide Microsatellite Analyzing Tool Package (GMATA) integrating SSR mining, statistical analysis and plotting, marker design, polymorphism screening and marker transferability, and enabled simultaneously display SSR markers with other genome features. GMATA applies novel strategies for SSR analysis and primer design in large genomes, which allows GMATA to perform faster calculation and provides more accurate results than existing tools. Our package is also capable of processing DNA sequences of any size on a standard computer. GMATA is user friendly, only requires mouse clicks or types inputs on the command line, and is executable in multiple computing platforms. We demonstrated the application of GMATA in plants genomes and reveal a novel distribution pattern of SSRs in 15 grass genomes. The most abundant motifs are dimer GA/TC, the A/T monomer and the GCG/CGC trimer, rather than the rich G/C content in DNA sequence. We also revealed that SSR count is a linear to the chromosome length in fully assembled grass genomes. GMATA represents a powerful application tool that facilitates genomic sequence analyses. GAMTA is freely available at http://sourceforge.net/projects/gmata/?source=navbar. PMID:27679641

  19. Ensembl Plants: Integrating Tools for Visualizing, Mining, and Analyzing Plant Genomics Data.

    Science.gov (United States)

    Bolser, Dan; Staines, Daniel M; Pritchard, Emily; Kersey, Paul

    2016-01-01

    Ensembl Plants ( http://plants.ensembl.org ) is an integrative resource presenting genome-scale information for a growing number of sequenced plant species (currently 33). Data provided includes genome sequence, gene models, functional annotation, and polymorphic loci. Various additional information are provided for variation data, including population structure, individual genotypes, linkage, and phenotype data. In each release, comparative analyses are performed on whole genome and protein sequences, and genome alignments and gene trees are made available that show the implied evolutionary history of each gene family. Access to the data is provided through a genome browser incorporating many specialist interfaces for different data types, and through a variety of additional methods for programmatic access and data mining. These access routes are consistent with those offered through the Ensembl interface for the genomes of non-plant species, including those of plant pathogens, pests, and pollinators.Ensembl Plants is updated 4-5 times a year and is developed in collaboration with our international partners in the Gramene ( http://www.gramene.org ) and transPLANT projects ( http://www.transplantdb.org ).

  20. A FRAMEWORK: CLUSTER DETECTION AND MULTIDIMENSIONAL VISUALIZATION OF AUTOMATED DATA MINING USING INTELLIGENT AGENTS

    Directory of Open Access Journals (Sweden)

    R. Jayabrabu

    2012-02-01

    Full Text Available Data Mining techniques plays a vital role like extraction of required knowledge, finding unsuspectedinformation to make strategic decision in a novel way which in term understandable by domain experts. Ageneralized frame work is proposed by considering non – domain experts during mining process for betterunderstanding, making better decision and better finding new patters in case of selecting suitable datamining techniques based on the user profile by means of intelligent agents.

  1. Perspective: NanoMine: A material genome approach for polymer nanocomposites analysis and design

    Science.gov (United States)

    Zhao, He; Li, Xiaolin; Zhang, Yichi; Schadler, Linda S.; Chen, Wei; Brinson, L. Catherine

    2016-05-01

    Polymer nanocomposites are a designer class of materials where nanoscale particles, functional chemistry, and polymer resin combine to provide materials with unprecedented combinations of physical properties. In this paper, we introduce NanoMine, a data-driven web-based platform for analysis and design of polymer nanocomposite systems under the material genome concept. This open data resource strives to curate experimental and computational data on nanocomposite processing, structure, and properties, as well as to provide analysis and modeling tools that leverage curated data for material property prediction and design. With a continuously expanding dataset and toolkit, NanoMine encourages community feedback and input to construct a sustainable infrastructure that benefits nanocomposite material research and development.

  2. Bibliomining for Automated Collection Development in a Digital Library Setting: Using Data Mining To Discover Web-Based Scholarly Research Works.

    Science.gov (United States)

    Nicholson, Scott

    2003-01-01

    Discusses quality issues regarding Web sites and describes research that created an intelligent agent for automated collection development in a digital academic library setting, which uses a predictive model based on facets of each Web page to select scholarly works. Describes the use of bibliomining, or data mining for libraries. (Author/LRW)

  3. 煤矿综采工作面自动化技术探究%Automation Technology of Coal Mine Fully Mechanized Working Face

    Institute of Scientific and Technical Information of China (English)

    田成立

    2015-01-01

    我国煤炭资源丰富,煤矿企业居多。由于煤矿开采作业环境恶劣,运用自动化技术即可提高产量,又能降低劳动强度。发展自动化开采技术有利于促进煤矿企业安全高效生产。研究和应用自动化采煤技术是煤炭行业的重点任务。从综采工作面自动化技术的构成、功能、控制等方面进行阐述。%My country is rich in coal resources, coal mining enterprises in the majority. Because of the bad coal mining operation environment, use of automation technology can increase production, and can reduce labor intensity. Development of automated production technology is helpful to promote the safe and efficient production of coal mining enterprises. Coal mining research and application of automation technology is the focus of the coal industry. This article from the automation technology of fully mechanized working face of structure, function, control, etc.

  4. Automated analysis for large amount gaseous fission product gamma-scanning spectra from nuclear power plant and its data mining

    International Nuclear Information System (INIS)

    Based on the Linssi database and UniSampo/Shaman software, an automated analysis platform has been setup for the analysis of large amounts of gamma-spectra from the primary coolant monitoring systems of a CANDU reactor. Thus, a database inventory of gaseous and volatile fission products in the primary coolant of a CANDU reactor has been established. This database is comprised of 15,000 spectra of radioisotope analysis records. Records from the database inventory were retrieved by a specifically designed data-mining module and subjected to further analysis. Results from the analysis were subsequently used to identify the reactor coolant half-life of 135Xe and 133Xe, as well as the correlations of 135Xe and 88Kr activities. (author)

  5. Genome mining reveals unlocked bioactive potential of marine Gram-negative bacteria

    DEFF Research Database (Denmark)

    Machado, Henrique; Sonnenschein, Eva; Melchiorsen, Jette;

    2015-01-01

    Background: Antibiotic resistance in bacteria spreads quickly, overtaking the pace at which new compounds are discovered and this emphasizes the immediate need to discover new compounds for control of infectious diseases. Terrestrial bacteria have for decades been investigated as a source......- and Gammaproteobacteria collected during the Galathea 3 expedition were sequenced and mined for natural product encoding gene clusters. Results: Independently of genome size, bacteria of all tested genera carried a large number of clusters encoding different potential bioactivities, especially within the Vibrionaceae...... of bioactive compounds leading to successful applications in pharmaceutical and biotech industries. Marine bacteria have so far not been exploited to the same extent; however, they are believed to harbor a multitude of novel bioactive chemistry. To explore this potential, genomes of 21 marine Alpha...

  6. Genome mining expands the chemical diversity of the cyanobactin family to include highly modified linear peptides.

    Science.gov (United States)

    Leikoski, Niina; Liu, Liwei; Jokela, Jouni; Wahlsten, Matti; Gugger, Muriel; Calteau, Alexandra; Permi, Perttu; Kerfeld, Cheryl A; Sivonen, Kaarina; Fewer, David P

    2013-08-22

    Ribosomal peptides are produced through the posttranslational modification of short precursor peptides. Cyanobactins are a growing family of cyclic ribosomal peptides produced by cyanobacteria. However, a broad systematic survey of the genetic capacity to produce cyanobactins is lacking. Here we report the identification of 31 cyanobactin gene clusters from 126 genomes of cyanobacteria. Genome mining suggested a complex evolutionary history defined by horizontal gene transfer and rapid diversification of precursor genes. Extensive chemical analyses demonstrated that some cyanobacteria produce short linear cyanobactins with a chain length ranging from three to five amino acids. The linear peptides were N-prenylated and O-methylated on the N and C termini, respectively, and named aeruginosamide and viridisamide. These findings broaden the structural diversity of the cyanobactin family to include highly modified linear peptides with rare posttranslational modifications. PMID:23911585

  7. Mining human Behaviors: automated behavioral Analysis from small to big Data

    OpenAIRE

    Staiano, Jacopo

    2014-01-01

    This research thesis aims to address complex problems in Human Behavior Understanding from a computational standpoint: to develop novel methods for enabling machines to capture not only what their sensors are perceiving but also how and why the situation they are presented with is evolving in a certain manner. Touching several fields, from Computer Vision to Social Psychology through Natural Language Processing and Data Mining, we will move from more to less constrained scenarios, descr...

  8. Mining clinical attributes of genomic variants through assisted literature curation in Egas.

    Science.gov (United States)

    Matos, Sérgio; Campos, David; Pinho, Renato; Silva, Raquel M; Mort, Matthew; Cooper, David N; Oliveira, José Luís

    2016-01-01

    The veritable deluge of biological data over recent years has led to the establishment of a considerable number of knowledge resources that compile curated information extracted from the literature and store it in structured form, facilitating its use and exploitation. In this article, we focus on the curation of inherited genetic variants and associated clinical attributes, such as zygosity, penetrance or inheritance mode, and describe the use of Egas for this task. Egas is a web-based platform for text-mining assisted literature curation that focuses on usability through modern design solutions and simple user interactions. Egas offers a flexible and customizable tool that allows defining the concept types and relations of interest for a given annotation task, as well as the ontologies used for normalizing each concept type. Further, annotations may be performed on raw documents or on the results of automated concept identification and relation extraction tools. Users can inspect, correct or remove automatic text-mining results, manually add new annotations, and export the results to standard formats. Egas is compatible with the most recent versions of Google Chrome, Mozilla Firefox, Internet Explorer and Safari and is available for use at https://demo.bmd-software.com/egas/Database URL: https://demo.bmd-software.com/egas/.

  9. Mining clinical attributes of genomic variants through assisted literature curation in Egas.

    Science.gov (United States)

    Matos, Sérgio; Campos, David; Pinho, Renato; Silva, Raquel M; Mort, Matthew; Cooper, David N; Oliveira, José Luís

    2016-01-01

    The veritable deluge of biological data over recent years has led to the establishment of a considerable number of knowledge resources that compile curated information extracted from the literature and store it in structured form, facilitating its use and exploitation. In this article, we focus on the curation of inherited genetic variants and associated clinical attributes, such as zygosity, penetrance or inheritance mode, and describe the use of Egas for this task. Egas is a web-based platform for text-mining assisted literature curation that focuses on usability through modern design solutions and simple user interactions. Egas offers a flexible and customizable tool that allows defining the concept types and relations of interest for a given annotation task, as well as the ontologies used for normalizing each concept type. Further, annotations may be performed on raw documents or on the results of automated concept identification and relation extraction tools. Users can inspect, correct or remove automatic text-mining results, manually add new annotations, and export the results to standard formats. Egas is compatible with the most recent versions of Google Chrome, Mozilla Firefox, Internet Explorer and Safari and is available for use at https://demo.bmd-software.com/egas/Database URL: https://demo.bmd-software.com/egas/. PMID:27278817

  10. Improved processing-string fusion-approach investigation for automated sea-mine classification in shallow water

    Science.gov (United States)

    Aridgides, Tom; Fernandez, Manuel F.; Dobeck, Gerald J.

    2004-09-01

    An improved sea mine computer-aided-detection/computer-aided-classification (CAD/CAC) processing string has been developed. This robust automated processing string involves the fusion of the outputs of unique mine classification algorithms. The overall CAD/CAC processing string consists of pre-processing, adaptive clutter filtering (ACF), normalization, detection, feature extraction, optimal subset feature selection, feature orthogonalization, classification and fusion processing blocks. The range-dimension ACF is matched both to average highlight and shadow information, while also adaptively suppressing background clutter. For each detected object, features are extracted and processed through an orthogonalization transformation, enabling an efficient application of the optimal log-likelihood-ratio-test (LLRT) classification rule, in the orthogonal feature space domain. The classified objects of 4 distinct processing strings are fused using the classification confidence values as features and "M-out-of-N", or LLRT-based fusion rules. The utility of the overall processing strings and their fusion was demonstrated with new shallow water high-resolution sonar imagery data. The processing string detection and classification parameters were tuned and the string classification performance was optimized, by appropriately selecting a subset of the original feature set. Two significant improvements were made to the CAD/CAC processing string by employing sub-image adaptive clutter filtering (SACF) and utilizing a repeated application of the subset feature selection/feature orthogonalization/LLRT classification blocks. It was shown that LLRT-based fusion of the CAD/CAC processing strings outperforms the "M-out-of-N" algorithms and results in up to a seven-fold false alarm rate reduction, compared to the best single CAD/CAC processing string results, while maintaining a high correct mine classification probability. Alternately, the fusion of the processing strings enabled

  11. Draft Genome Sequence of Plant Growth-Promoting Rhizobium Mesorhizobium amorphae, Isolated from Zinc-Lead Mine Tailings

    OpenAIRE

    Hao, Xiuli; Lin, Yanbing; Johnstone, Laurel; Baltrus, David A; Miller, Susan J.; Wei, Gehong; Rensing, Christopher

    2012-01-01

    Here, we describe the draft genome sequence of Mesorhizobium amorphae strain CCNWGS0123, isolated from nodules of Robinia pseudoacacia growing on zinc-lead mine tailings. A large number of metal(loid) resistance genes, as well as genes reported to promote plant growth, were identified, presenting a great future potential for aiding phytoremediation in metal(loid)-contaminated soil.

  12. Draft Genome Sequence of Plant Growth-Promoting Rhizobium Mesorhizobium amorphae, Isolated from Zinc-Lead Mine Tailings

    Science.gov (United States)

    Hao, Xiuli; Lin, Yanbing; Johnstone, Laurel; Baltrus, David A.; Miller, Susan J.

    2012-01-01

    Here, we describe the draft genome sequence of Mesorhizobium amorphae strain CCNWGS0123, isolated from nodules of Robinia pseudoacacia growing on zinc-lead mine tailings. A large number of metal(loid) resistance genes, as well as genes reported to promote plant growth, were identified, presenting a great future potential for aiding phytoremediation in metal(loid)-contaminated soil. PMID:22247533

  13. O-miner: an integrative platform for automated analysis and mining of -omics data

    OpenAIRE

    Cutts, Rosalind J.; Dayem Ullah, Abu Z; Sangaralingam, Ajanthah; Gadaleta, Emanuela; Lemoine, Nicholas R.; Chelala, Claude

    2012-01-01

    High-throughput profiling has generated massive amounts of data across basic, clinical and translational research fields. However, open source comprehensive web tools for analysing data obtained from different platforms and technologies are still lacking. To fill this gap and the unmet computational needs of ongoing research projects, we developed O-miner, a rapid, comprehensive, efficient web tool that covers all the steps required for the analysis of both transcriptomic and genomic data sta...

  14. Automated integration of genomic physical mapping data via parallel simulated annealing

    Energy Technology Data Exchange (ETDEWEB)

    Slezak, T.

    1994-06-01

    The Human Genome Center at the Lawrence Livermore National Laboratory (LLNL) is nearing closure on a high-resolution physical map of human chromosome 19. We have build automated tools to assemble 15,000 fingerprinted cosmid clones into 800 contigs with minimal spanning paths identified. These islands are being ordered, oriented, and spanned by a variety of other techniques including: Fluorescence Insitu Hybridization (FISH) at 3 levels of resolution, ECO restriction fragment mapping across all contigs, and a multitude of different hybridization and PCR techniques to link cosmid, YAC, AC, PAC, and Pl clones. The FISH data provide us with partial order and distance data as well as orientation. We made the observation that map builders need a much rougher presentation of data than do map readers; the former wish to see raw data since these can expose errors or interesting biology. We further noted that by ignoring our length and distance data we could simplify our problem into one that could be readily attacked with optimization techniques. The data integration problem could then be seen as an M x N ordering of our N cosmid clones which ``intersect`` M larger objects by defining ``intersection`` to mean either contig/map membership or hybridization results. Clearly, the goal of making an integrated map is now to rearrange the N cosmid clone ``columns`` such that the number of gaps on the object ``rows`` are minimized. Our FISH partially-ordered cosmid clones provide us with a set of constraints that cannot be violated by the rearrangement process. We solved the optimization problem via simulated annealing performed on a network of 40+ Unix machines in parallel, using a server/client model built on explicit socket calls. For current maps we can create a map in about 4 hours on the parallel net versus 4+ days on a single workstation. Our biologists are now using this software on a daily basis to guide their efforts toward final closure.

  15. From genome mining to phenotypic microarrays: Planctomycetes as source for novel bioactive molecules.

    Science.gov (United States)

    Jeske, Olga; Jogler, Mareike; Petersen, Jörn; Sikorski, Johannes; Jogler, Christian

    2013-10-01

    Most members of the phylum Planctomycetes share many unusual traits that are unique for bacteria, since they divide independent of FtsZ through asymmetric budding, possess a complex life cycle and comprise a compartmentalized cell plan. Besides their complex cell biological features Planctomycetes are environmentally important and play major roles in global matter fluxes. Such features have been successfully employed in biotechnological applications such as the anaerobic oxidation of ammonium in wastewater treatment plants or the utilization of enzymes for biotechnological processes. However, little is known about planctomycetal secondary metabolites. This is surprising as Planctomycetes have several key features in common with known producers of small bioactive molecules such as Streptomycetes or Myxobacteria: a complex life style and large genome sizes. Planctomycetal genomes with an average size of 6.9 MB appear as tempting targets for drug discovery approaches. To enable the hunt for bioactive molecules from Planctomycetes, we performed a comprehensive genome mining approach employing the antiSMASH secondary metabolite identification pipeline and found 102 candidate genes or clusters within the analyzed 13 genomes. However, as most genes and operons related to secondary metabolite production are exclusively expressed under certain environmental conditions, we optimized Phenotype MicroArray protocols for Rhodopirellula baltica and Planctomyces limnophilus to allow high throughput screening of putative stimulating carbon sources. Our results point towards a previously postulated relationship of Planctomycetes with algae or plants, which secrete compounds that might serve as trigger to stimulate the secondary metabolite production in Planctomycetes. Thus, this study provides the necessary starting point to explore planctomycetal small molecules for drug development. PMID:23982431

  16. An Integrated Metabolomic and Genomic Mining Workflow to Uncover the Biosynthetic Potential of Bacteria

    DEFF Research Database (Denmark)

    Månsson, Maria; Vynne, Nikolaj Grønnegaard; Klitgaard, Andreas;

    2016-01-01

    in bacteria and mine the associated chemical diversity. Thirteen strains closely related to Pseudoalteromonas luteoviolacea isolated from all over the Earth were analyzed using an untargeted metabolomics strategy, and metabolomic profiles were correlated with whole-genome sequences of the strains. We found...... considerable diversity: only 2% of the chemical features and 7% of the biosynthetic genes were common to all strains, while 30% of all features and 24% of the genes were unique to single strains. The list of chemical features was reduced to 50 discriminating features using a genetic algorithm and support......, integrative strategy for elucidating the chemical richness of a given set of bacteria and link the chemistry to biosynthetic genes....

  17. Recent processing string and fusion algorithm improvements for automated sea mine classification in shallow water

    Science.gov (United States)

    Aridgides, Tom; Fernandez, Manuel F.; Dobeck, Gerald J.

    2003-09-01

    A novel sea mine computer-aided-detection / computer-aided-classification (CAD/CAC) processing string has been developed. The overall CAD/CAC processing string consists of pre-processing, adaptive clutter filtering (ACF), normalization, detection, feature extraction, feature orthogonalization, optimal subset feature selection, classification and fusion processing blocks. The range-dimension ACF is matched both to average highlight and shadow information, while also adaptively suppressing background clutter. For each detected object, features are extracted and processed through an orthogonalization transformation, enabling an efficient application of the optimal log-likelihood-ratio-test (LLRT) classification rule, in the orthogonal feature space domain. The classified objects of 4 distinct processing strings are fused using the classification confidence values as features and logic-based, "M-out-of-N", or LLRT-based fusion rules. The utility of the overall processing strings and their fusion was demonstrated with new shallow water high-resolution sonar imagery data. The processing string detection and classification parameters were tuned and the string classification performance was optimized, by appropriately selecting a subset of the original feature set. A significant improvement was made to the CAD/CAC processing string by utilizing a repeated application of the subset feature selection / LLRT classification blocks. It was shown that LLRT-based fusion algorithms outperform the logic based and the "M-out-of-N" ones. The LLRT-based fusion of the CAD/CAC processing strings resulted in up to a nine-fold false alarm rate reduction, compared to the best single CAD/CAC processing string results, while maintaining a constant correct mine classification probability.

  18. Volterra fusion of processing strings for automated sea mine classification in shallow water

    Science.gov (United States)

    Aridgides, Tom; Fernandez, Manuel; Dobeck, Gerald j.

    2005-06-01

    An improved sea mine computer-aided-detection / computer-aided-classification (CAD/CAC) processing string has been developed. The overall CAD/CAC processing string consists of pre-processing, adaptive clutter filtering (ACF), normalization, detection, feature extraction, optimal subset feature selection, feature orthogonalization, classification and fusion processing blocks. The range-dimension ACF is matched both to average highlight and shadow information, while also adaptively suppressing background clutter. For each detected object, features are extracted and processed through an orthogonalization transformation, enabling an efficient application of the optimal log-likelihood-ratio-test (LLRT) classification rule, in the orthogonal feature space domain. The classified objects of 4 distinct processing strings are fused using the classification confidence values as features and either "M-out-of-N" or LLRT-based fusion rules. The utility of the overall processing strings and their fusion was demonstrated with new shallow water high-resolution sonar imagery data. The processing string detection and classification parameters were tuned and the string classification performance was optimized, by appropriately selecting a subset of the original feature set. Two significant improvements were made to the CAD/CAC processing string by employing sub-image adaptive clutter filtering (SACF) and utilizing a repeated application of the subset feature selection / feature orthogonalization / LLRT classification blocks. A new nonlinear (Volterra) feature LLRT fusion algorithm was developed. It was shown that this Volterra feature LLRT fusion of the CAD/CAC processing strings outperforms the "M-out-of-N" and baseline LLRT algorithms, yielding significant improvements over the best single CAD/CAC processing string results, and providing the capability to correctly call all mine targets while maintaining a very low false alarm rate.

  19. Mining

    Directory of Open Access Journals (Sweden)

    Khairullah Khan

    2014-09-01

    Full Text Available Opinion mining is an interesting area of research because of its applications in various fields. Collecting opinions of people about products and about social and political events and problems through the Web is becoming increasingly popular every day. The opinions of users are helpful for the public and for stakeholders when making certain decisions. Opinion mining is a way to retrieve information through search engines, Web blogs and social networks. Because of the huge number of reviews in the form of unstructured text, it is impossible to summarize the information manually. Accordingly, efficient computational methods are needed for mining and summarizing the reviews from corpuses and Web documents. This study presents a systematic literature survey regarding the computational techniques, models and algorithms for mining opinion components from unstructured reviews.

  20. EST Pipeline System: Detailed and Automated EST Data Processing and Mining

    Institute of Scientific and Technical Information of China (English)

    Hao Xu; Liang Zhang; Hong Yu; Yan Zhou; Ling He; Yuanzhong Zhu; Wei Huang; Lijun Fang; Lin Tao; Yuedong Zhu; Lin Cai; Huayong Xu

    2003-01-01

    Expressed sequence tags (ESTs) are widely used in gene survey research these years. The EST Pipeline System, software developed by Hangzhou Genomics Institute (HGI), can automatically analyze different scalar EST sequences by suitable methods. All the analysis reports, including those of vector masking, sequence assembly, gene annotation, Gene Ontology classification, and some other analyses,can be browsed and searched as well as downloaded in the Excel format from the web interface, saving research efforts from routine data processing for biological rules embedded in the data.

  1. Automated gamma knife radiosurgery treatment planning with image registration, data-mining, and Nelder-Mead simplex optimization

    International Nuclear Information System (INIS)

    Gamma knife treatments are usually planned manually, requiring much expertise and time. We describe a new, fully automatic method of treatment planning. The treatment volume to be planned is first compared with a database of past treatments to find volumes closely matching in size and shape. The treatment parameters of the closest matches are used as starting points for the new treatment plan. Further optimization is performed with the Nelder-Mead simplex method: the coordinates and weight of the isocenters are allowed to vary until a maximally conformal plan specific to the new treatment volume is found. The method was tested on a randomly selected set of 10 acoustic neuromas and 10 meningiomas. Typically, matching a new volume took under 30 seconds. The time for simplex optimization, on a 3 GHz Xeon processor, ranged from under a minute for small volumes (30 000 cubic mm,>20 isocenters). In 8/10 acoustic neuromas and 8/10 meningiomas, the automatic method found plans with conformation number equal or better than that of the manual plan. In 4/10 acoustic neuromas and 5/10 meningiomas, both overtreatment and undertreatment ratios were equal or better in automated plans. In conclusion, data-mining of past treatments can be used to derive starting parameters for treatment planning. These parameters can then be computer optimized to give good plans automatically

  2. Pep2Path: automated mass spectrometry-guided genome mining of peptidic natural products

    NARCIS (Netherlands)

    Medema, Marnix; Paalvast, Yared; Nguyen, D.D.; Melnik, A.; Dorrestein, P.C.; Takano, Eriko; Breitling, Rainer

    2014-01-01

    Nonribosomally and ribosomally synthesized bioactive peptides constitute a source of molecules of great biomedical importance, including antibiotics such as penicillin, immunosuppressants such as cyclosporine, and cytostatics such as bleomycin. Recently, an innovative mass-spectrometry-based strateg

  3. Discovery of defense- and neuropeptides in social ants by genome-mining.

    Science.gov (United States)

    Gruber, Christian W; Muttenthaler, Markus

    2012-01-01

    Natural peptides of great number and diversity occur in all organisms, but analyzing their peptidome is often difficult. With natural product drug discovery in mind, we devised a genome-mining approach to identify defense- and neuropeptides in the genomes of social ants from Atta cephalotes (leaf-cutter ant), Camponotus floridanus (carpenter ant) and Harpegnathos saltator (basal genus). Numerous peptide-encoding genes of defense peptides, in particular defensins, and neuropeptides or regulatory peptide hormones, such as allatostatins and tachykinins, were identified and analyzed. Most interestingly we annotated genes that encode oxytocin/vasopressin-related peptides (inotocins) and their putative receptors. This is the first piece of evidence for the existence of this nonapeptide hormone system in ants (Formicidae) and supports recent findings in Tribolium castaneum (red flour beetle) and Nasonia vitripennis (parasitoid wasp), and therefore its confinement to some basal holometabolous insects. By contrast, the absence of the inotocin hormone system in Apis mellifera (honeybee), another closely-related member of the eusocial Hymenoptera clade, establishes the basis for future studies on the molecular evolution and physiological function of oxytocin/vasopressin-related peptides (vasotocin nonapeptide family) and their receptors in social insects. Particularly the identification of ant inotocin and defensin peptide sequences will provide a basis for future pharmacological characterization in the quest for potent and selective lead compounds of therapeutic value. PMID:22448224

  4. Discovery of defense- and neuropeptides in social ants by genome-mining.

    Directory of Open Access Journals (Sweden)

    Christian W Gruber

    Full Text Available Natural peptides of great number and diversity occur in all organisms, but analyzing their peptidome is often difficult. With natural product drug discovery in mind, we devised a genome-mining approach to identify defense- and neuropeptides in the genomes of social ants from Atta cephalotes (leaf-cutter ant, Camponotus floridanus (carpenter ant and Harpegnathos saltator (basal genus. Numerous peptide-encoding genes of defense peptides, in particular defensins, and neuropeptides or regulatory peptide hormones, such as allatostatins and tachykinins, were identified and analyzed. Most interestingly we annotated genes that encode oxytocin/vasopressin-related peptides (inotocins and their putative receptors. This is the first piece of evidence for the existence of this nonapeptide hormone system in ants (Formicidae and supports recent findings in Tribolium castaneum (red flour beetle and Nasonia vitripennis (parasitoid wasp, and therefore its confinement to some basal holometabolous insects. By contrast, the absence of the inotocin hormone system in Apis mellifera (honeybee, another closely-related member of the eusocial Hymenoptera clade, establishes the basis for future studies on the molecular evolution and physiological function of oxytocin/vasopressin-related peptides (vasotocin nonapeptide family and their receptors in social insects. Particularly the identification of ant inotocin and defensin peptide sequences will provide a basis for future pharmacological characterization in the quest for potent and selective lead compounds of therapeutic value.

  5. Discovery of defense- and neuropeptides in social ants by genome-mining.

    Science.gov (United States)

    Gruber, Christian W; Muttenthaler, Markus

    2012-01-01

    Natural peptides of great number and diversity occur in all organisms, but analyzing their peptidome is often difficult. With natural product drug discovery in mind, we devised a genome-mining approach to identify defense- and neuropeptides in the genomes of social ants from Atta cephalotes (leaf-cutter ant), Camponotus floridanus (carpenter ant) and Harpegnathos saltator (basal genus). Numerous peptide-encoding genes of defense peptides, in particular defensins, and neuropeptides or regulatory peptide hormones, such as allatostatins and tachykinins, were identified and analyzed. Most interestingly we annotated genes that encode oxytocin/vasopressin-related peptides (inotocins) and their putative receptors. This is the first piece of evidence for the existence of this nonapeptide hormone system in ants (Formicidae) and supports recent findings in Tribolium castaneum (red flour beetle) and Nasonia vitripennis (parasitoid wasp), and therefore its confinement to some basal holometabolous insects. By contrast, the absence of the inotocin hormone system in Apis mellifera (honeybee), another closely-related member of the eusocial Hymenoptera clade, establishes the basis for future studies on the molecular evolution and physiological function of oxytocin/vasopressin-related peptides (vasotocin nonapeptide family) and their receptors in social insects. Particularly the identification of ant inotocin and defensin peptide sequences will provide a basis for future pharmacological characterization in the quest for potent and selective lead compounds of therapeutic value.

  6. Genome mining demonstrates the widespread occurrence of gene clusters encoding bacteriocins in cyanobacteria.

    Directory of Open Access Journals (Sweden)

    Hao Wang

    Full Text Available Cyanobacteria are a rich source of natural products with interesting biological activities. Many of these are peptides and the end products of a non-ribosomal pathway. However, several cyanobacterial peptide classes were recently shown to be produced through the proteolytic cleavage and post-translational modification of short precursor peptides. A new class of bacteriocins produced through the proteolytic cleavage and heterocyclization of precursor proteins was recently identified from marine cyanobacteria. Here we show the widespread occurrence of bacteriocin gene clusters in cyanobacteria through comparative analysis of 58 cyanobacterial genomes. A total of 145 bacteriocin gene clusters were discovered through genome mining. These clusters encoded 290 putative bacteriocin precursors. They ranged in length from 28 to 164 amino acids with very little sequence conservation of the core peptide. The gene clusters could be classified into seven groups according to their gene organization and domain composition. This classification is supported by phylogenetic analysis, which further indicated independent evolutionary trajectories of gene clusters in different groups. Our data suggests that cyanobacteria are a prolific source of low-molecular weight post-translationally modified peptides.

  7. Integration of Automated Decision Support Systems with Data Mining Abstract: A Client Perspective

    Directory of Open Access Journals (Sweden)

    Abdullah Saad AL-Malaise

    2013-03-01

    Full Text Available Customer’s behavior and satisfaction are always play important role to increase organization’s growth and market value. Customers are on top priority for the growing organization to build up their businesses. In this paper presents the architecture of Decision Support Systems (DSS in connection to deal with the customer’s enquiries and requests. Main purpose behind the proposed model is to enhance the customer’s satisfaction and behavior using DSS. We proposed model by extension in traditional DSS concepts with integration of Data Mining (DM abstract. The model presented in this paper shows the comprehensive architecture to work on the customer requests using DSS and knowledge management (KM for improving the customer’s behavior and satisfaction. Furthermore, DM abstract provides more methods and techniques; to understand the contacted customer’s data, to classify the replied answers in number of classes, and to generate association between the same type of queries, and finally to maintain the KM for future correspondence.

  8. An Automated Real-Time System for Opinion Mining using a Hybrid Approach

    Directory of Open Access Journals (Sweden)

    Indrajit Mukherjee

    2016-07-01

    Full Text Available In this paper, a novel idea is being presented to perform Opinion Mining in a very simple and efficient manner with the help of the One-Level-Tree (OLT based approach. To recognize opinions specific for features in customer reviews having a variety of features commingled with diverse emotions. Unlike some previous ventures entirely using one-time structured or filtered data but this is solely based on unstructured data obtained in real-time from Twitter. The hybrid approach utilizes the associations defined in Dependency Parsing Grammar and fully employs Double Propagation to extract new features and related new opinions within the review. The Dictionary based approach is used to expand the Opinion Lexicon. Within the dependency parsing relations a new relation is being proposed to more effectively catch the associations between opinions and features. The three new methods are being proposed, termed as Double Positive Double Negative (DPDN, Catch-Phrase Method (CPM & Negation Check (NC, for performing criteria specific evaluations. The OLT approach conveniently displays the relationship between the features and their opinions in an elementary fashion in the form of a graph. The proposed system achieves splendid accuracy across all domains and also performs better than the state-of-the-art systems.

  9. RiceGeneThresher: a web-based application for mining genes underlying QTL in rice genome.

    Science.gov (United States)

    Thongjuea, Supat; Ruanjaichon, Vinitchan; Bruskiewich, Richard; Vanavichit, Apichart

    2009-01-01

    RiceGeneThresher is a public online resource for mining genes underlying genome regions of interest or quantitative trait loci (QTL) in rice genome. It is a compendium of rice genomic resources consisting of genetic markers, genome annotation, expressed sequence tags (ESTs), protein domains, gene ontology, plant stress-responsive genes, metabolic pathways and prediction of protein-protein interactions. RiceGeneThresher system integrates these diverse data sources and provides powerful web-based applications, and flexible tools for delivering customized set of biological data on rice. Its system supports whole-genome gene mining for QTL by querying using DNA marker intervals or genomic loci. RiceGeneThresher provides biologically supported evidences that are essential for targeting groups or networks of genes involved in controlling traits underlying QTL. Users can use it to discover and to assign the most promising candidate genes in preparation for the further gene function validation analysis. The web-based application is freely available at http://rice.kps.ku.ac.th. PMID:18820292

  10. Future planning and evaluation for automated adaptive minehunting: a roadmap for mine countermeasures theory modernization

    Science.gov (United States)

    Garcia, Gregory A.; Wettergren, Thomas A.

    2012-06-01

    This paper presents a discussion of U.S. naval mine countermeasures (MCM) theory modernization in light of advances in the areas of autonomy, tactics, and sensor processing. The unifying theme spanning these research areas concerns the capability for in situ adaptation of processing algorithms, plans, and vehicle behaviors enabled through run-time situation assessment and performance estimation. Independently, each of these technology developments impact the MCM Measures of Effectiveness1 [MOE(s)] of time and risk by improving one or more associated Measures of Performance2 [MOP(s)]; the contribution of this paper is to outline an integrated strategy for realizing the cumulative benefits of these technology enablers to the United States Navy's minehunting capability. An introduction to the MCM problem is provided to frame the importance of the foundational research and the ramifications of the proposed strategy on the MIW community. We then include an overview of current and future adaptive capability research in the aforementioned areas, highlighting a departure from the existing rigid assumption-based approaches while identifying anticipated technology acceptance issues. Consequently, the paper describes an incremental strategy for transitioning from the current minehunting paradigm where tactical decision aids rely on a priori intelligence and there is little to no in situ adaptation or feedback to a future vision where unmanned systems3, equipped with a representation of the commander's intent, are afforded the authority and ability to adapt to environmental perturbations with minimal human-in-the-loop supervision. The discussion concludes with an articulation of the science and technology issues which the MCM research community must continue to address.

  11. Mining metagenomic whole genome sequences revealed subdominant but constant Lactobacillus population in the human gut microbiota.

    Science.gov (United States)

    Rossi, Maddalena; Martínez-Martínez, Daniel; Amaretti, Alberto; Ulrici, Alessandro; Raimondi, Stefano; Moya, Andrés

    2016-06-01

    The genus Lactobacillus includes over 215 species that colonize plants, foods, sewage and the gastrointestinal tract (GIT) of humans and animals. In the GIT, Lactobacillus population can be made by true inhabitants or by bacteria occasionally ingested with fermented or spoiled foods, or with probiotics. This study longitudinally surveyed Lactobacillus species and strains in the feces of a healthy subject through whole genome sequencing (WGS) data-mining, in order to identify members of the permanent or transient populations. In three time-points (0, 670 and 700 d), 58 different species were identified, 16 of them being retrieved for the first time in human feces. L. rhamnosus, L. ruminis, L. delbrueckii, L. plantarum, L. casei and L. acidophilus were the most represented, with estimated amounts ranging between 6 and 8 Log (cells g(-1) ), while the other were detected at 4 or 5 Log (cells g(-1) ). 86 Lactobacillus strains belonging to 52 species were identified. 43 seemingly occupied the GIT as true residents, since were detected in a time span of almost 2 years in all the three samples or in 2 samples separated by 670 or 700 d. As a whole, a stable community of lactobacilli was disclosed, with wide and understudied biodiversity. PMID:27043715

  12. Mining the genome of Rhodococcus fascians, a plant growth-promoting bacterium gone astray.

    Science.gov (United States)

    Francis, Isolde M; Stes, Elisabeth; Zhang, Yucheng; Rangel, Diana; Audenaert, Kris; Vereecke, Danny

    2016-09-25

    Rhodococcus fascians is a phytopathogenic Gram-positive Actinomycete with a very broad host range encompassing especially dicotyledonous herbaceous perennials, but also some monocots, such as the Liliaceae and, recently, the woody crop pistachio. The pathogenicity of R. fascians strain D188 is known to be encoded by the linear plasmid pFiD188 and to be dictated by its capacity to produce a mixture of cytokinins. Here, we show that D188-5, the nonpathogenic plasmid-free derivative of the wild-type strain D188 actually has a plant growth-promoting effect. With the availability of the genome sequence of R. fascians, the chromosome of strain D188 was mined for putative plant growth-promoting functions and the functionality of some of these activities was tested. This analysis together with previous results suggests that the plant growth-promoting activity of R. fascians is due to production of plant growth modulators, such as auxin and cytokinin, combined with degradation of ethylene through 1-amino-cyclopropane-1-carboxylic acid deaminase. Moreover, R. fascians has several functions that could contribute to efficient colonization and competitiveness, but there is little evidence for a strong impact on plant nutrition. Possibly, the plant growth promotion encoded by the D188 chromosome is imperative for the epiphytic phase of the life cycle of R. fascians and prepares the plant to host the bacteria, thus ensuring proper continuation into the pathogenic phase. PMID:26877150

  13. Draft Genome Sequence of Sinorhizobium meliloti CCNWSX0020, a Nitrogen-Fixing Symbiont with Copper Tolerance Capability Isolated from Lead-Zinc Mine Tailings

    Science.gov (United States)

    Li, Zhefei; Ma, Zhanqiang; Hao, Xiuli

    2012-01-01

    Sinorhizobium meliloti CCNWSX0020 was isolated from Medicago lupulina plants growing in lead-zinc mine tailings, which can establish a symbiotic relationship with Medicago species. Also, the genome of this bacterium contains a number of protein-coding sequences related to metal tolerance. We anticipate that the genomic sequence provides valuable information to explore environmental bioremediation. PMID:22328762

  14. Self-organizing Approach for Automated Gene Identification in Whole Genomes

    OpenAIRE

    Gorban, Alexander N; Zinovyev, Andrey Yu.; Popova, Tatyana G.

    2001-01-01

    An approach based on using the idea of distinguished coding phase in explicit form for identification of protein-coding regions (exons) in whole genome has been proposed. For several genomes an optimal window length for averaging GC-content function and calculating codon frequencies has been found. Self-training procedure based on clustering in multidimensional space of triplet frequencies is proposed. For visualization of data in the space of triplet requiencies method of elastic maps was ap...

  15. Data Mining Approaches for Genomic Biomarker Development: Applications Using Drug Screening Data from the Cancer Genome Project and the Cancer Cell Line Encyclopedia.

    Directory of Open Access Journals (Sweden)

    David G Covell

    Full Text Available Developing reliable biomarkers of tumor cell drug sensitivity and resistance can guide hypothesis-driven basic science research and influence pre-therapy clinical decisions. A popular strategy for developing biomarkers uses characterizations of human tumor samples against a range of cancer drug responses that correlate with genomic change; developed largely from the efforts of the Cancer Cell Line Encyclopedia (CCLE and Sanger Cancer Genome Project (CGP. The purpose of this study is to provide an independent analysis of this data that aims to vet existing and add novel perspectives to biomarker discoveries and applications. Existing and alternative data mining and statistical methods will be used to a evaluate drug responses of compounds with similar mechanism of action (MOA, b examine measures of gene expression (GE, copy number (CN and mutation status (MUT biomarkers, combined with gene set enrichment analysis (GSEA, for hypothesizing biological processes important for drug response, c conduct global comparisons of GE, CN and MUT as biomarkers across all drugs screened in the CGP dataset, and d assess the positive predictive power of CGP-derived GE biomarkers as predictors of drug response in CCLE tumor cells. The perspectives derived from individual and global examinations of GEs, MUTs and CNs confirm existing and reveal unique and shared roles for these biomarkers in tumor cell drug sensitivity and resistance. Applications of CGP-derived genomic biomarkers to predict the drug response of CCLE tumor cells finds a highly significant ROC, with a positive predictive power of 0.78. The results of this study expand the available data mining and analysis methods for genomic biomarker development and provide additional support for using biomarkers to guide hypothesis-driven basic science research and pre-therapy clinical decisions.

  16. Data Mining Approaches for Genomic Biomarker Development: Applications Using Drug Screening Data from the Cancer Genome Project and the Cancer Cell Line Encyclopedia.

    Science.gov (United States)

    Covell, David G

    2015-01-01

    Developing reliable biomarkers of tumor cell drug sensitivity and resistance can guide hypothesis-driven basic science research and influence pre-therapy clinical decisions. A popular strategy for developing biomarkers uses characterizations of human tumor samples against a range of cancer drug responses that correlate with genomic change; developed largely from the efforts of the Cancer Cell Line Encyclopedia (CCLE) and Sanger Cancer Genome Project (CGP). The purpose of this study is to provide an independent analysis of this data that aims to vet existing and add novel perspectives to biomarker discoveries and applications. Existing and alternative data mining and statistical methods will be used to a) evaluate drug responses of compounds with similar mechanism of action (MOA), b) examine measures of gene expression (GE), copy number (CN) and mutation status (MUT) biomarkers, combined with gene set enrichment analysis (GSEA), for hypothesizing biological processes important for drug response, c) conduct global comparisons of GE, CN and MUT as biomarkers across all drugs screened in the CGP dataset, and d) assess the positive predictive power of CGP-derived GE biomarkers as predictors of drug response in CCLE tumor cells. The perspectives derived from individual and global examinations of GEs, MUTs and CNs confirm existing and reveal unique and shared roles for these biomarkers in tumor cell drug sensitivity and resistance. Applications of CGP-derived genomic biomarkers to predict the drug response of CCLE tumor cells finds a highly significant ROC, with a positive predictive power of 0.78. The results of this study expand the available data mining and analysis methods for genomic biomarker development and provide additional support for using biomarkers to guide hypothesis-driven basic science research and pre-therapy clinical decisions.

  17. Genome-wide mining, characterization, and development of microsatellite markers in Marsupenaeus japonicus by genome survey sequencing

    Science.gov (United States)

    Lu, Xia; Luan, Sheng; Kong, Jie; Hu, Longyang; Mao, Yong; Zhong, Shengping

    2015-12-01

    The kuruma prawn, Marsupenaeus japonicus, is one of the most cultivated and consumed species of shrimp. However, very few molecular genetic/genomic resources are publically available for it. Thus, the characterization and distribution of simple sequence repeats (SSRs) remains ambiguous and the use of SSR markers in genomic studies and marker-assisted selection is limited. The goal of this study is to characterize and develop genome-wide SSR markers in M. japonicus by genome survey sequencing for application in comparative genomics and breeding. A total of 326 945 perfect SSRs were identifi ed, among which dinucleotide repeats were the most frequent class (44.08%), followed by mononucleotides (29.67%), trinucleotides (18.96%), tetranucleotides (5.66%), hexanucleotides (1.07%), and pentanucleotides (0.56%). In total, 151 541 SSR loci primers were successfully designed. A subset of 30 SSR primer pairs were synthesized and tested in 42 individuals from a wild population, of which 27 loci (90.0%) were successfully amplifi ed with specifi c products and 24 (80.0%) were polymorphic. For the amplifi ed polymorphic loci, the alleles ranged from 5 to 17 (with an average of 9.63), and the average PIC value was 0.796. A total of 58 256 SSR-containing sequences had signifi cant Gene Ontology annotation; these are good functional molecular marker candidates for association studies and comparative genomic analysis. The newly identifi ed SSRs signifi cantly contribute to the M. japonicus genomic resources and will facilitate a number of genetic and genomic studies, including high density linkage mapping, genome-wide association analysis, marker-aided selection, comparative genomics analysis, population genetics, and evolution.

  18. antiSMASH 3.0—a comprehensive resource for the genome mining of biosynthetic gene clusters

    DEFF Research Database (Denmark)

    Weber, Tilmann; Blin, Kai; Duddela, Srikanth;

    2015-01-01

    Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we...... of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products...

  19. REFGEN and TREENAMER: Automated Sequence Data Handling for Phylogenetic Analysis in the Genomic Era

    Directory of Open Access Journals (Sweden)

    Guy Leonard

    2009-01-01

    Full Text Available The phylogenetic analysis of nucleotide sequences and increasingly that of amino acid sequences is used to address a number of biological questions. Access to extensive datasets, including numerous genome projects, means that standard phylogenetic analyses can include many hundreds of sequences. Unfortunately, most phylogenetic analysis programs do not tolerate the sequence naming conventions of genome databases. Managing large numbers of sequences and standardizing sequence labels for use in phylogenetic analysis programs can be a time consuming and laborious task. Here we report the availability of an online resource for the management of gene sequences recovered from public access genome databases such as GenBank. These web utilities include the facility for renaming every sequence in a FASTA alignment fi le, with each sequence label derived from a user-defined combination of the species name and/or database accession number. This facility enables the user to keep track of the branching order of the sequences/taxa during multiple tree calculations and re-optimisations. Post phylogenetic analysis, these webpages can then be used to rename every label in the subsequent tree fi les (with a user-defined combination of species name and/or database accession number. Together these programs drastically reduce the time required for managing sequence alignments and labelling phylogenetic figures. Additional features of our platform include the automatic removal of identical accession numbers (recorded in the report file and generation of species and accession number lists for use in supplementary materials or figure legends.

  20. Data mining of high density genomic variant data for prediction of Alzheimer's disease risk

    Directory of Open Access Journals (Sweden)

    Briones Natalia

    2012-01-01

    Full Text Available Abstract Background The discovery of genetic associations is an important factor in the understanding of human illness to derive disease pathways. Identifying multiple interacting genetic mutations associated with disease remains challenging in studying the etiology of complex diseases. And although recently new single nucleotide polymorphisms (SNPs at genes implicated in immune response, cholesterol/lipid metabolism, and cell membrane processes have been confirmed by genome-wide association studies (GWAS to be associated with late-onset Alzheimer's disease (LOAD, a percentage of AD heritability continues to be unexplained. We try to find other genetic variants that may influence LOAD risk utilizing data mining methods. Methods Two different approaches were devised to select SNPs associated with LOAD in a publicly available GWAS data set consisting of three cohorts. In both approaches, single-locus analysis (logistic regression was conducted to filter the data with a less conservative p-value than the Bonferroni threshold; this resulted in a subset of SNPs used next in multi-locus analysis (random forest (RF. In the second approach, we took into account prior biological knowledge, and performed sample stratification and linkage disequilibrium (LD in addition to logistic regression analysis to preselect loci to input into the RF classifier construction step. Results The first approach gave 199 SNPs mostly associated with genes in calcium signaling, cell adhesion, endocytosis, immune response, and synaptic function. These SNPs together with APOE and GAB2 SNPs formed a predictive subset for LOAD status with an average error of 9.8% using 10-fold cross validation (CV in RF modeling. Nineteen variants in LD with ST5, TRPC1, ATG10, ANO3, NDUFA12, and NISCH respectively, genes linked directly or indirectly with neurobiology, were identified with the second approach. These variants were part of a model that included APOE and GAB2 SNPs to predict LOAD

  1. In silico mining of putative microsatellite markers from whole genome sequence of water buffalo (Bubalus bubalis) and development of first BuffSatDB

    OpenAIRE

    Sarika,; Arora Vasu; Iquebal Mir; Rai Anil; Kumar Dinesh

    2013-01-01

    Abstract Background Though India has sequenced water buffalo genome but its draft assembly is based on cattle genome BTau 4.0, thus de novo chromosome wise assembly is a major pending issue for global community. The existing radiation hybrid of buffalo and these reported STR can be used further in final gap plugging and “finishing” expected in de novo genome assembly. QTL and gene mapping needs mining of putative STR from buffalo genome at equal interval on each and every chromosome. Such mar...

  2. A framework for automated enrichment of functionally significant inverted repeats in whole genomes

    Directory of Open Access Journals (Sweden)

    Frank Ronald L

    2010-10-01

    Full Text Available Abstract Background RNA transcripts from genomic sequences showing dyad symmetry typically adopt hairpin-like, cloverleaf, or similar structures that act as recognition sites for proteins. Such structures often are the precursors of non-coding RNA (ncRNA sequences like microRNA (miRNA and small-interfering RNA (siRNA that have recently garnered more functional significance than in the past. Genomic DNA contains hundreds of thousands of such inverted repeats (IRs with varying degrees of symmetry. But by collecting statistically significant information from a known set of ncRNA, we can sort these IRs into those that are likely to be functional. Results A novel method was developed to scan genomic DNA for partially symmetric inverted repeats and the resulting set was further refined to match miRNA precursors (pre-miRNA with respect to their density of symmetry, statistical probability of the symmetry, length of stems in the predicted hairpin secondary structure, and the GC content of the stems. This method was applied on the Arabidopsis thaliana genome and validated against the set of 190 known Arabidopsis pre-miRNA in the miRBase database. A preliminary scan for IRs identified 186 of the known pre-miRNA but with 714700 pre-miRNA candidates. This large number of IRs was further refined to 483908 candidates with 183 pre-miRNA identified and further still to 165371 candidates with 171 pre-miRNA identified (i.e. with 90% of the known pre-miRNA retained. Conclusions 165371 candidates for potentially functional miRNA is still too large a set to warrant wet lab analyses, such as northern blotting, on all of them. Hence additional filters are needed to further refine the number of candidates while still retaining most of the known miRNA. These include detection of promoters and terminators, homology analyses, location of candidate relative to coding regions, and better secondary structure prediction algorithms. The software developed is designed to easily

  3. Genome-wide data-mining of candidate human splice translational efficiency polymorphisms (STEPs and an online database.

    Directory of Open Access Journals (Sweden)

    Christopher A Raistrick

    Full Text Available BACKGROUND: Variation in pre-mRNA splicing is common and in some cases caused by genetic variants in intronic splicing motifs. Recent studies into the insulin gene (INS discovered a polymorphism in a 5' non-coding intron that influences the likelihood of intron retention in the final mRNA, extending the 5' untranslated region and maintaining protein quality. Retention was also associated with increased insulin levels, suggesting that such variants--splice translational efficiency polymorphisms (STEPs--may relate to disease phenotypes through differential protein expression. We set out to explore the prevalence of STEPs in the human genome and validate this new category of protein quantitative trait loci (pQTL using publicly available data. METHODOLOGY/PRINCIPAL FINDINGS: Gene transcript and variant data were collected and mined for candidate STEPs in motif regions. Sequences from transcripts containing potential STEPs were analysed for evidence of splice site recognition and an effect in expressed sequence tags (ESTs. 16 publicly released genome-wide association data sets of common diseases were searched for association to candidate polymorphisms with HapMap frequency data. Our study found 3324 candidate STEPs lying in motif sequences of 5' non-coding introns and further mining revealed 170 with transcript evidence of intron retention. 21 potential STEPs had EST evidence of intron retention or exon extension, as well as population frequency data for comparison. CONCLUSIONS/SIGNIFICANCE: Results suggest that the insulin STEP was not a unique example and that many STEPs may occur genome-wide with potentially causal effects in complex disease. An online database of STEPs is freely accessible at http://dbstep.genes.org.uk/.

  4. Evaluating the Strengths and Weaknesses of Mining Audit Data for Automated Models for Intrusion Detection in Tcpdump and Basic Security Module Data

    Directory of Open Access Journals (Sweden)

    A. Arul Lawrence Selvakumar

    2012-01-01

    Full Text Available Problem statement: Intrusion Detection System (IDS have become an important component of infrastructure protection mechanism to secure the current and emerging networks, its services and applications by detecting, alerting and taking necessary actions against the malicious activities. The network size, technology diversities and security policies make networks more challenging and hence there is a requirement for IDS which should be very accurate, adaptive, extensible and more reliable. Although there exists the novel framework for this requirement namely Mining Audit Data for Automated Models for Intrusion Detection (MADAM ID, it is having some performance shortfalls in processing the audit data. Approach: Few experiments were conducted on tcpdump data of DARPA and BCM audit files by applying the algorithms and tools of MADAM ID in the processing of audit data, mine patterns, construct features and build RIPPER classifiers. By putting it all together, four main categories of attacks namely DOS, R2L, U2R and PROBING attacks were simulated. Results: This study outlines the experimentation results of MADAM ID in testing the DARPA and BSM data on a simulated network environment. Conclusion: The strengths and weakness of MADAM ID has been identified thru the experiments conducted on tcpdump data and also on Pascal based audit files of Basic Security Module (BSM. This study also gives some additional directions about the future applications of MADAM ID.

  5. Genome data mining of lactic acid bacteria: the impact of bioinformatics

    NARCIS (Netherlands)

    Siezen, R.J.; Enckevort, F.H.J. van; Kleerebezem, M.; Teusink, B.

    2004-01-01

    Lactic acid bacteria (LAB) have been widely used in food fermentations and, more recently, as probiotics in health-promoting food products. Genome sequencing and functional genomics studies of a variety of LAB are now rapidly providing insights into their diversity and evolution and revealing the mo

  6. Chicken genome mapping - Constructing part of a road map for mining this bird's DNA

    NARCIS (Netherlands)

    Aerts, J.

    2005-01-01

    The aim of the research presented in this thesis was to aid in the international chicken genome mapping effort. To this purpose, a significant contribution was made to the construction of the chicken whole-genome BAC-based physical map (presented in Chapter A). An important aspect of this constructi

  7. CisMiner: genome-wide in-silico cis-regulatory module prediction by fuzzy itemset mining.

    Directory of Open Access Journals (Sweden)

    Carmen Navarro

    Full Text Available Eukaryotic gene control regions are known to be spread throughout non-coding DNA sequences which may appear distant from the gene promoter. Transcription factors are proteins that coordinately bind to these regions at transcription factor binding sites to regulate gene expression. Several tools allow to detect significant co-occurrences of closely located binding sites (cis-regulatory modules, CRMs. However, these tools present at least one of the following limitations: 1 scope limited to promoter or conserved regions of the genome; 2 do not allow to identify combinations involving more than two motifs; 3 require prior information about target motifs. In this work we present CisMiner, a novel methodology to detect putative CRMs by means of a fuzzy itemset mining approach able to operate at genome-wide scale. CisMiner allows to perform a blind search of CRMs without any prior information about target CRMs nor limitation in the number of motifs. CisMiner tackles the combinatorial complexity of genome-wide cis-regulatory module extraction using a natural representation of motif combinations as itemsets and applying the Top-Down Fuzzy Frequent- Pattern Tree algorithm to identify significant itemsets. Fuzzy technology allows CisMiner to better handle the imprecision and noise inherent to regulatory processes. Results obtained for a set of well-known binding sites in the S. cerevisiae genome show that our method yields highly reliable predictions. Furthermore, CisMiner was also applied to putative in-silico predicted transcription factor binding sites to identify significant combinations in S. cerevisiae and D. melanogaster, proving that our approach can be further applied genome-wide to more complex genomes. CisMiner is freely accesible at: http://genome2.ugr.es/cisminer. CisMiner can be queried for the results presented in this work and can also perform a customized cis-regulatory module prediction on a query set of transcription factor binding

  8. Draft Genome Sequence of Halomonas sp. Strain HAL1, a Moderately Halophilic Arsenite-Oxidizing Bacterium Isolated from Gold-Mine Soil

    OpenAIRE

    Lin, Yanbing; Fan, Haoxin; Hao, Xiuli; Johnstone, Laurel; Hu, Yao; Wei, Gehong; Alwathnani, Hend A.; Wang, Gejiao; Rensing, Christopher

    2012-01-01

    We report the draft genome sequence of arsenite-oxidizing Halomonas sp. strain HAL1, isolated from the soil of a gold mine. Genes encoding proteins involved in arsenic resistance and transformation, phosphate utilization and uptake, and betaine biosynthesis were identified. Their identification might help in understanding how arsenic and phosphate metabolism are intertwined.

  9. Development of direct methanol fuel cells for the applications in mining and tunnelling. Automation and power conditioning of a fuel cell-battery hybrid system

    Energy Technology Data Exchange (ETDEWEB)

    Kulakarni, Sreekantha Rao

    2012-07-01

    appropriate option for applications in underground mining and tunneling. The specific advantages of DMFCs are simple structure, higher energy density of the fuel (i.e. methanol), low operating temperature, lower weight, clean and quiet operation. Methanol is in liquid form so it is easy to transport and store. Moreover, methanol is a renewable fuel that can be produced from biomass. This doctoral research work focused on the construction of a DMFC stack of 30 W electrical power and the testing of the fuel cell stack in underground mining for the applications discussed above. Not only the stack itself, but also the automated system for the fuel cell and battery hybrid system was developed. For automation of the system, a micro-controller monitoring system was developed, which uses sensors for voltage, current, temperature, methanol concentration and liquid level. Development and testing of the methanol concentration sensor was considered as the heart of the research work. Last but not least, the power conditioning of the fuel cell stack as well as the battery charging techniques developed were also part of the research work.

  10. Genome mining of the hitachimycin biosynthetic gene cluster: involvement of a phenylalanine-2,3-aminomutase in biosynthesis.

    Science.gov (United States)

    Kudo, Fumitaka; Kawamura, Koichi; Uchino, Asuka; Miyanaga, Akimasa; Numakura, Mario; Takayanagi, Ryuichi; Eguchi, Tadashi

    2015-04-13

    Hitachimycin is a macrolactam antibiotic with (S)-β-phenylalanine (β-Phe) at the starter position of its polyketide skeleton. To understand the incorporation mechanism of β-Phe and the modification mechanism of the unique polyketide skeleton, the biosynthetic gene cluster for hitachimycin in Streptomyces scabrisporus was identified by genome mining. The identified gene cluster contains a putative phenylalanine-2,3-aminomutase (PAM), five polyketide synthases, four β-amino-acid-carrying enzymes, and a characteristic amidohydrolase. A hitA knockout mutant showed no hitachimycin production, but antibiotic production was restored by feeding with (S)-β-Phe. We also confirmed the enzymatic activity of the HitA PAM. The results suggest that the identified gene cluster is responsible for the biosynthesis of hitachimycin. A plausible biosynthetic pathway for hitachimycin, including a unique polyketide skeletal transformation mechanism, is proposed.

  11. EST2uni: an open, parallel tool for automated EST analysis and database creation, with a data mining web interface and microarray expression data integration

    Directory of Open Access Journals (Sweden)

    Nuez Fernando

    2008-01-01

    Full Text Available Abstract Background Expressed sequence tag (EST collections are composed of a high number of single-pass, redundant, partial sequences, which need to be processed, clustered, and annotated to remove low-quality and vector regions, eliminate redundancy and sequencing errors, and provide biologically relevant information. In order to provide a suitable way of performing the different steps in the analysis of the ESTs, flexible computation pipelines adapted to the local needs of specific EST projects have to be developed. Furthermore, EST collections must be stored in highly structured relational databases available to researchers through user-friendly interfaces which allow efficient and complex data mining, thus offering maximum capabilities for their full exploitation. Results We have created EST2uni, an integrated, highly-configurable EST analysis pipeline and data mining software package that automates the pre-processing, clustering, annotation, database creation, and data mining of EST collections. The pipeline uses standard EST analysis tools and the software has a modular design to facilitate the addition of new analytical methods and their configuration. Currently implemented analyses include functional and structural annotation, SNP and microsatellite discovery, integration of previously known genetic marker data and gene expression results, and assistance in cDNA microarray design. It can be run in parallel in a PC cluster in order to reduce the time necessary for the analysis. It also creates a web site linked to the database, showing collection statistics, with complex query capabilities and tools for data mining and retrieval. Conclusion The software package presented here provides an efficient and complete bioinformatics tool for the management of EST collections which is very easy to adapt to the local needs of different EST projects. The code is freely available under the GPL license and can be obtained at http

  12. Process mining

    DEFF Research Database (Denmark)

    van der Aalst, W.M.P.; Rubin, V.; Verbeek, H.M.W.;

    2010-01-01

    Process mining includes the automated discovery of processes from event logs. Based on observed events (e.g., activities being executed or messages being exchanged) a process model is constructed. One of the essential problems in process mining is that one cannot assume to have seen all possible...... behavior. At best, one has seen a representative subset. Therefore, classical synthesis techniques are not suitable as they aim at finding a model that is able to exactly reproduce the log. Existing process mining techniques try to avoid such “overfitting” by generalizing the model to allow for more...

  13. InCoB2014: mining biological data from genomics for transforming industry and health.

    Science.gov (United States)

    Schönbach, Christian; Tan, Tin; Ranganathan, Shoba

    2014-01-01

    The 13th International Conference on Bioinformatics (InCoB2014) was held for the first time in Australia, at Sydney, July 31-2 August, 2014. InCoB is the annual scientific gathering of the Asia-Pacific Bioinformatics Network (APBioNet), hosted since 2002 in the Asia-Pacific region. Of 106 full papers submitted to the BMC track of InCoB2014, 50 (47.2%) were accepted in BMC Bioinformatics, BMC Genomics and BMC Systems Biology supplements, with three papers in a new BMC Medical Genomics supplement. While the majority of presenters and authors were from Asia and Australia, the increasing number of US and European conference attendees augurs well for the international flavour of InCoB. Next year's InCoB will be held jointly with the Genome Informatics Workshop (GIW), September 9-11, 2015 in Tokyo, Japan, with a view to integrate bioinformatics communities in the region.

  14. Identification of candidate genes in Populus cell wall biosynthesis using text-mining, co-expression network and comparative genomics

    Energy Technology Data Exchange (ETDEWEB)

    Yang, Xiaohan [ORNL; Ye, Chuyu [ORNL; Bisaria, Anjali [ORNL; Tuskan, Gerald A [ORNL; Kalluri, Udaya C [ORNL

    2011-01-01

    Populus is an important bioenergy crop for bioethanol production. A greater understanding of cell wall biosynthesis processes is critical in reducing biomass recalcitrance, a major hindrance in efficient generation of ethanol from lignocellulosic biomass. Here, we report the identification of candidate cell wall biosynthesis genes through the development and application of a novel bioinformatics pipeline. As a first step, via text-mining of PubMed publications, we obtained 121 Arabidopsis genes that had the experimental evidences supporting their involvement in cell wall biosynthesis or remodeling. The 121 genes were then used as bait genes to query an Arabidopsis co-expression database and additional genes were identified as neighbors of the bait genes in the network, increasing the number of genes to 548. The 548 Arabidopsis genes were then used to re-query the Arabidopsis co-expression database and re-construct a network that captured additional network neighbors, expanding to a total of 694 genes. The 694 Arabidopsis genes were computationally divided into 22 clusters. Queries of the Populus genome using the Arabidopsis genes revealed 817 Populus orthologs. Functional analysis of gene ontology and tissue-specific gene expression indicated that these Arabidopsis and Populus genes are high likelihood candidates for functional genomics in relation to cell wall biosynthesis.

  15. antiSMASH 3.0-a comprehensive resource for the genome mining of biosynthetic gene clusters.

    Science.gov (United States)

    Weber, Tilmann; Blin, Kai; Duddela, Srikanth; Krug, Daniel; Kim, Hyun Uk; Bruccoleri, Robert; Lee, Sang Yup; Fischbach, Michael A; Müller, Rolf; Wohlleben, Wolfgang; Breitling, Rainer; Takano, Eriko; Medema, Marnix H

    2015-07-01

    Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we introduced antiSMASH, a web server and stand-alone tool for the automatic genomic identification and analysis of biosynthetic gene clusters, available at http://antismash.secondarymetabolites.org. Here, we present version 3.0 of antiSMASH, which has undergone major improvements. A full integration of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products. At the enzyme level, active sites of key biosynthetic enzymes are now pinpointed through a curated pattern-matching procedure and Enzyme Commission numbers are assigned to functionally classify all enzyme-coding genes. Additionally, chemical structure prediction has been improved by incorporating polyketide reduction states. Finally, in order for users to be able to organize and analyze multiple antiSMASH outputs in a private setting, a new XML output module allows offline editing of antiSMASH annotations within the Geneious software.

  16. Genomic analyses of metal resistance genes in three plant growth promoting bacteria of legume plants in Northwest mine tailings, China

    Institute of Scientific and Technical Information of China (English)

    Pin Xie; Xiuli Hao; Martin Herzberg; Yantao Luo; Dietrich H.Nies; Gehong Wei

    2015-01-01

    To better understand the diversity of metal resistance genetic determinant from microbes that survived at metal tailings in northwest of China,a highly elevated level of heavy metal containing region,genomic analyses was conducted using genome sequence of three native metal-resistant plant growth promoting bacteria (PGPB).It shows that:Mesorhizobium amorphae CCNWGS0123 contains metal ~nsporters from P-type ATPase,CDF (Cation Diffusion Facilitator),HupE/UreJ and CHR (chromate ion transporter) family involved in copper,zinc,nickel as well as chromate resistance and homeostasis.Meanwhile,the putative CopA/CueO system is expected to mediate copper resistance in Sinorhizobium meliloti CCNWSX0020 while ZntA transporter,assisted with putative CzcD,determines zinc tolerance in Agrobacterium tumefaciens CCNWGS0286.The greenhouse experiment provides the consistent evidence of the plant growth promoting effects of these microbes on their hosts by nitrogen fixation and/or indoleacetic acid (IAA) secretion,indicating a potential in-site phytoremediation usage in the mining tailing regions of China.

  17. PLAN: a web platform for automating high-throughput BLAST searches and for managing and mining results

    OpenAIRE

    Zhao Xuechun; Dai Xinbin; He Ji

    2007-01-01

    Abstract Background BLAST searches are widely used for sequence alignment. The search results are commonly adopted for various functional and comparative genomics tasks such as annotating unknown sequences, investigating gene models and comparing two sequence sets. Advances in sequencing technologies pose challenges for high-throughput analysis of large-scale sequence data. A number of programs and hardware solutions exist for efficient BLAST searching, but there is a lack of generic software...

  18. Machine learning and data mining in complex genomic data--a review on the lessons learned in Genetic Analysis Workshop 19.

    Science.gov (United States)

    König, Inke R; Auerbach, Jonathan; Gola, Damian; Held, Elizabeth; Holzinger, Emily R; Legault, Marc-André; Sun, Rui; Tintle, Nathan; Yang, Hsin-Chou

    2016-02-03

    In the analysis of current genomic data, application of machine learning and data mining techniques has become more attractive given the rising complexity of the projects. As part of the Genetic Analysis Workshop 19, approaches from this domain were explored, mostly motivated from two starting points. First, assuming an underlying structure in the genomic data, data mining might identify this and thus improve downstream association analyses. Second, computational methods for machine learning need to be developed further to efficiently deal with the current wealth of data.In the course of discussing results and experiences from the machine learning and data mining approaches, six common messages were extracted. These depict the current state of these approaches in the application to complex genomic data. Although some challenges remain for future studies, important forward steps were taken in the integration of different data types and the evaluation of the evidence. Mining the data for underlying genetic or phenotypic structure and using this information in subsequent analyses proved to be extremely helpful and is likely to become of even greater use with more complex data sets.

  19. Genomic insights into a new acidophilic, copper-resistant Desulfosporosinus isolate from the oxidized tailings area of an abandoned gold mine.

    Science.gov (United States)

    Mardanov, Andrey V; Panova, Inna A; Beletsky, Alexey V; Avakyan, Marat R; Kadnikov, Vitaly V; Antsiferov, Dmitry V; Banks, David; Frank, Yulia A; Pimenov, Nikolay V; Ravin, Nikolai V; Karnachuk, Olga V

    2016-08-01

    Microbial sulfate reduction in acid mine drainage is still considered to be confined to anoxic conditions, although several reports have shown that sulfate-reducing bacteria occur under microaerophilic or aerobic conditions. We have measured sulfate reduction rates of up to 60 nmol S cm(-3) day(-1) in oxidized layers of gold mine tailings in Kuzbass (SW Siberia). A novel, acidophilic, copper-tolerant Desulfosporosinus sp. I2 was isolated from the same sample and its genome was sequenced. The genomic analysis and physiological data indicate the involvement of transporters and additional mechanisms to tolerate metals, such as sequestration by polyphosphates. Desulfosporinus sp. I2 encodes systems for a metabolically versatile life style. The genome possessed a complete Embden-Meyerhof pathway for glycolysis and gluconeogenesis. Complete oxidation of organic substrates could be enabled by the complete TCA cycle. Genomic analysis found all major components of the electron transfer chain necessary for energy generation via oxidative phosphorylation. Autotrophic CO2 fixation could be performed through the Wood-Ljungdahl pathway. Multiple oxygen detoxification systems were identified in the genome. Taking into account the metabolic activity and genomic analysis, the traits of the novel isolate broaden our understanding of active sulfate reduction and associated metabolism beyond strictly anaerobic niches. PMID:27222219

  20. CGMIM: Automated text-mining of Online Mendelian Inheritance in Man (OMIM to identify genetically-associated cancers and candidate genes

    Directory of Open Access Journals (Sweden)

    Jones Steven

    2005-03-01

    Full Text Available Abstract Background Online Mendelian Inheritance in Man (OMIM is a computerized database of information about genes and heritable traits in human populations, based on information reported in the scientific literature. Our objective was to establish an automated text-mining system for OMIM that will identify genetically-related cancers and cancer-related genes. We developed the computer program CGMIM to search for entries in OMIM that are related to one or more cancer types. We performed manual searches of OMIM to verify the program results. Results In the OMIM database on September 30, 2004, CGMIM identified 1943 genes related to cancer. BRCA2 (OMIM *164757, BRAF (OMIM *164757 and CDKN2A (OMIM *600160 were each related to 14 types of cancer. There were 45 genes related to cancer of the esophagus, 121 genes related to cancer of the stomach, and 21 genes related to both. Analysis of CGMIM results indicate that fewer than three gene entries in OMIM should mention both, and the more than seven-fold discrepancy suggests cancers of the esophagus and stomach are more genetically related than current literature suggests. Conclusion CGMIM identifies genetically-related cancers and cancer-related genes. In several ways, cancers with shared genetic etiology are anticipated to lead to further etiologic hypotheses and advances regarding environmental agents. CGMIM results are posted monthly and the source code can be obtained free of charge from the BC Cancer Research Centre website http://www.bccrc.ca/ccr/CGMIM.

  1. Data Mining Approaches for Genome-Wide Association of Mood Disorders

    OpenAIRE

    Pirooznia, Mehdi; Seifuddin, Fayaz; Judy, Jennifer; Mahon, Pamela B; James B Potash; Zandi, Peter P.

    2012-01-01

    Mood disorders are highly heritable forms of major mental illness. A major breakthrough in elucidating the genetic architecture of mood disorders was anticipated with the advent of genome-wide association studies (GWAS). However, to date few susceptibility loci have been conclusively identified. The genetic etiology of mood disorders appears to be quite complex, and as a result, alternative approaches for analyzing GWAS data are needed. Recently, a polygenic scoring approach that captures the...

  2. Nuclear species-diagnostic SNP markers mined from 454 amplicon sequencing reveal admixture genomic structure of modern citrus varieties.

    Directory of Open Access Journals (Sweden)

    Franck Curk

    Full Text Available Most cultivated Citrus species originated from interspecific hybridisation between four ancestral taxa (C. reticulata, C. maxima, C. medica, and C. micrantha with limited further interspecific recombination due to vegetative propagation. This evolution resulted in admixture genomes with frequent interspecific heterozygosity. Moreover, a major part of the phenotypic diversity of edible citrus results from the initial differentiation between these taxa. Deciphering the phylogenomic structure of citrus germplasm is therefore essential for an efficient utilization of citrus biodiversity in breeding schemes. The objective of this work was to develop a set of species-diagnostic single nucleotide polymorphism (SNP markers for the four Citrus ancestral taxa covering the nine chromosomes, and to use these markers to infer the phylogenomic structure of secondary species and modern cultivars. Species-diagnostic SNPs were mined from 454 amplicon sequencing of 57 gene fragments from 26 genotypes of the four basic taxa. Of the 1,053 SNPs mined from 28,507 kb sequence, 273 were found to be highly diagnostic for a single basic taxon. Species-diagnostic SNP markers (105 were used to analyse the admixture structure of varieties and rootstocks. This revealed C. maxima introgressions in most of the old and in all recent selections of mandarins, and suggested that C. reticulata × C. maxima reticulation and introgression processes were important in edible mandarin domestication. The large range of phylogenomic constitutions between C. reticulata and C. maxima revealed in mandarins, tangelos, tangors, sweet oranges, sour oranges, grapefruits, and orangelos is favourable for genetic association studies based on phylogenomic structures of the germplasm. Inferred admixture structures were in agreement with previous hypotheses regarding the origin of several secondary species and also revealed the probable origin of several acid citrus varieties. The developed species

  3. Nuclear species-diagnostic SNP markers mined from 454 amplicon sequencing reveal admixture genomic structure of modern citrus varieties.

    Science.gov (United States)

    Curk, Franck; Ancillo, Gema; Ollitrault, Frédérique; Perrier, Xavier; Jacquemoud-Collet, Jean-Pierre; Garcia-Lor, Andres; Navarro, Luis; Ollitrault, Patrick

    2015-01-01

    Most cultivated Citrus species originated from interspecific hybridisation between four ancestral taxa (C. reticulata, C. maxima, C. medica, and C. micrantha) with limited further interspecific recombination due to vegetative propagation. This evolution resulted in admixture genomes with frequent interspecific heterozygosity. Moreover, a major part of the phenotypic diversity of edible citrus results from the initial differentiation between these taxa. Deciphering the phylogenomic structure of citrus germplasm is therefore essential for an efficient utilization of citrus biodiversity in breeding schemes. The objective of this work was to develop a set of species-diagnostic single nucleotide polymorphism (SNP) markers for the four Citrus ancestral taxa covering the nine chromosomes, and to use these markers to infer the phylogenomic structure of secondary species and modern cultivars. Species-diagnostic SNPs were mined from 454 amplicon sequencing of 57 gene fragments from 26 genotypes of the four basic taxa. Of the 1,053 SNPs mined from 28,507 kb sequence, 273 were found to be highly diagnostic for a single basic taxon. Species-diagnostic SNP markers (105) were used to analyse the admixture structure of varieties and rootstocks. This revealed C. maxima introgressions in most of the old and in all recent selections of mandarins, and suggested that C. reticulata × C. maxima reticulation and introgression processes were important in edible mandarin domestication. The large range of phylogenomic constitutions between C. reticulata and C. maxima revealed in mandarins, tangelos, tangors, sweet oranges, sour oranges, grapefruits, and orangelos is favourable for genetic association studies based on phylogenomic structures of the germplasm. Inferred admixture structures were in agreement with previous hypotheses regarding the origin of several secondary species and also revealed the probable origin of several acid citrus varieties. The developed species-diagnostic SNP

  4. Nuclear species-diagnostic SNP markers mined from 454 amplicon sequencing reveal admixture genomic structure of modern citrus varieties.

    Science.gov (United States)

    Curk, Franck; Ancillo, Gema; Ollitrault, Frédérique; Perrier, Xavier; Jacquemoud-Collet, Jean-Pierre; Garcia-Lor, Andres; Navarro, Luis; Ollitrault, Patrick

    2015-01-01

    Most cultivated Citrus species originated from interspecific hybridisation between four ancestral taxa (C. reticulata, C. maxima, C. medica, and C. micrantha) with limited further interspecific recombination due to vegetative propagation. This evolution resulted in admixture genomes with frequent interspecific heterozygosity. Moreover, a major part of the phenotypic diversity of edible citrus results from the initial differentiation between these taxa. Deciphering the phylogenomic structure of citrus germplasm is therefore essential for an efficient utilization of citrus biodiversity in breeding schemes. The objective of this work was to develop a set of species-diagnostic single nucleotide polymorphism (SNP) markers for the four Citrus ancestral taxa covering the nine chromosomes, and to use these markers to infer the phylogenomic structure of secondary species and modern cultivars. Species-diagnostic SNPs were mined from 454 amplicon sequencing of 57 gene fragments from 26 genotypes of the four basic taxa. Of the 1,053 SNPs mined from 28,507 kb sequence, 273 were found to be highly diagnostic for a single basic taxon. Species-diagnostic SNP markers (105) were used to analyse the admixture structure of varieties and rootstocks. This revealed C. maxima introgressions in most of the old and in all recent selections of mandarins, and suggested that C. reticulata × C. maxima reticulation and introgression processes were important in edible mandarin domestication. The large range of phylogenomic constitutions between C. reticulata and C. maxima revealed in mandarins, tangelos, tangors, sweet oranges, sour oranges, grapefruits, and orangelos is favourable for genetic association studies based on phylogenomic structures of the germplasm. Inferred admixture structures were in agreement with previous hypotheses regarding the origin of several secondary species and also revealed the probable origin of several acid citrus varieties. The developed species-diagnostic SNP

  5. The potential for automated question answering in the context of genomic medicine: an assessment of existing resources and properties of answers.

    Science.gov (United States)

    Overby, Casey Lynnette; Tarczy-Hornoch, Peter; Demner-Fushman, Dina

    2009-01-01

    Knowledge gained in studies of genetic disorders is reported in a growing body of biomedical literature containing reports of genetic variation in individuals that map to medical conditions and/or response to therapy. These scientific discoveries need to be translated into practical applications to optimize patient care. Translating research into practice can be facilitated by supplying clinicians with research evidence. We assessed the role of existing tools in extracting answers to translational research questions in the area of genomic medicine. We: evaluate the coverage of translational research terms in the Unified Medical Language Systems (UMLS) Metathesaurus; determine where answers are most often found in full-text articles; and determine common answer patterns. Findings suggest that we will be able to leverage the UMLS in development of natural language processing algorithms for automated extraction of answers to translational research questions from biomedical text in the area of genomic medicine. PMID:19761578

  6. Novel LanT associated lantibiotic clusters identified by genome database mining.

    Directory of Open Access Journals (Sweden)

    Mangal Singh

    Full Text Available BACKGROUND: Frequent use of antibiotics has led to the emergence of antibiotic resistance in bacteria. Lantibiotic compounds are ribosomally synthesized antimicrobial peptides against which bacteria are not able to produce resistance, hence making them a good alternative to antibiotics. Nisin is the oldest and the most widely used lantibiotic, in food preservation, without having developed any significant resistance against it. Having their antimicrobial potential and a limited number, there is a need to identify novel lantibiotics. METHODOLOGY/FINDINGS: Identification of novel lantibiotic biosynthetic clusters from an ever increasing database of bacterial genomes, can provide a major lead in this direction. In order to achieve this, a strategy was adopted to identify novel lantibiotic biosynthetic clusters by screening the sequenced genomes for LanT homolog, which is a conserved lantibiotic transporter specific to type IB clusters. This strategy resulted in identification of 54 bacterial strains containing the LanT homologs, which are not the known lantibiotic producers. Of these, 24 strains were subjected to a detailed bioinformatic analysis to identify genes encoding for precursor peptides, modification enzyme, immunity and quorum sensing proteins. Eight clusters having two LanM determinants, similar to haloduracin and lichenicidin were identified, along with 13 clusters having a single LanM determinant as in mersacidin biosynthetic cluster. Besides these, orphan LanT homologs were also identified which might be associated with novel bacteriocins, encoded somewhere else in the genome. Three identified gene clusters had a C39 domain containing LanT transporter, associated with the LanBC proteins and double glycine type precursor peptides, the only known example of such a cluster is that of salivaricin. CONCLUSION: This study led to the identification of 8 novel putative two-component lantibiotic clusters along with 13 having a single LanM and

  7. Detailed investigation of cascaded Volterra fusion of processing strings for automated sea mine classification in very shallow water

    Science.gov (United States)

    Aridgides, Tom; Fernández, Manuel

    2006-05-01

    An improved sea mine computer-aided-detection/computer-aided- classification (CAD/CAC) processing string has been developed. The overall CAD/CAC processing string consists of pre-processing, subimage adaptive clutter filtering (SACF), normalization, detection, feature extraction, repeated application of optimal subset feature selection, feature orthogonalization and log-likelihood-ratio-test (LLRT) classification processing, and fusion processing blocks. The classified objects of 3 distinct processing strings are fused using the classification confidence values as features and either "M-out-of-N" or LLRT-based fusion rules. The utility of the overall processing strings and their fusion was demonstrated with new very shallow water high-resolution sonar imagery data. The processing string detection and classification parameters were tuned and the string classification performance was optimized, by appropriately selecting a subset of the original feature set. Two significant fusion algorithm improvements were made. First, a new nonlinear (Volterra) feature LLRT fusion algorithm was developed. Second, a repeated application of the subset Volterra feature selection/feature orthogonalization/LLRT fusion block was utilized. It was shown that this cascaded Volterra feature LLRT fusion of the CAD/CAC processing strings outperforms the "M-out- of-N," the baseline LLRT and single-stage Volterra feature LLRT fusion algorithms, and also yields an improvement over the best single CAD/CAC processing string, providing a significant reduction in the false alarm rate. Additionally, the robustness of cascade Volterra feature fusion was demonstrated, by showing that the algorithm yields similar performance with the training and test sets.

  8. The discovery of putative urine markers for the specific detection of prostate tumor by integrative mining of public genomic profiles.

    Directory of Open Access Journals (Sweden)

    Min Chen

    Full Text Available Urine has emerged as an attractive biofluid for the noninvasive detection of prostate cancer (PCa. There is a strong imperative to discover candidate urinary markers for the clinical diagnosis and prognosis of PCa. The rising flood of various omics profiles presents immense opportunities for the identification of prospective biomarkers. Here we present a simple and efficient strategy to derive candidate urine markers for prostate tumor by mining cancer genomic profiles from public databases. Prostate, bladder and kidney are three major tissues from which cellular matters could be released into urine. To identify urinary markers specific for PCa, upregulated entities that might be shed in exosomes of bladder cancer and kidney cancer are first excluded. Through the ontology-based filtering and further assessment, a reduced list of 19 entities encoding urinary proteins was derived as putative PCa markers. Among them, we have found 10 entities closely associated with the process of tumor cell growth and development by pathway enrichment analysis. Further, using the 10 entities as seeds, we have constructed a protein-protein interaction (PPI subnetwork and suggested a few urine markers as preferred prognostic markers to monitor the invasion and progression of PCa. Our approach is amenable to discover and prioritize potential markers present in a variety of body fluids for a spectrum of human diseases.

  9. The GeneCards Suite: From Gene Data Mining to Disease Genome Sequence Analyses.

    Science.gov (United States)

    Stelzer, Gil; Rosen, Naomi; Plaschkes, Inbar; Zimmerman, Shahar; Twik, Michal; Fishilevich, Simon; Stein, Tsippi Iny; Nudel, Ron; Lieder, Iris; Mazor, Yaron; Kaplan, Sergey; Dahary, Dvir; Warshawsky, David; Guan-Golan, Yaron; Kohn, Asher; Rappaport, Noa; Safran, Marilyn; Lancet, Doron

    2016-01-01

    GeneCards, the human gene compendium, enables researchers to effectively navigate and inter-relate the wide universe of human genes, diseases, variants, proteins, cells, and biological pathways. Our recently launched Version 4 has a revamped infrastructure facilitating faster data updates, better-targeted data queries, and friendlier user experience. It also provides a stronger foundation for the GeneCards suite of companion databases and analysis tools. Improved data unification includes gene-disease links via MalaCards and merged biological pathways via PathCards, as well as drug information and proteome expression. VarElect, another suite member, is a phenotype prioritizer for next-generation sequencing, leveraging the GeneCards and MalaCards knowledgebase. It automatically infers direct and indirect scored associations between hundreds or even thousands of variant-containing genes and disease phenotype terms. VarElect's capabilities, either independently or within TGex, our comprehensive variant analysis pipeline, help prepare for the challenge of clinical projects that involve thousands of exome/genome NGS analyses. © 2016 by John Wiley & Sons, Inc. PMID:27322403

  10. A novel data mining method to identify assay-specific signatures in functional genomic studies

    Directory of Open Access Journals (Sweden)

    Guidarelli Jack W

    2006-08-01

    Full Text Available Abstract Background: The highly dimensional data produced by functional genomic (FG studies makes it difficult to visualize relationships between gene products and experimental conditions (i.e., assays. Although dimensionality reduction methods such as principal component analysis (PCA have been very useful, their application to identify assay-specific signatures has been limited by the lack of appropriate methodologies. This article proposes a new and powerful PCA-based method for the identification of assay-specific gene signatures in FG studies. Results: The proposed method (PM is unique for several reasons. First, it is the only one, to our knowledge, that uses gene contribution, a product of the loading and expression level, to obtain assay signatures. The PM develops and exploits two types of assay-specific contribution plots, which are new to the application of PCA in the FG area. The first type plots the assay-specific gene contribution against the given order of the genes and reveals variations in distribution between assay-specific gene signatures as well as outliers within assay groups indicating the degree of importance of the most dominant genes. The second type plots the contribution of each gene in ascending or descending order against a constantly increasing index. This type of plots reveals assay-specific gene signatures defined by the inflection points in the curve. In addition, sharp regions within the signature define the genes that contribute the most to the signature. We proposed and used the curvature as an appropriate metric to characterize these sharp regions, thus identifying the subset of genes contributing the most to the signature. Finally, the PM uses the full dataset to determine the final gene signature, thus eliminating the chance of gene exclusion by poor screening in earlier steps. The strengths of the PM are demonstrated using a simulation study, and two studies of real DNA microarray data – a study of

  11. Quantification of Operational Risk Using A Data Mining

    Science.gov (United States)

    Perera, J. Sebastian

    1999-01-01

    What is Data Mining? - Data Mining is the process of finding actionable information hidden in raw data. - Data Mining helps find hidden patterns, trends, and important relationships often buried in a sea of data - Typically, automated software tools based on advanced statistical analysis and data modeling technology can be utilized to automate the data mining process

  12. Design and development of Java-based office automation system for coal mine enterprises%基于Java的煤矿企业办公自动化系统设计与开发

    Institute of Scientific and Technical Information of China (English)

    刘红霞; 张慧

    2015-01-01

    为满足煤矿企业办公信息化需要,将传统办公管理模式逐步向自动化办公管理模式转变,系统采用Java,JSP, SQL Server 2005等技术,基于B/S 结构设计开发煤矿企业办公自动化系统。结果表明,该系统结合煤矿企业的办公现状,为企业提供了一个科学、开放、先进的信息化办公平台,有效地降低了办公成本,提升了办公效率,推动了企业的信息化发展。%In order to meet the needs of office informatization for the coal mine enterprises,and gradually transform the tra⁃ditional office management mode to the office automation management mode,a B/S structure based office automation system of coal mining enterprises was designed and developed with Java,JSP,SQL Server 2005 and other technologies. The application results show that the system combines the present situation of the office system for coal mining enterprises,provides a scientific, open and advanced informatization working platform for coal mine enterprises,reduces the cost of office effectively,improve the office efficiency,and promote the informatization development of enterprises.

  13. Burkholderia genome mining for nonribosomal peptide synthetases reveals a great potential for novel siderophores and lipopeptides synthesis.

    Science.gov (United States)

    Esmaeel, Qassim; Pupin, Maude; Kieu, Nam Phuong; Chataigné, Gabrielle; Béchet, Max; Deravel, Jovana; Krier, François; Höfte, Monica; Jacques, Philippe; Leclère, Valérie

    2016-06-01

    Burkholderia is an important genus encompassing a variety of species, including pathogenic strains as well as strains that promote plant growth. We have carried out a global strategy, which combined two complementary approaches. The first one is genome guided with deep analysis of genome sequences and the second one is assay guided with experiments to support the predictions obtained in silico. This efficient screening for new secondary metabolites, performed on 48 gapless genomes of Burkholderia species, revealed a total of 161 clusters containing nonribosomal peptide synthetases (NRPSs), with the potential to synthesize at least 11 novel products. Most of them are siderophores or lipopeptides, two classes of products with potential application in biocontrol. The strategy led to the identification, for the first time, of the cluster for cepaciachelin biosynthesis in the genome of Burkholderia ambifaria AMMD and a cluster corresponding to a new malleobactin-like siderophore, called phymabactin, was identified in Burkholderia phymatum STM815 genome. In both cases, the siderophore was produced when the strain was grown in iron-limited conditions. Elsewhere, the cluster for the antifungal burkholdin was detected in the genome of B. ambifaria AMMD and also Burkholderia sp. KJ006. Burkholderia pseudomallei strains harbor the genetic potential to produce a novel lipopeptide called burkhomycin, containing a peptidyl moiety of 12 monomers. A mixture of lipopeptides produced by Burkholderia rhizoxinica lowered the surface tension of the supernatant from 70 to 27 mN·m(-1) . The production of nonribosomal secondary metabolites seems related to the three phylogenetic groups obtained from 16S rRNA sequences. Moreover, the genome-mining approach gave new insights into the nonribosomal synthesis exemplified by the identification of dual C/E domains in lipopeptide NRPSs, up to now essentially found in Pseudomonas strains. PMID:27060604

  14. Coal Mine Integrated Automation System Based on Internet of Things Technology%基于物联网技术的煤矿综合自动化系统

    Institute of Scientific and Technical Information of China (English)

    黄成玉; 李学哲; 张全柱

    2012-01-01

    设计了基于物联网的煤矿综合自动化系统,并建立了基于Web的综合自动化软件平台,对全矿的人员、物资、设备和基础设施等进行实时有效监控和管理,具有智能告警、多媒体报警、查询、确认、数据采集、硬盘录像等功能,监控监视SCADA系统设备,远程监控矿井环境参数,人员定位和管理,实现了地面监控中心对井下电设备的遥测、遥控、遥调。%Integrated automation system is designed based on internet of things, and integrated automation software platform is set based on Web. The system and software platform not only realize the real time monitoring and management for staff, material, equipment and infrastructure, but also realize intelligent alarm, multimedia alarm, inquring, confirming, date acquisition, harddisk video and other functions. Monitoring SCADA system equipment, remote minitoring for environment parameters of mine, personnel orientation and management realize that ground monitoring centre can remotely monitor, control and modulate electrical installation of mine.

  15. Characterization of the alkaline laccase Ssl1 from Streptomyces sviceus with unusual properties discovered by genome mining.

    Directory of Open Access Journals (Sweden)

    Matthias Gunne

    Full Text Available Fungal laccases are well investigated enzymes with high potential in diverse applications like bleaching of waste waters and textiles, cellulose delignification, and organic synthesis. However, they are limited to acidic reaction conditions and require eukaryotic expression systems. This raises a demand for novel laccases without these constraints. We have taken advantage of the laccase engineering database LccED derived from genome mining to identify and clone the laccase Ssl1 from Streptomyces sviceus which can circumvent the limitations of fungal laccases. Ssl1 belongs to the family of small laccases that contains only few characterized enzymes. After removal of the twin-arginine signal peptide Ssl1 was readily expressed in E. coli. Ssl1 is a small laccase with 32.5 kDa, consists of only two cupredoxin-like domains, and forms trimers in solution. Ssl1 oxidizes 2,2'-azino-bis(3-ethylbenzthiazoline-6-sulfonic acid (ABTS and phenolic substrates like 2,6-dimethoxy phenol, guaiacol, and syringaldazine. The k(cat value for ABTS oxidation was at least 20 times higher than for other substrates. The optimal pH for oxidation reactions is substrate dependent: for phenolic substrates the highest activities were detected at alkaline conditions (pH 9.0 for 2,6-dimethoxy phenol and guaiacol and pH 8.0 for syringaldazine, while the highest reaction rates with ABTS were observed at pH 4.0. Though originating from a mesophilic organism, Ssl demonstrates remarkable stability at elevated temperatures (T(1/2,60°C = 88 min and in a wide pH range (pH 5.0 to 11.0. Notably, the enzyme retained 80% residual activity after 5 days of incubation at pH 11. Detergents and organic co-solvents do not affect Ssl1 stability. The described robustness makes Ssl1 a potential candidate for industrial applications, preferably in processes that require alkaline reaction conditions.

  16. Fish the ChIPs: a pipeline for automated genomic annotation of ChIP-Seq data

    Directory of Open Access Journals (Sweden)

    Minucci Saverio

    2011-10-01

    Full Text Available Abstract Background High-throughput sequencing is generating massive amounts of data at a pace that largely exceeds the throughput of data analysis routines. Here we introduce Fish the ChIPs (FC, a computational pipeline aimed at a broad public of users and designed to perform complete ChIP-Seq data analysis of an unlimited number of samples, thus increasing throughput, reproducibility and saving time. Results Starting from short read sequences, FC performs the following steps: 1 quality controls, 2 alignment to a reference genome, 3 peak calling, 4 genomic annotation, 5 generation of raw signal tracks for visualization on the UCSC and IGV genome browsers. FC exploits some of the fastest and most effective tools today available. Installation on a Mac platform requires very basic computational skills while configuration and usage are supported by a user-friendly graphic user interface. Alternatively, FC can be compiled from the source code on any Unix machine and then run with the possibility of customizing each single parameter through a simple configuration text file that can be generated using a dedicated user-friendly web-form. Considering the execution time, FC can be run on a desktop machine, even though the use of a computer cluster is recommended for analyses of large batches of data. FC is perfectly suited to work with data coming from Illumina Solexa Genome Analyzers or ABI SOLiD and its usage can potentially be extended to any sequencing platform. Conclusions Compared to existing tools, FC has two main advantages that make it suitable for a broad range of users. First of all, it can be installed and run by wet biologists on a Mac machine. Besides it can handle an unlimited number of samples, being convenient for large analyses. In this context, computational biologists can increase reproducibility of their ChIP-Seq data analyses while saving time for downstream analyses. Reviewers This article was reviewed by Gavin Huttley, George

  17. An extended data mining method for identifying differentially expressed assay-specific signatures in functional genomic studies

    OpenAIRE

    Rollins Derrick K; Teh AiLing

    2010-01-01

    Abstract Background Microarray data sets provide relative expression levels for thousands of genes for a small number, in comparison, of different experimental conditions called assays. Data mining techniques are used to extract specific information of genes as they relate to the assays. The multivariate statistical technique of principal component analysis (PCA) has proven useful in providing effective data mining methods. This article extends the PCA approach of Rollins et al. to the develo...

  18. Automated cell analysis tool for a genome-wide RNAi screen with support vector machine based supervised learning

    Science.gov (United States)

    Remmele, Steffen; Ritzerfeld, Julia; Nickel, Walter; Hesser, Jürgen

    2011-03-01

    RNAi-based high-throughput microscopy screens have become an important tool in biological sciences in order to decrypt mostly unknown biological functions of human genes. However, manual analysis is impossible for such screens since the amount of image data sets can often be in the hundred thousands. Reliable automated tools are thus required to analyse the fluorescence microscopy image data sets usually containing two or more reaction channels. The herein presented image analysis tool is designed to analyse an RNAi screen investigating the intracellular trafficking and targeting of acylated Src kinases. In this specific screen, a data set consists of three reaction channels and the investigated cells can appear in different phenotypes. The main issue of the image processing task is an automatic cell segmentation which has to be robust and accurate for all different phenotypes and a successive phenotype classification. The cell segmentation is done in two steps by segmenting the cell nuclei first and then using a classifier-enhanced region growing on basis of the cell nuclei to segment the cells. The classification of the cells is realized by a support vector machine which has to be trained manually using supervised learning. Furthermore, the tool is brightness invariant allowing different staining quality and it provides a quality control that copes with typical defects during preparation and acquisition. A first version of the tool has already been successfully applied for an RNAi-screen containing three hundred thousand image data sets and the SVM extended version is designed for additional screens.

  19. Draft Genome of Streptomyces zinciresistens K42, a Novel Metal-Resistant Species Isolated from Copper-Zinc Mine Tailings

    Science.gov (United States)

    Lin, Yanbing; Hao, Xiuli; Johnstone, Laurel; Miller, Susan J.; Baltrus, David A.; Rensing, Christopher; Wei, Gehong

    2011-01-01

    A draft genome sequence of Streptomyces zinciresistens K42, a novel Streptomyces species displaying a high level of resistance to zinc and cadmium, is presented here. The genome contains a large number of genes encoding proteins predicted to be involved in conferring metal resistance. Many of these genes appear to have been acquired through horizontal gene transfer. PMID:22038968

  20. Probe on Service Level Agreement of Integrated Mine Wide Automation System%全矿井综合自动化系统服务级别协议的探究

    Institute of Scientific and Technical Information of China (English)

    陈建伟; 竺金光

    2011-01-01

    针对目前全矿井综合自动化系统服务级别混乱和服务周期长的问题,提出采用服务级别协议的方式来分类分级管理售中和售后服务的方案,从安装调试类服务、系统推广类服务、定制升级类服务、技术支持服务和日常运营服务5个方面详细介绍了服务级别协议的分类管理及内容.服务级别协议将IT服务管理的思想贯彻到了全矿井综合自动化系统的合同签订及实施过程中,对合同的实施周期、项目的验收和合同回款都将起到显著的作用.%In view of problems of jumbled service level and long service period of integrated mine wide automation system, the paper put forward a scheme of using service level agreement to classify and grade in-sale services and after-sale services, and introduced categorization management and content of service level agreement in terms of five aspects of installation and debugging services, system promoton services,customizing and upgrading services, technique supporting services and annual operating services in details.The service level agreement can carry out the idea of IT service management in contract signing and implementation process of integrated mine wide automation system, and can play a big role in service period, project acceptance and received payments of contract.

  1. Mining plant genome browsers as a means for efficient connection of physical, genetic and cytogenetic mapping: an example using soybean

    Directory of Open Access Journals (Sweden)

    Luis C. Belarmino

    2012-01-01

    Full Text Available Physical maps are important tools to uncover general chromosome structure as well as to compare different plant lineages and species, helping to elucidate genome structure, evolution and possibilities regarding synteny and colinearity. The increasing production of sequence data has opened an opportunity to link information from mapping studies to the underlying sequences. Genome browsers are invaluable platforms that provide access to these sequences, including tools for genome analysis, allowing the integration of multivariate information, and thus aiding to explain the emergence of complex genomes. The present work presents a tutorial regarding the use of genome browsers to develop targeted physical mapping, providing also a general overview and examples about the possibilities regarding the use of Fluorescent In Situ Hybridization (FISH using bacterial artificial chromosomes (BAC, simple sequence repeats (SSR and rDNA probes, highlighting the potential of such studies for map integration and comparative genetics. As a case study, the available genome of soybean was accessed to show how the physical and in silico distribution of such sequences may be compared at different levels. Such evaluations may also be complemented by the identification of sequences beyond the detection level of cytological methods, here using members of the aquaporin gene family as an example. The proposed approach highlights the complementation power of the combination of molecular cytogenetics and computational approaches for the anchoring of coding or repetitive sequences in plant genomes using available genome browsers, helping in the determination of sequence location, arrangement and number of repeats, and also filling gaps found in computational pseudochromosome assemblies.

  2. Draft genome sequence of extremely acidophilic bacterium Acidithiobacillus ferrooxidans DLC-5 isolated from acid mine drainage in Northeast China

    Directory of Open Access Journals (Sweden)

    Peng Chen

    2015-12-01

    Full Text Available Acidithiobacillus ferrooxidans type strain DLC-5, isolated from Wudalianchi in Heihe of Heilongjiang Province, China. Here, we present the draft genome of strain DLC-5 which contains 4,232,149 bp in 2745 contigs with 57.628% GC content and includes 32,719 protein-coding genes and 64 tRNA-encoding genes. The genome sequence can be accessed at DDBJ/EMBL/GenBank under the accession no. JNNH00000000.1.

  3. Identification of novel target genes for safer and more specific control of root-knot nematodes from a pan-genome mining.

    Directory of Open Access Journals (Sweden)

    Etienne G J Danchin

    2013-10-01

    Full Text Available Root-knot nematodes are globally the most aggressive and damaging plant-parasitic nematodes. Chemical nematicides have so far constituted the most efficient control measures against these agricultural pests. Because of their toxicity for the environment and danger for human health, these nematicides have now been banned from use. Consequently, new and more specific control means, safe for the environment and human health, are urgently needed to avoid worldwide proliferation of these devastating plant-parasites. Mining the genomes of root-knot nematodes through an evolutionary and comparative genomics approach, we identified and analyzed 15,952 nematode genes conserved in genomes of plant-damaging species but absent from non target genomes of chordates, plants, annelids, insect pollinators and mollusks. Functional annotation of the corresponding proteins revealed a relative abundance of putative transcription factors in this parasite-specific set compared to whole proteomes of root-knot nematodes. This may point to important and specific regulators of genes involved in parasitism. Because these nematodes are known to secrete effector proteins in planta, essential for parasitism, we searched and identified 993 such effector-like proteins absent from non-target species. Aiming at identifying novel targets for the development of future control methods, we biologically tested the effect of inactivation of the corresponding genes through RNA interference. A total of 15 novel effector-like proteins and one putative transcription factor compatible with the design of siRNAs were present as non-redundant genes and had transcriptional support in the model root-knot nematode Meloidogyne incognita. Infestation assays with siRNA-treated M. incognita on tomato plants showed significant and reproducible reduction of the infestation for 12 of the 16 tested genes compared to control nematodes. These 12 novel genes, showing efficient reduction of parasitism when

  4. Semantic text mining support for lignocellulose research

    Directory of Open Access Journals (Sweden)

    Meurs Marie-Jean

    2012-04-01

    Full Text Available Abstract Background Biofuels produced from biomass are considered to be promising sustainable alternatives to fossil fuels. The conversion of lignocellulose into fermentable sugars for biofuels production requires the use of enzyme cocktails that can efficiently and economically hydrolyze lignocellulosic biomass. As many fungi naturally break down lignocellulose, the identification and characterization of the enzymes involved is a key challenge in the research and development of biomass-derived products and fuels. One approach to meeting this challenge is to mine the rapidly-expanding repertoire of microbial genomes for enzymes with the appropriate catalytic properties. Results Semantic technologies, including natural language processing, ontologies, semantic Web services and Web-based collaboration tools, promise to support users in handling complex data, thereby facilitating knowledge-intensive tasks. An ongoing challenge is to select the appropriate technologies and combine them in a coherent system that brings measurable improvements to the users. We present our ongoing development of a semantic infrastructure in support of genomics-based lignocellulose research. Part of this effort is the automated curation of knowledge from information on fungal enzymes that is available in the literature and genome resources. Conclusions Working closely with fungal biology researchers who manually curate the existing literature, we developed ontological natural language processing pipelines integrated in a Web-based interface to assist them in two main tasks: mining the literature for relevant knowledge, and at the same time providing rich and semantically linked information.

  5. Automation of plasma-process fultext bibliography databases. An on-line data-collection, data-mining and data-input system

    International Nuclear Information System (INIS)

    Searching for relevant data, information retrieval, data extraction and data input are time- and resource-consuming activities in most data centers. Here we develop a Linux system automating the process in case of bibliography, abstract and fulltext databases. The present system is an open-source free-software low-cost solution that connects the target and provider databases in cyberspace through various web publishing formats. The abstract/fulltext relevance assessment is interfaced to external software modules. (author)

  6. Genome mining of the sordarin biosynthetic gene cluster from Sordaria araneosa Cain ATCC 36386: characterization of cycloaraneosene synthase and GDP-6-deoxyaltrose transferase.

    Science.gov (United States)

    Kudo, Fumitaka; Matsuura, Yasunori; Hayashi, Takaaki; Fukushima, Masayuki; Eguchi, Tadashi

    2016-07-01

    Sordarin is a glycoside antibiotic with a unique tetracyclic diterpene aglycone structure called sordaricin. To understand its intriguing biosynthetic pathway that may include a Diels-Alder-type [4+2]cycloaddition, genome mining of the gene cluster from the draft genome sequence of the producer strain, Sordaria araneosa Cain ATCC 36386, was carried out. A contiguous 67 kb gene cluster consisting of 20 open reading frames encoding a putative diterpene cyclase, a glycosyltransferase, a type I polyketide synthase, and six cytochrome P450 monooxygenases were identified. In vitro enzymatic analysis of the putative diterpene cyclase SdnA showed that it catalyzes the transformation of geranylgeranyl diphosphate to cycloaraneosene, a known biosynthetic intermediate of sordarin. Furthermore, a putative glycosyltransferase SdnJ was found to catalyze the glycosylation of sordaricin in the presence of GDP-6-deoxy-d-altrose to give 4'-O-demethylsordarin. These results suggest that the identified sdn gene cluster is responsible for the biosynthesis of sordarin. Based on the isolated potential biosynthetic intermediates and bioinformatics analysis, a plausible biosynthetic pathway for sordarin is proposed. PMID:27072286

  7. Genome mining of the sordarin biosynthetic gene cluster from Sordaria araneosa Cain ATCC 36386: characterization of cycloaraneosene synthase and GDP-6-deoxyaltrose transferase.

    Science.gov (United States)

    Kudo, Fumitaka; Matsuura, Yasunori; Hayashi, Takaaki; Fukushima, Masayuki; Eguchi, Tadashi

    2016-07-01

    Sordarin is a glycoside antibiotic with a unique tetracyclic diterpene aglycone structure called sordaricin. To understand its intriguing biosynthetic pathway that may include a Diels-Alder-type [4+2]cycloaddition, genome mining of the gene cluster from the draft genome sequence of the producer strain, Sordaria araneosa Cain ATCC 36386, was carried out. A contiguous 67 kb gene cluster consisting of 20 open reading frames encoding a putative diterpene cyclase, a glycosyltransferase, a type I polyketide synthase, and six cytochrome P450 monooxygenases were identified. In vitro enzymatic analysis of the putative diterpene cyclase SdnA showed that it catalyzes the transformation of geranylgeranyl diphosphate to cycloaraneosene, a known biosynthetic intermediate of sordarin. Furthermore, a putative glycosyltransferase SdnJ was found to catalyze the glycosylation of sordaricin in the presence of GDP-6-deoxy-d-altrose to give 4'-O-demethylsordarin. These results suggest that the identified sdn gene cluster is responsible for the biosynthesis of sordarin. Based on the isolated potential biosynthetic intermediates and bioinformatics analysis, a plausible biosynthetic pathway for sordarin is proposed.

  8. Text Mining Applications and Theory

    CERN Document Server

    Berry, Michael W

    2010-01-01

    Text Mining: Applications and Theory presents the state-of-the-art algorithms for text mining from both the academic and industrial perspectives.  The contributors span several countries and scientific domains: universities, industrial corporations, and government laboratories, and demonstrate the use of techniques from machine learning, knowledge discovery, natural language processing and information retrieval to design computational models for automated text analysis and mining. This volume demonstrates how advancements in the fields of applied mathematics, computer science, machine learning

  9. In Silico Mining of Microsatellites in Coding Sequences of the Date Palm (Arecaceae Genome, Characterization, and Transferability

    Directory of Open Access Journals (Sweden)

    Frédérique Aberlenc-Bertossi

    2014-01-01

    Full Text Available Premise of the study: To complement existing sets of primarily dinucleotide microsatellite loci from noncoding sequences of date palm, we developed primers for tri- and hexanucleotide microsatellite loci identified within genes. Due to their conserved genomic locations, the primers should be useful in other palm taxa, and their utility was tested in seven other Phoenix species and in Chamaerops, Livistona, and Hyphaene. Methods and Results: Tandem repeat motifs of 3–6 bp were searched using a simple sequence repeat (SSR–pipeline package in coding portions of the date palm draft genome sequence. Fifteen loci produced highly consistent amplification, intraspecific polymorphisms, and stepwise mutation patterns. Conclusions: These microsatellite loci showed sufficient levels of variability and transferability to make them useful for population genetic, selection signature, and interspecific gene flow studies in Phoenix and other Coryphoideae genera.

  10. Home Automation

    OpenAIRE

    Ahmed, Zeeshan

    2010-01-01

    In this paper I briefly discuss the importance of home automation system. Going in to the details I briefly present a real time designed and implemented software and hardware oriented house automation research project, capable of automating house's electricity and providing a security system to detect the presence of unexpected behavior.

  11. Data, text and web mining for business intelligence: a survey

    OpenAIRE

    Abdul-Aziz Rashid Al-Azmi

    2013-01-01

    The Information and Communication Technologies revolution brought a digital world with huge amounts of data available. Enterprises use mining technologies to search vast amounts of data for vital insight and knowledge. Mining tools such as data mining, text mining, and web mining are used to find hidden knowledge in large databases or the Internet. Mining tools are automated software tools used to achieve business intelligence by finding hidden relations, and predicting future eve...

  12. Distributed Framework for Data Mining As a Service on Private Cloud

    OpenAIRE

    Shraddha Masih; Sanjay Tanwani

    2014-01-01

    Data mining research faces two great challenges: i. Automated mining ii. Mining of distributed data. Conventional mining techniques are centralized and the data needs to be accumulated at central location. Mining tool needs to be installed on the computer before performing data mining. Thus, extra time is incurred in collecting the data. Mining is 4 done by specialized analysts who have access to mining tools. This technique is not optimal when the data is distributed over the net...

  13. An extended data mining method for identifying differentially expressed assay-specific signatures in functional genomic studies

    Directory of Open Access Journals (Sweden)

    Rollins Derrick K

    2010-12-01

    Full Text Available Abstract Background Microarray data sets provide relative expression levels for thousands of genes for a small number, in comparison, of different experimental conditions called assays. Data mining techniques are used to extract specific information of genes as they relate to the assays. The multivariate statistical technique of principal component analysis (PCA has proven useful in providing effective data mining methods. This article extends the PCA approach of Rollins et al. to the development of ranking genes of microarray data sets that express most differently between two biologically different grouping of assays. This method is evaluated on real and simulated data and compared to a current approach on the basis of false discovery rate (FDR and statistical power (SP which is the ability to correctly identify important genes. Results This work developed and evaluated two new test statistics based on PCA and compared them to a popular method that is not PCA based. Both test statistics were found to be effective as evaluated in three case studies: (i exposing E. coli cells to two different ethanol levels; (ii application of myostatin to two groups of mice; and (iii a simulated data study derived from the properties of (ii. The proposed method (PM effectively identified critical genes in these studies based on comparison with the current method (CM. The simulation study supports higher identification accuracy for PM over CM for both proposed test statistics when the gene variance is constant and for one of the test statistics when the gene variance is non-constant. Conclusions PM compares quite favorably to CM in terms of lower FDR and much higher SP. Thus, PM can be quite effective in producing accurate signatures from large microarray data sets for differential expression between assays groups identified in a preliminary step of the PCA procedure and is, therefore, recommended for use in these applications.

  14. CSIR: Mining Technology annual review 1996/97

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-12-31

    CSIR: Mining Technology works in close collaboration and strategic partnership with the mining industry, government institutions and employee organizations by acquiring, developing and transferring technologies to improve the safety and health of their employees, and to improve the profitability of the mining industry. The annual report describes achievements over the year in the areas of: rock engineering (including rockburst control, mine layout, stope and gully support, coal mining); environmental safety and health on topics such as occupational hygiene services, methane explosions, blasting techniques; and mining systems (orebody information, hydraulic transport mine mechanization, engineering design and automation, mine services). A list of Mining Technology`s 1996/97 publications is given.

  15. MacSyFinder: a program to mine genomes for molecular systems with an application to CRISPR-Cas systems.

    Directory of Open Access Journals (Sweden)

    Sophie S Abby

    Full Text Available Biologists often wish to use their knowledge on a few experimental models of a given molecular system to identify homologs in genomic data. We developed a generic tool for this purpose.Macromolecular System Finder (MacSyFinder provides a flexible framework to model the properties of molecular systems (cellular machinery or pathway including their components, evolutionary associations with other systems and genetic architecture. Modelled features also include functional analogs, and the multiple uses of a same component by different systems. Models are used to search for molecular systems in complete genomes or in unstructured data like metagenomes. The components of the systems are searched by sequence similarity using Hidden Markov model (HMM protein profiles. The assignment of hits to a given system is decided based on compliance with the content and organization of the system model. A graphical interface, MacSyView, facilitates the analysis of the results by showing overviews of component content and genomic context. To exemplify the use of MacSyFinder we built models to detect and class CRISPR-Cas systems following a previously established classification. We show that MacSyFinder allows to easily define an accurate "Cas-finder" using publicly available protein profiles.MacSyFinder is a standalone application implemented in Python. It requires Python 2.7, Hmmer and makeblastdb (version 2.2.28 or higher. It is freely available with its source code under a GPLv3 license at https://github.com/gem-pasteur/macsyfinder. It is compatible with all platforms supporting Python and Hmmer/makeblastdb. The "Cas-finder" (models and HMM profiles is distributed as a compressed tarball archive as Supporting Information.

  16. Transcriptome Analysis of Two Vicia sativa Subspecies: Mining Molecular Markers to Enhance Genomic Resources for Vetch Improvement

    Directory of Open Access Journals (Sweden)

    Tae-Sung Kim

    2015-11-01

    Full Text Available The vetch (Vicia sativa is one of the most important annual forage legumes globally due to its multiple uses and high nutritional content. Despite these agronomical benefits, many drawbacks, including cyano-alanine toxin, has reduced the agronomic value of vetch varieties. Here, we used 454 technology to sequence the two V. sativa subspecies (ssp. sativa and ssp. nigra to enrich functional information and genetic marker resources for the vetch research community. A total of 86,532 and 47,103 reads produced 35,202 and 18,808 unigenes with average lengths of 735 and 601 bp for V. sativa sativa and V. sativa nigra, respectively. Gene Ontology annotations and the cluster of orthologous gene classes were used to annotate the function of the Vicia transcriptomes. The Vicia transcriptome sequences were then mined for simple sequence repeat (SSR and single nucleotide polymorphism (SNP markers. About 13% and 3% of the Vicia unigenes contained the putative SSR and SNP sequences, respectively. Among those SSRs, 100 were chosen for the validation and the polymorphism test using the Vicia germplasm set. Thus, our approach takes advantage of the utility of transcriptomic data to expedite a vetch breeding program.

  17. Research on Text Data Mining on Human Genome Sequence Analysis%人类基因组测序文本数据挖掘研究

    Institute of Scientific and Technical Information of China (English)

    于跃; 潘玮; 王丽伟; 王伟

    2012-01-01

    对PubMed数据库中2001年1月1日-2011年5月11日的人类基因组测序相关文献进行检索,对其题录信息进行提取并进行共词聚类分析,提取高频主题词,生成词篇矩阵、共现聚阵、共词聚类,认为文本数据挖掘技术能够很好地反映学科发展状况及研究热点,从而为研究人员提供有价值的信息。%Retrieving the literatures on human genome sequence analysis from PubMed published from 2001.1.1 to 2011.5.11,extracts bibliographic information and carries out co - word analysis,high frequency subject headings are extracted,word matrix,co - occurrence matrix,co - word clustering are formulated.It clarifies that data mining is a good way to reflect development status and research hotspots,so as to provide valuable information to researchers.

  18. Biosynthesis of Antibiotic Leucinostatins in Bio-control Fungus Purpureocillium lilacinum and Their Inhibition on Phytophthora Revealed by Genome Mining

    Science.gov (United States)

    Li, Erfeng; Mao, Zhenchuan; Ling, Jian; Yang, Yuhong; Yin, Wen-Bing; Xie, Bingyan

    2016-01-01

    Purpureocillium lilacinum of Ophiocordycipitaceae is one of the most promising and commercialized agents for controlling plant parasitic nematodes, as well as other insects and plant pathogens. However, how the fungus functions at the molecular level remains unknown. Here, we sequenced two isolates (PLBJ-1 and PLFJ-1) of P. lilacinum from different places Beijing and Fujian. Genomic analysis showed high synteny of the two isolates, and the phylogenetic analysis indicated they were most related to the insect pathogen Tolypocladium inflatum. A comparison with other species revealed that this fungus was enriched in carbohydrate-active enzymes (CAZymes), proteases and pathogenesis related genes. Whole genome search revealed a rich repertoire of secondary metabolites (SMs) encoding genes. The non-ribosomal peptide synthetase LcsA, which is comprised of ten C-A-PCP modules, was identified as the core biosynthetic gene of lipopeptide leucinostatins, which was specific to P. lilacinum and T. ophioglossoides, as confirmed by phylogenetic analysis. Furthermore, gene expression level was analyzed when PLBJ-1 was grown in leucinostatin-inducing and non-inducing medium, and 20 genes involved in the biosynthesis of leucionostatins were identified. Disruption mutants allowed us to propose a putative biosynthetic pathway of leucinostatin A. Moreover, overexpression of the transcription factor lcsF increased the production (1.5-fold) of leucinostatins A and B compared to wild type. Bioassays explored a new bioactivity of leucinostatins and P. lilacinum: inhibiting the growth of Phytophthora infestans and P. capsici. These results contribute to our understanding of the biosynthetic mechanism of leucinostatins and may allow us to utilize P. lilacinum better as bio-control agent. PMID:27416025

  19. Genome mining and metabolic profiling of the rhizosphere bacterium Pseudomonas sp. SH-C52 for antimicrobial compounds

    Directory of Open Access Journals (Sweden)

    Menno evan der Voort

    2015-07-01

    Full Text Available The plant microbiome represents an enormous untapped resource for discovering novel genes and bioactive compounds. Previously, we isolated Pseudomonas sp. SH-C52 from the rhizosphere of sugar beet plants grown in a soil suppressive to the fungal pathogen Rhizoctonia solani and showed that its antifungal activity is, in part, attributed to the production of the chlorinated 9-amino-acid lipopeptide thanamycin (Mendes et al. 2011. Science. To get more insight into its biosynthetic repertoire, the genome of Pseudomonas sp. SH-C52 was sequenced and subjected to in silico, mutational and functional analyses. The sequencing revealed a genome size of 6.3 Mb and 5,579 predicted ORFs. Phylogenetic analysis placed strain SH-C52 within the Pseudomonas corrugata clade. In silico analysis for secondary metabolites revealed a total of six nonribosomal peptide synthetase (NRPS gene clusters, including the two previously described NRPS clusters for thanamycin and the 2-amino acid antibacterial lipopeptide brabantamide. Here we show that thanamycin also has activity against an array of other fungi and that brabantamide A exhibits anti-oomycete activity and affects phospholipases of the late blight pathogen Phytophthora infestans. Most notably, mass spectrometry led to the discovery of a third LP, designated thanapeptin, with a 22-amino-acid peptide moiety. Seven structural variants of thanapeptin were found with varying degrees of activity against P. infestans. Of the remaining four NRPS clusters, one was predicted to encode for yet another and unknown lipopeptide with a predicted peptide moiety of 8-amino acids. Collectively, these results show an enormous metabolic potential for Pseudomonas sp. SH-C52, with at least three structurally diverse lipopeptides, each with a different antimicrobial activity spectrum.

  20. Biosynthesis of Antibiotic Leucinostatins in Bio-control Fungus Purpureocillium lilacinum and Their Inhibition on Phytophthora Revealed by Genome Mining.

    Directory of Open Access Journals (Sweden)

    Gang Wang

    2016-07-01

    Full Text Available Purpureocillium lilacinum of Ophiocordycipitaceae is one of the most promising and commercialized agents for controlling plant parasitic nematodes, as well as other insects and plant pathogens. However, how the fungus functions at the molecular level remains unknown. Here, we sequenced two isolates (PLBJ-1 and PLFJ-1 of P. lilacinum from different places Beijing and Fujian. Genomic analysis showed high synteny of the two isolates, and the phylogenetic analysis indicated they were most related to the insect pathogen Tolypocladium inflatum. A comparison with other species revealed that this fungus was enriched in carbohydrate-active enzymes (CAZymes, proteases and pathogenesis related genes. Whole genome search revealed a rich repertoire of secondary metabolites (SMs encoding genes. The non-ribosomal peptide synthetase LcsA, which is comprised of ten C-A-PCP modules, was identified as the core biosynthetic gene of lipopeptide leucinostatins, which was specific to P. lilacinum and T. ophioglossoides, as confirmed by phylogenetic analysis. Furthermore, gene expression level was analyzed when PLBJ-1 was grown in leucinostatin-inducing and non-inducing medium, and 20 genes involved in the biosynthesis of leucionostatins were identified. Disruption mutants allowed us to propose a putative biosynthetic pathway of leucinostatin A. Moreover, overexpression of the transcription factor lcsF increased the production (1.5-fold of leucinostatins A and B compared to wild type. Bioassays explored a new bioactivity of leucinostatins and P. lilacinum: inhibiting the growth of Phytophthora infestans and P. capsici. These results contribute to our understanding of the biosynthetic mechanism of leucinostatins and may allow us to utilize P. lilacinum better as bio-control agent.

  1. Carotenoid metabolic profiling and transcriptome-genome mining reveal functional equivalence among blue-pigmented copepods and appendicularia

    KAUST Repository

    Mojib, Nazia

    2014-06-01

    The tropical oligotrophic oceanic areas are characterized by high water transparency and annual solar radiation. Under these conditions, a large number of phylogenetically diverse mesozooplankton species living in the surface waters (neuston) are found to be blue pigmented. In the present study, we focused on understanding the metabolic and genetic basis of the observed blue phenotype functional equivalence between the blue-pigmented organisms from the phylum Arthropoda, subclass Copepoda (Acartia fossae) and the phylum Chordata, class Appendicularia (Oikopleura dioica) in the Red Sea. Previous studies have shown that carotenoid–protein complexes are responsible for blue coloration in crustaceans. Therefore, we performed carotenoid metabolic profiling using both targeted and nontargeted (high-resolution mass spectrometry) approaches in four different blue-pigmented genera of copepods and one blue-pigmented species of appendicularia. Astaxanthin was found to be the principal carotenoid in all the species. The pathway analysis showed that all the species can synthesize astaxanthin from β-carotene, ingested from dietary sources, via 3-hydroxyechinenone, canthaxanthin, zeaxanthin, adonirubin or adonixanthin. Further, using de novo assembled transcriptome of blue A. fossae (subclass Copepoda), we identified highly expressed homologous β-carotene hydroxylase enzymes and putative carotenoid-binding proteins responsible for astaxanthin formation and the blue phenotype. In blue O. dioica (class Appendicularia), corresponding putative genes were identified from the reference genome. Collectively, our data provide molecular evidences for the bioconversion and accumulation of blue astaxanthin–protein complexes underpinning the observed ecological functional equivalence and adaptive convergence among neustonic mesozooplankton.

  2. Whole genome sequencing of Streptococcus pneumoniae: development, evaluation and verification of targets for serogroup and serotype prediction using an automated pipeline.

    Science.gov (United States)

    Kapatai, Georgia; Sheppard, Carmen L; Al-Shahib, Ali; Litt, David J; Underwood, Anthony P; Harrison, Timothy G; Fry, Norman K

    2016-01-01

    Streptococcus pneumoniae typically express one of 92 serologically distinct capsule polysaccharide (cps) types (serotypes). Some of these serotypes are closely related to each other; using the commercially available typing antisera, these are assigned to common serogroups containing types that show cross-reactivity. In this serotyping scheme, factor antisera are used to allocate serotypes within a serogroup, based on patterns of reactions. This serotyping method is technically demanding, requires considerable experience and the reading of the results can be subjective. This study describes the analysis of the S. pneumoniae capsular operon genetic sequence to determine serotype distinguishing features and the development, evaluation and verification of an automated whole genome sequence (WGS)-based serotyping bioinformatics tool, PneumoCaT (Pneumococcal Capsule Typing). Initially, WGS data from 871 S. pneumoniae isolates were mapped to reference cps locus sequences for the 92 serotypes. Thirty-two of 92 serotypes could be unambiguously identified based on sequence similarities within the cps operon. The remaining 60 were allocated to one of 20 'genogroups' that broadly correspond to the immunologically defined serogroups. By comparing the cps reference sequences for each genogroup, unique molecular differences were determined for serotypes within 18 of the 20 genogroups and verified using the set of 871 isolates. This information was used to design a decision-tree style algorithm within the PneumoCaT bioinformatics tool to predict to serotype level for 89/94 (92 + 2 molecular types/subtypes) from WGS data and to serogroup level for serogroups 24 and 32, which currently comprise 2.1% of UK referred, invasive isolates submitted to the National Reference Laboratory (NRL), Public Health England (June 2014-July 2015). PneumoCaT was evaluated with an internal validation set of 2065 UK isolates covering 72/92 serotypes, including 19 non-typeable isolates and an external

  3. Library Automation

    OpenAIRE

    Dhakne, B. N.; Giri, V. V; Waghmode, S. S.

    2010-01-01

    New technologies library provides several new materials, media and mode of storing and communicating the information. Library Automation reduces the drudgery of repeated manual efforts in library routine. By use of library automation collection, Storage, Administration, Processing, Preservation and communication etc.

  4. Two non-synonymous markers in PTPN21, identified by genome-wide association study data-mining and replication, are associated with schizophrenia.

    LENUS (Irish Health Repository)

    Chen, Jingchun

    2011-09-01

    We conducted data-mining analyses of genome wide association (GWA) studies of the CATIE and MGS-GAIN datasets, and found 13 markers in the two physically linked genes, PTPN21 and EML5, showing nominally significant association with schizophrenia. Linkage disequilibrium (LD) analysis indicated that all 7 markers from PTPN21 shared high LD (r(2)>0.8), including rs2274736 and rs2401751, the two non-synonymous markers with the most significant association signals (rs2401751, P=1.10 × 10(-3) and rs2274736, P=1.21 × 10(-3)). In a meta-analysis of all 13 replication datasets with a total of 13,940 subjects, we found that the two non-synonymous markers are significantly associated with schizophrenia (rs2274736, OR=0.92, 95% CI: 0.86-0.97, P=5.45 × 10(-3) and rs2401751, OR=0.92, 95% CI: 0.86-0.97, P=5.29 × 10(-3)). One SNP (rs7147796) in EML5 is also significantly associated with the disease (OR=1.08, 95% CI: 1.02-1.14, P=6.43 × 10(-3)). These 3 markers remain significant after Bonferroni correction. Furthermore, haplotype conditioned analyses indicated that the association signals observed between rs2274736\\/rs2401751 and rs7147796 are statistically independent. Given the results that 2 non-synonymous markers in PTPN21 are associated with schizophrenia, further investigation of this locus is warranted.

  5. Process automation

    International Nuclear Information System (INIS)

    Process automation technology has been pursued in the chemical processing industries and to a very limited extent in nuclear fuel reprocessing. Its effective use has been restricted in the past by the lack of diverse and reliable process instrumentation and the unavailability of sophisticated software designed for process control. The Integrated Equipment Test (IET) facility was developed by the Consolidated Fuel Reprocessing Program (CFRP) in part to demonstrate new concepts for control of advanced nuclear fuel reprocessing plants. A demonstration of fuel reprocessing equipment automation using advanced instrumentation and a modern, microprocessor-based control system is nearing completion in the facility. This facility provides for the synergistic testing of all chemical process features of a prototypical fuel reprocessing plant that can be attained with unirradiated uranium-bearing feed materials. The unique equipment and mission of the IET facility make it an ideal test bed for automation studies. This effort will provide for the demonstration of the plant automation concept and for the development of techniques for similar applications in a full-scale plant. A set of preliminary recommendations for implementing process automation has been compiled. Some of these concepts are not generally recognized or accepted. The automation work now under way in the IET facility should be useful to others in helping avoid costly mistakes because of the underutilization or misapplication of process automation. 6 figs

  6. Genome-wide analysis of the rice and arabidopsis non-specific lipid transfer protein (nsLtp) gene families and identification of wheat nsLtp genes by EST data mining

    OpenAIRE

    Chantret Nathalie; Boutrot Freddy; Gautier Marie-Françoise

    2008-01-01

    Abstract Background Plant non-specific lipid transfer proteins (nsLTPs) are encoded by multigene families and possess physiological functions that remain unclear. Our objective was to characterize the complete nsLtp gene family in rice and arabidopsis and to perform wheat EST database mining for nsLtp gene discovery. Results In this study, we carried out a genome-wide analysis of nsLtp gene families in Oryza sativa and Arabidopsis thaliana and identified 52 rice nsLtp genes and 49 arabidopsis...

  7. Genome-wide analysis of the rice and arabidopsis non-specific lipid transfer protein (nsLtp) gene families and identification of wheat nsLtp genes by EST data mining

    OpenAIRE

    Boutrot, Freddy; Chantret, Nathalie; Gautier, Marie Francoise

    2008-01-01

    Plant non-specific lipid transfer proteins (nsLTPs) are encoded by multigene families and possess physiological functions that remain unclear. Our objective was to characterize the complete nsLtp gene family in rice and arabidopsis and to perform wheat EST database mining for nsLtp gene discovery.b ResultsIn this study, we carried out a genome-wide analysis of nsLtp gene families in Oryza sativa and Arabidopsis thaliana and identified 52 rice nsLtp genes and 49 arabidopsis nsLtp genes. Here w...

  8. Genome mining for new α-amylase and glucoamylase encoding sequences and high level expression of a glucoamylase from Talaromyces stipitatus for potential raw starch hydrolysis.

    Science.gov (United States)

    Xiao, Zhizhuang; Wu, Meiqun; Grosse, Stephan; Beauchemin, Manon; Lévesque, Michelle; Lau, Peter C K

    2014-01-01

    Mining fungal genomes for glucoamylase and α-amylase encoding sequences led to the selection of 23 candidates, two of which (designated TSgam-2 and NFamy-2) were advanced to testing for cooked or raw starch hydrolysis. TSgam-2 is a 66-kDa glucoamylase recombinantly produced in Pichia pastoris and originally derived for Talaromyces stipitatus. When harvested in a 20-L bioreactor at high cell density (OD600 > 200), the secreted TSgam-2 enzyme activity from P. pastoris strain GS115 reached 800 U/mL. In a 6-L working volume of a 10-L fermentation, the TSgam-2 protein yield was estimated to be ∼8 g with a specific activity of 360 U/mg. In contrast, the highest activity of NFamy-2, a 70-kDa α-amylase originally derived from Neosartorya fischeri, and expressed in P. pastoris KM71 only reached 8 U/mL. Both proteins were purified and characterized in terms of pH and temperature optima, kinetic parameters, and thermostability. TSgam-2 was more thermostable than NFamy-2 with a respective half-life (t1/2) of >300 min at 55 °C and >200 min at 40 °C. The kinetic parameters for raw starch adsorption of TSgam-2 and NFamy-2 were also determined. A combination of NFamy-2 and TSgam-2 hydrolyzed cooked potato and triticale starch into glucose with yields, 71-87 %, that are competitive with commercially available α-amylases. In the hydrolysis of raw starch, the best hydrolysis condition was seen with a sequential addition of 40 U of a thermostable Bacillus globigii amylase (BgAmy)/g starch at 80 °C for 16 h, and 40 U TSgam-2/g starch at 45 °C for 24 h. The glucose released was 8.7 g/10 g of triticale starch and 7.9 g/10 g of potato starch, representing 95 and 86 % of starch degradation rate, respectively.

  9. Data mining process automatization of air pollution data by the LISp-Miner system

    OpenAIRE

    Ochodnická, Zuzana

    2014-01-01

    This thesis is focused on the area of automated data mining. The aim of this thesis is a description of the area of automated data mining, creation of a design of an automated data mining tasks creation process for verification of set domain knowledge and new knowledge search, and also an implementation of verification of set domain knowledge of attribute dependency type influence with search space adjustments. The implementation language is the LMCL language that enables usage of the LISp-Mi...

  10. Introduction to Space Resource Mining

    Science.gov (United States)

    Mueller, Robert P.

    2013-01-01

    There are vast amounts of resources in the solar system that will be useful to humans in space and possibly on Earth. None of these resources can be exploited without the first necessary step of extra-terrestrial mining. The necessary technologies for tele-robotic and autonomous mining have not matured sufficiently yet. The current state of technology was assessed for terrestrial and extraterrestrial mining and a taxonomy of robotic space mining mechanisms was presented which was based on current existing prototypes. Terrestrial and extra-terrestrial mining methods and technologies are on the cusp of massive changes towards automation and autonomy for economic and safety reasons. It is highly likely that these industries will benefit from mutual cooperation and technology transfer.

  11. The hydrogen mine introduction initiative

    Energy Technology Data Exchange (ETDEWEB)

    Betournay, M.C.; Howell, B. [Natural Resources Canada, Ottawa, ON (Canada). CANMET Mining and Mineral Sciences Laboratories

    2009-07-01

    In an effort to address air quality concerns in underground mines, the mining industry is considering the use fuel cells instead of diesel to power mine production vehicles. The immediate issues and opportunities associated with fuel cells use include a reduction in harmful greenhouse gas emissions; reduction in ventilation operating costs; reduction in energy consumption; improved health benefits; automation; and high productivity. The objective of the hydrogen mine introduction initiative (HMII) is to develop and test the range of fundamental and needed operational technology, specifications and best practices for underground hydrogen power applications. Although proof of concept studies have shown high potential for fuel cell use, safety considerations must be addressed, including hydrogen behaviour in confined conditions. This presentation highlighted the issues to meet operational requirements, notably hydrogen production; delivery and storage; mine regulations; and hydrogen behaviour underground. tabs., figs.

  12. Longwall mining

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1995-03-14

    As part of EIA`s program to provide information on coal, this report, Longwall-Mining, describes longwall mining and compares it with other underground mining methods. Using data from EIA and private sector surveys, the report describes major changes in the geologic, technological, and operating characteristics of longwall mining over the past decade. Most important, the report shows how these changes led to dramatic improvements in longwall mining productivity. For readers interested in the history of longwall mining and greater detail on recent developments affecting longwall mining, the report includes a bibliography.

  13. Automating Finance

    Science.gov (United States)

    Moore, John

    2007-01-01

    In past years, higher education's financial management side has been riddled with manual processes and aging mainframe applications. This article discusses schools which had taken advantage of an array of technologies that automate billing, payment processing, and refund processing in the case of overpayment. The investments are well worth it:…

  14. Automation Security

    OpenAIRE

    Mirzoev, Dr. Timur

    2014-01-01

    Web-based Automated Process Control systems are a new type of applications that use the Internet to control industrial processes with the access to the real-time data. Supervisory control and data acquisition (SCADA) networks contain computers and applications that perform key functions in providing essential services and commodities (e.g., electricity, natural gas, gasoline, water, waste treatment, transportation) to all Americans. As such, they are part of the nation s critical infrastructu...

  15. Text Mining.

    Science.gov (United States)

    Trybula, Walter J.

    1999-01-01

    Reviews the state of research in text mining, focusing on newer developments. The intent is to describe the disparate investigations currently included under the term text mining and provide a cohesive structure for these efforts. A summary of research identifies key organizations responsible for pushing the development of text mining. A section…

  16. Data, Text and Web Mining for Business Intelligence : A Survey

    Directory of Open Access Journals (Sweden)

    Abdul-Aziz Rashid Al-Azmi

    2013-04-01

    Full Text Available The Information and Communication Technologies revolution brought a digital world with huge amountsof data available. Enterprises use mining technologies to search vast amounts of data for vital insight andknowledge. Mining tools such as data mining, text mining, and web mining are used to find hiddenknowledge in large databases or the Internet. Mining tools are automated software tools used to achievebusiness intelligence by finding hidden relations,and predicting future events from vast amounts of data.This uncovered knowledge helps in gaining completive advantages, better customers’ relationships, andeven fraud detection. In this survey, we’ll describe how these techniques work, how they are implemented.Furthermore, we shall discuss how business intelligence is achieved using these mining tools. Then lookinto some case studies of success stories using mining tools. Finally, we shall demonstrate some of the mainchallenges to the mining technologies that limit their potential.

  17. DATA, TEXT, AND WEB MINING FOR BUSINESS INTELLIGENCE: A SURVEY

    Directory of Open Access Journals (Sweden)

    Abdul-Aziz Rashid

    2013-03-01

    Full Text Available The Information and Communication Technologies revolution brought a digital world with huge amounts of data available. Enterprises use mining technologies to search vast amounts of data for vital insight and knowledge. Mining tools such as data mining, text mining, and web mining are used to find hidden knowledge in large databases or the Internet. Mining tools are automated software tools used to achieve business intelligence by finding hidden relations, and predicting future events from vast amounts of data. This uncovered knowledge helps in gaining completive advantages, better customers’ relationships, and even fraud detection. In this survey, we’ll describe how these techniques work, how they are implemented. Furthermore, we shall discuss how business intelligence is achieved using these mining tools. Then look into some case studies of success stories using mining tools. Finally, we shall demonstrate some of the main challenges to the mining technologies that limit their potential.

  18. Maintainability Analysis of Underground Mining Equipment Using Genetic Algorithms: Case Studies with an LHD Vehicle

    OpenAIRE

    Sihong Peng; Nick Vayenas

    2014-01-01

    While increased mine mechanization and automation make considerable contributions to mine productivity, unexpected equipment failures and planned or routine maintenance prohibit the maximum possible utilization of sophisticated mining equipment and require a significant amount of extra capital investment. This paper deals with aspects of maintainability prediction for mining machinery. A PC software called GenRel was developed for this purpose. In GenRel, it is assumed that failures of mining...

  19. Locating previously unknown patterns in data-mining results: a dual data- and knowledge-mining method

    OpenAIRE

    Knaus William A; Siadaty Mir S

    2006-01-01

    Abstract Background Data mining can be utilized to automate analysis of substantial amounts of data produced in many organizations. However, data mining produces large numbers of rules and patterns, many of which are not useful. Existing methods for pruning uninteresting patterns have only begun to automate the knowledge acquisition step (which is required for subjective measures of interestingness), hence leaving a serious bottleneck. In this paper we propose a method for automatically acqui...

  20. Text mining for the biocuration workflow

    OpenAIRE

    Hirschman, L.; Burns, G. A. P. C.; Krallinger, M.; Arighi, C.; Cohen, K. B.; Valencia, A.; Wu, C H; Chatr-aryamontri, A; Dowell, K. G.; Huala, E; Lourenco, A.; Nash, R; Veuthey, A.-L.; Wiegers, T.; Winter, A. G.

    2012-01-01

    Molecular biology has become heavily dependent on biological knowledge encoded in expert curated biological databases. As the volume of biological literature increases, biocurators need help in keeping up with the literature; (semi-) automated aids for biocuration would seem to be an ideal application for natural language processing and text mining. However, to date, there have been few documented successes for improving biocuration throughput using text mining. Our initial investigations too...

  1. Graph mining

    OpenAIRE

    Ramon, Jan

    2013-01-01

    Graph mining is the study of how to perform data mining and machine learning on data represented with graphs. One can distinguish between, on the one hand, transactional graph mining, where a database of separate, independent graphs is considered (such as databases of molecules and databases of images), and, on the other hand, large network analysis, where a single large network is considered (such as chemical interaction networks and concept networks).

  2. Automated Budget System

    Data.gov (United States)

    Department of Transportation — The Automated Budget System (ABS) automates management and planning of the Mike Monroney Aeronautical Center (MMAC) budget by providing enhanced capability to plan,...

  3. Use of genome-wide expression data to mine the "gray zone" of GWA studies leads to novel candidate obesity genes

    NARCIS (Netherlands)

    J. Naukkarinen (Jussi); I. Surakka (Ida); K.H. Pietilainen (Kirsi Hannele); A. Rissanen (Aila); V. Salomaa (Veikko); S. Ripatti (Samuli); H. Yki-Jarvinen (Hannele); C.M. van Duijn (Cock); H.E. Wichmann (Heinz Erich); J. Kaprio (Jaakko); M. Taskinen (Marja Riitta); L. Peltonen (Leena Johanna)

    2010-01-01

    textabstractTo get beyond the "low-hanging fruits" so far identified by genome-wide association (GWA) studies, new methods must be developed in order to discover the numerous remaining genes that estimates of heritability indicate should be contributing to complex human phenotypes, such as obesity.

  4. Genome mining of the genetic diversity in the Aspergillus genus - from a collection of more than 30 Aspergillus species

    DEFF Research Database (Denmark)

    Rasmussen, Jane Lind Nybo; Vesth, Tammi Camilla; Theobald, Sebastian;

    , this project uses BLAST on the amino acid level to discover orthologs. With a potential of 300 Aspergillus species each having ~12,000 annotated genes, traditional clustering will demand supercomputing. Instead, our approach reduces the search space by identifying isoenzymes within each genome creating...

  5. Mining`s global tomorrow

    Energy Technology Data Exchange (ETDEWEB)

    Hobbs, B.; Grimstone, L.; MacBeth, A.

    1995-05-01

    Consists of an edited extract of the chapter `Mining Today, Australia`s Tomorrow - Exploration and Mining Globally to 2020` from the CSIRO book. `Challenge to change: Australia in 2020`. Forecasts the state of the Australian mineral industry in 2020 covering aspects such as: exploration concepts and area selection; excavation design and engineering; developments in mining technology and equipment; environmental management; and minerals processing. Presents an optimistic view of the long-term future of the Australian mining industry. Predicts that by 2020 Australia will emerge as the leader of an Austral/Asian trading bloc incorporating India, SE Asia, Pacific islands, Australia and New Zealand.

  6. Discussion of Minos Mine operating system

    Energy Technology Data Exchange (ETDEWEB)

    Pan, B.

    1991-10-01

    The MINOS (mine operating system), which is used in the majority of British collieries, provides central control at the surface for the machinery and environmental equipment distributed throughout the mine. Installed equipment, including face machinery, conveyors, pumps, fans and sensors are connected to local outstations which all communicate with the control system via a single run of signal cable. The article discusses the system particularly its use in the Automated Control System of Underground Mining Locomotives (ACSUML). The discussion includes the use of MINOS to improve wagon identification, the operating principle of ACSUML and the possibilities of a driverless locomotive. 2 figs.

  7. Data mining in healthcare: decision making and precision

    OpenAIRE

    Ionuţ ŢĂRANU

    2016-01-01

    The trend of application of data mining in healthcare today is increased because the health sector is rich with information and data mining has become a necessity. Healthcare organizations generate and collect large volumes of information to a daily basis. Use of information technology enables automation of data mining and knowledge that help bring some interesting patterns which means eliminating manual tasks and easy data extraction directly from electronic records, electronic transfer syst...

  8. Mining a database of single amplified genomes from Red Sea brine pool extremophiles-improving reliability of gene function prediction using a profile and pattern matching algorithm (PPMA).

    KAUST Repository

    Grötzinger, Stefan W

    2014-04-07

    Reliable functional annotation of genomic data is the key-step in the discovery of novel enzymes. Intrinsic sequencing data quality problems of single amplified genomes (SAGs) and poor homology of novel extremophile\\'s genomes pose significant challenges for the attribution of functions to the coding sequences identified. The anoxic deep-sea brine pools of the Red Sea are a promising source of novel enzymes with unique evolutionary adaptation. Sequencing data from Red Sea brine pool cultures and SAGs are annotated and stored in the Integrated Data Warehouse of Microbial Genomes (INDIGO) data warehouse. Low sequence homology of annotated genes (no similarity for 35% of these genes) may translate into false positives when searching for specific functions. The Profile and Pattern Matching (PPM) strategy described here was developed to eliminate false positive annotations of enzyme function before progressing to labor-intensive hyper-saline gene expression and characterization. It utilizes InterPro-derived Gene Ontology (GO)-terms (which represent enzyme function profiles) and annotated relevant PROSITE IDs (which are linked to an amino acid consensus pattern). The PPM algorithm was tested on 15 protein families, which were selected based on scientific and commercial potential. An initial list of 2577 enzyme commission (E.C.) numbers was translated into 171 GO-terms and 49 consensus patterns. A subset of INDIGO-sequences consisting of 58 SAGs from six different taxons of bacteria and archaea were selected from six different brine pool environments. Those SAGs code for 74,516 genes, which were independently scanned for the GO-terms (profile filter) and PROSITE IDs (pattern filter). Following stringent reliability filtering, the non-redundant hits (106 profile hits and 147 pattern hits) are classified as reliable, if at least two relevant descriptors (GO-terms and/or consensus patterns) are present. Scripts for annotation, as well as for the PPM algorithm, are available

  9. Educational Data Mining Model Using Rattle

    Directory of Open Access Journals (Sweden)

    Sadiq Hussain

    2014-07-01

    Full Text Available Data Mining is the extraction of knowledge from the large databases. Data Mining had affected all the fields from combating terror attacks to the human genome databases. For different data analysis, R programming has a key role to play. Rattle, an effective GUI for R Programming is used extensively for generating reports based on several current trends models like random forest, support vector machine etc. It is otherwise hard to compare which model to choose for the data that needs to be mined. This paper proposes a method using Rattle for selection of Educational Data Mining Model.

  10. Use of Genome-Wide Expression Data to Mine the “Gray Zone” of GWA Studies Leads to Novel Candidate Obesity Genes

    Science.gov (United States)

    Naukkarinen, Jussi; Surakka, Ida; Pietiläinen, Kirsi H.; Rissanen, Aila; Salomaa, Veikko; Ripatti, Samuli; Yki-Järvinen, Hannele; van Duijn, Cornelia M.; Wichmann, H.-Erich; Kaprio, Jaakko; Taskinen, Marja-Riitta; Peltonen, Leena

    2010-01-01

    To get beyond the “low-hanging fruits” so far identified by genome-wide association (GWA) studies, new methods must be developed in order to discover the numerous remaining genes that estimates of heritability indicate should be contributing to complex human phenotypes, such as obesity. Here we describe a novel integrative method for complex disease gene identification utilizing both genome-wide transcript profiling of adipose tissue samples and consequent analysis of genome-wide association data generated in large SNP scans. We infer causality of genes with obesity by employing a unique set of monozygotic twin pairs discordant for BMI (n = 13 pairs, age 24–28 years, 15.4 kg mean weight difference) and contrast the transcript profiles with those from a larger sample of non-related adult individuals (N = 77). Using this approach, we were able to identify 27 genes with possibly causal roles in determining the degree of human adiposity. Testing for association of SNP variants in these 27 genes in the population samples of the large ENGAGE consortium (N = 21,000) revealed a significant deviation of P-values from the expected (P = 4×10−4). A total of 13 genes contained SNPs nominally associated with BMI. The top finding was blood coagulation factor F13A1 identified as a novel obesity gene also replicated in a second GWA set of ∼2,000 individuals. This study presents a new approach to utilizing gene expression studies for informing choice of candidate genes for complex human phenotypes, such as obesity. PMID:20532202

  11. Use of genome-wide expression data to mine the "Gray Zone" of GWA studies leads to novel candidate obesity genes.

    Directory of Open Access Journals (Sweden)

    Jussi Naukkarinen

    2010-06-01

    Full Text Available To get beyond the "low-hanging fruits" so far identified by genome-wide association (GWA studies, new methods must be developed in order to discover the numerous remaining genes that estimates of heritability indicate should be contributing to complex human phenotypes, such as obesity. Here we describe a novel integrative method for complex disease gene identification utilizing both genome-wide transcript profiling of adipose tissue samples and consequent analysis of genome-wide association data generated in large SNP scans. We infer causality of genes with obesity by employing a unique set of monozygotic twin pairs discordant for BMI (n = 13 pairs, age 24-28 years, 15.4 kg mean weight difference and contrast the transcript profiles with those from a larger sample of non-related adult individuals (N = 77. Using this approach, we were able to identify 27 genes with possibly causal roles in determining the degree of human adiposity. Testing for association of SNP variants in these 27 genes in the population samples of the large ENGAGE consortium (N = 21,000 revealed a significant deviation of P-values from the expected (P = 4x10(-4. A total of 13 genes contained SNPs nominally associated with BMI. The top finding was blood coagulation factor F13A1 identified as a novel obesity gene also replicated in a second GWA set of approximately 2,000 individuals. This study presents a new approach to utilizing gene expression studies for informing choice of candidate genes for complex human phenotypes, such as obesity.

  12. Mining a database of single amplified genomes from Red Sea brine pool extremophiles – Improving reliability of gene function prediction using a profile and pattern matching algorithm (PPMA

    Directory of Open Access Journals (Sweden)

    Stefan Wolfgang Grötzinger

    2014-04-01

    Full Text Available Reliable functional annotation of genomic data is the key-step in the discovery of novel enzymes. Intrinsic sequencing data quality problems of single amplified genomes (SAGs and poor homology of novel extremophile’s genomes pose significant challenges for the attribution of functions to the coding sequences identified. The anoxic deep-sea brine pools of the Red Sea are a promising source of novel enzymes with unique evolutionary adaptation. Sequencing data from Red Sea brine pool cultures and SAGs are annotated and stored in the INDIGO data warehouse. Low sequence homology of annotated genes (no similarity for 35% of these genes may translate into false positives when searching for specific functions. The Profile & Pattern Matching (PPM strategy described here was developed to eliminate false positive annotations of enzyme function before progressing to labor-intensive hyper-saline gene expression and characterization. It utilizes InterPro-derived Gene Ontology (GO-terms (which represent enzyme function profiles and annotated relevant PROSITE IDs (which are linked to an amino acid consensus pattern. The PPM algorithm was tested on 15 protein families, which were selected based on scientific and commercial potential. An initial list of 2,577 E.C. numbers was translated into 171 GO-terms and 49 consensus patterns. A subset of INDIGO-sequences consisting of 58 SAGs from six different taxons of bacteria and archaea were selected from 6 different brine pool environments. Those SAGs code for 74,516 genes, which were independently scanned for the GO-terms (profile filter and PROSITE IDs (pattern filter. Following stringent reliability filtering, the non-redundant hits (106 profile hits and 147 pattern hits are classified as reliable, if at least two relevant descriptors (GO-terms and/or consensus patterns are present. Scripts for annotation, as well as for the PPM algorithm, are available through the INDIGO website.

  13. Automated Event Service: Efficient and Flexible Searching for Earth Science Phenomena Project

    Data.gov (United States)

    National Aeronautics and Space Administration — Develop an Automated Event Service system that: Methodically mines custom-defined events in the reanalysis data sets of global atmospheric models. Enables...

  14. FROM DATA MINING TO BEHAVIOR MINING

    OpenAIRE

    ZHENGXIN CHEN

    2006-01-01

    Knowledge economy requires data mining be more goal-oriented so that more tangible results can be produced. This requirement implies that the semantics of the data should be incorporated into the mining process. Data mining is ready to deal with this challenge because recent developments in data mining have shown an increasing interest on mining of complex data (as exemplified by graph mining, text mining, etc.). By incorporating the relationships of the data along with the data itself (rathe...

  15. Social big data mining

    CERN Document Server

    Ishikawa, Hiroshi

    2015-01-01

    Social Media. Big Data and Social Data. Hypotheses in the Era of Big Data. Social Big Data Applications. Basic Concepts in Data Mining. Association Rule Mining. Clustering. Classification. Prediction. Web Structure Mining. Web Content Mining. Web Access Log Mining, Information Extraction and Deep Web Mining. Media Mining. Scalability and Outlier Detection.

  16. Mining whole genomes and transcriptomes of Jatropha (Jatropha curcas) and Castor bean (Ricinus communis) for NBS-LRR genes and defense response associated transcription factors.

    Science.gov (United States)

    Sood, Archit; Jaiswal, Varun; Chanumolu, Sree Krishna; Malhotra, Nikhil; Pal, Tarun; Chauhan, Rajinder Singh

    2014-11-01

    Jatropha (Jatropha curcas L.) and Castor bean (Ricinus communis) are oilseed crops of family Euphorbiaceae with the potential of producing high quality biodiesel and having industrial value. Both the bioenergy plants are becoming susceptible to various biotic stresses directly affecting the oil quality and content. No report exists as of today on analysis of Nucleotide Binding Site-Leucine Rich Repeat (NBS-LRR) gene repertoire and defense response transcription factors in both the plant species. In silico analysis of whole genomes and transcriptomes identified 47 new NBS-LRR genes in both the species and 122 and 318 defense response related transcription factors in Jatropha and Castor bean, respectively. The identified NBS-LRR genes and defense response transcription factors were mapped onto the respective genomes. Common and unique NBS-LRR genes and defense related transcription factors were identified in both the plant species. All NBS-LRR genes in both the species were characterized into Toll/interleukin-1 receptor NBS-LRRs (TNLs) and coiled-coil NBS-LRRs (CNLs), position on contigs, gene clusters and motifs and domains distribution. Transcript abundance or expression values were measured for all NBS-LRR genes and defense response transcription factors, suggesting their functional role. The current study provides a repertoire of NBS-LRR genes and transcription factors which can be used in not only dissecting the molecular basis of disease resistance phenotype but also in developing disease resistant genotypes in Jatropha and Castor bean through transgenic or molecular breeding approaches.

  17. Beegle: from literature mining to disease-gene discovery.

    Science.gov (United States)

    ElShal, Sarah; Tranchevent, Léon-Charles; Sifrim, Alejandro; Ardeshirdavani, Amin; Davis, Jesse; Moreau, Yves

    2016-01-29

    Disease-gene identification is a challenging process that has multiple applications within functional genomics and personalized medicine. Typically, this process involves both finding genes known to be associated with the disease (through literature search) and carrying out preliminary experiments or screens (e.g. linkage or association studies, copy number analyses, expression profiling) to determine a set of promising candidates for experimental validation. This requires extensive time and monetary resources. We describe Beegle, an online search and discovery engine that attempts to simplify this process by automating the typical approaches. It starts by mining the literature to quickly extract a set of genes known to be linked with a given query, then it integrates the learning methodology of Endeavour (a gene prioritization tool) to train a genomic model and rank a set of candidate genes to generate novel hypotheses. In a realistic evaluation setup, Beegle has an average recall of 84% in the top 100 returned genes as a search engine, which improves the discovery engine by 12.6% in the top 5% prioritized genes. Beegle is publicly available at http://beegle.esat.kuleuven.be/.

  18. Bovine Genome Database: new tools for gleaning function from the Bos taurus genome.

    Science.gov (United States)

    Elsik, Christine G; Unni, Deepak R; Diesh, Colin M; Tayal, Aditi; Emery, Marianne L; Nguyen, Hung N; Hagen, Darren E

    2016-01-01

    We report an update of the Bovine Genome Database (BGD) (http://BovineGenome.org). The goal of BGD is to support bovine genomics research by providing genome annotation and data mining tools. We have developed new genome and annotation browsers using JBrowse and WebApollo for two Bos taurus genome assemblies, the reference genome assembly (UMD3.1.1) and the alternate genome assembly (Btau_4.6.1). Annotation tools have been customized to highlight priority genes for annotation, and to aid annotators in selecting gene evidence tracks from 91 tissue specific RNAseq datasets. We have also developed BovineMine, based on the InterMine data warehousing system, to integrate the bovine genome, annotation, QTL, SNP and expression data with external sources of orthology, gene ontology, gene interaction and pathway information. BovineMine provides powerful query building tools, as well as customized query templates, and allows users to analyze and download genome-wide datasets. With BovineMine, bovine researchers can use orthology to leverage the curated gene pathways of model organisms, such as human, mouse and rat. BovineMine will be especially useful for gene ontology and pathway analyses in conjunction with GWAS and QTL studies.

  19. Manufacturing and automation

    Directory of Open Access Journals (Sweden)

    Ernesto Córdoba Nieto

    2010-04-01

    Full Text Available The article presents concepts and definitions from different sources concerning automation. The work approaches automation by virtue of the author’s experience in manufacturing production; why and how automation prolects are embarked upon is considered. Technological reflection regarding the progressive advances or stages of automation in the production area is stressed. Coriat and Freyssenet’s thoughts about and approaches to the problem of automation and its current state are taken and examined, especially that referring to the problem’s relationship with reconciling the level of automation with the flexibility and productivity demanded by competitive, worldwide manufacturing.

  20. A Comparative Study on Serial and Parallel Web Content Mining

    Directory of Open Access Journals (Sweden)

    Binayak Panda

    2016-03-01

    Full Text Available World Wide Web (WWW is such a repository which serves every individuals need starting with the context of education to entertainment etc. But from users point of view getting relevant information with respect to one particular context is time consuming and also not so easy. It is because of the volume of data which is unstructured, distributed and dynamic in nature. There can be automation to extract relevant information with respect to one particular context, which is named as Web Content Mining. The efficiency of automation depends on validity of expected outcome as well as amount of processing time. The acceptability of outcome depends on user or user’s policy. But the amount of processing time depends on the methodology of Web Content Mining. In this work a study has been carried out between Serial Web Content Mining and Parallel Web Content Mining. This work also focuses on the frame work of implementation of parallelism in Web Content Mining.

  1. ONTOLOGY BASED DATA MINING METHODOLOGY FOR DISCRIMINATION PREVENTION

    Directory of Open Access Journals (Sweden)

    Nandana Nagabhushana

    2014-09-01

    Full Text Available Data Mining is being increasingly used in the field of automation of decision making processes, which involve extraction and discovery of information hidden in large volumes of collected data. Nonetheless, there are negative perceptions like privacy invasion and potential discrimination which contribute as hindrances to the use of data mining methodologies in software systems employing automated decision making. Loan granting, Employment, Insurance Premium calculation, Admissions in Educational Institutions etc., can make use of data mining to effectively prevent human biases pertaining to certain attributes like gender, nationality, race etc. in critical decision making. The proposed methodology prevents discriminatory rules ensuing due to the presence of certain information regarding sensitive discriminatory attributes in the data itself. Two aspects of novelty in the proposal are, first, the rule mining technique based on ontologies and the second, concerning generalization and transformation of the mined rules that are quantized as discriminatory, into non-discriminatory ones.

  2. Configuration Management Automation (CMA)

    Data.gov (United States)

    Department of Transportation — Configuration Management Automation (CMA) will provide an automated, integrated enterprise solution to support CM of FAA NAS and Non-NAS assets and investments. CMA...

  3. Coastal mining

    Science.gov (United States)

    Bell, Peter M.

    The Exclusive Economic Zone (EEZ) declared by President Reagan in March 1983 has met with a mixed response from those who would benefit from a guaranteed, 200-nautical-mile (370-km) protected underwater mining zone off the coasts of the United States and its possessions. On the one hand, the U.S. Department of the Interior is looking ahead and has been very successful in safeguarding important natural resources that will be needed in the coming decades. On the other hand, the mining industry is faced with a depressed metals and mining market.A report of the Exclusive Economic Zone Symposium held in November 1983 by the U.S. Geological Survey, the Mineral Management Service, and the Bureau of Mines described the mixed response as: “ … The Department of Interior … raring to go into promotion of deep-seal mining but industrial consortia being very pessimistic about the program, at least for the next 30 or so years.” (Chemical & Engineering News, February 5, 1983).

  4. Coal Mines, Active - Longwall Mining Panels

    Data.gov (United States)

    NSGIC GIS Inventory (aka Ramona) — Coal mining has occurred in Pennsylvania for over a century. A method of coal mining known as Longwall Mining has become more prevalent in recent decades. Longwall...

  5. Genome mining in Sorangium cellulosum So ce56: identification and characterization of the homologous electron transfer proteins of a myxobacterial cytochrome P450.

    Science.gov (United States)

    Ewen, Kerstin Maria; Hannemann, Frank; Khatri, Yogan; Perlova, Olena; Kappl, Reinhard; Krug, Daniel; Hüttermann, Jürgen; Müller, Rolf; Bernhardt, Rita

    2009-10-16

    Myxobacteria, especially members of the genus Sorangium, are known for their biotechnological potential as producers of pharmaceutically valuable secondary metabolites. The biosynthesis of several of those myxobacterial compounds includes cytochrome P450 activity. Although class I cytochrome P450 enzymes occur wide-spread in bacteria and rely on ferredoxins and ferredoxin reductases as essential electron mediators, the study of these proteins is often neglected. Therefore, we decided to search in the Sorangium cellulosum So ce56 genome for putative interaction partners of cytochromes P450. In this work we report the investigation of eight myxobacterial ferredoxins and two ferredoxin reductases with respect to their activity in cytochrome P450 systems. Intriguingly, we found not only one, but two ferredoxins whose ability to sustain an endogenous So ce56 cytochrome P450 was demonstrated by CYP260A1-dependent conversion of nootkatone. Moreover, we could demonstrate that the two ferredoxins were able to receive electrons from both ferredoxin reductases. These findings indicate that S. cellulosum can alternate between different electron transport pathways to sustain cytochrome P450 activity. PMID:19696019

  6. A genome-wide survey of maize lipid-related genes: candidate genes mining,digital gene expression profiling and colocation with QTL for maize kernel oil

    Institute of Scientific and Technical Information of China (English)

    2010-01-01

    Lipids play an important role in plants due to their abundance and their extensive participation in many metabolic processes.Genes involved in lipid metabolism have been extensively studied in Arabidopsis and other plant species.In this study,a total of 1003 maize lipid-related genes were cloned and annotated,including 42 genes with experimental validation,732 genes with full-length cDNA and protein sequences in public databases and 229 newly cloned genes.Ninety-seven maize lipid-related genes with tissue-preferential expression were discovered by in silico gene expression profiling based on 1984483 maize Expressed Sequence Tags collected from 182 cDNA libraries.Meanwhile,70 QTL clusters for maize kernel oil were identified,covering 34.5% of the maize genome.Fifty-nine (84%) QTL clusters co-located with at least one lipid-related gene,and the total number of these genes amounted to 147.Interestingly,thirteen genes with kernel-preferential expression profiles fell within QTL clusters for maize kernel oil content.All the maize lipid-related genes identified here may provide good targets for maize kernel oil QTL cloning and thus help us to better understand the molecular mechanism of maize kernel oil accumulation.

  7. Workflow automation architecture standard

    Energy Technology Data Exchange (ETDEWEB)

    Moshofsky, R.P.; Rohen, W.T. [Boeing Computer Services Co., Richland, WA (United States)

    1994-11-14

    This document presents an architectural standard for application of workflow automation technology. The standard includes a functional architecture, process for developing an automated workflow system for a work group, functional and collateral specifications for workflow automation, and results of a proof of concept prototype.

  8. Data mining and education.

    Science.gov (United States)

    Koedinger, Kenneth R; D'Mello, Sidney; McLaughlin, Elizabeth A; Pardos, Zachary A; Rosé, Carolyn P

    2015-01-01

    An emerging field of educational data mining (EDM) is building on and contributing to a wide variety of disciplines through analysis of data coming from various educational technologies. EDM researchers are addressing questions of cognition, metacognition, motivation, affect, language, social discourse, etc. using data from intelligent tutoring systems, massive open online courses, educational games and simulations, and discussion forums. The data include detailed action and timing logs of student interactions in user interfaces such as graded responses to questions or essays, steps in rich problem solving environments, games or simulations, discussion forum posts, or chat dialogs. They might also include external sensors such as eye tracking, facial expression, body movement, etc. We review how EDM has addressed the research questions that surround the psychology of learning with an emphasis on assessment, transfer of learning and model discovery, the role of affect, motivation and metacognition on learning, and analysis of language data and collaborative learning. For example, we discuss (1) how different statistical assessment methods were used in a data mining competition to improve prediction of student responses to intelligent tutor tasks, (2) how better cognitive models can be discovered from data and used to improve instruction, (3) how data-driven models of student affect can be used to focus discussion in a dialog-based tutoring system, and (4) how machine learning techniques applied to discussion data can be used to produce automated agents that support student learning as they collaborate in a chat room or a discussion board.

  9. Data mining

    CERN Document Server

    Gorunescu, Florin

    2011-01-01

    The knowledge discovery process is as old as Homo sapiens. Until some time ago, this process was solely based on the 'natural personal' computer provided by Mother Nature. Fortunately, in recent decades the problem has begun to be solved based on the development of the Data mining technology, aided by the huge computational power of the 'artificial' computers. Digging intelligently in different large databases, data mining aims to extract implicit, previously unknown and potentially useful information from data, since 'knowledge is power'. The goal of this book is to provide, in a friendly way

  10. Mining Review

    Science.gov (United States)

    ,

    2013-01-01

    In 2012, the estimated value of mineral production increased in the United States for the third consecutive year. Production and prices increased for most industrial mineral commodities mined in the United States. While production for most metals remained relatively unchanged, with the notable exception of gold, the prices for most metals declined. Minerals remained fundamental to the U.S. economy, contributing to the real gross domestic product (GDP) at several levels, including mining, processing and manufacturing finished products. Minerals’ contribution to the GDP increased for the second consecutive year.

  11. Frontiers of biomedical text mining: current progress

    OpenAIRE

    Zweigenbaum, Pierre; Demner-Fushman, Dina; Hong YU; Cohen, Kevin B.

    2007-01-01

    It is now almost 15 years since the publication of the first paper on text mining in the genomics domain, and decades since the first paper on text mining in the medical domain. Enormous progress has been made in the areas of information retrieval, evaluation methodologies and resource construction. Some problems, such as abbreviation-handling, can essentially be considered solved problems, and others, such as identification of gene mentions in text, seem likely to be solved soon. However, a ...

  12. In silico genome wide mining of conserved and novel miRNAs in the brain and pineal gland of Danio rerio using small RNA sequencing data.

    Science.gov (United States)

    Agarwal, Suyash; Nagpure, Naresh Sahebrao; Srivastava, Prachi; Kushwaha, Basdeo; Kumar, Ravindra; Pandey, Manmohan; Srivastava, Shreya

    2016-03-01

    MicroRNAs (miRNAs) are small, non-coding RNA molecules that bind to the mRNA of the target genes and regulate the expression of the gene at the post-transcriptional level. Zebrafish is an economically important freshwater fish species globally considered as a good predictive model for studying human diseases and development. The present study focused on uncovering known as well as novel miRNAs, target prediction of the novel miRNAs and the differential expression of the known miRNA using the small RNA sequencing data of the brain and pineal gland (dark and light treatments) obtained from NCBI SRA. A total of 165, 151 and 145 known zebrafish miRNAs were found in the brain, pineal gland (dark treatment) and pineal gland (light treatment), respectively. Chromosomes 4 and 5 of zebrafish reference assembly GRCz10 were found to contain maximum number of miR genes. The miR-181a and miR-182 were found to be highly expressed in terms of number of reads in the brain and pineal gland, respectively. Other ncRNAs, such as tRNA, rRNA and snoRNA, were curated against Rfam. Using GRCz10 as reference, the subsequent bioinformatic analyses identified 25, 19 and 9 novel miRNAs from the brain, pineal gland (dark treatment) and pineal gland (light treatment), respectively. Targets of the novel miRNAs were identified, based on sequence complementarity between miRNAs and mRNA, by searching for antisense hits in the 3'-UTR of reference RNA sequences of the zebrafish. The discovery of novel miRNAs and their targets in the zebrafish genome can be a valuable scientific resource for further functional studies not only in zebrafish but also in other economically important fishes. PMID:26981358

  13. Literature classification for semi-automated updating of biological knowledgebases

    DEFF Research Database (Denmark)

    Olsen, Lars Rønn; Kudahl, Ulrich Johan; Winther, Ole;

    2013-01-01

    abstracts yielded classification accuracy of 0.95, thus showing significant value in support of data extraction from the literature. Conclusion: We here propose a conceptual framework for semi-automated extraction of epitope data embedded in scientific literature using principles from text mining and...

  14. Shoe-String Automation

    Energy Technology Data Exchange (ETDEWEB)

    Duncan, M.L.

    2001-07-30

    Faced with a downsizing organization, serious budget reductions and retirement of key metrology personnel, maintaining capabilities to provide necessary services to our customers was becoming increasingly difficult. It appeared that the only solution was to automate some of our more personnel-intensive processes; however, it was crucial that the most personnel-intensive candidate process be automated, at the lowest price possible and with the lowest risk of failure. This discussion relates factors in the selection of the Standard Leak Calibration System for automation, the methods of automation used to provide the lowest-cost solution and the benefits realized as a result of the automation.

  15. GAMOLA: a new local solution for sequence annotation and analyzing draft and finished prokaryotic genomes.

    Science.gov (United States)

    Altermann, Eric; Klaenhammer, Todd R

    2003-01-01

    Laboratories working with draft phase genomes have specific software needs, such as the unattended processing of hundreds of single scaffolds and subsequent sequence annotation. In addition, it is critical to follow the "movement" and the manual annotation of single open reading frames (ORFs) within the successive sequence updates. Even with finished genomes, regular database updates can lead to significant changes in the annotation of single ORFs. In functional genomics it is important to mine data and identify new genetic targets rapidly and easily. Often there is no need for sophisticated relational databases (RDB) that greatly reduce the system-independent access of the results. Another aspect is the internet dependency of most software packages. If users are working with confidential data, this dependency poses a security issue. GAMOLA was designed to handle the numerous scaffolds and changing contents of draft phase genomes in an automated process and stores the results for each predicted ORF in flatfile databases. In addition, annotation transfers, ORF designation tracking, Blast comparisons, and primer design for whole genome microarrays have been implemented. The software is available under the license of North Carolina State University. A website and a downloadable example are accessible under (http://fsweb2.schaub. ncsu.edu/TRKwebsite/index.htm). PMID:14506845

  16. Data mining concepts and techniques

    CERN Document Server

    Han, Jiawei

    2005-01-01

    Our ability to generate and collect data has been increasing rapidly. Not only are all of our business, scientific, and government transactions now computerized, but the widespread use of digital cameras, publication tools, and bar codes also generate data. On the collection side, scanned text and image platforms, satellite remote sensing systems, and the World Wide Web have flooded us with a tremendous amount of data. This explosive growth has generated an even more urgent need for new techniques and automated tools that can help us transform this data into useful information and knowledge.Like the first edition, voted the most popular data mining book by KD Nuggets readers, this book explores concepts and techniques for the discovery of patterns hidden in large data sets, focusing on issues relating to their feasibility, usefulness, effectiveness, and scalability. However, since the publication of the first edition, great progress has been made in the development of new data mining methods, systems, and app...

  17. Mining Method

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Young Shik; Lee, Kyung Woon; Kim, Oak Hwan; Kim, Dae Kyung [Korea Institute of Geology Mining and Materials, Taejon (Korea, Republic of)

    1996-12-01

    The reducing coal market has been enforcing the coal industry to make exceptional rationalization and restructuring efforts since the end of the eighties. To the competition from crude oil and natural gas has been added the growing pressure from rising wages and rising production cost as the workings get deeper. To improve the competitive position of the coal mines against oil and gas through cost reduction, studies to improve mining system have been carried out. To find fields requiring improvements most, the technologies using in Tae Bak Colliery which was selected one of long running mines were investigated and analyzed. The mining method appeared the field needing improvements most to reduce the production cost. The present method, so-called inseam roadway caving method presently is using to extract the steep and thick seam. However, this method has several drawbacks. To solve the problems, two mining methods are suggested for a long term and short term method respectively. Inseam roadway caving method with long-hole blasting method is a variety of the present inseam roadway caving method modified by replacing timber sets with steel arch sets and the shovel loaders with chain conveyors. And long hole blasting is introduced to promote caving. And pillar caving method with chock supports method uses chock supports setting in the cross-cut from the hanging wall to the footwall. Two single chain conveyors are needed. One is installed in front of chock supports to clear coal from the cutting face. The other is installed behind the supports to transport caved coal from behind. This method is superior to the previous one in terms of safety from water-inrushes, production rate and productivity. The only drawback is that it needs more investment. (author). 14 tabs., 34 figs.

  18. EuCAP, a Eukaryotic Community Annotation Package, and its application to the rice genome

    OpenAIRE

    Hamilton John P; Campbell Matthew; Thibaud-Nissen Françoise; Zhu Wei; Buell C

    2007-01-01

    Abstract Background Despite the improvements of tools for automated annotation of genome sequences, manual curation at the structural and functional level can provide an increased level of refinement to genome annotation. The Institute for Genomic Research Rice Genome Annotation (hereafter named the Osa1 Genome Annotation) is the product of an automated pipeline and, for this reason, will benefit from the input of biologists with expertise in rice and/or particular gene families. Leveraging k...

  19. Automated Single Cell Data Decontamination Pipeline

    Energy Technology Data Exchange (ETDEWEB)

    Tennessen, Kristin [Lawrence Berkeley National Lab. (LBNL), Walnut Creek, CA (United States). Dept. of Energy Joint Genome Inst.; Pati, Amrita [Lawrence Berkeley National Lab. (LBNL), Walnut Creek, CA (United States). Dept. of Energy Joint Genome Inst.

    2014-03-21

    Recent technological advancements in single-cell genomics have encouraged the classification and functional assessment of microorganisms from a wide span of the biospheres phylogeny.1,2 Environmental processes of interest to the DOE, such as bioremediation and carbon cycling, can be elucidated through the genomic lens of these unculturable microbes. However, contamination can occur at various stages of the single-cell sequencing process. Contaminated data can lead to wasted time and effort on meaningless analyses, inaccurate or erroneous conclusions, and pollution of public databases. A fully automated decontamination tool is necessary to prevent these instances and increase the throughput of the single-cell sequencing process

  20. Application of Modern Tools and Techniques for Mine Safety & Disaster Management

    Science.gov (United States)

    Kumar, Dheeraj

    2016-04-01

    The implementation of novel systems and adoption of improvised equipment in mines help mining companies in two important ways: enhanced mine productivity and improved worker safety. There is a substantial need for adoption of state-of-the-art automation technologies in the mines to ensure the safety and to protect health of mine workers. With the advent of new autonomous equipment used in the mine, the inefficiencies are reduced by limiting human inconsistencies and error. The desired increase in productivity at a mine can sometimes be achieved by changing only a few simple variables. Significant developments have been made in the areas of surface and underground communication, robotics, smart sensors, tracking systems, mine gas monitoring systems and ground movements etc. Advancement in information technology in the form of internet, GIS, remote sensing, satellite communication, etc. have proved to be important tools for hazard reduction and disaster management. This paper is mainly focused on issues pertaining to mine safety and disaster management and some of the recent innovations in the mine automations that could be deployed in mines for safe mining operations and for avoiding any unforeseen mine disaster.

  1. TOP-10 DATA MINING CASE STUDIES

    OpenAIRE

    GABOR MELLI; XINDONG WU; PAUL BEINAT; FRANCESCO BONCHI; LONGBING CAO; RONG DUAN; CHRISTOS FALOUTSOS; RAYID GHANI; BRENDAN KITTS; BART GOETHALS; GEOFF MCLACHLAN; JIAN PEI; ASHOK SRIVASTAVA; OSMAR ZAÏANE

    2012-01-01

    We report on the panel discussion held at the ICDM'10 conference on the top 10 data mining case studies in order to provide a snapshot of where and how data mining techniques have made significant real-world impact. The tasks covered by 10 case studies range from the detection of anomalies such as cancer, fraud, and system failures to the optimization of organizational operations, and include the automated extraction of information from unstructured sources. From the 10 cases we find that sup...

  2. Mining machinery

    Energy Technology Data Exchange (ETDEWEB)

    Outram, J.K.

    1978-11-23

    This is concerned with a supporting and transport device for heavy mining auxiliary equipment such as hydraulic units, switchgear, transformers, etc., which must be erected nearby to operate large mines and which have to be moved on as the workings progress. The device consists of long steel structures consisting of girders, whose upper/outer part made as a bridge unit surrounds the conveyor part lying below it over its whole length. The bridge with the auxiliary equipment fixed to it can be lifted at the ends and in the centre by jacks. The bridge anchored to the floor of the seam engages with guide wheels under a flat rail, which are distributed on both sides along its length, where the rails run on the left and right along its length. Both rails form a track like that of a crane trolley, on which the conveyor structure under the bridge unit can be moved forward a certain distance. By lowering the bridge by its jacks, other pairs of wheels engage the same rails from above, so that the bridge with the auxiliary equipment can also be moved along on the support structure resting on the floor of the seam. By repeating this process all auxiliary equipment is brought close to the mining face by moving the required distance.

  3. A genetic programming based business process mining approach

    OpenAIRE

    Turner, Christopher James

    2009-01-01

    As business processes become ever more complex there is a need for companies to understand the processes they already have in place. To undertake this manually would be time consuming. The practice of process mining attempts to automatically construct the correct representation of a process based on a set of process execution logs. The aim of this research is to develop a genetic programming based approach for business process mining. The focus of this research is on automated/semi automat...

  4. Data mining meets economic analysis: opportunities and challenges

    OpenAIRE

    Baicoianu, A.; Dumitrescu, S

    2010-01-01

    Along with the increase of economic globalization and the evolution of information technology, data mining has become an important approach for economic data analysis. As a result, there has been a critical need for automated approaches to effective and efficient usage of massive amount of economic data, in order to support both companies’ and individuals’ strategic planning and investment decision-making. The goal of this paper is to illustrate the impact of data mining techniques on sales, ...

  5. Advances in Computer, Communication, Control and Automation

    CERN Document Server

    011 International Conference on Computer, Communication, Control and Automation

    2012-01-01

    The volume includes a set of selected papers extended and revised from the 2011 International Conference on Computer, Communication, Control and Automation (3CA 2011). 2011 International Conference on Computer, Communication, Control and Automation (3CA 2011) has been held in Zhuhai, China, November 19-20, 2011. This volume  topics covered include signal and Image processing, speech and audio Processing, video processing and analysis, artificial intelligence, computing and intelligent systems, machine learning, sensor and neural networks, knowledge discovery and data mining, fuzzy mathematics and Applications, knowledge-based systems, hybrid systems modeling and design, risk analysis and management, system modeling and simulation. We hope that researchers, graduate students and other interested readers benefit scientifically from the proceedings and also find it stimulating in the process.

  6. Mining Patient Journeys From Healthcare Narratives

    OpenAIRE

    Dehghan, Azad

    2015-01-01

    The aim of the thesis is to investigate the feasibility of using text mining methods to reconstruct patient journeys from unstructured clinical narratives.A novel method to extract and represent patient journeys is proposed and evaluated in this thesis. A composition of methods were designed, developed and evaluated to this end; which included health-related concept extraction, temporal information extraction, and concept clustering and automated work-flow generation.A suite of methods to ext...

  7. Integrating Data Mining Into Business Intelligence

    OpenAIRE

    Maria Cristina ENACHE

    2006-01-01

    Data Mining is a broad term often used to describe the process of using database technology, modeling techniques, statistical analysis, and machine learning to analyze large amounts of data in an automated fashion to discover hidden patterns and predictive information in the data. By building highly complex and sophisticated statistical and mathematical models, organizations can gain new insight into their activities. The purpose of this document is to provide users with a background of a few...

  8. Genome bioinformatics of tomato and potato

    NARCIS (Netherlands)

    Datema, E.

    2011-01-01

    In the past two decades genome sequencing has developed from a laborious and costly technology employed by large international consortia to a widely used, automated and affordable tool used worldwide by many individual research groups. Genome sequences of many food animals and crop plants have been

  9. GDR surface mining technology - a programme for complicated geological and climatic conditions of surface mining

    Energy Technology Data Exchange (ETDEWEB)

    Rudolf, W.; Klose, W.

    1979-08-01

    This paper describes surface mining as an expanding technology with a work productivity 2.5 to 6.0 times higher than in underground mining. Increasing amounts of overburden can be removed, from 100,000 m3 to 300,000 m3 per day, by large excavation complexes. TAKRAF had exported 300 surface mining machines to various countries as of 1979. Surface mining technology is continually being improved with developments in equipment, such as better service life, unit construction and interchangeability of parts, higher capacity, automation, climatic resistance to 60 C, etc. The TAKRAF equipment series are introduced including information on their range of capacity. TAKRAF bucket wheel and bucket chain excavators, conveyor belt systems, overburden conveyor bridges and swing chutes are described. Equipment for briquetting plants, brown coal enrichment and power plants is also produced by TAKRAF.

  10. Coal Mine Permit Boundaries

    Data.gov (United States)

    Earth Data Analysis Center, University of New Mexico — ESRI ArcView shapefile depicting New Mexico coal mines permitted under the Surface Mining Control and Reclamation Act of 1977 (SMCRA), by either the NM Mining...

  11. Exploration and Mining Roadmap

    Energy Technology Data Exchange (ETDEWEB)

    none,

    2002-09-01

    This Exploration and Mining Technology Roadmap represents the third roadmap for the Mining Industry of the Future. It is based upon the results of the Exploration and Mining Roadmap Workshop held May 10 ñ 11, 2001.

  12. Genome-wide analysis of the rice and arabidopsis non-specific lipid transfer protein (nsLtp gene families and identification of wheat nsLtp genes by EST data mining

    Directory of Open Access Journals (Sweden)

    Chantret Nathalie

    2008-02-01

    Full Text Available Abstract Background Plant non-specific lipid transfer proteins (nsLTPs are encoded by multigene families and possess physiological functions that remain unclear. Our objective was to characterize the complete nsLtp gene family in rice and arabidopsis and to perform wheat EST database mining for nsLtp gene discovery. Results In this study, we carried out a genome-wide analysis of nsLtp gene families in Oryza sativa and Arabidopsis thaliana and identified 52 rice nsLtp genes and 49 arabidopsis nsLtp genes. Here we present a complete overview of the genes and deduced protein features. Tandem duplication repeats, which represent 26 out of the 52 rice nsLtp genes and 18 out of the 49 arabidopsis nsLtp genes identified, support the complexity of the nsLtp gene families in these species. Phylogenetic analysis revealed that rice and arabidopsis nsLTPs are clustered in nine different clades. In addition, we performed comparative analysis of rice nsLtp genes and wheat (Triticum aestivum EST sequences indexed in the UniGene database. We identified 156 putative wheat nsLtp genes, among which 91 were found in the 'Chinese Spring' cultivar. The 122 wheat non-redundant nsLTPs were organized in eight types and 33 subfamilies. Based on the observation that seven of these clades were present in arabidopsis, rice and wheat, we conclude that the major functional diversification within the nsLTP family predated the monocot/dicot divergence. In contrast, there is no type VII nsLTPs in arabidopsis and type IX nsLTPs were only identified in arabidopsis. The reason for the larger number of nsLtp genes in wheat may simply be due to the hexaploid state of wheat but may also reflect extensive duplication of gene clusters as observed on rice chromosomes 11 and 12 and arabidopsis chromosome 5. Conclusion Our current study provides fundamental information on the organization of the rice, arabidopsis and wheat nsLtp gene families. The multiplicity of nsLTP types provide new

  13. Automate functional testing

    Directory of Open Access Journals (Sweden)

    Ramesh Kalindri

    2014-06-01

    Full Text Available Currently, software engineers are increasingly turning to the option of automating functional tests, but not always have successful in this endeavor. Reasons range from low planning until over cost in the process. Some principles that can guide teams in automating these tests are described in this article.

  14. Automation in Warehouse Development

    NARCIS (Netherlands)

    Hamberg, R.; Verriet, J.

    2012-01-01

    The warehouses of the future will come in a variety of forms, but with a few common ingredients. Firstly, human operational handling of items in warehouses is increasingly being replaced by automated item handling. Extended warehouse automation counteracts the scarcity of human operators and support

  15. Work and Programmable Automation.

    Science.gov (United States)

    DeVore, Paul W.

    A new industrial era based on electronics and the microprocessor has arrived, an era that is being called intelligent automation. Intelligent automation, in the form of robots, replaces workers, and the new products, using microelectronic devices, require significantly less labor to produce than the goods they replace. The microprocessor thus…

  16. Library Automation Style Guide.

    Science.gov (United States)

    Gaylord Bros., Liverpool, NY.

    This library automation style guide lists specific terms and names often used in the library automation industry. The terms and/or acronyms are listed alphabetically and each is followed by a brief definition. The guide refers to the "Chicago Manual of Style" for general rules, and a notes section is included for the convenience of individual…

  17. Mining review

    Science.gov (United States)

    McCartan, L.; Morse, D.E.; Plunkert, P.A.; Sibley, S.F.

    2004-01-01

    The average annual growth rate of real gross domestic product (GDP) from the third quarter of 2001 through the second quarter of 2003 in the United States was about 2.6 percent. GDP growth rates in the third and fourth quarters of 2003 were about 8 percent and 4 percent, respectively. The upward trends in many sectors of the U.S. economy in 2003, however, were shared by few of the mineral materials industries. Annual output declined in most nonfuel mining and mineral processing industries, although there was an upward turn toward yearend as prices began to increase.

  18. Corrosion of friction rock stabilizers in selected uranium and copper mine waters. Report of Investigations/1984

    International Nuclear Information System (INIS)

    The Bureau of Mines evaluated corrosion resistance of Split Set friction rock stabilizer mine roof bolts to aid in better prediction of useful service life. Electrochemical corrosion testing was conducted utilizing an automated corrosion measurement system. Natural and/or synthetic mine waters from four uranium and two copper mines were the test media for the two types of high-strength, low-alloy (HSLA) steels from which Split Set stabilizers are manufactured, and for galvanized steel. Tests were conducted with waters of minimum and maximum dissolved oxygen content at in-mine water temperatures. Retrieved Split Set stabilizers were also evaluated for property changes

  19. Development of On-Board Fluid Analysis for the Mining Industry - Final report

    Energy Technology Data Exchange (ETDEWEB)

    Pardini, Allan F.

    2005-08-16

    Pacific Northwest National Laboratory (PNNL: Operated by Battelle Memorial Institute for the Department of Energy) is working with the Department of Energy (DOE) to develop technology for the US mining industry. PNNL was awarded a three-year program to develop automated on-board/in-line or on-site oil analysis for the mining industry.

  20. Automation in immunohematology.

    Science.gov (United States)

    Bajpai, Meenu; Kaur, Ravneet; Gupta, Ekta

    2012-07-01

    There have been rapid technological advances in blood banking in South Asian region over the past decade with an increasing emphasis on quality and safety of blood products. The conventional test tube technique has given way to newer techniques such as column agglutination technique, solid phase red cell adherence assay, and erythrocyte-magnetized technique. These new technologies are adaptable to automation and major manufacturers in this field have come up with semi and fully automated equipments for immunohematology tests in the blood bank. Automation improves the objectivity and reproducibility of tests. It reduces human errors in patient identification and transcription errors. Documentation and traceability of tests, reagents and processes and archiving of results is another major advantage of automation. Shifting from manual methods to automation is a major undertaking for any transfusion service to provide quality patient care with lesser turnaround time for their ever increasing workload. This article discusses the various issues involved in the process.

  1. Automation in Immunohematology

    Directory of Open Access Journals (Sweden)

    Meenu Bajpai

    2012-01-01

    Full Text Available There have been rapid technological advances in blood banking in South Asian region over the past decade with an increasing emphasis on quality and safety of blood products. The conventional test tube technique has given way to newer techniques such as column agglutination technique, solid phase red cell adherence assay, and erythrocyte-magnetized technique. These new technologies are adaptable to automation and major manufacturers in this field have come up with semi and fully automated equipments for immunohematology tests in the blood bank. Automation improves the objectivity and reproducibility of tests. It reduces human errors in patient identification and transcription errors. Documentation and traceability of tests, reagents and processes and archiving of results is another major advantage of automation. Shifting from manual methods to automation is a major undertaking for any transfusion service to provide quality patient care with lesser turnaround time for their ever increasing workload. This article discusses the various issues involved in the process.

  2. Automation in immunohematology.

    Science.gov (United States)

    Bajpai, Meenu; Kaur, Ravneet; Gupta, Ekta

    2012-07-01

    There have been rapid technological advances in blood banking in South Asian region over the past decade with an increasing emphasis on quality and safety of blood products. The conventional test tube technique has given way to newer techniques such as column agglutination technique, solid phase red cell adherence assay, and erythrocyte-magnetized technique. These new technologies are adaptable to automation and major manufacturers in this field have come up with semi and fully automated equipments for immunohematology tests in the blood bank. Automation improves the objectivity and reproducibility of tests. It reduces human errors in patient identification and transcription errors. Documentation and traceability of tests, reagents and processes and archiving of results is another major advantage of automation. Shifting from manual methods to automation is a major undertaking for any transfusion service to provide quality patient care with lesser turnaround time for their ever increasing workload. This article discusses the various issues involved in the process. PMID:22988378

  3. Automation in Warehouse Development

    CERN Document Server

    Verriet, Jacques

    2012-01-01

    The warehouses of the future will come in a variety of forms, but with a few common ingredients. Firstly, human operational handling of items in warehouses is increasingly being replaced by automated item handling. Extended warehouse automation counteracts the scarcity of human operators and supports the quality of picking processes. Secondly, the development of models to simulate and analyse warehouse designs and their components facilitates the challenging task of developing warehouses that take into account each customer’s individual requirements and logistic processes. Automation in Warehouse Development addresses both types of automation from the innovative perspective of applied science. In particular, it describes the outcomes of the Falcon project, a joint endeavour by a consortium of industrial and academic partners. The results include a model-based approach to automate warehouse control design, analysis models for warehouse design, concepts for robotic item handling and computer vision, and auton...

  4. Advances in inspection automation

    Science.gov (United States)

    Weber, Walter H.; Mair, H. Douglas; Jansen, Dion; Lombardi, Luciano

    2013-01-01

    This new session at QNDE reflects the growing interest in inspection automation. Our paper describes a newly developed platform that makes the complex NDE automation possible without the need for software programmers. Inspection tasks that are tedious, error-prone or impossible for humans to perform can now be automated using a form of drag and drop visual scripting. Our work attempts to rectify the problem that NDE is not keeping pace with the rest of factory automation. Outside of NDE, robots routinely and autonomously machine parts, assemble components, weld structures and report progress to corporate databases. By contrast, components arriving in the NDT department typically require manual part handling, calibrations and analysis. The automation examples in this paper cover the development of robotic thickness gauging and the use of adaptive contour following on the NRU reactor inspection at Chalk River.

  5. Automated model building

    CERN Document Server

    Caferra, Ricardo; Peltier, Nicholas

    2004-01-01

    This is the first book on automated model building, a discipline of automated deduction that is of growing importance Although models and their construction are important per se, automated model building has appeared as a natural enrichment of automated deduction, especially in the attempt to capture the human way of reasoning The book provides an historical overview of the field of automated deduction, and presents the foundations of different existing approaches to model construction, in particular those developed by the authors Finite and infinite model building techniques are presented The main emphasis is on calculi-based methods, and relevant practical results are provided The book is of interest to researchers and graduate students in computer science, computational logic and artificial intelligence It can also be used as a textbook in advanced undergraduate courses

  6. Coal Mines, Abandoned - Digitized Mined Areas

    Data.gov (United States)

    NSGIC GIS Inventory (aka Ramona) — Coal mining has occurred in Pennsylvania for over a century. The maps to these coal mines are stored at many various public and private locations (if they still...

  7. Uses of antimicrobial genes from microbial genome

    Science.gov (United States)

    Sorek, Rotem; Rubin, Edward M.

    2013-08-20

    We describe a method for mining microbial genomes to discover antimicrobial genes and proteins having broad spectrum of activity. Also described are antimicrobial genes and their expression products from various microbial genomes that were found using this method. The products of such genes can be used as antimicrobial agents or as tools for molecular biology.

  8. Wikipedia Mining

    Science.gov (United States)

    Nakayama, Kotaro; Ito, Masahiro; Erdmann, Maike; Shirakawa, Masumi; Michishita, Tomoyuki; Hara, Takahiro; Nishio, Shojiro

    Wikipedia, a collaborative Wiki-based encyclopedia, has become a huge phenomenon among Internet users. It covers a huge number of concepts of various fields such as arts, geography, history, science, sports and games. As a corpus for knowledge extraction, Wikipedia's impressive characteristics are not limited to the scale, but also include the dense link structure, URL based word sense disambiguation, and brief anchor texts. Because of these characteristics, Wikipedia has become a promising corpus and a new frontier for research. In the past few years, a considerable number of researches have been conducted in various areas such as semantic relatedness measurement, bilingual dictionary construction, and ontology construction. Extracting machine understandable knowledge from Wikipedia to enhance the intelligence on computational systems is the main goal of "Wikipedia Mining," a project on CREP (Challenge for Realizing Early Profits) in JSAI. In this paper, we take a comprehensive, panoramic view of Wikipedia Mining research and the current status of our challenge. After that, we will discuss about the future vision of this challenge.

  9. Image Mining: Review and New Challenges

    Directory of Open Access Journals (Sweden)

    Barbora Zahradnikova

    2015-07-01

    Full Text Available Besides new technology, a huge volume of data in various form has been available for people. Image data represents a keystone of many research areas including medicine, forensic criminology, robotics and industrial automation, meteorology and geography as well as education. Therefore, obtaining specific in-formation from image databases has become of great importance. Images as a special category of data differ from text data as in terms of their nature so in terms of storing and retrieving. Image Mining as a research field is an interdisciplinary area combining methodologies and knowledge of many branches including data mining, computer vision, image processing, image retrieval, statis-tics, recognition, machine learning, artificial intelligence etc. This review focuses researching the current image mining approaches and techniques aiming at widening the possibilities of facial image analysis. This paper aims at reviewing the current state of the IM as well as at describing challenges and identifying directions of the future research in the field.

  10. Proceedings: Fourth Workshop on Mining Scientific Datasets

    Energy Technology Data Exchange (ETDEWEB)

    Kamath, C

    2001-07-24

    Commercial applications of data mining in areas such as e-commerce, market-basket analysis, text-mining, and web-mining have taken on a central focus in the JCDD community. However, there is a significant amount of innovative data mining work taking place in the context of scientific and engineering applications that is not well represented in the mainstream KDD conferences. For example, scientific data mining techniques are being developed and applied to diverse fields such as remote sensing, physics, chemistry, biology, astronomy, structural mechanics, computational fluid dynamics etc. In these areas, data mining frequently complements and enhances existing analysis methods based on statistics, exploratory data analysis, and domain-specific approaches. On the surface, it may appear that data from one scientific field, say genomics, is very different from another field, such as physics. However, despite their diversity, there is much that is common across the mining of scientific and engineering data. For example, techniques used to identify objects in images are very similar, regardless of whether the images came from a remote sensing application, a physics experiment, an astronomy observation, or a medical study. Further, with data mining being applied to new types of data, such as mesh data from scientific simulations, there is the opportunity to apply and extend data mining to new scientific domains. This one-day workshop brings together data miners analyzing science data and scientists from diverse fields to share their experiences, learn how techniques developed in one field can be applied in another, and better understand some of the newer techniques being developed in the KDD community. This is the fourth workshop on the topic of Mining Scientific Data sets; for information on earlier workshops, see http://www.ahpcrc.org/conferences/. This workshop continues the tradition of addressing challenging problems in a field where the diversity of applications is

  11. Automated recognition of malignancy mentions in biomedical literature

    Directory of Open Access Journals (Sweden)

    Liberman Mark Y

    2006-11-01

    Full Text Available Abstract Background The rapid proliferation of biomedical text makes it increasingly difficult for researchers to identify, synthesize, and utilize developed knowledge in their fields of interest. Automated information extraction procedures can assist in the acquisition and management of this knowledge. Previous efforts in biomedical text mining have focused primarily upon named entity recognition of well-defined molecular objects such as genes, but less work has been performed to identify disease-related objects and concepts. Furthermore, promise has been tempered by an inability to efficiently scale approaches in ways that minimize manual efforts and still perform with high accuracy. Here, we have applied a machine-learning approach previously successful for identifying molecular entities to a disease concept to determine if the underlying probabilistic model effectively generalizes to unrelated concepts with minimal manual intervention for model retraining. Results We developed a named entity recognizer (MTag, an entity tagger for recognizing clinical descriptions of malignancy presented in text. The application uses the machine-learning technique Conditional Random Fields with additional domain-specific features. MTag was tested with 1,010 training and 432 evaluation documents pertaining to cancer genomics. Overall, our experiments resulted in 0.85 precision, 0.83 recall, and 0.84 F-measure on the evaluation set. Compared with a baseline system using string matching of text with a neoplasm term list, MTag performed with a much higher recall rate (92.1% vs. 42.1% recall and demonstrated the ability to learn new patterns. Application of MTag to all MEDLINE abstracts yielded the identification of 580,002 unique and 9,153,340 overall mentions of malignancy. Significantly, addition of an extensive lexicon of malignancy mentions as a feature set for extraction had minimal impact in performance. Conclusion Together, these results suggest that the

  12. Chef infrastructure automation cookbook

    CERN Document Server

    Marschall, Matthias

    2013-01-01

    Chef Infrastructure Automation Cookbook contains practical recipes on everything you will need to automate your infrastructure using Chef. The book is packed with illustrated code examples to automate your server and cloud infrastructure.The book first shows you the simplest way to achieve a certain task. Then it explains every step in detail, so that you can build your knowledge about how things work. Eventually, the book shows you additional things to consider for each approach. That way, you can learn step-by-step and build profound knowledge on how to go about your configuration management

  13. Genome Sequence of Mushroom Soft-Rot Pathogen Janthinobacterium agaricidamnosum

    OpenAIRE

    Graupner, Katharina; Lackner, Gerald; Hertweck, Christian

    2015-01-01

    Janthinobacterium agaricidamnosum causes soft-rot disease of the cultured button mushroom Agaricus bisporus and is thus responsible for agricultural losses. Here, we present the genome sequence of J. agaricidamnosum DSM 9628. The 5.9-Mb genome harbors several secondary metabolite biosynthesis gene clusters, which renders this neglected bacterium a promising source for genome mining approaches.

  14. Mission-Critical Mobile Broadband Communications in Open Pit Mines

    DEFF Research Database (Denmark)

    Uzeda Garcia, Luis Guilherme; Portela Lopes de Almeida, Erika; Barbosa, Viviane S. B.;

    2016-01-01

    The need for continuous safety improvements and increased operational efficiency is driving the mining industry through a transition towards automated operations. From a communications perspective, this transition introduces a new set of high-bandwidth business- and mission-critical applications....... By means of an illustrative example the potential benefits of this framework are evaluated....

  15. Mining machine safari

    Energy Technology Data Exchange (ETDEWEB)

    Woof, M.

    1998-11-01

    New South African and other mining equipment on display at the Electra 98 exhibition is described. Products include: cutting machines; shovels; crushing machines; drilling equipment; control systems; moon buggy inspection vehicles; remote control underground mining machines; longwall shearers; mining software; scrapedozers; continuous miners; sprays; mine haulage equipment; milling machines; flotation plant; mud removal systems; chains; vehicle exhaust filters and continuous miner monitoring systems.

  16. Mining lore : Bankhead, mining for coal

    Energy Technology Data Exchange (ETDEWEB)

    Nichiporuk, A.

    2007-09-15

    Bankhead, Alberta was one of the first communities to be established because of mining. It was founded in 1903 by the Canadian Pacific Railway (CPR) on Cascade Mountain in the Bow River Valley of Banff National Park. In 1904, Mine No. 80 was opened by the Pacific Coal Company to fuel CPR's steam engines. In order to avoid flooding the mine, the decision was made to mine up the steep seams instead of down. The mine entered full production in 1905. This article described the working conditions and pay scale for the mine workers, noting that there was not much in terms of safety equipment. There were many accidents and 15 men lost their lives at the mine. During the mine's 20-year operation, miners went on strike 6 times. The last strike marked the closure of the mine in June 1922 and the end of industry in national parks. CPR was ordered to clear out and move the mining equipment as well as the houses, buildings and essentially the entire town. During its peak production, Mine No. 80 produced about a half million tons of coal. 1 ref., 1 fig.

  17. I-94 Automation FAQs

    Data.gov (United States)

    Department of Homeland Security — In order to increase efficiency, reduce operating costs and streamline the admissions process, U.S. Customs and Border Protection has automated Form I-94 at air and...

  18. Hydrometeorological Automated Data System

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The Office of Hydrologic Development of the National Weather Service operates HADS, the Hydrometeorological Automated Data System. This data set contains the last...

  19. Automated Vehicles Symposium 2015

    CERN Document Server

    Beiker, Sven

    2016-01-01

    This edited book comprises papers about the impacts, benefits and challenges of connected and automated cars. It is the third volume of the LNMOB series dealing with Road Vehicle Automation. The book comprises contributions from researchers, industry practitioners and policy makers, covering perspectives from the U.S., Europe and Japan. It is based on the Automated Vehicles Symposium 2015 which was jointly organized by the Association of Unmanned Vehicle Systems International (AUVSI) and the Transportation Research Board (TRB) in Ann Arbor, Michigan, in July 2015. The topical spectrum includes, but is not limited to, public sector activities, human factors, ethical and business aspects, energy and technological perspectives, vehicle systems and transportation infrastructure. This book is an indispensable source of information for academic researchers, industrial engineers and policy makers interested in the topic of road vehicle automation.

  20. Automated Vehicles Symposium 2014

    CERN Document Server

    Beiker, Sven; Road Vehicle Automation 2

    2015-01-01

    This paper collection is the second volume of the LNMOB series on Road Vehicle Automation. The book contains a comprehensive review of current technical, socio-economic, and legal perspectives written by experts coming from public authorities, companies and universities in the U.S., Europe and Japan. It originates from the Automated Vehicle Symposium 2014, which was jointly organized by the Association for Unmanned Vehicle Systems International (AUVSI) and the Transportation Research Board (TRB) in Burlingame, CA, in July 2014. The contributions discuss the challenges arising from the integration of highly automated and self-driving vehicles into the transportation system, with a focus on human factors and different deployment scenarios. This book is an indispensable source of information for academic researchers, industrial engineers, and policy makers interested in the topic of road vehicle automation.

  1. Disassembly automation automated systems with cognitive abilities

    CERN Document Server

    Vongbunyong, Supachai

    2015-01-01

    This book presents a number of aspects to be considered in the development of disassembly automation, including the mechanical system, vision system and intelligent planner. The implementation of cognitive robotics increases the flexibility and degree of autonomy of the disassembly system. Disassembly, as a step in the treatment of end-of-life products, can allow the recovery of embodied value left within disposed products, as well as the appropriate separation of potentially-hazardous components. In the end-of-life treatment industry, disassembly has largely been limited to manual labor, which is expensive in developed countries. Automation is one possible solution for economic feasibility. The target audience primarily comprises researchers and experts in the field, but the book may also be beneficial for graduate students.

  2. Evaluation of genomic island predictors using a comparative genomics approach

    Directory of Open Access Journals (Sweden)

    Brinkman Fiona SL

    2008-08-01

    Full Text Available Abstract Background Genomic islands (GIs are clusters of genes in prokaryotic genomes of probable horizontal origin. GIs are disproportionately associated with microbial adaptations of medical or environmental interest. Recently, multiple programs for automated detection of GIs have been developed that utilize sequence composition characteristics, such as G+C ratio and dinucleotide bias. To robustly evaluate the accuracy of such methods, we propose that a dataset of GIs be constructed using criteria that are independent of sequence composition-based analysis approaches. Results We developed a comparative genomics approach (IslandPick that identifies both very probable islands and non-island regions. The approach involves 1 flexible, automated selection of comparative genomes for each query genome, using a distance function that picks appropriate genomes for identification of GIs, 2 identification of regions unique to the query genome, compared with the chosen genomes (positive dataset and 3 identification of regions conserved across all genomes (negative dataset. Using our constructed datasets, we investigated the accuracy of several sequence composition-based GI prediction tools. Conclusion Our results indicate that AlienHunter has the highest recall, but the lowest measured precision, while SIGI-HMM is the most precise method. SIGI-HMM and IslandPath/DIMOB have comparable overall highest accuracy. Our comparative genomics approach, IslandPick, was the most accurate, compared with a curated list of GIs, indicating that we have constructed suitable datasets. This represents the first evaluation, using diverse and, independent datasets that were not artificially constructed, of the accuracy of several sequence composition-based GI predictors. The caveats associated with this analysis and proposals for optimal island prediction are discussed.

  3. Data mining for ontology development.

    Energy Technology Data Exchange (ETDEWEB)

    Davidson, George S.; Strasburg, Jana (Pacific Northwest National Laboratory, Richland, WA); Stampf, David (Brookhaven National Laboratory, Upton, NY); Neymotin,Lev (Brookhaven National Laboratory, Upton, NY); Czajkowski, Carl (Brookhaven National Laboratory, Upton, NY); Shine, Eugene (Savannah River National Laboratory, Aiken, SC); Bollinger, James (Savannah River National Laboratory, Aiken, SC); Ghosh, Vinita (Brookhaven National Laboratory, Upton, NY); Sorokine, Alexandre (Oak Ridge National Laboratory, Oak Ridge, TN); Ferrell, Regina (Oak Ridge National Laboratory, Oak Ridge, TN); Ward, Richard (Oak Ridge National Laboratory, Oak Ridge, TN); Schoenwald, David Alan

    2010-06-01

    A multi-laboratory ontology construction effort during the summer and fall of 2009 prototyped an ontology for counterfeit semiconductor manufacturing. This effort included an ontology development team and an ontology validation methods team. Here the third team of the Ontology Project, the Data Analysis (DA) team reports on their approaches, the tools they used, and results for mining literature for terminology pertinent to counterfeit semiconductor manufacturing. A discussion of the value of ontology-based analysis is presented, with insights drawn from other ontology-based methods regularly used in the analysis of genomic experiments. Finally, suggestions for future work are offered.

  4. Predictive Data mining and discovering hidden values of Data warehouse

    Directory of Open Access Journals (Sweden)

    Mehta Neel B

    2011-04-01

    Full Text Available Data Mining is an analytic process to explore data (usually large amounts of data - typically business or market related in search of consistent patterns and/or systematic relationships between variables, and then to validate the findings by applying the detected patterns to new sets of data. The main target of data mining application is prediction. Predictive data mining is important and it has the most direct business applications in world. The paper briefly explains the process of data mining which consists of three stages: (1 the Initial exploration, (2 Pattern identification with validation, and (3 Deployment (application of the model to new data in order to generate predictions. Data Mining is being done for Patterns and Relationships recognitions in Data analysis, with an emphasis on large Observational data bases. From a statistical perspective Data Mining is viewed as computer automated exploratory data analytical system for large sets of data and it has huge Research challenges in India and abroad as well. Machine learning methods form the core of Data Mining and Decision tree learning. Data mining work is integrated within an existing user environment, including the works that already make use of data warehousing and Online Analytical Processing (OLAP. The paper describes how data mining tools predict future trends and behaviour which allows in making proactive knowledge-driven decisions.

  5. Instant Sikuli test automation

    CERN Document Server

    Lau, Ben

    2013-01-01

    Get to grips with a new technology, understand what it is and what it can do for you, and then get to work with the most important features and tasks. A concise guide written in an easy-to follow style using the Starter guide approach.This book is aimed at automation and testing professionals who want to use Sikuli to automate GUI. Some Python programming experience is assumed.

  6. Automated security management

    CERN Document Server

    Al-Shaer, Ehab; Xie, Geoffrey

    2013-01-01

    In this contributed volume, leading international researchers explore configuration modeling and checking, vulnerability and risk assessment, configuration analysis, and diagnostics and discovery. The authors equip readers to understand automated security management systems and techniques that increase overall network assurability and usability. These constantly changing networks defend against cyber attacks by integrating hundreds of security devices such as firewalls, IPSec gateways, IDS/IPS, authentication servers, authorization/RBAC servers, and crypto systems. Automated Security Managemen

  7. Automated Lattice Perturbation Theory

    Energy Technology Data Exchange (ETDEWEB)

    Monahan, Christopher

    2014-11-01

    I review recent developments in automated lattice perturbation theory. Starting with an overview of lattice perturbation theory, I focus on the three automation packages currently "on the market": HiPPy/HPsrc, Pastor and PhySyCAl. I highlight some recent applications of these methods, particularly in B physics. In the final section I briefly discuss the related, but distinct, approach of numerical stochastic perturbation theory.

  8. Automation of a single-DNA molecule stretching device

    DEFF Research Database (Denmark)

    Sørensen, Kristian Tølbøl; Lopacinska, Joanna M.; Tommerup, Niels;

    2015-01-01

    We automate the manipulation of genomic-length DNA in a nanofluidic device based on real-time analysis of fluorescence images. In our protocol, individual molecules are picked from a microchannel and stretched with pN forces using pressure driven flows. The millimeter-long DNA fragments free...

  9. Automated systems to identify relevant documents in product risk management

    Directory of Open Access Journals (Sweden)

    Wee Xue

    2012-03-01

    Full Text Available Abstract Background Product risk management involves critical assessment of the risks and benefits of health products circulating in the market. One of the important sources of safety information is the primary literature, especially for newer products which regulatory authorities have relatively little experience with. Although the primary literature provides vast and diverse information, only a small proportion of which is useful for product risk assessment work. Hence, the aim of this study is to explore the possibility of using text mining to automate the identification of useful articles, which will reduce the time taken for literature search and hence improving work efficiency. In this study, term-frequency inverse document-frequency values were computed for predictors extracted from the titles and abstracts of articles related to three tumour necrosis factors-alpha blockers. A general automated system was developed using only general predictors and was tested for its generalizability using articles related to four other drug classes. Several specific automated systems were developed using both general and specific predictors and training sets of different sizes in order to determine the minimum number of articles required for developing such systems. Results The general automated system had an area under the curve value of 0.731 and was able to rank 34.6% and 46.2% of the total number of 'useful' articles among the first 10% and 20% of the articles presented to the evaluators when tested on the generalizability set. However, its use may be limited by the subjective definition of useful articles. For the specific automated system, it was found that only 20 articles were required to develop a specific automated system with a prediction performance (AUC 0.748 that was better than that of general automated system. Conclusions Specific automated systems can be developed rapidly and avoid problems caused by subjective definition of useful

  10. Locating previously unknown patterns in data-mining results: a dual data- and knowledge-mining method

    Directory of Open Access Journals (Sweden)

    Knaus William A

    2006-03-01

    Full Text Available Abstract Background Data mining can be utilized to automate analysis of substantial amounts of data produced in many organizations. However, data mining produces large numbers of rules and patterns, many of which are not useful. Existing methods for pruning uninteresting patterns have only begun to automate the knowledge acquisition step (which is required for subjective measures of interestingness, hence leaving a serious bottleneck. In this paper we propose a method for automatically acquiring knowledge to shorten the pattern list by locating the novel and interesting ones. Methods The dual-mining method is based on automatically comparing the strength of patterns mined from a database with the strength of equivalent patterns mined from a relevant knowledgebase. When these two estimates of pattern strength do not match, a high "surprise score" is assigned to the pattern, identifying the pattern as potentially interesting. The surprise score captures the degree of novelty or interestingness of the mined pattern. In addition, we show how to compute p values for each surprise score, thus filtering out noise and attaching statistical significance. Results We have implemented the dual-mining method using scripts written in Perl and R. We applied the method to a large patient database and a biomedical literature citation knowledgebase. The system estimated association scores for 50,000 patterns, composed of disease entities and lab results, by querying the database and the knowledgebase. It then computed the surprise scores by comparing the pairs of association scores. Finally, the system estimated statistical significance of the scores. Conclusion The dual-mining method eliminates more than 90% of patterns with strong associations, thus identifying them as uninteresting. We found that the pruning of patterns using the surprise score matched the biomedical evidence in the 100 cases that were examined by hand. The method automates the acquisition of

  11. Linked Data approach for selection process automation in Systematic Reviews

    OpenAIRE

    Torchiano, Marco; Morisio, Maurizio; Tomassetti, Federico Cesare Argentino; Ardito, Luca; Vetro, Antonio; Rizzo, Giuseppe

    2011-01-01

    Background: a systematic review identifies, evaluates and synthesizes the available literature on a given topic using scientific and repeatable methodologies. The significant workload required and the subjectivity bias could affect results. Aim: semi-automate the selection process to reduce the amount of manual work needed and the consequent subjectivity bias. Method: extend and enrich the selection of primary studies using the existing technologies in the field of Linked Data and text mining...

  12. Large-scale automated synthesis of human functional neuroimaging data

    OpenAIRE

    Yarkoni, Tal; Poldrack, Russell A.; Nichols, Thomas E.; Van Essen, David C; Wager, Tor D.

    2011-01-01

    The explosive growth of the human neuroimaging literature has led to major advances in understanding of human brain function, but has also made aggregation and synthesis of neuroimaging findings increasingly difficult. Here we describe and validate an automated brain mapping framework that uses text mining, meta-analysis and machine learning techniques to generate a large database of mappings between neural and cognitive states. We demonstrate the capacity of our approach to automatically con...

  13. A Framework to Support Automated Classification and Labeling of Brain Electromagnetic Patterns

    OpenAIRE

    Gwen A. Frishkoff; Robert M. Frank; Jiawei Rong; Dejing Dou; Joseph Dien; Laura K. Halderman

    2007-01-01

    This paper describes a framework for automated classification and labeling of patterns in electroencephalographic (EEG) and magnetoencephalographic (MEG) data. We describe recent progress on four goals: 1) specification of rules and concepts that capture expert knowledge of event-related potentials (ERP) patterns in visual word recognition; 2) implementation of rules in an automated data processing and labeling stream; 3) data mining techniques that lead to r...

  14. Text Classification using Data Mining

    CERN Document Server

    Kamruzzaman, S M; Hasan, Ahmed Ryadh

    2010-01-01

    Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement of text retrieval systems, which retrieve texts in response to a user query, and text understanding systems, which transform text in some way such as producing summaries, answering questions or extracting data. Existing supervised learning algorithms to automatically classify text need sufficient documents to learn accurately. This paper presents a new algorithm for text classification using data mining that requires fewer documents for training. Instead of using words, word relation i.e. association rules from these words is used to derive feature set from pre-classified text documents. The concept of Naive Bayes classifier is then used on derived features and finally only a single concept of Genetic Algorithm has been added for final classification. A system based on the...

  15. Materials Testing and Automation

    Science.gov (United States)

    Cooper, Wayne D.; Zweigoron, Ronald B.

    1980-07-01

    The advent of automation in materials testing has been in large part responsible for recent radical changes in the materials testing field: Tests virtually impossible to perform without a computer have become more straightforward to conduct. In addition, standardized tests may be performed with enhanced efficiency and repeatability. A typical automated system is described in terms of its primary subsystems — an analog station, a digital computer, and a processor interface. The processor interface links the analog functions with the digital computer; it includes data acquisition, command function generation, and test control functions. Features of automated testing are described with emphasis on calculated variable control, control of a variable that is computed by the processor and cannot be read directly from a transducer. Three calculated variable tests are described: a yield surface probe test, a thermomechanical fatigue test, and a constant-stress-intensity range crack-growth test. Future developments are discussed.

  16. Recent advances in remote coal mining machine sensing, guidance, and teleoperation

    Energy Technology Data Exchange (ETDEWEB)

    Ralston, J.C.; Hainsworth, D.W.; Reid, D.C.; Anderson, D.L.; McPhee, R.J. [CSIRO Exploration & Minerals, Kenmore, Qld. (Australia)

    2001-10-01

    Some recent applications of sensing, guidance and telerobotic technology in the coal mining industry are presented. Of special interest is the development of semi or fully autonomous systems to provide remote guidance and communications for coal mining equipment. The use of radar and inertial based sensors are considered in an attempt to solve the horizontal and lateral guidance problems associated with mining equipment automation. Also described is a novel teleoperated robot vehicle with unique communications capabilities, called the Numbat, which is used in underground mine safety and reconnaissance missions.

  17. First Mexican coal mine recovery after mine fire, Esmeralda Mine

    Energy Technology Data Exchange (ETDEWEB)

    Santillan, M.A. [Minerales Monclova, SA de CV, Palau Coahuila (Mexico)

    2005-07-01

    The fire started on 8 May 1998 in the development section from methane released into the mine through a roof-bolt hole. The flames spread quickly as the coal was ignited. After eight hours the Safety Department decided to seal the vertical ventilation shafts and the slopes. The quality of coal in the Esmeralda Mine is very high quality, and Minerales Monclova (MIMOSA) decided to recover the facilities. However, the Esmeralda Mine coals have a very high gas content of 12 m{sup 3}/t. During the next 2.5 months, MIMOSA staff and specialists observed and analysed the gas behaviour supported by a chromatograph. With the results of the observations and analyses, MIMOSA in consultation with the specialists developed a recovery plan based on flooding the area in which fire might have propagated and in which rekindling was highly probable. At the same time MIMOSA trained rescue teams. By 20 August 1998, the mine command centre had re-opened the slopes seal. Using a 'Step-by-Step' system, the rescue team began the recovery process by employing cross-cuts and using an auxiliary fan to establish the ventilation circuit. The MIMOSA team advanced into the mine as far as allowed by the water level and was able to recover the main fan. The official mine recovery date was 30 November 1998. Esmeralda Mine was back in operation in December 1998. 1 ref., 3 figs.

  18. Automating the CMS DAQ

    Energy Technology Data Exchange (ETDEWEB)

    Bauer, G.; et al.

    2014-01-01

    We present the automation mechanisms that have been added to the Data Acquisition and Run Control systems of the Compact Muon Solenoid (CMS) experiment during Run 1 of the LHC, ranging from the automation of routine tasks to automatic error recovery and context-sensitive guidance to the operator. These mechanisms helped CMS to maintain a data taking efficiency above 90% and to even improve it to 95% towards the end of Run 1, despite an increase in the occurrence of single-event upsets in sub-detector electronics at high LHC luminosity.

  19. Automated phantom assay system

    International Nuclear Information System (INIS)

    This paper describes an automated phantom assay system developed for assaying phantoms spiked with minute quantities of radionuclides. The system includes a computer-controlled linear-translation table that positions the phantom at exact distances from a spectrometer. A multichannel analyzer (MCA) interfaces with a computer to collect gamma spectral data. Signals transmitted between the controller and MCA synchronize data collection and phantom positioning. Measured data are then stored on disk for subsequent analysis. The automated system allows continuous unattended operation and ensures reproducible results

  20. Automated gas chromatography

    Science.gov (United States)

    Mowry, Curtis D.; Blair, Dianna S.; Rodacy, Philip J.; Reber, Stephen D.

    1999-01-01

    An apparatus and process for the continuous, near real-time monitoring of low-level concentrations of organic compounds in a liquid, and, more particularly, a water stream. A small liquid volume of flow from a liquid process stream containing organic compounds is diverted by an automated process to a heated vaporization capillary where the liquid volume is vaporized to a gas that flows to an automated gas chromatograph separation column to chromatographically separate the organic compounds. Organic compounds are detected and the information transmitted to a control system for use in process control. Concentrations of organic compounds less than one part per million are detected in less than one minute.

  1. Genome Modeling System: A Knowledge Management Platform for Genomics.

    Directory of Open Access Journals (Sweden)

    Malachi Griffith

    2015-07-01

    Full Text Available In this work, we present the Genome Modeling System (GMS, an analysis information management system capable of executing automated genome analysis pipelines at a massive scale. The GMS framework provides detailed tracking of samples and data coupled with reliable and repeatable analysis pipelines. The GMS also serves as a platform for bioinformatics development, allowing a large team to collaborate on data analysis, or an individual researcher to leverage the work of others effectively within its data management system. Rather than separating ad-hoc analysis from rigorous, reproducible pipelines, the GMS promotes systematic integration between the two. As a demonstration of the GMS, we performed an integrated analysis of whole genome, exome and transcriptome sequencing data from a breast cancer cell line (HCC1395 and matched lymphoblastoid line (HCC1395BL. These data are available for users to test the software, complete tutorials and develop novel GMS pipeline configurations. The GMS is available at https://github.com/genome/gms.

  2. De Novo Assembly of Human Herpes Virus Type 1 (HHV-1) Genome, Mining of Non-Canonical Structures and Detection of Novel Drug-Resistance Mutations Using Short- and Long-Read Next Generation Sequencing Technologies

    Science.gov (United States)

    Karamitros, Timokratis; Piorkowska, Renata; Katzourakis, Aris; Magiorkinis, Gkikas; Mbisa, Jean Lutamyo

    2016-01-01

    Human herpesvirus type 1 (HHV-1) has a large double-stranded DNA genome of approximately 152 kbp that is structurally complex and GC-rich. This makes the assembly of HHV-1 whole genomes from short-read sequencing data technically challenging. To improve the assembly of HHV-1 genomes we have employed a hybrid genome assembly protocol using data from two sequencing technologies: the short-read Roche 454 and the long-read Oxford Nanopore MinION sequencers. We sequenced 18 HHV-1 cell culture-isolated clinical specimens collected from immunocompromised patients undergoing antiviral therapy. The susceptibility of the samples to several antivirals was determined by plaque reduction assay. Hybrid genome assembly resulted in a decrease in the number of contigs in 6 out of 7 samples and an increase in N(G)50 and N(G)75 of all 7 samples sequenced by both technologies. The approach also enhanced the detection of non-canonical contigs including a rearrangement between the unique (UL) and repeat (T/IRL) sequence regions of one sample that was not detectable by assembly of 454 reads alone. We detected several known and novel resistance-associated mutations in UL23 and UL30 genes. Genome-wide genetic variability ranged from <1% to 53% of amino acids in each gene exhibiting at least one substitution within the pool of samples. The UL23 gene had one of the highest genetic variabilities at 35.2% in keeping with its role in development of drug resistance. The assembly of accurate, full-length HHV-1 genomes will be useful in determining genetic determinants of drug resistance, virulence, pathogenesis and viral evolution. The numerous, complex repeat regions of the HHV-1 genome currently remain a barrier towards this goal. PMID:27309375

  3. Mining Ostrava '93

    International Nuclear Information System (INIS)

    Part I of the Proceedings contains 55 contributions, out of which 2 deal with environmental impacts of undermining during coal mining, and of shocks and vibrations during underground coal mining. (Z.S.)

  4. Mines and Mineral Resources

    Data.gov (United States)

    Department of Homeland Security — Mines in the United States According to the Homeland Security Infrastructure Program Tiger Team Report Table E-2.V.1 Sub-Layer Geographic Names, a mine is defined...

  5. Mine drainage treatment

    OpenAIRE

    Golomeova, Mirjana; Zendelska, Afrodita; Krstev, Boris; Golomeov, Blagoj; Krstev, Aleksandar

    2012-01-01

    Water flowing from underground and surface mines and contains high concentrations of dissolved metals is called mine drainage. Mine drainage can be categorized into several basic types by their alkalinity or acidity. Sulfide rich and carbonate poor materials are expected to produce acidic drainage, and alkaline rich materials, even with significant sulfide concentrations, often produce net alkaline water. Mine drainages are dangerous because pollutants may decompose in the environment. In...

  6. Mining in El Salvador

    DEFF Research Database (Denmark)

    Pacheco Cueva, Vladimir

    2014-01-01

    In this guest article, Vladimir Pacheco, a social scientist who has worked on mining and human rights shares his perspectives on a current campaign against mining in El Salvador – Central America’s smallest but most densely populated country.......In this guest article, Vladimir Pacheco, a social scientist who has worked on mining and human rights shares his perspectives on a current campaign against mining in El Salvador – Central America’s smallest but most densely populated country....

  7. Distributed Framework for Data Mining As a Service on Private Cloud

    Directory of Open Access Journals (Sweden)

    Shraddha Masih

    2014-11-01

    Full Text Available Data mining research faces two great challenges: i. Automated mining ii. Mining of distributed data. Conventional mining techniques are centralized and the data needs to be accumulated at central location. Mining tool needs to be installed on the computer before performing data mining. Thus, extra time is incurred in collecting the data. Mining is 4 done by specialized analysts who have access to mining tools. This technique is not optimal when the data is distributed over the network. To perform data mining in distributed scenario, we need to design a different framework to improve efficiency. Also, the size of accumulated data grows exponentially with time and is difficult to mine using a single computer. Personal computers have limitations in terms of computation capability and storage capacity. Cloud computing can be exploited for compute-intensive and data intensive applications. Data mining algorithms are both compute and data intensive, therefore cloud based tools can provide an infrastructure for distributed data mining. This paper is intended to use cloud computing to support distributed data mining. We propose a cloud based data mining model which provides the facility of mass data storage along with distributed data mining facility. This paper provide a solution for distributed data mining on Hadoop framework using an interface to run the algorithm on specified number of nodes without any user level configuration. Hadoop is configured over private servers and clients can process their data through common framework from anywhere in private network. Data to be mined can either be chosen from cloud data server or can be uploaded from private computers on the network. It is observed that the framework is helpful in processing large size data in less time as compared to single system.

  8. Web Usage Mining

    OpenAIRE

    Benkovská, Petra

    2007-01-01

    General characteristic of web mining including methodology and procedures incorporated into this term. Relation to other areas (data mining, artificial intelligence, statistics, databases, internet technologies, management etc.) Web usage mining - data sources, data pre-processing, characterization of analytical methods and tools, interpretation of outputs (results), and possible areas of usage including examples. Suggestion of solution method, realization and a concrete example's outputs int...

  9. A MINE alternative to D-optimal designs for the linear model.

    Directory of Open Access Journals (Sweden)

    Amanda M Bouffier

    Full Text Available Doing large-scale genomics experiments can be expensive, and so experimenters want to get the most information out of each experiment. To this end the Maximally Informative Next Experiment (MINE criterion for experimental design was developed. Here we explore this idea in a simplified context, the linear model. Four variations of the MINE method for the linear model were created: MINE-like, MINE, MINE with random orthonormal basis, and MINE with random rotation. Each method varies in how it maximizes the MINE criterion. Theorem 1 establishes sufficient conditions for the maximization of the MINE criterion under the linear model. Theorem 2 establishes when the MINE criterion is equivalent to the classic design criterion of D-optimality. By simulation under the linear model, we establish that the MINE with random orthonormal basis and MINE with random rotation are faster to discover the true linear relation with p regression coefficients and n observations when p>>n. We also establish in simulations with n<100, p=1000, σ=0.01 and 1000 replicates that these two variations of MINE also display a lower false positive rate than the MINE-like method and additionally, for a majority of the experiments, for the MINE method.

  10. The potential of text mining in data integration and network biology for plant research: a case study on Arabidopsis

    OpenAIRE

    Van Landeghem, Sofie; De Bodt, Stefanie; Drebert, Zuzanna; Inzé, Dirk; Van de Peer, Yves

    2013-01-01

    Despite the availability of various data repositories for plant research, a wealth of information currently remains hidden within the biomolecular literature. Text mining provides the necessary means to retrieve these data through automated processing of texts. However, only recently has advanced text mining methodology been implemented with sufficient computational power to process texts at a large scale. In this study, we assess the potential of large-scale text mining for plant biology res...

  11. Data mining meets economic analysis: opportunities and challenges

    Directory of Open Access Journals (Sweden)

    Baicoianu, A.

    2010-12-01

    Full Text Available Along with the increase of economic globalization and the evolution of information technology, data mining has become an important approach for economic data analysis. As a result, there has been a critical need for automated approaches to effective and efficient usage of massive amount of economic data, in order to support both companies’ and individuals’ strategic planning and investment decision-making. The goal of this paper is to illustrate the impact of data mining techniques on sales, customer satisfaction and corporate profits. To this end, we present different data mining techniques and we discuss important data mining issues involved in specific economic applications. In addition, we discuss about a new method based on Boolean functions, LAD, which is successfully applied to data analysis. Finally, we highlight a number of challenges and opportunities for future research.

  12. Application of fuzzy logic for determining of coal mine mechanization

    Institute of Scientific and Technical Information of China (English)

    HOSSEINI SAA; ATAEI M; HOSSEINI S M; AKHYANI M

    2012-01-01

    The fundamental task of mining engineers is to produce more coal at a given level of labour input and material costs,for optimum quality and maximum efficiency.To achieve these goals,it is necessary to automate and mechanize mining operations.Mechanization is an objective that can result in significant cost reduction and higher levels of profitability for underground mines.To analyze the potential of mechanization,some important factors such as seam inclination and thickness,geological disturbances,seam floor conditions and roof conditions should be considered.In this study we have used fuzzy logic,membership functions and created fuzzy rule-based methods and considered the ultimate objective:mechanization of mining.As a case study,the mechanization of the Tazare coal seams in Shahroud area of Iran was investigated.The results show a low potential for mechanization in most of the Tazare coal seams.

  13. Automated solvent concentrator

    Science.gov (United States)

    Griffith, J. S.; Stuart, J. L.

    1976-01-01

    Designed for automated drug identification system (AUDRI), device increases concentration by 100. Sample is first filtered, removing particulate contaminants and reducing water content of sample. Sample is extracted from filtered residue by specific solvent. Concentrator provides input material to analysis subsystem.

  14. Protokoller til Home Automation

    DEFF Research Database (Denmark)

    Kjær, Kristian Ellebæk

    2008-01-01

    computer, der kan skifte mellem foruddefinerede indstillinger. Nogle gange kan computeren fjernstyres over internettet, så man kan se hjemmets status fra en computer eller måske endda fra en mobiltelefon. Mens nævnte anvendelser er klassiske indenfor home automation, er yderligere funktionalitet dukket op...

  15. ELECTROPNEUMATIC AUTOMATION EDUCATIONAL LABORATORY

    OpenAIRE

    Dolgorukov, S. O.; National Aviation University; Roman, B. V.; National Aviation University

    2013-01-01

    The article reflects current situation in education regarding mechatronics learning difficulties. Com-plex of laboratory test benches on electropneumatic automation are considered as a tool in advancing through technical science. Course of laboratory works developed to meet the requirement of efficient and reliable way of practical skills acquisition is regarded the simplest way for students to learn the ba-sics of mechatronics.

  16. Building Automation Systems.

    Science.gov (United States)

    Honeywell, Inc., Minneapolis, Minn.

    A number of different automation systems for use in monitoring and controlling building equipment are described in this brochure. The system functions include--(1) collection of information, (2) processing and display of data at a central panel, and (3) taking corrective action by sounding alarms, making adjustments, or automatically starting and…

  17. Test Construction: Automated

    NARCIS (Netherlands)

    Veldkamp, Bernard P.

    2014-01-01

    Optimal test construction deals with automated assembly of tests for educational and psychological measurement. Items are selected from an item bank to meet a predefined set of test specifications. Several models for optimal test construction are presented, and two algorithms for optimal test assemb

  18. Test Construction: Automated

    NARCIS (Netherlands)

    Veldkamp, Bernard P.

    2016-01-01

    Optimal test construction deals with automated assembly of tests for educational and psychological measurement. Items are selected from an item bank to meet a predefined set of test specifications. Several models for optimal test construction are presented, and two algorithms for optimal test assemb

  19. Automated Web Applications Testing

    Directory of Open Access Journals (Sweden)

    Alexandru Dan CĂPRIŢĂ

    2009-01-01

    Full Text Available Unit tests are a vital part of several software development practicesand processes such as Test-First Programming, Extreme Programming andTest-Driven Development. This article shortly presents the software quality andtesting concepts as well as an introduction to an automated unit testingframework for PHP web based applications.

  20. Automated Student Model Improvement

    Science.gov (United States)

    Koedinger, Kenneth R.; McLaughlin, Elizabeth A.; Stamper, John C.

    2012-01-01

    Student modeling plays a critical role in developing and improving instruction and instructional technologies. We present a technique for automated improvement of student models that leverages the DataShop repository, crowd sourcing, and a version of the Learning Factors Analysis algorithm. We demonstrate this method on eleven educational…

  1. Myths in test automation

    Directory of Open Access Journals (Sweden)

    Jazmine Francis

    2015-01-01

    Full Text Available Myths in automation of software testing is an issue of discussion that echoes about the areas of service in validation of software industry. Probably, the first though that appears in knowledgeable reader would be Why this old topic again? What's New to discuss the matter? But, for the first time everyone agrees that undoubtedly automation testing today is not today what it used to be ten or fifteen years ago, because it has evolved in scope and magnitude. What began as a simple linear scripts for web applications today has a complex architecture and a hybrid framework to facilitate the implementation of testing applications developed with various platforms and technologies. Undoubtedly automation has advanced, but so did the myths associated with it. The change in perspective and knowledge of people on automation has altered the terrain. This article reflects the points of views and experience of the author in what has to do with the transformation of the original myths in new versions, and how they are derived; also provides his thoughts on the new generation of myths.

  2. Geochemistry and mineralogy of arsenic in mine wastes and stream sediments in a historic metal mining area in the UK

    Energy Technology Data Exchange (ETDEWEB)

    Rieuwerts, J.S., E-mail: jrieuwerts@plymouth.ac.uk [School of Geography, Earth and Environmental Sciences, Plymouth University, Plymouth PL4 8AA (United Kingdom); Mighanetara, K.; Braungardt, C.B. [School of Geography, Earth and Environmental Sciences, Plymouth University, Plymouth PL4 8AA (United Kingdom); Rollinson, G.K. [Camborne School of Mines, CEMPS, University of Exeter, Tremough Campus, Penryn, Cornwall TR10 9EZ (United Kingdom); Pirrie, D. [Helford Geoscience LLP, Menallack Farm, Treverva, Penryn, Cornwall TR10 9BP (United Kingdom); Azizi, F. [School of Geography, Earth and Environmental Sciences, Plymouth University, Plymouth PL4 8AA (United Kingdom)

    2014-02-01

    Mining generates large amounts of waste which may contain potentially toxic elements (PTE), which, if released into the wider environment, can cause air, water and soil pollution long after mining operations have ceased. The fate and toxicological impact of PTEs are determined by their partitioning and speciation and in this study, the concentrations and mineralogy of arsenic in mine wastes and stream sediments in a former metal mining area of the UK are investigated. Pseudo-total (aqua-regia extractable) arsenic concentrations in all samples from the mining area exceeded background and guideline values by 1–5 orders of magnitude, with a maximum concentration in mine wastes of 1.8 × 10{sup 5} mg kg{sup −1} As and concentrations in stream sediments of up to 2.5 × 10{sup 4} mg kg{sup −1} As, raising concerns over potential environmental impacts. Mineralogical analysis of the wastes and sediments was undertaken by scanning electron microscopy (SEM) and automated SEM-EDS based quantitative evaluation (QEMSCAN®). The main arsenic mineral in the mine waste was scorodite and this was significantly correlated with pseudo-total As concentrations and significantly inversely correlated with potentially mobile arsenic, as estimated from the sum of exchangeable, reducible and oxidisable arsenic fractions obtained from a sequential extraction procedure; these findings correspond with the low solubility of scorodite in acidic mine wastes. The work presented shows that the study area remains grossly polluted by historical mining and processing and illustrates the value of combining mineralogical data with acid and sequential extractions to increase our understanding of potential environmental threats. - Highlights: • Stream sediments in a former mining area remain polluted with up to 25 g As per kg. • The main arsenic mineral in adjacent mine wastes appears to be scorodite. • Low solubility scorodite was inversely correlated with potentially mobile As. • Combining

  3. Automating spectral measurements

    Science.gov (United States)

    Goldstein, Fred T.

    2008-09-01

    This paper discusses the architecture of software utilized in spectroscopic measurements. As optical coatings become more sophisticated, there is mounting need to automate data acquisition (DAQ) from spectrophotometers. Such need is exacerbated when 100% inspection is required, ancillary devices are utilized, cost reduction is crucial, or security is vital. While instrument manufacturers normally provide point-and-click DAQ software, an application programming interface (API) may be missing. In such cases automation is impossible or expensive. An API is typically provided in libraries (*.dll, *.ocx) which may be embedded in user-developed applications. Users can thereby implement DAQ automation in several Windows languages. Another possibility, developed by FTG as an alternative to instrument manufacturers' software, is the ActiveX application (*.exe). ActiveX, a component of many Windows applications, provides means for programming and interoperability. This architecture permits a point-and-click program to act as automation client and server. Excel, for example, can control and be controlled by DAQ applications. Most importantly, ActiveX permits ancillary devices such as barcode readers and XY-stages to be easily and economically integrated into scanning procedures. Since an ActiveX application has its own user-interface, it can be independently tested. The ActiveX application then runs (visibly or invisibly) under DAQ software control. Automation capabilities are accessed via a built-in spectro-BASIC language with industry-standard (VBA-compatible) syntax. Supplementing ActiveX, spectro-BASIC also includes auxiliary serial port commands for interfacing programmable logic controllers (PLC). A typical application is automatic filter handling.

  4. Informationization of coal enterprises and digital mine

    Institute of Scientific and Technical Information of China (English)

    LU Jian-jun; WANG Xiao-lu; MA Li; ZHAO An-xin

    2008-01-01

    Analyzed the main problems which were found in current conditions and prob-lems of informationization in coal enterprises. It clarified how to achieve informationizationin coal mine and put forward a general configuration of informationization construction inwhich informationization in coal enterprises was divided into two parts: informationizationof safety production and informationization of management. Planned a platform of inte-grated management of informationization in coal enterprises. Ultimately, it has broughtforward that an overall integrated digital mine is the way to achieve the goal of informa-tionization in coal enterprises, which can promote the application of automation, digitaliza-tion, networking, informaitionization to intellectualization. At the same time, the competi-tiveness of enterprises can be improved entirely, and new type of coal industry can besupported by information technology.

  5. Opinion mining and summarization for customer reviews

    Directory of Open Access Journals (Sweden)

    Sanjeev kumar Chauhan

    2012-08-01

    Full Text Available Opinion Mining is related to detect the opinion of the author expressed in the document. The primary task in the field of opinion Mining is Subjectivity Analysis which finds whether the document is subjective or objective. Subjectivity shows that the document contains some opinionated part, while the objectivity shows thatthe document is far behind from the opinionated part i.e. it has no sentiments containing. The next task is Sentiment Polarity Analysis which differentiates the documents according to positivity and negativity. But presently there is no automated system which can perform this task. We are developing a system which can findthe degree of polarity of each document and according to it assign a human like rating to that document. At last it generates the summary of review which contains only the highly subjective and feature related part of the document.

  6. Mining text data

    CERN Document Server

    Aggarwal, Charu C

    2012-01-01

    Text mining applications have experienced tremendous advances because of web 2.0 and social networking applications. Recent advances in hardware and software technology have lead to a number of unique scenarios where text mining algorithms are learned. ""Mining Text Data"" introduces an important niche in the text analytics field, and is an edited volume contributed by leading international researchers and practitioners focused on social networks & data mining. This book contains a wide swath in topics across social networks & data mining. Each chapter contains a comprehensive survey including

  7. Mining planing introduction

    International Nuclear Information System (INIS)

    Basic concepts concerning mining parameters, plan establishment and typical procedure methods applied throughout the physical execution of mining operations are here determined, analyzed and discussed. Technological and economic aspects of the exploration phase are presented as well as general mathematical and statistical methods for estimating, analyzing and representing mineral deposits which are virtually essential for good mining project execution. The characterization of important mineral substances and the basic parameters of mining works are emphasized in conjunction with long, medium and short term mining planning. Finally, geological modelling, ore reserves calculations and final economic evaluations are considered using a hypothetical example in order to consolidate the main elaborated ideas. (D.J.M.)

  8. Data mining in radiology

    Directory of Open Access Journals (Sweden)

    Amit T Kharat

    2014-01-01

    Full Text Available Data mining facilitates the study of radiology data in various dimensions. It converts large patient image and text datasets into useful information that helps in improving patient care and provides informative reports. Data mining technology analyzes data within the Radiology Information System and Hospital Information System using specialized software which assesses relationships and agreement in available information. By using similar data analysis tools, radiologists can make informed decisions and predict the future outcome of a particular imaging finding. Data, information and knowledge are the components of data mining. Classes, Clusters, Associations, Sequential patterns, Classification, Prediction and Decision tree are the various types of data mining. Data mining has the potential to make delivery of health care affordable and ensure that the best imaging practices are followed. It is a tool for academic research. Data mining is considered to be ethically neutral, however concerns regarding privacy and legality exists which need to be addressed to ensure success of data mining.

  9. Data mining in radiology.

    Science.gov (United States)

    Kharat, Amit T; Singh, Amarjit; Kulkarni, Vilas M; Shah, Digish

    2014-04-01

    Data mining facilitates the study of radiology data in various dimensions. It converts large patient image and text datasets into useful information that helps in improving patient care and provides informative reports. Data mining technology analyzes data within the Radiology Information System and Hospital Information System using specialized software which assesses relationships and agreement in available information. By using similar data analysis tools, radiologists can make informed decisions and predict the future outcome of a particular imaging finding. Data, information and knowledge are the components of data mining. Classes, Clusters, Associations, Sequential patterns, Classification, Prediction and Decision tree are the various types of data mining. Data mining has the potential to make delivery of health care affordable and ensure that the best imaging practices are followed. It is a tool for academic research. Data mining is considered to be ethically neutral, however concerns regarding privacy and legality exists which need to be addressed to ensure success of data mining. PMID:25024513

  10. Genome bioinformatics of tomato and potato

    OpenAIRE

    Datema, E.

    2011-01-01

    In the past two decades genome sequencing has developed from a laborious and costly technology employed by large international consortia to a widely used, automated and affordable tool used worldwide by many individual research groups. Genome sequences of many food animals and crop plants have been deciphered and are being exploited for fundamental research and applied to improve their breeding programs. The developments in sequencing technologies have also impacted the associated bioinformat...

  11. Text mining and its potential applications in systems biology.

    Science.gov (United States)

    Ananiadou, Sophia; Kell, Douglas B; Tsujii, Jun-ichi

    2006-12-01

    With biomedical literature increasing at a rate of several thousand papers per week, it is impossible to keep abreast of all developments; therefore, automated means to manage the information overload are required. Text mining techniques, which involve the processes of information retrieval, information extraction and data mining, provide a means of solving this. By adding meaning to text, these techniques produce a more structured analysis of textual knowledge than simple word searches, and can provide powerful tools for the production and analysis of systems biology models. PMID:17045684

  12. Data mining approach to model the diagnostic service management.

    Science.gov (United States)

    Lee, Sun-Mi; Lee, Ae-Kyung; Park, Il-Su

    2006-01-01

    Korea has National Health Insurance Program operated by the government-owned National Health Insurance Corporation, and diagnostic services are provided every two year for the insured and their family members. Developing a customer relationship management (CRM) system using data mining technology would be useful to improve the performance of diagnostic service programs. Under these circumstances, this study developed a model for diagnostic service management taking into account the characteristics of subjects using a data mining approach. This study could be further used to develop an automated CRM system contributing to the increase in the rate of receiving diagnostic services. PMID:17102454

  13. Rapid automated nuclear chemistry

    Energy Technology Data Exchange (ETDEWEB)

    Meyer, R.A.

    1979-05-31

    Rapid Automated Nuclear Chemistry (RANC) can be thought of as the Z-separation of Neutron-rich Isotopes by Automated Methods. The range of RANC studies of fission and its products is large. In a sense, the studies can be categorized into various energy ranges from the highest where the fission process and particle emission are considered, to low energies where nuclear dynamics are being explored. This paper presents a table which gives examples of current research using RANC on fission and fission products. The remainder of this text is divided into three parts. The first contains a discussion of the chemical methods available for the fission product elements, the second describes the major techniques, and in the last section, examples of recent results are discussed as illustrations of the use of RANC.

  14. Automated theorem proving.

    Science.gov (United States)

    Plaisted, David A

    2014-03-01

    Automated theorem proving is the use of computers to prove or disprove mathematical or logical statements. Such statements can express properties of hardware or software systems, or facts about the world that are relevant for applications such as natural language processing and planning. A brief introduction to propositional and first-order logic is given, along with some of the main methods of automated theorem proving in these logics. These methods of theorem proving include resolution, Davis and Putnam-style approaches, and others. Methods for handling the equality axioms are also presented. Methods of theorem proving in propositional logic are presented first, and then methods for first-order logic. WIREs Cogn Sci 2014, 5:115-128. doi: 10.1002/wcs.1269 CONFLICT OF INTEREST: The authors has declared no conflicts of interest for this article. For further resources related to this article, please visit the WIREs website. PMID:26304304

  15. ATLAS Distributed Computing Automation

    CERN Document Server

    Schovancova, J; The ATLAS collaboration; Borrego, C; Campana, S; Di Girolamo, A; Elmsheuser, J; Hejbal, J; Kouba, T; Legger, F; Magradze, E; Medrano Llamas, R; Negri, G; Rinaldi, L; Sciacca, G; Serfon, C; Van Der Ster, D C

    2012-01-01

    The ATLAS Experiment benefits from computing resources distributed worldwide at more than 100 WLCG sites. The ATLAS Grid sites provide over 100k CPU job slots, over 100 PB of storage space on disk or tape. Monitoring of status of such a complex infrastructure is essential. The ATLAS Grid infrastructure is monitored 24/7 by two teams of shifters distributed world-wide, by the ATLAS Distributed Computing experts, and by site administrators. In this paper we summarize automation efforts performed within the ATLAS Distributed Computing team in order to reduce manpower costs and improve the reliability of the system. Different aspects of the automation process are described: from the ATLAS Grid site topology provided by the ATLAS Grid Information System, via automatic site testing by the HammerCloud, to automatic exclusion from production or analysis activities.

  16. Rapid automated nuclear chemistry

    International Nuclear Information System (INIS)

    Rapid Automated Nuclear Chemistry (RANC) can be thought of as the Z-separation of Neutron-rich Isotopes by Automated Methods. The range of RANC studies of fission and its products is large. In a sense, the studies can be categorized into various energy ranges from the highest where the fission process and particle emission are considered, to low energies where nuclear dynamics are being explored. This paper presents a table which gives examples of current research using RANC on fission and fission products. The remainder of this text is divided into three parts. The first contains a discussion of the chemical methods available for the fission product elements, the second describes the major techniques, and in the last section, examples of recent results are discussed as illustrations of the use of RANC

  17. Assessment of reliability and efficiency of mining coal seams located above or below extracted coal seams with support coal pillars. [USSR

    Energy Technology Data Exchange (ETDEWEB)

    Batmanov, Yu.K.; Bakhtin, A.F.; Bulavka, E.I.

    1981-04-01

    Mining thin (under 1.1 m) coal seams located above or below extracted thicker coal seams in which coal support pillars were left is one of the ways of increasing coal output without major investment in Donbass coal mines. It is planned that by 1985 25 thin coal seams will be mined in the Donbass. Investigations show that mining thin coal seams with gradients up to 12 degrees by a system of raise faces without leaving coal pillars is economical using mining systems available at present. This mining scheme is economical also in the case of coal seams located in zones of geologic dislocations. Using integrated mining systems (coal cutter, powered supports and face conveyor) in this coal seams would reduce mining cost from 0.2 to 0.3 rubles/t. Using automated integrated mining systems is economical in working faces with coal output exceeding 900 t/d. (3 refs.) (In Russian)

  18. The Automated Medical Office

    OpenAIRE

    Petreman, Mel

    1990-01-01

    With shock and surprise many physicians learned in the 1980s that they must change the way they do business. Competition for patients, increasing government regulation, and the rapidly escalating risk of litigation forces physicians to seek modern remedies in office management. The author describes a medical clinic that strives to be paperless using electronic innovation to solve the problems of medical practice management. A computer software program to automate information management in a c...

  19. Cancer genomics

    DEFF Research Database (Denmark)

    Norrild, Bodil; Guldberg, Per; Ralfkiær, Elisabeth Methner

    2007-01-01

    Almost all cells in the human body contain a complete copy of the genome with an estimated number of 25,000 genes. The sequences of these genes make up about three percent of the genome and comprise the inherited set of genetic information. The genome also contains information that determines whe...

  20. Text Mining Perspectives in Microarray Data Mining

    OpenAIRE

    Natarajan, Jeyakumar

    2013-01-01

    Current microarray data mining methods such as clustering, classification, and association analysis heavily rely on statistical and machine learning algorithms for analysis of large sets of gene expression data. In recent years, there has been a growing interest in methods that attempt to discover patterns based on multiple but related data sources. Gene expression data and the corresponding literature data are one such example. This paper suggests a new approach to microarray data mining as ...

  1. Automation in biological crystallization.

    Science.gov (United States)

    Stewart, Patrick Shaw; Mueller-Dieckmann, Jochen

    2014-06-01

    Crystallization remains the bottleneck in the crystallographic process leading from a gene to a three-dimensional model of the encoded protein or RNA. Automation of the individual steps of a crystallization experiment, from the preparation of crystallization cocktails for initial or optimization screens to the imaging of the experiments, has been the response to address this issue. Today, large high-throughput crystallization facilities, many of them open to the general user community, are capable of setting up thousands of crystallization trials per day. It is thus possible to test multiple constructs of each target for their ability to form crystals on a production-line basis. This has improved success rates and made crystallization much more convenient. High-throughput crystallization, however, cannot relieve users of the task of producing samples of high quality. Moreover, the time gained from eliminating manual preparations must now be invested in the careful evaluation of the increased number of experiments. The latter requires a sophisticated data and laboratory information-management system. A review of the current state of automation at the individual steps of crystallization with specific attention to the automation of optimization is given.

  2. Automated expert modeling for automated student evaluation.

    Energy Technology Data Exchange (ETDEWEB)

    Abbott, Robert G.

    2006-01-01

    The 8th International Conference on Intelligent Tutoring Systems provides a leading international forum for the dissemination of original results in the design, implementation, and evaluation of intelligent tutoring systems and related areas. The conference draws researchers from a broad spectrum of disciplines ranging from artificial intelligence and cognitive science to pedagogy and educational psychology. The conference explores intelligent tutoring systems increasing real world impact on an increasingly global scale. Improved authoring tools and learning object standards enable fielding systems and curricula in real world settings on an unprecedented scale. Researchers deploy ITS's in ever larger studies and increasingly use data from real students, tasks, and settings to guide new research. With high volumes of student interaction data, data mining, and machine learning, tutoring systems can learn from experience and improve their teaching performance. The increasing number of realistic evaluation studies also broaden researchers knowledge about the educational contexts for which ITS's are best suited. At the same time, researchers explore how to expand and improve ITS/student communications, for example, how to achieve more flexible and responsive discourse with students, help students integrate Web resources into learning, use mobile technologies and games to enhance student motivation and learning, and address multicultural perspectives.

  3. Collaborative Data Mining

    Science.gov (United States)

    Moyle, Steve

    Collaborative Data Mining is a setting where the Data Mining effort is distributed to multiple collaborating agents - human or software. The objective of the collaborative Data Mining effort is to produce solutions to the tackled Data Mining problem which are considered better by some metric, with respect to those solutions that would have been achieved by individual, non-collaborating agents. The solutions require evaluation, comparison, and approaches for combination. Collaboration requires communication, and implies some form of community. The human form of collaboration is a social task. Organizing communities in an effective manner is non-trivial and often requires well defined roles and processes. Data Mining, too, benefits from a standard process. This chapter explores the standard Data Mining process CRISP-DM utilized in a collaborative setting.

  4. Coal mine site reclamation

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2013-02-15

    Coal mine sites can have significant effects on local environments. In addition to the physical disruption of land forms and ecosystems, mining can also leave behind a legacy of secondary detrimental effects due to leaching of acid and trace elements from discarded materials. This report looks at the remediation of both deep mine and opencast mine sites, covering reclamation methods, back-filling issues, drainage and restoration. Examples of national variations in the applicable legislation and in the definition of rehabilitation are compared. Ultimately, mine site rehabilitation should return sites to conditions where land forms, soils, hydrology, and flora and fauna are self-sustaining and compatible with surrounding land uses. Case studies are given to show what can be achieved and how some landscapes can actually be improved as a result of mining activity.

  5. Gold-Mining

    DEFF Research Database (Denmark)

    Raaballe, J.; Grundy, B.D.

    2002-01-01

    of operating gold mines. Asymmetric information on the reserves in the mine implies that, at a high enough price of gold, the manager of high type finds the extraction value of the company to be higher than the current market value of the non-operating gold mine. Due to this under valuation the maxim of market...... value maximization forces the manager of high type to extract the gold.The implications are three-fold. First, all managers (except the lowest type) extract the gold too soon compared to the first-best policy of leaving the gold in the mine forever. Second, a manager of high type extracts the gold......  Based on standard option pricing arguments and assumptions (including no convenience yield and sustainable property rights), we will not observe operating gold mines. We find that asymmetric information on the reserves in the gold mine is a necessary and sufficient condition for the existence...

  6. The UCSC Genome Browser Database: update 2006

    DEFF Research Database (Denmark)

    Hinrichs, A S; Karolchik, D; Baertsch, R;

    2006-01-01

    The University of California Santa Cruz Genome Browser Database (GBD) contains sequence and annotation data for the genomes of about a dozen vertebrate species and several major model organisms. Genome annotations typically include assembly data, sequence composition, genes and gene predictions, m......RNA and expressed sequence tag evidence, comparative genomics, regulation, expression and variation data. The database is optimized to support fast interactive performance with web tools that provide powerful visualization and querying capabilities for mining the data. The Genome Browser displays a wide variety...... of annotations at all scales from single nucleotide level up to a full chromosome. The Table Browser provides direct access to the database tables and sequence data, enabling complex queries on genome-wide datasets. The Proteome Browser graphically displays protein properties. The Gene Sorter allows filtering...

  7. Physics Mining of Multi-Source Data Sets

    Science.gov (United States)

    Helly, John; Karimabadi, Homa; Sipes, Tamara

    2012-01-01

    Powerful new parallel data mining algorithms can produce diagnostic and prognostic numerical models and analyses from observational data. These techniques yield higher-resolution measures than ever before of environmental parameters by fusing synoptic imagery and time-series measurements. These techniques are general and relevant to observational data, including raster, vector, and scalar, and can be applied in all Earth- and environmental science domains. Because they can be highly automated and are parallel, they scale to large spatial domains and are well suited to change and gap detection. This makes it possible to analyze spatial and temporal gaps in information, and facilitates within-mission replanning to optimize the allocation of observational resources. The basis of the innovation is the extension of a recently developed set of algorithms packaged into MineTool to multi-variate time-series data. MineTool is unique in that it automates the various steps of the data mining process, thus making it amenable to autonomous analysis of large data sets. Unlike techniques such as Artificial Neural Nets, which yield a blackbox solution, MineTool's outcome is always an analytical model in parametric form that expresses the output in terms of the input variables. This has the advantage that the derived equation can then be used to gain insight into the physical relevance and relative importance of the parameters and coefficients in the model. This is referred to as physics-mining of data. The capabilities of MineTool are extended to include both supervised and unsupervised algorithms, handle multi-type data sets, and parallelize it.

  8. Advanced Text Mining Methods for the Financial Markets and Forecasting of Intraday Volatility

    OpenAIRE

    Pieper, Michael J.

    2011-01-01

    The flow of information in financial markets is covered in two parts. An high-order estimator of intraday volatility is introduced in order to boost risk forecasts. Over the last decade, text mining of news and its application to finance were a vibrant topic of research as well as in the finance industry. This thesis develops a coherent approach to financial text mining that can be utilized for automated trading.

  9. Developing Image Processing Meta-Algorithms with Data Mining of Multiple Metrics

    OpenAIRE

    Kelvin Leung; Alexandre Cunha; TOGA, A. W.; D. Stott Parker

    2014-01-01

    People often use multiple metrics in image processing, but here we take a novel approach of mining the values of batteries of metrics on image processing results. We present a case for extending image processing methods to incorporate automated mining of multiple image metric values. Here by a metric we mean any image similarity or distance measure, and in this paper we consider intensity-based and statistical image measures and focus on registration as an image processing problem. We show ho...

  10. Statistical data analytics foundations for data mining, informatics, and knowledge discovery

    CERN Document Server

    Piegorsch, Walter W

    2015-01-01

      A comprehensive introduction to statistical methods for data mining and knowledge discovery.Applications of data mining and 'big data' increasingly take center stage in our modern, knowledge-driven society, supported by advances in computing power, automated data acquisition, social media development and interactive, linkable internet software.  This book presents a coherent, technical introduction to modern statistical learning and analytics, starting from the core foundations of statistics and probability. It includes an overview of probability and statistical distributions, basic

  11. Implementation of Paste Backfill Mining Technology in Chinese Coal Mines

    Directory of Open Access Journals (Sweden)

    Qingliang Chang

    2014-01-01

    Full Text Available Implementation of clean mining technology at coal mines is crucial to protect the environment and maintain balance among energy resources, consumption, and ecology. After reviewing present coal clean mining technology, we introduce the technology principles and technological process of paste backfill mining in coal mines and discuss the components and features of backfill materials, the constitution of the backfill system, and the backfill process. Specific implementation of this technology and its application are analyzed for paste backfill mining in Daizhuang Coal Mine; a practical implementation shows that paste backfill mining can improve the safety and excavation rate of coal mining, which can effectively resolve surface subsidence problems caused by underground mining activities, by utilizing solid waste such as coal gangues as a resource. Therefore, paste backfill mining is an effective clean coal mining technology, which has widespread application.

  12. Mining in South Africa

    Energy Technology Data Exchange (ETDEWEB)

    Brewis, T.

    1992-09-01

    Poor metals prices, high inflation, rapid political change and (in the case of gold) hot, deep mines all pose challenging problems. Nevertheless, mining plays an essential role in the South African economy and the long-term outlook is positive. The article discusses recent developments in technology and gives production figures for mining of gold, coal, platinum, diamonds, ferrous metals, non-ferrous metals and industrial minerals. 4 refs., 1 tab., 5 photos.

  13. Allele mining and enhanced genetic recombination for rice breeding.

    Science.gov (United States)

    Leung, Hei; Raghavan, Chitra; Zhou, Bo; Oliva, Ricardo; Choi, Il Ryong; Lacorte, Vanica; Jubay, Mona Liza; Cruz, Casiana Vera; Gregorio, Glenn; Singh, Rakesh Kumar; Ulat, Victor Jun; Borja, Frances Nikki; Mauleon, Ramil; Alexandrov, Nickolai N; McNally, Kenneth L; Sackville Hamilton, Ruaraidh

    2015-12-01

    Traditional rice varieties harbour a large store of genetic diversity with potential to accelerate rice improvement. For a long time, this diversity maintained in the International Rice Genebank has not been fully used because of a lack of genome information. The publication of the first reference genome of Nipponbare by the International Rice Genome Sequencing Project (IRGSP) marked the beginning of a systematic exploration and use of rice diversity for genetic research and breeding. Since then, the Nipponbare genome has served as the reference for the assembly of many additional genomes. The recently completed 3000 Rice Genomes Project together with the public database (SNP-Seek) provides a new genomic and data resource that enables the identification of useful accessions for breeding. Using disease resistance traits as case studies, we demonstrated the power of allele mining in the 3,000 genomes for extracting accessions from the GeneBank for targeted phenotyping. Although potentially useful landraces can now be identified, their use in breeding is often hindered by unfavourable linkages. Efficient breeding designs are much needed to transfer the useful diversity to breeding. Multi-parent Advanced Generation InterCross (MAGIC) is a breeding design to produce highly recombined populations. The MAGIC approach can be used to generate pre-breeding populations with increased genotypic diversity and reduced linkage drag. Allele mining combined with a multi-parent breeding design can help convert useful diversity into breeding-ready genetic resources. PMID:26606925

  14. The intelligent deep mine

    Energy Technology Data Exchange (ETDEWEB)

    Hejny, Horst [Mineral Industry Research Organisation (MIRO), Birmingham (United Kingdom)

    2010-12-15

    The intended 'Intelligent Deep Mine'' (IDM) initiative will cope with the challenges the mining industry is currently facing. The main focus will be on technological issues. An increasing share of the total mining production will be by underground mining. The mineral deposits will be found in gradually greater depths, including all problems associated with it. It involves an increase in overburden pressure with subsequent rock stability problems and risks of structural collapse. There is a need for new and safe technologies for deep underground mining. This need targets at nearly all parts of a modern mine, the infrastructure including the communication network, logistics and transport and will reach up to mine preparation work and the winning operation itself including maintenance. Additionally, the concept of near-to-face preparation will be considered in order to step into an ''Invisible Mine'' approach. The IDM project marks the start of a series of development activities aiming to realise the concept of an invisible, zero-impact mine. The extractive sector, still seen as being old-fashioned and highly environment polluting, will join forces to revise this image showing that minerals extraction and processing can be done in a highly innovative manner with low impact underground and zero impact above ground. (orig.)

  15. Uranium mining and milling

    International Nuclear Information System (INIS)

    In this report uranium mining and milling are reviewed. The fuel cycle, different types of uranium geological deposits, blending of ores, open cast and underground mining, the mining cost and radiation protection in mines are treated in the first part of this report. In the second part, the milling of uranium ores is treated, including process technology, acid and alkaline leaching, process design for physical and chemical treatment of the ores, and the cost. Each chapter is clarified by added figures, diagrams, tables, and flowsheets. (HK)

  16. Responsible Mining: A Human Resources Strategy for Mine Development Project

    OpenAIRE

    Sampathkumar, Sriram (Ram)

    2012-01-01

    Mining is a global industry. Most mining companies operate internationally, often in remote, challenging environments and consequently frequently have respond to unusual and demanding Human Resource (HR) requirements. It is my opinion that the strategic imperative behind success in mining industry is responsible mining. The purpose of this paper is to examine how an effective HR strategy can be a competitive advantage that contributes to the success of a mining project in the global mining in...

  17. Data Mining Cultural Aspects of Social Media Marketing

    OpenAIRE

    Hochreiter, Ronald; Waldhauser, Christoph

    2014-01-01

    For marketing to function in a globalized world it must respect a diverse set of local cultures. With marketing efforts extending to social media platforms, the crossing of cultural boundaries can happen in an instant. In this paper we examine how culture influences the popularity of marketing messages in social media platforms. Text mining, automated translation and sentiment analysis contribute largely to our research. From our analysis of 400 posts on the localized Google+ pages of German ...

  18. Automation in organizations: Eternal conflict

    Science.gov (United States)

    Dieterly, D. L.

    1981-01-01

    Some ideas on and insights into the problems associated with automation in organizations are presented with emphasis on the concept of automation, its relationship to the individual, and its impact on system performance. An analogy is drawn, based on an American folk hero, to emphasize the extent of the problems encountered when dealing with automation within an organization. A model is proposed to focus attention on a set of appropriate dimensions. The function allocation process becomes a prominent aspect of the model. The current state of automation research is mentioned in relation to the ideas introduced. Proposed directions for an improved understanding of automation's effect on the individual's efficiency are discussed. The importance of understanding the individual's perception of the system in terms of the degree of automation is highlighted.

  19. Automated Eukaryotic Gene Structure Annotation Using EVidenceModeler and the Program to Assemble Spliced Alignments

    Energy Technology Data Exchange (ETDEWEB)

    Haas, B J; Salzberg, S L; Zhu, W; Pertea, M; Allen, J E; Orvis, J; White, O; Buell, C R; Wortman, J R

    2007-12-10

    EVidenceModeler (EVM) is presented as an automated eukaryotic gene structure annotation tool that reports eukaryotic gene structures as a weighted consensus of all available evidence. EVM, when combined with the Program to Assemble Spliced Alignments (PASA), yields a comprehensive, configurable annotation system that predicts protein-coding genes and alternatively spliced isoforms. Our experiments on both rice and human genome sequences demonstrate that EVM produces automated gene structure annotation approaching the quality of manual curation.

  20. Automated Assessment, Face to Face

    OpenAIRE

    Rizik M. H. Al-Sayyed; Amjad Hudaib; Muhannad AL-Shboul; Yousef Majdalawi; Mohammed Bataineh

    2010-01-01

    This research paper evaluates the usability of automated exams and compares them with the paper-and-pencil traditional ones. It presents the results of a detailed study conducted at The University of Jordan (UoJ) that comprised students from 15 faculties. A set of 613 students were asked about their opinions concerning automated exams; and their opinions were deeply analyzed. The results indicate that most students reported that they are satisfied with using automated exams but they have sugg...

  1. Automation System Products and Research

    OpenAIRE

    Rintala, Mikko; Sormunen, Jussi; Kuisma, Petri; Rahkala, Matti

    2014-01-01

    Automation systems are used in most buildings nowadays. In the past they were mainly used in industry to control and monitor critical systems. During the past few decades the automation systems have become more common and are used today from big industrial solutions to homes of private customers. With the growing need for ecologic and cost-efficient management systems, home and building automation systems are becoming a standard way of controlling lighting, ventilation, heating etc. Auto...

  2. Test Automation of Online Games

    OpenAIRE

    Schoenfeldt, Alexander

    2015-01-01

    State of the art browser games are increasingly complex pieces of software with extensive code basis. With increasing complexity, a software becomes harder to maintain. Automated regression testing can simplify these maintenance processes and thereby enable developers as well as testers to spend their workforce more efficiently. This thesis addresses the utilization of automated tests in web applications. As a use case test automation is applied to an online-based strategy game for the bro...

  3. Text Mining for Protein Docking.

    Directory of Open Access Journals (Sweden)

    Varsha D Badal

    2015-12-01

    Full Text Available The rapidly growing amount of publicly available information from biomedical research is readily accessible on the Internet, providing a powerful resource for predictive biomolecular modeling. The accumulated data on experimentally determined structures transformed structure prediction of proteins and protein complexes. Instead of exploring the enormous search space, predictive tools can simply proceed to the solution based on similarity to the existing, previously determined structures. A similar major paradigm shift is emerging due to the rapidly expanding amount of information, other than experimentally determined structures, which still can be used as constraints in biomolecular structure prediction. Automated text mining has been widely used in recreating protein interaction networks, as well as in detecting small ligand binding sites on protein structures. Combining and expanding these two well-developed areas of research, we applied the text mining to structural modeling of protein-protein complexes (protein docking. Protein docking can be significantly improved when constraints on the docking mode are available. We developed a procedure that retrieves published abstracts on a specific protein-protein interaction and extracts information relevant to docking. The procedure was assessed on protein complexes from Dockground (http://dockground.compbio.ku.edu. The results show that correct information on binding residues can be extracted for about half of the complexes. The amount of irrelevant information was reduced by conceptual analysis of a subset of the retrieved abstracts, based on the bag-of-words (features approach. Support Vector Machine models were trained and validated on the subset. The remaining abstracts were filtered by the best-performing models, which decreased the irrelevant information for ~ 25% complexes in the dataset. The extracted constraints were incorporated in the docking protocol and tested on the Dockground unbound

  4. Mechatronic Design Automation

    DEFF Research Database (Denmark)

    Fan, Zhun

    This book proposes a novel design method that combines both genetic programming (GP) to automatically explore the open-ended design space and bond graphs (BG) to unify design representations of multi-domain Mechatronic systems. Results show that the method, formally called GPBG method, can...... successfully design analogue filters, vibration absorbers, micro-electro-mechanical systems, and vehicle suspension systems, all in an automatic or semi-automatic way. It also investigates the very important issue of co-designing plant-structures and dynamic controllers in automated design of Mechatronic...

  5. The automated medical office.

    Science.gov (United States)

    Petreman, M

    1990-08-01

    With shock and surprise many physicians learned in the 1980s that they must change the way they do business. Competition for patients, increasing government regulation, and the rapidly escalating risk of litigation forces physicians to seek modern remedies in office management. The author describes a medical clinic that strives to be paperless using electronic innovation to solve the problems of medical practice management. A computer software program to automate information management in a clinic shows that practical thinking linked to advanced technology can greatly improve office efficiency.

  6. AUTOMATED API TESTING APPROACH

    Directory of Open Access Journals (Sweden)

    SUNIL L. BANGARE

    2012-02-01

    Full Text Available Software testing is an investigation conducted to provide stakeholders with information about the quality of the product or service under test. With the help of software testing we can verify or validate the software product. Normally testing will be done after development of software but we can perform the software testing at the time of development process also. This paper will give you a brief introduction about Automated API Testing Tool. This tool of testing will reduce lots of headache after the whole development of software. It saves time as well as money. Such type of testing is helpful in the Industries & Colleges also.

  7. World-wide distribution automation systems

    Energy Technology Data Exchange (ETDEWEB)

    Devaney, T.M.

    1994-12-31

    A worldwide power distribution automation system is outlined. Distribution automation is defined and the status of utility automation is discussed. Other topics discussed include a distribution management system, substation feeder, and customer functions, potential benefits, automation costs, planning and engineering considerations, automation trends, databases, system operation, computer modeling of system, and distribution management systems.

  8. Identification and evolutionary genomics of novel LTR retrotransposons in Brassica

    OpenAIRE

    NOUROZ, FAISAL; NOREEN, SHUMAILA; HESLOP-HARRISON, JOHN SEYMOUR

    2015-01-01

    Abstract: Retrotransposons (REs) are the most abundant and diverse elements identified from eukaryotic genomes. Using computational and molecular methods, 262 intact LTR retrotransposons were identified from Brassica genomes by dot plot analysis and data mining. The Copia superfamily was dominant (206 elements) over Gypsy (56), with estimated intact copies of ~1596 Copia and 540 Gypsy and ~7540 Copia and 780 Gypsy from Brassica rapa and Brassica oleracea whole genomes, respectively. Canonical...

  9. BRAD, the genetics and genomics database for Brassica plants

    OpenAIRE

    Li Pingxia; Liu Bo; Sun Silong; Fang Lu; Wu Jian; Liu Shengyi; Cheng Feng; Hua Wei; Wang Xiaowu

    2011-01-01

    Abstract Background Brassica species include both vegetable and oilseed crops, which are very important to the daily life of common human beings. Meanwhile, the Brassica species represent an excellent system for studying numerous aspects of plant biology, specifically for the analysis of genome evolution following polyploidy, so it is also very important for scientific research. Now, the genome of Brassica rapa has already been assembled, it is the time to do deep mining of the genome data. D...

  10. Bioinformatics for Whole-Genome Shotgun Sequencing of Microbial Communities

    OpenAIRE

    Chen, Kevin; Pachter, Lior

    2005-01-01

    The application of whole-genome shotgun sequencing to microbial communities represents a major development in metagenomics, the study of uncultured microbes via the tools of modern genomic analysis. In the past year, whole-genome shotgun sequencing projects of prokaryotic communities from an acid mine biofilm, the Sargasso Sea, Minnesota farm soil, three deep-sea whale falls, and deep-sea sediments have been reported, adding to previously published work on viral communities from marine and fe...

  11. The Penarroya mining railway

    International Nuclear Information System (INIS)

    The French Society Miniere et Metallurgique de Panarroya mining railway, 241 km long, was the second largest private narrow gauge railway in Spain. Located in the inland. linked the coal and galena mines with foundries and also with the national railroads grid, to transport the minerals to national and foreign markets. (Author)

  12. Ghana - Mining and Development

    OpenAIRE

    P. C. Mohan

    2004-01-01

    The objectives of the project ($9.37 million, 1996-2001) were to (a) enhance the capacity of the mining sector institutions to carry out their functions of encouraging and regulating investments in the mining sector in an environmentally sound manner and (b) support the use of techniques and mechanisms that will improve productivity, financial viability and reduce the environmental impact of ...

  13. Automation from pictures

    International Nuclear Information System (INIS)

    The state transition diagram (STD) model has been helpful in the design of real time software, especially with the emergence of graphical computer aided software engineering (CASE) tools. Nevertheless, the translation of the STD to real time code has in the past been primarily a manual task. At Los Alamos we have automated this process. The designer constructs the STD using a CASE tool (Cadre Teamwork) using a special notation for events and actions. A translator converts the STD into an intermediate state notation language (SNL), and this SNL is compiled directly into C code (a state program). Execution of the state program is driven by external events, allowing multiple state programs to effectively share the resources of the host processor. Since the design and the code are tightly integrated through the CASE tool, the design and code never diverge, and we avoid design obsolescence. Furthermore, the CASE tool automates the production of formal technical documents from the graphic description encapsulated by the CASE tool. (author)

  14. Automated Test Case Generation

    CERN Document Server

    CERN. Geneva

    2015-01-01

    I would like to present the concept of automated test case generation. I work on it as part of my PhD and I think it would be interesting also for other people. It is also the topic of a workshop paper that I am introducing in Paris. (abstract below) Please note that the talk itself would be more general and not about the specifics of my PhD, but about the broad field of Automated Test Case Generation. I would introduce the main approaches (combinatorial testing, symbolic execution, adaptive random testing) and their advantages and problems. (oracle problem, combinatorial explosion, ...) Abstract of the paper: Over the last decade code-based test case generation techniques such as combinatorial testing or dynamic symbolic execution have seen growing research popularity. Most algorithms and tool implementations are based on finding assignments for input parameter values in order to maximise the execution branch coverage. Only few of them consider dependencies from outside the Code Under Test’s scope such...

  15. Automated Postediting of Documents

    CERN Document Server

    Knight, K; Knight, Kevin; Chander, Ishwar

    1994-01-01

    Large amounts of low- to medium-quality English texts are now being produced by machine translation (MT) systems, optical character readers (OCR), and non-native speakers of English. Most of this text must be postedited by hand before it sees the light of day. Improving text quality is tedious work, but its automation has not received much research attention. Anyone who has postedited a technical report or thesis written by a non-native speaker of English knows the potential of an automated postediting system. For the case of MT-generated text, we argue for the construction of postediting modules that are portable across MT systems, as an alternative to hardcoding improvements inside any one system. As an example, we have built a complete self-contained postediting module for the task of article selection (a, an, the) for English noun phrases. This is a notoriously difficult problem for Japanese-English MT. Our system contains over 200,000 rules derived automatically from online text resources. We report on l...

  16. Maneuver Automation Software

    Science.gov (United States)

    Uffelman, Hal; Goodson, Troy; Pellegrin, Michael; Stavert, Lynn; Burk, Thomas; Beach, David; Signorelli, Joel; Jones, Jeremy; Hahn, Yungsun; Attiyah, Ahlam; Illsley, Jeannette

    2009-01-01

    The Maneuver Automation Software (MAS) automates the process of generating commands for maneuvers to keep the spacecraft of the Cassini-Huygens mission on a predetermined prime mission trajectory. Before MAS became available, a team of approximately 10 members had to work about two weeks to design, test, and implement each maneuver in a process that involved running many maneuver-related application programs and then serially handing off data products to other parts of the team. MAS enables a three-member team to design, test, and implement a maneuver in about one-half hour after Navigation has process-tracking data. MAS accepts more than 60 parameters and 22 files as input directly from users. MAS consists of Practical Extraction and Reporting Language (PERL) scripts that link, sequence, and execute the maneuver- related application programs: "Pushing a single button" on a graphical user interface causes MAS to run navigation programs that design a maneuver; programs that create sequences of commands to execute the maneuver on the spacecraft; and a program that generates predictions about maneuver performance and generates reports and other files that enable users to quickly review and verify the maneuver design. MAS can also generate presentation materials, initiate electronic command request forms, and archive all data products for future reference.

  17. Data mining for service

    CERN Document Server

    2014-01-01

    Virtually all nontrivial and modern service related problems and systems involve data volumes and types that clearly fall into what is presently meant as "big data", that is, are huge, heterogeneous, complex, distributed, etc. Data mining is a series of processes which include collecting and accumulating data, modeling phenomena, and discovering new information, and it is one of the most important steps to scientific analysis of the processes of services.  Data mining application in services requires a thorough understanding of the characteristics of each service and knowledge of the compatibility of data mining technology within each particular service, rather than knowledge only in calculation speed and prediction accuracy. Varied examples of services provided in this book will help readers understand the relation between services and data mining technology. This book is intended to stimulate interest among researchers and practitioners in the relation between data mining technology and its application to ...

  18. Tellurium Mobility Through Mine Environments

    Science.gov (United States)

    Dorsk, M.

    2015-12-01

    Tellurium is a rare metalloid that has received minimal research regarding environmental mobility. Observations of Tellurium mobility are mainly based on observations of related metalloids such as selenium and beryllium; yet little research has been done on specific Tellurium behavior. This laboratory work established the environmental controls that influence Tellurium mobility and chemical speciation in aqueous driven systems. Theoretical simulations show possible mobility of Te as Te(OH)3[+] at highly oxidizing and acidic conditions. Movement as TeO3[2-] under more basic conditions may also be possible in elevated Eh conditions. Mobility in reducing environments is theoretically not as likely. For a practical approach to investigate mobility conditions for Te, a site with known Tellurium content was chosen in Colorado. Composite samples were selected from the top, center and bottom of a tailings pile for elution experiments. These samples were disintegrated using a rock crusher and pulverized with an automated mortar and pestle. The material was then classified to 70 microns. A 10g sample split was digested in concentrated HNO3 and HF and analyzed by Atomic Absorption Spectroscopy to determine initial Te concentrations. Additional 10g splits from each location were subjected to elution in 100 mL of each of the following solutions; nitric acid to a pH of 1.0, sulfuric acid to a pH of 2.0, sodium hydroxide to a pH of 12, ammonium hydroxide to a pH of 10, a pine needle/soil tea from material within the vicinity of the collection site to a pH of 3.5 and lastly distilled water to serve as control with a pH of 7. Sulfuric acid was purposefully chosen to simulate acid mine drainage from the decomposition of pyrite within the mine tailings. Sample sub sets were also inundated with 10mL of a 3% hydrogen peroxide solution to induce oxidizing conditions. All collected eluates were then analyzed by atomic absorption spectroscopy (AAS) to measure Tellurium concentrations in

  19. An ISU study of asteroid mining

    Science.gov (United States)

    Burke, J. D.

    During the 1990 summer session of the International Space University, 59 graduate students from 16 countries carried out a design project on using the resources of near-earth asteroids. The results of the project, whose full report is now available from ISU, are summarized. The student team included people in these fields: architecture, business and management, engineering, life sciences, physical sciences, policy and law, resources and manufacturing, and satellite applications. They designed a project for transporting equipment and personnel to a near-earth asteroid, setting up a mining base there, and hauling products back for use in cislunar space. In addition, they outlined the needed precursor steps, beginning with expansion of present ground-based programs for finding and characterizing near-earth asteroids and continuing with automated flight missions to candidate bodies. (To limit the summer project's scope the actual design of these flight-mission precursors was excluded.) The main conclusions were that asteroid mining may provide an important complement to the future use of lunar resources, with the potential to provide large amounts of water and carbonaceous materials for use off earth. However, the recovery of such materials from presently known asteroids did not show an economic gain under the study assumptions; therefore, asteroid mining cannot yet be considered a prospective business.

  20. Spatiotemporal Data Mining: A Computational Perspective

    Directory of Open Access Journals (Sweden)

    Shashi Shekhar

    2015-10-01

    Full Text Available Explosive growth in geospatial and temporal data as well as the emergence of new technologies emphasize the need for automated discovery of spatiotemporal knowledge. Spatiotemporal data mining studies the process of discovering interesting and previously unknown, but potentially useful patterns from large spatiotemporal databases. It has broad application domains including ecology and environmental management, public safety, transportation, earth science, epidemiology, and climatology. The complexity of spatiotemporal data and intrinsic relationships limits the usefulness of conventional data science techniques for extracting spatiotemporal patterns. In this survey, we review recent computational techniques and tools in spatiotemporal data mining, focusing on several major pattern families: spatiotemporal outlier, spatiotemporal coupling and tele-coupling, spatiotemporal prediction, spatiotemporal partitioning and summarization, spatiotemporal hotspots, and change detection. Compared with other surveys in the literature, this paper emphasizes the statistical foundations of spatiotemporal data mining and provides comprehensive coverage of computational approaches for various pattern families. ISPRS Int. J. Geo-Inf. 2015, 4 2307 We also list popular software tools for spatiotemporal data analysis. The survey concludes with a look at future research needs.

  1. The Challenge of Wireless Connectivity to Support Intelligent Mines

    DEFF Research Database (Denmark)

    Barbosa, Viviane S. B.; Garcia, Luis G. U.; Portela Lopes de Almeida, Erika;

    2016-01-01

    in terms of network planning, management and optimization. For example, the data rates required to support unmanned equipment, e.g. a teleoperated bulldozer, shift from a few kilobits/second to megabits/second due to live video feeds. This traffic volume is well beyond the capabilities of Professional...... for unmanned mine operations. Although voice and narrowband data radios have been used for years to support several types of mining activities, such as fleet management (dispatch) and telemetry, the use of automated equipment introduces a new set of connectivity requirements and poses a set of challenges...... Mobile Radio narrowband systems and mandates the deployment of broadband systems. Furthermore, the (data) traffic requirements of a mine also vary in time as the fleet expands. Additionally, wireless networks are planned according to the characteristics of the scenario in which they will be deployed...

  2. MTGD: The Medicago truncatula genome database.

    Science.gov (United States)

    Krishnakumar, Vivek; Kim, Maria; Rosen, Benjamin D; Karamycheva, Svetlana; Bidwell, Shelby L; Tang, Haibao; Town, Christopher D

    2015-01-01

    Medicago truncatula, a close relative of alfalfa (Medicago sativa), is a model legume used for studying symbiotic nitrogen fixation, mycorrhizal interactions and legume genomics. J. Craig Venter Institute (JCVI; formerly TIGR) has been involved in M. truncatula genome sequencing and annotation since 2002 and has maintained a web-based resource providing data to the community for this entire period. The website (http://www.MedicagoGenome.org) has seen major updates in the past year, where it currently hosts the latest version of the genome (Mt4.0), associated data and legacy project information, presented to users via a rich set of open-source tools. A JBrowse-based genome browser interface exposes tracks for visualization. Mutant gene symbols originally assembled and curated by the Frugoli lab are now hosted at JCVI and tie into our community annotation interface, Medicago EuCAP (to be integrated soon with our implementation of WebApollo). Literature pertinent to M. truncatula is indexed and made searchable via the Textpresso search engine. The site also implements MedicMine, an instance of InterMine that offers interconnectivity with other plant 'mines' such as ThaleMine and PhytoMine, and other model organism databases (MODs). In addition to these new features, we continue to provide keyword- and locus identifier-based searches served via a Chado-backed Tripal Instance, a BLAST search interface and bulk downloads of data sets from the iPlant Data Store (iDS). Finally, we maintain an E-mail helpdesk, facilitated by a JIRA issue tracking system, where we receive and respond to questions about the website and requests for specific data sets from the community.

  3. Innovative management techniques to deal with mine water issues in the Sydney coal field, Nova Scotia, Canada

    Energy Technology Data Exchange (ETDEWEB)

    Shea, J. [Enterprise Cape Breton Corp., Sydney, NS (Canada)

    2010-07-01

    There are currently 20 mine pools that have flooded to an equilibrium point and are discharging water at the Sydney Coalfield in Nova Scotia (NS). This paper discussed a new mine water technique that is being used at 3 of the mine's mine pools. An emergency active treatment plant was constructed at one of the mine shafts to prevent uncontrolled discharges. A drilling program was also conducted in the inflooded zones of the mine to test the quality of the rising mine water. Pump tests were conducted to allow for the discharge of better quality mine water into a receiving stream without treatment. An automated and remote-controlled pumping system was installed. A passive treatment system consisting of aeration cascades and a 1.2 hectare settling pond and a 1.1 hectare reed bed wetland was constructed. The mine water flow through the pond was designed using a simple piston flow theory that provided a 50 hour retention time for the mine water. Floating pond curtains were also installed. Boreholes were drilled to combine mine waters from other pools into the passive treatment plant. It is expected that mine water issues at the site will be resolved within the next 5 years. 3 refs., 4 figs.

  4. Tomato Functional Genomics Database: a comprehensive resource and analysis package for tomato functional genomics

    OpenAIRE

    Fei, Zhangjun; Joung, Je-Gun; Tang, Xuemei; Zheng, Yi; Huang, Mingyun; Lee, Je Min; McQuinn, Ryan; Tieman, Denise M.; Alba, Rob; Klee, Harry J.; Giovannoni, James J

    2010-01-01

    Tomato Functional Genomics Database (TFGD) provides a comprehensive resource to store, query, mine, analyze, visualize and integrate large-scale tomato functional genomics data sets. The database is functionally expanded from the previously described Tomato Expression Database by including metabolite profiles as well as large-scale tomato small RNA (sRNA) data sets. Computational pipelines have been developed to process microarray, metabolite and sRNA data sets archived in the database, respe...

  5. Microbial species delineation using whole genome sequences

    Energy Technology Data Exchange (ETDEWEB)

    Kyrpides, Nikos; Mukherjee, Supratim; Ivanova, Natalia; Mavrommatics, Kostas; Pati, Amrita; Konstantinidis, Konstantinos

    2014-10-20

    Species assignments in prokaryotes use a manual, poly-phasic approach utilizing both phenotypic traits and sequence information of phylogenetic marker genes. With thousands of genomes being sequenced every year, an automated, uniform and scalable approach exploiting the rich genomic information in whole genome sequences is desired, at least for the initial assignment of species to an organism. We have evaluated pairwise genome-wide Average Nucleotide Identity (gANI) values and alignment fractions (AFs) for nearly 13,000 genomes using our fast implementation of the computation, identifying robust and widely applicable hard cut-offs for species assignments based on AF and gANI. Using these cutoffs, we generated stable species-level clusters of organisms, which enabled the identification of several species mis-assignments and facilitated the assignment of species for organisms without species definitions.

  6. Get smart! automate your house!

    NARCIS (Netherlands)

    Van Amstel, P.; Gorter, N.; De Rouw, J.

    2016-01-01

    This "designers' manual" is made during the TIDO-course AR0531 Innovation and Sustainability This manual will help you in reducing both energy usage and costs by automating your home. It gives an introduction to a number of home automation systems that every homeowner can install.

  7. Automated Methods Of Corrosion Measurements

    DEFF Research Database (Denmark)

    Bech-Nielsen, Gregers; Andersen, Jens Enevold Thaulov; Reeve, John Ch;

    1997-01-01

    The chapter describes the following automated measurements: Corrosion Measurements by Titration, Imaging Corrosion by Scanning Probe Microscopy, Critical Pitting Temperature and Application of the Electrochemical Hydrogen Permeation Cell.......The chapter describes the following automated measurements: Corrosion Measurements by Titration, Imaging Corrosion by Scanning Probe Microscopy, Critical Pitting Temperature and Application of the Electrochemical Hydrogen Permeation Cell....

  8. Automated separation for heterogeneous immunoassays

    OpenAIRE

    Truchaud, A.; Barclay, J; Yvert, J. P.; Capolaghi, B.

    1991-01-01

    Beside general requirements for modern automated systems, immunoassay automation involves specific requirements as a separation step for heterogeneous immunoassays. Systems are designed according to the solid phase selected: dedicated or open robots for coated tubes and wells, systems nearly similar to chemistry analysers in the case of magnetic particles, and a completely original design for those using porous and film materials.

  9. Automated Test-Form Generation

    Science.gov (United States)

    van der Linden, Wim J.; Diao, Qi

    2011-01-01

    In automated test assembly (ATA), the methodology of mixed-integer programming is used to select test items from an item bank to meet the specifications for a desired test form and optimize its measurement accuracy. The same methodology can be used to automate the formatting of the set of selected items into the actual test form. Three different…

  10. Translation: Aids, Robots, and Automation.

    Science.gov (United States)

    Andreyewsky, Alexander

    1981-01-01

    Examines electronic aids to translation both as ways to automate it and as an approach to solve problems resulting from shortage of qualified translators. Describes the limitations of robotic MT (Machine Translation) systems, viewing MAT (Machine-Aided Translation) as the only practical solution and the best vehicle for further automation. (MES)

  11. Opening up Library Automation Software

    Science.gov (United States)

    Breeding, Marshall

    2009-01-01

    Throughout the history of library automation, the author has seen a steady advancement toward more open systems. In the early days of library automation, when proprietary systems dominated, the need for standards was paramount since other means of inter-operability and data exchange weren't possible. Today's focus on Application Programming…

  12. Radioecological challenges for mining

    Energy Technology Data Exchange (ETDEWEB)

    Vesterbacka, P.; Ikaeheimonen, T.K.; Solatie, D. [Radiation and Nuclear Safety Authority (Finland)

    2014-07-01

    In Finland, mining became popular in the mid-1990's when the mining amendments to the law made the mining activities easier for foreign companies. Also the price of the minerals rose and mining in Finland became economically profitable. Expanding mining industry brought new challenges to radiation safety aspect since radioactive substances occur in nearly all minerals. In Finnish soil and bedrock the average crystal abundance of uranium and thorium are 2.8 ppm and 10 ppm, respectively. It cannot be predicted beforehand how radionuclides behave in the mining processes which why they need to be taken into account in mining activities. Radiation and Nuclear Safety Authority (STUK) has given a national guide ST 12.1 based on the Finnish Radiation Act. The guide sets the limits for radiation doses to the public also from mining activities. In general, no measures to limit the radiation exposure are needed, if the dose from the operation liable to cause exposure to natural radiation is no greater than 0.1 mSv per year above the natural background radiation dose. If the exposure of the public may be higher than 0.1 mSv per year, the responsible party must provide STUK a plan describing the measures by which the radiation exposure is to be kept as low as is reasonably achievable. In that case the mining company responsible company has to make a radiological baseline study. The baseline study must focus on the environment that the mining activities may impact. The study describes the occurrence of natural radioactivity in the environment before any mining activities are started. The baseline study lasts usually for two to three years in natural circumstances. Based on the baseline study measurements, detailed information of the existing levels of radioactivity in the environment can be attained. Once the mining activities begin, it is important that the limits are set for the wastewater discharges to the environment and environmental surveillance in the vicinity of

  13. Rumen microbial genomics

    International Nuclear Information System (INIS)

    cellulase systems are employed by at least some ruminal bacteria. But is that enough? 'Metagenomics' is a term coined with reference to the genetic potential resident within an entire microbial community, and is dependent upon high throughput DNA sequencing, advances in recombinant DNA technologies, and computational biology. It is anticipated that metagenomics will significantly augment the rumen genome studies that are already underway, and allow for the genetic characterization of microbes that cannot currently be cultured in the laboratory. The genetic potential of these species, which undoubtedly make a significant contribution to the ecology of the rumen environment have, until now, escaped attention. The '-omics' technologies also offer exciting new opportunities to investigate microbial diversity and physiology in ruminants, other herbivorous animals, and humans. Hopefully, the current model that has been established by the North American Consortium will be just the beginning, but we are aware that many challenges lay ahead in terms of funding, data acquisition, data mining, and data interpretation. The benefits from these studies will however have global implications for animal productivity. (author)

  14. Advanced Data Mining of Leukemia Cells Micro-Arrays

    Directory of Open Access Journals (Sweden)

    Ryan M. Pierce

    2009-12-01

    Full Text Available This paper provides continuation and extensions of previous research by Segall and Pierce (2009a that discussed data mining for micro-array databases of Leukemia cells for primarily self-organized maps (SOM. As Segall and Pierce (2009a and Segall and Pierce (2009b the results of applying data mining are shown and discussed for the data categories of microarray databases of HL60, Jurkat, NB4 and U937 Leukemia cells that are also described in this article. First, a background section is provided on the work of others pertaining to the applications of data mining to micro-array databases of Leukemia cells and micro-array databases in general. As noted in predecessor article by Segall and Pierce (2009a, micro-array databases are one of the most popular functional genomics tools in use today. This research in this paper is intended to use advanced data mining technologies for better interpretations and knowledge discovery as generated by the patterns of gene expressions of HL60, Jurkat, NB4 and U937 Leukemia cells. The advanced data mining performed entailed using other data mining tools such as cubic clustering criterion, variable importance rankings, decision trees, and more detailed examinations of data mining statistics and study of other self-organized maps (SOM clustering regions of workspace as generated by SAS Enterprise Miner version 4. Conclusions and future directions of the research are also presented.

  15. Clinical pertinence metric enables hypothesis-independent genome-phenome analysis for neurologic diagnosis.

    Science.gov (United States)

    Segal, Michael M; Abdellateef, Mostafa; El-Hattab, Ayman W; Hilbush, Brian S; De La Vega, Francisco M; Tromp, Gerard; Williams, Marc S; Betensky, Rebecca A; Gleeson, Joseph

    2015-06-01

    We describe an "integrated genome-phenome analysis" that combines both genomic sequence data and clinical information for genomic diagnosis. It is novel in that it uses robust diagnostic decision support and combines the clinical differential diagnosis and the genomic variants using a "pertinence" metric. This allows the analysis to be hypothesis-independent, not requiring assumptions about mode of inheritance, number of genes involved, or which clinical findings are most relevant. Using 20 genomic trios with neurologic disease, we find that pertinence scores averaging 99.9% identify the causative variant under conditions in which a genomic trio is analyzed and family-aware variant calling is done. The analysis takes seconds, and pertinence scores can be improved by clinicians adding more findings. The core conclusion is that automated genome-phenome analysis can be accurate, rapid, and efficient. We also conclude that an automated process offers a methodology for quality improvement of many components of genomic analysis.

  16. Automated Motivic Analysis

    DEFF Research Database (Denmark)

    Lartillot, Olivier

    2016-01-01

    Motivic analysis provides very detailed understanding of musical composi- tions, but is also particularly difficult to formalize and systematize. A computational automation of the discovery of motivic patterns cannot be reduced to a mere extraction of all possible sequences of descriptions....... The systematic approach inexorably leads to a proliferation of redundant structures that needs to be addressed properly. Global filtering techniques cause a drastic elimination of interesting structures that damages the quality of the analysis. On the other hand, a selection of closed patterns allows...... for lossless compression. The structural complexity resulting from successive repetitions of patterns can be controlled through a simple modelling of cycles. Generally, motivic patterns cannot always be defined solely as sequences of descriptions in a fixed set of dimensions: throughout the descriptions...

  17. Robust automated knowledge capture.

    Energy Technology Data Exchange (ETDEWEB)

    Stevens-Adams, Susan Marie; Abbott, Robert G.; Forsythe, James Chris; Trumbo, Michael Christopher Stefan; Haass, Michael Joseph; Hendrickson, Stacey M. Langfitt

    2011-10-01

    This report summarizes research conducted through the Sandia National Laboratories Robust Automated Knowledge Capture Laboratory Directed Research and Development project. The objective of this project was to advance scientific understanding of the influence of individual cognitive attributes on decision making. The project has developed a quantitative model known as RumRunner that has proven effective in predicting the propensity of an individual to shift strategies on the basis of task and experience related parameters. Three separate studies are described which have validated the basic RumRunner model. This work provides a basis for better understanding human decision making in high consequent national security applications, and in particular, the individual characteristics that underlie adaptive thinking.

  18. Automated Electrostatics Environmental Chamber

    Science.gov (United States)

    Calle, Carlos; Lewis, Dean C.; Buchanan, Randy K.; Buchanan, Aubri

    2005-01-01

    The Mars Electrostatics Chamber (MEC) is an environmental chamber designed primarily to create atmospheric conditions like those at the surface of Mars to support experiments on electrostatic effects in the Martian environment. The chamber is equipped with a vacuum system, a cryogenic cooling system, an atmospheric-gas replenishing and analysis system, and a computerized control system that can be programmed by the user and that provides both automation and options for manual control. The control system can be set to maintain steady Mars-like conditions or to impose temperature and pressure variations of a Mars diurnal cycle at any given season and latitude. In addition, the MEC can be used in other areas of research because it can create steady or varying atmospheric conditions anywhere within the wide temperature, pressure, and composition ranges between the extremes of Mars-like and Earth-like conditions.

  19. Automated Standard Hazard Tool

    Science.gov (United States)

    Stebler, Shane

    2014-01-01

    The current system used to generate standard hazard reports is considered cumbersome and iterative. This study defines a structure for this system's process in a clear, algorithmic way so that standard hazard reports and basic hazard analysis may be completed using a centralized, web-based computer application. To accomplish this task, a test server is used to host a prototype of the tool during development. The prototype is configured to easily integrate into NASA's current server systems with minimal alteration. Additionally, the tool is easily updated and provides NASA with a system that may grow to accommodate future requirements and possibly, different applications. Results of this project's success are outlined in positive, subjective reviews complete by payload providers and NASA Safety and Mission Assurance personnel. Ideally, this prototype will increase interest in the concept of standard hazard automation and lead to the full-scale production of a user-ready application.

  20. Automated synthetic scene generation

    Science.gov (United States)

    Givens, Ryan N.

    Physics-based simulations generate synthetic imagery to help organizations anticipate system performance of proposed remote sensing systems. However, manually constructing synthetic scenes which are sophisticated enough to capture the complexity of real-world sites can take days to months depending on the size of the site and desired fidelity of the scene. This research, sponsored by the Air Force Research Laboratory's Sensors Directorate, successfully developed an automated approach to fuse high-resolution RGB imagery, lidar data, and hyperspectral imagery and then extract the necessary scene components. The method greatly reduces the time and money required to generate realistic synthetic scenes and developed new approaches to improve material identification using information from all three of the input datasets.

  1. Automated Essay Scoring

    Directory of Open Access Journals (Sweden)

    Semire DIKLI

    2006-01-01

    Full Text Available Automated Essay Scoring Semire DIKLI Florida State University Tallahassee, FL, USA ABSTRACT The impacts of computers on writing have been widely studied for three decades. Even basic computers functions, i.e. word processing, have been of great assistance to writers in modifying their essays. The research on Automated Essay Scoring (AES has revealed that computers have the capacity to function as a more effective cognitive tool (Attali, 2004. AES is defined as the computer technology that evaluates and scores the written prose (Shermis & Barrera, 2002; Shermis & Burstein, 2003; Shermis, Raymat, & Barrera, 2003. Revision and feedback are essential aspects of the writing process. Students need to receive feedback in order to increase their writing quality. However, responding to student papers can be a burden for teachers. Particularly if they have large number of students and if they assign frequent writing assignments, providing individual feedback to student essays might be quite time consuming. AES systems can be very useful because they can provide the student with a score as well as feedback within seconds (Page, 2003. Four types of AES systems, which are widely used by testing companies, universities, and public schools: Project Essay Grader (PEG, Intelligent Essay Assessor (IEA, E-rater, and IntelliMetric. AES is a developing technology. Many AES systems are used to overcome time, cost, and generalizability issues in writing assessment. The accuracy and reliability of these systems have been proven to be high. The search for excellence in machine scoring of essays is continuing and numerous studies are being conducted to improve the effectiveness of the AES systems.

  2. Implementation of Paste Backfill Mining Technology in Chinese Coal Mines

    OpenAIRE

    Qingliang Chang; Jianhang Chen; Huaqiang Zhou; Jianbiao Bai

    2014-01-01

    Implementation of clean mining technology at coal mines is crucial to protect the environment and maintain balance among energy resources, consumption, and ecology. After reviewing present coal clean mining technology, we introduce the technology principles and technological process of paste backfill mining in coal mines and discuss the components and features of backfill materials, the constitution of the backfill system, and the backfill process. Specific implementation of this technology a...

  3. Mining a Web Citation Database for Author Co-Citation Analysis.

    Science.gov (United States)

    He, Yulan; Hui, Siu Cheung

    2002-01-01

    Proposes a mining process to automate author co-citation analysis based on the Web Citation Database, a data warehouse for storing citation indices of Web publications. Describes the use of agglomerative hierarchical clustering for author clustering and multidimensional scaling for displaying author cluster maps, and explains PubSearch, a…

  4. Automating the radiographic NDT process

    International Nuclear Information System (INIS)

    Automation, the removal of the human element in inspection, has not been generally applied to film radiographic NDT. The justication for automating is not only productivity but also reliability of results. Film remains in the automated system of the future because of its extremely high image content, approximately 8 x 109 bits per 14 x 17. The equivalent to 2200 computer floppy discs. Parts handling systems and robotics applied for manufacturing and some NDT modalities, should now be applied to film radiographic NDT systems. Automatic film handling can be achieved with the daylight NDT film handling system. Automatic film processing is becoming the standard in industry and can be coupled to the daylight system. Robots offer the opportunity to automate fully the exposure step. Finally, computer aided interpretation appears on the horizon. A unit which laser scans a 14 x 17 (inch) film in 6 - 8 seconds can digitize film information for further manipulation and possible automatic interrogations (computer aided interpretation). The system called FDRS (for Film Digital Radiography System) is moving toward 50 micron (*approx* 16 lines/mm) resolution. This is believed to meet the need of the majority of image content needs. We expect the automated system to appear first in parts (modules) as certain operations are automated. The future will see it all come together in an automated film radiographic NDT system (author)

  5. VRLane: a desktop virtual safety management program for underground coal mine

    Science.gov (United States)

    Li, Mei; Chen, Jingzhu; Xiong, Wei; Zhang, Pengpeng; Wu, Daozheng

    2008-10-01

    VR technologies, which generate immersive, interactive, and three-dimensional (3D) environments, are seldom applied to coal mine safety work management. In this paper, a new method that combined the VR technologies with underground mine safety management system was explored. A desktop virtual safety management program for underground coal mine, called VRLane, was developed. The paper mainly concerned about the current research advance in VR, system design, key techniques and system application. Two important techniques were introduced in the paper. Firstly, an algorithm was designed and implemented, with which the 3D laneway models and equipment models can be built on the basis of the latest mine 2D drawings automatically, whereas common VR programs established 3D environment by using 3DS Max or the other 3D modeling software packages with which laneway models were built manually and laboriously. Secondly, VRLane realized system integration with underground industrial automation. VRLane not only described a realistic 3D laneway environment, but also described the status of the coal mining, with functions of displaying the run states and related parameters of equipment, per-alarming the abnormal mining events, and animating mine cars, mine workers, or long-wall shearers. The system, with advantages of cheap, dynamic, easy to maintenance, provided a useful tool for safety production management in coal mine.

  6. Data mining in Cloud Computing

    OpenAIRE

    Ruxandra-Ştefania PETRE

    2012-01-01

    This paper describes how data mining is used in cloud computing. Data Mining is used for extracting potentially useful information from raw data. The integration of data mining techniques into normal day-to-day activities has become common place. Every day people are confronted with targeted advertising, and data mining techniques help businesses to become more efficient by reducing costs. Data mining techniques and applications are very much needed in the cloud computing paradigm. The implem...

  7. Concept of Web Usage Mining

    OpenAIRE

    Istrate Mihai

    2011-01-01

    Web mining is the use of data mining techniques to automatically discover and extract information from World Wide Web documents and services. This article considers the question: is effective Web mining possible? Skeptics believe that the Web is too unstructured for Web mining to succeed. Indeed, data mining has been applied to databases traditionally, yet much of the information on the Web lies buried in documents designed for human consumption such as home pages or product catalogs. Further...

  8. Languages for Mining and Learning

    OpenAIRE

    De Raedt, Luc

    2015-01-01

    Applying machine learning and data mining to novel applications is cumbersome. This observation is the prime motivation for the interest in languages for learning and mining. In this talk, I shall provide a gentle introduction to three types of languages that support machine learning and data mining: inductive query languages, which extend database query languages with primitives for mining and learning, modelling languages, which allow to declaratively specify and solve mining and learning p...

  9. Mine Maps as Grey Literature

    OpenAIRE

    Musser, Linda R. (Pennsylvania State University); GreyNet, Grey Literature Network Service

    2000-01-01

    Mine maps are extremely useful resources for determining regional and local hydrogeologic conditions as well as mineral resources and reserves. Users include mining companies, property owners concerned about risk factors related to mining activities, government inspectors, engineers and planners. Hundreds of thousands of mine maps exist yet how many are collected or cataloged by libraries or archives? This paper examines the characteristics of mine maps, how they are published, their value, a...

  10. Data mining in agriculture

    CERN Document Server

    Mucherino, Antonio; Pardalos, Panos M

    2009-01-01

    Data Mining in Agriculture represents a comprehensive effort to provide graduate students and researchers with an analytical text on data mining techniques applied to agriculture and environmental related fields. This book presents both theoretical and practical insights with a focus on presenting the context of each data mining technique rather intuitively with ample concrete examples represented graphically and with algorithms written in MATLAB®. Examples and exercises with solutions are provided at the end of each chapter to facilitate the comprehension of the material. For each data mining technique described in the book variants and improvements of the basic algorithm are also given. Also by P.J. Papajorgji and P.M. Pardalos: Advances in Modeling Agricultural Systems, 'Springer Optimization and its Applications' vol. 25, ©2009.

  11. Ensemble Data Mining Methods

    Data.gov (United States)

    National Aeronautics and Space Administration — Ensemble Data Mining Methods, also known as Committee Methods or Model Combiners, are machine learning methods that leverage the power of multiple models to achieve...

  12. Automated Quantitative Rare Earth Elements Mineralogy by Scanning Electron Microscopy

    Science.gov (United States)

    Sindern, Sven; Meyer, F. Michael

    2016-09-01

    Increasing industrial demand of rare earth elements (REEs) stems from the central role they play for advanced technologies and the accelerating move away from carbon-based fuels. However, REE production is often hampered by the chemical, mineralogical as well as textural complexity of the ores with a need for better understanding of their salient properties. This is not only essential for in-depth genetic interpretations but also for a robust assessment of ore quality and economic viability. The design of energy and cost-efficient processing of REE ores depends heavily on information about REE element deportment that can be made available employing automated quantitative process mineralogy. Quantitative mineralogy assigns numeric values to compositional and textural properties of mineral matter. Scanning electron microscopy (SEM) combined with a suitable software package for acquisition of backscatter electron and X-ray signals, phase assignment and image analysis is one of the most efficient tools for quantitative mineralogy. The four different SEM-based automated quantitative mineralogy systems, i.e. FEI QEMSCAN and MLA, Tescan TIMA and Zeiss Mineralogic Mining, which are commercially available, are briefly characterized. Using examples of quantitative REE mineralogy, this chapter illustrates capabilities and limitations of automated SEM-based systems. Chemical variability of REE minerals and analytical uncertainty can reduce performance of phase assignment. This is shown for the REE phases parisite and synchysite. In another example from a monazite REE deposit, the quantitative mineralogical parameters surface roughness and mineral association derived from image analysis are applied for automated discrimination of apatite formed in a breakdown reaction of monazite and apatite formed by metamorphism prior to monazite breakdown. SEM-based automated mineralogy fulfils all requirements for characterization of complex unconventional REE ores that will become

  13. Automated Fluid Interface System (AFIS)

    Science.gov (United States)

    1990-01-01

    Automated remote fluid servicing will be necessary for future space missions, as future satellites will be designed for on-orbit consumable replenishment. In order to develop an on-orbit remote servicing capability, a standard interface between a tanker and the receiving satellite is needed. The objective of the Automated Fluid Interface System (AFIS) program is to design, fabricate, and functionally demonstrate compliance with all design requirements for an automated fluid interface system. A description and documentation of the Fairchild AFIS design is provided.

  14. International mining forum 2004, new technologies in underground mining, safety in mines proceedings

    Energy Technology Data Exchange (ETDEWEB)

    Jerzy Kicki; Eugeniusz Sobczyk (eds.)

    2004-01-15

    The book comprises technical papers that were presented at the International Mining Forum 2004. This event aims to bring together scientists and engineers in mining, rock mechanics, and computer engineering, with a view to explore and discuss international developments in the field. Topics discussed in this book are: trends in the mining industry; new solutions and tendencies in underground mines; rock engineering problems in underground mines; utilization and exploitation of methane; prevention measures for the control of rock bursts in Polish mines; and current problems in Ukrainian coal mines.

  15. Python data mining environments

    OpenAIRE

    Mrak, Aleš

    2012-01-01

    In the thesis we compare the systems for data mining that have an interface in the programming language Python. Many open-source systems for data mining and library had implemented their software interfaces to the Python programming language. They choose Python because it is fast and provides object-oriented programming, allows for the integration of other software libraries in Python and is implemented in all major operating systems (Windows, Linux / Unix, OS / 2, Mac, etc..). Our analysis s...

  16. Coal Mines Security System

    OpenAIRE

    Ankita Guhe; Shruti Deshmukh; Bhagyashree Borekar; Apoorva Kailaswar; Milind E. Rane

    2012-01-01

    Geological circumstances of mine seem to be extremely complicated and there are many hidden troubles. Coal is wrongly lifted by the musclemen from coal stocks, coal washeries, coal transfer and loading points and also in the transport routes by malfunctioning the weighing of trucks. CIL —Coal India Ltd is under the control of mafia and a large number of irregularities can be contributed to coal mafia. An Intelligent Coal Mine Security System using data acquisition method utilizes sensor, auto...

  17. MINING INDUSTRY IN CROATIA

    OpenAIRE

    Slavko Vujec

    1996-01-01

    The trends of World and European mine industry is presented with introductory short review. The mining industry is very important in economy of Croatia, because of cover most of needed petroleum and natural gas quantity, total construction raw materials and industrial non-metallic raw minerals. Detail quantitative presentation of mineral raw material production is compared with pre-war situation. The value of annual production is represented for each raw mineral (the paper is published in Cro...

  18. MINING INDUSTRY IN CROATIA

    Directory of Open Access Journals (Sweden)

    Slavko Vujec

    1996-12-01

    Full Text Available The trends of World and European mine industry is presented with introductory short review. The mining industry is very important in economy of Croatia, because of cover most of needed petroleum and natural gas quantity, total construction raw materials and industrial non-metallic raw minerals. Detail quantitative presentation of mineral raw material production is compared with pre-war situation. The value of annual production is represented for each raw mineral (the paper is published in Croatian.

  19. Applied data mining

    CERN Document Server

    Xu, Guandong

    2013-01-01

    Data mining has witnessed substantial advances in recent decades. New research questions and practical challenges have arisen from emerging areas and applications within the various fields closely related to human daily life, e.g. social media and social networking. This book aims to bridge the gap between traditional data mining and the latest advances in newly emerging information services. It explores the extension of well-studied algorithms and approaches into these new research arenas.

  20. Web Mining: An Overview

    Directory of Open Access Journals (Sweden)

    P. V. G. S. Mudiraj B. Jabber K. David raju

    2011-12-01

    Full Text Available Web usage mining is a main research area in Web mining focused on learning about Web users and their interactions with Web sites. The motive of mining is to find users’ access models automatically and quickly from the vast Web log data, such as frequent access paths, frequent access page groups and user clustering. Through web usage mining, the server log, registration information and other relative information left by user provide foundation for decision making of organizations. This article provides a survey and analysis of current Web usage mining systems and technologies. There are generally three tasks in Web Usage Mining: Preprocessing, Pattern analysis and Knowledge discovery. Preprocessing cleans log file of server by removing log entries such as error or failure and repeated request for the same URL from the same host etc... The main task of Pattern analysis is to filter uninteresting information and to visualize and interpret the interesting pattern to users. The statistics collected from the log file can help to discover the knowledge. This knowledge collected can be used to take decision on various factors like Excellent, Medium, Weak users and Excellent, Medium and Weak web pages based on hit counts of the web page in the web site. The design of the website is restructured based on user’s behavior or hit counts which provides quick response to the web users, saves memory space of servers and thus reducing HTTP requests and bandwidth utilization. This paper addresses challenges in three phases of Web Usage mining along with Web Structure Mining.This paper also discusses an application of WUM, an online Recommender System that dynamically generates links to pages that have not yet been visited by a user and might be of his potential interest. Differently from the recommender systems proposed so far, ONLINE MINER does not make use of any off-line component, and is able to manage Web sites made up of pages dynamically generated.

  1. Data mining in radiology

    OpenAIRE

    Amit T Kharat; Amarjit Singh; Kulkarni, Vilas M; Digish Shah

    2014-01-01

    Data mining facilitates the study of radiology data in various dimensions. It converts large patient image and text datasets into useful information that helps in improving patient care and provides informative reports. Data mining technology analyzes data within the Radiology Information System and Hospital Information System using specialized software which assesses relationships and agreement in available information. By using similar data analysis tools, radiologists can make informed dec...

  2. Automated Training for Algorithms That Learn from Genomic Data

    OpenAIRE

    Gokcen Cilingir; Broschat, Shira L.

    2015-01-01

    Supervised machine learning algorithms are used by life scientists for a variety of objectives. Expert-curated public gene and protein databases are major resources for gathering data to train these algorithms. While these data resources are continuously updated, generally, these updates are not incorporated into published machine learning algorithms which thereby can become outdated soon after their introduction. In this paper, we propose a new model of operation for supervis...

  3. Investigation and characterization of mining subsidence in Kaiyang Phosphorus Mine

    Institute of Scientific and Technical Information of China (English)

    DENG Jian; BIAN Li

    2007-01-01

    In Kaiyang Phosphorus Mine, serious environmental and safety problems are caused by large scale mining activities in the past 40 years. These problems include mining subsidence, low recovery ratio, too much dead ore in pillars, and pollution of phosphorus gypsum. Mining subsidence falls into four categories: curved ground and mesa, ground cracks and collapse hole, spalling and eboulement, slope slide and creeping. Measures to treat the mining subsidence were put forward: finding out and managing abandoned stopes, optimizing mining method (cut and fill mining method), selecting proper backfilling materials (phosphogypsum mixtures), avoiding disorder mining operation, and treating highway slopes. These investigations and engineering treatment methods are believed to be able to contribute to the safety extraction of ore and sustainable development in Kaiyang Phosphorus Mine.

  4. The UCSC Genome Browser database: 2015 update.

    Science.gov (United States)

    Rosenbloom, Kate R; Armstrong, Joel; Barber, Galt P; Casper, Jonathan; Clawson, Hiram; Diekhans, Mark; Dreszer, Timothy R; Fujita, Pauline A; Guruvadoo, Luvina; Haeussler, Maximilian; Harte, Rachel A; Heitner, Steve; Hickey, Glenn; Hinrichs, Angie S; Hubley, Robert; Karolchik, Donna; Learned, Katrina; Lee, Brian T; Li, Chin H; Miga, Karen H; Nguyen, Ngan; Paten, Benedict; Raney, Brian J; Smit, Arian F A; Speir, Matthew L; Zweig, Ann S; Haussler, David; Kuhn, Robert M; Kent, W James

    2015-01-01

    Launched in 2001 to showcase the draft human genome assembly, the UCSC Genome Browser database (http://genome.ucsc.edu) and associated tools continue to grow, providing a comprehensive resource of genome assemblies and annotations to scientists and students worldwide. Highlights of the past year include the release of a browser for the first new human genome reference assembly in 4 years in December 2013 (GRCh38, UCSC hg38), a watershed comparative genomics annotation (100-species multiple alignment and conservation) and a novel distribution mechanism for the browser (GBiB: Genome Browser in a Box). We created browsers for new species (Chinese hamster, elephant shark, minke whale), 'mined the web' for DNA sequences and expanded the browser display with stacked color graphs and region highlighting. As our user community increasingly adopts the UCSC track hub and assembly hub representations for sharing large-scale genomic annotation data sets and genome sequencing projects, our menu of public data hubs has tripled. PMID:25428374

  5. Recent Developments of Genomic Research in Soybean

    Institute of Scientific and Technical Information of China (English)

    Ching Chan; Xinpeng Qi; Man-Wah Li; Fuk-Ling Wong; Hon-Ming Lam

    2012-01-01

    Soybean is an important cash crop with unique and important traits such as the high seed protein and oil contents,and the ability to perform symbiotic nitrogen fixation.A reference genome of cultivated soybeans was established in 2010,followed by whole-genome re-sequencing of wild and cultivated soybean accessions.These efforts revealed unique features of the soybean genome and helped to understand its evolution.Mapping of variations between wild and cultivated soybean genomes were performed.These genomic variations may be related to the process of domestication and human selection.Wild soybean germplasms exhibited high genomic diversity and hence may be an important source of novel genes/alleles.Accumulation of genomic data will help to refine genetic maps and expedite the identification of functional genes.In this review,we summarize the major findings from the whole-genome sequencing projects and discuss the possible impacts on soybean researches and breeding programs.Some emerging areas such as transcriptomic and epigenomic studies will be introduced.In addition,we also tabulated some useful bioinformatics tools that will help the mining of the soybean genomic data.

  6. Coal mining in Ramagundam

    Energy Technology Data Exchange (ETDEWEB)

    Chakraberty, S.

    1979-07-01

    The Ramagundam area in the South Godavari Coalfield is one of the most promising coal-bearing belts in India. It contains total coal reserves of about 1,132,000,000 tons in an area of approximately 150 square kilometers, and holds high potential for development into a vast industrial center. During the past four years production has doubled to 3,500,000 tons in 1978 to 1979. By 1983 to 1984, the total output per year is planned to be doubled again. Increased mechanization and the introduction of more advanced mining techniques will help to achieve this goal. In addition to the present face machinery, i.e., gathering arm loaders/shuttle cars and side dump loaders/chain conveyor combinations, the latest Voest-Alpine AM50 tunneling and roadheading machines have been commissioned for development work. Load-haul-dump machines will be introduced in the near future to ensure higher loading/transport capacities. A double-drum shearer loader with self-advancing supports is due to be commissioned shortly for faster, more efficient longwall mining to supplement conventional bord and pillar mining. In addition, a mechanized open cast mine has come on stream, and a walking dragline will soon be delivered to the mine for removing overburden. The projected annual output from this mine will be about 2,000,000 tons. (LTN)

  7. SorghumFDB: sorghum functional genomics database with multidimensional network analysis

    OpenAIRE

    Tian, Tian; You, Qi; Zhang, Liwei; Yi, Xin; Yan, Hengyu; Xu, Wenying; Su, Zhen

    2016-01-01

    Sorghum (Sorghum bicolor [L.] Moench) has excellent agronomic traits and biological properties, such as heat and drought-tolerance. It is a C4 grass and potential bioenergy-producing plant, which makes it an important crop worldwide. With the sorghum genome sequence released, it is essential to establish a sorghum functional genomics data mining platform. We collected genomic data and some functional annotations to construct a sorghum functional genomics database (SorghumFDB). SorghumFDB inte...

  8. Data mining in healthcare: decision making and precision

    Directory of Open Access Journals (Sweden)

    Ionuţ ŢĂRANU

    2016-05-01

    Full Text Available The trend of application of data mining in healthcare today is increased because the health sector is rich with information and data mining has become a necessity. Healthcare organizations generate and collect large volumes of information to a daily basis. Use of information technology enables automation of data mining and knowledge that help bring some interesting patterns which means eliminating manual tasks and easy data extraction directly from electronic records, electronic transfer system that will secure medical records, save lives and reduce the cost of medical services as well as enabling early detection of infectious diseases on the basis of advanced data collection. Data mining can enable healthcare organizations to anticipate trends in the patient's medical condition and behaviour proved by analysis of prospects different and by making connections between seemingly unrelated information. The raw data from healthcare organizations are voluminous and heterogeneous. It needs to be collected and stored in organized form and their integration allows the formation unite medical information system. Data mining in health offers unlimited possibilities for analyzing different data models less visible or hidden to common analysis techniques. These patterns can be used by healthcare practitioners to make forecasts, put diagnoses, and set treatments for patients in healthcare organizations.

  9. National Automated Conformity Inspection Process

    Data.gov (United States)

    Department of Transportation — The National Automated Conformity Inspection Process (NACIP) Application is intended to expedite the workflow process as it pertains to the FAA Form 81 0-10 Request...

  10. Evolution of Home Automation Technology

    Directory of Open Access Journals (Sweden)

    Mohd. Rihan

    2009-01-01

    Full Text Available In modern society home and office automation has becomeincreasingly important, providing ways to interconnectvarious home appliances. This interconnection results infaster transfer of information within home/offices leading tobetter home management and improved user experience.Home Automation, in essence, is a technology thatintegrates various electrical systems of a home to provideenhanced comfort and security. Users are grantedconvenient and complete control over all the electrical homeappliances and they are relieved from the tasks thatpreviously required manual control. This paper tracks thedevelopment of home automation technology over the lasttwo decades. Various home automation technologies havebeen explained briefly, giving a chronological account of theevolution of one of the most talked about technologies ofrecent times.

  11. Automation of antimicrobial activity screening.

    Science.gov (United States)

    Forry, Samuel P; Madonna, Megan C; López-Pérez, Daneli; Lin, Nancy J; Pasco, Madeleine D

    2016-03-01

    Manual and automated methods were compared for routine screening of compounds for antimicrobial activity. Automation generally accelerated assays and required less user intervention while producing comparable results. Automated protocols were validated for planktonic, biofilm, and agar cultures of the oral microbe Streptococcus mutans that is commonly associated with tooth decay. Toxicity assays for the known antimicrobial compound cetylpyridinium chloride (CPC) were validated against planktonic, biofilm forming, and 24 h biofilm culture conditions, and several commonly reported toxicity/antimicrobial activity measures were evaluated: the 50 % inhibitory concentration (IC50), the minimum inhibitory concentration (MIC), and the minimum bactericidal concentration (MBC). Using automated methods, three halide salts of cetylpyridinium (CPC, CPB, CPI) were rapidly screened with no detectable effect of the counter ion on antimicrobial activity. PMID:26970766

  12. Automating the Purple Crow Lidar

    Science.gov (United States)

    Hicks, Shannon; Sica, R. J.; Argall, P. S.

    2016-06-01

    The Purple Crow LiDAR (PCL) was built to measure short and long term coupling between the lower, middle, and upper atmosphere. The initial component of my MSc. project is to automate two key elements of the PCL: the rotating liquid mercury mirror and the Zaber alignment mirror. In addition to the automation of the Zaber alignment mirror, it is also necessary to describe the mirror's movement and positioning errors. Its properties will then be added into the alignment software. Once the alignment software has been completed, we will compare the new alignment method with the previous manual procedure. This is the first among several projects that will culminate in a fully-automated lidar. Eventually, we will be able to work remotely, thereby increasing the amount of data we collect. This paper will describe the motivation for automation, the methods we propose, preliminary results for the Zaber alignment error analysis, and future work.

  13. Home automation with Intel Galileo

    CERN Document Server

    Dundar, Onur

    2015-01-01

    This book is for anyone who wants to learn Intel Galileo for home automation and cross-platform software development. No knowledge of programming with Intel Galileo is assumed, but knowledge of the C programming language is essential.

  14. Towards automated traceability maintenance.

    Science.gov (United States)

    Mäder, Patrick; Gotel, Orlena

    2012-10-01

    Traceability relations support stakeholders in understanding the dependencies between artifacts created during the development of a software system and thus enable many development-related tasks. To ensure that the anticipated benefits of these tasks can be realized, it is necessary to have an up-to-date set of traceability relations between the established artifacts. This goal requires the creation of traceability relations during the initial development process. Furthermore, the goal also requires the maintenance of traceability relations over time as the software system evolves in order to prevent their decay. In this paper, an approach is discussed that supports the (semi-) automated update of traceability relations between requirements, analysis and design models of software systems expressed in the UML. This is made possible by analyzing change events that have been captured while working within a third-party UML modeling tool. Within the captured flow of events, development activities comprised of several events are recognized. These are matched with predefined rules that direct the update of impacted traceability relations. The overall approach is supported by a prototype tool and empirical results on the effectiveness of tool-supported traceability maintenance are provided. PMID:23471308

  15. Automated Gas Distribution System

    Science.gov (United States)

    Starke, Allen; Clark, Henry

    2012-10-01

    The cyclotron of Texas A&M University is one of the few and prized cyclotrons in the country. Behind the scenes of the cyclotron is a confusing, and dangerous setup of the ion sources that supplies the cyclotron with particles for acceleration. To use this machine there is a time consuming, and even wasteful step by step process of switching gases, purging, and other important features that must be done manually to keep the system functioning properly, while also trying to maintain the safety of the working environment. Developing a new gas distribution system to the ion source prevents many of the problems generated by the older manually setup process. This developed system can be controlled manually in an easier fashion than before, but like most of the technology and machines in the cyclotron now, is mainly operated based on software programming developed through graphical coding environment Labview. The automated gas distribution system provides multi-ports for a selection of different gases to decrease the amount of gas wasted through switching gases, and a port for the vacuum to decrease the amount of time spent purging the manifold. The Labview software makes the operation of the cyclotron and ion sources easier, and safer for anyone to use.

  16. Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

    CERN Document Server

    Birkholtz, L -M; Wells, G; Grando, D; Joubert, F; Kasam, V; Zimmermann, M; Ortet, P; Jacq, N; Roy, S; Hoffmann-Apitius, M; Breton, V; Louw, A I; Maréchal, E

    2006-01-01

    The organization and mining of malaria genomic and post-genomic data is highly motivated by the necessity to predict and characterize new biological targets and new drugs. Biological targets are sought in a biological space designed from the genomic data from Plasmodium falciparum, but using also the millions of genomic data from other species. Drug candidates are sought in a chemical space containing the millions of small molecules stored in public and private chemolibraries. Data management should therefore be as reliable and versatile as possible. In this context, we examined five aspects of the organization and mining of malaria genomic and post-genomic data: 1) the comparison of protein sequences including compositionally atypical malaria sequences, 2) the high throughput reconstruction of molecular phylogenies, 3) the representation of biological processes particularly metabolic pathways, 4) the versatile methods to integrate genomic data, biological representations and functional profiling obtained fro...

  17. Automated methods of predicting the function of biological sequences using GO and BLAST

    Directory of Open Access Journals (Sweden)

    Baumann Ute

    2005-11-01

    Full Text Available Abstract Background With the exponential increase in genomic sequence data there is a need to develop automated approaches to deducing the biological functions of novel sequences with high accuracy. Our aim is to demonstrate how accuracy benchmarking can be used in a decision-making process evaluating competing designs of biological function predictors. We utilise the Gene Ontology, GO, a directed acyclic graph of functional terms, to annotate sequences with functional information describing their biological context. Initially we examine the effect on accuracy scores of increasing the allowed distance between predicted and a test set of curator assigned terms. Next we evaluate several annotator methods using accuracy benchmarking. Given an unannotated sequence we use the Basic Local Alignment Search Tool, BLAST, to find similar sequences that have already been assigned GO terms by curators. A number of methods were developed that utilise terms associated with the best five matching sequences. These methods were compared against a benchmark method of simply using terms associated with the best BLAST-matched sequence (best BLAST approach. Results The precision and recall of estimates increases rapidly as the amount of distance permitted between a predicted term and a correct term assignment increases. Accuracy benchmarking allows a comparison of annotation methods. A covering graph approach performs poorly, except where the term assignment rate is high. A term distance concordance approach has a similar accuracy to the best BLAST approach, demonstrating lower precision but higher recall. However, a discriminant function method has higher precision and recall than the best BLAST approach and other methods shown here. Conclusion Allowing term predictions to be counted correct if closely related to a correct term decreases the reliability of the accuracy score. As such we recommend using accuracy measures that require exact matching of predicted

  18. Aprendizaje automático

    OpenAIRE

    Moreno, Antonio

    2006-01-01

    En este libro se introducen los conceptos básicos en una de las ramas más estudiadas actualmente dentro de la inteligencia artificial: el aprendizaje automático. Se estudian temas como el aprendizaje inductivo, el razonamiento analógico, el aprendizaje basado en explicaciones, las redes neuronales, los algoritmos genéticos, el razonamiento basado en casos o las aproximaciones teóricas al aprendizaje automático.

  19. 2015 Chinese Intelligent Automation Conference

    CERN Document Server

    Li, Hongbo

    2015-01-01

    Proceedings of the 2015 Chinese Intelligent Automation Conference presents selected research papers from the CIAC’15, held in Fuzhou, China. The topics include adaptive control, fuzzy control, neural network based control, knowledge based control, hybrid intelligent control, learning control, evolutionary mechanism based control, multi-sensor integration, failure diagnosis, reconfigurable control, etc. Engineers and researchers from academia, industry and the government can gain valuable insights into interdisciplinary solutions in the field of intelligent automation.

  20. Technology modernization assessment flexible automation

    Energy Technology Data Exchange (ETDEWEB)

    Bennett, D.W.; Boyd, D.R.; Hansen, N.H.; Hansen, M.A.; Yount, J.A.

    1990-12-01

    The objectives of this report are: to present technology assessment guidelines to be considered in conjunction with defense regulations before an automation project is developed to give examples showing how assessment guidelines may be applied to a current project to present several potential areas where automation might be applied successfully in the depot system. Depots perform primarily repair and remanufacturing operations, with limited small batch manufacturing runs. While certain activities (such as Management Information Systems and warehousing) are directly applicable to either environment, the majority of applications will require combining existing and emerging technologies in different ways, with the special needs of depot remanufacturing environment. Industry generally enjoys the ability to make revisions to its product lines seasonally, followed by batch runs of thousands or more. Depot batch runs are in the tens, at best the hundreds, of parts with a potential for large variation in product mix; reconfiguration may be required on a week-to-week basis. This need for a higher degree of flexibility suggests a higher level of operator interaction, and, in turn, control systems that go beyond the state of the art for less flexible automation and industry in general. This report investigates the benefits and barriers to automation and concludes that, while significant benefits do exist for automation, depots must be prepared to carefully investigate the technical feasibility of each opportunity and the life-cycle costs associated with implementation. Implementation is suggested in two ways: (1) develop an implementation plan for automation technologies based on results of small demonstration automation projects; (2) use phased implementation for both these and later stage automation projects to allow major technical and administrative risk issues to be addressed. 10 refs., 2 figs., 2 tabs. (JF)

  1. Application of fluorescence-based semi-automated AFLP analysis in barley and wheat

    DEFF Research Database (Denmark)

    Schwarz, G.; Herz, M.; Huang, X.Q.;

    2000-01-01

    of semi-automated codominant analysis for hemizygous AFLP markers in an F-2 population was too low, proposing the use of dominant allele-typing defaults. Nevertheless, the efficiency of genetic mapping, especially of complex plant genomes, will be accelerated by combining the presented genotyping...

  2. Mapping extent and change in surface mines within the United States for 2001 to 2006

    Science.gov (United States)

    Soulard, Christopher E.; Acevedo, William; Stehman, Stephen V.; Parker, Owen P.

    2016-01-01

    A complete, spatially explicit dataset illustrating the 21st century mining footprint for the conterminous United States does not exist. To address this need, we developed a semi-automated procedure to map the country's mining footprint (30-m pixel) and establish a baseline to monitor changes in mine extent over time. The process uses mine seed points derived from the U.S. Energy Information Administration (EIA), U.S. Geological Survey (USGS) Mineral Resources Data System (MRDS), and USGS National Land Cover Dataset (NLCD) and recodes patches of barren land that meet a “distance to seed” requirement and a patch area requirement before mapping a pixel as mining. Seed points derived from EIA coal points, an edited MRDS point file, and 1992 NLCD mine points were used in three separate efforts using different distance and patch area parameters for each. The three products were then merged to create a 2001 map of moderate-to-large mines in the United States, which was subsequently manually edited to reduce omission and commission errors. This process was replicated using NLCD 2006 barren pixels as a base layer to create a 2006 mine map and a 2001–2006 mine change map focusing on areas with surface mine expansion. In 2001, 8,324 km2 of surface mines were mapped. The footprint increased to 9,181 km2 in 2006, representing a 10·3% increase over 5 years. These methods exhibit merit as a timely approach to generate wall-to-wall, spatially explicit maps representing the recent extent of a wide range of surface mining activities across the country. 

  3. Unsupervised Tensor Mining for Big Data Practitioners.

    Science.gov (United States)

    Papalexakis, Evangelos E; Faloutsos, Christos

    2016-09-01

    Multiaspect data are ubiquitous in modern Big Data applications. For instance, different aspects of a social network are the different types of communication between people, the time stamp of each interaction, and the location associated to each individual. How can we jointly model all those aspects and leverage the additional information that they introduce to our analysis? Tensors, which are multidimensional extensions of matrices, are a principled and mathematically sound way of modeling such multiaspect data. In this article, our goal is to popularize tensors and tensor decompositions to Big Data practitioners by demonstrating their effectiveness, outlining challenges that pertain to their application in Big Data scenarios, and presenting our recent work that tackles those challenges. We view this work as a step toward a fully automated, unsupervised tensor mining tool that can be easily and broadly adopted by practitioners in academia and industry.

  4. Unsupervised Tensor Mining for Big Data Practitioners.

    Science.gov (United States)

    Papalexakis, Evangelos E; Faloutsos, Christos

    2016-09-01

    Multiaspect data are ubiquitous in modern Big Data applications. For instance, different aspects of a social network are the different types of communication between people, the time stamp of each interaction, and the location associated to each individual. How can we jointly model all those aspects and leverage the additional information that they introduce to our analysis? Tensors, which are multidimensional extensions of matrices, are a principled and mathematically sound way of modeling such multiaspect data. In this article, our goal is to popularize tensors and tensor decompositions to Big Data practitioners by demonstrating their effectiveness, outlining challenges that pertain to their application in Big Data scenarios, and presenting our recent work that tackles those challenges. We view this work as a step toward a fully automated, unsupervised tensor mining tool that can be easily and broadly adopted by practitioners in academia and industry. PMID:27642720

  5. Automated analysis and annotation of basketball video

    Science.gov (United States)

    Saur, Drew D.; Tan, Yap-Peng; Kulkarni, Sanjeev R.; Ramadge, Peter J.

    1997-01-01

    Automated analysis and annotation of video sequences are important for digital video libraries, content-based video browsing and data mining projects. A successful video annotation system should provide users with useful video content summary in a reasonable processing time. Given the wide variety of video genres available today, automatically extracting meaningful video content for annotation still remains hard by using current available techniques. However, a wide range video has inherent structure such that some prior knowledge about the video content can be exploited to improve our understanding of the high-level video semantic content. In this paper, we develop tools and techniques for analyzing structured video by using the low-level information available directly from MPEG compressed video. Being able to work directly in the video compressed domain can greatly reduce the processing time and enhance storage efficiency. As a testbed, we have developed a basketball annotation system which combines the low-level information extracted from MPEG stream with the prior knowledge of basketball video structure to provide high level content analysis, annotation and browsing for events such as wide- angle and close-up views, fast breaks, steals, potential shots, number of possessions and possession times. We expect our approach can also be extended to structured video in other domains.

  6. Extended -Regular Sequence for Automated Analysis of Microarray Images

    Directory of Open Access Journals (Sweden)

    Jin Hee-Jeong

    2006-01-01

    Full Text Available Microarray study enables us to obtain hundreds of thousands of expressions of genes or genotypes at once, and it is an indispensable technology for genome research. The first step is the analysis of scanned microarray images. This is the most important procedure for obtaining biologically reliable data. Currently most microarray image processing systems require burdensome manual block/spot indexing work. Since the amount of experimental data is increasing very quickly, automated microarray image analysis software becomes important. In this paper, we propose two automated methods for analyzing microarray images. First, we propose the extended -regular sequence to index blocks and spots, which enables a novel automatic gridding procedure. Second, we provide a methodology, hierarchical metagrid alignment, to allow reliable and efficient batch processing for a set of microarray images. Experimental results show that the proposed methods are more reliable and convenient than the commercial tools.

  7. AUTOMATED ANALYSIS OF BREAKERS

    Directory of Open Access Journals (Sweden)

    E. M. Farhadzade

    2014-01-01

    Full Text Available Breakers relate to Electric Power Systems’ equipment, the reliability of which influence, to a great extend, on reliability of Power Plants. In particular, the breakers determine structural reliability of switchgear circuit of Power Stations and network substations. Failure in short-circuit switching off by breaker with further failure of reservation unit or system of long-distance protection lead quite often to system emergency.The problem of breakers’ reliability improvement and the reduction of maintenance expenses is becoming ever more urgent in conditions of systematic increasing of maintenance cost and repair expenses of oil circuit and air-break circuit breakers. The main direction of this problem solution is the improvement of diagnostic control methods and organization of on-condition maintenance. But this demands to use a great amount of statistic information about nameplate data of breakers and their operating conditions, about their failures, testing and repairing, advanced developments (software of computer technologies and specific automated information system (AIS.The new AIS with AISV logo was developed at the department: “Reliability of power equipment” of AzRDSI of Energy. The main features of AISV are:· to provide the security and data base accuracy;· to carry out systematic control of breakers conformity with operating conditions;· to make the estimation of individual  reliability’s value and characteristics of its changing for given combination of characteristics variety;· to provide personnel, who is responsible for technical maintenance of breakers, not only with information but also with methodological support, including recommendations for the given problem solving  and advanced methods for its realization.

  8. Imitating manual curation of text-mined facts in biomedicine.

    Directory of Open Access Journals (Sweden)

    Raul Rodriguez-Esteban

    2006-09-01

    Full Text Available Text-mining algorithms make mistakes in extracting facts from natural-language texts. In biomedical applications, which rely on use of text-mined data, it is critical to assess the quality (the probability that the message is correctly extracted of individual facts--to resolve data conflicts and inconsistencies. Using a large set of almost 100,000 manually produced evaluations (most facts were independently reviewed more than once, producing independent evaluations, we implemented and tested a collection of algorithms that mimic human evaluation of facts provided by an automated information-extraction system. The performance of our best automated classifiers closely approached that of our human evaluators (ROC score close to 0.95. Our hypothesis is that, were we to use a larger number of human experts to evaluate any given sentence, we could implement an artificial-intelligence curator that would perform the classification job at least as accurately as an average individual human evaluator. We illustrated our analysis by visualizing the predicted accuracy of the text-mined relations involving the term cocaine.

  9. In vivo robotics: the automation of neuroscience and other intact-system biological fields.

    Science.gov (United States)

    Kodandaramaiah, Suhasa B; Boyden, Edward S; Forest, Craig R

    2013-12-01

    Robotic and automation technologies have played a huge role in in vitro biological science, having proved critical for scientific endeavors such as genome sequencing and high-throughput screening. Robotic and automation strategies are beginning to play a greater role in in vivo and in situ sciences, especially when it comes to the difficult in vivo experiments required for understanding the neural mechanisms of behavior and disease. In this perspective, we discuss the prospects for robotics and automation to influence neuroscientific and intact-system biology fields. We discuss how robotic innovations might be created to open up new frontiers in basic and applied neuroscience and present a concrete example with our recent automation of in vivo whole-cell patch clamp electrophysiology of neurons in the living mouse brain.

  10. Modern geodesy approach in underground mining

    OpenAIRE

    Mijalkovski, Stojance; Despodov, Zoran; Gorgievski, Cvetan; Bogdanovski, Goran; Mirakovski, Dejan; Hadzi-Nikolova, Marija; Doneva, Nikolinka

    2013-01-01

    This paper presents overview of the development of modern geodesy approach in underground mining. Correct surveying measurements have great importance in mining, especially underground mining as well as a major impact on safety in the development of underground mining facilities.

  11. Colombia, mining country. Vision a year 2019

    International Nuclear Information System (INIS)

    Scope of the state action for the mining sector, the performance of the mining sector, regional perceptions of mining development, construction of a long-term vision for the mining sector, the action plan and goals follow-up

  12. Mining the social mediome.

    Science.gov (United States)

    Asch, David A; Rader, Daniel J; Merchant, Raina M

    2015-09-01

    The experiences and behaviors revealed in our everyday lives provide as much insight into health and disease as any analysis of our genome could ever produce. These characteristics are not found in the genome, but may be revealed in our online activities, which make up our social mediome.

  13. Journey from Data Mining to Web Mining to Big Data

    OpenAIRE

    Gupta, Richa

    2014-01-01

    This paper describes the journey of big data starting from data mining to web mining to big data. It discusses each of this method in brief and also provides their applications. It states the importance of mining big data today using fast and novel approaches.

  14. Mining and the environment

    International Nuclear Information System (INIS)

    The proceedings contain 30 contributions, out of which 9 have been inputted in INIS. They are concerned with uranium mines and mills in the Czech Republic. The impacts of the mining activities and of the mill tailings on the environment and the population are assessed, and it is concluded that the radiation hazard does not exceed that from natural background. Considerable attention is paid to the monitoring of the surroundings of mines and mills and to landscaping activities. Proposed technologies for the purification of waste waters from the chemical leaching process are described. Ways to eliminate environmental damage from abandoned tailings settling ponds are suggested. (M.D.). 18 tabs., 21 figs., 43 refs

  15. Overview of solution mining

    International Nuclear Information System (INIS)

    This paper deals with in situ solution mining. A significant fraction of known U.S. uranium reserves occur as low grade mineralization in sedimentary sandstone deposits located between 40 and 200 meters subsurface. For a variety of reasons, such deposits may not be economically developable by any method other than in situ solution mining. This coupled with the current market price of uranium has led to significant development and application of in situ solution mining for uranium production during the past several years. The process consists of the ore body; well field; lixiviant; uranium recovery process; waste treatment processes; and aquifer restoration. A tabulation of firms involved with a summary of the leach chemistry used is given. 3 refs

  16. Data mining methods

    CERN Document Server

    Chattamvelli, Rajan

    2015-01-01

    DATA MINING METHODS, Second Edition discusses both theoretical foundation and practical applications of datamining in a web field including banking, e-commerce, medicine, engineering and management. This book starts byintroducing data and information, basic data type, data category and applications of data mining. The second chapterbriefly reviews data visualization technology and importance in data mining. Fundamentals of probability and statisticsare discussed in chapter 3, and novel algorithm for sample covariants are derived. The next two chapters give an indepthand useful discussion of data warehousing and OLAP. Decision trees are clearly explained and a new tabularmethod for decision tree building is discussed. The chapter on association rules discusses popular algorithms andcompares various algorithms in summary table form. An interesting application of genetic algorithm is introduced inthe next chapter. Foundations of neural networks are built from scratch and the back propagation algorithm is derived...

  17. Listeria Genomics

    Science.gov (United States)

    Cabanes, Didier; Sousa, Sandra; Cossart, Pascale

    The opportunistic intracellular foodborne pathogen Listeria monocytogenes has become a paradigm for the study of host-pathogen interactions and bacterial adaptation to mammalian hosts. Analysis of L. monocytogenes infection has provided considerable insight into how bacteria invade cells, move intracellularly, and disseminate in tissues, as well as tools to address fundamental processes in cell biology. Moreover, the vast amount of knowledge that has been gathered through in-depth comparative genomic analyses and in vivo studies makes L. monocytogenes one of the most well-studied bacterial pathogens. This chapter provides an overview of progress in the exploration of genomic, transcriptomic, and proteomic data in Listeria spp. to understand genome evolution and diversity, as well as physiological aspects of metabolism used by bacteria when growing in diverse environments, in particular in infected hosts.

  18. Personal continuous route pattern mining

    Institute of Scientific and Technical Information of China (English)

    Qian YE; Ling CHEN; Gen-cai CHEN

    2009-01-01

    In the daily life, people often repeat regular routes in certain periods. In this paper, a mining system is developed to find the continuous route patterns of personal past trips. In order to count the diversity of personal moving status, the mining system employs the adaptive GPS data recording and five data filters to guarantee the clean trips data. The mining system uses a client/server architecture to protect personal privacy and to reduce the computational load. The server conducts the main mining procedure but with insufficient information to recover real personal routes. In order to improve the scalability of sequential pattern mining, a novel pattern mining algorithm, continuous route pattern mining (CRPM), is proposed. This algorithm can tolerate the different disturbances in real routes and extract the frequent patterns. Experimental results based on nine persons' trips show that CRPM can extract more than two times longer route patterns than the traditional route pattern mining algorithms.

  19. Lunabotics Mining Competition: Inspiration Through Accomplishment

    Science.gov (United States)

    Mueller, Robert P.

    2011-01-01

    NASA's Lunabotics Mining Competition is designed to promote the development of interest in space activities and STEM (Science, Technology, Engineering, and Mathematics) fields. The competition uses excavation, a necessary first step towards extracting resources from the regolith and building bases on the moon. The unique physical properties of lunar regolith and the reduced 1/6th gravity, vacuum environment make excavation a difficult technical challenge. Advances in lunar regolith mining have the potential to significantly contribute to our nation's space vision and NASA space exploration operations. The competition is conducted annually by NASA at the Kennedy Space Center Visitor Complex. The teams that can use telerobotic or autonomous operation to excavate a lunar regolith geotechnical simulant, herein after referred to as Black Point-1 (or BP-1) and score the most points (calculated as an average of two separate 10-minute timed competition attempts) will eam points towards the Joe Kosmo Award for Excellence and the scores will reflect ranking in the on-site mining category of the competition. The minimum excavation requirement is 10.0 kg during each competition attempt and the robotic excavator, referred to as the "Lunabot", must meet all specifications. This paper will review the achievements of the Lunabotics Mining Competition in 2010 and 2011, and present the new rules for 2012. By providing a framework for robotic design and fabrication, which culminates in a live competition event, university students have been able to produce sophisticated lunabots which are tele-operated. Multi-disciplinary teams are encouraged and the extreme sense of accomplishment provides a unique source of inspiration to the participating students, which has been shown to translate into increased interest in STEM careers. Our industrial sponsors (Caterpillar, Newmont Mining, Harris, Honeybee Robotics) have all stated that there is a strong need for skills in the workforce related

  20. WEB MINING BASED FRAMEWORK FOR ONTOLOGY LEARNING

    Directory of Open Access Journals (Sweden)

    C.Ramesh

    2015-07-01

    Full Text Available Today, the notion of Semantic Web has emerged as a prominent solution to the problem of organizing the immense information provided by World Wide Web, and its focus on supporting a better co-operation between humans and machines is noteworthy. Ontology forms the major component of Semantic Web in its realization. However, manual method of ontology construction is time-consuming, costly, error-prone and inflexible to change and in addition, it requires a complete participation of knowledge engineer or domain expert. To address this issue, researchers hoped that a semi-automatic or automatic process would result in faster and better ontology construction and enrichment. Ontology learning has become recently a major area of research, whose goal is to facilitate construction of ontologies, which reduces the effort in developing ontology for a new domain. However, there are few research studies that attempt to construct ontology from semi-structured Web pages. In this paper, we present a complete framework for ontology learning that facilitates the semi-automation of constructing and enriching web site ontology from semi structured Web pages. The proposed framework employs Web Content Mining and Web Usage mining in extracting conceptual relationship from Web. The main idea behind this concept was to incorporate the web author's ideas as well as web users’ intentions in the ontology development and its evolution.

  1. Savant Genome Browser 2: visualization and analysis for population-scale genomics.

    Science.gov (United States)

    Fiume, Marc; Smith, Eric J M; Brook, Andrew; Strbenac, Dario; Turner, Brian; Mezlini, Aziz M; Robinson, Mark D; Wodak, Shoshana J; Brudno, Michael

    2012-07-01

    High-throughput sequencing (HTS) technologies are providing an unprecedented capacity for data generation, and there is a corresponding need for efficient data exploration and analysis capabilities. Although most existing tools for HTS data analysis are developed for either automated (e.g. genotyping) or visualization (e.g. genome browsing) purposes, such tools are most powerful when combined. For example, integration of visualization and computation allows users to iteratively refine their analyses by updating computational parameters within the visual framework in real-time. Here we introduce the second version of the Savant Genome Browser, a standalone program for visual and computational analysis of HTS data. Savant substantially improves upon its predecessor and existing tools by introducing innovative visualization modes and navigation interfaces for several genomic datatypes, and synergizing visual and automated analyses in a way that is powerful yet easy even for non-expert users. We also present a number of plugins that were developed by the Savant Community, which demonstrate the power of integrating visual and automated analyses using Savant. The Savant Genome Browser is freely available (open source) at www.savantbrowser.com.

  2. Databases for Data Mining

    OpenAIRE

    LANGOF, LADO

    2015-01-01

    This work is about looking for synergies between data mining tools and databa\\-se management systems (DBMS). Imagine a situation where we need to solve an analytical problem using data that are too large to be processed solely inside the main physical memory and at the same time too small to put data warehouse or distributed analytical system in place. The target area is therefore a single personal computer that is used to solve data mining problems. We are looking for tools that allows us to...

  3. Mining protein structure data

    OpenAIRE

    Santos, José Carlos Almeida

    2006-01-01

    The principal topic of this work is the application of data mining techniques, in particular of machine learning, to the discovery of knowledge in a protein database. In the first chapter a general background is presented. Namely, in section 1.1 we overview the methodology of a Data Mining project and its main algorithms. In section 1.2 an introduction to the proteins and its supporting file formats is outlined. This chapter is concluded with section 1.3 which defines that main problem we...

  4. Mining multidimensional distinct patterns

    OpenAIRE

    Kubendranathan, Thusjanthan

    2010-01-01

    How do we find the dominant groups of customers in age, sex and location that were responsible for at least 85% of the sales of iPad, Macbook and iPhone? To answer such types of questions we introduce a novel data mining task – mining multidimensional distinct patterns (DPs). Given a multidimensional data set where each tuple carries some attribute values and a transaction, multidimensional DPs are itemsets whose absolute support ratio in a group-by on the attributes against the rest of the d...

  5. Data mining mobile devices

    CERN Document Server

    Mena, Jesus

    2013-01-01

    With today's consumers spending more time on their mobiles than on their PCs, new methods of empirical stochastic modeling have emerged that can provide marketers with detailed information about the products, content, and services their customers desire.Data Mining Mobile Devices defines the collection of machine-sensed environmental data pertaining to human social behavior. It explains how the integration of data mining and machine learning can enable the modeling of conversation context, proximity sensing, and geospatial location throughout large communities of mobile users

  6. First aid at mines

    Energy Technology Data Exchange (ETDEWEB)

    1993-01-01

    This Code of Practice has been approved by the Health and Safety Commission with the consent of the Secretary of State under section 16 of the Health and Safety at Work etc. Act 1974. It gives practical guidance on the requirements placed on employers and self-employed persons by the Health and Safety (First-Aid) Regulations 1981 as they now apply to mines and comes into effect on 1 October 1993 which is the date on which the Management and Administration of Safety and Health at Mines Regulations 1993 come into force.

  7. Blockchain Mining Games

    OpenAIRE

    Kiayias, Aggelos; Koutsoupias, Elias; Kyropoulou, Maria; Tselekounis, Yiannis

    2016-01-01

    We study the strategic considerations of miners participating in the bitcoin's protocol. We formulate and study the stochastic game that underlies these strategic considerations. The miners collectively build a tree of blocks, and they are paid when they create a node (mine a block) which will end up in the path of the tree that is adopted by all. Since the miners can hide newly mined nodes, they play a game with incomplete information. Here we consider two simplified forms of this game in wh...

  8. Data mining for dummies

    CERN Document Server

    Brown, Meta S

    2014-01-01

    Delve into your data for the key to success Data mining is quickly becoming integral to creating value and business momentum. The ability to detect unseen patterns hidden in the numbers exhaustively generated by day-to-day operations allows savvy decision-makers to exploit every tool at their disposal in the pursuit of better business. By creating models and testing whether patterns hold up, it is possible to discover new intelligence that could change your business''s entire paradigm for a more successful outcome. Data Mining for Dummies shows you why it doesn''t take a data scientist to gain

  9. Pervasive data mining engine

    OpenAIRE

    Peixoto, Rui Daniel Ferreira

    2015-01-01

    Dissertação de mestrado integrado em Engenharia e Gestão de Sistemas de Informação As ferramentas de Data Mining atuais necessitam de um conhecimento aprofundado na área para obter resultados otimizados. Esta dissertação apresenta como questão de investigação analisar “A viabilidade da construção de um sistema de Data Mining semiautónomo com características pervasive, sem perder a sua identidade e funcionalidades” e como objetivo desenvolver um protótipo de um sistema designado Pervasive D...

  10. Cephalopod genomics

    DEFF Research Database (Denmark)

    Albertin, Caroline B.; Bonnaud, Laure; Brown, C. Titus;

    2012-01-01

    The Cephalopod Sequencing Consortium (CephSeq Consortium) was established at a NESCent Catalysis Group Meeting, ``Paths to Cephalopod Genomics-Strategies, Choices, Organization,'' held in Durham, North Carolina, USA on May 24-27, 2012. Twenty-eight participants representing nine countries (Austria...... active in sequencing, assembling and annotating genomes, agreed on a set of cephalopod species of particular importance for initial sequencing and developed strategies and an organization (CephSeq Consortium) to promote this sequencing. The conclusions and recommendations of this meeting are described...

  11. Genome Sequencing

    DEFF Research Database (Denmark)

    Sato, Shusei; Andersen, Stig Uggerhøj

    2014-01-01

    The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based on transcr......The current Lotus japonicus reference genome sequence is based on a hybrid assembly of Sanger TAC/BAC, Sanger shotgun and Illumina shotgun sequencing data generated from the Miyakojima-MG20 accession. It covers nearly all expressed L. japonicus genes and has been annotated mainly based...

  12. WIRELESS MINE WIDE TELECOMMUNICATIONS TECHNOLOGY

    International Nuclear Information System (INIS)

    Two industrial prototype units for through-the-earth wireless communication were constructed and tested. Preparation for a temporary installation in NIOSH's Lake Lynn mine for the through-the-earth and the in-mine system were completed. Progress was made in the programming of the in-mine system to provide data communication. Work has begun to implement a wireless interface between equipment controllers and our in-mine system

  13. Neural Networks in Data Mining

    OpenAIRE

    Priyanka Gaur

    2012-01-01

    The application of neural networks in the data mining is very wide. Although neural networks may have complex structure, long training time, and uneasily understandable representation of results, neural networks have high acceptance ability for noisy data and high accuracy and are preferable in data mining. In this paper the data mining based on neural networks is researched in detail, and the key technology and ways to achieve the data mining based on neural networks are also researched.

  14. Programmable automation systems in PSA

    International Nuclear Information System (INIS)

    The Finnish safety authority (STUK) requires plant specific PSAs, and quantitative safety goals are set on different levels. The reliability analysis is more problematic when critical safety functions are realized by applying programmable automation systems. Conventional modeling techniques do not necessarily apply to the analysis of these systems, and the quantification seems to be impossible. However, it is important to analyze contribution of programmable automation systems to the plant safety and PSA is the only method with system analytical view over the safety. This report discusses the applicability of PSA methodology (fault tree analyses, failure modes and effects analyses) in the analysis of programmable automation systems. The problem of how to decompose programmable automation systems for reliability modeling purposes is discussed. In addition to the qualitative analysis and structural reliability modeling issues, the possibility to evaluate failure probabilities of programmable automation systems is considered. One solution to the quantification issue is the use of expert judgements, and the principles to apply expert judgements is discussed in the paper. A framework to apply expert judgements is outlined. Further, the impacts of subjective estimates on the interpretation of PSA results are discussed. (orig.) (13 refs.)

  15. Automated bolting and meshing on a continuous miner for roadway development

    Institute of Scientific and Technical Information of China (English)

    van Duin Stephen; Meers Luke; Donnelly Peter; Oxley Ian

    2013-01-01

    Automated installation of primary roof support material can potentially increase productivity and operator safety in the roadway development process within underground coal mining.Although the broader manufacturing sector has benefited from automation,several challenges exist within the Australian underground coal industry which makes it difficult to fully exploit these technologies.At the University of Wollongong a series of reprogrammable electromechanical manipulators have been designed to overcome these challenges and automatically handle the installation of roof and rib containment consumables on a continuous miner.The automated manipulation removes personnel from hazards in the immediate face area,particularly those associated with working in a confined and unstable working environment in close proximity to rotating and moving equipment.In a series of above ground trials the automated system was successfully demonstrated without human intervention and proven to be capable of achieving cycle times at a rate of 10 m per operating hour,consistent with that required to support high capacity longwall mines.The trials also identified a number of refinements which could further improve both cycle times and system reliability when considering the technology for underground use.The results have concluded that conventional manual handling practices on a continuous miner can be eliminated,and that the prototypes have significantly reduced the technical risk in proceeding to a full underground trial.

  16. International Conference Automation : Challenges in Automation, Robotics and Measurement Techniques

    CERN Document Server

    Zieliński, Cezary; Kaliczyńska, Małgorzata

    2016-01-01

    This book presents the set of papers accepted for presentation at the International Conference Automation, held in Warsaw, 2-4 March of 2016. It presents the research results presented by top experts in the fields of industrial automation, control, robotics and measurement techniques. Each chapter presents a thorough analysis of a specific technical problem which is usually followed by numerical analysis, simulation, and description of results of implementation of the solution of a real world problem. The presented theoretical results, practical solutions and guidelines will be valuable for both researchers working in the area of engineering sciences and for practitioners solving industrial problems. .

  17. Optimization based automated curation of metabolic reconstructions

    Directory of Open Access Journals (Sweden)

    Maranas Costas D

    2007-06-01

    Full Text Available Abstract Background Currently, there exists tens of different microbial and eukaryotic metabolic reconstructions (e.g., Escherichia coli, Saccharomyces cerevisiae, Bacillus subtilis with many more under development. All of these reconstructions are inherently incomplete with some functionalities missing due to the lack of experimental and/or homology information. A key challenge in the automated generation of genome-scale reconstructions is the elucidation of these gaps and the subsequent generation of hypotheses to bridge them. Results In this work, an optimization based procedure is proposed to identify and eliminate network gaps in these reconstructions. First we identify the metabolites in the metabolic network reconstruction which cannot be produced under any uptake conditions and subsequently we identify the reactions from a customized multi-organism database that restores the connectivity of these metabolites to the parent network using four mechanisms. This connectivity restoration is hypothesized to take place through four mechanisms: a reversing the directionality of one or more reactions in the existing model, b adding reaction from another organism to provide functionality absent in the existing model, c adding external transport mechanisms to allow for importation of metabolites in the existing model and d restore flow by adding intracellular transport reactions in multi-compartment models. We demonstrate this procedure for the genome- scale reconstruction of Escherichia coli and also Saccharomyces cerevisiae wherein compartmentalization of intra-cellular reactions results in a more complex topology of the metabolic network. We determine that about 10% of metabolites in E. coli and 30% of metabolites in S. cerevisiae cannot carry any flux. Interestingly, the dominant flow restoration mechanism is directionality reversals of existing reactions in the respective models. Conclusion We have proposed systematic methods to identify and

  18. Data mining för resurseffektivaretillverkning och produktion

    OpenAIRE

    Bergkvist, Anton; Snögren, Dawit

    2014-01-01

    Sammanfattning Hur går en organisation från att vara informationsrik till att bli insiktsfull - och hur tar man bästa beslut utifrån denna insikt? Det strömmar otroligt stora mängder data genom produktion- och tillverkningssystem. Företag saknar ofta kunskap kring hur denna information skall användas. Data mining är de verktyg som kan användas för att ta reda på den informationen. Genom att upptäcka mönster och relationer i stora mängder data har Data mining gjort det möjligt att finna den ku...

  19. SeqGene: a comprehensive software solution for mining exome- and transcriptome- sequencing data

    OpenAIRE

    Deng Xutao

    2011-01-01

    Abstract Background The popularity of massively parallel exome and transcriptome sequencing projects demands new data mining tools with a comprehensive set of features to support a wide range of analysis tasks. Results SeqGene, a new data mining tool, supports mutation detection and annotation, dbSNP and 1000 Genome data integration, RNA-Seq expression quantification, mutation and coverage visualization, allele specific expression (ASE), differentially expressed genes (DEGs) identification, c...

  20. DATA MINING TECHNIQUES AND APPLICATIONS

    OpenAIRE

    . Ramageri; Bharati M.

    2010-01-01

    Data mining is a process which finds useful patterns from large amount of data. The paper discusses few of the data mining techniques, algorithms and some of the organizations which have adapted data mining technology to improve their businesses and found excellent results.

  1. The Automatic Drilling System of 6R-2P Mining Drill Jumbos

    Directory of Open Access Journals (Sweden)

    Yujun Wang

    2015-02-01

    Full Text Available In order to improve the efficiency of underground mining and tunneling operations and to realize automatic drilling, it is necessary to develop the automation system for large drill jumbos. This work focuses on one such mining drill jumbo which is actually a redundant robotic manipulator with eight degrees of freedom, because it has six revolute joints and two prismatic joints. To realize the autonomous drilling operation, algorithms are proposed to calculate the desired pose of the end-effector and to solve the inverse kinematics of the drill jumbo, which is one of the key issues for developing the automation system. After that, a control strategy is proposed to independently control the eight joint variables using PID feedback control approaches. The simulation model is developed in Simulink. As the closed-loop controllers corresponding to all joints are local and independent of each other, the whole system is not a closed-loop feedback control. In order to estimate the possible maximal pose error, the analysis of the pose error caused by the errors of the joint variables is conducted. The results are satisfactory for mining applications and the developed automation system is being applied in the drill jumbos built by Mining Technologies International Inc.

  2. GPS based checking survey and precise DEM development in Open mine

    Institute of Scientific and Technical Information of China (English)

    XU Ai-gong

    2008-01-01

    The checking survey in Open mine is one of the most frequent and important work. It plays the role of forming a connecting link between open mine planning and production. Traditional checking method has such disadvantages as long time consumption,heavy workload, complicated calculating process, and lower automation. Used GPS and GIS technologies to systematically study the core issues of checking survey in open mine.A detail GPS data acquisition coding scheme was presented. Based on the scheme an algorithm used for computer semiautomatic cartography was made. Three methods used for eliminating gross errors from raw data which were needed for creating DEM was discussed. Two algorithms were researched and realized which can be used to create open mine fine DEM model with constrained conditions and to dynamically update the model.The precision analysis and evaluation of the created model were carried out.

  3. Integrative Bioinformatics for Genomics and Proteomics

    OpenAIRE

    Wu, C.H.

    2011-01-01

    Systems integration is becoming the driving force for 21st century biology. Researchers are systematically tackling gene functions and complex regulatory processes by studying organisms at different levels of organization, from genomes and transcriptomes to proteomes and interactomes. To fully realize the value of such high-throughput data requires advanced bioinformatics for integration, mining, comparative analysis, and functional interpretation. We are developing a bioinformatics research ...

  4. Linking Virus Genomes with Host Taxonomy.

    Science.gov (United States)

    Mihara, Tomoko; Nishimura, Yosuke; Shimizu, Yugo; Nishiyama, Hiroki; Yoshikawa, Genki; Uehara, Hideya; Hingamp, Pascal; Goto, Susumu; Ogata, Hiroyuki

    2016-03-01

    Environmental genomics can describe all forms of organisms--cellular and viral--present in a community. The analysis of such eco-systems biology data relies heavily on reference databases, e.g., taxonomy or gene function databases. Reference databases of symbiosis sensu lato, although essential for the analysis of organism interaction networks, are lacking. By mining existing databases and literature, we here provide a comprehensive and manually curated database of taxonomic links between viruses and their cellular hosts.

  5. 76 FR 63238 - Proximity Detection Systems for Continuous Mining Machines in Underground Coal Mines

    Science.gov (United States)

    2011-10-12

    ... Proximity Detection Systems for Continuous Mining Machines in Underground Coal Mines, published on August 31... Mining Machines in Underground Coal Mines AGENCY: Mine Safety and Health Administration, Labor. ACTION... Mining Machines in Underground Coal Mines. Due to requests from the public and to provide...

  6. Automated power management and control

    Science.gov (United States)

    Dolce, James L.

    1991-01-01

    A comprehensive automation design is being developed for Space Station Freedom's electric power system. A joint effort between NASA's Office of Aeronautics and Exploration Technology and NASA's Office of Space Station Freedom, it strives to increase station productivity by applying expert systems and conventional algorithms to automate power system operation. The initial station operation will use ground-based dispatches to perform the necessary command and control tasks. These tasks constitute planning and decision-making activities that strive to eliminate unplanned outages. We perceive an opportunity to help these dispatchers make fast and consistent on-line decisions by automating three key tasks: failure detection and diagnosis, resource scheduling, and security analysis. Expert systems will be used for the diagnostics and for the security analysis; conventional algorithms will be used for the resource scheduling.

  7. Computer automation and artificial intelligence

    International Nuclear Information System (INIS)

    Rapid advances in computing, resulting from micro chip revolution has increased its application manifold particularly for computer automation. Yet the level of automation available, has limited its application to more complex and dynamic systems which require an intelligent computer control. In this paper a review of Artificial intelligence techniques used to augment automation is presented. The current sequential processing approach usually adopted in artificial intelligence has succeeded in emulating the symbolic processing part of intelligence, but the processing power required to get more elusive aspects of intelligence leads towards parallel processing. An overview of parallel processing with emphasis on transputer is also provided. A Fuzzy knowledge based controller for amination drug delivery in muscle relaxant anesthesia on transputer is described. 4 figs. (author)

  8. Unmet needs in automated cytogenetics

    International Nuclear Information System (INIS)

    Though some, at least, of the goals of automation systems for analysis of clinical cytogenetic material seem either at hand, like automatic metaphase finding, or at least likely to be met in the near future, like operator-assisted semi-automatic analysis of banded metaphase spreads, important areas of cytogenetic analsis, most importantly the determination of chromosomal aberration frequencies in populations of cells or in samples of cells from people exposed to environmental mutagens, await practical methods of automation. Important as are the clinical diagnostic applications, it is apparent that increasing concern over the clastogenic effects of the multitude of potentially clastogenic chemical and physical agents to which human populations are being increasingly exposed, and the resulting emergence of extensive cytogenetic testing protocols, makes the development of automation not only economically feasible but almost mandatory. The nature of the problems involved, and acutal of possible approaches to their solution, are discussed

  9. Manual versus automated blood sampling

    DEFF Research Database (Denmark)

    Teilmann, A C; Kalliokoski, Otto; Sørensen, Dorte B;

    2014-01-01

    corticosterone metabolites, and expressed more anxious behavior than did the mice of the other groups. Plasma corticosterone levels of mice subjected to tail blood sampling were also elevated, although less significantly. Mice subjected to automated blood sampling were less affected with regard to the parameters......Facial vein (cheek blood) and caudal vein (tail blood) phlebotomy are two commonly used techniques for obtaining blood samples from laboratory mice, while automated blood sampling through a permanent catheter is a relatively new technique in mice. The present study compared physiological parameters......, glucocorticoid dynamics as well as the behavior of mice sampled repeatedly for 24 h by cheek blood, tail blood or automated blood sampling from the carotid artery. Mice subjected to cheek blood sampling lost significantly more body weight, had elevated levels of plasma corticosterone, excreted more fecal...

  10. Network based automation for SMEs

    DEFF Research Database (Denmark)

    Shahabeddini Parizi, Mohammad; Radziwon, Agnieszka

    2016-01-01

    The implementation of appropriate automation concepts which increase productivity in Small and Medium Sized Enterprises (SMEs) requires a lot of effort, due to their limited resources. Therefore, it is strongly recommended for small firms to open up for the external sources of knowledge, which co......, this paper develops and discusses a set of guidelines for systematic productivity improvement within an innovative collaboration in regards to automation processes in SMEs.......The implementation of appropriate automation concepts which increase productivity in Small and Medium Sized Enterprises (SMEs) requires a lot of effort, due to their limited resources. Therefore, it is strongly recommended for small firms to open up for the external sources of knowledge, which...... could be obtained through network interaction. Based on two extreme cases of SMEs representing low-tech industry and an in-depth analysis of their manufacturing facilities this paper presents how collaboration between firms embedded in a regional ecosystem could result in implementation of new...

  11. Design automation for integrated circuits

    Science.gov (United States)

    Newell, S. B.; de Geus, A. J.; Rohrer, R. A.

    1983-04-01

    Consideration is given to the development status of the use of computers in automated integrated circuit design methods, which promise the minimization of both design time and design error incidence. Integrated circuit design encompasses two major tasks: error specification, in which the goal is a logic diagram that accurately represents the desired electronic function, and physical specification, in which the goal is an exact description of the physical locations of all circuit elements and their interconnections on the chip. Design automation not only saves money by reducing design and fabrication time, but also helps the community of systems and logic designers to work more innovatively. Attention is given to established design automation methodologies, programmable logic arrays, and design shortcuts.

  12. Ancient genomics

    DEFF Research Database (Denmark)

    Der Sarkissian, Clio; Allentoft, Morten Erik; Avila Arcos, Maria del Carmen;

    2015-01-01

    , archaic hominins, ancient pathogens and megafaunal species. Those have revealed important functional and phenotypic information, as well as unexpected adaptation, migration and admixture patterns. As such, the field of aDNA has entered the new era of genomics and has provided valuable information when...

  13. Herbarium genomics

    DEFF Research Database (Denmark)

    Bakker, Freek T.; Lei, Di; Yu, Jiaying;

    2016-01-01

    Herbarium genomics is proving promising as next-generation sequencing approaches are well suited to deal with the usually fragmented nature of archival DNA. We show that routine assembly of partial plastome sequences from herbarium specimens is feasible, from total DNA extracts and with specimens...

  14. Frequent pattern mining

    CERN Document Server

    Aggarwal, Charu C

    2014-01-01

    Proposes numerous methods to solve some of the most fundamental problems in data mining and machine learning Presents various simplified perspectives, providing a range of information to benefit both students and practitioners Includes surveys on key research content, case studies and future research directions

  15. Contextual Text Mining

    Science.gov (United States)

    Mei, Qiaozhu

    2009-01-01

    With the dramatic growth of text information, there is an increasing need for powerful text mining systems that can automatically discover useful knowledge from text. Text is generally associated with all kinds of contextual information. Those contexts can be explicit, such as the time and the location where a blog article is written, and the…

  16. Lunabotics Mining Competition

    Science.gov (United States)

    Mueller, Rob; Murphy, Gloria

    2010-01-01

    This slide presentation describes a competition to design a lunar robot (lunabot) that can be controlled either remotely or autonomously, isolated from the operator, and is designed to mine a lunar aggregate simulant. The competition is part of a systems engineering curriculum. The 2010 competition winners in five areas of the competition were acknowledged, and the 2011 competition was announced.

  17. Coal Mines Security System

    Directory of Open Access Journals (Sweden)

    Ankita Guhe

    2012-05-01

    Full Text Available Geological circumstances of mine seem to be extremely complicated and there are many hidden troubles. Coal is wrongly lifted by the musclemen from coal stocks, coal washeries, coal transfer and loading points and also in the transport routes by malfunctioning the weighing of trucks. CIL —Coal India Ltd is under the control of mafia and a large number of irregularities can be contributed to coal mafia. An Intelligent Coal Mine Security System using data acquisition method utilizes sensor, automatic detection, communication and microcontroller technologies, to realize the operational parameters of the mining area. The data acquisition terminal take the PIC 16F877A chip integrated circuit as a core for sensing the data, which carries on the communication through the RS232 interface with the main control machine, which has realized the intelligent monitoring. Data management system uses EEPROM chip as a Black box to store data permanently and also use CCTV camera for recording internal situation. The system implements the real-time monitoring and displaying for data undermine, query, deletion and maintenance of history data, graphic statistic, report printing, expert diagnosis and decision-making support. The Research, development and Promote Application will provide the safeguard regarding the mine pit control in accuracy, real-time capacity and has high reliability.

  18. Genome cartography: charting the apicomplexan genome.

    Science.gov (United States)

    Kissinger, Jessica C; DeBarry, Jeremy

    2011-08-01

    Genes reside in particular genomic contexts that can be mapped at many levels. Historically, 'genetic maps' were used primarily to locate genes. Recent technological advances in the determination of genome sequences have made the analysis and comparison of whole genomes possible and increasingly tractable. What do we see if we shift our focus from gene content (the 'inventory' of genes contained within a genome) to the composition and organization of a genome? This review examines what has been learned about the evolution of the apicomplexan genome as well as the significance and impact of genomic location on our understanding of the eukaryotic genome and parasite biology.

  19. Automated Podcasting System for Universities

    Directory of Open Access Journals (Sweden)

    Ypatios Grigoriadis

    2013-03-01

    Full Text Available This paper presents the results achieved at Graz University of Technology (TU Graz in the field of automating the process of recording and publishing university lectures in a very new way. It outlines cornerstones of the development and integration of an automated recording system such as the lecture hall setup, the recording hardware and software architecture as well as the development of a text-based search for the final product by method of indexing video podcasts. Furthermore, the paper takes a look at didactical aspects, evaluations done in this context and future outlook.

  20. Agile Data: Automating database refactorings

    Directory of Open Access Journals (Sweden)

    Bruno Xavier

    2014-09-01

    Full Text Available This paper discusses an automated approach to database change management throughout the companies’ development workflow. By using automated tools, companies can avoid common issues related to manual database deployments. This work was motivated by analyzing usual problems within organizations, mostly originated from manual interventions that may result in systems disruptions and production incidents. In addition to practices of continuous integration and continuous delivery, the current paper describes a case study in which a suggested pipeline is implemented in order to reduce the deployment times and decrease incidents due to ineffective data controlling.