WorldWideScience

Sample records for automated genome mining

  1. Automated genome mining for natural products

    Directory of Open Access Journals (Sweden)

    Zajkowski James

    2009-06-01

    Full Text Available Abstract Background Discovery of new medicinal agents from natural sources has largely been an adventitious process based on screening of plant and microbial extracts combined with bioassay-guided identification and natural product structure elucidation. Increasingly rapid and more cost-effective genome sequencing technologies coupled with advanced computational power have converged to transform this trend toward a more rational and predictive pursuit. Results We have developed a rapid method of scanning genome sequences for multiple polyketide, nonribosomal peptide, and mixed combination natural products with output in a text format that can be readily converted to two and three dimensional structures using conventional software. Our open-source and web-based program can assemble various small molecules composed of twenty standard amino acids and twenty two other chain-elongation intermediates used in nonribosomal peptide systems, and four acyl-CoA extender units incorporated into polyketides by reading a hidden Markov model of DNA. This process evaluates and selects the substrate specificities along the assembly line of nonribosomal synthetases and modular polyketide synthases. Conclusion Using this approach we have predicted the structures of natural products from a diverse range of bacteria based on a limited number of signature sequences. In accelerating direct DNA to metabolomic analysis, this method bridges the interface between chemists and biologists and enables rapid scanning for compounds with potential therapeutic value.

  2. Analyzing and mining automated imaging experiments.

    Science.gov (United States)

    Berlage, Thomas

    2007-04-01

    Image mining is the application of computer-based techniques that extract and exploit information from large image sets to support human users in generating knowledge from these sources. This review focuses on biomedical applications of this technique, in particular automated imaging at the cellular level. Due to increasing automation and the availability of integrated instruments, biomedical users are becoming increasingly confronted with the problem of analyzing such data. Image database applications need to combine data management, image analysis and visual data mining. The main point of such a system is a software layer that represents objects within an image and the ability to use a large spectrum of quantitative and symbolic object features. Image analysis needs to be adapted to each particular experiment; therefore, 'end user programming' will be desired to make the technology more widely applicable.

  3. Sensing for advancing mining automation capability:A review of underground automation technology development

    Institute of Scientific and Technical Information of China (English)

    Ralston Jonathon; Reid David; Hargrave Chad; Hainsworth David

    2014-01-01

    This paper highlights the role of automation technologies for improving the safety, productivity, and environmental sustainability of underground coal mining processes. This is accomplished by reviewing the impact that the introduction of automation technology has made through the longwall shearer auto-mation research program of Longwall Automation Steering Committee (LASC). This result has been achieved through close integration of sensing, processing, and control technologies into the longwall mining process. Key to the success of the automation solution has been the development of new sensing methods to accurately measure the location of longwall equipment and the spatial configuration of coal seam geology. The relevance of system interoperability and open communications standards for facilitat-ing effective automation is also discussed. Importantly, the insights gained through the longwall automa-tion development process are now leading to new technology transfer activity to benefit other underground mining processes.

  4. Text mining from ontology learning to automated text processing applications

    CERN Document Server

    Biemann, Chris

    2014-01-01

    This book comprises a set of articles that specify the methodology of text mining, describe the creation of lexical resources in the framework of text mining and use text mining for various tasks in natural language processing (NLP). The analysis of large amounts of textual data is a prerequisite to build lexical resources such as dictionaries and ontologies and also has direct applications in automated text processing in fields such as history, healthcare and mobile applications, just to name a few. This volume gives an update in terms of the recent gains in text mining methods and reflects

  5. Present situation and developing trend of coal mine automation and communication technology

    Institute of Scientific and Technical Information of China (English)

    HU Sui-yan

    2008-01-01

    Introduced developing process of coal mine automation and communicationtechnology, analyzed present features and characteristics of coal mine automation andcommunication technology, and put forward a few key technical problems needed to besolved.

  6. DGIdb - Mining the druggable genome

    Science.gov (United States)

    Coffman, Adam C.; Weible, James V.; McMichael, Josh F.; Spies, Nicholas C.; Koval, James; Das, Indraniel; Callaway, Matthew B.; Eldred, James M.; Miller, Christopher A.; Subramanian, Janakiraman; Govindan, Ramaswamy; Kumar, Runjun D.; Bose, Ron; Ding, Li; Walker, Jason R.; Larson, David E.; Dooling, David J.; Smith, Scott M.; Ley, Timothy J.; Mardis, Elaine R.; Wilson, Richard K.

    2013-01-01

    The Drug-Gene Interaction database (DGIdb) mines existing resources that generate hypotheses about how mutated genes might be targeted therapeutically or prioritized for drug development. It provides an interface for searching lists of genes against a compendium of drug-gene interactions and potentially druggable genes. DGIdb can be accessed at dgidb.org. PMID:24122041

  7. DGIdb - Mining the druggable genome

    OpenAIRE

    Griffith, Malachi; Griffith, Obi L; Coffman, Adam C.; Weible, James V.; McMichael, Josh F; Nicholas C Spies; Koval, James; Das, Indraniel; Callaway, Matthew B.; Eldred, James M.; Miller, Christopher A.; Subramanian, Janakiraman; Govindan, Ramaswamy; Runjun D Kumar; Bose, Ron

    2013-01-01

    The Drug-Gene Interaction database (DGIdb) mines existing resources that generate hypotheses about how mutated genes might be targeted therapeutically or prioritized for drug development. It provides an interface for searching lists of genes against a compendium of drug-gene interactions and potentially druggable genes. DGIdb can be accessed at dgidb.org.

  8. Restriction enzyme mining for SNPs in genomes.

    Science.gov (United States)

    Chuang, Li-Yeh; Yang, Cheng-Hong; Tsui, Ke-Hung; Cheng, Yu-Huei; Chang, Phei-Lang; Wen, Cheng-Hao; Chang, Hsueh-Wei

    2008-01-01

    Many different single nucleotide polymorphisms (SNPs) genotyping methods have been developed recently. However, most of them are expensive. Using restriction enzymes for SNP genotyping is a cost-effective method. However, restriction enzyme mining for SNPs in a genome sequence is still challenging for researchers who do not have a background in genomics and bioinformatics. In this review, the basic bioinformatics tools used for restriction enzyme mining for SNP genotyping are summarized and described. The objectives of this paper include: i) the introduction of SNPs, genotyping and PCR-restriction fragment length polymorphism (RFLP); ii) a review of components for genotyping software, including tools for primer design only or restriction enzyme mining only; iii) a review of software providing the flanking sequence for primer design; iv) recent advances in PCR-RFLP tools and natural and mutagenic PCR-RFLP; v) highlighting the strategy for restriction enzyme mining for SNP genotyping; vi) a discussion of potential problems for multiple PCR-RFLP. The different implications for restriction enzymes on sense and antisense strands are also discussed. Our PCR-RFLP freeware, SNP-RFLPing, is included in this review to illustrate many characteristics of PCR-RFLP software design. Future developments will include further sophistication of PCR-RFLP software in order to provide better visualization and a more interactive environment for SNP genotyping and to integrate the software with other tools used in association studies.

  9. Automated design of genomic Southern blot probes

    Directory of Open Access Journals (Sweden)

    Komiyama Noboru H

    2010-01-01

    experimentally validate a number of these automated designs by Southern blotting. The majority of probes we tested performed well confirming our in silico prediction methodology and the general usefulness of the software for automated genomic Southern probe design. Conclusions Software and supplementary information are freely available at: http://www.genes2cognition.org/software/southern_blot

  10. Pneumatic automation systems in coal mines

    Energy Technology Data Exchange (ETDEWEB)

    Shmatkov, N.A.; Kiklevich, Yu.N.

    1981-04-01

    Giprougleavtomatizatsiya, Avtomatgormash, Dongiprouglemash, VNIIGD and other plants develop 30 new pneumatic systems for mine machines and equipment control each year. The plants produce about 200 types of pneumatic systems. Major pneumatic systems for face systems, machines and equipment are reviewed: Sirena system for remote control of ANShch and AShchM face systems for steep coal seams, UPS control systems for pump stations, PAUZA control system for stowing machines, remote control system of B100-200 drilling machines, PUSK control system for coal cutter loaders with pneumatic drive (A-70, Temp), PUVSh control system for ventilation barriers activated from moving electric locomotives, PAZ control system for skip hoist loading. Specifications of the systems are given. Economic benefit produced by the pneumatic control systems are evaluated (from 1,500 to 40,000 rubles/year). Using the systems increases productivity of face machines and other machines used in black coal mines by 5 to 30%.

  11. An automated Genomes-to-Natural Products platform (GNP) for the discovery of modular natural products.

    Science.gov (United States)

    Johnston, Chad W; Skinnider, Michael A; Wyatt, Morgan A; Li, Xiang; Ranieri, Michael R M; Yang, Lian; Zechel, David L; Ma, Bin; Magarvey, Nathan A

    2015-09-28

    Bacterial natural products are a diverse and valuable group of small molecules, and genome sequencing indicates that the vast majority remain undiscovered. The prediction of natural product structures from biosynthetic assembly lines can facilitate their discovery, but highly automated, accurate, and integrated systems are required to mine the broad spectrum of sequenced bacterial genomes. Here we present a genome-guided natural products discovery tool to automatically predict, combinatorialize and identify polyketides and nonribosomal peptides from biosynthetic assembly lines using LC-MS/MS data of crude extracts in a high-throughput manner. We detail the directed identification and isolation of six genetically predicted polyketides and nonribosomal peptides using our Genome-to-Natural Products platform. This highly automated, user-friendly programme provides a means of realizing the potential of genetically encoded natural products.

  12. Automation and remote control at Mining 85 in Birmingham

    Energy Technology Data Exchange (ETDEWEB)

    Czauderna, N.

    1985-09-12

    Looking round the exhibition showed that more and more manufacturers and users take advantage of the new measuring, control and regulatory techniques with economic profit. Process computers and micro-computers make a general use of these techniques possible. The backbone of all automation systems is data transmission between widely distributed stations on individual machines. Here, too, it was clear that in mining multiple data transmission on the basis of highly integrated components and own microprocessors is occupying an ever more prominent place in communication techniques. BUS systems are more and more in evidence and already installed in some plants. For the purpose of this report and ease of reference to the large field automation and remote control at Minig 85 the products have been grouped under sensor, transmission and automation technologies. (orig./MOS).

  13. Genome Mining for antibiotics biosynthesis pathways with antiSMASH 3

    DEFF Research Database (Denmark)

    Weber, Tilmann; Kim, Hyun Uk; Blin, Kai

    2014-01-01

    Microorganisms are the most important source of natural products with antimicrobial or antitumor activity. These natural products are the main source for anti-­‐infectives; 80% of antibiotics currently in medical use are derived from this class of compounds. In the past, functional screenings...... the biological sources for novel drug candidates. For high-­‐throughput genome mining, sophisticated software is required, which allows the prediction of putative biosynthetic products based on genomic data. Here, we present the new version 3 of the software antiSMASH (http......://antismash.secondarymetabolites.org). antiSMASH3 currently is the most comprehensive automated genome mining platform for natural product biosynthetic pathways. It automatically screens genomic data of bacteria and fungi for the presence of 24 different types of secondary metabolite biosynthetic pathways. For different classes of secondary...

  14. Comparative genomics using data mining tools

    Indian Academy of Sciences (India)

    Tannistha Nandi; Chandrika B-Rao; Srinivasan Ramachandran

    2002-02-01

    We have analysed the genomes of representatives of three kingdoms of life, namely, archaea, eubacteria and eukaryota using data mining tools based on compositional analyses of the protein sequences. The representatives chosen in this analysis were Methanococcus jannaschii, Haemophilus influenzae and Saccharomyces cerevisiae. We have identified the common and different features between the three genomes in the protein evolution patterns. M. jannaschii has been seen to have a greater number of proteins with more charged amino acids whereas S. cerevisiae has been observed to have a greater number of hydrophilic proteins. Despite the differences in intrinsic compositional characteristics between the proteins from the different genomes we have also identified certain common characteristics. We have carried out exploratory Principal Component Analysis of the multivariate data on the proteins of each organism in an effort to classify the proteins into clusters. Interestingly, we found that most of the proteins in each organism cluster closely together, but there are a few ‘outliers’. We focus on the outliers for the functional investigations, which may aid in revealing any unique features of the biology of the respective organisms.

  15. Automated Mineral Analysis to Characterize Metalliferous Mine Waste

    Science.gov (United States)

    Hensler, Ana-Sophie; Lottermoser, Bernd G.; Vossen, Peter; Langenberg, Lukas C.

    2016-10-01

    The objective of this study was to investigate the applicability of automated QEMSCAN® mineral analysis combined with bulk geochemical analysis to evaluate the environmental risk of non-acid producing mine waste present at the historic Albertsgrube Pb-Zn mine site, Hastenrath, North Rhine-Westphalia, Germany. Geochemical analyses revealed elevated average abundances of As, Cd, Cu, Mn, Pb, Sb and Zn and near neutral to slightly alkaline paste pH values. Mineralogical analyses using the QEMSCAN® revealed diverse mono- and polymineralic particles across all samples, with grain sizes ranging from a few μm up to 2000 μm. Calcite and dolomite (up to 78 %), smithsonite (up to 24 %) and Ca sulphate (up to 11.5 %) are present mainly as coarse-grained particles. By contrast, significant amounts of quartz, muscovite/illite, sphalerite (up to 10.8 %), galena (up to 1 %), pyrite (up to 3.4 %) and cerussite/anglesite (up to 4.3 %) are present as fine-grained (<500 μm) particles. QEMSCAN® analysis also identified disseminated sauconite, coronadite/chalcophanite, chalcopyrite, jarosite, apatite, rutile, K-feldspar, biotite, Fe (hydr) oxides/CO3 and unknown Zn Pb(Fe) and Zn Pb Ca (Fe Ti) phases. Many of the metal-bearing sulphide grains occur as separate particles with exposed surface areas and thus, may be matter of environmental concern because such mineralogical hosts will continue to release metals and metalloids (As, Cd, Sb, Zn) at near neutral pH into ground and surface waters. QEMSCAN® mineral analysis allows acquisition of fully quantitative data on the mineralogical composition, textural characteristics and grain size estimation of mine waste material and permits the recognition of mine waste as “high-risk” material that would have otherwise been classified by traditional geochemical tests as benign.

  16. BAGEL2 : mining for bacteriocins in genomic data

    NARCIS (Netherlands)

    de Jong, Anne; van Heel, Auke J.; Kok, Jan; Kuipers, Oscar P.

    2010-01-01

    Mining bacterial genomes for bacteriocins is a challenging task due to the substantial structure and sequence diversity, and generally small sizes, of these antimicrobial peptides. Major progress in the research of antimicrobial peptides and the ever-increasing quantities of genomic data, varying fr

  17. BEACON: automated tool for Bacterial GEnome Annotation ComparisON

    KAUST Repository

    Kalkatawi, Manal Matoq Saeed

    2015-08-18

    Background Genome annotation is one way of summarizing the existing knowledge about genomic characteristics of an organism. There has been an increased interest during the last several decades in computer-based structural and functional genome annotation. Many methods for this purpose have been developed for eukaryotes and prokaryotes. Our study focuses on comparison of functional annotations of prokaryotic genomes. To the best of our knowledge there is no fully automated system for detailed comparison of functional genome annotations generated by different annotation methods (AMs). Results The presence of many AMs and development of new ones introduce needs to: a/ compare different annotations for a single genome, and b/ generate annotation by combining individual ones. To address these issues we developed an Automated Tool for Bacterial GEnome Annotation ComparisON (BEACON) that benefits both AM developers and annotation analysers. BEACON provides detailed comparison of gene function annotations of prokaryotic genomes obtained by different AMs and generates extended annotations through combination of individual ones. For the illustration of BEACON’s utility, we provide a comparison analysis of multiple different annotations generated for four genomes and show on these examples that the extended annotation can increase the number of genes annotated by putative functions up to 27 %, while the number of genes without any function assignment is reduced. Conclusions We developed BEACON, a fast tool for an automated and a systematic comparison of different annotations of single genomes. The extended annotation assigns putative functions to many genes with unknown functions. BEACON is available under GNU General Public License version 3.0 and is accessible at: http://www.cbrc.kaust.edu.sa/BEACON/

  18. The evolution of genome mining in microbes – a review

    DEFF Research Database (Denmark)

    Ziemert, Nadine; Alanjary, Mohammad; Weber, Tilmann

    2016-01-01

    clusters that await linkage to their encoded natural products. With the development of high-throughput sequencing methods and the wealth of DNA data available, a variety of genome mining methods and tools have been developed to guide discovery and characterisation of these compounds. This article reviews......Covering: 2006 to 2016. The computational mining of genomes has become an important part in the discovery of novel natural products as drug leads. Thousands of bacterial genome sequences are publically available these days containing an even larger number and diversity of secondary metabolite gene...

  19. Highlights of recent articles on data mining in genomics & proteomics

    Science.gov (United States)

    This editorial elaborates on investigations consisting of different “OMICS” technologies and their application to biological sciences. In addition, advantages and recent development of the proteomic, genomic and data mining technologies are discussed. This information will be useful to scientists ...

  20. Mining nematode genome data for novel drug targets.

    Science.gov (United States)

    Foster, Jeremy M; Zhang, Yinhua; Kumar, Sanjay; Carlow, Clotilde K S

    2005-03-01

    Expressed sequence tag projects have currently produced over 400 000 partial gene sequences from more than 30 nematode species and the full genomic sequences of selected nematodes are being determined. In addition, functional analyses in the model nematode Caenorhabditis elegans have addressed the role of almost all genes predicted by the genome sequence. This recent explosion in the amount of available nematode DNA sequences, coupled with new gene function data, provides an unprecedented opportunity to identify pre-validated drug targets through efficient mining of nematode genomic databases. This article describes the various information sources available and strategies that can expedite this process.

  1. Large-scale data mining pilot project in human genome

    Energy Technology Data Exchange (ETDEWEB)

    Musick, R.; Fidelis, R.; Slezak, T.

    1997-05-01

    This whitepaper briefly describes a new, aggressive effort in large- scale data Livermore National Labs. The implications of `large- scale` will be clarified Section. In the short term, this effort will focus on several @ssion-critical questions of Genome project. We will adapt current data mining techniques to the Genome domain, to quantify the accuracy of inference results, and lay the groundwork for a more extensive effort in large-scale data mining. A major aspect of the approach is that we will be fully-staffed data warehousing effort in the human Genome area. The long term goal is strong applications- oriented research program in large-@e data mining. The tools, skill set gained will be directly applicable to a wide spectrum of tasks involving a for large spatial and multidimensional data. This includes applications in ensuring non-proliferation, stockpile stewardship, enabling Global Ecology (Materials Database Industrial Ecology), advancing the Biosciences (Human Genome Project), and supporting data for others (Battlefield Management, Health Care).

  2. Digital Coal Mine Integrated Automation System Based on ControlNet

    Institute of Scientific and Technical Information of China (English)

    CHEN Jin-yun; ZHANG Shen; ZUO Wei-ran

    2007-01-01

    A three-layer model for digital communication in a mine is proposed. Two basic platforms are discussed: A uniform transmission network and a uniform data warehouse. An actual, ControlNet based, transmission network platform suitable for the Jining No.3 coal mine is presented. This network is an information superhighway intended to integrate all existing and new automation subsystems. Its standard interface can be used with future subsystems. The network, data structure and management decision-making all employ this uniform hardware and software. This effectively avoids the problems of system and information islands seen in traditional mine-automation systems. The construction of the network provides a stable foundation for digital communication in the Jining No.3 coal mine.

  3. Chapter 13: Mining electronic health records in the genomics era.

    Directory of Open Access Journals (Sweden)

    Joshua C Denny

    Full Text Available The combination of improved genomic analysis methods, decreasing genotyping costs, and increasing computing resources has led to an explosion of clinical genomic knowledge in the last decade. Similarly, healthcare systems are increasingly adopting robust electronic health record (EHR systems that not only can improve health care, but also contain a vast repository of disease and treatment data that could be mined for genomic research. Indeed, institutions are creating EHR-linked DNA biobanks to enable genomic and pharmacogenomic research, using EHR data for phenotypic information. However, EHRs are designed primarily for clinical care, not research, so reuse of clinical EHR data for research purposes can be challenging. Difficulties in use of EHR data include: data availability, missing data, incorrect data, and vast quantities of unstructured narrative text data. Structured information includes billing codes, most laboratory reports, and other variables such as physiologic measurements and demographic information. Significant information, however, remains locked within EHR narrative text documents, including clinical notes and certain categories of test results, such as pathology and radiology reports. For relatively rare observations, combinations of simple free-text searches and billing codes may prove adequate when followed by manual chart review. However, to extract the large cohorts necessary for genome-wide association studies, natural language processing methods to process narrative text data may be needed. Combinations of structured and unstructured textual data can be mined to generate high-validity collections of cases and controls for a given condition. Once high-quality cases and controls are identified, EHR-derived cases can be used for genomic discovery and validation. Since EHR data includes a broad sampling of clinically-relevant phenotypic information, it may enable multiple genomic investigations upon a single set of genotyped

  4. Data mining and the human genome

    Energy Technology Data Exchange (ETDEWEB)

    Abarbanel, Henry [The MITRE Corporation, McLean, VA (US). JASON Program Office; Callan, Curtis [The MITRE Corporation, McLean, VA (US). JASON Program Office; Dally, William [The MITRE Corporation, McLean, VA (US). JASON Program Office; Dyson, Freeman [The MITRE Corporation, McLean, VA (US). JASON Program Office; Hwa, Terence [The MITRE Corporation, McLean, VA (US). JASON Program Office; Koonin, Steven [The MITRE Corporation, McLean, VA (US). JASON Program Office; Levine, Herbert [The MITRE Corporation, McLean, VA (US). JASON Program Office; Rothaus, Oscar [The MITRE Corporation, McLean, VA (US). JASON Program Office; Schwitters, Roy [The MITRE Corporation, McLean, VA (US). JASON Program Office; Stubbs, Christopher [The MITRE Corporation, McLean, VA (US). JASON Program Office; Weinberger, Peter [The MITRE Corporation, McLean, VA (US). JASON Program Office

    2000-01-07

    As genomics research moves from an era of data acquisition to one of both acquisition and interpretation, new methods are required for organizing and prioritizing the data. These methods would allow an initial level of data analysis to be carried out before committing resources to a particular genetic locus. This JASON study sought to delineate the main problems that must be faced in bioinformatics and to identify information technologies that can help to overcome those problems. While the current influx of data greatly exceeds what biologists have experienced in the past, other scientific disciplines and the commercial sector have been handling much larger datasets for many years. Powerful datamining techniques have been developed in other fields that, with appropriate modification, could be applied to the biological sciences.

  5. Understanding social collaboration between actors and technology in an automated and digitised deep mining environment.

    Science.gov (United States)

    Sanda, M-A; Johansson, J; Johansson, B; Abrahamsson, L

    2011-10-01

    The purpose of this article is to develop knowledge and learning on the best way to automate organisational activities in deep mines that could lead to the creation of harmony between the human, technical and the social system, towards increased productivity. The findings showed that though the introduction of high-level technological tools in the work environment disrupted the social relations developed over time amongst the employees in most situations, the technological tools themselves became substitute social collaborative partners to the employees. It is concluded that, in developing a digitised mining production system, knowledge of the social collaboration between the humans (miners) and the technology they use for their work must be developed. By implication, knowledge of the human's subject-oriented and object-oriented activities should be considered as an important integral resource for developing a better technological, organisational and human interactive subsystem when designing the intelligent automation and digitisation systems for deep mines. STATEMENT OF RELEVANCE: This study focused on understanding the social collaboration between humans and the technologies they use to work in underground mines. The learning provides an added knowledge in designing technologies and work organisations that could better enhance the human-technology interactive and collaborative system in the automation and digitisation of underground mines.

  6. AGAPE (Automated Genome Analysis PipelinE for pan-genome analysis of Saccharomyces cerevisiae.

    Directory of Open Access Journals (Sweden)

    Giltae Song

    Full Text Available The characterization and public release of genome sequences from thousands of organisms is expanding the scope for genetic variation studies. However, understanding the phenotypic consequences of genetic variation remains a challenge in eukaryotes due to the complexity of the genotype-phenotype map. One approach to this is the intensive study of model systems for which diverse sources of information can be accumulated and integrated. Saccharomyces cerevisiae is an extensively studied model organism, with well-known protein functions and thoroughly curated phenotype data. To develop and expand the available resources linking genomic variation with function in yeast, we aim to model the pan-genome of S. cerevisiae. To initiate the yeast pan-genome, we newly sequenced or re-sequenced the genomes of 25 strains that are commonly used in the yeast research community using advanced sequencing technology at high quality. We also developed a pipeline for automated pan-genome analysis, which integrates the steps of assembly, annotation, and variation calling. To assign strain-specific functional annotations, we identified genes that were not present in the reference genome. We classified these according to their presence or absence across strains and characterized each group of genes with known functional and phenotypic features. The functional roles of novel genes not found in the reference genome and associated with strains or groups of strains appear to be consistent with anticipated adaptations in specific lineages. As more S. cerevisiae strain genomes are released, our analysis can be used to collate genome data and relate it to lineage-specific patterns of genome evolution. Our new tool set will enhance our understanding of genomic and functional evolution in S. cerevisiae, and will be available to the yeast genetics and molecular biology community.

  7. AGAPE (Automated Genome Analysis PipelinE) for pan-genome analysis of Saccharomyces cerevisiae.

    Science.gov (United States)

    Song, Giltae; Dickins, Benjamin J A; Demeter, Janos; Engel, Stacia; Gallagher, Jennifer; Choe, Kisurb; Dunn, Barbara; Snyder, Michael; Cherry, J Michael

    2015-01-01

    The characterization and public release of genome sequences from thousands of organisms is expanding the scope for genetic variation studies. However, understanding the phenotypic consequences of genetic variation remains a challenge in eukaryotes due to the complexity of the genotype-phenotype map. One approach to this is the intensive study of model systems for which diverse sources of information can be accumulated and integrated. Saccharomyces cerevisiae is an extensively studied model organism, with well-known protein functions and thoroughly curated phenotype data. To develop and expand the available resources linking genomic variation with function in yeast, we aim to model the pan-genome of S. cerevisiae. To initiate the yeast pan-genome, we newly sequenced or re-sequenced the genomes of 25 strains that are commonly used in the yeast research community using advanced sequencing technology at high quality. We also developed a pipeline for automated pan-genome analysis, which integrates the steps of assembly, annotation, and variation calling. To assign strain-specific functional annotations, we identified genes that were not present in the reference genome. We classified these according to their presence or absence across strains and characterized each group of genes with known functional and phenotypic features. The functional roles of novel genes not found in the reference genome and associated with strains or groups of strains appear to be consistent with anticipated adaptations in specific lineages. As more S. cerevisiae strain genomes are released, our analysis can be used to collate genome data and relate it to lineage-specific patterns of genome evolution. Our new tool set will enhance our understanding of genomic and functional evolution in S. cerevisiae, and will be available to the yeast genetics and molecular biology community.

  8. A web server for mining Comparative Genomic Hybridization (CGH) data

    Science.gov (United States)

    Liu, Jun; Ranka, Sanjay; Kahveci, Tamer

    2007-11-01

    Advances in cytogenetics and molecular biology has established that chromosomal alterations are critical in the pathogenesis of human cancer. Recurrent chromosomal alterations provide cytological and molecular markers for the diagnosis and prognosis of disease. They also facilitate the identification of genes that are important in carcinogenesis, which in the future may help in the development of targeted therapy. A large amount of publicly available cancer genetic data is now available and it is growing. There is a need for public domain tools that allow users to analyze their data and visualize the results. This chapter describes a web based software tool that will allow researchers to analyze and visualize Comparative Genomic Hybridization (CGH) datasets. It employs novel data mining methodologies for clustering and classification of CGH datasets as well as algorithms for identifying important markers (small set of genomic intervals with aberrations) that are potentially cancer signatures. The developed software will help in understanding the relationships between genomic aberrations and cancer types.

  9. Automated training for algorithms that learn from genomic data.

    Science.gov (United States)

    Cilingir, Gokcen; Broschat, Shira L

    2015-01-01

    Supervised machine learning algorithms are used by life scientists for a variety of objectives. Expert-curated public gene and protein databases are major resources for gathering data to train these algorithms. While these data resources are continuously updated, generally, these updates are not incorporated into published machine learning algorithms which thereby can become outdated soon after their introduction. In this paper, we propose a new model of operation for supervised machine learning algorithms that learn from genomic data. By defining these algorithms in a pipeline in which the training data gathering procedure and the learning process are automated, one can create a system that generates a classifier or predictor using information available from public resources. The proposed model is explained using three case studies on SignalP, MemLoci, and ApicoAP in which existing machine learning models are utilized in pipelines. Given that the vast majority of the procedures described for gathering training data can easily be automated, it is possible to transform valuable machine learning algorithms into self-evolving learners that benefit from the ever-changing data available for gene products and to develop new machine learning algorithms that are similarly capable.

  10. An automated annotation tool for genomic DNA sequences using GeneScan and BLAST

    Indian Academy of Sciences (India)

    Andrew M. Lynn; Chakresh Kumar Jain; K. Kosalai; Pranjan Barman; Nupur Thakur; Harish Batra; Alok Bhattacharya

    2001-04-01

    Genomic sequence data are often available well before the annotated sequence is published. We present a method for analysis of genomic DNA to identify coding sequences using the GeneScan algorithm and characterize these resultant sequences by BLAST. The routines are used to develop a system for automated annotation of genome DNA sequences.

  11. Automated methods and control when mining seams prone to outburst

    Energy Technology Data Exchange (ETDEWEB)

    Kolesov, O.A.; Agaphonov, A.V.; Kolchin, G.I. [Makeyevka Safety in Mines Research Institute (Ukraine)

    1995-12-31

    Drawbacks in existing methods of predicting outburst zones in Donbas (Russia) thin coal seams led specialists at MakNII to investigate methods based on artificial excited acoustic signals, with processing by personnal computers. The paper describes investigations to correlate different acoustic signal parameters with stress and strained state of the massif preface. The method proved reliable in determining the relief zone in 12 Donbas mines. The paper goes on to describe development of a control method for another widely used method of coal and gas outburst prevention in Donbas, that of water injection into the coal seam known as `hydroripping`. This method includes acoustic signals recording and preface part parameters determination in the drilling process for infusion and recording and processing of the acoustic signal in real time, which is created during water infusion. 8 refs.

  12. Evaluation of automated underground mapping solutions for mining and civil engineering applications

    Science.gov (United States)

    Eyre, Matthew; Wetherelt, Andrew; Coggan, John

    2016-10-01

    The extractive and construction industries rely heavily on accurate geospatial data to control position, location, alignment, and orientation of planned excavations. Recent advancements in the survey industry, through the use of terrestrial laser scanning, can now provide engineering teams with three-dimensional (3-D) data in unprecedented detail via georeferenced point clouds. Furthermore, equipment is now available that provides fully mobile automated mapping solutions, independent of satellite positioning, utilizing simultaneous localization and mapping. This paper evaluates the surveying capability of three fully mobile automated mapping solutions against a benchmark laser scanning survey undertaken at the underground Camborne School of Mines Test Mine facility. The study highlights that handheld automated mapping solutions, in which closed-loops can be formed, have the potential to provide quicker data collection and processing time, as well as the required accuracy for underground surveying applications. However, the automated solution was unable to produce the necessary point cloud density to identify low-angled discontinuities that may have a major safety implication, leading to potential rockfall.

  13. Comprehensive automation of work on delivering and laying rail tracks in mine roadways

    Energy Technology Data Exchange (ETDEWEB)

    Mazin, S.P.; Volkov, V.Yu.; Ignatova, E.M.; Puleev, S.F.

    1988-07-01

    Presents the KD-1 and NPSh-900 sets of equipment for automation of track laying work in underground mine roadways. Analysis of track laying in mines in the USSR shows that about 200 km of track are laid annually, with energy at 1600 kJ/m expended in the form of manual labor. The KD-1 equipment, developed by VNIIOMShS, is based on a track delivery container holding 10-12 rails 8-12 m long. The NPSh-900 set is a set of tools and accessories for track laying - a rail cutting unit, a rail drilling unit, a rail bender, ballast cars, grips, jacks, etc. All this equipment is handed to the track laying site on a special rail/wheeled trolley. Both sets of equipment have been successfully tested in Donbass mines.

  14. Data mining for regulatory elements in yeast genome.

    Science.gov (United States)

    Brazma, A; Vilo, J; Ukkonen, E; Valtonen, K

    1997-01-01

    We have examined methods and developed a general software tool for finding and analyzing combinations of transcription factor binding sites that occur relatively often in gene upstream regions (putative promoter regions) in the yeast genome. Such frequently occurring combinations may be essential parts of possible promoter classes. The regions upstream to all genes were first isolated from the yeast genome database MIPS using the information in the annotation files of the database. The ones that do not overlap with coding regions were chosen for further studies. Next, all occurrences of the yeast transcription factor binding sites, as given in the IMD database, were located in the genome and in the selected regions in particular. Finally, by using a general purpose data mining software in combination with our own software, which parametrizes the search, we can find the combinations of binding sites that occur in the upstream regions more frequently than would be expected on the basis of the frequency of individual sites. The procedure also finds so-called association rules present in such combinations. The developed tool is available for use through the WWW.

  15. Event-based text mining for biology and functional genomics.

    Science.gov (United States)

    Ananiadou, Sophia; Thompson, Paul; Nawaz, Raheel; McNaught, John; Kell, Douglas B

    2015-05-01

    The assessment of genome function requires a mapping between genome-derived entities and biochemical reactions, and the biomedical literature represents a rich source of information about reactions between biological components. However, the increasingly rapid growth in the volume of literature provides both a challenge and an opportunity for researchers to isolate information about reactions of interest in a timely and efficient manner. In response, recent text mining research in the biology domain has been largely focused on the identification and extraction of 'events', i.e. categorised, structured representations of relationships between biochemical entities, from the literature. Functional genomics analyses necessarily encompass events as so defined. Automatic event extraction systems facilitate the development of sophisticated semantic search applications, allowing researchers to formulate structured queries over extracted events, so as to specify the exact types of reactions to be retrieved. This article provides an overview of recent research into event extraction. We cover annotated corpora on which systems are trained, systems that achieve state-of-the-art performance and details of the community shared tasks that have been instrumental in increasing the quality, coverage and scalability of recent systems. Finally, several concrete applications of event extraction are covered, together with emerging directions of research.

  16. Automated comparative auditing of NCIT genomic roles using NCBI.

    Science.gov (United States)

    Cohen, Barry; Oren, Marc; Min, Hua; Perl, Yehoshua; Halper, Michael

    2008-12-01

    Biomedical research has identified many human genes and various knowledge about them. The National Cancer Institute Thesaurus (NCIT) represents such knowledge as concepts and roles (relationships). Due to the rapid advances in this field, it is to be expected that the NCIT's Gene hierarchy will contain role errors. A comparative methodology to audit the Gene hierarchy with the use of the National Center for Biotechnology Information's (NCBI's) Entrez Gene database is presented. The two knowledge sources are accessed via a pair of Web crawlers to ensure up-to-date data. Our algorithms then compare the knowledge gathered from each, identify discrepancies that represent probable errors, and suggest corrective actions. The primary focus is on two kinds of gene-roles: (1) the chromosomal locations of genes, and (2) the biological processes in which genes play a role. Regarding chromosomal locations, the discrepancies revealed are striking and systematic, suggesting a structurally common origin. In regard to the biological processes, difficulties arise because genes frequently play roles in multiple processes, and processes may have many designations (such as synonymous terms). Our algorithms make use of the roles defined in the NCIT Biological Process hierarchy to uncover many probable gene-role errors in the NCIT. These results show that automated comparative auditing is a promising technique that can identify a large number of probable errors and corrections for them in a terminological genomic knowledge repository, thus facilitating its overall maintenance.

  17. Automating slope monitoring in mines with terrestrial lidar scanners

    Science.gov (United States)

    Conforti, Dario

    2014-05-01

    Static terrestrial laser scanners (TLS) have been an important component of slope monitoring for some time, and many solutions for monitoring the progress of a slide have been devised over the years. However, all of these solutions have required users to operate the lidar equipment in the field, creating a high cost in time and resources, especially if the surveys must be performed very frequently. This paper presents a new solution for monitoring slides, developed using a TLS and an automated data acquisition, processing and analysis system. In this solution, a TLS is permanently mounted within sight of the target surface and connected to a control computer. The control software on the computer automatically triggers surveys according to a user-defined schedule, parses data into point clouds, and compares data against a baseline. The software can base the comparison against either the original survey of the site or the most recent survey, depending on whether the operator needs to measure the total or recent movement of the slide. If the displacement exceeds a user-defined safety threshold, the control computer transmits alerts via SMS text messaging and/or email, including graphs and tables describing the nature and size of the displacement. The solution can also be configured to trigger the external visual/audio alarm systems. If the survey areas contain high-traffic areas such as roads, the operator can mark them for exclusion in the comparison to prevent false alarms. To improve usability and safety, the control computer can connect to a local intranet and allow remote access through the software's web portal. This enables operators to perform most tasks with the TLS from their office, including reviewing displacement reports, downloading survey data, and adjusting the scan schedule. This solution has proved invaluable in automatically detecting and alerting users to potential danger within the monitored areas while lowering the cost and work required for

  18. Discovery of secondary metabolites from Bacillus spp. biocontrol strains using genome mining and mass spectroscopy

    Science.gov (United States)

    Genome sequencing, data mining and mass spectrometry were used to identify secondary metabolites produced by several Bacillus spp. biocontrol strains. These biocontrol strains have shown promise in managing Fusarium head blight in wheat. Draft genomes were produced and screened in silico using genom...

  19. Joint Genome Institute's Automation Approach and History

    Energy Technology Data Exchange (ETDEWEB)

    Roberts, Simon

    2006-07-05

    Department of Energy/Joint Genome Institute (DOE/JGI) collaborates with DOE national laboratories and community users, to advance genome science in support of the DOE missions of clean bio-energy, carbon cycling, and bioremediation.

  20. Lunar surface mining for automated acquisition of helium-3: Methods, processes, and equipment

    Science.gov (United States)

    Li, Y. T.; Wittenberg, L. J.

    1992-09-01

    In this paper, several techniques considered for mining and processing the regolith on the lunar surface are presented. These techniques have been proposed and evaluated based primarily on the following criteria: (1) mining operations should be relatively simple; (2) procedures of mineral processing should be few and relatively easy; (3) transferring tonnages of regolith on the Moon should be minimized; (4) operations outside the lunar base should be readily automated; (5) all equipment should be maintainable; and (6) economic benefit should be sufficient for commercial exploitation. The economic benefits are not addressed in this paper; however, the energy benefits have been estimated to be between 250 and 350 times the mining energy. A mobile mining scheme is proposed that meets most of the mining objectives. This concept uses a bucket-wheel excavator for excavating the regolith, several mechanical electrostatic separators for beneficiation of the regolith, a fast-moving fluidized bed reactor to heat the particles, and a palladium diffuser to separate H2 from the other solar wind gases. At the final stage of the miner, the regolith 'tailings' are deposited directly into the ditch behind the miner and cylinders of the valuable solar wind gases are transported to a central gas processing facility. During the production of He-3, large quantities of valuable H2, H2O, CO, CO2, and N2 are produced for utilization at the lunar base. For larger production of He-3 the utilization of multiple-miners is recommended rather than increasing their size. Multiple miners permit operations at more sites and provide redundancy in case of equipment failure.

  1. Automating the Analysis of Spatial Grids A Practical Guide to Data Mining Geospatial Images for Human & Environmental Applications

    CERN Document Server

    Lakshmanan, Valliappa

    2012-01-01

    The ability to create automated algorithms to process gridded spatial data is increasingly important as remotely sensed datasets increase in volume and frequency. Whether in business, social science, ecology, meteorology or urban planning, the ability to create automated applications to analyze and detect patterns in geospatial data is increasingly important. This book provides students with a foundation in topics of digital image processing and data mining as applied to geospatial datasets. The aim is for readers to be able to devise and implement automated techniques to extract information from spatial grids such as radar, satellite or high-resolution survey imagery.

  2. Comments on Sensory Mine Internet of Things and Mine Comprehensive Automation%论感知矿山物联网与矿山综合自动化

    Institute of Scientific and Technical Information of China (English)

    张申; 赵小虎

    2012-01-01

    系统分析了感知矿山物联网与矿山综合自动化的关系及区别,从矿山综合自动化的概念、实施案例出发,讨论矿山综合自动化建设与运行10余年来的成果,并总结了其存在的问题:感知手段传统单一,缺乏泛在感知网络,重硬集成,轻软集成,多学科交叉不够以及标准建设不足等。而感知矿山物联网技术正是解决这些问题的有效方法,提出了矿山物联网建设的主要内容为网络平台、应用平台、3个感知核心内容(人员感知、设备感知、灾害感知)和应用系统4个方面。指出综合自动化是矿山物联网的基础,矿山物联网是综合自动化概念的升华。%With the systematic analysis on the relationship and distinction between the sensory mine internet of things and the mine comprehensive automation,from the view of the mine comprehensive automation conception and the implemented cases,the paper discussed the passed 10 years achievements on the mine comprehensive automation construction and operation and summarized the existed problems,such as the sensory means traditional and unitary,the lacking of the general sensory network,more attention to the hardware integration,less attention to software integration,no sufficient to the multi sciences crossing and insufficient construction of the standard and others.But the sensory mine internet of things technology would be the right effective method to solve those problems.The paper provided the main contents to build the mine internet of things,including the network platform,the application platform,three sensory key contents(as the sensory personnel,sensory equipment and sensory disaster) and the application system.The paper pointed out that the comprehensive automation would be the basis of the mine internet of things and the mine internet of things would be the sublimation of the comprehensive automation conception.

  3. Clinic-Genomic Association Mining for Colorectal Cancer Using Publicly Available Datasets

    Directory of Open Access Journals (Sweden)

    Fang Liu

    2014-01-01

    Full Text Available In recent years, a growing number of researchers began to focus on how to establish associations between clinical and genomic data. However, up to now, there is lack of research mining clinic-genomic associations by comprehensively analysing available gene expression data for a single disease. Colorectal cancer is one of the malignant tumours. A number of genetic syndromes have been proven to be associated with colorectal cancer. This paper presents our research on mining clinic-genomic associations for colorectal cancer under biomedical big data environment. The proposed method is engineered with multiple technologies, including extracting clinical concepts using the unified medical language system (UMLS, extracting genes through the literature mining, and mining clinic-genomic associations through statistical analysis. We applied this method to datasets extracted from both gene expression omnibus (GEO and genetic association database (GAD. A total of 23517 clinic-genomic associations between 139 clinical concepts and 7914 genes were obtained, of which 3474 associations between 31 clinical concepts and 1689 genes were identified as highly reliable ones. Evaluation and interpretation were performed using UMLS, KEGG, and Gephi, and potential new discoveries were explored. The proposed method is effective in mining valuable knowledge from available biomedical big data and achieves a good performance in bridging clinical data with genomic data for colorectal cancer.

  4. New development of longwall mining equipment based on automation and intelligent technology for thin seam coal

    Institute of Scientific and Technical Information of China (English)

    Guo-fa WANG

    2013-01-01

    The paper introduced complete sets of automatic equipment and technology used in thin seam coal face,and proposed the comprehensive mechanization and automation of safe and high efficiency mining models based on the thin seam drum shearer.The key technology of short length and high power thin seam drum shearer,and new type roof support with big extension ratio and plate canopy were introduced.The new research achievement on automatic control system of complete sets of equipment for the thin seam coal,which composed of electronic-hydraulic system,compact thin seam roof supports,high effective shearer with intelligent control system,and characterized by automatical follow-up and remote control technology,was described in this paper..

  5. Automated interpretation of optic nerve images: a data mining framework for glaucoma diagnostic support.

    Science.gov (United States)

    Abidi, Syed S R; Artes, Paul H; Yun, Sanjan; Yu, Jin

    2007-01-01

    Confocal Scanning Laser Tomography (CSLT) techniques capture high-quality images of the optic disc (the retinal region where the optic nerve exits the eye) that are used in the diagnosis and monitoring of glaucoma. We present a hybrid framework, combining image processing and data mining methods, to support the interpretation of CSLT optic nerve images. Our framework features (a) Zernike moment methods to derive shape information from optic disc images; (b) classification of optic disc images, based on shape information, to distinguish between healthy and glaucomatous optic discs. We apply Multi Layer Perceptrons, Support Vector Machines and Bayesian Networks for feature sub-set selection and image classification; and (c) clustering of optic disc images, based on shape information, using Self-Organizing Maps to visualize sub-types of glaucomatous optic disc damage. Our framework offers an automated and objective analysis of optic nerve images that can potentially support both diagnosis and monitoring of glaucoma.

  6. PLAN: a web platform for automating high-throughput BLAST searches and for managing and mining results

    Directory of Open Access Journals (Sweden)

    Zhao Xuechun

    2007-02-01

    Full Text Available Abstract Background BLAST searches are widely used for sequence alignment. The search results are commonly adopted for various functional and comparative genomics tasks such as annotating unknown sequences, investigating gene models and comparing two sequence sets. Advances in sequencing technologies pose challenges for high-throughput analysis of large-scale sequence data. A number of programs and hardware solutions exist for efficient BLAST searching, but there is a lack of generic software solutions for mining and personalized management of the results. Systematically reviewing the results and identifying information of interest remains tedious and time-consuming. Results Personal BLAST Navigator (PLAN is a versatile web platform that helps users to carry out various personalized pre- and post-BLAST tasks, including: (1 query and target sequence database management, (2 automated high-throughput BLAST searching, (3 indexing and searching of results, (4 filtering results online, (5 managing results of personal interest in favorite categories, (6 automated sequence annotation (such as NCBI NR and ontology-based annotation. PLAN integrates, by default, the Decypher hardware-based BLAST solution provided by Active Motif Inc. with a greatly improved efficiency over conventional BLAST software. BLAST results are visualized by spreadsheets and graphs and are full-text searchable. BLAST results and sequence annotations can be exported, in part or in full, in various formats including Microsoft Excel and FASTA. Sequences and BLAST results are organized in projects, the data publication levels of which are controlled by the registered project owners. In addition, all analytical functions are provided to public users without registration. Conclusion PLAN has proved a valuable addition to the community for automated high-throughput BLAST searches, and, more importantly, for knowledge discovery, management and sharing based on sequence alignment results

  7. Application of the Deformation Information System for automated analysis and mapping of mining terrain deformations - case study from SW Poland

    Science.gov (United States)

    Blachowski, Jan; Grzempowski, Piotr; Milczarek, Wojciech; Nowacka, Anna

    2015-04-01

    Monitoring, mapping and modelling of mining induced terrain deformations are important tasks for quantifying and minimising threats that arise from underground extraction of useful minerals and affect surface infrastructure, human safety, the environment and security of the mining operation itself. The number of methods and techniques used for monitoring and analysis of mining terrain deformations is wide and expanding with the progress in geographical information technologies. These include for example: terrestrial geodetic measurements, Global Navigation Satellite Systems, remote sensing, GIS based modelling and spatial statistics, finite element method modelling, geological modelling, empirical modelling using e.g. the Knothe theory, artificial neural networks, fuzzy logic calculations and other. The presentation shows the results of numerical modelling and mapping of mining terrain deformations for two cases of underground mining sites in SW Poland, hard coal one (abandoned) and copper ore (active) using the functionalities of the Deformation Information System (DIS) (Blachowski et al, 2014 @ http://meetingorganizer.copernicus.org/EGU2014/EGU2014-7949.pdf). The functionalities of the spatial data modelling module of DIS have been presented and its applications in modelling, mapping and visualising mining terrain deformations based on processing of measurement data (geodetic and GNSS) for these two cases have been characterised and compared. These include, self-developed and implemented in DIS, automation procedures for calculating mining terrain subsidence with different interpolation techniques, calculation of other mining deformation parameters (i.e. tilt, horizontal displacement, horizontal strain and curvature), as well as mapping mining terrain categories based on classification of the values of these parameters as used in Poland. Acknowledgments. This work has been financed from the National Science Centre Project "Development of a numerical method of

  8. A hybrid approach for the automated finishing of bacterial genomes.

    Science.gov (United States)

    Bashir, Ali; Klammer, Aaron A; Robins, William P; Chin, Chen-Shan; Webster, Dale; Paxinos, Ellen; Hsu, David; Ashby, Meredith; Wang, Susana; Peluso, Paul; Sebra, Robert; Sorenson, Jon; Bullard, James; Yen, Jackie; Valdovino, Marie; Mollova, Emilia; Luong, Khai; Lin, Steven; LaMay, Brianna; Joshi, Amruta; Rowe, Lori; Frace, Michael; Tarr, Cheryl L; Turnsek, Maryann; Davis, Brigid M; Kasarskis, Andrew; Mekalanos, John J; Waldor, Matthew K; Schadt, Eric E

    2012-07-01

    Advances in DNA sequencing technology have improved our ability to characterize most genomic diversity. However, accurate resolution of large structural events is challenging because of the short read lengths of second-generation technologies. Third-generation sequencing technologies, which can yield longer multikilobase reads, have the potential to address limitations associated with genome assembly. Here we combine sequencing data from second- and third-generation DNA sequencing technologies to assemble the two-chromosome genome of a recent Haitian cholera outbreak strain into two nearly finished contigs at >99.9% accuracy. Complex regions with clinically relevant structure were completely resolved. In separate control assemblies on experimental and simulated data for the canonical N16961 cholera reference strain, we obtained 14 scaffolds of greater than 1 kb for the experimental data and 8 scaffolds of greater than 1 kb for the simulated data, which allowed us to correct several errors in contigs assembled from the short-read data alone. This work provides a blueprint for the next generation of rapid microbial identification and full-genome assembly.

  9. Automated extraction of precise protein expression patterns in lymphoma by text mining abstracts of immunohistochemical studies

    Directory of Open Access Journals (Sweden)

    Jia-Fu Chang

    2013-01-01

    Full Text Available Background: In general, surgical pathology reviews report protein expression by tumors in a semi-quantitative manner, that is, -, -/+, +/-, +. At the same time, the experimental pathology literature provides multiple examples of precise expression levels determined by immunohistochemical (IHC tissue examination of populations of tumors. Natural language processing (NLP techniques enable the automated extraction of such information through text mining. We propose establishing a database linking quantitative protein expression levels with specific tumor classifications through NLP. Materials and Methods: Our method takes advantage of typical forms of representing experimental findings in terms of percentages of protein expression manifest by the tumor population under study. Characteristically, percentages are represented straightforwardly with the % symbol or as the number of positive findings of the total population. Such text is readily recognized using regular expressions and templates permitting extraction of sentences containing these forms for further analysis using grammatical structures and rule-based algorithms. Results: Our pilot study is limited to the extraction of such information related to lymphomas. We achieved a satisfactory level of retrieval as reflected in scores of 69.91% precision and 57.25% recall with an F-score of 62.95%. In addition, we demonstrate the utility of a web-based curation tool for confirming and correcting our findings. Conclusions: The experimental pathology literature represents a rich source of pathobiological information, which has been relatively underutilized. There has been a combinatorial explosion of knowledge within the pathology domain as represented by increasing numbers of immunophenotypes and disease subclassifications. NLP techniques support practical text mining techniques for extracting this knowledge and organizing it in forms appropriate for pathology decision support systems.

  10. Mining for Single Nucleotide Polymorphisms in Pig genome sequence data

    NARCIS (Netherlands)

    Kerstens, H.H.D.; Kollers, S.; Kommandath, A.; Rosario, del M.; Dibbits, B.W.; Kinders, S.M.; Crooijmans, R.P.M.A.; Groenen, M.A.M.

    2009-01-01

    Background - Single nucleotide polymorphisms (SNPs) are ideal genetic markers due to their high abundance and the highly automated way in which SNPs are detected and SNP assays are performed. The number of SNPs identified in the pig thus far is still limited. Results - A total of 4.8 million whole g

  11. Draft Genome Sequences of Two Novel Acidimicrobiaceae Members from an Acid Mine Drainage Biofilm Metagenome

    Science.gov (United States)

    Pinto, Ameet J.; Sharp, Jonathan O.; Yoder, Michael J.

    2016-01-01

    Bacteria belonging to the family Acidimicrobiaceae are frequently encountered in heavy metal-contaminated acidic environments. However, their phylogenetic and metabolic diversity is poorly resolved. We present draft genome sequences of two novel and phylogenetically distinct Acidimicrobiaceae members assembled from an acid mine drainage biofilm metagenome. PMID:26769942

  12. Ask and Ye Shall Receive? Automated Text Mining of Michigan Capital Facility Finance Bond Election Proposals to Identify Which Topics Are Associated with Bond Passage and Voter Turnout

    Science.gov (United States)

    Bowers, Alex J.; Chen, Jingjing

    2015-01-01

    The purpose of this study is to bring together recent innovations in the research literature around school district capital facility finance, municipal bond elections, statistical models of conditional time-varying outcomes, and data mining algorithms for automated text mining of election ballot proposals to examine the factors that influence the…

  13. KAIKObase: An integrated silkworm genome database and data mining tool

    Directory of Open Access Journals (Sweden)

    Nagaraju Javaregowda

    2009-10-01

    Full Text Available Abstract Background The silkworm, Bombyx mori, is one of the most economically important insects in many developing countries owing to its large-scale cultivation for silk production. With the development of genomic and biotechnological tools, B. mori has also become an important bioreactor for production of various recombinant proteins of biomedical interest. In 2004, two genome sequencing projects for B. mori were reported independently by Chinese and Japanese teams; however, the datasets were insufficient for building long genomic scaffolds which are essential for unambiguous annotation of the genome. Now, both the datasets have been merged and assembled through a joint collaboration between the two groups. Description Integration of the two data sets of silkworm whole-genome-shotgun sequencing by the Japanese and Chinese groups together with newly obtained fosmid- and BAC-end sequences produced the best continuity (~3.7 Mb in N50 scaffold size among the sequenced insect genomes and provided a high degree of nucleotide coverage (88% of all 28 chromosomes. In addition, a physical map of BAC contigs constructed by fingerprinting BAC clones and a SNP linkage map constructed using BAC-end sequences were available. In parallel, proteomic data from two-dimensional polyacrylamide gel electrophoresis in various tissues and developmental stages were compiled into a silkworm proteome database. Finally, a Bombyx trap database was constructed for documenting insertion positions and expression data of transposon insertion lines. Conclusion For efficient usage of genome information for functional studies, genomic sequences, physical and genetic map information and EST data were compiled into KAIKObase, an integrated silkworm genome database which consists of 4 map viewers, a gene viewer, and sequence, keyword and position search systems to display results and data at the level of nucleotide sequence, gene, scaffold and chromosome. Integration of the

  14. VirSorter: mining viral signal from microbial genomic data

    Directory of Open Access Journals (Sweden)

    Simon Roux

    2015-05-01

    Full Text Available Viruses of microbes impact all ecosystems where microbes drive key energy and substrate transformations including the oceans, humans and industrial fermenters. However, despite this recognized importance, our understanding of viral diversity and impacts remains limited by too few model systems and reference genomes. One way to fill these gaps in our knowledge of viral diversity is through the detection of viral signal in microbial genomic data. While multiple approaches have been developed and applied for the detection of prophages (viral genomes integrated in a microbial genome, new types of microbial genomic data are emerging that are more fragmented and larger scale, such as Single-cell Amplified Genomes (SAGs of uncultivated organisms or genomic fragments assembled from metagenomic sequencing. Here, we present VirSorter, a tool designed to detect viral signal in these different types of microbial sequence data in both a reference-dependent and reference-independent manner, leveraging probabilistic models and extensive virome data to maximize detection of novel viruses. Performance testing shows that VirSorter’s prophage prediction capability compares to that of available prophage predictors for complete genomes, but is superior in predicting viral sequences outside of a host genome (i.e., from extrachromosomal prophages, lytic infections, or partially assembled prophages. Furthermore, VirSorter outperforms existing tools for fragmented genomic and metagenomic datasets, and can identify viral signal in assembled sequence (contigs as short as 3kb, while providing near-perfect identification (>95% Recall and 100% Precision on contigs of at least 10kb. Because VirSorter scales to large datasets, it can also be used in “reverse” to more confidently identify viral sequence in viral metagenomes by sorting away cellular DNA whether derived from gene transfer agents, generalized transduction or contamination. Finally, VirSorter is made

  15. Correcting Inconsistencies and Errors in Bacterial Genome Metadata Using an Automated Curation Tool in Excel (AutoCurE).

    Science.gov (United States)

    Schmedes, Sarah E; King, Jonathan L; Budowle, Bruce

    2015-01-01

    Whole-genome data are invaluable for large-scale comparative genomic studies. Current sequencing technologies have made it feasible to sequence entire bacterial genomes with relative ease and time with a substantially reduced cost per nucleotide, hence cost per genome. More than 3,000 bacterial genomes have been sequenced and are available at the finished status. Publically available genomes can be readily downloaded; however, there are challenges to verify the specific supporting data contained within the download and to identify errors and inconsistencies that may be present within the organizational data content and metadata. AutoCurE, an automated tool for bacterial genome database curation in Excel, was developed to facilitate local database curation of supporting data that accompany downloaded genomes from the National Center for Biotechnology Information. AutoCurE provides an automated approach to curate local genomic databases by flagging inconsistencies or errors by comparing the downloaded supporting data to the genome reports to verify genome name, RefSeq accession numbers, the presence of archaea, BioProject/UIDs, and sequence file descriptions. Flags are generated for nine metadata fields if there are inconsistencies between the downloaded genomes and genomes reports and if erroneous or missing data are evident. AutoCurE is an easy-to-use tool for local database curation for large-scale genome data prior to downstream analyses.

  16. Precursor-centric genome-mining approach for lasso peptide discovery.

    Science.gov (United States)

    Maksimov, Mikhail O; Pelczer, István; Link, A James

    2012-09-18

    Lasso peptides are a class of ribosomally synthesized posttranslationally modified natural products found in bacteria. Currently known lasso peptides have a diverse set of pharmacologically relevant activities, including inhibition of bacterial growth, receptor antagonism, and enzyme inhibition. The biosynthesis of lasso peptides is specified by a cluster of three genes encoding a precursor protein and two enzymes. Here we develop a unique genome-mining algorithm to identify lasso peptide gene clusters in prokaryotes. Our approach involves pattern matching to a small number of conserved amino acids in precursor proteins, and thus allows for a more global survey of lasso peptide gene clusters than does homology-based genome mining. Of more than 3,000 currently sequenced prokaryotic genomes, we found 76 organisms that are putative lasso peptide producers. These organisms span nine bacterial phyla and an archaeal phylum. To provide validation of the genome-mining method, we focused on a single lasso peptide predicted to be produced by the freshwater bacterium Asticcacaulis excentricus. Heterologous expression of an engineered, minimal gene cluster in Escherichia coli led to the production of a unique lasso peptide, astexin-1. At 23 aa, astexin-1 is the largest lasso peptide isolated to date. It is also highly polar, in contrast to many lasso peptides that are primarily hydrophobic. Astexin-1 has modest antimicrobial activity against its phylogenetic relative Caulobacter crescentus. The solution structure of astexin-1 was determined revealing a unique topology that is stabilized by hydrogen bonding between segments of the peptide.

  17. Toward the automated generation of genome-scale metabolic networks in the SEED

    Directory of Open Access Journals (Sweden)

    Gould John

    2007-04-01

    Full Text Available Abstract Background Current methods for the automated generation of genome-scale metabolic networks focus on genome annotation and preliminary biochemical reaction network assembly, but do not adequately address the process of identifying and filling gaps in the reaction network, and verifying that the network is suitable for systems level analysis. Thus, current methods are only sufficient for generating draft-quality networks, and refinement of the reaction network is still largely a manual, labor-intensive process. Results We have developed a method for generating genome-scale metabolic networks that produces substantially complete reaction networks, suitable for systems level analysis. Our method partitions the reaction space of central and intermediary metabolism into discrete, interconnected components that can be assembled and verified in isolation from each other, and then integrated and verified at the level of their interconnectivity. We have developed a database of components that are common across organisms, and have created tools for automatically assembling appropriate components for a particular organism based on the metabolic pathways encoded in the organism's genome. This focuses manual efforts on that portion of an organism's metabolism that is not yet represented in the database. We have demonstrated the efficacy of our method by reverse-engineering and automatically regenerating the reaction network from a published genome-scale metabolic model for Staphylococcus aureus. Additionally, we have verified that our method capitalizes on the database of common reaction network components created for S. aureus, by using these components to generate substantially complete reconstructions of the reaction networks from three other published metabolic models (Escherichia coli, Helicobacter pylori, and Lactococcus lactis. We have implemented our tools and database within the SEED, an open-source software environment for comparative

  18. Mining of simple sequence repeats in the Genome of Gentianaceae

    Directory of Open Access Journals (Sweden)

    R Sathishkumar

    2011-01-01

    Full Text Available Simple sequence repeats (SSRs or short tandem repeats are short repeat motifs that show high level of length polymorphism due to insertion or deletion mutations of one or more repeat types. Here, we present the detection and abundance of microsatellites or SSRs in nucleotide sequences of Gentianaceae family. A total of 545 SSRs were mined in 4698 nucleotide sequences downloaded from the National Center for Biotechnology Information (NCBI. Among the SSR sequences, the frequency of repeat type was about 429 -mono repeats, 99 -di repeats, 15 -tri repeats, and 2 --hexa repeats. Mononucleotide repeats were found to be abundant repeat types, about 78%, followed by dinucleotide repeats (18.16% among the SSR sequences. An attempt was made to design primer pairs for 545 identified SSRs but these were found only for 169 sequences.

  19. Automated whole-genome multiple alignment of rat, mouse, and human

    Energy Technology Data Exchange (ETDEWEB)

    Brudno, Michael; Poliakov, Alexander; Salamov, Asaf; Cooper, Gregory M.; Sidow, Arend; Rubin, Edward M.; Solovyev, Victor; Batzoglou, Serafim; Dubchak, Inna

    2004-07-04

    We have built a whole genome multiple alignment of the three currently available mammalian genomes using a fully automated pipeline which combines the local/global approach of the Berkeley Genome Pipeline and the LAGAN program. The strategy is based on progressive alignment, and consists of two main steps: (1) alignment of the mouse and rat genomes; and (2) alignment of human to either the mouse-rat alignments from step 1, or the remaining unaligned mouse and rat sequences. The resulting alignments demonstrate high sensitivity, with 87% of all human gene-coding areas aligned in both mouse and rat. The specificity is also high: <7% of the rat contigs are aligned to multiple places in human and 97% of all alignments with human sequence > 100kb agree with a three-way synteny map built independently using predicted exons in the three genomes. At the nucleotide level <1% of the rat nucleotides are mapped to multiple places in the human sequence in the alignment; and 96.5% of human nucleotides within all alignments agree with the synteny map. The alignments are publicly available online, with visualization through the novel Multi-VISTA browser that we also present.

  20. Precursor-centric genome-mining approach for lasso peptide discovery

    OpenAIRE

    Maksimov, Mikhail O.; Pelczer, István; Link, A. James

    2012-01-01

    Lasso peptides are a class of ribosomally synthesized posttranslationally modified natural products found in bacteria. Currently known lasso peptides have a diverse set of pharmacologically relevant activities, including inhibition of bacterial growth, receptor antagonism, and enzyme inhibition. The biosynthesis of lasso peptides is specified by a cluster of three genes encoding a precursor protein and two enzymes. Here we develop a unique genome-mining algorithm to identify lasso peptide gen...

  1. The design of aided software for osmotic stress responding genes mining in plant genome

    Institute of Scientific and Technical Information of China (English)

    2007-01-01

    A software and algorithm which based on random sequence model uses osmotic stress responding cis elements from existing information sources of biology was designed. It can infer the genie downstream function of Arabidopsis thaliana through analyzing its promoter region, and can offer effective aided analysis to mine osmotic stress responding genes in Arabidopsis thaliana genome. The practical application proves that this software can aid to analyze vast genie data and offer important data evidence.

  2. Genome mining offers a new starting point for parasitology research.

    Science.gov (United States)

    Lv, Zhiyue; Wu, Zhongdao; Zhang, Limei; Ji, Pengyu; Cai, Yifeng; Luo, Shiqi; Wang, Hongxi; Li, Hao

    2015-02-01

    Parasites including helminthes, protozoa, and medical arthropod vectors are a major cause of global infectious diseases, affecting one-sixth of the world's population, which are responsible for enormous levels of morbidity and mortality important and remain impediments to economic development especially in tropical countries. Prevalent drug resistance, lack of highly effective and practical vaccines, as well as specific and sensitive diagnostic markers are proving to be challenging problems in parasitic disease control in most parts of the world. The impressive progress recently made in genome-wide analysis of parasites of medical importance, including trematodes of Clonorchis sinensis, Opisthorchis viverrini, Schistosoma haematobium, S. japonicum, and S. mansoni; nematodes of Brugia malayi, Loa loa, Necator americanus, Trichinella spiralis, and Trichuris suis; cestodes of Echinococcus granulosus, E. multilocularis, and Taenia solium; protozoa of Babesia bovis, B. microti, Cryptosporidium hominis, Eimeria falciformis, E. histolytica, Giardia intestinalis, Leishmania braziliensis, L. donovani, L. major, Plasmodium falciparum, P. vivax, Trichomonas vaginalis, Trypanosoma brucei and T. cruzi; and medical arthropod vectors of Aedes aegypti, Anopheles darlingi, A. sinensis, and Culex quinquefasciatus, have been systematically covered in this review for a comprehensive understanding of the genetic information contained in nuclear, mitochondrial, kinetoplast, plastid, or endosymbiotic bacterial genomes of parasites, further valuable insight into parasite-host interactions and development of promising novel drug and vaccine candidates and preferable diagnostic tools, thereby underpinning the prevention and control of parasitic diseases.

  3. Mining the pig genome to investigate the domestication process.

    Science.gov (United States)

    Ramos-Onsins, S E; Burgos-Paz, W; Manunza, A; Amills, M

    2014-12-01

    Pig domestication began around 9000 YBP in the Fertile Crescent and Far East, involving marked morphological and genetic changes that occurred in a relatively short window of time. Identifying the alleles that drove the behavioural and physiological transformation of wild boars into pigs through artificial selection constitutes a formidable challenge that can only be faced from an interdisciplinary perspective. Indeed, although basic facts regarding the demography of pig domestication and dispersal have been uncovered, the biological substrate of these processes remains enigmatic. Considerable hope has been placed on new approaches, based on next-generation sequencing, which allow whole-genome variation to be analyzed at the population level. In this review, we provide an outline of the current knowledge on pig domestication by considering both archaeological and genetic data. Moreover, we discuss several potential scenarios of genome evolution under the complex mixture of demography and selection forces at play during domestication. Finally, we highlight several technical and methodological approaches that may represent significant advances in resolving the conundrum of livestock domestication.

  4. A pipeline for automated annotation of yeast genome sequences by a conserved-synteny approach

    Directory of Open Access Journals (Sweden)

    Proux-Wéra Estelle

    2012-09-01

    Full Text Available Abstract Background Yeasts are a model system for exploring eukaryotic genome evolution. Next-generation sequencing technologies are poised to vastly increase the number of yeast genome sequences, both from resequencing projects (population studies and from de novo sequencing projects (new species. However, the annotation of genomes presents a major bottleneck for de novo projects, because it still relies on a process that is largely manual. Results Here we present the Yeast Genome Annotation Pipeline (YGAP, an automated system designed specifically for new yeast genome sequences lacking transcriptome data. YGAP does automatic de novo annotation, exploiting homology and synteny information from other yeast species stored in the Yeast Gene Order Browser (YGOB database. The basic premises underlying YGAP's approach are that data from other species already tells us what genes we should expect to find in any particular genomic region and that we should also expect that orthologous genes are likely to have similar intron/exon structures. Additionally, it is able to detect probable frameshift sequencing errors and can propose corrections for them. YGAP searches intelligently for introns, and detects tRNA genes and Ty-like elements. Conclusions In tests on Saccharomyces cerevisiae and on the genomes of Naumovozyma castellii and Tetrapisispora blattae newly sequenced with Roche-454 technology, YGAP outperformed another popular annotation program (AUGUSTUS. For S. cerevisiae and N. castellii, 91-93% of YGAP's predicted gene structures were identical to those in previous manually curated gene sets. YGAP has been implemented as a webserver with a user-friendly interface at http://wolfe.gen.tcd.ie/annotation.

  5. Chapter 10: Mining genome-wide genetic markers.

    Directory of Open Access Journals (Sweden)

    Xiang Zhang

    Full Text Available Genome-wide association study (GWAS aims to discover genetic factors underlying phenotypic traits. The large number of genetic factors poses both computational and statistical challenges. Various computational approaches have been developed for large scale GWAS. In this chapter, we will discuss several widely used computational approaches in GWAS. The following topics will be covered: (1 An introduction to the background of GWAS. (2 The existing computational approaches that are widely used in GWAS. This will cover single-locus, epistasis detection, and machine learning methods that have been recently developed in biology, statistic, and computer science communities. This part will be the main focus of this chapter. (3 The limitations of current approaches and future directions.

  6. A framework: Cluster detection and multidimensional visualization of automated data mining using intelligent agents

    CERN Document Server

    Jayabrabu, R; Vivekanandan, K

    2012-01-01

    Data Mining techniques plays a vital role like extraction of required knowledge, finding unsuspected information to make strategic decision in a novel way which in term understandable by domain experts. A generalized frame work is proposed by considering non - domain experts during mining process for better understanding, making better decision and better finding new patters in case of selecting suitable data mining techniques based on the user profile by means of intelligent agents. KEYWORDS: Data Mining Techniques, Intelligent Agents, User Profile, Multidimensional Visualization, Knowledge Discovery.

  7. SeqMule: automated pipeline for analysis of human exome/genome sequencing data.

    Science.gov (United States)

    Guo, Yunfei; Ding, Xiaolei; Shen, Yufeng; Lyon, Gholson J; Wang, Kai

    2015-09-18

    Next-generation sequencing (NGS) technology has greatly helped us identify disease-contributory variants for Mendelian diseases. However, users are often faced with issues such as software compatibility, complicated configuration, and no access to high-performance computing facility. Discrepancies exist among aligners and variant callers. We developed a computational pipeline, SeqMule, to perform automated variant calling from NGS data on human genomes and exomes. SeqMule integrates computational-cluster-free parallelization capability built on top of the variant callers, and facilitates normalization/intersection of variant calls to generate consensus set with high confidence. SeqMule integrates 5 alignment tools, 5 variant calling algorithms and accepts various combinations all by one-line command, therefore allowing highly flexible yet fully automated variant calling. In a modern machine (2 Intel Xeon X5650 CPUs, 48 GB memory), when fast turn-around is needed, SeqMule generates annotated VCF files in a day from a 30X whole-genome sequencing data set; when more accurate calling is needed, SeqMule generates consensus call set that improves over single callers, as measured by both Mendelian error rate and consistency. SeqMule supports Sun Grid Engine for parallel processing, offers turn-key solution for deployment on Amazon Web Services, allows quality check, Mendelian error check, consistency evaluation, HTML-based reports. SeqMule is available at http://seqmule.openbioinformatics.org.

  8. Design of Coal Mine Integrated Automation System Based on NetLinx

    Institute of Scientific and Technical Information of China (English)

    DING En-jie; ZHANG Shen

    2003-01-01

    A network structure of coalmine integrated automation system based on NetLinx was proposed. The features of three-layer-network structure were discussed in detail. The mechanism of time determination of the network was analyzed. A design example of the integrated automation system for a real coalmine was presented.

  9. GMATA: An Integrated Software Package for Genome-Scale SSR Mining, Marker Development and Viewing

    Science.gov (United States)

    Wang, Xuewen; Wang, Le

    2016-01-01

    Simple sequence repeats (SSRs), also referred to as microsatellites, are highly variable tandem DNAs that are widely used as genetic markers. The increasing availability of whole-genome and transcript sequences provides information resources for SSR marker development. However, efficient software is required to efficiently identify and display SSR information along with other gene features at a genome scale. We developed novel software package Genome-wide Microsatellite Analyzing Tool Package (GMATA) integrating SSR mining, statistical analysis and plotting, marker design, polymorphism screening and marker transferability, and enabled simultaneously display SSR markers with other genome features. GMATA applies novel strategies for SSR analysis and primer design in large genomes, which allows GMATA to perform faster calculation and provides more accurate results than existing tools. Our package is also capable of processing DNA sequences of any size on a standard computer. GMATA is user friendly, only requires mouse clicks or types inputs on the command line, and is executable in multiple computing platforms. We demonstrated the application of GMATA in plants genomes and reveal a novel distribution pattern of SSRs in 15 grass genomes. The most abundant motifs are dimer GA/TC, the A/T monomer and the GCG/CGC trimer, rather than the rich G/C content in DNA sequence. We also revealed that SSR count is a linear to the chromosome length in fully assembled grass genomes. GMATA represents a powerful application tool that facilitates genomic sequence analyses. GAMTA is freely available at http://sourceforge.net/projects/gmata/?source=navbar. PMID:27679641

  10. GMATA: an integrated software package for genome-scale SSR mining, marker development and viewing

    Directory of Open Access Journals (Sweden)

    Xuewen Wang

    2016-09-01

    Full Text Available Simple sequence repeats (SSRs, also referred to as microsatellites, are highly variable tandem DNAs that are widely used as genetic markers. The increasing availability of whole-genome and transcript sequences provides information resources for SSR marker development. However, efficient software is required to efficiently identify and display SSR information along with other gene features at a genome scale. We developed novel software package Genome-wide Microsatellite Analyzing Tool Package (GMATA integrating SSR mining, statistical analysis and plotting, marker design, polymorphism screening and marker transferability, and enabled simultaneously display SSR markers with other genome features. GMATA applies novel strategies for SSR analysis and primer design in large genomes, which allows GMATA to perform faster calculation and provides more accurate results than existing tools. Our package is also capable of processing DNA sequences of any size on a standard computer. GMATA is user friendly, only requires mouse clicks or types inputs on the command line, and is executable in multiple computing platforms. We demonstrated the application of GMATA in plants genomes and reveal a novel distribution pattern of SSRs in 15 grass genomes. The most abundant motifs are dimer GA/TC, the A/T monomer and the GCG/CGC trimer, rather than the rich G/C content in DNA sequence. We also revealed that SSR count is a linear to the chromosome length in fully assembled grass genomes. GMATA represents a powerful application tool that facilitates genomic sequence analyses. GAMTA is freely available at http://sourceforge.net/projects/gmata/?source=navbar.

  11. Ensembl Plants: Integrating Tools for Visualizing, Mining, and Analyzing Plant Genomics Data.

    Science.gov (United States)

    Bolser, Dan; Staines, Daniel M; Pritchard, Emily; Kersey, Paul

    2016-01-01

    Ensembl Plants ( http://plants.ensembl.org ) is an integrative resource presenting genome-scale information for a growing number of sequenced plant species (currently 33). Data provided includes genome sequence, gene models, functional annotation, and polymorphic loci. Various additional information are provided for variation data, including population structure, individual genotypes, linkage, and phenotype data. In each release, comparative analyses are performed on whole genome and protein sequences, and genome alignments and gene trees are made available that show the implied evolutionary history of each gene family. Access to the data is provided through a genome browser incorporating many specialist interfaces for different data types, and through a variety of additional methods for programmatic access and data mining. These access routes are consistent with those offered through the Ensembl interface for the genomes of non-plant species, including those of plant pathogens, pests, and pollinators.Ensembl Plants is updated 4-5 times a year and is developed in collaboration with our international partners in the Gramene ( http://www.gramene.org ) and transPLANT projects ( http://www.transplantdb.org ).

  12. Phylogeny-guided (meta)genome mining approach for the targeted discovery of new microbial natural products.

    Science.gov (United States)

    Kang, Hahk-Soo

    2017-02-01

    Genomics-based methods are now commonplace in natural products research. A phylogeny-guided mining approach provides a means to quickly screen a large number of microbial genomes or metagenomes in search of new biosynthetic gene clusters of interest. In this approach, biosynthetic genes serve as molecular markers, and phylogenetic trees built with known and unknown marker gene sequences are used to quickly prioritize biosynthetic gene clusters for their metabolites characterization. An increase in the use of this approach has been observed for the last couple of years along with the emergence of low cost sequencing technologies. The aim of this review is to discuss the basic concept of a phylogeny-guided mining approach, and also to provide examples in which this approach was successfully applied to discover new natural products from microbial genomes and metagenomes. I believe that the phylogeny-guided mining approach will continue to play an important role in genomics-based natural products research.

  13. A Tool for Multiple Targeted Genome Deletions that Is Precise, Scar-Free, and Suitable for Automation.

    Science.gov (United States)

    Aubrey, Wayne; Riley, Michael C; Young, Michael; King, Ross D; Oliver, Stephen G; Clare, Amanda

    2015-01-01

    Many advances in synthetic biology require the removal of a large number of genomic elements from a genome. Most existing deletion methods leave behind markers, and as there are a limited number of markers, such methods can only be applied a fixed number of times. Deletion methods that recycle markers generally are either imprecise (remove untargeted sequences), or leave scar sequences which can cause genome instability and rearrangements. No existing marker recycling method is automation-friendly. We have developed a novel openly available deletion tool that consists of: 1) a method for deleting genomic elements that can be repeatedly used without limit, is precise, scar-free, and suitable for automation; and 2) software to design the method's primers. Our tool is sequence agnostic and could be used to delete large numbers of coding sequences, promoter regions, transcription factor binding sites, terminators, etc in a single genome. We have validated our tool on the deletion of non-essential open reading frames (ORFs) from S. cerevisiae. The tool is applicable to arbitrary genomes, and we provide primer sequences for the deletion of: 90% of the ORFs from the S. cerevisiae genome, 88% of the ORFs from S. pombe genome, and 85% of the ORFs from the L. lactis genome.

  14. A Tool for Multiple Targeted Genome Deletions that Is Precise, Scar-Free, and Suitable for Automation.

    Directory of Open Access Journals (Sweden)

    Wayne Aubrey

    Full Text Available Many advances in synthetic biology require the removal of a large number of genomic elements from a genome. Most existing deletion methods leave behind markers, and as there are a limited number of markers, such methods can only be applied a fixed number of times. Deletion methods that recycle markers generally are either imprecise (remove untargeted sequences, or leave scar sequences which can cause genome instability and rearrangements. No existing marker recycling method is automation-friendly. We have developed a novel openly available deletion tool that consists of: 1 a method for deleting genomic elements that can be repeatedly used without limit, is precise, scar-free, and suitable for automation; and 2 software to design the method's primers. Our tool is sequence agnostic and could be used to delete large numbers of coding sequences, promoter regions, transcription factor binding sites, terminators, etc in a single genome. We have validated our tool on the deletion of non-essential open reading frames (ORFs from S. cerevisiae. The tool is applicable to arbitrary genomes, and we provide primer sequences for the deletion of: 90% of the ORFs from the S. cerevisiae genome, 88% of the ORFs from S. pombe genome, and 85% of the ORFs from the L. lactis genome.

  15. A Tool for Multiple Targeted Genome Deletions that Is Precise, Scar-Free, and Suitable for Automation

    Science.gov (United States)

    Aubrey, Wayne; Riley, Michael C.; Young, Michael; King, Ross D.; Oliver, Stephen G.; Clare, Amanda

    2015-01-01

    Many advances in synthetic biology require the removal of a large number of genomic elements from a genome. Most existing deletion methods leave behind markers, and as there are a limited number of markers, such methods can only be applied a fixed number of times. Deletion methods that recycle markers generally are either imprecise (remove untargeted sequences), or leave scar sequences which can cause genome instability and rearrangements. No existing marker recycling method is automation-friendly. We have developed a novel openly available deletion tool that consists of: 1) a method for deleting genomic elements that can be repeatedly used without limit, is precise, scar-free, and suitable for automation; and 2) software to design the method’s primers. Our tool is sequence agnostic and could be used to delete large numbers of coding sequences, promoter regions, transcription factor binding sites, terminators, etc in a single genome. We have validated our tool on the deletion of non-essential open reading frames (ORFs) from S. cerevisiae. The tool is applicable to arbitrary genomes, and we provide primer sequences for the deletion of: 90% of the ORFs from the S. cerevisiae genome, 88% of the ORFs from S. pombe genome, and 85% of the ORFs from the L. lactis genome. PMID:26630677

  16. BioCreative Workshops for DOE Genome Sciences: Text Mining for Metagenomics

    Energy Technology Data Exchange (ETDEWEB)

    Wu, Cathy H. [Univ. of Delaware, Newark, DE (United States). Center for Bioinformatics and Computational Biology; Hirschman, Lynette [The MITRE Corporation, Bedford, MA (United States)

    2016-10-29

    The objective of this project was to host BioCreative workshops to define and develop text mining tasks to meet the needs of the Genome Sciences community, focusing on metadata information extraction in metagenomics. Following the successful introduction of metagenomics at the BioCreative IV workshop, members of the metagenomics community and BioCreative communities continued discussion to identify candidate topics for a BioCreative metagenomics track for BioCreative V. Of particular interest was the capture of environmental and isolation source information from text. The outcome was to form a “community of interest” around work on the interactive EXTRACT system, which supported interactive tagging of environmental and species data. This experiment is included in the BioCreative V virtual issue of Database. In addition, there was broad participation by members of the metagenomics community in the panels held at BioCreative V, leading to valuable exchanges between the text mining developers and members of the metagenomics research community. These exchanges are reflected in a number of the overview and perspective pieces also being captured in the BioCreative V virtual issue. Overall, this conversation has exposed the metagenomics researchers to the possibilities of text mining, and educated the text mining developers to the specific needs of the metagenomics community.

  17. 多元自动化基因组工程%Multiplex Automated Genome Engineering

    Institute of Scientific and Technical Information of China (English)

    李丹; 高海军

    2015-01-01

    基因组编辑技术在基因组工程研究中应用广泛,其中位点特异性核酸酶编辑技术和CRISPR/Cas系统在单基因编辑方面贡献卓越,但由于基因组的庞大,这些技术又有一定的局限性。多元自动化基因组工程(MAGE)是一种新型基因组编辑技术,可同时作用于多个基因,具有快速、高效的特点,已被用于大肠杆菌的基因敲除和基因替换。主要介绍了MAGE的原理、具体操作流程及技术进展,并结合MAGE技术的应用,讨论其发展趋势。%Genome editing is widely used in genome engineering research and site-specific nuclease technologies and CRISPR/Cas system focus on single gene editing. Owing to the huge size of genome, there are some limitations on the applications of these technologies. Multiplex Automated Genome Engineering(MAGE)is a new, fast and efficient genome editing technology, which can operate multiple genes simultaneously, and be used in knockout and replacement ofEscherichia coligenes. This review illustrates the recent advances in the theory, operation protocol and technological innovation of MAGE, its application and development trend were also discussed.

  18. Strategies for the discovery of new natural products by genome mining.

    Science.gov (United States)

    Zerikly, Malek; Challis, Gregory L

    2009-03-02

    Natural products have a very broad spectrum of applications. Many natural products are used clinically as antibacterial, antifungal, antiparasitic, anticancer and immunosuppressive agents and are therefore of utmost importance for our society. When in the 1940s the golden age of antibiotics was ushered in, a "gold rush fever" of natural product discovery in the pharmaceutical industry ensued for many decades. However, the traditional process of discovering new bioactive natural products is generally long and laborious, and known natural products are frequently rediscovered. A mass-withdrawal of pharmaceutical companies from new natural product discovery and natural products research has thus occurred in recent years. In this article, the concept of genome mining for novel natural product discovery, which promises to provide a myriad of new bioactive natural compounds, is summarized and discussed. Genome mining for new natural product discovery exploits the huge and constantly increasing quantity of DNA sequence data from a wide variety of organisms that is accumulating in publicly accessible databases. Genes encoding enzymes likely to be involved in natural product biosynthesis can be readily located in sequenced genomes by use of computational sequence comparison tools. This information can be exploited in a variety of ways in the search for new bioactive natural products.

  19. Perspective: NanoMine: A material genome approach for polymer nanocomposites analysis and design

    Science.gov (United States)

    Zhao, He; Li, Xiaolin; Zhang, Yichi; Schadler, Linda S.; Chen, Wei; Brinson, L. Catherine

    2016-05-01

    Polymer nanocomposites are a designer class of materials where nanoscale particles, functional chemistry, and polymer resin combine to provide materials with unprecedented combinations of physical properties. In this paper, we introduce NanoMine, a data-driven web-based platform for analysis and design of polymer nanocomposite systems under the material genome concept. This open data resource strives to curate experimental and computational data on nanocomposite processing, structure, and properties, as well as to provide analysis and modeling tools that leverage curated data for material property prediction and design. With a continuously expanding dataset and toolkit, NanoMine encourages community feedback and input to construct a sustainable infrastructure that benefits nanocomposite material research and development.

  20. Discovery of phosphonic acid natural products by mining the genomes of 10,000 actinomycetes.

    Science.gov (United States)

    Ju, Kou-San; Gao, Jiangtao; Doroghazi, James R; Wang, Kwo-Kwang A; Thibodeaux, Christopher J; Li, Steven; Metzger, Emily; Fudala, John; Su, Joleen; Zhang, Jun Kai; Lee, Jaeheon; Cioni, Joel P; Evans, Bradley S; Hirota, Ryuichi; Labeda, David P; van der Donk, Wilfred A; Metcalf, William W

    2015-09-29

    Although natural products have been a particularly rich source of human medicines, activity-based screening results in a very high rate of rediscovery of known molecules. Based on the large number of natural product biosynthetic genes in microbial genomes, many have proposed "genome mining" as an alternative approach for discovery efforts; however, this idea has yet to be performed experimentally on a large scale. Here, we demonstrate the feasibility of large-scale, high-throughput genome mining by screening a collection of over 10,000 actinomycetes for the genetic potential to make phosphonic acids, a class of natural products with diverse and useful bioactivities. Genome sequencing identified a diverse collection of phosphonate biosynthetic gene clusters within 278 strains. These clusters were classified into 64 distinct groups, of which 55 are likely to direct the synthesis of unknown compounds. Characterization of strains within five of these groups resulted in the discovery of a new archetypical pathway for phosphonate biosynthesis, the first (to our knowledge) dedicated pathway for H-phosphinates, and 11 previously undescribed phosphonic acid natural products. Among these compounds are argolaphos, a broad-spectrum antibacterial phosphonopeptide composed of aminomethylphosphonate in peptide linkage to a rare amino acid N(5)-hydroxyarginine; valinophos, an N-acetyl l-Val ester of 2,3-dihydroxypropylphosphonate; and phosphonocystoximate, an unusual thiohydroximate-containing molecule representing a new chemotype of sulfur-containing phosphonate natural products. Analysis of the genome sequences from the remaining strains suggests that the majority of the phosphonate biosynthetic repertoire of Actinobacteria has been captured at the gene level. This dereplicated strain collection now provides a reservoir of numerous, as yet undiscovered, phosphonate natural products.

  1. Genome mining of the Streptomyces avermitilis genome and development of genome-minimized hosts for heterologous expression of biosynthetic gene clusters.

    Science.gov (United States)

    Ikeda, Haruo; Kazuo, Shin-ya; Omura, Satoshi

    2014-02-01

    To date, several actinomycete genomes have been completed and annotated. Among them, Streptomyces microorganisms are of major pharmaceutical interest because they are a rich source of numerous secondary metabolites. S. avermitilis is an industrial microorganism used for the production of an anthelmintic agent, avermectin, which is a commercially important antiparasitic agent in human and veterinary medicine, and agricultural pesticides. Genome analysis of S. avermitilis provides significant information for not only industrial applications but also understanding the features of this genus. On genome mining of S. avermitilis, the microorganism has been found to harbor at least 38 secondary metabolic gene clusters and 46 insertion sequence (IS)-like sequences on the genome, which have not been searched so far. A significant use of the genome data of Streptomyces microorganisms is the construction of a versatile host for heterologous expression of exogenous biosynthetic gene clusters by genetic engineering. Since S. avermitilis is used as an industrial microorganism, the microorganism is already optimized for the efficient supply of primary metabolic precursors and biochemical energy to support multistep biosynthesis. The feasibility of large-deletion mutants of S. avermitilis has been confirmed by heterologous expression of more than 20 exogenous biosynthetic gene clusters.

  2. Towards fully automated structure-based function prediction in structural genomics: a case study.

    Science.gov (United States)

    Watson, James D; Sanderson, Steve; Ezersky, Alexandra; Savchenko, Alexei; Edwards, Aled; Orengo, Christine; Joachimiak, Andrzej; Laskowski, Roman A; Thornton, Janet M

    2007-04-13

    As the global Structural Genomics projects have picked up pace, the number of structures annotated in the Protein Data Bank as hypothetical protein or unknown function has grown significantly. A major challenge now involves the development of computational methods to assign functions to these proteins accurately and automatically. As part of the Midwest Center for Structural Genomics (MCSG) we have developed a fully automated functional analysis server, ProFunc, which performs a battery of analyses on a submitted structure. The analyses combine a number of sequence-based and structure-based methods to identify functional clues. After the first stage of the Protein Structure Initiative (PSI), we review the success of the pipeline and the importance of structure-based function prediction. As a dataset, we have chosen all structures solved by the MCSG during the 5 years of the first PSI. Our analysis suggests that two of the structure-based methods are particularly successful and provide examples of local similarity that is difficult to identify using current sequence-based methods. No one method is successful in all cases, so, through the use of a number of complementary sequence and structural approaches, the ProFunc server increases the chances that at least one method will find a significant hit that can help elucidate function. Manual assessment of the results is a time-consuming process and subject to individual interpretation and human error. We present a method based on the Gene Ontology (GO) schema using GO-slims that can allow the automated assessment of hits with a success rate approaching that of expert manual assessment.

  3. Genome mining expands the chemical diversity of the cyanobactin family to include highly modified linear peptides.

    Science.gov (United States)

    Leikoski, Niina; Liu, Liwei; Jokela, Jouni; Wahlsten, Matti; Gugger, Muriel; Calteau, Alexandra; Permi, Perttu; Kerfeld, Cheryl A; Sivonen, Kaarina; Fewer, David P

    2013-08-22

    Ribosomal peptides are produced through the posttranslational modification of short precursor peptides. Cyanobactins are a growing family of cyclic ribosomal peptides produced by cyanobacteria. However, a broad systematic survey of the genetic capacity to produce cyanobactins is lacking. Here we report the identification of 31 cyanobactin gene clusters from 126 genomes of cyanobacteria. Genome mining suggested a complex evolutionary history defined by horizontal gene transfer and rapid diversification of precursor genes. Extensive chemical analyses demonstrated that some cyanobacteria produce short linear cyanobactins with a chain length ranging from three to five amino acids. The linear peptides were N-prenylated and O-methylated on the N and C termini, respectively, and named aeruginosamide and viridisamide. These findings broaden the structural diversity of the cyanobactin family to include highly modified linear peptides with rare posttranslational modifications.

  4. Genome mining reveals unlocked bioactive potential of marine Gram-negative bacteria

    DEFF Research Database (Denmark)

    Machado, Henrique; Sonnenschein, Eva; Melchiorsen, Jette;

    2015-01-01

    of bioactive compounds leading to successful applications in pharmaceutical and biotech industries. Marine bacteria have so far not been exploited to the same extent; however, they are believed to harbor a multitude of novel bioactive chemistry. To explore this potential, genomes of 21 marine Alpha......- and Gammaproteobacteria collected during the Galathea 3 expedition were sequenced and mined for natural product encoding gene clusters. Results: Independently of genome size, bacteria of all tested genera carried a large number of clusters encoding different potential bioactivities, especially within the Vibrionaceae...... and Pseudoalteromonadaceae families. A very high potential was identified in pigmented pseudoalteromonads with up to 20 clusters in a single strain, mostly NRPSs and NRPS-PKS hybrids. Furthermore, regulatory elements in bioactivity-related pathways including chitin metabolism, quorum sensing and iron scavenging systems were...

  5. Mining clinical attributes of genomic variants through assisted literature curation in Egas.

    Science.gov (United States)

    Matos, Sérgio; Campos, David; Pinho, Renato; Silva, Raquel M; Mort, Matthew; Cooper, David N; Oliveira, José Luís

    2016-01-01

    The veritable deluge of biological data over recent years has led to the establishment of a considerable number of knowledge resources that compile curated information extracted from the literature and store it in structured form, facilitating its use and exploitation. In this article, we focus on the curation of inherited genetic variants and associated clinical attributes, such as zygosity, penetrance or inheritance mode, and describe the use of Egas for this task. Egas is a web-based platform for text-mining assisted literature curation that focuses on usability through modern design solutions and simple user interactions. Egas offers a flexible and customizable tool that allows defining the concept types and relations of interest for a given annotation task, as well as the ontologies used for normalizing each concept type. Further, annotations may be performed on raw documents or on the results of automated concept identification and relation extraction tools. Users can inspect, correct or remove automatic text-mining results, manually add new annotations, and export the results to standard formats. Egas is compatible with the most recent versions of Google Chrome, Mozilla Firefox, Internet Explorer and Safari and is available for use at https://demo.bmd-software.com/egas/Database URL: https://demo.bmd-software.com/egas/.

  6. Mining

    Directory of Open Access Journals (Sweden)

    Khairullah Khan

    2014-09-01

    Full Text Available Opinion mining is an interesting area of research because of its applications in various fields. Collecting opinions of people about products and about social and political events and problems through the Web is becoming increasingly popular every day. The opinions of users are helpful for the public and for stakeholders when making certain decisions. Opinion mining is a way to retrieve information through search engines, Web blogs and social networks. Because of the huge number of reviews in the form of unstructured text, it is impossible to summarize the information manually. Accordingly, efficient computational methods are needed for mining and summarizing the reviews from corpuses and Web documents. This study presents a systematic literature survey regarding the computational techniques, models and algorithms for mining opinion components from unstructured reviews.

  7. Genome mining and functional genomics for siderophore production in Aspergillus niger.

    Science.gov (United States)

    Franken, Angelique C W; Lechner, Beatrix E; Werner, Ernst R; Haas, Hubertus; Lokman, B Christien; Ram, Arthur F J; van den Hondel, Cees A M J J; de Weert, Sandra; Punt, Peter J

    2014-11-01

    Iron is an essential metal for many organisms, but the biologically relevant form of iron is scarce because of rapid oxidation resulting in low solubility. Simultaneously, excessive accumulation of iron is toxic. Consequently, iron uptake is a highly controlled process. In most fungal species, siderophores play a central role in iron handling. Siderophores are small iron-specific chelators that can be secreted to scavenge environmental iron or bind intracellular iron with high affinity. A second high-affinity iron uptake mechanism is reductive iron assimilation (RIA). As shown in Aspergillus fumigatus and Aspergillus nidulans, synthesis of siderophores in Aspergilli is predominantly under control of the transcription factors SreA and HapX, which are connected by a negative transcriptional feedback loop. Abolishing this fine-tuned regulation corroborates iron homeostasis, including heme biosynthesis, which could be biotechnologically of interest, e.g. the heterologous production of heme-dependent peroxidases. Aspergillus niger genome inspection identified orthologues of several genes relevant for RIA and siderophore metabolism, as well as sreA and hapX. Interestingly, genes related to synthesis of the common fungal extracellular siderophore triacetylfusarinine C were absent. Reverse-phase high-performance liquid chromatography (HPLC) confirmed the absence of triacetylfusarinine C, and demonstrated that the major secreted siderophores of A. niger are coprogen B and ferrichrome, which is also the dominant intracellular siderophore. In A. niger wild type grown under iron-replete conditions, the expression of genes involved in coprogen biosynthesis and RIA was low in the exponential growth phase but significantly induced during ascospore germination. Deletion of sreA in A. niger resulted in elevated iron uptake and increased cellular ferrichrome accumulation. Increased sensitivity toward phleomycin and high iron concentration reflected the toxic effects of excessive

  8. Epigenetic genome mining of an endophytic fungus leads to the pleiotropic biosynthesis of natural products.

    Science.gov (United States)

    Mao, Xu-Ming; Xu, Wei; Li, Dehai; Yin, Wen-Bing; Chooi, Yit-Heng; Li, Yong-Quan; Tang, Yi; Hu, Youcai

    2015-06-22

    The small-molecule biosynthetic potential of most filamentous fungi has remained largely unexplored and represents an attractive source for the discovery of new compounds. Genome sequencing of Calcarisporium arbuscula, a mushroom-endophytic fungus, revealed 68 core genes that are involved in natural product biosynthesis. This is in sharp contrast to the predominant production of the ATPase inhibitors aurovertin B and D in the wild-type fungus. Inactivation of a histone H3 deacetylase led to pleiotropic activation and overexpression of more than 75 % of the biosynthetic genes. Sampling of the overproduced compounds led to the isolation of ten compounds of which four contained new structures, including the cyclic peptides arbumycin and arbumelin, the diterpenoid arbuscullic acid A, and the meroterpenoid arbuscullic acid B. Such epigenetic modifications therefore provide a rapid and global approach to mine the chemical diversity of endophytic fungi.

  9. Genomic typing of Escherichia coli O157:H7 by semi-automated fluorescent AFLP analysis.

    Science.gov (United States)

    Zhao, S; Mitchell, S E; Meng, J; Kresovich, S; Doyle, M P; Dean, R E; Casa, A M; Weller, J W

    2000-02-01

    Escherichia coli serotype O157:H7 isolates were analyzed using a relatively new DNA fingerprinting method, amplified fragment length polymorphism (AFLP). Total genomic DNA was digested with two restriction endonucleases (EcoRI and MseI), and compatible oligonucleotide adapters were ligated to the ends of the resulting DNA fragments. Subsets of fragments from the total pool of cleaved DNA were then amplified by the polymerase chain reaction (PCR) using selective primers that extended beyond the adapter and restriction site sequences. One of the primers from each set was labeled with a fluorescent dye, which enabled amplified fragments to be detected and sized automatically on an automated DNA sequencer. Three AFLP primer sets generated a total of thirty-seven unique genotypes among the 48 E. coli O157:H7 isolates tested. Prior fingerprinting analysis of large restriction fragments from these same isolates by pulsed-field gel electrophoresis (PFGE) resulted in only 21 unique DNA profiles. Also, AFLP fingerprinting was successful for one DNA sample that was not typable by PFGE, presumably because of template degradation. AFLP analysis, therefore, provided greater genetic resolution and was less sensitive to DNA quality than PFGE. Consequently, this DNA typing technology should be very useful for genetic subtyping of bacterial pathogens in epidemiologic studies.

  10. Human genomic DNA analysis using a semi-automated sample preparation, amplification, and electrophoresis separation platform.

    Science.gov (United States)

    Raisi, Fariba; Blizard, Benjamin A; Raissi Shabari, Akbar; Ching, Jesus; Kintz, Gregory J; Mitchell, Jim; Lemoff, Asuncion; Taylor, Mike T; Weir, Fred; Western, Linda; Wong, Wendy; Joshi, Rekha; Howland, Pamela; Chauhan, Avinash; Nguyen, Peter; Petersen, Kurt E

    2004-03-01

    The growing importance of analyzing the human genome to detect hereditary and infectious diseases associated with specific DNA sequences has motivated us to develop automated devices to integrate sample preparation, real-time PCR, and microchannel electrophoresis (MCE). In this report, we present results from an optimized compact system capable of processing a raw sample of blood, extracting the DNA, and performing a multiplexed PCR reaction. Finally, an innovative electrophoretic separation was performed on the post-PCR products using a unique MCE system. The sample preparation system extracted and lysed white blood cells (WBC) from whole blood, producing DNA of sufficient quantity and quality for a polymerase chain reaction (PCR). Separation of multiple amplicons was achieved in a microfabricated channel 30 microm x 100 microm in cross section and 85 mm in length filled with a replaceable methyl cellulose matrix operated under denaturing conditions at 50 degrees C. By incorporating fluorescent-labeled primers in the PCR, the amplicons were identified by a two-color (multiplexed) fluorescence detection system. Two base-pair resolution of single-stranded DNA (PCR products) was achieved. We believe that this integrated system provides a unique solution for DNA analysis.

  11. Automated Personal Email Organizer with Information Management and Text Mining Application

    Directory of Open Access Journals (Sweden)

    Dr. Sanjay Tanwani

    2012-04-01

    Full Text Available Email is one of the most ubiquitous applications used regularly by millions of people worldwide. Professionals have to manage hundreds of emails on a daily basis, sometimes leading to overload and stress. Lots of emails are unanswered and sometimes remain unattended as the time pass by. Managing every single email takes a lot of effort especially when the size of email transaction log is very large. This work is focused on creating better ways of automatically organizing personal email messages. In this paper, a methodology for automated event information extraction from incoming email messages is proposed. The proposed methodology/algorithm and the software based on the above, has helped to improve the email management leading to reduction in the stress and timely response of emails.

  12. New strategies for medical data mining, part 3: automated workflow analysis and optimization.

    Science.gov (United States)

    Reiner, Bruce

    2011-02-01

    The practice of evidence-based medicine calls for the creation of "best practice" guidelines, leading to improved clinical outcomes. One of the primary factors limiting evidence-based medicine in radiology today is the relative paucity of standardized databases. The creation of standardized medical imaging databases offer the potential to enhance radiologist workflow and diagnostic accuracy through objective data-driven analytics, which can be categorized in accordance with specific variables relating to the individual examination, patient, provider, and technology being used. In addition to this "global" database analysis, "individual" radiologist workflow can be analyzed through the integration of electronic auditing tools into the PACS. The combination of these individual and global analyses can ultimately identify best practice patterns, which can be adapted to the individual attributes of end users and ultimately used in the creation of automated evidence-based medicine workflow templates.

  13. Genome mining demonstrates the widespread occurrence of gene clusters encoding bacteriocins in cyanobacteria.

    Science.gov (United States)

    Wang, Hao; Fewer, David P; Sivonen, Kaarina

    2011-01-01

    Cyanobacteria are a rich source of natural products with interesting biological activities. Many of these are peptides and the end products of a non-ribosomal pathway. However, several cyanobacterial peptide classes were recently shown to be produced through the proteolytic cleavage and post-translational modification of short precursor peptides. A new class of bacteriocins produced through the proteolytic cleavage and heterocyclization of precursor proteins was recently identified from marine cyanobacteria. Here we show the widespread occurrence of bacteriocin gene clusters in cyanobacteria through comparative analysis of 58 cyanobacterial genomes. A total of 145 bacteriocin gene clusters were discovered through genome mining. These clusters encoded 290 putative bacteriocin precursors. They ranged in length from 28 to 164 amino acids with very little sequence conservation of the core peptide. The gene clusters could be classified into seven groups according to their gene organization and domain composition. This classification is supported by phylogenetic analysis, which further indicated independent evolutionary trajectories of gene clusters in different groups. Our data suggests that cyanobacteria are a prolific source of low-molecular weight post-translationally modified peptides.

  14. Genome mining demonstrates the widespread occurrence of gene clusters encoding bacteriocins in cyanobacteria.

    Directory of Open Access Journals (Sweden)

    Hao Wang

    Full Text Available Cyanobacteria are a rich source of natural products with interesting biological activities. Many of these are peptides and the end products of a non-ribosomal pathway. However, several cyanobacterial peptide classes were recently shown to be produced through the proteolytic cleavage and post-translational modification of short precursor peptides. A new class of bacteriocins produced through the proteolytic cleavage and heterocyclization of precursor proteins was recently identified from marine cyanobacteria. Here we show the widespread occurrence of bacteriocin gene clusters in cyanobacteria through comparative analysis of 58 cyanobacterial genomes. A total of 145 bacteriocin gene clusters were discovered through genome mining. These clusters encoded 290 putative bacteriocin precursors. They ranged in length from 28 to 164 amino acids with very little sequence conservation of the core peptide. The gene clusters could be classified into seven groups according to their gene organization and domain composition. This classification is supported by phylogenetic analysis, which further indicated independent evolutionary trajectories of gene clusters in different groups. Our data suggests that cyanobacteria are a prolific source of low-molecular weight post-translationally modified peptides.

  15. Discovery of defense- and neuropeptides in social ants by genome-mining.

    Directory of Open Access Journals (Sweden)

    Christian W Gruber

    Full Text Available Natural peptides of great number and diversity occur in all organisms, but analyzing their peptidome is often difficult. With natural product drug discovery in mind, we devised a genome-mining approach to identify defense- and neuropeptides in the genomes of social ants from Atta cephalotes (leaf-cutter ant, Camponotus floridanus (carpenter ant and Harpegnathos saltator (basal genus. Numerous peptide-encoding genes of defense peptides, in particular defensins, and neuropeptides or regulatory peptide hormones, such as allatostatins and tachykinins, were identified and analyzed. Most interestingly we annotated genes that encode oxytocin/vasopressin-related peptides (inotocins and their putative receptors. This is the first piece of evidence for the existence of this nonapeptide hormone system in ants (Formicidae and supports recent findings in Tribolium castaneum (red flour beetle and Nasonia vitripennis (parasitoid wasp, and therefore its confinement to some basal holometabolous insects. By contrast, the absence of the inotocin hormone system in Apis mellifera (honeybee, another closely-related member of the eusocial Hymenoptera clade, establishes the basis for future studies on the molecular evolution and physiological function of oxytocin/vasopressin-related peptides (vasotocin nonapeptide family and their receptors in social insects. Particularly the identification of ant inotocin and defensin peptide sequences will provide a basis for future pharmacological characterization in the quest for potent and selective lead compounds of therapeutic value.

  16. Discovery of defense- and neuropeptides in social ants by genome-mining.

    Science.gov (United States)

    Gruber, Christian W; Muttenthaler, Markus

    2012-01-01

    Natural peptides of great number and diversity occur in all organisms, but analyzing their peptidome is often difficult. With natural product drug discovery in mind, we devised a genome-mining approach to identify defense- and neuropeptides in the genomes of social ants from Atta cephalotes (leaf-cutter ant), Camponotus floridanus (carpenter ant) and Harpegnathos saltator (basal genus). Numerous peptide-encoding genes of defense peptides, in particular defensins, and neuropeptides or regulatory peptide hormones, such as allatostatins and tachykinins, were identified and analyzed. Most interestingly we annotated genes that encode oxytocin/vasopressin-related peptides (inotocins) and their putative receptors. This is the first piece of evidence for the existence of this nonapeptide hormone system in ants (Formicidae) and supports recent findings in Tribolium castaneum (red flour beetle) and Nasonia vitripennis (parasitoid wasp), and therefore its confinement to some basal holometabolous insects. By contrast, the absence of the inotocin hormone system in Apis mellifera (honeybee), another closely-related member of the eusocial Hymenoptera clade, establishes the basis for future studies on the molecular evolution and physiological function of oxytocin/vasopressin-related peptides (vasotocin nonapeptide family) and their receptors in social insects. Particularly the identification of ant inotocin and defensin peptide sequences will provide a basis for future pharmacological characterization in the quest for potent and selective lead compounds of therapeutic value.

  17. RiceGeneThresher: a web-based application for mining genes underlying QTL in rice genome.

    Science.gov (United States)

    Thongjuea, Supat; Ruanjaichon, Vinitchan; Bruskiewich, Richard; Vanavichit, Apichart

    2009-01-01

    RiceGeneThresher is a public online resource for mining genes underlying genome regions of interest or quantitative trait loci (QTL) in rice genome. It is a compendium of rice genomic resources consisting of genetic markers, genome annotation, expressed sequence tags (ESTs), protein domains, gene ontology, plant stress-responsive genes, metabolic pathways and prediction of protein-protein interactions. RiceGeneThresher system integrates these diverse data sources and provides powerful web-based applications, and flexible tools for delivering customized set of biological data on rice. Its system supports whole-genome gene mining for QTL by querying using DNA marker intervals or genomic loci. RiceGeneThresher provides biologically supported evidences that are essential for targeting groups or networks of genes involved in controlling traits underlying QTL. Users can use it to discover and to assign the most promising candidate genes in preparation for the further gene function validation analysis. The web-based application is freely available at http://rice.kps.ku.ac.th.

  18. EST Pipeline System: Detailed and Automated EST Data Processing and Mining

    Institute of Scientific and Technical Information of China (English)

    Hao Xu; Liang Zhang; Hong Yu; Yan Zhou; Ling He; Yuanzhong Zhu; Wei Huang; Lijun Fang; Lin Tao; Yuedong Zhu; Lin Cai; Huayong Xu

    2003-01-01

    Expressed sequence tags (ESTs) are widely used in gene survey research these years. The EST Pipeline System, software developed by Hangzhou Genomics Institute (HGI), can automatically analyze different scalar EST sequences by suitable methods. All the analysis reports, including those of vector masking, sequence assembly, gene annotation, Gene Ontology classification, and some other analyses,can be browsed and searched as well as downloaded in the Excel format from the web interface, saving research efforts from routine data processing for biological rules embedded in the data.

  19. Automated integration of genomic physical mapping data via parallel simulated annealing

    Energy Technology Data Exchange (ETDEWEB)

    Slezak, T.

    1994-06-01

    The Human Genome Center at the Lawrence Livermore National Laboratory (LLNL) is nearing closure on a high-resolution physical map of human chromosome 19. We have build automated tools to assemble 15,000 fingerprinted cosmid clones into 800 contigs with minimal spanning paths identified. These islands are being ordered, oriented, and spanned by a variety of other techniques including: Fluorescence Insitu Hybridization (FISH) at 3 levels of resolution, ECO restriction fragment mapping across all contigs, and a multitude of different hybridization and PCR techniques to link cosmid, YAC, AC, PAC, and Pl clones. The FISH data provide us with partial order and distance data as well as orientation. We made the observation that map builders need a much rougher presentation of data than do map readers; the former wish to see raw data since these can expose errors or interesting biology. We further noted that by ignoring our length and distance data we could simplify our problem into one that could be readily attacked with optimization techniques. The data integration problem could then be seen as an M x N ordering of our N cosmid clones which ``intersect`` M larger objects by defining ``intersection`` to mean either contig/map membership or hybridization results. Clearly, the goal of making an integrated map is now to rearrange the N cosmid clone ``columns`` such that the number of gaps on the object ``rows`` are minimized. Our FISH partially-ordered cosmid clones provide us with a set of constraints that cannot be violated by the rearrangement process. We solved the optimization problem via simulated annealing performed on a network of 40+ Unix machines in parallel, using a server/client model built on explicit socket calls. For current maps we can create a map in about 4 hours on the parallel net versus 4+ days on a single workstation. Our biologists are now using this software on a daily basis to guide their efforts toward final closure.

  20. Automated image processing and analysis of cartilage MRI: enabling technology for data mining applied to osteoarthritis

    Science.gov (United States)

    Tameem, Hussain Z.; Sinha, Usha S.

    2011-01-01

    Osteoarthritis (OA) is a heterogeneous and multi-factorial disease characterized by the progressive loss of articular cartilage. Magnetic Resonance Imaging has been established as an accurate technique to assess cartilage damage through both cartilage morphology (volume and thickness) and cartilage water mobility (Spin-lattice relaxation, T2). The Osteoarthritis Initiative, OAI, is a large scale serial assessment of subjects at different stages of OA including those with pre-clinical symptoms. The electronic availability of the comprehensive data collected as part of the initiative provides an unprecedented opportunity to discover new relationships in complex diseases such as OA. However, imaging data, which provides the most accurate non-invasive assessment of OA, is not directly amenable for data mining. Changes in morphometry and relaxivity with OA disease are both complex and subtle, making manual methods extremely difficult. This chapter focuses on the image analysis techniques to automatically localize the differences in morphometry and relaxivity changes in different population sub-groups (normal and OA subjects segregated by age, gender, and race). The image analysis infrastructure will enable automatic extraction of cartilage features at the voxel level; the ultimate goal is to integrate this infrastructure to discover relationships between the image findings and other clinical features. PMID:21785520

  1. Automated image processing and analysis of cartilage MRI: enabling technology for data mining applied to osteoarthritis.

    Science.gov (United States)

    Tameem, Hussain Z; Sinha, Usha S

    2007-01-01

    Osteoarthritis (OA) is a heterogeneous and multi-factorial disease characterized by the progressive loss of articular cartilage. Magnetic Resonance Imaging has been established as an accurate technique to assess cartilage damage through both cartilage morphology (volume and thickness) and cartilage water mobility (Spin-lattice relaxation, T2). The Osteoarthritis Initiative, OAI, is a large scale serial assessment of subjects at different stages of OA including those with pre-clinical symptoms. The electronic availability of the comprehensive data collected as part of the initiative provides an unprecedented opportunity to discover new relationships in complex diseases such as OA. However, imaging data, which provides the most accurate non-invasive assessment of OA, is not directly amenable for data mining. Changes in morphometry and relaxivity with OA disease are both complex and subtle, making manual methods extremely difficult. This chapter focuses on the image analysis techniques to automatically localize the differences in morphometry and relaxivity changes in different population sub-groups (normal and OA subjects segregated by age, gender, and race). The image analysis infrastructure will enable automatic extraction of cartilage features at the voxel level; the ultimate goal is to integrate this infrastructure to discover relationships between the image findings and other clinical features.

  2. Automated web usage data mining and recommendation system using K-Nearest Neighbor (KNN classification method

    Directory of Open Access Journals (Sweden)

    D.A. Adeniyi

    2016-01-01

    Full Text Available The major problem of many on-line web sites is the presentation of many choices to the client at a time; this usually results to strenuous and time consuming task in finding the right product or information on the site. In this work, we present a study of automatic web usage data mining and recommendation system based on current user behavior through his/her click stream data on the newly developed Really Simple Syndication (RSS reader website, in order to provide relevant information to the individual without explicitly asking for it. The K-Nearest-Neighbor (KNN classification method has been trained to be used on-line and in Real-Time to identify clients/visitors click stream data, matching it to a particular user group and recommend a tailored browsing option that meet the need of the specific user at a particular time. To achieve this, web users RSS address file was extracted, cleansed, formatted and grouped into meaningful session and data mart was developed. Our result shows that the K-Nearest Neighbor classifier is transparent, consistent, straightforward, simple to understand, high tendency to possess desirable qualities and easy to implement than most other machine learning techniques specifically when there is little or no prior knowledge about data distribution.

  3. An Automated Real-Time System for Opinion Mining using a Hybrid Approach

    Directory of Open Access Journals (Sweden)

    Indrajit Mukherjee

    2016-07-01

    Full Text Available In this paper, a novel idea is being presented to perform Opinion Mining in a very simple and efficient manner with the help of the One-Level-Tree (OLT based approach. To recognize opinions specific for features in customer reviews having a variety of features commingled with diverse emotions. Unlike some previous ventures entirely using one-time structured or filtered data but this is solely based on unstructured data obtained in real-time from Twitter. The hybrid approach utilizes the associations defined in Dependency Parsing Grammar and fully employs Double Propagation to extract new features and related new opinions within the review. The Dictionary based approach is used to expand the Opinion Lexicon. Within the dependency parsing relations a new relation is being proposed to more effectively catch the associations between opinions and features. The three new methods are being proposed, termed as Double Positive Double Negative (DPDN, Catch-Phrase Method (CPM & Negation Check (NC, for performing criteria specific evaluations. The OLT approach conveniently displays the relationship between the features and their opinions in an elementary fashion in the form of a graph. The proposed system achieves splendid accuracy across all domains and also performs better than the state-of-the-art systems.

  4. Semi-automated curation of protein subcellular localization: a text mining-based approach to Gene Ontology (GO Cellular Component curation

    Directory of Open Access Journals (Sweden)

    Chan Juancarlos

    2009-07-01

    Full Text Available Abstract Background Manual curation of experimental data from the biomedical literature is an expensive and time-consuming endeavor. Nevertheless, most biological knowledge bases still rely heavily on manual curation for data extraction and entry. Text mining software that can semi- or fully automate information retrieval from the literature would thus provide a significant boost to manual curation efforts. Results We employ the Textpresso category-based information retrieval and extraction system http://www.textpresso.org, developed by WormBase to explore how Textpresso might improve the efficiency with which we manually curate C. elegans proteins to the Gene Ontology's Cellular Component Ontology. Using a training set of sentences that describe results of localization experiments in the published literature, we generated three new curation task-specific categories (Cellular Components, Assay Terms, and Verbs containing words and phrases associated with reports of experimentally determined subcellular localization. We compared the results of manual curation to that of Textpresso queries that searched the full text of articles for sentences containing terms from each of the three new categories plus the name of a previously uncurated C. elegans protein, and found that Textpresso searches identified curatable papers with recall and precision rates of 79.1% and 61.8%, respectively (F-score of 69.5%, when compared to manual curation. Within those documents, Textpresso identified relevant sentences with recall and precision rates of 30.3% and 80.1% (F-score of 44.0%. From returned sentences, curators were able to make 66.2% of all possible experimentally supported GO Cellular Component annotations with 97.3% precision (F-score of 78.8%. Measuring the relative efficiencies of Textpresso-based versus manual curation we find that Textpresso has the potential to increase curation efficiency by at least 8-fold, and perhaps as much as 15-fold, given

  5. Comparative analysis of a cryptic thienamycin-like gene cluster identified in Streptomyces flavogriseus by genome mining.

    Science.gov (United States)

    Blanco, Gloria

    2012-06-01

    In silico database searches allowed the identification in the S. flavogriseus ATCC 33331 genome of a carbapenem gene cluster highly related to the S. cattleya thienamycin one. This is the second cluster found for a complex highly substituted carbapenem. Comparative analysis revealed that both gene clusters display a high degree of synteny in gene organization and in protein conservation. Although the cluster appears to be silent under our laboratory conditions, the putative metabolic product was predicted from bioinformatics analyses using sequence comparison tools. These data, together with previous reports concerning epithienamycins production by S. flavogriseus strains, suggest that the cluster metabolic product might be a thienamycin-like carbapenem, possibly the epimeric epithienamycin. This finding might help in understanding the biosynthetic pathway to thienamycin and other highly substituted carbapenems. It also provides another example of genome mining in Streptomyces sequenced genomes as a powerful approach for novel antibiotic discovery.

  6. Draft genome sequence of Sinorhizobium meliloti CCNWSX0020, a nitrogen-fixing symbiont with copper tolerance capability isolated from lead-zinc mine tailings.

    Science.gov (United States)

    Li, Zhefei; Ma, Zhanqiang; Hao, Xiuli; Wei, Gehong

    2012-03-01

    Sinorhizobium meliloti CCNWSX0020 was isolated from Medicago lupulina plants growing in lead-zinc mine tailings, which can establish a symbiotic relationship with Medicago species. Also, the genome of this bacterium contains a number of protein-coding sequences related to metal tolerance. We anticipate that the genomic sequence provides valuable information to explore environmental bioremediation.

  7. Evaluation of Three Automated Genome Annotations for Halorhabdus utahensis

    DEFF Research Database (Denmark)

    Bakke, Peter; Carney, Nick; DeLoache, Will

    2009-01-01

    in databases such as NCBI and used to validate subsequent annotation errors. We submitted the genome sequence of halophilic archaeon Halorhabdus utahensis to be analyzed by three genome annotation services. We have examined the output from each service in a variety of ways in order to compare the methodology...

  8. Data Mining Approaches for Genomic Biomarker Development: Applications Using Drug Screening Data from the Cancer Genome Project and the Cancer Cell Line Encyclopedia.

    Science.gov (United States)

    Covell, David G

    2015-01-01

    Developing reliable biomarkers of tumor cell drug sensitivity and resistance can guide hypothesis-driven basic science research and influence pre-therapy clinical decisions. A popular strategy for developing biomarkers uses characterizations of human tumor samples against a range of cancer drug responses that correlate with genomic change; developed largely from the efforts of the Cancer Cell Line Encyclopedia (CCLE) and Sanger Cancer Genome Project (CGP). The purpose of this study is to provide an independent analysis of this data that aims to vet existing and add novel perspectives to biomarker discoveries and applications. Existing and alternative data mining and statistical methods will be used to a) evaluate drug responses of compounds with similar mechanism of action (MOA), b) examine measures of gene expression (GE), copy number (CN) and mutation status (MUT) biomarkers, combined with gene set enrichment analysis (GSEA), for hypothesizing biological processes important for drug response, c) conduct global comparisons of GE, CN and MUT as biomarkers across all drugs screened in the CGP dataset, and d) assess the positive predictive power of CGP-derived GE biomarkers as predictors of drug response in CCLE tumor cells. The perspectives derived from individual and global examinations of GEs, MUTs and CNs confirm existing and reveal unique and shared roles for these biomarkers in tumor cell drug sensitivity and resistance. Applications of CGP-derived genomic biomarkers to predict the drug response of CCLE tumor cells finds a highly significant ROC, with a positive predictive power of 0.78. The results of this study expand the available data mining and analysis methods for genomic biomarker development and provide additional support for using biomarkers to guide hypothesis-driven basic science research and pre-therapy clinical decisions.

  9. Data Mining Approaches for Genomic Biomarker Development: Applications Using Drug Screening Data from the Cancer Genome Project and the Cancer Cell Line Encyclopedia.

    Directory of Open Access Journals (Sweden)

    David G Covell

    Full Text Available Developing reliable biomarkers of tumor cell drug sensitivity and resistance can guide hypothesis-driven basic science research and influence pre-therapy clinical decisions. A popular strategy for developing biomarkers uses characterizations of human tumor samples against a range of cancer drug responses that correlate with genomic change; developed largely from the efforts of the Cancer Cell Line Encyclopedia (CCLE and Sanger Cancer Genome Project (CGP. The purpose of this study is to provide an independent analysis of this data that aims to vet existing and add novel perspectives to biomarker discoveries and applications. Existing and alternative data mining and statistical methods will be used to a evaluate drug responses of compounds with similar mechanism of action (MOA, b examine measures of gene expression (GE, copy number (CN and mutation status (MUT biomarkers, combined with gene set enrichment analysis (GSEA, for hypothesizing biological processes important for drug response, c conduct global comparisons of GE, CN and MUT as biomarkers across all drugs screened in the CGP dataset, and d assess the positive predictive power of CGP-derived GE biomarkers as predictors of drug response in CCLE tumor cells. The perspectives derived from individual and global examinations of GEs, MUTs and CNs confirm existing and reveal unique and shared roles for these biomarkers in tumor cell drug sensitivity and resistance. Applications of CGP-derived genomic biomarkers to predict the drug response of CCLE tumor cells finds a highly significant ROC, with a positive predictive power of 0.78. The results of this study expand the available data mining and analysis methods for genomic biomarker development and provide additional support for using biomarkers to guide hypothesis-driven basic science research and pre-therapy clinical decisions.

  10. Genome-wide mining, characterization, and development of microsatellite markers in Marsupenaeus japonicus by genome survey sequencing

    Science.gov (United States)

    Lu, Xia; Luan, Sheng; Kong, Jie; Hu, Longyang; Mao, Yong; Zhong, Shengping

    2017-01-01

    The kuruma prawn, Marsupenaeus japonicus, is one of the most cultivated and consumed species of shrimp. However, very few molecular genetic/genomic resources are publically available for it. Thus, the characterization and distribution of simple sequence repeats (SSRs) remains ambiguous and the use of SSR markers in genomic studies and marker-assisted selection is limited. The goal of this study is to characterize and develop genome-wide SSR markers in M. japonicus by genome survey sequencing for application in comparative genomics and breeding. A total of 326 945 perfect SSRs were identified, among which dinucleotide repeats were the most frequent class (44.08%), followed by mononucleotides (29.67%), trinucleotides (18.96%), tetranucleotides (5.66%), hexanucleotides (1.07%), and pentanucleotides (0.56%). In total, 151 541 SSR loci primers were successfully designed. A subset of 30 SSR primer pairs were synthesized and tested in 42 individuals from a wild population, of which 27 loci (90.0%) were successfully amplified with specific products and 24 (80.0%) were polymorphic. For the amplified polymorphic loci, the alleles ranged from 5 to 17 (with an average of 9.63), and the average PIC value was 0.796. A total of 58 256 SSR-containing sequences had significant Gene Ontology annotation; these are good functional molecular marker candidates for association studies and comparative genomic analysis. The newly identified SSRs significantly contribute to the M. japonicus genomic resources and will facilitate a number of genetic and genomic studies, including high density linkage mapping, genome-wide association analysis, marker-aided selection, comparative genomics analysis, population genetics, and evolution.

  11. antiSMASH 3.0—a comprehensive resource for the genome mining of biosynthetic gene clusters

    DEFF Research Database (Denmark)

    Weber, Tilmann; Blin, Kai; Duddela, Srikanth

    2015-01-01

    Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we...... introduced antiSMASH, a web server and stand-alone tool for the automatic genomic identification and analysis of biosynthetic gene clusters, available at http://antismash.secondarymetabolites.org. Here, we present version 3.0 of antiSMASH, which has undergone major improvements. A full integration...... of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products...

  12. Identification and activation of novel biosynthetic gene clusters by genome mining in the kirromycin producer Streptomyces collinus Tü 365

    DEFF Research Database (Denmark)

    Iftime, Dumitrita; Kulik, Andreas; Härtner, Thomas;

    2016-01-01

    Streptomycetes are prolific sources of novel biologically active secondary metabolites with pharmaceutical potential. S. collinus Tü 365 is a Streptomyces strain, isolated 1972 from Kouroussa (Guinea). It is best known as producer of the antibiotic kirromycin, an inhibitor of the protein...... metabolisms predicted for S. collinus Tü 365 includes PKS, NRPS, PKS-NRPS hybrids, a lanthipeptide, terpenes and siderophores. While some of these gene clusters were found to contain genes related to known secondary metabolites, which also could be detected in HPLC–MS analyses, most of the uncharacterized...... biosynthesis interacting with elongation factor EF-Tu. Genome Mining revealed 32 gene clusters encoding the biosynthesis of diverse secondary metabolites in the genome of Streptomyces collinus Tü 365, indicating an enormous biosynthetic potential of this strain. The structural diversity of secondary...

  13. Discovery of novel targets for multi-epitope vaccines: Screening of HIV-1 genomes using association rule mining

    Directory of Open Access Journals (Sweden)

    Piontkivska Helen

    2009-07-01

    Full Text Available Abstract Background Studies have shown that in the genome of human immunodeficiency virus (HIV-1 regions responsible for interactions with the host's immune system, namely, cytotoxic T-lymphocyte (CTL epitopes tend to cluster together in relatively conserved regions. On the other hand, "epitope-less" regions or regions with relatively low density of epitopes tend to be more variable. However, very little is known about relationships among epitopes from different genes, in other words, whether particular epitopes from different genes would occur together in the same viral genome. To identify CTL epitopes in different genes that co-occur in HIV genomes, association rule mining was used. Results Using a set of 189 best-defined HIV-1 CTL/CD8+ epitopes from 9 different protein-coding genes, as described by Frahm, Linde & Brander (2007, we examined the complete genomic sequences of 62 reference HIV sequences (including 13 subtypes and sub-subtypes with approximately 4 representative sequences for each subtype or sub-subtype, and 18 circulating recombinant forms. The results showed that despite inclusion of recombinant sequences that would be expected to break-up associations of epitopes in different genes when two different genomes are recombined, there exist particular combinations of epitopes (epitope associations that occur repeatedly across the world-wide population of HIV-1. For example, Pol epitope LFLDGIDKA is found to be significantly associated with epitopes GHQAAMQML and FLKEKGGL from Gag and Nef, respectively, and this association rule is observed even among circulating recombinant forms. Conclusion We have identified CTL epitope combinations co-occurring in HIV-1 genomes including different subtypes and recombinant forms. Such co-occurrence has important implications for design of complex vaccines (multi-epitope vaccines and/or drugs that would target multiple HIV-1 regions at once and, thus, may be expected to overcome challenges

  14. Chicken genome mapping - Constructing part of a road map for mining this bird's DNA

    NARCIS (Netherlands)

    Aerts, J.

    2005-01-01

    The aim of the research presented in this thesis was to aid in the international chicken genome mapping effort. To this purpose, a significant contribution was made to the construction of the chicken whole-genome BAC-based physical map (presented in Chapter A). An important aspect of this constructi

  15. Genome data mining of lactic acid bacteria: the impact of bioinformatics

    NARCIS (Netherlands)

    Siezen, R.J.; Enckevort, F.H.J. van; Kleerebezem, M.; Teusink, B.

    2004-01-01

    Lactic acid bacteria (LAB) have been widely used in food fermentations and, more recently, as probiotics in health-promoting food products. Genome sequencing and functional genomics studies of a variety of LAB are now rapidly providing insights into their diversity and evolution and revealing the mo

  16. Mining non-model genomic libraries for microsatellites: BAC versus EST libraries and the generation of allelic richness

    Directory of Open Access Journals (Sweden)

    Shaw Kerry L

    2010-07-01

    Full Text Available Abstract Background Simple sequence repeats (SSRs are tandemly repeated sequence motifs common in genomic nucleotide sequence that often harbor significant variation in repeat number. Frequently used as molecular markers, SSRs are increasingly identified via in silico approaches. Two common classes of genomic resources that can be mined are bacterial artificial chromosome (BAC libraries and expressed sequence tag (EST libraries. Results 288 SSR loci were screened in the rapidly radiating Hawaiian swordtail cricket genus Laupala. SSRs were more densely distributed and contained longer repeat structures in BAC library-derived sequence than in EST library-derived sequence, although neither repeat density nor length was exceptionally elevated despite the relatively large genome size of Laupala. A non-random distribution favoring AT-rich SSRs was observed. Allelic diversity of SSRs was positively correlated with repeat length and was generally higher in AT-rich repeat motifs. Conclusion The first large-scale survey of Orthopteran SSR allelic diversity is presented. Selection contributes more strongly to the size and density distributions of SSR loci derived from EST library sequence than from BAC library sequence, although all SSRs likely are subject to similar physical and structural constraints, such as slippage of DNA replication machinery, that may generate increased allelic diversity in AT-rich sequence motifs. Although in silico approaches work well for SSR locus identification in both EST and BAC libraries, BAC library sequence and AT-rich repeat motifs are generally superior SSR development resources for most applications.

  17. Identification and activation of novel biosynthetic gene clusters by genome mining in the kirromycin producer Streptomyces collinus Tü 365.

    Science.gov (United States)

    Iftime, Dumitrita; Kulik, Andreas; Härtner, Thomas; Rohrer, Sabrina; Niedermeyer, Timo Horst Johannes; Stegmann, Evi; Weber, Tilmann; Wohlleben, Wolfgang

    2016-03-01

    Streptomycetes are prolific sources of novel biologically active secondary metabolites with pharmaceutical potential. S. collinus Tü 365 is a Streptomyces strain, isolated 1972 from Kouroussa (Guinea). It is best known as producer of the antibiotic kirromycin, an inhibitor of the protein biosynthesis interacting with elongation factor EF-Tu. Genome Mining revealed 32 gene clusters encoding the biosynthesis of diverse secondary metabolites in the genome of Streptomyces collinus Tü 365, indicating an enormous biosynthetic potential of this strain. The structural diversity of secondary metabolisms predicted for S. collinus Tü 365 includes PKS, NRPS, PKS-NRPS hybrids, a lanthipeptide, terpenes and siderophores. While some of these gene clusters were found to contain genes related to known secondary metabolites, which also could be detected in HPLC-MS analyses, most of the uncharacterized gene clusters are not expressed under standard laboratory conditions. With this study we aimed to characterize the genome information of S. collinus Tü 365 to make use of gene clusters, which previously have not been described for this strain. We were able to connect the gene clusters of a lanthipeptide, a carotenoid, five terpenoid compounds, an ectoine, a siderophore and a spore pigment-associated gene cluster to their respective biosynthesis products.

  18. Evaluating the Strengths and Weaknesses of Mining Audit Data for Automated Models for Intrusion Detection in Tcpdump and Basic Security Module Data

    Directory of Open Access Journals (Sweden)

    A. Arul Lawrence Selvakumar

    2012-01-01

    Full Text Available Problem statement: Intrusion Detection System (IDS have become an important component of infrastructure protection mechanism to secure the current and emerging networks, its services and applications by detecting, alerting and taking necessary actions against the malicious activities. The network size, technology diversities and security policies make networks more challenging and hence there is a requirement for IDS which should be very accurate, adaptive, extensible and more reliable. Although there exists the novel framework for this requirement namely Mining Audit Data for Automated Models for Intrusion Detection (MADAM ID, it is having some performance shortfalls in processing the audit data. Approach: Few experiments were conducted on tcpdump data of DARPA and BCM audit files by applying the algorithms and tools of MADAM ID in the processing of audit data, mine patterns, construct features and build RIPPER classifiers. By putting it all together, four main categories of attacks namely DOS, R2L, U2R and PROBING attacks were simulated. Results: This study outlines the experimentation results of MADAM ID in testing the DARPA and BSM data on a simulated network environment. Conclusion: The strengths and weakness of MADAM ID has been identified thru the experiments conducted on tcpdump data and also on Pascal based audit files of Basic Security Module (BSM. This study also gives some additional directions about the future applications of MADAM ID.

  19. Draft Genome Sequence of "Acidibacillus ferrooxidans" ITV01, a Novel Acidophilic Firmicute Isolated from a Chalcopyrite Mine Drainage Site in Brazil.

    Science.gov (United States)

    Dall'Agnol, Hivana; Ñancucheo, Ivan; Johnson, D Barrie; Oliveira, Renato; Leite, Laura; Pylro, Victor S; Holanda, Roseanne; Grail, Barry; Carvalho, Nelson; Nunes, Gisele Lopes; Tzotzos, George; Fernandes, Gabriel Rocha; Dutra, Julliane; Orellana, Sara Cuadros; Oliveira, Guilherme

    2016-03-17

    Here, we report the draft genome sequence of "Acidibacillus ferrooxidans" strain ITV01, a ferrous iron- and sulfide-mineral-oxidizing, obligate heterotrophic, and acidophilic bacterium affiliated with the phylum Firmicutes. Strain ITV01 was isolated from neutral drainage from a low-grade chalcopyrite from a mine in northern Brazil.

  20. Draft Genome Sequence of “Acidibacillus ferrooxidans” ITV01, a Novel Acidophilic Firmicute Isolated from a Chalcopyrite Mine Drainage Site in Brazil

    Science.gov (United States)

    Dall’Agnol, Hivana; Ñancucheo, Ivan; Johnson, D. Barrie; Oliveira, Renato; Leite, Laura; Holanda, Roseanne; Grail, Barry; Carvalho, Nelson; Nunes, Gisele Lopes; Tzotzos, George; Fernandes, Gabriel Rocha; Dutra, Julliane; Orellana, Sara Cuadros

    2016-01-01

    Here, we report the draft genome sequence of “Acidibacillus ferrooxidans” strain ITV01, a ferrous iron- and sulfide-mineral-oxidizing, obligate heterotrophic, and acidophilic bacterium affiliated with the phylum Firmicutes. Strain ITV01 was isolated from neutral drainage from a low-grade chalcopyrite from a mine in northern Brazil. PMID:26988062

  1. A framework for automated enrichment of functionally significant inverted repeats in whole genomes

    Directory of Open Access Journals (Sweden)

    Frank Ronald L

    2010-10-01

    Full Text Available Abstract Background RNA transcripts from genomic sequences showing dyad symmetry typically adopt hairpin-like, cloverleaf, or similar structures that act as recognition sites for proteins. Such structures often are the precursors of non-coding RNA (ncRNA sequences like microRNA (miRNA and small-interfering RNA (siRNA that have recently garnered more functional significance than in the past. Genomic DNA contains hundreds of thousands of such inverted repeats (IRs with varying degrees of symmetry. But by collecting statistically significant information from a known set of ncRNA, we can sort these IRs into those that are likely to be functional. Results A novel method was developed to scan genomic DNA for partially symmetric inverted repeats and the resulting set was further refined to match miRNA precursors (pre-miRNA with respect to their density of symmetry, statistical probability of the symmetry, length of stems in the predicted hairpin secondary structure, and the GC content of the stems. This method was applied on the Arabidopsis thaliana genome and validated against the set of 190 known Arabidopsis pre-miRNA in the miRBase database. A preliminary scan for IRs identified 186 of the known pre-miRNA but with 714700 pre-miRNA candidates. This large number of IRs was further refined to 483908 candidates with 183 pre-miRNA identified and further still to 165371 candidates with 171 pre-miRNA identified (i.e. with 90% of the known pre-miRNA retained. Conclusions 165371 candidates for potentially functional miRNA is still too large a set to warrant wet lab analyses, such as northern blotting, on all of them. Hence additional filters are needed to further refine the number of candidates while still retaining most of the known miRNA. These include detection of promoters and terminators, homology analyses, location of candidate relative to coding regions, and better secondary structure prediction algorithms. The software developed is designed to easily

  2. Automation in mining. Sensor for measuring the free through cross section. Final report; Automatisierung im Bergbau. Sensor zur Erfassung des freien Durchgangsquerschnittes. Schlussbericht

    Energy Technology Data Exchange (ETDEWEB)

    Breitgraf, H.U.; Wenzel, C.; Plaschke, H.; Hambrock, A.

    1994-06-01

    In this research project, completely new sensors wer developed, as there was nothing comparable on the market. The development was carried out by the partners Sick and Weidmueller, with the support of MB DATA for mining details and mining law authorisation. With regard to the developed sensors, the optical sensor from the firm of Sick represents a very capable equipment. By contrast, the ultrasonic sensors have not been developed ready for authorisation yet, but the management of the firm of Weidmueller states that they will continue development in 1994. The sensor system described here was developed in the context of the research project `Automation in mining/sensor for measuring the free through cross section`. The aim of this research contract was to develop an equipment which should succeed in detecting any obstacles on the route of a vehicle. In the case of the optical sensor, this project succeeded in doing this. (orig.) [Deutsch] Bei diesem Forschungsvorhaben wurden komplett neue Sensoren entwickelt, da es vergleichbares bisher nicht auf dem Markt gab. Die Entwicklung wurde durch die Kooperationspartner Sick und Weidmueller, mit Unterstuetzung bei bergbauspezifischen Details und der bergbaugerechten Zulassung durch MB DATA, durchgefuehrt. Im Hinblick auf die entwickelte Sensorik stellt vor allem der optischen Sensor der Firma Sick ein sehr leistungsfaehiges Geraet dar. Die Ultraschallsensorik hingegen ist noch nicht bis zur Zulassungsreife entwickelt worden, es liegen aber Aussagen der Geschaeftsleitung der Firma Weidmueller vor, mit der Weiterentwicklung auch im Jahre 1994 taetig zu sein. Das hier beschriebene Sensorsystem wurde im Rahmen des Forschungsvorhabens `Automatisierung im Bergbau / Sensor zur Erfassung des freien Durchgangsquerschnittes` entwickelt. Ziel dieses Forschungsauftrages war es, ein Geraet zu entwickeln, mit dem es gelingen soll, eventuelle Hindernisse innerhalb der Wegstrecke eines Fahrzeugs zu detektieren. Im Falle des optischen

  3. High-throughput automated microfluidic sample preparation for accurate microbial genomics

    Science.gov (United States)

    Kim, Soohong; De Jonghe, Joachim; Kulesa, Anthony B.; Feldman, David; Vatanen, Tommi; Bhattacharyya, Roby P.; Berdy, Brittany; Gomez, James; Nolan, Jill; Epstein, Slava; Blainey, Paul C.

    2017-01-01

    Low-cost shotgun DNA sequencing is transforming the microbial sciences. Sequencing instruments are so effective that sample preparation is now the key limiting factor. Here, we introduce a microfluidic sample preparation platform that integrates the key steps in cells to sequence library sample preparation for up to 96 samples and reduces DNA input requirements 100-fold while maintaining or improving data quality. The general-purpose microarchitecture we demonstrate supports workflows with arbitrary numbers of reaction and clean-up or capture steps. By reducing the sample quantity requirements, we enabled low-input (∼10,000 cells) whole-genome shotgun (WGS) sequencing of Mycobacterium tuberculosis and soil micro-colonies with superior results. We also leveraged the enhanced throughput to sequence ∼400 clinical Pseudomonas aeruginosa libraries and demonstrate excellent single-nucleotide polymorphism detection performance that explained phenotypically observed antibiotic resistance. Fully-integrated lab-on-chip sample preparation overcomes technical barriers to enable broader deployment of genomics across many basic research and translational applications. PMID:28128213

  4. Genome mining of the hitachimycin biosynthetic gene cluster: involvement of a phenylalanine-2,3-aminomutase in biosynthesis.

    Science.gov (United States)

    Kudo, Fumitaka; Kawamura, Koichi; Uchino, Asuka; Miyanaga, Akimasa; Numakura, Mario; Takayanagi, Ryuichi; Eguchi, Tadashi

    2015-04-13

    Hitachimycin is a macrolactam antibiotic with (S)-β-phenylalanine (β-Phe) at the starter position of its polyketide skeleton. To understand the incorporation mechanism of β-Phe and the modification mechanism of the unique polyketide skeleton, the biosynthetic gene cluster for hitachimycin in Streptomyces scabrisporus was identified by genome mining. The identified gene cluster contains a putative phenylalanine-2,3-aminomutase (PAM), five polyketide synthases, four β-amino-acid-carrying enzymes, and a characteristic amidohydrolase. A hitA knockout mutant showed no hitachimycin production, but antibiotic production was restored by feeding with (S)-β-Phe. We also confirmed the enzymatic activity of the HitA PAM. The results suggest that the identified gene cluster is responsible for the biosynthesis of hitachimycin. A plausible biosynthetic pathway for hitachimycin, including a unique polyketide skeletal transformation mechanism, is proposed.

  5. Genome sequence of the acid-tolerant Desulfovibrio sp. DV isolated from the sediments of a Pb-Zn mine tailings dam in the Chita region, Russia

    Directory of Open Access Journals (Sweden)

    Anastasiia Kovaliova

    2017-03-01

    Full Text Available Here we report the draft genome sequence of the acid-tolerant Desulfovibrio sp. DV isolated from the sediments of a Pb-Zn mine tailings dam in the Chita region, Russia. The draft genome has a size of 4.9 Mb and encodes multiple K+-transporters and proton-consuming decarboxylases. The phylogenetic analysis based on concatenated ribosomal proteins revealed that strain DV clusters together with the acid-tolerant Desulfovibrio sp. TomC and Desulfovibrio magneticus. The draft genome sequence and annotation have been deposited at GenBank under the accession number MLBG00000000.

  6. Development of direct methanol fuel cells for the applications in mining and tunnelling. Automation and power conditioning of a fuel cell-battery hybrid system

    Energy Technology Data Exchange (ETDEWEB)

    Kulakarni, Sreekantha Rao

    2012-07-01

    appropriate option for applications in underground mining and tunneling. The specific advantages of DMFCs are simple structure, higher energy density of the fuel (i.e. methanol), low operating temperature, lower weight, clean and quiet operation. Methanol is in liquid form so it is easy to transport and store. Moreover, methanol is a renewable fuel that can be produced from biomass. This doctoral research work focused on the construction of a DMFC stack of 30 W electrical power and the testing of the fuel cell stack in underground mining for the applications discussed above. Not only the stack itself, but also the automated system for the fuel cell and battery hybrid system was developed. For automation of the system, a micro-controller monitoring system was developed, which uses sensors for voltage, current, temperature, methanol concentration and liquid level. Development and testing of the methanol concentration sensor was considered as the heart of the research work. Last but not least, the power conditioning of the fuel cell stack as well as the battery charging techniques developed were also part of the research work.

  7. EST2uni: an open, parallel tool for automated EST analysis and database creation, with a data mining web interface and microarray expression data integration

    Directory of Open Access Journals (Sweden)

    Nuez Fernando

    2008-01-01

    Full Text Available Abstract Background Expressed sequence tag (EST collections are composed of a high number of single-pass, redundant, partial sequences, which need to be processed, clustered, and annotated to remove low-quality and vector regions, eliminate redundancy and sequencing errors, and provide biologically relevant information. In order to provide a suitable way of performing the different steps in the analysis of the ESTs, flexible computation pipelines adapted to the local needs of specific EST projects have to be developed. Furthermore, EST collections must be stored in highly structured relational databases available to researchers through user-friendly interfaces which allow efficient and complex data mining, thus offering maximum capabilities for their full exploitation. Results We have created EST2uni, an integrated, highly-configurable EST analysis pipeline and data mining software package that automates the pre-processing, clustering, annotation, database creation, and data mining of EST collections. The pipeline uses standard EST analysis tools and the software has a modular design to facilitate the addition of new analytical methods and their configuration. Currently implemented analyses include functional and structural annotation, SNP and microsatellite discovery, integration of previously known genetic marker data and gene expression results, and assistance in cDNA microarray design. It can be run in parallel in a PC cluster in order to reduce the time necessary for the analysis. It also creates a web site linked to the database, showing collection statistics, with complex query capabilities and tools for data mining and retrieval. Conclusion The software package presented here provides an efficient and complete bioinformatics tool for the management of EST collections which is very easy to adapt to the local needs of different EST projects. The code is freely available under the GPL license and can be obtained at http

  8. InCoB2014: mining biological data from genomics for transforming industry and health.

    Science.gov (United States)

    Schönbach, Christian; Tan, Tin; Ranganathan, Shoba

    2014-01-01

    The 13th International Conference on Bioinformatics (InCoB2014) was held for the first time in Australia, at Sydney, July 31-2 August, 2014. InCoB is the annual scientific gathering of the Asia-Pacific Bioinformatics Network (APBioNet), hosted since 2002 in the Asia-Pacific region. Of 106 full papers submitted to the BMC track of InCoB2014, 50 (47.2%) were accepted in BMC Bioinformatics, BMC Genomics and BMC Systems Biology supplements, with three papers in a new BMC Medical Genomics supplement. While the majority of presenters and authors were from Asia and Australia, the increasing number of US and European conference attendees augurs well for the international flavour of InCoB. Next year's InCoB will be held jointly with the Genome Informatics Workshop (GIW), September 9-11, 2015 in Tokyo, Japan, with a view to integrate bioinformatics communities in the region.

  9. Genomic analyses of metal resistance genes in three plant growth promoting bacteria of legume plants in Northwest mine tailings, China

    Institute of Scientific and Technical Information of China (English)

    Pin Xie; Xiuli Hao; Martin Herzberg; Yantao Luo; Dietrich H.Nies; Gehong Wei

    2015-01-01

    To better understand the diversity of metal resistance genetic determinant from microbes that survived at metal tailings in northwest of China,a highly elevated level of heavy metal containing region,genomic analyses was conducted using genome sequence of three native metal-resistant plant growth promoting bacteria (PGPB).It shows that:Mesorhizobium amorphae CCNWGS0123 contains metal ~nsporters from P-type ATPase,CDF (Cation Diffusion Facilitator),HupE/UreJ and CHR (chromate ion transporter) family involved in copper,zinc,nickel as well as chromate resistance and homeostasis.Meanwhile,the putative CopA/CueO system is expected to mediate copper resistance in Sinorhizobium meliloti CCNWSX0020 while ZntA transporter,assisted with putative CzcD,determines zinc tolerance in Agrobacterium tumefaciens CCNWGS0286.The greenhouse experiment provides the consistent evidence of the plant growth promoting effects of these microbes on their hosts by nitrogen fixation and/or indoleacetic acid (IAA) secretion,indicating a potential in-site phytoremediation usage in the mining tailing regions of China.

  10. antiSMASH 3.0-a comprehensive resource for the genome mining of biosynthetic gene clusters.

    Science.gov (United States)

    Weber, Tilmann; Blin, Kai; Duddela, Srikanth; Krug, Daniel; Kim, Hyun Uk; Bruccoleri, Robert; Lee, Sang Yup; Fischbach, Michael A; Müller, Rolf; Wohlleben, Wolfgang; Breitling, Rainer; Takano, Eriko; Medema, Marnix H

    2015-07-01

    Microbial secondary metabolism constitutes a rich source of antibiotics, chemotherapeutics, insecticides and other high-value chemicals. Genome mining of gene clusters that encode the biosynthetic pathways for these metabolites has become a key methodology for novel compound discovery. In 2011, we introduced antiSMASH, a web server and stand-alone tool for the automatic genomic identification and analysis of biosynthetic gene clusters, available at http://antismash.secondarymetabolites.org. Here, we present version 3.0 of antiSMASH, which has undergone major improvements. A full integration of the recently published ClusterFinder algorithm now allows using this probabilistic algorithm to detect putative gene clusters of unknown types. Also, a new dereplication variant of the ClusterBlast module now identifies similarities of identified clusters to any of 1172 clusters with known end products. At the enzyme level, active sites of key biosynthetic enzymes are now pinpointed through a curated pattern-matching procedure and Enzyme Commission numbers are assigned to functionally classify all enzyme-coding genes. Additionally, chemical structure prediction has been improved by incorporating polyketide reduction states. Finally, in order for users to be able to organize and analyze multiple antiSMASH outputs in a private setting, a new XML output module allows offline editing of antiSMASH annotations within the Geneious software.

  11. Machine learning and data mining in complex genomic data--a review on the lessons learned in Genetic Analysis Workshop 19.

    Science.gov (United States)

    König, Inke R; Auerbach, Jonathan; Gola, Damian; Held, Elizabeth; Holzinger, Emily R; Legault, Marc-André; Sun, Rui; Tintle, Nathan; Yang, Hsin-Chou

    2016-02-03

    In the analysis of current genomic data, application of machine learning and data mining techniques has become more attractive given the rising complexity of the projects. As part of the Genetic Analysis Workshop 19, approaches from this domain were explored, mostly motivated from two starting points. First, assuming an underlying structure in the genomic data, data mining might identify this and thus improve downstream association analyses. Second, computational methods for machine learning need to be developed further to efficiently deal with the current wealth of data.In the course of discussing results and experiences from the machine learning and data mining approaches, six common messages were extracted. These depict the current state of these approaches in the application to complex genomic data. Although some challenges remain for future studies, important forward steps were taken in the integration of different data types and the evaluation of the evidence. Mining the data for underlying genetic or phenotypic structure and using this information in subsequent analyses proved to be extremely helpful and is likely to become of even greater use with more complex data sets.

  12. Genome mining unveils widespread natural product biosynthetic capacity in human oral microbe Streptococcus mutans

    Science.gov (United States)

    Liu, Liwei; Hao, Tingting; Xie, Zhoujie; Horsman, Geoff P.; Chen, Yihua

    2016-01-01

    Streptococcus mutans is a major pathogen causing human dental caries. As a Gram-positive bacterium with a small genome (about 2 Mb) it is considered a poor source of natural products. Due to a recent explosion in genomic data available for S. mutans strains, we were motivated to explore the natural product production potential of this organism. Bioinformatic characterization of 169 publically available genomes of S. mutans from human dental caries revealed a surprisingly rich source of natural product biosynthetic gene clusters. Anti-SMASH analysis identified one nonribosomal peptide synthetase (NRPS) gene cluster, seven polyketide synthase (PKS) gene clusters and 136 hybrid PKS/NRPS gene clusters. In addition, 211 ribosomally synthesized and post-translationally modified peptides (RiPPs) clusters and 615 bacteriocin precursors were identified by a combined analysis using BAGEL and anti-SMASH. S. mutans harbors a rich and diverse natural product genetic capacity, which underscores the importance of probing the human microbiome and revisiting species that have traditionally been overlooked as “poor” sources of natural products. PMID:27869143

  13. Nuclear species-diagnostic SNP markers mined from 454 amplicon sequencing reveal admixture genomic structure of modern citrus varieties.

    Science.gov (United States)

    Curk, Franck; Ancillo, Gema; Ollitrault, Frédérique; Perrier, Xavier; Jacquemoud-Collet, Jean-Pierre; Garcia-Lor, Andres; Navarro, Luis; Ollitrault, Patrick

    2015-01-01

    Most cultivated Citrus species originated from interspecific hybridisation between four ancestral taxa (C. reticulata, C. maxima, C. medica, and C. micrantha) with limited further interspecific recombination due to vegetative propagation. This evolution resulted in admixture genomes with frequent interspecific heterozygosity. Moreover, a major part of the phenotypic diversity of edible citrus results from the initial differentiation between these taxa. Deciphering the phylogenomic structure of citrus germplasm is therefore essential for an efficient utilization of citrus biodiversity in breeding schemes. The objective of this work was to develop a set of species-diagnostic single nucleotide polymorphism (SNP) markers for the four Citrus ancestral taxa covering the nine chromosomes, and to use these markers to infer the phylogenomic structure of secondary species and modern cultivars. Species-diagnostic SNPs were mined from 454 amplicon sequencing of 57 gene fragments from 26 genotypes of the four basic taxa. Of the 1,053 SNPs mined from 28,507 kb sequence, 273 were found to be highly diagnostic for a single basic taxon. Species-diagnostic SNP markers (105) were used to analyse the admixture structure of varieties and rootstocks. This revealed C. maxima introgressions in most of the old and in all recent selections of mandarins, and suggested that C. reticulata × C. maxima reticulation and introgression processes were important in edible mandarin domestication. The large range of phylogenomic constitutions between C. reticulata and C. maxima revealed in mandarins, tangelos, tangors, sweet oranges, sour oranges, grapefruits, and orangelos is favourable for genetic association studies based on phylogenomic structures of the germplasm. Inferred admixture structures were in agreement with previous hypotheses regarding the origin of several secondary species and also revealed the probable origin of several acid citrus varieties. The developed species-diagnostic SNP

  14. Handling method of PLC power down in mine integrated automation system%煤矿综合自动化系统中PLC掉电处理方法

    Institute of Scientific and Technical Information of China (English)

    唐日宏; 陈程; 燕飞雄

    2013-01-01

    针对煤矿综合自动化系统中因突发情况出现胶带运输系统PLC掉电,从而使逆煤流方向胶带未能及时停车而导致的堆煤事故,重点提出了一种软件层面的PLC掉电处理方法.该方法主要通过检测远程I/O分站中通信模块ENBT的状态,判断I/O分站是否掉电,然后再根据互锁关系,使逆煤流方向的胶带停止运行,从而实现相关胶带及时停车,避免堆煤事故的发生.%For PLC power down problem of belt transportation system due to emergency in mine integrated automation system,which makes belt of inverse direction of coal flow fail to stop and results in coal piling accident,a handling method of PLC power down with software-level was proposed.The method judges power down of I/O substation mainly by detecting communication status of ENBT module of remote I/O substation,then stops running of the belt of inverse direction of coal flow accoding to interlocking relationship,so as to make relevant belts stop timely and avoid accidents of coal piling.

  15. CGMIM: Automated text-mining of Online Mendelian Inheritance in Man (OMIM to identify genetically-associated cancers and candidate genes

    Directory of Open Access Journals (Sweden)

    Jones Steven

    2005-03-01

    Full Text Available Abstract Background Online Mendelian Inheritance in Man (OMIM is a computerized database of information about genes and heritable traits in human populations, based on information reported in the scientific literature. Our objective was to establish an automated text-mining system for OMIM that will identify genetically-related cancers and cancer-related genes. We developed the computer program CGMIM to search for entries in OMIM that are related to one or more cancer types. We performed manual searches of OMIM to verify the program results. Results In the OMIM database on September 30, 2004, CGMIM identified 1943 genes related to cancer. BRCA2 (OMIM *164757, BRAF (OMIM *164757 and CDKN2A (OMIM *600160 were each related to 14 types of cancer. There were 45 genes related to cancer of the esophagus, 121 genes related to cancer of the stomach, and 21 genes related to both. Analysis of CGMIM results indicate that fewer than three gene entries in OMIM should mention both, and the more than seven-fold discrepancy suggests cancers of the esophagus and stomach are more genetically related than current literature suggests. Conclusion CGMIM identifies genetically-related cancers and cancer-related genes. In several ways, cancers with shared genetic etiology are anticipated to lead to further etiologic hypotheses and advances regarding environmental agents. CGMIM results are posted monthly and the source code can be obtained free of charge from the BC Cancer Research Centre website http://www.bccrc.ca/ccr/CGMIM.

  16. Novel LanT associated lantibiotic clusters identified by genome database mining.

    Directory of Open Access Journals (Sweden)

    Mangal Singh

    Full Text Available BACKGROUND: Frequent use of antibiotics has led to the emergence of antibiotic resistance in bacteria. Lantibiotic compounds are ribosomally synthesized antimicrobial peptides against which bacteria are not able to produce resistance, hence making them a good alternative to antibiotics. Nisin is the oldest and the most widely used lantibiotic, in food preservation, without having developed any significant resistance against it. Having their antimicrobial potential and a limited number, there is a need to identify novel lantibiotics. METHODOLOGY/FINDINGS: Identification of novel lantibiotic biosynthetic clusters from an ever increasing database of bacterial genomes, can provide a major lead in this direction. In order to achieve this, a strategy was adopted to identify novel lantibiotic biosynthetic clusters by screening the sequenced genomes for LanT homolog, which is a conserved lantibiotic transporter specific to type IB clusters. This strategy resulted in identification of 54 bacterial strains containing the LanT homologs, which are not the known lantibiotic producers. Of these, 24 strains were subjected to a detailed bioinformatic analysis to identify genes encoding for precursor peptides, modification enzyme, immunity and quorum sensing proteins. Eight clusters having two LanM determinants, similar to haloduracin and lichenicidin were identified, along with 13 clusters having a single LanM determinant as in mersacidin biosynthetic cluster. Besides these, orphan LanT homologs were also identified which might be associated with novel bacteriocins, encoded somewhere else in the genome. Three identified gene clusters had a C39 domain containing LanT transporter, associated with the LanBC proteins and double glycine type precursor peptides, the only known example of such a cluster is that of salivaricin. CONCLUSION: This study led to the identification of 8 novel putative two-component lantibiotic clusters along with 13 having a single LanM and

  17. An Integrated Metabolomic and Genomic Mining Workflow to Uncover the Biosynthetic Potential of Bacteria

    DEFF Research Database (Denmark)

    Månsson, Maria; Vynne, Nikolaj Grønnegaard; Klitgaard, Andreas

    2016-01-01

    considerable diversity: only 2% of the chemical features and 7% of the biosynthetic genes were common to all strains, while 30% of all features and 24% of the genes were unique to single strains. The list of chemical features was reduced to 50 discriminating features using a genetic algorithm and support...... vector machines. Features were dereplicated by tandem mass spectrometry (MS/MS) networking to identify molecular families of the same biosynthetic origin, and the associated pathways were probed using comparative genomics. Most of the discriminating features were related to antibacterial compounds...

  18. Genome mining of mycosporine-like amino acid (MAA) synthesizing and non-synthesizing cyanobacteria: A bioinformatics study.

    Science.gov (United States)

    Singh, Shailendra P; Klisch, Manfred; Sinha, Rajeshwar P; Häder, Donat-P

    2010-02-01

    Mycosporine-like amino acids (MAAs) are a family of more than 20 compounds having absorption maxima between 310 and 362 nm. These compounds are well known for their UV-absorbing/screening role in various organisms and seem to have evolutionary significance. In the present investigation we tested four cyanobacteria, e.g., Anabaena variabilis PCC 7937, Anabaena sp. PCC 7120, Synechocystis sp. PCC 6803 and Synechococcus sp. PCC 6301, for their ability to synthesize MAA and conducted genomic and phylogenetic analysis to identify the possible set of genes that might be involved in the biosynthesis of these compounds. Out of the four investigated species, only A. variabilis PCC 7937 was able to synthesize MAA. Genome mining identified a combination of genes, YP_324358 (predicted DHQ synthase) and YP_324357 (O-methyltransferase), which were present only in A. variabilis PCC 7937 and missing in the other studied cyanobacteria. Phylogenetic analysis revealed that these two genes are transferred from a cyanobacterial donor to dinoflagellates and finally to metazoa by a lateral gene transfer event. All other cyanobacteria, which have these two genes, also had another copy of the DHQ synthase gene. The predicted protein structure for YP_324358 also suggested that this product is different from the chemically characterized DHQ synthase of Aspergillus nidulans contrary to the YP_324879, which was predicted to be similar to the DHQ synthase. The present study provides a first insight into the genes of cyanobacteria involved in MAA biosynthesis and thus widens the field of research for molecular, bioinformatics and phylogenetic analysis of these evolutionary and industrially important compounds. Based on the results we propose that YP_324358 and YP_324357 gene products are involved in the biosynthesis of the common core (deoxygadusol) of all MAAs.

  19. Applied genomics: data mining reveals species-specific malaria diagnostic targets more sensitive than 18S rRNA.

    Science.gov (United States)

    Demas, Allison; Oberstaller, Jenna; DeBarry, Jeremy; Lucchi, Naomi W; Srinivasamoorthy, Ganesh; Sumari, Deborah; Kabanywanyi, Abdunoor M; Villegas, Leopoldo; Escalante, Ananias A; Kachur, S Patrick; Barnwell, John W; Peterson, David S; Udhayakumar, Venkatachalam; Kissinger, Jessica C

    2011-07-01

    Accurate and rapid diagnosis of malaria infections is crucial for implementing species-appropriate treatment and saving lives. Molecular diagnostic tools are the most accurate and sensitive method of detecting Plasmodium, differentiating between Plasmodium species, and detecting subclinical infections. Despite available whole-genome sequence data for Plasmodium falciparum and P. vivax, the majority of PCR-based methods still rely on the 18S rRNA gene targets. Historically, this gene has served as the best target for diagnostic assays. However, it is limited in its ability to detect mixed infections in multiplex assay platforms without the use of nested PCR. New diagnostic targets are needed. Ideal targets will be species specific, highly sensitive, and amenable to both single-step and multiplex PCRs. We have mined the genomes of P. falciparum and P. vivax to identify species-specific, repetitive sequences that serve as new PCR targets for the detection of malaria. We show that these targets (Pvr47 and Pfr364) exist in 14 to 41 copies and are more sensitive than 18S rRNA when utilized in a single-step PCR. Parasites are routinely detected at levels of 1 to 10 parasites/μl. The reaction can be multiplexed to detect both species in a single reaction. We have examined 7 P. falciparum strains and 91 P. falciparum clinical isolates from Tanzania and 10 P. vivax strains and 96 P. vivax clinical isolates from Venezuela, and we have verified a sensitivity and specificity of ∼100% for both targets compared with a nested 18S rRNA approach. We show that bioinformatics approaches can be successfully applied to identify novel diagnostic targets and improve molecular methods for pathogen detection. These novel targets provide a powerful alternative molecular diagnostic method for the detection of P. falciparum and P. vivax in conventional or multiplex PCR platforms.

  20. Transcriptome analysis in Concholepas concholepas (Gastropoda, Muricidae): mining and characterization of new genomic and molecular markers.

    Science.gov (United States)

    Cárdenas, Leyla; Sánchez, Roland; Gomez, Daniela; Fuenzalida, Gonzalo; Gallardo-Escárate, Cristián; Tanguy, Arnaud

    2011-09-01

    The marine gastropod Concholepas concholepas, locally known as the "loco", is the main target species of the benthonic Chilean fisheries. Genetic and genomic tools are necessary to study the genome of this species in order to understand the molecular basis of its development, growth, and other key traits to improve the management strategies and to identify local adaptation to prevent loss of biodiversity. Here, we use pyrosequencing technologies to generate the first transcriptomic database from adult specimens of the loco. After trimming, a total of 140,756 Expressed Sequence Tag sequences were achieved. Clustering and assembly analysis identified 19,219 contigs and 105,435 singleton sequences. BlastN analysis showed a significant identity with Expressed Sequence Tags of different gastropod species available in public databases. Similarly, BlastX results showed that only 895 out of the total 124,654 had significant hits and may represent novel genes for marine gastropods. From this database, simple sequence repeat motifs were also identified and a total of 38 primer pairs were designed and tested to assess their potential as informative markers and to investigate their cross-species amplification in different related gastropod species. This dataset represents the first publicly available 454 data for a marine gastropod endemic to the southeastern Pacific coast, providing a valuable transcriptomic resource for future efforts of gene discovery and development of functional markers in other marine gastropods.

  1. 基于工业以太网矿山综合自动化系统改造设计%Based on industrial Ethernet retrofit design of the mine integrated automation system

    Institute of Scientific and Technical Information of China (English)

    卜玉明

    2016-01-01

    Aiming at the non-ferrous metallurgical mines in the automation of information transfor-mation, due to the lack of overall planning and cause the system into different system, cannot communi-cation with each other, so that the whole mine production information island, cannot achieve centralized monitoring, causing the construction investment is high, the maintenance of the huge amount of prob-lems. This paper combines the digital mine of information construction requirements, designed the com-prehensive automation retrofit mine of nonferrous metallurgy. Practice shows that, this design solves the integration problem of metallurgical mines ERP/MES/PCS communication, realize centralized monitoring of large metallurgical mines, reduce the engineering investment, reducing the amount of maintenance, and has great popularization and application value.%针对有色冶金矿山在自动化信息化改造中,由于缺乏整体规划而导致系统各成体系,相互之间不能通讯,使整个矿山生产出现信息孤岛,不能实现集中监控,造成建设投资高,维护量庞大的问题,该文结合数字矿山对信息化建设的要求,进行了有色冶金矿山综合自动化改造设计. 实践表明,该设计解决了冶金矿山ERP/MES/PCS通讯集成问题,实现了大型冶金矿山的集中监控,降低了工程投资,减小了维护量,具有很大的推广应用价值.

  2. Identification of fluorinases from Streptomyces sp MA37, Norcardia brasiliensis, and Actinoplanes sp N902-109 by genome mining.

    Science.gov (United States)

    Deng, Hai; Ma, Long; Bandaranayaka, Nouchali; Qin, Zhiwei; Mann, Greg; Kyeremeh, Kwaku; Yu, Yi; Shepherd, Thomas; Naismith, James H; O'Hagan, David

    2014-02-10

    The fluorinase is an enzyme that catalyses the combination of S-adenosyl-L-methionine (SAM) and a fluoride ion to generate 5'-fluorodeoxy adenosine (FDA) and L-methionine through a nucleophilic substitution reaction with a fluoride ion as the nucleophile. It is the only native fluorination enzyme that has been characterised. The fluorinase was isolated in 2002 from Streptomyces cattleya, and, to date, this has been the only source of the fluorinase enzyme. Herein, we report three new fluorinase isolates that have been identified by genome mining. The novel fluorinases from Streptomyces sp. MA37, Nocardia brasiliensis, and an Actinoplanes sp. have high homology (80-87 % identity) to the original S. cattleya enzyme. They all possess a characteristic 21-residue loop. The three newly identified genes were overexpressed in E. coli and shown to be fluorination enzymes. An X-ray crystallographic study of the Streptomyces sp. MA37 enzyme demonstrated that it is almost identical in structure to the original fluorinase. Culturing of the Streptomyces sp. MA37 strain demonstrated that it not only also elaborates the fluorometabolites, fluoroacetate and 4-fluorothreonine, similar to S. cattleya, but this strain also produces a range of unidentified fluorometabolites. These are the first new fluorinases to be reported since the first isolate, over a decade ago, and their identification extends the range of fluorination genes available for fluorination biotechnology.

  3. Target recognition, resistance, immunity and genome mining of class II bacteriocins from Gram-positive bacteria.

    Science.gov (United States)

    Kjos, Morten; Borrero, Juan; Opsata, Mona; Birri, Dagim J; Holo, Helge; Cintas, Luis M; Snipen, Lars; Hernández, Pablo E; Nes, Ingolf F; Diep, Dzung B

    2011-12-01

    Due to their very potent antimicrobial activity against diverse food-spoiling bacteria and pathogens and their favourable biochemical properties, peptide bacteriocins from Gram-positive bacteria have long been considered promising for applications in food preservation or medical treatment. To take advantage of bacteriocins in different applications, it is crucial to have detailed knowledge on the molecular mechanisms by which these peptides recognize and kill target cells, how producer cells protect themselves from their own bacteriocin (self-immunity) and how target cells may develop resistance. In this review we discuss some important recent progress in these areas for the non-lantibiotic (class II) bacteriocins. We also discuss some examples of how the current wealth of genome sequences provides an invaluable source in the search for novel class II bacteriocins.

  4. Analysis of regulatory protease sequences identified through bioinformatic data mining of the Schistosoma mansoni genome

    Directory of Open Access Journals (Sweden)

    Minchella Dennis J

    2009-10-01

    Full Text Available Abstract Background New chemotherapeutic agents against Schistosoma mansoni, an etiological agent of human schistosomiasis, are a priority due to the emerging drug resistance and the inability of current drug treatments to prevent reinfection. Proteases have been under scrutiny as targets of immunological or chemotherapeutic anti-Schistosoma agents because of their vital role in many stages of the parasitic life cycle. Function has been established for only a handful of identified S. mansoni proteases, and the vast majority of these are the digestive proteases; very few of the conserved classes of regulatory proteases have been identified from Schistosoma species, despite their vital role in numerous cellular processes. To that end, we identified protease protein coding genes from the S. mansoni genome project and EST library. Results We identified 255 protease sequences from five catalytic classes using predicted proteins of the S. mansoni genome. The vast majority of these show significant similarity to proteins in KEGG and the Conserved Domain Database. Proteases include calpains, caspases, cytosolic and mitochondrial signal peptidases, proteases that interact with ubiquitin and ubiquitin-like molecules, and proteases that perform regulated intramembrane proteolysis. Comparative analysis of classes of important regulatory proteases find conserved active site domains, and where appropriate, signal peptides and transmembrane helices. Phylogenetic analysis provides support for inferring functional divergence among regulatory aspartic, cysteine, and serine proteases. Conclusion Numerous proteases are identified for the first time in S. mansoni. We characterized important regulatory proteases and focus analysis on these proteases to complement the growing knowledge base of digestive proteases. This work provides a foundation for expanding knowledge of proteases in Schistosoma species and examining their diverse function and potential as targets

  5. Genome mining of fungal lipid-degrading enzymes for industrial applications.

    Science.gov (United States)

    Vorapreeda, Tayvich; Thammarongtham, Chinae; Cheevadhanarak, Supapon; Laoteng, Kobkul

    2015-08-01

    Lipases are interesting enzymes, which contribute important roles in maintaining lipid homeostasis and cellular metabolisms. Using available genome data, seven lipase families of oleaginous and non-oleaginous yeast and fungi were categorized based on the similarity of their amino acid sequences and conserved structural domains. Of them, triacylglycerol lipase (patatin-domain-containing protein) and steryl ester hydrolase (abhydro_lipase-domain-containing protein) families were ubiquitous enzymes found in all species studied. The two essential lipases rendered signature characteristics of integral membrane proteins that might be targeted to lipid monolayer particles. At least one of the extracellular lipase families existed in each species of yeast and fungi. We found that the diversity of lipase families and the number of genes in individual families of oleaginous strains were greater than those identified in non-oleaginous species, which might play a role in nutrient acquisition from surrounding hydrophobic substrates and attribute to their obese phenotype. The gene/enzyme catalogue and relevant informative data of the lipases provided by this study are not only valuable toolboxes for investigation of the biological role of these lipases, but also convey potential in various industrial applications.

  6. Genomic mining for novel FADH₂-dependent halogenases in marine sponge-associated microbial consortia.

    Science.gov (United States)

    Bayer, Kristina; Scheuermayer, Matthias; Fieseler, Lars; Hentschel, Ute

    2013-02-01

    Many marine sponges (Porifera) are known to contain large amounts of phylogenetically diverse microorganisms. Sponges are also known for their large arsenal of natural products, many of which are halogenated. In this study, 36 different FADH₂-dependent halogenase gene fragments were amplified from various Caribbean and Mediterranean sponges using newly designed degenerate PCR primers. Four unique halogenase-positive fosmid clones, all containing the highly conserved amino acid motif "GxGxxG", were identified in the microbial metagenome of Aplysina aerophoba. Sequence analysis of one halogenase-bearing fosmid revealed notably two open reading frames with high homologies to efflux and multidrug resistance proteins. Single cell genomic analysis allowed for a taxonomic assignment of the halogenase genes to specific symbiotic lineages. Specifically, the halogenase cluster S1 is predicted to be produced by a deltaproteobacterial symbiont and halogenase cluster S2 by a poribacterial sponge symbiont. An additional halogenase gene is possibly produced by an actinobacterial symbiont of marine sponges. The identification of three novel, phylogenetically, and possibly also functionally distinct halogenase gene clusters indicates that the microbial consortia of sponges are a valuable resource for novel enzymes involved in halogenation reactions.

  7. In silico mining of putative microsatellite markers from whole genome sequence of water buffalo (Bubalus bubalis and development of first BuffSatDB

    Directory of Open Access Journals (Sweden)

    Sarika

    2013-01-01

    Full Text Available Abstract Background Though India has sequenced water buffalo genome but its draft assembly is based on cattle genome BTau 4.0, thus de novo chromosome wise assembly is a major pending issue for global community. The existing radiation hybrid of buffalo and these reported STR can be used further in final gap plugging and “finishing” expected in de novo genome assembly. QTL and gene mapping needs mining of putative STR from buffalo genome at equal interval on each and every chromosome. Such markers have potential role in improvement of desirable characteristics, such as high milk yields, resistance to diseases, high growth rate. The STR mining from whole genome and development of user friendly database is yet to be done to reap the benefit of whole genome sequence. Description By in silico microsatellite mining of whole genome, we have developed first STR database of water buffalo, BuffSatDb (Buffalo MicroSatellite Database (http://cabindb.iasri.res.in/buffsatdb/ which is a web based relational database of 910529 microsatellite markers, developed using PHP and MySQL database. Microsatellite markers have been generated using MIcroSAtellite tool. It is simple and systematic web based search for customised retrieval of chromosome wise and genome-wide microsatellites. Search has been enabled based on chromosomes, motif type (mono-hexa, repeat motif and repeat kind (simple and composite. The search may be customised by limiting location of STR on chromosome as well as number of markers in that range. This is a novel approach and not been implemented in any of the existing marker database. This database has been further appended with Primer3 for primer designing of the selected markers enabling researcher to select markers of choice at desired interval over the chromosome. The unique add-on of degenerate bases further helps in resolving presence of degenerate bases in current buffalo assembly. Conclusion Being first buffalo STR database in the world

  8. Genome sequence of the copper resistant and acid-tolerant Desulfosporosinus sp. BG isolated from the tailings of a molybdenum-tungsten mine in the Transbaikal area

    Directory of Open Access Journals (Sweden)

    Olga V. Karnachuk

    2017-03-01

    Full Text Available Here, we report on the draft genome of a copper-resistant and acidophilic Desulfosporosinus sp. BG, isolated from the tailings of a molybdenum-tungsten mine in Transbaikal area. The draft genome has a size of 4.52 Mb and encodes transporters of heavy metals. The phylogenetic analysis based on concatenated ribosomal proteins revealed that strain BG clusters together with the other acidophilic copper-resistant strains Desulfosporosinus sp. OT and Desulfosporosinus sp. I2. The K+-ATPase, Na+/H+ antiporter and amino acid decarboxylases may participate in enabling growth at low pH. The draft genome sequence and annotation have been deposited at GenBank under the accession number NZ_MASS00000000.

  9. Process mining

    DEFF Research Database (Denmark)

    van der Aalst, W.M.P.; Rubin, V.; Verbeek, H.M.W.

    2010-01-01

    Process mining includes the automated discovery of processes from event logs. Based on observed events (e.g., activities being executed or messages being exchanged) a process model is constructed. One of the essential problems in process mining is that one cannot assume to have seen all possible...... behavior. At best, one has seen a representative subset. Therefore, classical synthesis techniques are not suitable as they aim at finding a model that is able to exactly reproduce the log. Existing process mining techniques try to avoid such “overfitting” by generalizing the model to allow for more...... behavior. This generalization is often driven by the representation language and very crude assumptions about completeness. As a result, parts of the model are “overfitting” (allow only for what has actually been observed) while other parts may be “underfitting” (allowfor much more behavior without strong...

  10. Asteroid mining

    Science.gov (United States)

    Gertsch, Richard E.

    1992-01-01

    The earliest studies of asteroid mining proposed retrieving a main belt asteroid. Because of the very long travel times to the main asteroid belt, attention has shifted to the asteroids whose orbits bring them fairly close to the Earth. In these schemes, the asteroids would be bagged and then processed during the return trip, with the asteroid itself providing the reaction mass to propel the mission homeward. A mission to one of these near-Earth asteroids would be shorter, involve less weight, and require a somewhat lower change in velocity. Since these asteroids apparently contain a wide range of potentially useful materials, our study group considered only them. The topics covered include asteroid materials and properties, asteroid mission selection, manned versus automated missions, mining in zero gravity, and a conceptual mining method.

  11. Quantification of Operational Risk Using A Data Mining

    Science.gov (United States)

    Perera, J. Sebastian

    1999-01-01

    What is Data Mining? - Data Mining is the process of finding actionable information hidden in raw data. - Data Mining helps find hidden patterns, trends, and important relationships often buried in a sea of data - Typically, automated software tools based on advanced statistical analysis and data modeling technology can be utilized to automate the data mining process

  12. A novel data mining method to identify assay-specific signatures in functional genomic studies

    Directory of Open Access Journals (Sweden)

    Guidarelli Jack W

    2006-08-01

    Full Text Available Abstract Background: The highly dimensional data produced by functional genomic (FG studies makes it difficult to visualize relationships between gene products and experimental conditions (i.e., assays. Although dimensionality reduction methods such as principal component analysis (PCA have been very useful, their application to identify assay-specific signatures has been limited by the lack of appropriate methodologies. This article proposes a new and powerful PCA-based method for the identification of assay-specific gene signatures in FG studies. Results: The proposed method (PM is unique for several reasons. First, it is the only one, to our knowledge, that uses gene contribution, a product of the loading and expression level, to obtain assay signatures. The PM develops and exploits two types of assay-specific contribution plots, which are new to the application of PCA in the FG area. The first type plots the assay-specific gene contribution against the given order of the genes and reveals variations in distribution between assay-specific gene signatures as well as outliers within assay groups indicating the degree of importance of the most dominant genes. The second type plots the contribution of each gene in ascending or descending order against a constantly increasing index. This type of plots reveals assay-specific gene signatures defined by the inflection points in the curve. In addition, sharp regions within the signature define the genes that contribute the most to the signature. We proposed and used the curvature as an appropriate metric to characterize these sharp regions, thus identifying the subset of genes contributing the most to the signature. Finally, the PM uses the full dataset to determine the final gene signature, thus eliminating the chance of gene exclusion by poor screening in earlier steps. The strengths of the PM are demonstrated using a simulation study, and two studies of real DNA microarray data – a study of

  13. Design of Integrated Automation System Platform of Coal Mine Based on Wonderware IAS and Its Application%基于Wonderware IAS的煤矿综合自动化系统平台的设计和应用

    Institute of Scientific and Technical Information of China (English)

    顾对芳; 王为广; 李红霞; 丁继存

    2012-01-01

    根据济宁三号煤矿的实际情况,提出了一种基于Wonderware IAS的煤矿综合自动化系统平台设计方案,介绍了该平台的结构、IAS的开发、与煤矿其他子系统接口协议的通信方法、基于InTouch组态软件的HMI的设计方法等.实际应用表明,该平台运行稳定,有效提高了煤矿的信息化程度,解决了以往各系统各自运行、各自管理的弊端,达到了减员增效的目的.%According to actual situation of Jining No. 3 Coal Mine, a design scheme of integrated automation system platform of coal mine based on Wonderware IAS was proposed. Platform structure, IAS development, communication between the platform and interface protocol of other subsystem, HMI design method based on InTouch configuration software were introduced. The practical application showed that the platform is stable in operation, and improves coal mine informatization degree effectively, solves problems of self running and self management, which achieves goal of increasing efficiency by downsizing staff.

  14. Fish the ChIPs: a pipeline for automated genomic annotation of ChIP-Seq data

    Directory of Open Access Journals (Sweden)

    Minucci Saverio

    2011-10-01

    Full Text Available Abstract Background High-throughput sequencing is generating massive amounts of data at a pace that largely exceeds the throughput of data analysis routines. Here we introduce Fish the ChIPs (FC, a computational pipeline aimed at a broad public of users and designed to perform complete ChIP-Seq data analysis of an unlimited number of samples, thus increasing throughput, reproducibility and saving time. Results Starting from short read sequences, FC performs the following steps: 1 quality controls, 2 alignment to a reference genome, 3 peak calling, 4 genomic annotation, 5 generation of raw signal tracks for visualization on the UCSC and IGV genome browsers. FC exploits some of the fastest and most effective tools today available. Installation on a Mac platform requires very basic computational skills while configuration and usage are supported by a user-friendly graphic user interface. Alternatively, FC can be compiled from the source code on any Unix machine and then run with the possibility of customizing each single parameter through a simple configuration text file that can be generated using a dedicated user-friendly web-form. Considering the execution time, FC can be run on a desktop machine, even though the use of a computer cluster is recommended for analyses of large batches of data. FC is perfectly suited to work with data coming from Illumina Solexa Genome Analyzers or ABI SOLiD and its usage can potentially be extended to any sequencing platform. Conclusions Compared to existing tools, FC has two main advantages that make it suitable for a broad range of users. First of all, it can be installed and run by wet biologists on a Mac machine. Besides it can handle an unlimited number of samples, being convenient for large analyses. In this context, computational biologists can increase reproducibility of their ChIP-Seq data analyses while saving time for downstream analyses. Reviewers This article was reviewed by Gavin Huttley, George

  15. Commercial Data Mining Software

    Science.gov (United States)

    Zhang, Qingyu; Segall, Richard S.

    This chapter discusses selected commercial software for data mining, supercomputing data mining, text mining, and web mining. The selected software are compared with their features and also applied to available data sets. The software for data mining are SAS Enterprise Miner, Megaputer PolyAnalyst 5.0, PASW (formerly SPSS Clementine), IBM Intelligent Miner, and BioDiscovery GeneSight. The software for supercomputing are Avizo by Visualization Science Group and JMP Genomics from SAS Institute. The software for text mining are SAS Text Miner and Megaputer PolyAnalyst 5.0. The software for web mining are Megaputer PolyAnalyst and SPSS Clementine . Background on related literature and software are presented. Screen shots of each of the selected software are presented, as are conclusions and future directions.

  16. Draft Genome Sequence of Mesorhizobium sp. UFLA 01-765, a Multitolerant, Efficient Symbiont and Plant Growth-Promoting Strain Isolated from Zn-Mining Soil Using Leucaena leucocephala as a Trap Plant.

    Science.gov (United States)

    Rangel, Wesley Melo; Thijs, Sofie; Moreira, Fatima Maria de Souza; Weyens, Nele; Vangronsveld, Jaco; Van Hamme, Jonathan D; Bottos, Eric M; Rineau, Francois

    2016-03-10

    We report the 7.4-Mb draft genome sequence of Mesorhizobium sp. strain UFLA 01-765, a Gram-negative bacterium of the Phyllobacteriaceae isolated from Zn-mining soil in Minas Gerais, Brazil. This strain promotes plant growth, efficiently fixes N2 in symbiosis with Leucaena leucocephala on multicontaminated soil, and has potential for application in bioremediation of marginal lands.

  17. Genome mining of the genetic diversity in the Aspergillus genus - from a collection of more than 30 Aspergillus species

    DEFF Research Database (Denmark)

    Rasmussen, Jane Lind Nybo; Vesth, Tammi Camilla; Theobald, Sebastian;

    In the era of high-throughput sequencing, comparative genomics can be applied for evaluating species diversity. In this project we aim to compare the genomes of 300 species of filamentous fungi from the Aspergillus genus, a complex task. To be able to define species, clade, and core features......, this project uses BLAST on the amino acid level to discover orthologs. With a potential of 300 Aspergillus species each having ~12,000 annotated genes, traditional clustering will demand supercomputing. Instead, our approach reduces the search space by identifying isoenzymes within each genome creating...... intragenomic protein families (iPFs), and then connecting iPFs across all genomes. The initial findings in a set of 31 species show that ~48% of the annotated genes are core genes (genes shared between all species) and 2-24% of the genes are defining the individual species. The methods presented here...

  18. Identification of novel target genes for safer and more specific control of root-knot nematodes from a pan-genome mining.

    Directory of Open Access Journals (Sweden)

    Etienne G J Danchin

    2013-10-01

    Full Text Available Root-knot nematodes are globally the most aggressive and damaging plant-parasitic nematodes. Chemical nematicides have so far constituted the most efficient control measures against these agricultural pests. Because of their toxicity for the environment and danger for human health, these nematicides have now been banned from use. Consequently, new and more specific control means, safe for the environment and human health, are urgently needed to avoid worldwide proliferation of these devastating plant-parasites. Mining the genomes of root-knot nematodes through an evolutionary and comparative genomics approach, we identified and analyzed 15,952 nematode genes conserved in genomes of plant-damaging species but absent from non target genomes of chordates, plants, annelids, insect pollinators and mollusks. Functional annotation of the corresponding proteins revealed a relative abundance of putative transcription factors in this parasite-specific set compared to whole proteomes of root-knot nematodes. This may point to important and specific regulators of genes involved in parasitism. Because these nematodes are known to secrete effector proteins in planta, essential for parasitism, we searched and identified 993 such effector-like proteins absent from non-target species. Aiming at identifying novel targets for the development of future control methods, we biologically tested the effect of inactivation of the corresponding genes through RNA interference. A total of 15 novel effector-like proteins and one putative transcription factor compatible with the design of siRNAs were present as non-redundant genes and had transcriptional support in the model root-knot nematode Meloidogyne incognita. Infestation assays with siRNA-treated M. incognita on tomato plants showed significant and reproducible reduction of the infestation for 12 of the 16 tested genes compared to control nematodes. These 12 novel genes, showing efficient reduction of parasitism when

  19. Identification of novel target genes for safer and more specific control of root-knot nematodes from a pan-genome mining.

    Science.gov (United States)

    Danchin, Etienne G J; Arguel, Marie-Jeanne; Campan-Fournier, Amandine; Perfus-Barbeoch, Laetitia; Magliano, Marc; Rosso, Marie-Noëlle; Da Rocha, Martine; Da Silva, Corinne; Nottet, Nicolas; Labadie, Karine; Guy, Julie; Artiguenave, François; Abad, Pierre

    2013-10-01

    Root-knot nematodes are globally the most aggressive and damaging plant-parasitic nematodes. Chemical nematicides have so far constituted the most efficient control measures against these agricultural pests. Because of their toxicity for the environment and danger for human health, these nematicides have now been banned from use. Consequently, new and more specific control means, safe for the environment and human health, are urgently needed to avoid worldwide proliferation of these devastating plant-parasites. Mining the genomes of root-knot nematodes through an evolutionary and comparative genomics approach, we identified and analyzed 15,952 nematode genes conserved in genomes of plant-damaging species but absent from non target genomes of chordates, plants, annelids, insect pollinators and mollusks. Functional annotation of the corresponding proteins revealed a relative abundance of putative transcription factors in this parasite-specific set compared to whole proteomes of root-knot nematodes. This may point to important and specific regulators of genes involved in parasitism. Because these nematodes are known to secrete effector proteins in planta, essential for parasitism, we searched and identified 993 such effector-like proteins absent from non-target species. Aiming at identifying novel targets for the development of future control methods, we biologically tested the effect of inactivation of the corresponding genes through RNA interference. A total of 15 novel effector-like proteins and one putative transcription factor compatible with the design of siRNAs were present as non-redundant genes and had transcriptional support in the model root-knot nematode Meloidogyne incognita. Infestation assays with siRNA-treated M. incognita on tomato plants showed significant and reproducible reduction of the infestation for 12 of the 16 tested genes compared to control nematodes. These 12 novel genes, showing efficient reduction of parasitism when silenced, constitute

  20. Integrative Genomic Data Mining for Discovery of Potential Blood-Borne Biomarkers for Early Diagnosis of Cancer

    OpenAIRE

    Yongliang Yang; Pavel Pospisil; Iyer, Lakshmanan K.; S. James Adelstein; Amin I. Kassis

    2008-01-01

    BACKGROUND: With the arrival of the postgenomic era, there is increasing interest in the discovery of biomarkers for the accurate diagnosis, prognosis, and early detection of cancer. Blood-borne cancer markers are favored by clinicians, because blood samples can be obtained and analyzed with relative ease. We have used a combined mining strategy based on an integrated cancer microarray platform, Oncomine, and the biomarker module of the Ingenuity Pathways Analysis (IPA) program to identify po...

  1. Probe on Service Level Agreement of Integrated Mine Wide Automation System%全矿井综合自动化系统服务级别协议的探究

    Institute of Scientific and Technical Information of China (English)

    陈建伟; 竺金光

    2011-01-01

    针对目前全矿井综合自动化系统服务级别混乱和服务周期长的问题,提出采用服务级别协议的方式来分类分级管理售中和售后服务的方案,从安装调试类服务、系统推广类服务、定制升级类服务、技术支持服务和日常运营服务5个方面详细介绍了服务级别协议的分类管理及内容.服务级别协议将IT服务管理的思想贯彻到了全矿井综合自动化系统的合同签订及实施过程中,对合同的实施周期、项目的验收和合同回款都将起到显著的作用.%In view of problems of jumbled service level and long service period of integrated mine wide automation system, the paper put forward a scheme of using service level agreement to classify and grade in-sale services and after-sale services, and introduced categorization management and content of service level agreement in terms of five aspects of installation and debugging services, system promoton services,customizing and upgrading services, technique supporting services and annual operating services in details.The service level agreement can carry out the idea of IT service management in contract signing and implementation process of integrated mine wide automation system, and can play a big role in service period, project acceptance and received payments of contract.

  2. Automated cell analysis tool for a genome-wide RNAi screen with support vector machine based supervised learning

    Science.gov (United States)

    Remmele, Steffen; Ritzerfeld, Julia; Nickel, Walter; Hesser, Jürgen

    2011-03-01

    RNAi-based high-throughput microscopy screens have become an important tool in biological sciences in order to decrypt mostly unknown biological functions of human genes. However, manual analysis is impossible for such screens since the amount of image data sets can often be in the hundred thousands. Reliable automated tools are thus required to analyse the fluorescence microscopy image data sets usually containing two or more reaction channels. The herein presented image analysis tool is designed to analyse an RNAi screen investigating the intracellular trafficking and targeting of acylated Src kinases. In this specific screen, a data set consists of three reaction channels and the investigated cells can appear in different phenotypes. The main issue of the image processing task is an automatic cell segmentation which has to be robust and accurate for all different phenotypes and a successive phenotype classification. The cell segmentation is done in two steps by segmenting the cell nuclei first and then using a classifier-enhanced region growing on basis of the cell nuclei to segment the cells. The classification of the cells is realized by a support vector machine which has to be trained manually using supervised learning. Furthermore, the tool is brightness invariant allowing different staining quality and it provides a quality control that copes with typical defects during preparation and acquisition. A first version of the tool has already been successfully applied for an RNAi-screen containing three hundred thousand image data sets and the SVM extended version is designed for additional screens.

  3. Text Mining Applications and Theory

    CERN Document Server

    Berry, Michael W

    2010-01-01

    Text Mining: Applications and Theory presents the state-of-the-art algorithms for text mining from both the academic and industrial perspectives.  The contributors span several countries and scientific domains: universities, industrial corporations, and government laboratories, and demonstrate the use of techniques from machine learning, knowledge discovery, natural language processing and information retrieval to design computational models for automated text analysis and mining. This volume demonstrates how advancements in the fields of applied mathematics, computer science, machine learning

  4. Genome mining of the sordarin biosynthetic gene cluster from Sordaria araneosa Cain ATCC 36386: characterization of cycloaraneosene synthase and GDP-6-deoxyaltrose transferase.

    Science.gov (United States)

    Kudo, Fumitaka; Matsuura, Yasunori; Hayashi, Takaaki; Fukushima, Masayuki; Eguchi, Tadashi

    2016-07-01

    Sordarin is a glycoside antibiotic with a unique tetracyclic diterpene aglycone structure called sordaricin. To understand its intriguing biosynthetic pathway that may include a Diels-Alder-type [4+2]cycloaddition, genome mining of the gene cluster from the draft genome sequence of the producer strain, Sordaria araneosa Cain ATCC 36386, was carried out. A contiguous 67 kb gene cluster consisting of 20 open reading frames encoding a putative diterpene cyclase, a glycosyltransferase, a type I polyketide synthase, and six cytochrome P450 monooxygenases were identified. In vitro enzymatic analysis of the putative diterpene cyclase SdnA showed that it catalyzes the transformation of geranylgeranyl diphosphate to cycloaraneosene, a known biosynthetic intermediate of sordarin. Furthermore, a putative glycosyltransferase SdnJ was found to catalyze the glycosylation of sordaricin in the presence of GDP-6-deoxy-d-altrose to give 4'-O-demethylsordarin. These results suggest that the identified sdn gene cluster is responsible for the biosynthesis of sordarin. Based on the isolated potential biosynthetic intermediates and bioinformatics analysis, a plausible biosynthetic pathway for sordarin is proposed.

  5. -Genomic data mining of the marine actinobacteria Streptomyces sp. H-KF8 unveils insights into multi-stress related genes and metabolic pathways involved in antimicrobial synthesis.

    Science.gov (United States)

    Undabarrena, Agustina; Ugalde, Juan A; Seeger, Michael; Cámara, Beatriz

    2017-01-01

    Streptomyces sp. H-KF8 is an actinobacterial strain isolated from marine sediments of a Chilean Patagonian fjord. Morphological characterization together with antibacterial activity was assessed in various culture media, revealing a carbon-source dependent activity mainly against Gram-positive bacteria (S. aureus and L. monocytogenes). Genome mining of this antibacterial-producing bacterium revealed the presence of 26 biosynthetic gene clusters (BGCs) for secondary metabolites, where among them, 81% have low similarities with known BGCs. In addition, a genomic search in Streptomyces sp. H-KF8 unveiled the presence of a wide variety of genetic determinants related to heavy metal resistance (49 genes), oxidative stress (69 genes) and antibiotic resistance (97 genes). This study revealed that the marine-derived Streptomyces sp. H-KF8 bacterium has the capability to tolerate a diverse set of heavy metals such as copper, cobalt, mercury, chromate and nickel; as well as the highly toxic tellurite, a feature first time described for Streptomyces. In addition, Streptomyces sp. H-KF8 possesses a major resistance towards oxidative stress, in comparison to the soil reference strain Streptomyces violaceoruber A3(2). Moreover, Streptomyces sp. H-KF8 showed resistance to 88% of the antibiotics tested, indicating overall, a strong response to several abiotic stressors. The combination of these biological traits confirms the metabolic versatility of Streptomyces sp. H-KF8, a genetically well-prepared microorganism with the ability to confront the dynamics of the fjord-unique marine environment.

  6. CSIR: Mining Technology annual review 1996/97

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1997-12-31

    CSIR: Mining Technology works in close collaboration and strategic partnership with the mining industry, government institutions and employee organizations by acquiring, developing and transferring technologies to improve the safety and health of their employees, and to improve the profitability of the mining industry. The annual report describes achievements over the year in the areas of: rock engineering (including rockburst control, mine layout, stope and gully support, coal mining); environmental safety and health on topics such as occupational hygiene services, methane explosions, blasting techniques; and mining systems (orebody information, hydraulic transport mine mechanization, engineering design and automation, mine services). A list of Mining Technology`s 1996/97 publications is given.

  7. In Silico Mining of Microsatellites in Coding Sequences of the Date Palm (Arecaceae Genome, Characterization, and Transferability

    Directory of Open Access Journals (Sweden)

    Frédérique Aberlenc-Bertossi

    2014-01-01

    Full Text Available Premise of the study: To complement existing sets of primarily dinucleotide microsatellite loci from noncoding sequences of date palm, we developed primers for tri- and hexanucleotide microsatellite loci identified within genes. Due to their conserved genomic locations, the primers should be useful in other palm taxa, and their utility was tested in seven other Phoenix species and in Chamaerops, Livistona, and Hyphaene. Methods and Results: Tandem repeat motifs of 3–6 bp were searched using a simple sequence repeat (SSR–pipeline package in coding portions of the date palm draft genome sequence. Fifteen loci produced highly consistent amplification, intraspecific polymorphisms, and stepwise mutation patterns. Conclusions: These microsatellite loci showed sufficient levels of variability and transferability to make them useful for population genetic, selection signature, and interspecific gene flow studies in Phoenix and other Coryphoideae genera.

  8. High-quality draft genome sequence of Kocuria marina SO9-6, an actinobacterium isolated from a copper mine

    Science.gov (United States)

    Castro, Daniel B.A.; Pereira, Letícia Bianca; Silva, Marcus Vinícius M. e; Silva, Bárbara P. da; Palermo, Bruna Rafaella Z.; Carlos, Camila; Belgini, Daiane R.B.; Limache, Elmer Erasmo G.; Lacerda, Gileno V. Jr; Nery, Mariana B.P.; Gomes, Milene B.; Souza, Salatiel S. de; Silva, Thiago M. da; Rodrigues, Viviane D.; Paulino, Luciana C.; Vicentini, Renato; Ferraz, Lúcio F.C.; Ottoboni, Laura M.M.

    2015-01-01

    An actinobacterial strain, designated SO9-6, was isolated from a copper iron sulfide mineral. The organism is Gram-positive, facultatively anaerobic, and coccoid. Chemotaxonomic and phylogenetic properties were consistent with its classification in the genus Kocuria. Here, we report the first draft genome sequence of Kocuria marina SO9-6 under accession JROM00000000 (http://www.ncbi.nlm.nih.gov/nuccore/725823918), which provides insights for heavy metal bioremediation and production of compounds of biotechnological interest. PMID:26484219

  9. Integrative genomic data mining for discovery of potential blood-borne biomarkers for early diagnosis of cancer.

    Directory of Open Access Journals (Sweden)

    Yongliang Yang

    Full Text Available BACKGROUND: With the arrival of the postgenomic era, there is increasing interest in the discovery of biomarkers for the accurate diagnosis, prognosis, and early detection of cancer. Blood-borne cancer markers are favored by clinicians, because blood samples can be obtained and analyzed with relative ease. We have used a combined mining strategy based on an integrated cancer microarray platform, Oncomine, and the biomarker module of the Ingenuity Pathways Analysis (IPA program to identify potential blood-based markers for six common human cancer types. METHODOLOGY/PRINCIPAL FINDINGS: In the Oncomine platform, the genes overexpressed in cancer tissues relative to their corresponding normal tissues were filtered by Gene Ontology keywords, with the extracellular environment stipulated and a corrected Q value (false discovery rate cut-off implemented. The identified genes were imported to the IPA biomarker module to separate out those genes encoding putative secreted or cell-surface proteins as blood-borne (blood/serum/plasma cancer markers. The filtered potential indicators were ranked and prioritized according to normalized absolute Student t values. The retrieval of numerous marker genes that are already clinically useful or under active investigation confirmed the effectiveness of our mining strategy. To identify the biomarkers that are unique for each cancer type, the upregulated marker genes that are in common between each two tumor types across the six human tumors were also analyzed by the IPA biomarker comparison function. CONCLUSION/SIGNIFICANCE: The upregulated marker genes shared among the six cancer types may serve as a molecular tool to complement histopathologic examination, and the combination of the commonly upregulated and unique biomarkers may serve as differentiating markers for a specific cancer. This approach will be increasingly useful to discover diagnostic signatures as the mass of microarray data continues to grow in the

  10. Warehouse automation

    OpenAIRE

    Pogačnik, Jure

    2017-01-01

    An automated high bay warehouse is commonly used for storing large number of material with a high throughput. In an automated warehouse pallet movements are mainly performed by a number of automated devices like conveyors systems, trolleys, and stacker cranes. From the introduction of the material to the automated warehouse system to its dispatch the system requires no operator input or intervention since all material movements are done automatically. This allows the automated warehouse to op...

  11. MacSyFinder: a program to mine genomes for molecular systems with an application to CRISPR-Cas systems.

    Directory of Open Access Journals (Sweden)

    Sophie S Abby

    Full Text Available Biologists often wish to use their knowledge on a few experimental models of a given molecular system to identify homologs in genomic data. We developed a generic tool for this purpose.Macromolecular System Finder (MacSyFinder provides a flexible framework to model the properties of molecular systems (cellular machinery or pathway including their components, evolutionary associations with other systems and genetic architecture. Modelled features also include functional analogs, and the multiple uses of a same component by different systems. Models are used to search for molecular systems in complete genomes or in unstructured data like metagenomes. The components of the systems are searched by sequence similarity using Hidden Markov model (HMM protein profiles. The assignment of hits to a given system is decided based on compliance with the content and organization of the system model. A graphical interface, MacSyView, facilitates the analysis of the results by showing overviews of component content and genomic context. To exemplify the use of MacSyFinder we built models to detect and class CRISPR-Cas systems following a previously established classification. We show that MacSyFinder allows to easily define an accurate "Cas-finder" using publicly available protein profiles.MacSyFinder is a standalone application implemented in Python. It requires Python 2.7, Hmmer and makeblastdb (version 2.2.28 or higher. It is freely available with its source code under a GPLv3 license at https://github.com/gem-pasteur/macsyfinder. It is compatible with all platforms supporting Python and Hmmer/makeblastdb. The "Cas-finder" (models and HMM profiles is distributed as a compressed tarball archive as Supporting Information.

  12. Mining and robotized equipment

    Energy Technology Data Exchange (ETDEWEB)

    Krisztian, B.

    1984-01-01

    The general concepts about the expedience of using industrial robots (PR) in mining and about the most rational fields of their use are cited. The achievements in creating industrial robots for the needs of the mining industry in the USSR, Sweden (the ASEA Company), in the United States (Westinghouse Electric and Cincinnati Milacron Companies) and in Japan (the Fupitsu Fanuk Company) are noted. The necessity in a whole number of cases of a fundamental restructuring of the productive processes with respect to the planned introduction of industrial robots in mining enterprises is stressed. The questions associated with the necessity for changes introduced into systems for automating industrial processes with the introduction of industrial robots into them are also discussed. The prospects for the development, creation and introduction of industrial robots in the Hungarian (VNR) mining industry are indicated in conclusion.

  13. Transcriptome Analysis of Two Vicia sativa Subspecies: Mining Molecular Markers to Enhance Genomic Resources for Vetch Improvement

    Directory of Open Access Journals (Sweden)

    Tae-Sung Kim

    2015-11-01

    Full Text Available The vetch (Vicia sativa is one of the most important annual forage legumes globally due to its multiple uses and high nutritional content. Despite these agronomical benefits, many drawbacks, including cyano-alanine toxin, has reduced the agronomic value of vetch varieties. Here, we used 454 technology to sequence the two V. sativa subspecies (ssp. sativa and ssp. nigra to enrich functional information and genetic marker resources for the vetch research community. A total of 86,532 and 47,103 reads produced 35,202 and 18,808 unigenes with average lengths of 735 and 601 bp for V. sativa sativa and V. sativa nigra, respectively. Gene Ontology annotations and the cluster of orthologous gene classes were used to annotate the function of the Vicia transcriptomes. The Vicia transcriptome sequences were then mined for simple sequence repeat (SSR and single nucleotide polymorphism (SNP markers. About 13% and 3% of the Vicia unigenes contained the putative SSR and SNP sequences, respectively. Among those SSRs, 100 were chosen for the validation and the polymorphism test using the Vicia germplasm set. Thus, our approach takes advantage of the utility of transcriptomic data to expedite a vetch breeding program.

  14. Research on Text Data Mining on Human Genome Sequence Analysis%人类基因组测序文本数据挖掘研究

    Institute of Scientific and Technical Information of China (English)

    于跃; 潘玮; 王丽伟; 王伟

    2012-01-01

    对PubMed数据库中2001年1月1日-2011年5月11日的人类基因组测序相关文献进行检索,对其题录信息进行提取并进行共词聚类分析,提取高频主题词,生成词篇矩阵、共现聚阵、共词聚类,认为文本数据挖掘技术能够很好地反映学科发展状况及研究热点,从而为研究人员提供有价值的信息。%Retrieving the literatures on human genome sequence analysis from PubMed published from 2001.1.1 to 2011.5.11,extracts bibliographic information and carries out co - word analysis,high frequency subject headings are extracted,word matrix,co - occurrence matrix,co - word clustering are formulated.It clarifies that data mining is a good way to reflect development status and research hotspots,so as to provide valuable information to researchers.

  15. Genome mining and metabolic profiling of the rhizosphere bacterium Pseudomonas sp. SH-C52 for antimicrobial compounds

    Directory of Open Access Journals (Sweden)

    Menno evan der Voort

    2015-07-01

    Full Text Available The plant microbiome represents an enormous untapped resource for discovering novel genes and bioactive compounds. Previously, we isolated Pseudomonas sp. SH-C52 from the rhizosphere of sugar beet plants grown in a soil suppressive to the fungal pathogen Rhizoctonia solani and showed that its antifungal activity is, in part, attributed to the production of the chlorinated 9-amino-acid lipopeptide thanamycin (Mendes et al. 2011. Science. To get more insight into its biosynthetic repertoire, the genome of Pseudomonas sp. SH-C52 was sequenced and subjected to in silico, mutational and functional analyses. The sequencing revealed a genome size of 6.3 Mb and 5,579 predicted ORFs. Phylogenetic analysis placed strain SH-C52 within the Pseudomonas corrugata clade. In silico analysis for secondary metabolites revealed a total of six nonribosomal peptide synthetase (NRPS gene clusters, including the two previously described NRPS clusters for thanamycin and the 2-amino acid antibacterial lipopeptide brabantamide. Here we show that thanamycin also has activity against an array of other fungi and that brabantamide A exhibits anti-oomycete activity and affects phospholipases of the late blight pathogen Phytophthora infestans. Most notably, mass spectrometry led to the discovery of a third LP, designated thanapeptin, with a 22-amino-acid peptide moiety. Seven structural variants of thanapeptin were found with varying degrees of activity against P. infestans. Of the remaining four NRPS clusters, one was predicted to encode for yet another and unknown lipopeptide with a predicted peptide moiety of 8-amino acids. Collectively, these results show an enormous metabolic potential for Pseudomonas sp. SH-C52, with at least three structurally diverse lipopeptides, each with a different antimicrobial activity spectrum.

  16. iSubgraph: integrative genomics for subgroup discovery in hepatocellular carcinoma using graph mining and mixture models.

    Directory of Open Access Journals (Sweden)

    Bahadir Ozdemir

    Full Text Available The high tumor heterogeneity makes it very challenging to identify key tumorigenic pathways as therapeutic targets. The integration of multiple omics data is a promising approach to identify driving regulatory networks in patient subgroups. Here, we propose a novel conceptual framework to discover patterns of miRNA-gene networks, observed frequently up- or down-regulated in a group of patients and to use such networks for patient stratification in hepatocellular carcinoma (HCC. We developed an integrative subgraph mining approach, called iSubgraph, and identified altered regulatory networks frequently observed in HCC patients. The miRNA and gene expression profiles were jointly analyzed in a graph structure. We defined a method to transform microarray data into graph representation that encodes miRNA and gene expression levels and the interactions between them as well. The iSubgraph algorithm was capable to detect cooperative regulation of miRNAs and genes even if it occurred only in some patients. Next, the miRNA-mRNA modules were used in an unsupervised class prediction model to discover HCC subgroups via patient clustering by mixture models. The robustness analysis of the mixture model showed that the class predictions are highly stable. Moreover, the Kaplan-Meier survival analysis revealed that the HCC subgroups identified by the algorithm have different survival characteristics. The pathway analyses of the miRNA-mRNA co-modules identified by the algorithm demonstrate key roles of Myc, E2F1, let-7, TGFB1, TNF and EGFR in HCC subgroups. Thus, our method can integrate various omics data derived from different platforms and with different dynamic scales to better define molecular tumor subtypes. iSubgraph is available as MATLAB code at http://www.cs.umd.edu/~ozdemir/isubgraph/.

  17. Accounting Automation

    OpenAIRE

    Laynebaril1

    2017-01-01

    Accounting Automation   Click Link Below To Buy:   http://hwcampus.com/shop/accounting-automation/  Or Visit www.hwcampus.com Accounting Automation” Please respond to the following: Imagine you are a consultant hired to convert a manual accounting system to an automated system. Suggest the key advantages and disadvantages of automating a manual accounting system. Identify the most important step in the conversion process. Provide a rationale for your response. ...

  18. Home Automation

    OpenAIRE

    Ahmed, Zeeshan

    2010-01-01

    In this paper I briefly discuss the importance of home automation system. Going in to the details I briefly present a real time designed and implemented software and hardware oriented house automation research project, capable of automating house's electricity and providing a security system to detect the presence of unexpected behavior.

  19. Carotenoid metabolic profiling and transcriptome-genome mining reveal functional equivalence among blue-pigmented copepods and appendicularia

    KAUST Repository

    Mojib, Nazia

    2014-06-01

    The tropical oligotrophic oceanic areas are characterized by high water transparency and annual solar radiation. Under these conditions, a large number of phylogenetically diverse mesozooplankton species living in the surface waters (neuston) are found to be blue pigmented. In the present study, we focused on understanding the metabolic and genetic basis of the observed blue phenotype functional equivalence between the blue-pigmented organisms from the phylum Arthropoda, subclass Copepoda (Acartia fossae) and the phylum Chordata, class Appendicularia (Oikopleura dioica) in the Red Sea. Previous studies have shown that carotenoid–protein complexes are responsible for blue coloration in crustaceans. Therefore, we performed carotenoid metabolic profiling using both targeted and nontargeted (high-resolution mass spectrometry) approaches in four different blue-pigmented genera of copepods and one blue-pigmented species of appendicularia. Astaxanthin was found to be the principal carotenoid in all the species. The pathway analysis showed that all the species can synthesize astaxanthin from β-carotene, ingested from dietary sources, via 3-hydroxyechinenone, canthaxanthin, zeaxanthin, adonirubin or adonixanthin. Further, using de novo assembled transcriptome of blue A. fossae (subclass Copepoda), we identified highly expressed homologous β-carotene hydroxylase enzymes and putative carotenoid-binding proteins responsible for astaxanthin formation and the blue phenotype. In blue O. dioica (class Appendicularia), corresponding putative genes were identified from the reference genome. Collectively, our data provide molecular evidences for the bioconversion and accumulation of blue astaxanthin–protein complexes underpinning the observed ecological functional equivalence and adaptive convergence among neustonic mesozooplankton.

  20. Genome mining of astaxanthin biosynthetic genes from Sphingomonas sp. ATCC 55669 for heterologous overproduction in Escherichia coli.

    Science.gov (United States)

    Ma, Tian; Zhou, Yuanjie; Li, Xiaowei; Zhu, Fayin; Cheng, Yongbo; Liu, Yi; Deng, Zixin; Liu, Tiangang

    2016-02-01

    As a highly valued keto-carotenoid, astaxanthin is widely used in nutritional supplements and pharmaceuticals. Therefore, the demand for biosynthetic astaxanthin and improved efficiency of astaxanthin biosynthesis has driven the investigation of metabolic engineering of native astaxanthin producers and heterologous hosts. However, microbial resources for astaxanthin are limited. In this study, we found that the α-Proteobacterium Sphingomonas sp. ATCC 55669 could produce astaxanthin naturally. We used whole-genome sequencing to identify the astaxanthin biosynthetic pathway using a combined PacBio-Illumina approach. The putative astaxanthin biosynthetic pathway in Sphingomonas sp. ATCC 55669 was predicted. For further confirmation, a high-efficiency targeted engineering carotenoid synthesis platform was constructed in E. coli for identifying the functional roles of candidate genes. All genes involved in astaxanthin biosynthesis showed discrete distributions on the chromosome. Moreover, the overexpression of exogenous E. coli idi in Sphingomonas sp. ATCC 55669 increased astaxanthin production by 5.4-fold. This study described a new astaxanthin producer and provided more biosynthesis components for bioengineering of astaxanthin in the future.

  1. Genome mining in Streptomyces avermitilis: cloning and characterization of SAV_76, the synthase for a new sesquiterpene, avermitilol.

    Science.gov (United States)

    Chou, Wayne K W; Fanizza, Immacolata; Uchiyama, Takuma; Komatsu, Mamoru; Ikeda, Haruo; Cane, David E

    2010-07-07

    The terpene synthase encoded by the sav76 gene of Streptomyces avermtilis was expressed in Escherichia coli as an N-terminal-His(6)-tag protein, using a codon-optimized synthetic gene. Incubation of the recombinant protein, SAV_76, with farnesyl diphosphate (1, FPP) in the presence of Mg(2+) gave a new sesquiterpene alcohol avermitilol (2), whose structure and stereochemistry were determined by a combination of (1)H, (13)C, COSY, HMQC, HMBC, and NOESY NMR, along with minor amounts of germacrene A (3), germacrene B (4), and viridiflorol (5). The absolute configuration of 2 was assigned by (1)H NMR analysis of the corresponding (R)- and (S)-Mosher esters. The steady state kinetic parameters were k(cat) 0.040 +/- 0.001 s(-1) and K(m) 1.06 +/- 0.11 microM. Individual incubations of recombinant avermitilol synthase with [1,1-(2)H(2)]FPP (1a), (1S)-[1-(2)H]-FPP (1b), and (1R)-[1-(2)H]-FPP (1c) and NMR analysis of the resulting avermitilols supported a cyclization mechanism involving the loss of H-1(re) to generate the intermediate bicyclogermacrene (7), which then undergoes proton-initiated anti-Markovnikov cyclization and capture of water to generate 2. A copy of the sav76 gene was reintroduced into S. avermitilis SUKA17, a large deletion mutant from which the genes for the major endogenous secondary metabolites had been removed, and expressed under control of the native S. avermitilis promoter rpsJp (sav4925). The resultant transformants generated avermitilol (2) as well as the derived ketone, avermitilone (8), along with small amounts of 3, 4, and 5. The biochemical function of all four terpene synthases found in the S. avermtilis genome have now been determined.

  2. Two non-synonymous markers in PTPN21, identified by genome-wide association study data-mining and replication, are associated with schizophrenia.

    LENUS (Irish Health Repository)

    Chen, Jingchun

    2011-09-01

    We conducted data-mining analyses of genome wide association (GWA) studies of the CATIE and MGS-GAIN datasets, and found 13 markers in the two physically linked genes, PTPN21 and EML5, showing nominally significant association with schizophrenia. Linkage disequilibrium (LD) analysis indicated that all 7 markers from PTPN21 shared high LD (r(2)>0.8), including rs2274736 and rs2401751, the two non-synonymous markers with the most significant association signals (rs2401751, P=1.10 × 10(-3) and rs2274736, P=1.21 × 10(-3)). In a meta-analysis of all 13 replication datasets with a total of 13,940 subjects, we found that the two non-synonymous markers are significantly associated with schizophrenia (rs2274736, OR=0.92, 95% CI: 0.86-0.97, P=5.45 × 10(-3) and rs2401751, OR=0.92, 95% CI: 0.86-0.97, P=5.29 × 10(-3)). One SNP (rs7147796) in EML5 is also significantly associated with the disease (OR=1.08, 95% CI: 1.02-1.14, P=6.43 × 10(-3)). These 3 markers remain significant after Bonferroni correction. Furthermore, haplotype conditioned analyses indicated that the association signals observed between rs2274736\\/rs2401751 and rs7147796 are statistically independent. Given the results that 2 non-synonymous markers in PTPN21 are associated with schizophrenia, further investigation of this locus is warranted.

  3. RGS2 expression predicts amyloid-β sensitivity, MCI and Alzheimer's disease: genome-wide transcriptomic profiling and bioinformatics data mining

    Science.gov (United States)

    Hadar, A; Milanesi, E; Squassina, A; Niola, P; Chillotti, C; Pasmanik-Chor, M; Yaron, O; Martásek, P; Rehavi, M; Weissglas-Volkov, D; Shomron, N; Gozes, I; Gurwitz, D

    2016-01-01

    Alzheimer's disease (AD) is the most frequent cause of dementia. Misfolded protein pathological hallmarks of AD are brain deposits of amyloid-β (Aβ) plaques and phosphorylated tau neurofibrillary tangles. However, doubts about the role of Aβ in AD pathology have been raised as Aβ is a common component of extracellular brain deposits found, also by in vivo imaging, in non-demented aged individuals. It has been suggested that some individuals are more prone to Aβ neurotoxicity and hence more likely to develop AD when aging brains start accumulating Aβ plaques. Here, we applied genome-wide transcriptomic profiling of lymphoblastoid cells lines (LCLs) from healthy individuals and AD patients for identifying genes that predict sensitivity to Aβ. Real-time PCR validation identified 3.78-fold lower expression of RGS2 (regulator of G-protein signaling 2; P=0.0085) in LCLs from healthy individuals exhibiting high vs low Aβ sensitivity. Furthermore, RGS2 showed 3.3-fold lower expression (P=0.0008) in AD LCLs compared with controls. Notably, RGS2 expression in AD LCLs correlated with the patients' cognitive function. Lower RGS2 expression levels were also discovered in published expression data sets from postmortem AD brain tissues as well as in mild cognitive impairment and AD blood samples compared with controls. In conclusion, Aβ sensitivity phenotyping followed by transcriptomic profiling and published patient data mining identified reduced peripheral and brain expression levels of RGS2, a key regulator of G-protein-coupled receptor signaling and neuronal plasticity. RGS2 is suggested as a novel AD biomarker (alongside other genes) toward early AD detection and future disease modifying therapeutics. PMID:27701409

  4. Ontology for Genome Comparison and Genomic Rearrangements

    Directory of Open Access Journals (Sweden)

    Anil Wipat

    2006-04-01

    Full Text Available We present an ontology for describing genomes, genome comparisons, their evolution and biological function. This ontology will support the development of novel genome comparison algorithms and aid the community in discussing genomic evolution. It provides a framework for communication about comparative genomics, and a basis upon which further automated analysis can be built. The nomenclature defined by the ontology will foster clearer communication between biologists, and also standardize terms used by data publishers in the results of analysis programs. The overriding aim of this ontology is the facilitation of consistent annotation of genomes through computational methods, rather than human annotators. To this end, the ontology includes definitions that support computer analysis and automated transfer of annotations between genomes, rather than relying upon human mediation.

  5. Automated High Throughput Drug Target Crystallography

    Energy Technology Data Exchange (ETDEWEB)

    Rupp, B

    2005-02-18

    The molecular structures of drug target proteins and receptors form the basis for 'rational' or structure guided drug design. The majority of target structures are experimentally determined by protein X-ray crystallography, which as evolved into a highly automated, high throughput drug discovery and screening tool. Process automation has accelerated tasks from parallel protein expression, fully automated crystallization, and rapid data collection to highly efficient structure determination methods. A thoroughly designed automation technology platform supported by a powerful informatics infrastructure forms the basis for optimal workflow implementation and the data mining and analysis tools to generate new leads from experimental protein drug target structures.

  6. Whole genome sequencing of Streptococcus pneumoniae: development, evaluation and verification of targets for serogroup and serotype prediction using an automated pipeline

    Directory of Open Access Journals (Sweden)

    Georgia Kapatai

    2016-09-01

    Full Text Available Streptococcus pneumoniae typically express one of 92 serologically distinct capsule polysaccharide (cps types (serotypes. Some of these serotypes are closely related to each other; using the commercially available typing antisera, these are assigned to common serogroups containing types that show cross-reactivity. In this serotyping scheme, factor antisera are used to allocate serotypes within a serogroup, based on patterns of reactions. This serotyping method is technically demanding, requires considerable experience and the reading of the results can be subjective. This study describes the analysis of the S. pneumoniae capsular operon genetic sequence to determine serotype distinguishing features and the development, evaluation and verification of an automated whole genome sequence (WGS-based serotyping bioinformatics tool, PneumoCaT (Pneumococcal Capsule Typing. Initially, WGS data from 871 S. pneumoniae isolates were mapped to reference cps locus sequences for the 92 serotypes. Thirty-two of 92 serotypes could be unambiguously identified based on sequence similarities within the cps operon. The remaining 60 were allocated to one of 20 ‘genogroups’ that broadly correspond to the immunologically defined serogroups. By comparing the cps reference sequences for each genogroup, unique molecular differences were determined for serotypes within 18 of the 20 genogroups and verified using the set of 871 isolates. This information was used to design a decision-tree style algorithm within the PneumoCaT bioinformatics tool to predict to serotype level for 89/94 (92 + 2 molecular types/subtypes from WGS data and to serogroup level for serogroups 24 and 32, which currently comprise 2.1% of UK referred, invasive isolates submitted to the National Reference Laboratory (NRL, Public Health England (June 2014–July 2015. PneumoCaT was evaluated with an internal validation set of 2065 UK isolates covering 72/92 serotypes, including 19 non-typeable isolates

  7. The hydrogen mine introduction initiative

    Energy Technology Data Exchange (ETDEWEB)

    Betournay, M.C.; Howell, B. [Natural Resources Canada, Ottawa, ON (Canada). CANMET Mining and Mineral Sciences Laboratories

    2009-07-01

    In an effort to address air quality concerns in underground mines, the mining industry is considering the use fuel cells instead of diesel to power mine production vehicles. The immediate issues and opportunities associated with fuel cells use include a reduction in harmful greenhouse gas emissions; reduction in ventilation operating costs; reduction in energy consumption; improved health benefits; automation; and high productivity. The objective of the hydrogen mine introduction initiative (HMII) is to develop and test the range of fundamental and needed operational technology, specifications and best practices for underground hydrogen power applications. Although proof of concept studies have shown high potential for fuel cell use, safety considerations must be addressed, including hydrogen behaviour in confined conditions. This presentation highlighted the issues to meet operational requirements, notably hydrogen production; delivery and storage; mine regulations; and hydrogen behaviour underground. tabs., figs.

  8. Introduction to Space Resource Mining

    Science.gov (United States)

    Mueller, Robert P.

    2013-01-01

    There are vast amounts of resources in the solar system that will be useful to humans in space and possibly on Earth. None of these resources can be exploited without the first necessary step of extra-terrestrial mining. The necessary technologies for tele-robotic and autonomous mining have not matured sufficiently yet. The current state of technology was assessed for terrestrial and extraterrestrial mining and a taxonomy of robotic space mining mechanisms was presented which was based on current existing prototypes. Terrestrial and extra-terrestrial mining methods and technologies are on the cusp of massive changes towards automation and autonomy for economic and safety reasons. It is highly likely that these industries will benefit from mutual cooperation and technology transfer.

  9. Longwall mining

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1995-03-14

    As part of EIA`s program to provide information on coal, this report, Longwall-Mining, describes longwall mining and compares it with other underground mining methods. Using data from EIA and private sector surveys, the report describes major changes in the geologic, technological, and operating characteristics of longwall mining over the past decade. Most important, the report shows how these changes led to dramatic improvements in longwall mining productivity. For readers interested in the history of longwall mining and greater detail on recent developments affecting longwall mining, the report includes a bibliography.

  10. Antimicrobials of Bacillus species: mining and engineering

    OpenAIRE

    Zhao, Xin

    2016-01-01

    Bacillus sp. have been successfully used to suppress various bacterial and fungal pathogens. Due to the wide availability of whole genome sequence data and the development of genome mining tools, novel antimicrobials are being discovered and updated,;not only bacteriocins, but also NRPs and PKs. A new classification system of known and putative antimicrobial compounds of Bacillus by genome mining is presented in Chapter 2. Importantly, predicting, isolating and screening of Bacillus strains w...

  11. Library Automation

    OpenAIRE

    Dhakne, B. N.; Giri, V. V.; Waghmode, S. S.

    2010-01-01

    New technologies library provides several new materials, media and mode of storing and communicating the information. Library Automation reduces the drudgery of repeated manual efforts in library routine. By use of library automation collection, Storage, Administration, Processing, Preservation and communication etc.

  12. Genome mining for new α-amylase and glucoamylase encoding sequences and high level expression of a glucoamylase from Talaromyces stipitatus for potential raw starch hydrolysis.

    Science.gov (United States)

    Xiao, Zhizhuang; Wu, Meiqun; Grosse, Stephan; Beauchemin, Manon; Lévesque, Michelle; Lau, Peter C K

    2014-01-01

    Mining fungal genomes for glucoamylase and α-amylase encoding sequences led to the selection of 23 candidates, two of which (designated TSgam-2 and NFamy-2) were advanced to testing for cooked or raw starch hydrolysis. TSgam-2 is a 66-kDa glucoamylase recombinantly produced in Pichia pastoris and originally derived for Talaromyces stipitatus. When harvested in a 20-L bioreactor at high cell density (OD600 > 200), the secreted TSgam-2 enzyme activity from P. pastoris strain GS115 reached 800 U/mL. In a 6-L working volume of a 10-L fermentation, the TSgam-2 protein yield was estimated to be ∼8 g with a specific activity of 360 U/mg. In contrast, the highest activity of NFamy-2, a 70-kDa α-amylase originally derived from Neosartorya fischeri, and expressed in P. pastoris KM71 only reached 8 U/mL. Both proteins were purified and characterized in terms of pH and temperature optima, kinetic parameters, and thermostability. TSgam-2 was more thermostable than NFamy-2 with a respective half-life (t1/2) of >300 min at 55 °C and >200 min at 40 °C. The kinetic parameters for raw starch adsorption of TSgam-2 and NFamy-2 were also determined. A combination of NFamy-2 and TSgam-2 hydrolyzed cooked potato and triticale starch into glucose with yields, 71-87 %, that are competitive with commercially available α-amylases. In the hydrolysis of raw starch, the best hydrolysis condition was seen with a sequential addition of 40 U of a thermostable Bacillus globigii amylase (BgAmy)/g starch at 80 °C for 16 h, and 40 U TSgam-2/g starch at 45 °C for 24 h. The glucose released was 8.7 g/10 g of triticale starch and 7.9 g/10 g of potato starch, representing 95 and 86 % of starch degradation rate, respectively.

  13. Automation or De-automation

    Science.gov (United States)

    Gorlach, Igor; Wessel, Oliver

    2008-09-01

    In the global automotive industry, for decades, vehicle manufacturers have continually increased the level of automation of production systems in order to be competitive. However, there is a new trend to decrease the level of automation, especially in final car assembly, for reasons of economy and flexibility. In this research, the final car assembly lines at three production sites of Volkswagen are analysed in order to determine the best level of automation for each, in terms of manufacturing costs, productivity, quality and flexibility. The case study is based on the methodology proposed by the Fraunhofer Institute. The results of the analysis indicate that fully automated assembly systems are not necessarily the best option in terms of cost, productivity and quality combined, which is attributed to high complexity of final car assembly systems; some de-automation is therefore recommended. On the other hand, the analysis shows that low automation can result in poor product quality due to reasons related to plant location, such as inadequate workers' skills, motivation, etc. Hence, the automation strategy should be formulated on the basis of analysis of all relevant aspects of the manufacturing process, such as costs, quality, productivity and flexibility in relation to the local context. A more balanced combination of automated and manual assembly operations provides better utilisation of equipment, reduces production costs and improves throughput.

  14. Evolutionary Data Mining Approach to Creating Digital Logic

    Science.gov (United States)

    2010-01-01

    A data mining based procedure for automated reverse engineering has been developed. The data mining algorithm for reverse engineering uses a genetic...program (GP) as a data mining function. A genetic program is an algorithm based on the theory of evolution that automatically evolves populations of...based data mining is then conducted. This procedure incorporates not only the experts? rules into the fitness function, but also the information in the

  15. Technological applications of robotics in mining

    Energy Technology Data Exchange (ETDEWEB)

    Konyukh, V. [Institute of Coal, Kemerovo (Russian Federation)

    1996-12-31

    There are the objective preconditions for use of automated multifunctional machines in mining. Some methods for evaluation of mining robotisability are offered. Functional modeling and Petri nets were used for synthesis and dynamic simulation of mining robotics systems. It enables to construct some robotics-based technologies for underground mining. The trial of rail robocar and teleloader in the Siberian ore-mines showed it is necessary to have a post-learning of control system. The created database about 90 applications of robotics in coal and other mining is used for choice and synthesis of robotized technologies. There are 6 direct and 8 indirect sources of robotics efficiency in mining, that are evaluated total by correlation between materialized and present labor expenses. 4 refs., 7 figs.

  16. Analysis of segmental duplications, mouse genome synteny and recurrent cancer-associated amplicons in human chromosome 6p21-p12.

    Science.gov (United States)

    Martin, J W; Yoshimoto, M; Ludkovski, O; Thorner, P S; Zielenska, M; Squire, J A; Nuin, P A S

    2010-06-01

    It has been proposed that regions of microhomology in the human genome could facilitate genomic rearrangements, copy number transitions, and rapid genomic change during tumor progression. To investigate this idea, this study examines the role of repetitive sequence elements, and corresponding syntenic mouse genomic features, in targeting cancer-associated genomic instability of specific regions of the human genome. Automated database-mining algorithms designed to search for frequent copy number transitions and genomic breakpoints were applied to 2 publicly-available online databases and revealed that 6p21-p12 is one of the regions of the human genome most frequently involved in tumor-specific alterations. In these analyses, 6p21-p12 exhibited the highest frequency of genomic amplification in osteosarcomas. Analysis of repetitive elements in regions of homology between human chromosome 6p and the syntenic regions of the mouse genome revealed a strong association between the location of segmental duplications greater than 5 kilobase-pairs and the position of discontinuities at the end of the syntenic region. The presence of clusters of segmental duplications flanking these syntenic regions also correlated with a high frequency of amplification and genomic alteration. Collectively, the experimental findings, in silico analyses, and comparative genomic studies presented here suggest that segmental duplications may facilitate cancer-associated copy number transitions and rearrangements at chromosome 6p21-p12. This process may involve homology-dependent DNA recombination and/or repair, which may also contribute towards the overall plasticity of the human genome.

  17. Text Mining.

    Science.gov (United States)

    Trybula, Walter J.

    1999-01-01

    Reviews the state of research in text mining, focusing on newer developments. The intent is to describe the disparate investigations currently included under the term text mining and provide a cohesive structure for these efforts. A summary of research identifies key organizations responsible for pushing the development of text mining. A section…

  18. The automation of science.

    Science.gov (United States)

    King, Ross D; Rowland, Jem; Oliver, Stephen G; Young, Michael; Aubrey, Wayne; Byrne, Emma; Liakata, Maria; Markham, Magdalena; Pir, Pinar; Soldatova, Larisa N; Sparkes, Andrew; Whelan, Kenneth E; Clare, Amanda

    2009-04-03

    The basis of science is the hypothetico-deductive method and the recording of experiments in sufficient detail to enable reproducibility. We report the development of Robot Scientist "Adam," which advances the automation of both. Adam has autonomously generated functional genomics hypotheses about the yeast Saccharomyces cerevisiae and experimentally tested these hypotheses by using laboratory automation. We have confirmed Adam's conclusions through manual experiments. To describe Adam's research, we have developed an ontology and logical language. The resulting formalization involves over 10,000 different research units in a nested treelike structure, 10 levels deep, that relates the 6.6 million biomass measurements to their logical description. This formalization describes how a machine contributed to scientific knowledge.

  19. Reference Based Genome Compression

    OpenAIRE

    Chern, Bobbie; Ochoa, Idoia; Manolakos, Alexandros; No, Albert; Venkat, Kartik; Weissman, Tsachy

    2012-01-01

    DNA sequencing technology has advanced to a point where storage is becoming the central bottleneck in the acquisition and mining of more data. Large amounts of data are vital for genomics research, and generic compression tools, while viable, cannot offer the same savings as approaches tuned to inherent biological properties. We propose an algorithm to compress a target genome given a known reference genome. The proposed algorithm first generates a mapping from the reference to the target gen...

  20. Data, Text and Web Mining for Business Intelligence : A Survey

    Directory of Open Access Journals (Sweden)

    Abdul-Aziz Rashid Al-Azmi

    2013-04-01

    Full Text Available The Information and Communication Technologies revolution brought a digital world with huge amountsof data available. Enterprises use mining technologies to search vast amounts of data for vital insight andknowledge. Mining tools such as data mining, text mining, and web mining are used to find hiddenknowledge in large databases or the Internet. Mining tools are automated software tools used to achievebusiness intelligence by finding hidden relations,and predicting future events from vast amounts of data.This uncovered knowledge helps in gaining completive advantages, better customers’ relationships, andeven fraud detection. In this survey, we’ll describe how these techniques work, how they are implemented.Furthermore, we shall discuss how business intelligence is achieved using these mining tools. Then lookinto some case studies of success stories using mining tools. Finally, we shall demonstrate some of the mainchallenges to the mining technologies that limit their potential.

  1. DATA, TEXT, AND WEB MINING FOR BUSINESS INTELLIGENCE: A SURVEY

    Directory of Open Access Journals (Sweden)

    Abdul-Aziz Rashid

    2013-03-01

    Full Text Available The Information and Communication Technologies revolution brought a digital world with huge amounts of data available. Enterprises use mining technologies to search vast amounts of data for vital insight and knowledge. Mining tools such as data mining, text mining, and web mining are used to find hidden knowledge in large databases or the Internet. Mining tools are automated software tools used to achieve business intelligence by finding hidden relations, and predicting future events from vast amounts of data. This uncovered knowledge helps in gaining completive advantages, better customers’ relationships, and even fraud detection. In this survey, we’ll describe how these techniques work, how they are implemented. Furthermore, we shall discuss how business intelligence is achieved using these mining tools. Then look into some case studies of success stories using mining tools. Finally, we shall demonstrate some of the main challenges to the mining technologies that limit their potential.

  2. Enhancements for a Dynamic Data Warehousing and Mining System for Large-scale HSCB Data

    Science.gov (United States)

    2016-08-29

    Intelligent Automation Incorporated Enhancements for a Dynamic Data Warehousing and Mining ...Page | 2 Intelligent Automation Incorporated Monthly Report No. 5 Enhancements for a Dynamic Data Warehousing and Mining System Large-Scale HSCB...System for Large-scale HSCB Data Monthly Report No. 5 Reporting Period: July 20, 2016 – Aug 19, 2016 Contract No. N00014-16-P-3014

  3. Automating Finance

    Science.gov (United States)

    Moore, John

    2007-01-01

    In past years, higher education's financial management side has been riddled with manual processes and aging mainframe applications. This article discusses schools which had taken advantage of an array of technologies that automate billing, payment processing, and refund processing in the case of overpayment. The investments are well worth it:…

  4. Use of a Pan–Genomic DNA Microarray in Determination of the Phylogenetic Relatedness among Cronobacter spp. and Its Use as a Data Mining Tool to Understand Cronobacter Biology

    Directory of Open Access Journals (Sweden)

    Ben D. Tall

    2017-03-01

    Full Text Available Cronobacter (previously known as Enterobacter sakazakii is a genus of Gram-negative, facultatively anaerobic, oxidase-negative, catalase-positive, rod-shaped bacteria of the family Enterobacteriaceae. These organisms cause a variety of illnesses such as meningitis, necrotizing enterocolitis, and septicemia in neonates and infants, and urinary tract, wound, abscesses or surgical site infections, septicemia, and pneumonia in adults. The total gene content of 379 strains of Cronobacter spp. and taxonomically-related isolates was determined using a recently reported DNA microarray. The Cronobacter microarray as a genotyping tool gives the global food safety community a rapid method to identify and capture the total genomic content of outbreak isolates for food safety, environmental, and clinical surveillance purposes. It was able to differentiate the seven Cronobacter species from one another and from non-Cronobacter species. The microarray was also able to cluster strains within each species into well-defined subgroups. These results also support previous studies on the phylogenic separation of species members of the genus and clearly highlight the evolutionary sequence divergence among each species of the genus compared to phylogenetically-related species. This review extends these studies and illustrates how the microarray can also be used as an investigational tool to mine genomic data sets from strains. Three case studies describing the use of the microarray are shown and include: (1 the determination of allelic differences among Cronobacter sakazakii strains possessing the virulence plasmid pESA3; (2 mining of malonate and myo-inositol alleles among subspecies of Cronobacter dublinensis strains to determine subspecies identity; and (3 lastly using the microarray to demonstrate sequence divergence and phylogenetic relatedness trends for 13 outer-membrane protein alleles among 240 Cronobacter and phylogenetically-related strains. The goal of

  5. Heating automation

    OpenAIRE

    Tomažič, Tomaž

    2013-01-01

    This degree paper presents usage and operation of peripheral devices with microcontroller for heating automation. The main goal is to make a quality system control for heating three house floors and with that, increase efficiency of heating devices and lower heating expenses. Heat pump, furnace, boiler pump, two floor-heating pumps and two radiator pumps need to be controlled by this system. For work, we have chosen a development kit stm32f4 - discovery with five temperature sensors, LCD disp...

  6. Automation Security

    OpenAIRE

    Mirzoev, Dr. Timur

    2014-01-01

    Web-based Automated Process Control systems are a new type of applications that use the Internet to control industrial processes with the access to the real-time data. Supervisory control and data acquisition (SCADA) networks contain computers and applications that perform key functions in providing essential services and commodities (e.g., electricity, natural gas, gasoline, water, waste treatment, transportation) to all Americans. As such, they are part of the nation s critical infrastructu...

  7. Marketing automation

    OpenAIRE

    Raluca Dania TODOR

    2017-01-01

    The automation of the marketing process seems to be nowadays, the only solution to face the major changes brought by the fast evolution of technology and the continuous increase in supply and demand. In order to achieve the desired marketing results, businessis have to employ digital marketing and communication services. These services are efficient and measurable thanks to the marketing technology used to track, score and implement each campaign. Due to the...

  8. Use of genome-wide expression data to mine the "gray zone" of GWA studies leads to novel candidate obesity genes

    NARCIS (Netherlands)

    J. Naukkarinen (Jussi); I. Surakka (Ida); K.H. Pietilainen (Kirsi Hannele); A. Rissanen (Aila); V. Salomaa (Veikko); S. Ripatti (Samuli); H. Yki-Jarvinen (Hannele); C.M. van Duijn (Cock); H.E. Wichmann (Heinz Erich); J. Kaprio (Jaakko); M. Taskinen (Marja Riitta); L. Peltonen (Leena Johanna)

    2010-01-01

    textabstractTo get beyond the "low-hanging fruits" so far identified by genome-wide association (GWA) studies, new methods must be developed in order to discover the numerous remaining genes that estimates of heritability indicate should be contributing to complex human phenotypes, such as obesity.

  9. Text mining for systems biology.

    Science.gov (United States)

    Fluck, Juliane; Hofmann-Apitius, Martin

    2014-02-01

    Scientific communication in biomedicine is, by and large, still text based. Text mining technologies for the automated extraction of useful biomedical information from unstructured text that can be directly used for systems biology modelling have been substantially improved over the past few years. In this review, we underline the importance of named entity recognition and relationship extraction as fundamental approaches that are relevant to systems biology. Furthermore, we emphasize the role of publicly organized scientific benchmarking challenges that reflect the current status of text-mining technology and are important in moving the entire field forward. Given further interdisciplinary development of systems biology-orientated ontologies and training corpora, we expect a steadily increasing impact of text-mining technology on systems biology in the future.

  10. Educational Data Mining Model Using Rattle

    Directory of Open Access Journals (Sweden)

    Sadiq Hussain

    2014-07-01

    Full Text Available Data Mining is the extraction of knowledge from the large databases. Data Mining had affected all the fields from combating terror attacks to the human genome databases. For different data analysis, R programming has a key role to play. Rattle, an effective GUI for R Programming is used extensively for generating reports based on several current trends models like random forest, support vector machine etc. It is otherwise hard to compare which model to choose for the data that needs to be mined. This paper proposes a method using Rattle for selection of Educational Data Mining Model.

  11. Antimicrobials of Bacillus species: mining and engineering

    NARCIS (Netherlands)

    Zhao, Xin

    2016-01-01

    Bacillus sp. have been successfully used to suppress various bacterial and fungal pathogens. Due to the wide availability of whole genome sequence data and the development of genome mining tools, novel antimicrobials are being discovered and updated,;not only bacteriocins, but also NRPs and PKs. A n

  12. The Aspergillus Mine - publishing bioinformatics

    DEFF Research Database (Denmark)

    Vesth, Tammi Camilla; Rasmussen, Jane Lind Nybo; Theobald, Sebastian

    so with no computational specialist. Here we present a setup for analysis and publication of genome data of 70 species of Aspergillus fungi. The platform is based on R, Python and uses the RShiny framework to create interactive web‐applications. It allows all participants to create interactive...... analysis which can be shared with the team and in connection with publications. We present analysis for investigation of genetic diversity, secondary and primary metabolism and general data overview. The platform, the Aspergillus Mine, is a collection of analysis tools based on data from collaboration...... with the Joint Genome Institute. The Aspergillus Mine is not intended as a genomic data sharing service but instead focuses on creating an environment where the results of bioinformatic analysis is made available for inspection. The data and code is public upon request and figures can be obtained directly from...

  13. Social big data mining

    CERN Document Server

    Ishikawa, Hiroshi

    2015-01-01

    Social Media. Big Data and Social Data. Hypotheses in the Era of Big Data. Social Big Data Applications. Basic Concepts in Data Mining. Association Rule Mining. Clustering. Classification. Prediction. Web Structure Mining. Web Content Mining. Web Access Log Mining, Information Extraction and Deep Web Mining. Media Mining. Scalability and Outlier Detection.

  14. Mining a database of single amplified genomes from Red Sea brine pool extremophiles-improving reliability of gene function prediction using a profile and pattern matching algorithm (PPMA).

    Science.gov (United States)

    Grötzinger, Stefan W; Alam, Intikhab; Ba Alawi, Wail; Bajic, Vladimir B; Stingl, Ulrich; Eppinger, Jörg

    2014-01-01

    Reliable functional annotation of genomic data is the key-step in the discovery of novel enzymes. Intrinsic sequencing data quality problems of single amplified genomes (SAGs) and poor homology of novel extremophile's genomes pose significant challenges for the attribution of functions to the coding sequences identified. The anoxic deep-sea brine pools of the Red Sea are a promising source of novel enzymes with unique evolutionary adaptation. Sequencing data from Red Sea brine pool cultures and SAGs are annotated and stored in the Integrated Data Warehouse of Microbial Genomes (INDIGO) data warehouse. Low sequence homology of annotated genes (no similarity for 35% of these genes) may translate into false positives when searching for specific functions. The Profile and Pattern Matching (PPM) strategy described here was developed to eliminate false positive annotations of enzyme function before progressing to labor-intensive hyper-saline gene expression and characterization. It utilizes InterPro-derived Gene Ontology (GO)-terms (which represent enzyme function profiles) and annotated relevant PROSITE IDs (which are linked to an amino acid consensus pattern). The PPM algorithm was tested on 15 protein families, which were selected based on scientific and commercial potential. An initial list of 2577 enzyme commission (E.C.) numbers was translated into 171 GO-terms and 49 consensus patterns. A subset of INDIGO-sequences consisting of 58 SAGs from six different taxons of bacteria and archaea were selected from six different brine pool environments. Those SAGs code for 74,516 genes, which were independently scanned for the GO-terms (profile filter) and PROSITE IDs (pattern filter). Following stringent reliability filtering, the non-redundant hits (106 profile hits and 147 pattern hits) are classified as reliable, if at least two relevant descriptors (GO-terms and/or consensus patterns) are present. Scripts for annotation, as well as for the PPM algorithm, are available

  15. Mining a database of single amplified genomes from Red Sea brine pool extremophiles-improving reliability of gene function prediction using a profile and pattern matching algorithm (PPMA).

    KAUST Repository

    Grötzinger, Stefan W.

    2014-04-07

    Reliable functional annotation of genomic data is the key-step in the discovery of novel enzymes. Intrinsic sequencing data quality problems of single amplified genomes (SAGs) and poor homology of novel extremophile\\'s genomes pose significant challenges for the attribution of functions to the coding sequences identified. The anoxic deep-sea brine pools of the Red Sea are a promising source of novel enzymes with unique evolutionary adaptation. Sequencing data from Red Sea brine pool cultures and SAGs are annotated and stored in the Integrated Data Warehouse of Microbial Genomes (INDIGO) data warehouse. Low sequence homology of annotated genes (no similarity for 35% of these genes) may translate into false positives when searching for specific functions. The Profile and Pattern Matching (PPM) strategy described here was developed to eliminate false positive annotations of enzyme function before progressing to labor-intensive hyper-saline gene expression and characterization. It utilizes InterPro-derived Gene Ontology (GO)-terms (which represent enzyme function profiles) and annotated relevant PROSITE IDs (which are linked to an amino acid consensus pattern). The PPM algorithm was tested on 15 protein families, which were selected based on scientific and commercial potential. An initial list of 2577 enzyme commission (E.C.) numbers was translated into 171 GO-terms and 49 consensus patterns. A subset of INDIGO-sequences consisting of 58 SAGs from six different taxons of bacteria and archaea were selected from six different brine pool environments. Those SAGs code for 74,516 genes, which were independently scanned for the GO-terms (profile filter) and PROSITE IDs (pattern filter). Following stringent reliability filtering, the non-redundant hits (106 profile hits and 147 pattern hits) are classified as reliable, if at least two relevant descriptors (GO-terms and/or consensus patterns) are present. Scripts for annotation, as well as for the PPM algorithm, are available

  16. Life in an arsenic-containing gold mine: genome and physiology of the autotrophic arsenite-oxidizing bacterium rhizobium sp. NT-26.

    Science.gov (United States)

    Andres, Jérémy; Arsène-Ploetze, Florence; Barbe, Valérie; Brochier-Armanet, Céline; Cleiss-Arnold, Jessica; Coppée, Jean-Yves; Dillies, Marie-Agnès; Geist, Lucie; Joublin, Aurélie; Koechler, Sandrine; Lassalle, Florent; Marchal, Marie; Médigue, Claudine; Muller, Daniel; Nesme, Xavier; Plewniak, Frédéric; Proux, Caroline; Ramírez-Bahena, Martha Helena; Schenowitz, Chantal; Sismeiro, Odile; Vallenet, David; Santini, Joanne M; Bertin, Philippe N

    2013-01-01

    Arsenic is widespread in the environment and its presence is a result of natural or anthropogenic activities. Microbes have developed different mechanisms to deal with toxic compounds such as arsenic and this is to resist or metabolize the compound. Here, we present the first reference set of genomic, transcriptomic and proteomic data of an Alphaproteobacterium isolated from an arsenic-containing goldmine: Rhizobium sp. NT-26. Although phylogenetically related to the plant-associated bacteria, this organism has lost the major colonizing capabilities needed for symbiosis with legumes. In contrast, the genome of Rhizobium sp. NT-26 comprises a megaplasmid containing the various genes, which enable it to metabolize arsenite. Remarkably, although the genes required for arsenite oxidation and flagellar motility/biofilm formation are carried by the megaplasmid and the chromosome, respectively, a coordinate regulation of these two mechanisms was observed. Taken together, these processes illustrate the impact environmental pressure can have on the evolution of bacterial genomes, improving the fitness of bacterial strains by the acquisition of novel functions.

  17. The discuss of the characteristic specialty construction of mechanical engineering and automation specialty based on the characteristics of coal mine%基于煤矿特色的机械工程及自动化特色专业建设探讨

    Institute of Scientific and Technical Information of China (English)

    赵四海

    2016-01-01

    Characteristic specialty construction in universities is the important measures of higher education improving the quality of personnel training,professional level and characteris-tics. Based on the characteristics of coal mine,according to the professional training scheme, course and teaching material construction,teachers and team construction,practice teaching four aspects,this paper introduces the main content of characteristic specialized construction of me-chanical engineering and automation specialty in China University of Mining and Technology (Beij ing).%高等学校特色专业建设是高等教育提高人才培养质量,办出专业水平和特色的重要措施。基于煤矿特色,本文从专业培养方案、课程与教材建设、师资与团队建设、实践教学等几个方面,介绍了中国矿业大学(北京)机械工程及自动化专业特色专业建设的主要内容。

  18. Automated cognome construction and semi-automated hypothesis generation.

    Science.gov (United States)

    Voytek, Jessica B; Voytek, Bradley

    2012-06-30

    Modern neuroscientific research stands on the shoulders of countless giants. PubMed alone contains more than 21 million peer-reviewed articles with 40-50,000 more published every month. Understanding the human brain, cognition, and disease will require integrating facts from dozens of scientific fields spread amongst millions of studies locked away in static documents, making any such integration daunting, at best. The future of scientific progress will be aided by bridging the gap between the millions of published research articles and modern databases such as the Allen brain atlas (ABA). To that end, we have analyzed the text of over 3.5 million scientific abstracts to find associations between neuroscientific concepts. From the literature alone, we show that we can blindly and algorithmically extract a "cognome": relationships between brain structure, function, and disease. We demonstrate the potential of data-mining and cross-platform data-integration with the ABA by introducing two methods for semi-automated hypothesis generation. By analyzing statistical "holes" and discrepancies in the literature we can find understudied or overlooked research paths. That is, we have added a layer of semi-automation to a part of the scientific process itself. This is an important step toward fundamentally incorporating data-mining algorithms into the scientific method in a manner that is generalizable to any scientific or medical field.

  19. Mining a database of single amplified genomes from Red Sea brine pool extremophiles – Improving reliability of gene function prediction using a profile and pattern matching algorithm (PPMA

    Directory of Open Access Journals (Sweden)

    Stefan Wolfgang Grötzinger

    2014-04-01

    Full Text Available Reliable functional annotation of genomic data is the key-step in the discovery of novel enzymes. Intrinsic sequencing data quality problems of single amplified genomes (SAGs and poor homology of novel extremophile’s genomes pose significant challenges for the attribution of functions to the coding sequences identified. The anoxic deep-sea brine pools of the Red Sea are a promising source of novel enzymes with unique evolutionary adaptation. Sequencing data from Red Sea brine pool cultures and SAGs are annotated and stored in the INDIGO data warehouse. Low sequence homology of annotated genes (no similarity for 35% of these genes may translate into false positives when searching for specific functions. The Profile & Pattern Matching (PPM strategy described here was developed to eliminate false positive annotations of enzyme function before progressing to labor-intensive hyper-saline gene expression and characterization. It utilizes InterPro-derived Gene Ontology (GO-terms (which represent enzyme function profiles and annotated relevant PROSITE IDs (which are linked to an amino acid consensus pattern. The PPM algorithm was tested on 15 protein families, which were selected based on scientific and commercial potential. An initial list of 2,577 E.C. numbers was translated into 171 GO-terms and 49 consensus patterns. A subset of INDIGO-sequences consisting of 58 SAGs from six different taxons of bacteria and archaea were selected from 6 different brine pool environments. Those SAGs code for 74,516 genes, which were independently scanned for the GO-terms (profile filter and PROSITE IDs (pattern filter. Following stringent reliability filtering, the non-redundant hits (106 profile hits and 147 pattern hits are classified as reliable, if at least two relevant descriptors (GO-terms and/or consensus patterns are present. Scripts for annotation, as well as for the PPM algorithm, are available through the INDIGO website.

  20. Use of Genome-Wide Expression Data to Mine the “Gray Zone” of GWA Studies Leads to Novel Candidate Obesity Genes

    Science.gov (United States)

    Naukkarinen, Jussi; Surakka, Ida; Pietiläinen, Kirsi H.; Rissanen, Aila; Salomaa, Veikko; Ripatti, Samuli; Yki-Järvinen, Hannele; van Duijn, Cornelia M.; Wichmann, H.-Erich; Kaprio, Jaakko; Taskinen, Marja-Riitta; Peltonen, Leena

    2010-01-01

    To get beyond the “low-hanging fruits” so far identified by genome-wide association (GWA) studies, new methods must be developed in order to discover the numerous remaining genes that estimates of heritability indicate should be contributing to complex human phenotypes, such as obesity. Here we describe a novel integrative method for complex disease gene identification utilizing both genome-wide transcript profiling of adipose tissue samples and consequent analysis of genome-wide association data generated in large SNP scans. We infer causality of genes with obesity by employing a unique set of monozygotic twin pairs discordant for BMI (n = 13 pairs, age 24–28 years, 15.4 kg mean weight difference) and contrast the transcript profiles with those from a larger sample of non-related adult individuals (N = 77). Using this approach, we were able to identify 27 genes with possibly causal roles in determining the degree of human adiposity. Testing for association of SNP variants in these 27 genes in the population samples of the large ENGAGE consortium (N = 21,000) revealed a significant deviation of P-values from the expected (P = 4×10−4). A total of 13 genes contained SNPs nominally associated with BMI. The top finding was blood coagulation factor F13A1 identified as a novel obesity gene also replicated in a second GWA set of ∼2,000 individuals. This study presents a new approach to utilizing gene expression studies for informing choice of candidate genes for complex human phenotypes, such as obesity. PMID:20532202

  1. Use of genome-wide expression data to mine the "Gray Zone" of GWA studies leads to novel candidate obesity genes.

    Directory of Open Access Journals (Sweden)

    Jussi Naukkarinen

    2010-06-01

    Full Text Available To get beyond the "low-hanging fruits" so far identified by genome-wide association (GWA studies, new methods must be developed in order to discover the numerous remaining genes that estimates of heritability indicate should be contributing to complex human phenotypes, such as obesity. Here we describe a novel integrative method for complex disease gene identification utilizing both genome-wide transcript profiling of adipose tissue samples and consequent analysis of genome-wide association data generated in large SNP scans. We infer causality of genes with obesity by employing a unique set of monozygotic twin pairs discordant for BMI (n = 13 pairs, age 24-28 years, 15.4 kg mean weight difference and contrast the transcript profiles with those from a larger sample of non-related adult individuals (N = 77. Using this approach, we were able to identify 27 genes with possibly causal roles in determining the degree of human adiposity. Testing for association of SNP variants in these 27 genes in the population samples of the large ENGAGE consortium (N = 21,000 revealed a significant deviation of P-values from the expected (P = 4x10(-4. A total of 13 genes contained SNPs nominally associated with BMI. The top finding was blood coagulation factor F13A1 identified as a novel obesity gene also replicated in a second GWA set of approximately 2,000 individuals. This study presents a new approach to utilizing gene expression studies for informing choice of candidate genes for complex human phenotypes, such as obesity.

  2. Status and Outlook of Key Technology for Automation Coal Mining Face in Thin Seam%薄煤层自动化工作面关键技术现状与展望

    Institute of Scientific and Technical Information of China (English)

    田成金

    2011-01-01

    According the automatic coal mining face with a coal shearer in a thin seam and the automatic coal mining face with a coal plough in a thin seam,a brief analysis and summarization was conducted on the application status of the related automatic key technology and the restricted factors.An analysis was conducted on the coal and rock interface identification,the automatic lining technology of the coal mining face equipment,the posture detection and control technology,the plough head detection technology,the plough head control technology and other automatic mining key technology needed to be broken through in the present automatic coal mining face in China.Some technical difficulties needed to be solved in the next development of the automatic coal mining face were provided.The paper summarized the present study status of those technical difficulties and provided the further study orientation.%针对我国采煤机薄煤层自动化工作面和刨煤机薄煤层自动化工作面,就其中涉及的自动化关键技术应用现状与制约因素进行了概括性的分析与总结,对我国现阶段自动化工作面中急需突破的煤岩界面识别、工作面设备自动找直技术、姿态检测控制技术、刨头检测技术、刨头控制技术等自动化开采关键技术进行分析,提出自动化工作面下一步发展迫切需要解决的一些技术难题,总结这些技术难题的研究现状,并提出下一步的研究方向。

  3. Automated Budget System

    Data.gov (United States)

    Department of Transportation — The Automated Budget System (ABS) automates management and planning of the Mike Monroney Aeronautical Center (MMAC) budget by providing enhanced capability to plan,...

  4. ­Genomic data mining of the marine actinobacteria Streptomyces sp. H-KF8 unveils insights into multi-stress related genes and metabolic pathways involved in antimicrobial synthesis

    Directory of Open Access Journals (Sweden)

    Agustina Undabarrena

    2017-02-01

    Full Text Available Streptomyces sp. H-KF8 is an actinobacterial strain isolated from marine sediments of a Chilean Patagonian fjord. Morphological characterization together with antibacterial activity was assessed in various culture media, revealing a carbon-source dependent activity mainly against Gram-positive bacteria (S. aureus and L. monocytogenes. Genome mining of this antibacterial-producing bacterium revealed the presence of 26 biosynthetic gene clusters (BGCs for secondary metabolites, where among them, 81% have low similarities with known BGCs. In addition, a genomic search in Streptomyces sp. H-KF8 unveiled the presence of a wide variety of genetic determinants related to heavy metal resistance (49 genes, oxidative stress (69 genes and antibiotic resistance (97 genes. This study revealed that the marine-derived Streptomyces sp. H-KF8 bacterium has the capability to tolerate a diverse set of heavy metals such as copper, cobalt, mercury, chromate and nickel; as well as the highly toxic tellurite, a feature first time described for Streptomyces. In addition, Streptomyces sp. H-KF8 possesses a major resistance towards oxidative stress, in comparison to the soil reference strain Streptomyces violaceoruber A3(2. Moreover, Streptomyces sp. H-KF8 showed resistance to 88% of the antibiotics tested, indicating overall, a strong response to several abiotic stressors. The combination of these biological traits confirms the metabolic versatility of Streptomyces sp. H-KF8, a genetically well-prepared microorganism with the ability to confront the dynamics of the fjord-unique marine environment.

  5. ­Genomic data mining of the marine actinobacteria Streptomyces sp. H-KF8 unveils insights into multi-stress related genes and metabolic pathways involved in antimicrobial synthesis

    Science.gov (United States)

    Undabarrena, Agustina; Ugalde, Juan A.; Seeger, Michael

    2017-01-01

    Streptomyces sp. H-KF8 is an actinobacterial strain isolated from marine sediments of a Chilean Patagonian fjord. Morphological characterization together with antibacterial activity was assessed in various culture media, revealing a carbon-source dependent activity mainly against Gram-positive bacteria (S. aureus and L. monocytogenes). Genome mining of this antibacterial-producing bacterium revealed the presence of 26 biosynthetic gene clusters (BGCs) for secondary metabolites, where among them, 81% have low similarities with known BGCs. In addition, a genomic search in Streptomyces sp. H-KF8 unveiled the presence of a wide variety of genetic determinants related to heavy metal resistance (49 genes), oxidative stress (69 genes) and antibiotic resistance (97 genes). This study revealed that the marine-derived Streptomyces sp. H-KF8 bacterium has the capability to tolerate a diverse set of heavy metals such as copper, cobalt, mercury, chromate and nickel; as well as the highly toxic tellurite, a feature first time described for Streptomyces. In addition, Streptomyces sp. H-KF8 possesses a major resistance towards oxidative stress, in comparison to the soil reference strain Streptomyces violaceoruber A3(2). Moreover, Streptomyces sp. H-KF8 showed resistance to 88% of the antibiotics tested, indicating overall, a strong response to several abiotic stressors. The combination of these biological traits confirms the metabolic versatility of Streptomyces sp. H-KF8, a genetically well-prepared microorganism with the ability to confront the dynamics of the fjord-unique marine environment. PMID:28229018

  6. Mining whole genomes and transcriptomes of Jatropha (Jatropha curcas) and Castor bean (Ricinus communis) for NBS-LRR genes and defense response associated transcription factors.

    Science.gov (United States)

    Sood, Archit; Jaiswal, Varun; Chanumolu, Sree Krishna; Malhotra, Nikhil; Pal, Tarun; Chauhan, Rajinder Singh

    2014-11-01

    Jatropha (Jatropha curcas L.) and Castor bean (Ricinus communis) are oilseed crops of family Euphorbiaceae with the potential of producing high quality biodiesel and having industrial value. Both the bioenergy plants are becoming susceptible to various biotic stresses directly affecting the oil quality and content. No report exists as of today on analysis of Nucleotide Binding Site-Leucine Rich Repeat (NBS-LRR) gene repertoire and defense response transcription factors in both the plant species. In silico analysis of whole genomes and transcriptomes identified 47 new NBS-LRR genes in both the species and 122 and 318 defense response related transcription factors in Jatropha and Castor bean, respectively. The identified NBS-LRR genes and defense response transcription factors were mapped onto the respective genomes. Common and unique NBS-LRR genes and defense related transcription factors were identified in both the plant species. All NBS-LRR genes in both the species were characterized into Toll/interleukin-1 receptor NBS-LRRs (TNLs) and coiled-coil NBS-LRRs (CNLs), position on contigs, gene clusters and motifs and domains distribution. Transcript abundance or expression values were measured for all NBS-LRR genes and defense response transcription factors, suggesting their functional role. The current study provides a repertoire of NBS-LRR genes and transcription factors which can be used in not only dissecting the molecular basis of disease resistance phenotype but also in developing disease resistant genotypes in Jatropha and Castor bean through transgenic or molecular breeding approaches.

  7. Genome mining of lipolytic exoenzymes from Bacillus safensis S9 and Pseudomonas alcaliphila ED1 isolated from a dairy wastewater lagoon.

    Science.gov (United States)

    Ficarra, Florencia A; Santecchia, Ignacio; Lagorio, Sebastián H; Alarcón, Sergio; Magni, Christian; Espariz, Martín

    2016-11-01

    Dairy production plants produce highly polluted wastewaters rich in organic molecules such as lactose, proteins and fats. Fats generally lead to low overall performance of the treatment system. In this study, a wastewater dairy lagoon was used as microbial source and different screening strategies were conducted to select 58 lipolytic microorganisms. Exoenzymes and RAPD analyses revealed genetic and phenotypic diversity among isolates. Bacillus safensis, Pseudomonas alcaliphila and the potential pathogens, B. cereus, Aeromonas and Acinetobacter were identified by 16S-rRNA, gyrA, oprI and/or oprL sequence analyses. Five out of 10 selected isolates produced lipolytic enzymes and grew in dairy wastewater. Based on these abilities and their safety, B. safensis S9 and P. alcaliphila ED1 were selected and their genome sequences determined. The genome of strain S9 and ED1 consisted of 3,794,315 and 5,239,535 bp and encoded for 3990 and 4844 genes, respectively. Putative extracellular enzymes with lipolytic (12 and 16), proteolytic (20) or hydrolytic (10 and 15) activity were identified for S9 and ED1 strains, respectively. These bacteria also encoded other technological relevant proteins such as amylases, proteases, glucanases, xylanases and pectate lyases.

  8. Automation 2017

    CERN Document Server

    Zieliński, Cezary; Kaliczyńska, Małgorzata

    2017-01-01

    This book consists of papers presented at Automation 2017, an international conference held in Warsaw from March 15 to 17, 2017. It discusses research findings associated with the concepts behind INDUSTRY 4.0, with a focus on offering a better understanding of and promoting participation in the Fourth Industrial Revolution. Each chapter presents a detailed analysis of a specific technical problem, in most cases followed by a numerical analysis, simulation and description of the results of implementing the solution in a real-world context. The theoretical results, practical solutions and guidelines presented are valuable for both researchers working in the area of engineering sciences and practitioners looking for solutions to industrial problems. .

  9. Marketing automation

    Directory of Open Access Journals (Sweden)

    TODOR Raluca Dania

    2017-01-01

    Full Text Available The automation of the marketing process seems to be nowadays, the only solution to face the major changes brought by the fast evolution of technology and the continuous increase in supply and demand. In order to achieve the desired marketing results, businessis have to employ digital marketing and communication services. These services are efficient and measurable thanks to the marketing technology used to track, score and implement each campaign. Due to the technical progress, the marketing fragmentation, demand for customized products and services on one side and the need to achieve constructive dialogue with the customers, immediate and flexible response and the necessity to measure the investments and the results on the other side, the classical marketing approached had changed continue to improve substantially.

  10. Beegle: from literature mining to disease-gene discovery.

    Science.gov (United States)

    ElShal, Sarah; Tranchevent, Léon-Charles; Sifrim, Alejandro; Ardeshirdavani, Amin; Davis, Jesse; Moreau, Yves

    2016-01-29

    Disease-gene identification is a challenging process that has multiple applications within functional genomics and personalized medicine. Typically, this process involves both finding genes known to be associated with the disease (through literature search) and carrying out preliminary experiments or screens (e.g. linkage or association studies, copy number analyses, expression profiling) to determine a set of promising candidates for experimental validation. This requires extensive time and monetary resources. We describe Beegle, an online search and discovery engine that attempts to simplify this process by automating the typical approaches. It starts by mining the literature to quickly extract a set of genes known to be linked with a given query, then it integrates the learning methodology of Endeavour (a gene prioritization tool) to train a genomic model and rank a set of candidate genes to generate novel hypotheses. In a realistic evaluation setup, Beegle has an average recall of 84% in the top 100 returned genes as a search engine, which improves the discovery engine by 12.6% in the top 5% prioritized genes. Beegle is publicly available at http://beegle.esat.kuleuven.be/.

  11. ONTOLOGY BASED DATA MINING METHODOLOGY FOR DISCRIMINATION PREVENTION

    Directory of Open Access Journals (Sweden)

    Nandana Nagabhushana

    2014-09-01

    Full Text Available Data Mining is being increasingly used in the field of automation of decision making processes, which involve extraction and discovery of information hidden in large volumes of collected data. Nonetheless, there are negative perceptions like privacy invasion and potential discrimination which contribute as hindrances to the use of data mining methodologies in software systems employing automated decision making. Loan granting, Employment, Insurance Premium calculation, Admissions in Educational Institutions etc., can make use of data mining to effectively prevent human biases pertaining to certain attributes like gender, nationality, race etc. in critical decision making. The proposed methodology prevents discriminatory rules ensuing due to the presence of certain information regarding sensitive discriminatory attributes in the data itself. Two aspects of novelty in the proposal are, first, the rule mining technique based on ontologies and the second, concerning generalization and transformation of the mined rules that are quantized as discriminatory, into non-discriminatory ones.

  12. Reference Based Genome Compression

    CERN Document Server

    Chern, Bobbie; Manolakos, Alexandros; No, Albert; Venkat, Kartik; Weissman, Tsachy

    2012-01-01

    DNA sequencing technology has advanced to a point where storage is becoming the central bottleneck in the acquisition and mining of more data. Large amounts of data are vital for genomics research, and generic compression tools, while viable, cannot offer the same savings as approaches tuned to inherent biological properties. We propose an algorithm to compress a target genome given a known reference genome. The proposed algorithm first generates a mapping from the reference to the target genome, and then compresses this mapping with an entropy coder. As an illustration of the performance: applying our algorithm to James Watson's genome with hg18 as a reference, we are able to reduce the 2991 megabyte (MB) genome down to 6.99 MB, while Gzip compresses it to 834.8 MB.

  13. Coal Mines, Active - Longwall Mining Panels

    Data.gov (United States)

    NSGIC GIS Inventory (aka Ramona) — Coal mining has occurred in Pennsylvania for over a century. A method of coal mining known as Longwall Mining has become more prevalent in recent decades. Longwall...

  14. Automated Event Service: Efficient and Flexible Searching for Earth Science Phenomena Project

    Data.gov (United States)

    National Aeronautics and Space Administration — Develop an Automated Event Service system that: Methodically mines custom-defined events in the reanalysis data sets of global atmospheric models. Enables...

  15. Data mining strategies to improve multiplex microbead immunoassay tolerance in a mouse model of infectious diseases.

    Science.gov (United States)

    Mani, Akshay; Ravindran, Resmi; Mannepalli, Soujanya; Vang, Daniel; Luciw, Paul A; Hogarth, Michael; Khan, Imran H; Krishnan, Viswanathan V

    2015-01-01

    Multiplex methodologies, especially those with high-throughput capabilities generate large volumes of data. Accumulation of such data (e.g., genomics, proteomics, metabolomics etc.) is fast becoming more common and thus requires the development and implementation of effective data mining strategies designed for biological and clinical applications. Multiplex microbead immunoassay (MMIA), on xMAP or MagPix platform (Luminex), which is amenable to automation, offers a major advantage over conventional methods such as Western blot or ELISA, for increasing the efficiencies in serodiagnosis of infectious diseases. MMIA allows detection of antibodies and/or antigens efficiently for a wide range of infectious agents simultaneously in host blood samples, in one reaction vessel. In the process, MMIA generates large volumes of data. In this report we demonstrate the application of data mining tools on how the inherent large volume data can improve the assay tolerance (measured in terms of sensitivity and specificity) by analysis of experimental data accumulated over a span of two years. The combination of prior knowledge with machine learning tools provides an efficient approach to improve the diagnostic power of the assay in a continuous basis. Furthermore, this study provides an in-depth knowledge base to study pathological trends of infectious agents in mouse colonies on a multivariate scale. Data mining techniques using serodetection of infections in mice, developed in this study, can be used as a general model for more complex applications in epidemiology and clinical translational research.

  16. Bovine Genome Database: new tools for gleaning function from the Bos taurus genome.

    Science.gov (United States)

    Elsik, Christine G; Unni, Deepak R; Diesh, Colin M; Tayal, Aditi; Emery, Marianne L; Nguyen, Hung N; Hagen, Darren E

    2016-01-01

    We report an update of the Bovine Genome Database (BGD) (http://BovineGenome.org). The goal of BGD is to support bovine genomics research by providing genome annotation and data mining tools. We have developed new genome and annotation browsers using JBrowse and WebApollo for two Bos taurus genome assemblies, the reference genome assembly (UMD3.1.1) and the alternate genome assembly (Btau_4.6.1). Annotation tools have been customized to highlight priority genes for annotation, and to aid annotators in selecting gene evidence tracks from 91 tissue specific RNAseq datasets. We have also developed BovineMine, based on the InterMine data warehousing system, to integrate the bovine genome, annotation, QTL, SNP and expression data with external sources of orthology, gene ontology, gene interaction and pathway information. BovineMine provides powerful query building tools, as well as customized query templates, and allows users to analyze and download genome-wide datasets. With BovineMine, bovine researchers can use orthology to leverage the curated gene pathways of model organisms, such as human, mouse and rat. BovineMine will be especially useful for gene ontology and pathway analyses in conjunction with GWAS and QTL studies.

  17. Data mining

    CERN Document Server

    Gorunescu, Florin

    2011-01-01

    The knowledge discovery process is as old as Homo sapiens. Until some time ago, this process was solely based on the 'natural personal' computer provided by Mother Nature. Fortunately, in recent decades the problem has begun to be solved based on the development of the Data mining technology, aided by the huge computational power of the 'artificial' computers. Digging intelligently in different large databases, data mining aims to extract implicit, previously unknown and potentially useful information from data, since 'knowledge is power'. The goal of this book is to provide, in a friendly way

  18. Mining Review

    Science.gov (United States)

    ,

    2013-01-01

    In 2012, the estimated value of mineral production increased in the United States for the third consecutive year. Production and prices increased for most industrial mineral commodities mined in the United States. While production for most metals remained relatively unchanged, with the notable exception of gold, the prices for most metals declined. Minerals remained fundamental to the U.S. economy, contributing to the real gross domestic product (GDP) at several levels, including mining, processing and manufacturing finished products. Minerals’ contribution to the GDP increased for the second consecutive year.

  19. Data mining and education.

    Science.gov (United States)

    Koedinger, Kenneth R; D'Mello, Sidney; McLaughlin, Elizabeth A; Pardos, Zachary A; Rosé, Carolyn P

    2015-01-01

    An emerging field of educational data mining (EDM) is building on and contributing to a wide variety of disciplines through analysis of data coming from various educational technologies. EDM researchers are addressing questions of cognition, metacognition, motivation, affect, language, social discourse, etc. using data from intelligent tutoring systems, massive open online courses, educational games and simulations, and discussion forums. The data include detailed action and timing logs of student interactions in user interfaces such as graded responses to questions or essays, steps in rich problem solving environments, games or simulations, discussion forum posts, or chat dialogs. They might also include external sensors such as eye tracking, facial expression, body movement, etc. We review how EDM has addressed the research questions that surround the psychology of learning with an emphasis on assessment, transfer of learning and model discovery, the role of affect, motivation and metacognition on learning, and analysis of language data and collaborative learning. For example, we discuss (1) how different statistical assessment methods were used in a data mining competition to improve prediction of student responses to intelligent tutor tasks, (2) how better cognitive models can be discovered from data and used to improve instruction, (3) how data-driven models of student affect can be used to focus discussion in a dialog-based tutoring system, and (4) how machine learning techniques applied to discussion data can be used to produce automated agents that support student learning as they collaborate in a chat room or a discussion board.

  20. A genome-wide survey of maize lipid-related genes: candidate genes mining,digital gene expression profiling and colocation with QTL for maize kernel oil

    Institute of Scientific and Technical Information of China (English)

    2010-01-01

    Lipids play an important role in plants due to their abundance and their extensive participation in many metabolic processes.Genes involved in lipid metabolism have been extensively studied in Arabidopsis and other plant species.In this study,a total of 1003 maize lipid-related genes were cloned and annotated,including 42 genes with experimental validation,732 genes with full-length cDNA and protein sequences in public databases and 229 newly cloned genes.Ninety-seven maize lipid-related genes with tissue-preferential expression were discovered by in silico gene expression profiling based on 1984483 maize Expressed Sequence Tags collected from 182 cDNA libraries.Meanwhile,70 QTL clusters for maize kernel oil were identified,covering 34.5% of the maize genome.Fifty-nine (84%) QTL clusters co-located with at least one lipid-related gene,and the total number of these genes amounted to 147.Interestingly,thirteen genes with kernel-preferential expression profiles fell within QTL clusters for maize kernel oil content.All the maize lipid-related genes identified here may provide good targets for maize kernel oil QTL cloning and thus help us to better understand the molecular mechanism of maize kernel oil accumulation.

  1. Mining Method

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Young Shik; Lee, Kyung Woon; Kim, Oak Hwan; Kim, Dae Kyung [Korea Institute of Geology Mining and Materials, Taejon (Korea, Republic of)

    1996-12-01

    The reducing coal market has been enforcing the coal industry to make exceptional rationalization and restructuring efforts since the end of the eighties. To the competition from crude oil and natural gas has been added the growing pressure from rising wages and rising production cost as the workings get deeper. To improve the competitive position of the coal mines against oil and gas through cost reduction, studies to improve mining system have been carried out. To find fields requiring improvements most, the technologies using in Tae Bak Colliery which was selected one of long running mines were investigated and analyzed. The mining method appeared the field needing improvements most to reduce the production cost. The present method, so-called inseam roadway caving method presently is using to extract the steep and thick seam. However, this method has several drawbacks. To solve the problems, two mining methods are suggested for a long term and short term method respectively. Inseam roadway caving method with long-hole blasting method is a variety of the present inseam roadway caving method modified by replacing timber sets with steel arch sets and the shovel loaders with chain conveyors. And long hole blasting is introduced to promote caving. And pillar caving method with chock supports method uses chock supports setting in the cross-cut from the hanging wall to the footwall. Two single chain conveyors are needed. One is installed in front of chock supports to clear coal from the cutting face. The other is installed behind the supports to transport caved coal from behind. This method is superior to the previous one in terms of safety from water-inrushes, production rate and productivity. The only drawback is that it needs more investment. (author). 14 tabs., 34 figs.

  2. Data mining concepts and techniques

    CERN Document Server

    Han, Jiawei

    2005-01-01

    Our ability to generate and collect data has been increasing rapidly. Not only are all of our business, scientific, and government transactions now computerized, but the widespread use of digital cameras, publication tools, and bar codes also generate data. On the collection side, scanned text and image platforms, satellite remote sensing systems, and the World Wide Web have flooded us with a tremendous amount of data. This explosive growth has generated an even more urgent need for new techniques and automated tools that can help us transform this data into useful information and knowledge.Like the first edition, voted the most popular data mining book by KD Nuggets readers, this book explores concepts and techniques for the discovery of patterns hidden in large data sets, focusing on issues relating to their feasibility, usefulness, effectiveness, and scalability. However, since the publication of the first edition, great progress has been made in the development of new data mining methods, systems, and app...

  3. Analyzing and mining image databases.

    Science.gov (United States)

    Berlage, Thomas

    2005-06-01

    Image mining is the application of computer-based techniques that extract and exploit information from large image sets to support human users in generating knowledge from these sources. This review focuses on biomedical applications, in particular automated imaging at the cellular level. An image database is an interactive software application that combines data management, image analysis and visual data mining. The main characteristic of such a system is a layer that represents objects within an image, and that represents a large spectrum of quantitative and semantic object features. The image analysis needs to be adapted to each particular experiment, so 'end-user programming' will be desirable to make the technology more widely applicable.

  4. Planning the Mine and Mining the Plan

    Science.gov (United States)

    Boucher, D. S.; Chen, N.

    2016-11-01

    Overview of best practices used in the terrestrial mining industry when developing a mine site towards production. The intent is to guide planners towards an effective and well constructed roadmap for the development of ISRU mining activities. A strawman scenario is presented as an illustration for lunar mining of water ice.

  5. In silico genome wide mining of conserved and novel miRNAs in the brain and pineal gland of Danio rerio using small RNA sequencing data.

    Science.gov (United States)

    Agarwal, Suyash; Nagpure, Naresh Sahebrao; Srivastava, Prachi; Kushwaha, Basdeo; Kumar, Ravindra; Pandey, Manmohan; Srivastava, Shreya

    2016-03-01

    MicroRNAs (miRNAs) are small, non-coding RNA molecules that bind to the mRNA of the target genes and regulate the expression of the gene at the post-transcriptional level. Zebrafish is an economically important freshwater fish species globally considered as a good predictive model for studying human diseases and development. The present study focused on uncovering known as well as novel miRNAs, target prediction of the novel miRNAs and the differential expression of the known miRNA using the small RNA sequencing data of the brain and pineal gland (dark and light treatments) obtained from NCBI SRA. A total of 165, 151 and 145 known zebrafish miRNAs were found in the brain, pineal gland (dark treatment) and pineal gland (light treatment), respectively. Chromosomes 4 and 5 of zebrafish reference assembly GRCz10 were found to contain maximum number of miR genes. The miR-181a and miR-182 were found to be highly expressed in terms of number of reads in the brain and pineal gland, respectively. Other ncRNAs, such as tRNA, rRNA and snoRNA, were curated against Rfam. Using GRCz10 as reference, the subsequent bioinformatic analyses identified 25, 19 and 9 novel miRNAs from the brain, pineal gland (dark treatment) and pineal gland (light treatment), respectively. Targets of the novel miRNAs were identified, based on sequence complementarity between miRNAs and mRNA, by searching for antisense hits in the 3'-UTR of reference RNA sequences of the zebrafish. The discovery of novel miRNAs and their targets in the zebrafish genome can be a valuable scientific resource for further functional studies not only in zebrafish but also in other economically important fishes.

  6. Text mining for the biocuration workflow.

    Science.gov (United States)

    Hirschman, Lynette; Burns, Gully A P C; Krallinger, Martin; Arighi, Cecilia; Cohen, K Bretonnel; Valencia, Alfonso; Wu, Cathy H; Chatr-Aryamontri, Andrew; Dowell, Karen G; Huala, Eva; Lourenço, Anália; Nash, Robert; Veuthey, Anne-Lise; Wiegers, Thomas; Winter, Andrew G

    2012-01-01

    Molecular biology has become heavily dependent on biological knowledge encoded in expert curated biological databases. As the volume of biological literature increases, biocurators need help in keeping up with the literature; (semi-) automated aids for biocuration would seem to be an ideal application for natural language processing and text mining. However, to date, there have been few documented successes for improving biocuration throughput using text mining. Our initial investigations took place for the workshop on 'Text Mining for the BioCuration Workflow' at the third International Biocuration Conference (Berlin, 2009). We interviewed biocurators to obtain workflows from eight biological databases. This initial study revealed high-level commonalities, including (i) selection of documents for curation; (ii) indexing of documents with biologically relevant entities (e.g. genes); and (iii) detailed curation of specific relations (e.g. interactions); however, the detailed workflows also showed many variabilities. Following the workshop, we conducted a survey of biocurators. The survey identified biocurator priorities, including the handling of full text indexed with biological entities and support for the identification and prioritization of documents for curation. It also indicated that two-thirds of the biocuration teams had experimented with text mining and almost half were using text mining at that time. Analysis of our interviews and survey provide a set of requirements for the integration of text mining into the biocuration workflow. These can guide the identification of common needs across curated databases and encourage joint experimentation involving biocurators, text mining developers and the larger biomedical research community.

  7. Application of Modern Tools and Techniques for Mine Safety & Disaster Management

    Science.gov (United States)

    Kumar, Dheeraj

    2016-04-01

    The implementation of novel systems and adoption of improvised equipment in mines help mining companies in two important ways: enhanced mine productivity and improved worker safety. There is a substantial need for adoption of state-of-the-art automation technologies in the mines to ensure the safety and to protect health of mine workers. With the advent of new autonomous equipment used in the mine, the inefficiencies are reduced by limiting human inconsistencies and error. The desired increase in productivity at a mine can sometimes be achieved by changing only a few simple variables. Significant developments have been made in the areas of surface and underground communication, robotics, smart sensors, tracking systems, mine gas monitoring systems and ground movements etc. Advancement in information technology in the form of internet, GIS, remote sensing, satellite communication, etc. have proved to be important tools for hazard reduction and disaster management. This paper is mainly focused on issues pertaining to mine safety and disaster management and some of the recent innovations in the mine automations that could be deployed in mines for safe mining operations and for avoiding any unforeseen mine disaster.

  8. Manufacturing and automation

    Directory of Open Access Journals (Sweden)

    Ernesto Córdoba Nieto

    2010-04-01

    Full Text Available The article presents concepts and definitions from different sources concerning automation. The work approaches automation by virtue of the author’s experience in manufacturing production; why and how automation prolects are embarked upon is considered. Technological reflection regarding the progressive advances or stages of automation in the production area is stressed. Coriat and Freyssenet’s thoughts about and approaches to the problem of automation and its current state are taken and examined, especially that referring to the problem’s relationship with reconciling the level of automation with the flexibility and productivity demanded by competitive, worldwide manufacturing.

  9. Development of opencast mines

    Energy Technology Data Exchange (ETDEWEB)

    Szebenyi, F.

    1987-01-01

    The role and works of the Central Institute for Mining Development and its legal predecessors, the Mining Research Institute and Mines Design Institute, in relation with opencast lignite mining in Hungary, are summarized. Investigations aimed at the determination of the heating technical properties of lignites are reviewed. Different lignite mines, their geological features, production possibilities and development conditions are outlined.

  10. Solutions for data integration in functional genomics: a critical assessment and case study.

    Science.gov (United States)

    Smedley, Damian; Swertz, Morris A; Wolstencroft, Katy; Proctor, Glenn; Zouberakis, Michael; Bard, Jonathan; Hancock, John M; Schofield, Paul

    2008-11-01

    The torrent of data emerging from the application of new technologies to functional genomics and systems biology can no longer be contained within the traditional modes of data sharing and publication with the consequence that data is being deposited in, distributed across and disseminated through an increasing number of databases. The resulting fragmentation poses serious problems for the model organism community which increasingly rely on data mining and computational approaches that require gathering of data from a range of sources. In the light of these problems, the European Commission has funded a coordination action, CASIMIR (coordination and sustainability of international mouse informatics resources), with a remit to assess the technical and social aspects of database interoperability that currently prevent the full realization of the potential of data integration in mouse functional genomics. In this article, we assess the current problems with interoperability, with particular reference to mouse functional genomics, and critically review the technologies that can be deployed to overcome them. We describe a typical use-case where an investigator wishes to gather data on variation, genomic context and metabolic pathway involvement for genes discovered in a genome-wide screen. We go on to develop an automated approach involving an in silico experimental workflow tool, Taverna, using web services, BioMart and MOLGENIS technologies for data retrieval. Finally, we focus on the current impediments to adopting such an approach in a wider context, and strategies to overcome them.

  11. Clone Detection Using DIFF Algorithm For Aspect Mining

    Directory of Open Access Journals (Sweden)

    Rowyda Mohammed Abd El-Aziz

    2012-08-01

    Full Text Available Aspect mining is a reverse engineering process that aims at mining legacy systems to discover crosscutting concerns to be refactored into aspects. This process improves system reusability and maintainability. But, locating crosscutting concerns in legacy systems manually is very difficult and causes many errors. So, there is a need for automated techniques that can discover crosscutting concerns in source code. Aspect mining approaches are automated techniques that vary according to the type of crosscutting concerns symptoms they search for. Code duplication is one of such symptoms which risks software maintenance and evolution. So, many code clone detection techniques have been proposed to find this duplicated code in legacy systems. In this paper, we present a clone detection technique to extract exact clones from object-oriented source code using Differential File Comparison Algorithm (DIFF to improve system reusability and maintainability which is a major objective of aspect mining.

  12. An automated swimming respirometer

    DEFF Research Database (Denmark)

    STEFFENSEN, JF; JOHANSEN, K; BUSHNELL, PG

    1984-01-01

    An automated respirometer is described that can be used for computerized respirometry of trout and sharks.......An automated respirometer is described that can be used for computerized respirometry of trout and sharks....

  13. Configuration Management Automation (CMA) -

    Data.gov (United States)

    Department of Transportation — Configuration Management Automation (CMA) will provide an automated, integrated enterprise solution to support CM of FAA NAS and Non-NAS assets and investments. CMA...

  14. Autonomy and Automation

    Science.gov (United States)

    Shively, Jay

    2017-01-01

    A significant level of debate and confusion has surrounded the meaning of the terms autonomy and automation. Automation is a multi-dimensional concept, and we propose that Remotely Piloted Aircraft Systems (RPAS) automation should be described with reference to the specific system and task that has been automated, the context in which the automation functions, and other relevant dimensions. In this paper, we present definitions of automation, pilot in the loop, pilot on the loop and pilot out of the loop. We further propose that in future, the International Civil Aviation Organization (ICAO) RPAS Panel avoids the use of the terms autonomy and autonomous when referring to automated systems on board RPA. Work Group 7 proposes to develop, in consultation with other workgroups, a taxonomy of Levels of Automation for RPAS.

  15. Automated alignment-based curation of gene models in filamentous fungi

    OpenAIRE

    2014-01-01

    Background Automated gene-calling is still an error-prone process, particularly for the highly plastic genomes of fungal species. Improvement through quality control and manual curation of gene models is a time-consuming process that requires skilled biologists and is only marginally performed. The wealth of available fungal genomes has not yet been exploited by an automated method that applies quality control of gene models in order to obtain more accurate genome annotations. Results We prov...

  16. Lessons learnt from mining meter data of residential consumers

    OpenAIRE

    Blazakis, K; Davarzani, S; G. Stavrakakis; Pisica, I

    2016-01-01

    Tracking end-users' usage patterns can enable more accurate demand forecasting and the automation of demand response execution. Accordingly, more advanced applications, such as electricity market design, integration of distributed generation and theft detection can be developed. By employing data mining techniques on smart meter recordings, the suppliers can efficiently investigate the load patterns of consumers. This paper presents applications where data mining of energy usage can derive us...

  17. Exploration and Mining Roadmap

    Energy Technology Data Exchange (ETDEWEB)

    none,

    2002-09-01

    This Exploration and Mining Technology Roadmap represents the third roadmap for the Mining Industry of the Future. It is based upon the results of the Exploration and Mining Roadmap Workshop held May 10 ñ 11, 2001.

  18. Coal Mine Permit Boundaries

    Data.gov (United States)

    Earth Data Analysis Center, University of New Mexico — ESRI ArcView shapefile depicting New Mexico coal mines permitted under the Surface Mining Control and Reclamation Act of 1977 (SMCRA), by either the NM Mining these...

  19. Workflow automation architecture standard

    Energy Technology Data Exchange (ETDEWEB)

    Moshofsky, R.P.; Rohen, W.T. [Boeing Computer Services Co., Richland, WA (United States)

    1994-11-14

    This document presents an architectural standard for application of workflow automation technology. The standard includes a functional architecture, process for developing an automated workflow system for a work group, functional and collateral specifications for workflow automation, and results of a proof of concept prototype.

  20. Mining review

    Science.gov (United States)

    McCartan, L.; Morse, D.E.; Plunkert, P.A.; Sibley, S.F.

    2004-01-01

    The average annual growth rate of real gross domestic product (GDP) from the third quarter of 2001 through the second quarter of 2003 in the United States was about 2.6 percent. GDP growth rates in the third and fourth quarters of 2003 were about 8 percent and 4 percent, respectively. The upward trends in many sectors of the U.S. economy in 2003, however, were shared by few of the mineral materials industries. Annual output declined in most nonfuel mining and mineral processing industries, although there was an upward turn toward yearend as prices began to increase.

  1. Coal Mines, Abandoned - Digitized Mined Areas

    Data.gov (United States)

    NSGIC GIS Inventory (aka Ramona) — Coal mining has occurred in Pennsylvania for over a century. The maps to these coal mines are stored at many various public and private locations (if they still...

  2. Automation in Clinical Microbiology

    Science.gov (United States)

    Ledeboer, Nathan A.

    2013-01-01

    Historically, the trend toward automation in clinical pathology laboratories has largely bypassed the clinical microbiology laboratory. In this article, we review the historical impediments to automation in the microbiology laboratory and offer insight into the reasons why we believe that we are on the cusp of a dramatic change that will sweep a wave of automation into clinical microbiology laboratories. We review the currently available specimen-processing instruments as well as the total laboratory automation solutions. Lastly, we outline the types of studies that will need to be performed to fully assess the benefits of automation in microbiology laboratories. PMID:23515547

  3. Process Mining: A Two-Step Approach to Balance Between Underfitting and Overfitting

    DEFF Research Database (Denmark)

    van der Aalst, W.M.P.; Rubin, V.; Verbeek, H.M.W.

    Process mining includes the automated discovery of processes from event logs. Based on observed events (e.g., activities being executed or messages being exchanged) a process model is constructed. One of the essential problems in process mining is that one cannot assume to have seen all possible...

  4. Wikipedia Mining

    Science.gov (United States)

    Nakayama, Kotaro; Ito, Masahiro; Erdmann, Maike; Shirakawa, Masumi; Michishita, Tomoyuki; Hara, Takahiro; Nishio, Shojiro

    Wikipedia, a collaborative Wiki-based encyclopedia, has become a huge phenomenon among Internet users. It covers a huge number of concepts of various fields such as arts, geography, history, science, sports and games. As a corpus for knowledge extraction, Wikipedia's impressive characteristics are not limited to the scale, but also include the dense link structure, URL based word sense disambiguation, and brief anchor texts. Because of these characteristics, Wikipedia has become a promising corpus and a new frontier for research. In the past few years, a considerable number of researches have been conducted in various areas such as semantic relatedness measurement, bilingual dictionary construction, and ontology construction. Extracting machine understandable knowledge from Wikipedia to enhance the intelligence on computational systems is the main goal of "Wikipedia Mining," a project on CREP (Challenge for Realizing Early Profits) in JSAI. In this paper, we take a comprehensive, panoramic view of Wikipedia Mining research and the current status of our challenge. After that, we will discuss about the future vision of this challenge.

  5. Image Mining: Review and New Challenges

    Directory of Open Access Journals (Sweden)

    Barbora Zahradnikova

    2015-07-01

    Full Text Available Besides new technology, a huge volume of data in various form has been available for people. Image data represents a keystone of many research areas including medicine, forensic criminology, robotics and industrial automation, meteorology and geography as well as education. Therefore, obtaining specific in-formation from image databases has become of great importance. Images as a special category of data differ from text data as in terms of their nature so in terms of storing and retrieving. Image Mining as a research field is an interdisciplinary area combining methodologies and knowledge of many branches including data mining, computer vision, image processing, image retrieval, statis-tics, recognition, machine learning, artificial intelligence etc. This review focuses researching the current image mining approaches and techniques aiming at widening the possibilities of facial image analysis. This paper aims at reviewing the current state of the IM as well as at describing challenges and identifying directions of the future research in the field.

  6. Automated update, revision, and quality control of the maize genome annotations using MAKER-P improves the B73 RefGen_v3 gene models and identifies new genes

    Science.gov (United States)

    The large size and relative complexity of many plant genomes make creation, quality control, and dissemination of high-quality gene structure annotations challenging. In response, we have developed MAKER-P, a fast and easy-to-use genome annotation engine for plants. Here, we report the use of MAKER-...

  7. Proceedings: Fourth Workshop on Mining Scientific Datasets

    Energy Technology Data Exchange (ETDEWEB)

    Kamath, C

    2001-07-24

    Commercial applications of data mining in areas such as e-commerce, market-basket analysis, text-mining, and web-mining have taken on a central focus in the JCDD community. However, there is a significant amount of innovative data mining work taking place in the context of scientific and engineering applications that is not well represented in the mainstream KDD conferences. For example, scientific data mining techniques are being developed and applied to diverse fields such as remote sensing, physics, chemistry, biology, astronomy, structural mechanics, computational fluid dynamics etc. In these areas, data mining frequently complements and enhances existing analysis methods based on statistics, exploratory data analysis, and domain-specific approaches. On the surface, it may appear that data from one scientific field, say genomics, is very different from another field, such as physics. However, despite their diversity, there is much that is common across the mining of scientific and engineering data. For example, techniques used to identify objects in images are very similar, regardless of whether the images came from a remote sensing application, a physics experiment, an astronomy observation, or a medical study. Further, with data mining being applied to new types of data, such as mesh data from scientific simulations, there is the opportunity to apply and extend data mining to new scientific domains. This one-day workshop brings together data miners analyzing science data and scientists from diverse fields to share their experiences, learn how techniques developed in one field can be applied in another, and better understand some of the newer techniques being developed in the KDD community. This is the fourth workshop on the topic of Mining Scientific Data sets; for information on earlier workshops, see http://www.ahpcrc.org/conferences/. This workshop continues the tradition of addressing challenging problems in a field where the diversity of applications is

  8. Adaptive Multi-Modal Data Mining and Fusion for Autonomous Intelligence Discovery

    Science.gov (United States)

    2009-03-01

    Final DATES COVERED (From To) From 15-12-2006 to 15-12-2007 4. TITLE AND SUBTITLE Adaptive Multi-Modal Data Mining and Fusion For Autonomous...well as geospatial mapping of documents and images. 15. SUBJECT TERMS automated data mining , streaming data, geospatial Internet localization, Arabic...streaming text data mining . 1.1 Mixed Language Text Database Search A particularly useful component that was under development was on a mixed language

  9. Automated Single Cell Data Decontamination Pipeline

    Energy Technology Data Exchange (ETDEWEB)

    Tennessen, Kristin [Lawrence Berkeley National Lab. (LBNL), Walnut Creek, CA (United States). Dept. of Energy Joint Genome Inst.; Pati, Amrita [Lawrence Berkeley National Lab. (LBNL), Walnut Creek, CA (United States). Dept. of Energy Joint Genome Inst.

    2014-03-21

    Recent technological advancements in single-cell genomics have encouraged the classification and functional assessment of microorganisms from a wide span of the biospheres phylogeny.1,2 Environmental processes of interest to the DOE, such as bioremediation and carbon cycling, can be elucidated through the genomic lens of these unculturable microbes. However, contamination can occur at various stages of the single-cell sequencing process. Contaminated data can lead to wasted time and effort on meaningless analyses, inaccurate or erroneous conclusions, and pollution of public databases. A fully automated decontamination tool is necessary to prevent these instances and increase the throughput of the single-cell sequencing process

  10. National Underground Mines Inventory

    Science.gov (United States)

    1983-10-01

    08 019 726 LONG PARK 15 0502379 08 095 2904 GEO a1 MINE 0502383 08 085 2904 BESSIE 0 MINE 0502387 08 667 2904 PAYSTREAK 0502397 08 113 2904 BUENO MILL...35 061QUESTA MINE 2901267 35 055 43560 ’ RUDY NO, I S 2 2901364 35 031 MT, TAYLOR 2901375 35 061 0 MARQUEZ SHAFT 2901597 35 031 6534 MARIANO LAKE MINE

  11. Automated DNA Sequencing System

    Energy Technology Data Exchange (ETDEWEB)

    Armstrong, G.A.; Ekkebus, C.P.; Hauser, L.J.; Kress, R.L.; Mural, R.J.

    1999-04-25

    Oak Ridge National Laboratory (ORNL) is developing a core DNA sequencing facility to support biological research endeavors at ORNL and to conduct basic sequencing automation research. This facility is novel because its development is based on existing standard biology laboratory equipment; thus, the development process is of interest to the many small laboratories trying to use automation to control costs and increase throughput. Before automation, biology Laboratory personnel purified DNA, completed cycle sequencing, and prepared 96-well sample plates with commercially available hardware designed specifically for each step in the process. Following purification and thermal cycling, an automated sequencing machine was used for the sequencing. A technician handled all movement of the 96-well sample plates between machines. To automate the process, ORNL is adding a CRS Robotics A- 465 arm, ABI 377 sequencing machine, automated centrifuge, automated refrigerator, and possibly an automated SpeedVac. The entire system will be integrated with one central controller that will direct each machine and the robot. The goal of this system is to completely automate the sequencing procedure from bacterial cell samples through ready-to-be-sequenced DNA and ultimately to completed sequence. The system will be flexible and will accommodate different chemistries than existing automated sequencing lines. The system will be expanded in the future to include colony picking and/or actual sequencing. This discrete event, DNA sequencing system will demonstrate that smaller sequencing labs can achieve cost-effective the laboratory grow.

  12. Mining lore : Bankhead, mining for coal

    Energy Technology Data Exchange (ETDEWEB)

    Nichiporuk, A.

    2007-09-15

    Bankhead, Alberta was one of the first communities to be established because of mining. It was founded in 1903 by the Canadian Pacific Railway (CPR) on Cascade Mountain in the Bow River Valley of Banff National Park. In 1904, Mine No. 80 was opened by the Pacific Coal Company to fuel CPR's steam engines. In order to avoid flooding the mine, the decision was made to mine up the steep seams instead of down. The mine entered full production in 1905. This article described the working conditions and pay scale for the mine workers, noting that there was not much in terms of safety equipment. There were many accidents and 15 men lost their lives at the mine. During the mine's 20-year operation, miners went on strike 6 times. The last strike marked the closure of the mine in June 1922 and the end of industry in national parks. CPR was ordered to clear out and move the mining equipment as well as the houses, buildings and essentially the entire town. During its peak production, Mine No. 80 produced about a half million tons of coal. 1 ref., 1 fig.

  13. Genome bioinformatics of tomato and potato

    NARCIS (Netherlands)

    Datema, E.

    2011-01-01

    In the past two decades genome sequencing has developed from a laborious and costly technology employed by large international consortia to a widely used, automated and affordable tool used worldwide by many individual research groups. Genome sequences of many food animals and crop plants have been

  14. Mining and environment

    Energy Technology Data Exchange (ETDEWEB)

    Kisgyorgy, S.

    1986-01-01

    The realization of new mining projects should be preceded by detailed studies on the impact of mining activities on the environment. For defining the conditions of environmental protection and for making proper financial plans the preparation of an information system is needed. The possible social effects of the mining investments have to be estimated, first of all from the points of view of waste disposal, mining hydrology, subsidence due to underground mining etc.

  15. Advances in Computer, Communication, Control and Automation

    CERN Document Server

    011 International Conference on Computer, Communication, Control and Automation

    2012-01-01

    The volume includes a set of selected papers extended and revised from the 2011 International Conference on Computer, Communication, Control and Automation (3CA 2011). 2011 International Conference on Computer, Communication, Control and Automation (3CA 2011) has been held in Zhuhai, China, November 19-20, 2011. This volume  topics covered include signal and Image processing, speech and audio Processing, video processing and analysis, artificial intelligence, computing and intelligent systems, machine learning, sensor and neural networks, knowledge discovery and data mining, fuzzy mathematics and Applications, knowledge-based systems, hybrid systems modeling and design, risk analysis and management, system modeling and simulation. We hope that researchers, graduate students and other interested readers benefit scientifically from the proceedings and also find it stimulating in the process.

  16. Uses of antimicrobial genes from microbial genome

    Science.gov (United States)

    Sorek, Rotem; Rubin, Edward M.

    2013-08-20

    We describe a method for mining microbial genomes to discover antimicrobial genes and proteins having broad spectrum of activity. Also described are antimicrobial genes and their expression products from various microbial genomes that were found using this method. The products of such genes can be used as antimicrobial agents or as tools for molecular biology.

  17. Mission-Critical Mobile Broadband Communications in Open Pit Mines

    DEFF Research Database (Denmark)

    Uzeda Garcia, Luis Guilherme; Portela Lopes de Almeida, Erika; Barbosa, Viviane S. B.

    2016-01-01

    The need for continuous safety improvements and increased operational efficiency is driving the mining industry through a transition towards automated operations. From a communications perspective, this transition introduces a new set of high-bandwidth business- and mission-critical applications...

  18. Robotics and automation for oil sands bitumen production and maintenance

    Energy Technology Data Exchange (ETDEWEB)

    Lipsett, M.G. [Alberta Univ., Edmonton, AB (Canada). Dept. of Mechanical Engineering

    2008-07-01

    This presentation examined technical challenges and commercial challenges related to robotics and automation processes in the mining and oil sands industries. The oil sands industry has on-going cost pressures. Challenges include the depths to which miners must travel, as well as problems related to equipment reliability and safety. Surface mines must operate in all weather conditions with a variety of complex systems. Barriers for new technologies include high capital and operating expenses. It has also proven difficult to integrate new technologies within established mining practices. However, automation has the potential to improve mineral processing, production, and maintenance processes. Step changes can be placed in locations that are hazardous or inaccessible. Automated sizing, material, and ventilation systems are can also be implemented as well as tele-operated equipment. Prototypes currently being developed include advanced systems for cutting; rock bolting; loose rock detection systems; lump size estimation; unstructured environment sensing; environment modelling; and automatic task execution. Enabling technologies are now being developed for excavation, haulage, material handling systems, mining and reclamation methods, and integrated control and reliability. tabs., figs.

  19. Data mining for ontology development.

    Energy Technology Data Exchange (ETDEWEB)

    Davidson, George S.; Strasburg, Jana (Pacific Northwest National Laboratory, Richland, WA); Stampf, David (Brookhaven National Laboratory, Upton, NY); Neymotin,Lev (Brookhaven National Laboratory, Upton, NY); Czajkowski, Carl (Brookhaven National Laboratory, Upton, NY); Shine, Eugene (Savannah River National Laboratory, Aiken, SC); Bollinger, James (Savannah River National Laboratory, Aiken, SC); Ghosh, Vinita (Brookhaven National Laboratory, Upton, NY); Sorokine, Alexandre (Oak Ridge National Laboratory, Oak Ridge, TN); Ferrell, Regina (Oak Ridge National Laboratory, Oak Ridge, TN); Ward, Richard (Oak Ridge National Laboratory, Oak Ridge, TN); Schoenwald, David Alan

    2010-06-01

    A multi-laboratory ontology construction effort during the summer and fall of 2009 prototyped an ontology for counterfeit semiconductor manufacturing. This effort included an ontology development team and an ontology validation methods team. Here the third team of the Ontology Project, the Data Analysis (DA) team reports on their approaches, the tools they used, and results for mining literature for terminology pertinent to counterfeit semiconductor manufacturing. A discussion of the value of ontology-based analysis is presented, with insights drawn from other ontology-based methods regularly used in the analysis of genomic experiments. Finally, suggestions for future work are offered.

  20. Laboratory Automation and Middleware.

    Science.gov (United States)

    Riben, Michael

    2015-06-01

    The practice of surgical pathology is under constant pressure to deliver the highest quality of service, reduce errors, increase throughput, and decrease turnaround time while at the same time dealing with an aging workforce, increasing financial constraints, and economic uncertainty. Although not able to implement total laboratory automation, great progress continues to be made in workstation automation in all areas of the pathology laboratory. This report highlights the benefits and challenges of pathology automation, reviews middleware and its use to facilitate automation, and reviews the progress so far in the anatomic pathology laboratory.

  1. DIYA: A Bacterial Annotation Pipeline for any Genomics Lab

    Science.gov (United States)

    2009-02-12

    microbial genomes overnight (Mardis, 2008). These technologies have created many new small ‘genome centers’ ( Zwick , 2005). DIYA (Do-It- Yourself...2008) The development of PIPA: an integrated and automated pipeline for genome-wide protein function annotation. BMC Bioinformatics, 9, 52. Zwick ,M.E

  2. Automating checks of plan check automation.

    Science.gov (United States)

    Halabi, Tarek; Lu, Hsiao-Ming

    2014-07-08

    While a few physicists have designed new plan check automation solutions for their clinics, fewer, if any, managed to adapt existing solutions. As complex and varied as the systems they check, these programs must gain the full confidence of those who would run them on countless patient plans. The present automation effort, planCheck, therefore focuses on versatility and ease of implementation and verification. To demonstrate this, we apply planCheck to proton gantry, stereotactic proton gantry, stereotactic proton fixed beam (STAR), and IMRT treatments.

  3. Automation in Warehouse Development

    NARCIS (Netherlands)

    Hamberg, R.; Verriet, J.

    2012-01-01

    The warehouses of the future will come in a variety of forms, but with a few common ingredients. Firstly, human operational handling of items in warehouses is increasingly being replaced by automated item handling. Extended warehouse automation counteracts the scarcity of human operators and support

  4. Automate functional testing

    Directory of Open Access Journals (Sweden)

    Ramesh Kalindri

    2014-06-01

    Full Text Available Currently, software engineers are increasingly turning to the option of automating functional tests, but not always have successful in this endeavor. Reasons range from low planning until over cost in the process. Some principles that can guide teams in automating these tests are described in this article.

  5. More Benefits of Automation.

    Science.gov (United States)

    Getz, Malcolm

    1988-01-01

    Describes a study that measured the benefits of an automated catalog and automated circulation system from the library user's point of view in terms of the value of time saved. Topics discussed include patterns of use, access time, availability of information, search behaviors, and the effectiveness of the measures used. (seven references)…

  6. Text Classification using Data Mining

    CERN Document Server

    Kamruzzaman, S M; Hasan, Ahmed Ryadh

    2010-01-01

    Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement of text retrieval systems, which retrieve texts in response to a user query, and text understanding systems, which transform text in some way such as producing summaries, answering questions or extracting data. Existing supervised learning algorithms to automatically classify text need sufficient documents to learn accurately. This paper presents a new algorithm for text classification using data mining that requires fewer documents for training. Instead of using words, word relation i.e. association rules from these words is used to derive feature set from pre-classified text documents. The concept of Naive Bayes classifier is then used on derived features and finally only a single concept of Genetic Algorithm has been added for final classification. A system based on the...

  7. Advances in inspection automation

    Science.gov (United States)

    Weber, Walter H.; Mair, H. Douglas; Jansen, Dion; Lombardi, Luciano

    2013-01-01

    This new session at QNDE reflects the growing interest in inspection automation. Our paper describes a newly developed platform that makes the complex NDE automation possible without the need for software programmers. Inspection tasks that are tedious, error-prone or impossible for humans to perform can now be automated using a form of drag and drop visual scripting. Our work attempts to rectify the problem that NDE is not keeping pace with the rest of factory automation. Outside of NDE, robots routinely and autonomously machine parts, assemble components, weld structures and report progress to corporate databases. By contrast, components arriving in the NDT department typically require manual part handling, calibrations and analysis. The automation examples in this paper cover the development of robotic thickness gauging and the use of adaptive contour following on the NRU reactor inspection at Chalk River.

  8. Automation in immunohematology.

    Science.gov (United States)

    Bajpai, Meenu; Kaur, Ravneet; Gupta, Ekta

    2012-07-01

    There have been rapid technological advances in blood banking in South Asian region over the past decade with an increasing emphasis on quality and safety of blood products. The conventional test tube technique has given way to newer techniques such as column agglutination technique, solid phase red cell adherence assay, and erythrocyte-magnetized technique. These new technologies are adaptable to automation and major manufacturers in this field have come up with semi and fully automated equipments for immunohematology tests in the blood bank. Automation improves the objectivity and reproducibility of tests. It reduces human errors in patient identification and transcription errors. Documentation and traceability of tests, reagents and processes and archiving of results is another major advantage of automation. Shifting from manual methods to automation is a major undertaking for any transfusion service to provide quality patient care with lesser turnaround time for their ever increasing workload. This article discusses the various issues involved in the process.

  9. Automated model building

    CERN Document Server

    Caferra, Ricardo; Peltier, Nicholas

    2004-01-01

    This is the first book on automated model building, a discipline of automated deduction that is of growing importance Although models and their construction are important per se, automated model building has appeared as a natural enrichment of automated deduction, especially in the attempt to capture the human way of reasoning The book provides an historical overview of the field of automated deduction, and presents the foundations of different existing approaches to model construction, in particular those developed by the authors Finite and infinite model building techniques are presented The main emphasis is on calculi-based methods, and relevant practical results are provided The book is of interest to researchers and graduate students in computer science, computational logic and artificial intelligence It can also be used as a textbook in advanced undergraduate courses

  10. Automation in Warehouse Development

    CERN Document Server

    Verriet, Jacques

    2012-01-01

    The warehouses of the future will come in a variety of forms, but with a few common ingredients. Firstly, human operational handling of items in warehouses is increasingly being replaced by automated item handling. Extended warehouse automation counteracts the scarcity of human operators and supports the quality of picking processes. Secondly, the development of models to simulate and analyse warehouse designs and their components facilitates the challenging task of developing warehouses that take into account each customer’s individual requirements and logistic processes. Automation in Warehouse Development addresses both types of automation from the innovative perspective of applied science. In particular, it describes the outcomes of the Falcon project, a joint endeavour by a consortium of industrial and academic partners. The results include a model-based approach to automate warehouse control design, analysis models for warehouse design, concepts for robotic item handling and computer vision, and auton...

  11. Automation in Immunohematology

    Directory of Open Access Journals (Sweden)

    Meenu Bajpai

    2012-01-01

    Full Text Available There have been rapid technological advances in blood banking in South Asian region over the past decade with an increasing emphasis on quality and safety of blood products. The conventional test tube technique has given way to newer techniques such as column agglutination technique, solid phase red cell adherence assay, and erythrocyte-magnetized technique. These new technologies are adaptable to automation and major manufacturers in this field have come up with semi and fully automated equipments for immunohematology tests in the blood bank. Automation improves the objectivity and reproducibility of tests. It reduces human errors in patient identification and transcription errors. Documentation and traceability of tests, reagents and processes and archiving of results is another major advantage of automation. Shifting from manual methods to automation is a major undertaking for any transfusion service to provide quality patient care with lesser turnaround time for their ever increasing workload. This article discusses the various issues involved in the process.

  12. Canadian suppliers of mining goods and services: Links between Canadian mining companies and selected sectors of the Canadian economy

    Energy Technology Data Exchange (ETDEWEB)

    Lemieux, A. [Natural Resources Canada, Ottawa, ON (Canada)

    2000-07-01

    Economic links between Canada's minerals and metals industry and Canadian suppliers of mining goods and services are examined to provide an insight into the interdependencies of these two key resource-related components of Canada's economy. The impact of globalization of the mining industry, estimates of its economic potential and the potential for exporting goods and services in conjunction with Canadian mining projects abroad are also assessed. The study concludes that the links between Canadian mining companies and the rest of the economy are difficult to quantify, due to the absence of statistical data that would differentiate supplier transactions with mining companies from those with other areas of the economy. At best, the approaches used in this study give but an imperfect understanding of the complex relationships between mining companies and their suppliers. It is clear, however, that as much of the demand for mining products is global, so is the supply, therefore, globalization of the mining industry, while creating unprecedented opportunities for Canadian suppliers to provide expertise, goods and services to Canadian and other customers offshore, the fact remains that mining multinationals buy a lot of their supplies locally. As a result, only some of the opportunities created by mining companies based in Canada and elsewhere will translate into sales for Canadian suppliers. Nevertheless, Canadian suppliers appear to have considerable depth in products related to underground mining, environment protection, exploration, feasibility studies, mineral processing, and mine automation. There appear to be considerable opportunities to derive further benefits from these areas of expertise. Appendices contain information about methodological aspects of the survey. 8 tabs., 32 figs., 6 appendices.

  13. De Novo Assembly of Human Herpes Virus Type 1 (HHV-1) Genome, Mining of Non-Canonical Structures and Detection of Novel Drug-Resistance Mutations Using Short- and Long-Read Next Generation Sequencing Technologies.

    Science.gov (United States)

    Karamitros, Timokratis; Harrison, Ian; Piorkowska, Renata; Katzourakis, Aris; Magiorkinis, Gkikas; Mbisa, Jean Lutamyo

    2016-01-01

    Human herpesvirus type 1 (HHV-1) has a large double-stranded DNA genome of approximately 152 kbp that is structurally complex and GC-rich. This makes the assembly of HHV-1 whole genomes from short-read sequencing data technically challenging. To improve the assembly of HHV-1 genomes we have employed a hybrid genome assembly protocol using data from two sequencing technologies: the short-read Roche 454 and the long-read Oxford Nanopore MinION sequencers. We sequenced 18 HHV-1 cell culture-isolated clinical specimens collected from immunocompromised patients undergoing antiviral therapy. The susceptibility of the samples to several antivirals was determined by plaque reduction assay. Hybrid genome assembly resulted in a decrease in the number of contigs in 6 out of 7 samples and an increase in N(G)50 and N(G)75 of all 7 samples sequenced by both technologies. The approach also enhanced the detection of non-canonical contigs including a rearrangement between the unique (UL) and repeat (T/IRL) sequence regions of one sample that was not detectable by assembly of 454 reads alone. We detected several known and novel resistance-associated mutations in UL23 and UL30 genes. Genome-wide genetic variability ranged from assembly of accurate, full-length HHV-1 genomes will be useful in determining genetic determinants of drug resistance, virulence, pathogenesis and viral evolution. The numerous, complex repeat regions of the HHV-1 genome currently remain a barrier towards this goal.

  14. First Mexican coal mine recovery after mine fire, Esmeralda Mine

    Energy Technology Data Exchange (ETDEWEB)

    Santillan, M.A. [Minerales Monclova, SA de CV, Palau Coahuila (Mexico)

    2005-07-01

    The fire started on 8 May 1998 in the development section from methane released into the mine through a roof-bolt hole. The flames spread quickly as the coal was ignited. After eight hours the Safety Department decided to seal the vertical ventilation shafts and the slopes. The quality of coal in the Esmeralda Mine is very high quality, and Minerales Monclova (MIMOSA) decided to recover the facilities. However, the Esmeralda Mine coals have a very high gas content of 12 m{sup 3}/t. During the next 2.5 months, MIMOSA staff and specialists observed and analysed the gas behaviour supported by a chromatograph. With the results of the observations and analyses, MIMOSA in consultation with the specialists developed a recovery plan based on flooding the area in which fire might have propagated and in which rekindling was highly probable. At the same time MIMOSA trained rescue teams. By 20 August 1998, the mine command centre had re-opened the slopes seal. Using a 'Step-by-Step' system, the rescue team began the recovery process by employing cross-cuts and using an auxiliary fan to establish the ventilation circuit. The MIMOSA team advanced into the mine as far as allowed by the water level and was able to recover the main fan. The official mine recovery date was 30 November 1998. Esmeralda Mine was back in operation in December 1998. 1 ref., 3 figs.

  15. Mines and Mineral Resources

    Data.gov (United States)

    Department of Homeland Security — Mines in the United States According to the Homeland Security Infrastructure Program Tiger Team Report Table E-2.V.1 Sub-Layer Geographic Names, a mine is defined as...

  16. Gold-Mining

    DEFF Research Database (Denmark)

    Raaballe, J.; Grundy, B.D.

    2002-01-01

    sooner than a manager of lower type. Third, a non-operating gold mine is valued as being of the lowest type in the pool and all else equal, high-asymmetri mines are valued lower than low-asymmetri mines. In a qualitative sense these results are robust with respect to different assumptions (re cost......  Based on standard option pricing arguments and assumptions (including no convenience yield and sustainable property rights), we will not observe operating gold mines. We find that asymmetric information on the reserves in the gold mine is a necessary and sufficient condition for the existence...... of operating gold mines. Asymmetric information on the reserves in the mine implies that, at a high enough price of gold, the manager of high type finds the extraction value of the company to be higher than the current market value of the non-operating gold mine. Due to this under valuation the maxim of market...

  17. Recent advances in remote coal mining machine sensing, guidance, and teleoperation

    Energy Technology Data Exchange (ETDEWEB)

    Ralston, J.C.; Hainsworth, D.W.; Reid, D.C.; Anderson, D.L.; McPhee, R.J. [CSIRO Exploration & Minerals, Kenmore, Qld. (Australia)

    2001-10-01

    Some recent applications of sensing, guidance and telerobotic technology in the coal mining industry are presented. Of special interest is the development of semi or fully autonomous systems to provide remote guidance and communications for coal mining equipment. The use of radar and inertial based sensors are considered in an attempt to solve the horizontal and lateral guidance problems associated with mining equipment automation. Also described is a novel teleoperated robot vehicle with unique communications capabilities, called the Numbat, which is used in underground mine safety and reconnaissance missions.

  18. Mining in El Salvador

    DEFF Research Database (Denmark)

    Pacheco Cueva, Vladimir

    2014-01-01

    In this guest article, Vladimir Pacheco, a social scientist who has worked on mining and human rights shares his perspectives on a current campaign against mining in El Salvador – Central America’s smallest but most densely populated country.......In this guest article, Vladimir Pacheco, a social scientist who has worked on mining and human rights shares his perspectives on a current campaign against mining in El Salvador – Central America’s smallest but most densely populated country....

  19. Systematic review automation technologies

    Science.gov (United States)

    2014-01-01

    Systematic reviews, a cornerstone of evidence-based medicine, are not produced quickly enough to support clinical practice. The cost of production, availability of the requisite expertise and timeliness are often quoted as major contributors for the delay. This detailed survey of the state of the art of information systems designed to support or automate individual tasks in the systematic review, and in particular systematic reviews of randomized controlled clinical trials, reveals trends that see the convergence of several parallel research projects. We surveyed literature describing informatics systems that support or automate the processes of systematic review or each of the tasks of the systematic review. Several projects focus on automating, simplifying and/or streamlining specific tasks of the systematic review. Some tasks are already fully automated while others are still largely manual. In this review, we describe each task and the effect that its automation would have on the entire systematic review process, summarize the existing information system support for each task, and highlight where further research is needed for realizing automation for the task. Integration of the systems that automate systematic review tasks may lead to a revised systematic review workflow. We envisage the optimized workflow will lead to system in which each systematic review is described as a computer program that automatically retrieves relevant trials, appraises them, extracts and synthesizes data, evaluates the risk of bias, performs meta-analysis calculations, and produces a report in real time. PMID:25005128

  20. Distributed Framework for Data Mining As a Service on Private Cloud

    Directory of Open Access Journals (Sweden)

    Shraddha Masih

    2014-11-01

    Full Text Available Data mining research faces two great challenges: i. Automated mining ii. Mining of distributed data. Conventional mining techniques are centralized and the data needs to be accumulated at central location. Mining tool needs to be installed on the computer before performing data mining. Thus, extra time is incurred in collecting the data. Mining is 4 done by specialized analysts who have access to mining tools. This technique is not optimal when the data is distributed over the network. To perform data mining in distributed scenario, we need to design a different framework to improve efficiency. Also, the size of accumulated data grows exponentially with time and is difficult to mine using a single computer. Personal computers have limitations in terms of computation capability and storage capacity. Cloud computing can be exploited for compute-intensive and data intensive applications. Data mining algorithms are both compute and data intensive, therefore cloud based tools can provide an infrastructure for distributed data mining. This paper is intended to use cloud computing to support distributed data mining. We propose a cloud based data mining model which provides the facility of mass data storage along with distributed data mining facility. This paper provide a solution for distributed data mining on Hadoop framework using an interface to run the algorithm on specified number of nodes without any user level configuration. Hadoop is configured over private servers and clients can process their data through common framework from anywhere in private network. Data to be mined can either be chosen from cloud data server or can be uploaded from private computers on the network. It is observed that the framework is helpful in processing large size data in less time as compared to single system.

  1. American mines, methods and men

    Energy Technology Data Exchange (ETDEWEB)

    Walker, S.C.A. (Thames Water Utilities (UK))

    1992-04-01

    The paper is based on the author's visits to a number of American mines, to see their mining machinery and to discuss with mine management their industrial relations problems. The paper gives a brief review of American mines, methods and men and is in the form of a diary. Mines visited are: Ohio Valley Coal Company; Big John Mine; Pittsburgh Research Center of the US Bureau of Mines; Martinka Mine; Robin Hood Complex No 9 Mine (Boone County, West Virginia), Green Briar Mine (Virginia); Martin County Coal (Kentucky); Wabash Mine (Keensburgh, Illinois); Galatia Mine (Harrisburgh, Illinois); and William Station Mine (Sturgis, Kentucky). Details given include mining methods productivity and staffing levels. The mining machinery is described in detail in a separate article. 5 figs.

  2. Data Mining for CRM

    Science.gov (United States)

    Thearling, Kurt

    Data Mining technology allows marketing organizations to better understand their customers and respond to their needs. This chapter describes how Data Mining can be combined with customer relationship management to help drive improved interactions with customers. An example showing how to use Data Mining to drive customer acquisition activities is presented.

  3. A MINE alternative to D-optimal designs for the linear model.

    Directory of Open Access Journals (Sweden)

    Amanda M Bouffier

    Full Text Available Doing large-scale genomics experiments can be expensive, and so experimenters want to get the most information out of each experiment. To this end the Maximally Informative Next Experiment (MINE criterion for experimental design was developed. Here we explore this idea in a simplified context, the linear model. Four variations of the MINE method for the linear model were created: MINE-like, MINE, MINE with random orthonormal basis, and MINE with random rotation. Each method varies in how it maximizes the MINE criterion. Theorem 1 establishes sufficient conditions for the maximization of the MINE criterion under the linear model. Theorem 2 establishes when the MINE criterion is equivalent to the classic design criterion of D-optimality. By simulation under the linear model, we establish that the MINE with random orthonormal basis and MINE with random rotation are faster to discover the true linear relation with p regression coefficients and n observations when p>>n. We also establish in simulations with n<100, p=1000, σ=0.01 and 1000 replicates that these two variations of MINE also display a lower false positive rate than the MINE-like method and additionally, for a majority of the experiments, for the MINE method.

  4. Chef infrastructure automation cookbook

    CERN Document Server

    Marschall, Matthias

    2013-01-01

    Chef Infrastructure Automation Cookbook contains practical recipes on everything you will need to automate your infrastructure using Chef. The book is packed with illustrated code examples to automate your server and cloud infrastructure.The book first shows you the simplest way to achieve a certain task. Then it explains every step in detail, so that you can build your knowledge about how things work. Eventually, the book shows you additional things to consider for each approach. That way, you can learn step-by-step and build profound knowledge on how to go about your configuration management

  5. Automated systems to identify relevant documents in product risk management

    Directory of Open Access Journals (Sweden)

    Wee Xue

    2012-03-01

    Full Text Available Abstract Background Product risk management involves critical assessment of the risks and benefits of health products circulating in the market. One of the important sources of safety information is the primary literature, especially for newer products which regulatory authorities have relatively little experience with. Although the primary literature provides vast and diverse information, only a small proportion of which is useful for product risk assessment work. Hence, the aim of this study is to explore the possibility of using text mining to automate the identification of useful articles, which will reduce the time taken for literature search and hence improving work efficiency. In this study, term-frequency inverse document-frequency values were computed for predictors extracted from the titles and abstracts of articles related to three tumour necrosis factors-alpha blockers. A general automated system was developed using only general predictors and was tested for its generalizability using articles related to four other drug classes. Several specific automated systems were developed using both general and specific predictors and training sets of different sizes in order to determine the minimum number of articles required for developing such systems. Results The general automated system had an area under the curve value of 0.731 and was able to rank 34.6% and 46.2% of the total number of 'useful' articles among the first 10% and 20% of the articles presented to the evaluators when tested on the generalizability set. However, its use may be limited by the subjective definition of useful articles. For the specific automated system, it was found that only 20 articles were required to develop a specific automated system with a prediction performance (AUC 0.748 that was better than that of general automated system. Conclusions Specific automated systems can be developed rapidly and avoid problems caused by subjective definition of useful

  6. The Challenge of Wireless Connectivity to Support Intelligent Mines

    DEFF Research Database (Denmark)

    Barbosa, Viviane S. B.; Garcia, Luis G. U.; Caldwell, George

    2016-01-01

    and increase productivity, from extraction all the way to the delivery of a processed product to the customer. In this context, one of the key enablers is wireless connectivity since it allows mining equipment to be remotely monitored and controlled. Simply put, dependable wireless connectivity is essential......, but mines change by definition on a daily-basis. Therefore, a careful and continuous effort must be made to ensure the wireless network keeps up with the topographic and operational changes in order to provide the necessary network availability, reliability, capacity and coverage needed to support a new...... by mining automation and discuss the consequences of not providing connectivity for all applications. The work also discusses how the careful positioning of the heavy communications infrastructure (tall towers) from the early stages of the mine site project can make the provision of incremental capacity...

  7. Application of fuzzy logic for determining of coal mine mechanization

    Institute of Scientific and Technical Information of China (English)

    HOSSEINI SAA; ATAEI M; HOSSEINI S M; AKHYANI M

    2012-01-01

    The fundamental task of mining engineers is to produce more coal at a given level of labour input and material costs,for optimum quality and maximum efficiency.To achieve these goals,it is necessary to automate and mechanize mining operations.Mechanization is an objective that can result in significant cost reduction and higher levels of profitability for underground mines.To analyze the potential of mechanization,some important factors such as seam inclination and thickness,geological disturbances,seam floor conditions and roof conditions should be considered.In this study we have used fuzzy logic,membership functions and created fuzzy rule-based methods and considered the ultimate objective:mechanization of mining.As a case study,the mechanization of the Tazare coal seams in Shahroud area of Iran was investigated.The results show a low potential for mechanization in most of the Tazare coal seams.

  8. Application of text mining in the biomedical domain.

    Science.gov (United States)

    Fleuren, Wilco W M; Alkema, Wynand

    2015-03-01

    In recent years the amount of experimental data that is produced in biomedical research and the number of papers that are being published in this field have grown rapidly. In order to keep up to date with developments in their field of interest and to interpret the outcome of experiments in light of all available literature, researchers turn more and more to the use of automated literature mining. As a consequence, text mining tools have evolved considerably in number and quality and nowadays can be used to address a variety of research questions ranging from de novo drug target discovery to enhanced biological interpretation of the results from high throughput experiments. In this paper we introduce the most important techniques that are used for a text mining and give an overview of the text mining tools that are currently being used and the type of problems they are typically applied for.

  9. Mining text data

    CERN Document Server

    Aggarwal, Charu C

    2012-01-01

    Text mining applications have experienced tremendous advances because of web 2.0 and social networking applications. Recent advances in hardware and software technology have lead to a number of unique scenarios where text mining algorithms are learned. ""Mining Text Data"" introduces an important niche in the text analytics field, and is an edited volume contributed by leading international researchers and practitioners focused on social networks & data mining. This book contains a wide swath in topics across social networks & data mining. Each chapter contains a comprehensive survey including

  10. Automation of a single-DNA molecule stretching device

    DEFF Research Database (Denmark)

    Sørensen, Kristian Tølbøl; Lopacinska, Joanna M.; Tommerup, Niels

    2015-01-01

    We automate the manipulation of genomic-length DNA in a nanofluidic device based on real-time analysis of fluorescence images. In our protocol, individual molecules are picked from a microchannel and stretched with pN forces using pressure driven flows. The millimeter-long DNA fragments free...

  11. Technology development for remote, computer-assisted operation of a continuous mining machine

    Energy Technology Data Exchange (ETDEWEB)

    Schnakenberg, G.H. [Pittsburgh Research Center, PA (United States)

    1993-12-31

    The U.S. Bureau of Mines was created to conduct research to improve the health, safety, and efficiency of the coal and metal mining industries. In 1986, the Bureau embarked on a new, major research effort to develop the technology that would enable the relocation of workers from hazardous areas to areas of relative safety. This effort is in contrast to historical efforts by the Bureau of controlling or reducing the hazardous agent or providing protection to the worker. The technologies associated with automation, robotics, and computer software and hardware systems had progressed to the point that their use to develop computer-assisted operation of mobile mining equipment appeared to be a cost-effective and accomplishable task. At the first International Symposium of Mine Mechanization and Automation, an overview of the Bureau`s computer-assisted mining program for underground coal mining was presented. The elements included providing computer-assisted tele-remote operation of continuous mining machines, haulage systems and roof bolting machines. Areas of research included sensors for machine guidance and for coal interface detection. Additionally, the research included computer hardware and software architectures which are extremely important in developing technology that is transferable to industry and is flexible enough to accommodate the variety of machines used in coal mining today. This paper provides an update of the research under the computer-assisted mining program.

  12. Automated Vehicles Symposium 2015

    CERN Document Server

    Beiker, Sven

    2016-01-01

    This edited book comprises papers about the impacts, benefits and challenges of connected and automated cars. It is the third volume of the LNMOB series dealing with Road Vehicle Automation. The book comprises contributions from researchers, industry practitioners and policy makers, covering perspectives from the U.S., Europe and Japan. It is based on the Automated Vehicles Symposium 2015 which was jointly organized by the Association of Unmanned Vehicle Systems International (AUVSI) and the Transportation Research Board (TRB) in Ann Arbor, Michigan, in July 2015. The topical spectrum includes, but is not limited to, public sector activities, human factors, ethical and business aspects, energy and technological perspectives, vehicle systems and transportation infrastructure. This book is an indispensable source of information for academic researchers, industrial engineers and policy makers interested in the topic of road vehicle automation.

  13. I-94 Automation FAQs

    Data.gov (United States)

    Department of Homeland Security — In order to increase efficiency, reduce operating costs and streamline the admissions process, U.S. Customs and Border Protection has automated Form I-94 at air and...

  14. Automated Vehicles Symposium 2014

    CERN Document Server

    Beiker, Sven; Road Vehicle Automation 2

    2015-01-01

    This paper collection is the second volume of the LNMOB series on Road Vehicle Automation. The book contains a comprehensive review of current technical, socio-economic, and legal perspectives written by experts coming from public authorities, companies and universities in the U.S., Europe and Japan. It originates from the Automated Vehicle Symposium 2014, which was jointly organized by the Association for Unmanned Vehicle Systems International (AUVSI) and the Transportation Research Board (TRB) in Burlingame, CA, in July 2014. The contributions discuss the challenges arising from the integration of highly automated and self-driving vehicles into the transportation system, with a focus on human factors and different deployment scenarios. This book is an indispensable source of information for academic researchers, industrial engineers, and policy makers interested in the topic of road vehicle automation.

  15. Hydrometeorological Automated Data System

    Data.gov (United States)

    National Oceanic and Atmospheric Administration, Department of Commerce — The Office of Hydrologic Development of the National Weather Service operates HADS, the Hydrometeorological Automated Data System. This data set contains the last 48...

  16. Automating the Media Center.

    Science.gov (United States)

    Holloway, Mary A.

    1988-01-01

    Discusses the need to develop more efficient information retrieval skills by the use of new technology. Lists four stages used in automating the media center. Describes North Carolina's pilot programs. Proposes benefits and looks at the media center's future. (MVL)

  17. De Novo Assembly of Human Herpes Virus Type 1 (HHV-1) Genome, Mining of Non-Canonical Structures and Detection of Novel Drug-Resistance Mutations Using Short- and Long-Read Next Generation Sequencing Technologies

    Science.gov (United States)

    Karamitros, Timokratis; Piorkowska, Renata; Katzourakis, Aris; Magiorkinis, Gkikas; Mbisa, Jean Lutamyo

    2016-01-01

    Human herpesvirus type 1 (HHV-1) has a large double-stranded DNA genome of approximately 152 kbp that is structurally complex and GC-rich. This makes the assembly of HHV-1 whole genomes from short-read sequencing data technically challenging. To improve the assembly of HHV-1 genomes we have employed a hybrid genome assembly protocol using data from two sequencing technologies: the short-read Roche 454 and the long-read Oxford Nanopore MinION sequencers. We sequenced 18 HHV-1 cell culture-isolated clinical specimens collected from immunocompromised patients undergoing antiviral therapy. The susceptibility of the samples to several antivirals was determined by plaque reduction assay. Hybrid genome assembly resulted in a decrease in the number of contigs in 6 out of 7 samples and an increase in N(G)50 and N(G)75 of all 7 samples sequenced by both technologies. The approach also enhanced the detection of non-canonical contigs including a rearrangement between the unique (UL) and repeat (T/IRL) sequence regions of one sample that was not detectable by assembly of 454 reads alone. We detected several known and novel resistance-associated mutations in UL23 and UL30 genes. Genome-wide genetic variability ranged from <1% to 53% of amino acids in each gene exhibiting at least one substitution within the pool of samples. The UL23 gene had one of the highest genetic variabilities at 35.2% in keeping with its role in development of drug resistance. The assembly of accurate, full-length HHV-1 genomes will be useful in determining genetic determinants of drug resistance, virulence, pathogenesis and viral evolution. The numerous, complex repeat regions of the HHV-1 genome currently remain a barrier towards this goal. PMID:27309375

  18. Data mining in radiology.

    Science.gov (United States)

    Kharat, Amit T; Singh, Amarjit; Kulkarni, Vilas M; Shah, Digish

    2014-04-01

    Data mining facilitates the study of radiology data in various dimensions. It converts large patient image and text datasets into useful information that helps in improving patient care and provides informative reports. Data mining technology analyzes data within the Radiology Information System and Hospital Information System using specialized software which assesses relationships and agreement in available information. By using similar data analysis tools, radiologists can make informed decisions and predict the future outcome of a particular imaging finding. Data, information and knowledge are the components of data mining. Classes, Clusters, Associations, Sequential patterns, Classification, Prediction and Decision tree are the various types of data mining. Data mining has the potential to make delivery of health care affordable and ensure that the best imaging practices are followed. It is a tool for academic research. Data mining is considered to be ethically neutral, however concerns regarding privacy and legality exists which need to be addressed to ensure success of data mining.

  19. Data mining in radiology

    Directory of Open Access Journals (Sweden)

    Amit T Kharat

    2014-01-01

    Full Text Available Data mining facilitates the study of radiology data in various dimensions. It converts large patient image and text datasets into useful information that helps in improving patient care and provides informative reports. Data mining technology analyzes data within the Radiology Information System and Hospital Information System using specialized software which assesses relationships and agreement in available information. By using similar data analysis tools, radiologists can make informed decisions and predict the future outcome of a particular imaging finding. Data, information and knowledge are the components of data mining. Classes, Clusters, Associations, Sequential patterns, Classification, Prediction and Decision tree are the various types of data mining. Data mining has the potential to make delivery of health care affordable and ensure that the best imaging practices are followed. It is a tool for academic research. Data mining is considered to be ethically neutral, however concerns regarding privacy and legality exists which need to be addressed to ensure success of data mining.

  20. Disassembly automation automated systems with cognitive abilities

    CERN Document Server

    Vongbunyong, Supachai

    2015-01-01

    This book presents a number of aspects to be considered in the development of disassembly automation, including the mechanical system, vision system and intelligent planner. The implementation of cognitive robotics increases the flexibility and degree of autonomy of the disassembly system. Disassembly, as a step in the treatment of end-of-life products, can allow the recovery of embodied value left within disposed products, as well as the appropriate separation of potentially-hazardous components. In the end-of-life treatment industry, disassembly has largely been limited to manual labor, which is expensive in developed countries. Automation is one possible solution for economic feasibility. The target audience primarily comprises researchers and experts in the field, but the book may also be beneficial for graduate students.

  1. ACCOUNTING AUTOMATIONS RISKS

    OpenAIRE

    Муравський, В. В.; Хома, Н. Г.

    2015-01-01

    Accountant accepts active voice in organization of the automated account in the conditions of the informative systems introduction in enterprise activity. Effective accounting automation needs identification and warning of organizational risks. Authors researched, classified and generalized the risks of introduction of the informative accounting systems. The ways of liquidation of the organizational risks sources andminimization of their consequences are gives. The method of the effective con...

  2. Instant Sikuli test automation

    CERN Document Server

    Lau, Ben

    2013-01-01

    Get to grips with a new technology, understand what it is and what it can do for you, and then get to work with the most important features and tasks. A concise guide written in an easy-to follow style using the Starter guide approach.This book is aimed at automation and testing professionals who want to use Sikuli to automate GUI. Some Python programming experience is assumed.

  3. Automated security management

    CERN Document Server

    Al-Shaer, Ehab; Xie, Geoffrey

    2013-01-01

    In this contributed volume, leading international researchers explore configuration modeling and checking, vulnerability and risk assessment, configuration analysis, and diagnostics and discovery. The authors equip readers to understand automated security management systems and techniques that increase overall network assurability and usability. These constantly changing networks defend against cyber attacks by integrating hundreds of security devices such as firewalls, IPSec gateways, IDS/IPS, authentication servers, authorization/RBAC servers, and crypto systems. Automated Security Managemen

  4. Automation of Diagrammatic Reasoning

    OpenAIRE

    Jamnik, Mateja; Bundy, Alan; Green, Ian

    1997-01-01

    Theorems in automated theorem proving are usually proved by logical formal proofs. However, there is a subset of problems which humans can prove in a different way by the use of geometric operations on diagrams, so called diagrammatic proofs. Insight is more clearly perceived in these than in the corresponding algebraic proofs: they capture an intuitive notion of truthfulness that humans find easy to see and understand. We are identifying and automating this diagrammatic reasoning on mathemat...

  5. Automated Lattice Perturbation Theory

    Energy Technology Data Exchange (ETDEWEB)

    Monahan, Christopher

    2014-11-01

    I review recent developments in automated lattice perturbation theory. Starting with an overview of lattice perturbation theory, I focus on the three automation packages currently "on the market": HiPPy/HPsrc, Pastor and PhySyCAl. I highlight some recent applications of these methods, particularly in B physics. In the final section I briefly discuss the related, but distinct, approach of numerical stochastic perturbation theory.

  6. Marketing automation supporting sales

    OpenAIRE

    Sandell, Niko

    2016-01-01

    The past couple of decades has been a time of major changes in marketing. Digitalization has become a permanent part of marketing and at the same time enabled efficient collection of data. Personalization and customization of content are playing a crucial role in marketing when new customers are acquired. This has also created a need for automation to facilitate the distribution of targeted content. As a result of successful marketing automation more information of the customers is gathered ...

  7. Informationization of coal enterprises and digital mine

    Energy Technology Data Exchange (ETDEWEB)

    Lu, Jian-jun; Wang, Xiao-lu; Ma, Li; Zhao, An-xin [Xi' an University of Science and Technology, Xi' an (China). School of Communication and Information Engineering

    2008-09-15

    The main problems which were found in current conditions and problems of informationization in coal enterprises in China were analysed. The paper clarified how to achieve informationization in coal mining and put forward a general configuration of informationization construction in which informationization in coal enterprises was divided into two parts: informationization of safety production and informationization of management. A platform of integrated management of informationization in coal enterprises was planned. Ultimately, it was considered that an overall integrated digital mine is the way to achieve the goal of informatonization in coal enterprises, which can promote the application of automation, digitalization, networking, informaitionization to intellectualization. At the same time, the competitiveness of enterprises can be improved entirely and a new type of coal industry can be supported by information technology. 8 refs., 4 figs.

  8. Informationization of coal enterprises and digital mine

    Institute of Scientific and Technical Information of China (English)

    LU Jian-jun; WANG Xiao-lu; MA Li; ZHAO An-xin

    2008-01-01

    Analyzed the main problems which were found in current conditions and prob-lems of informationization in coal enterprises. It clarified how to achieve informationizationin coal mine and put forward a general configuration of informationization construction inwhich informationization in coal enterprises was divided into two parts: informationizationof safety production and informationization of management. Planned a platform of inte-grated management of informationization in coal enterprises. Ultimately, it has broughtforward that an overall integrated digital mine is the way to achieve the goal of informa-tionization in coal enterprises, which can promote the application of automation, digitaliza-tion, networking, informaitionization to intellectualization. At the same time, the competi-tiveness of enterprises can be improved entirely, and new type of coal industry can besupported by information technology.

  9. Opinion mining and summarization for customer reviews

    Directory of Open Access Journals (Sweden)

    Sanjeev kumar Chauhan

    2012-08-01

    Full Text Available Opinion Mining is related to detect the opinion of the author expressed in the document. The primary task in the field of opinion Mining is Subjectivity Analysis which finds whether the document is subjective or objective. Subjectivity shows that the document contains some opinionated part, while the objectivity shows thatthe document is far behind from the opinionated part i.e. it has no sentiments containing. The next task is Sentiment Polarity Analysis which differentiates the documents according to positivity and negativity. But presently there is no automated system which can perform this task. We are developing a system which can findthe degree of polarity of each document and according to it assign a human like rating to that document. At last it generates the summary of review which contains only the highly subjective and feature related part of the document.

  10. Reality mining of animal social systems.

    Science.gov (United States)

    Krause, Jens; Krause, Stefan; Arlinghaus, Robert; Psorakis, Ioannis; Roberts, Stephen; Rutz, Christian

    2013-09-01

    The increasing miniaturisation of animal-tracking technology has made it possible to gather exceptionally detailed machine-sensed data on the social dynamics of almost entire populations of individuals, in both terrestrial and aquatic study systems. Here, we review important issues concerning the collection of such data, and their processing and analysis, to identify the most promising approaches in the emerging field of 'reality mining'. Automated technologies can provide data sensing at time intervals small enough to close the gap between social patterns and their underlying processes, providing insights into how social structures arise and change dynamically over different timescales. Especially in conjunction with experimental manipulations, reality mining promises significant advances in basic and applied research on animal social systems.

  11. Elements of EAF automation processes

    Science.gov (United States)

    Ioana, A.; Constantin, N.; Dragna, E. C.

    2017-01-01

    Our article presents elements of Electric Arc Furnace (EAF) automation. So, we present and analyze detailed two automation schemes: the scheme of electrical EAF automation system; the scheme of thermic EAF automation system. The application results of these scheme of automation consists in: the sensitive reduction of specific consummation of electrical energy of Electric Arc Furnace, increasing the productivity of Electric Arc Furnace, increase the quality of the developed steel, increasing the durability of the building elements of Electric Arc Furnace.

  12. Data mining, mining data : energy consumption modelling

    Energy Technology Data Exchange (ETDEWEB)

    Dessureault, S. [Arizona Univ., Tucson, AZ (United States)

    2007-09-15

    Most modern mining operations are accumulating large amounts of data on production and business processes. Data, however, provides value only if it can be translated into information that appropriate users can utilize. This paper emphasized that a new technological focus should emerge, notably how to concentrate data into information; analyze information sufficiently to become knowledge; and, act on that knowledge. Researchers at the Mining Information Systems and Operations Management (MISOM) laboratory at the University of Arizona have created a method to transform data into action. The data-to-action approach was exercised in the development of an energy consumption model (ECM), in partnership with a major US-based copper mining company, 2 software companies, and the MISOM laboratory. The approach begins by integrating several key data sources using data warehousing techniques, and increasing the existing level of integration and data cleaning. An online analytical processing (OLAP) cube was also created to investigate the data and identify a subset of several million records. Data mining algorithms were applied using the information that was isolated by the OLAP cube. The data mining results showed that traditional cost drivers of energy consumption are poor predictors. A comparison was made between traditional methods of predicting energy consumption and the prediction formed using data mining. Traditionally, in the mines for which data were available, monthly averages of tons and distance are used to predict diesel fuel consumption. However, this article showed that new information technology can be used to incorporate many more variables into the budgeting process, resulting in more accurate predictions. The ECM helped mine planners improve the prediction of energy use through more data integration, measure development, and workflow analysis. 5 refs., 11 figs.

  13. Genome Modeling System: A Knowledge Management Platform for Genomics.

    Directory of Open Access Journals (Sweden)

    Malachi Griffith

    2015-07-01

    Full Text Available In this work, we present the Genome Modeling System (GMS, an analysis information management system capable of executing automated genome analysis pipelines at a massive scale. The GMS framework provides detailed tracking of samples and data coupled with reliable and repeatable analysis pipelines. The GMS also serves as a platform for bioinformatics development, allowing a large team to collaborate on data analysis, or an individual researcher to leverage the work of others effectively within its data management system. Rather than separating ad-hoc analysis from rigorous, reproducible pipelines, the GMS promotes systematic integration between the two. As a demonstration of the GMS, we performed an integrated analysis of whole genome, exome and transcriptome sequencing data from a breast cancer cell line (HCC1395 and matched lymphoblastoid line (HCC1395BL. These data are available for users to test the software, complete tutorials and develop novel GMS pipeline configurations. The GMS is available at https://github.com/genome/gms.

  14. Automated discovery of single nucleotide polymorphism and simple sequence repeat molecular genetic markers.

    Science.gov (United States)

    Batley, Jacqueline; Jewell, Erica; Edwards, David

    2007-01-01

    Molecular genetic markers represent one of the most powerful tools for the analysis of genomes. Molecular marker technology has developed rapidly over the last decade, and two forms of sequence-based markers, simple sequence repeats (SSRs), also known as microsatellites, and single nucleotide polymorphisms (SNPs), now predominate applications in modern genetic analysis. The availability of large sequence data sets permits mining for SSRs and SNPs, which may then be applied to genetic trait mapping and marker-assisted selection. Here, we describe Web-based automated methods for the discovery of these SSRs and SNPs from sequence data. SSRPrimer enables the real-time discovery of SSRs within submitted DNA sequences, with the concomitant design of PCR primers for SSR amplification. Alternatively, users may browse the SSR Taxonomy Tree to identify predetermined SSR amplification primers for any species represented within the GenBank database. SNPServer uses a redundancy-based approach to identify SNPs within DNA sequence data. Following submission of a sequence of interest, SNPServer uses BLAST to identify similar sequences, CAP3 to cluster and assemble these sequences, and then the SNP discovery software autoSNP to detect SNPs and insertion/deletion (indel) polymorphisms.

  15. Data mining approach to model the diagnostic service management.

    Science.gov (United States)

    Lee, Sun-Mi; Lee, Ae-Kyung; Park, Il-Su

    2006-01-01

    Korea has National Health Insurance Program operated by the government-owned National Health Insurance Corporation, and diagnostic services are provided every two year for the insured and their family members. Developing a customer relationship management (CRM) system using data mining technology would be useful to improve the performance of diagnostic service programs. Under these circumstances, this study developed a model for diagnostic service management taking into account the characteristics of subjects using a data mining approach. This study could be further used to develop an automated CRM system contributing to the increase in the rate of receiving diagnostic services.

  16. Coal mine site reclamation

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2013-02-15

    Coal mine sites can have significant effects on local environments. In addition to the physical disruption of land forms and ecosystems, mining can also leave behind a legacy of secondary detrimental effects due to leaching of acid and trace elements from discarded materials. This report looks at the remediation of both deep mine and opencast mine sites, covering reclamation methods, back-filling issues, drainage and restoration. Examples of national variations in the applicable legislation and in the definition of rehabilitation are compared. Ultimately, mine site rehabilitation should return sites to conditions where land forms, soils, hydrology, and flora and fauna are self-sustaining and compatible with surrounding land uses. Case studies are given to show what can be achieved and how some landscapes can actually be improved as a result of mining activity.

  17. Implementation of Paste Backfill Mining Technology in Chinese Coal Mines

    Directory of Open Access Journals (Sweden)

    Qingliang Chang

    2014-01-01

    Full Text Available Implementation of clean mining technology at coal mines is crucial to protect the environment and maintain balance among energy resources, consumption, and ecology. After reviewing present coal clean mining technology, we introduce the technology principles and technological process of paste backfill mining in coal mines and discuss the components and features of backfill materials, the constitution of the backfill system, and the backfill process. Specific implementation of this technology and its application are analyzed for paste backfill mining in Daizhuang Coal Mine; a practical implementation shows that paste backfill mining can improve the safety and excavation rate of coal mining, which can effectively resolve surface subsidence problems caused by underground mining activities, by utilizing solid waste such as coal gangues as a resource. Therefore, paste backfill mining is an effective clean coal mining technology, which has widespread application.

  18. Implementation of paste backfill mining technology in Chinese coal mines.

    Science.gov (United States)

    Chang, Qingliang; Chen, Jianhang; Zhou, Huaqiang; Bai, Jianbiao

    2014-01-01

    Implementation of clean mining technology at coal mines is crucial to protect the environment and maintain balance among energy resources, consumption, and ecology. After reviewing present coal clean mining technology, we introduce the technology principles and technological process of paste backfill mining in coal mines and discuss the components and features of backfill materials, the constitution of the backfill system, and the backfill process. Specific implementation of this technology and its application are analyzed for paste backfill mining in Daizhuang Coal Mine; a practical implementation shows that paste backfill mining can improve the safety and excavation rate of coal mining, which can effectively resolve surface subsidence problems caused by underground mining activities, by utilizing solid waste such as coal gangues as a resource. Therefore, paste backfill mining is an effective clean coal mining technology, which has widespread application.

  19. Developing image processing meta-algorithms with data mining of multiple metrics.

    Science.gov (United States)

    Leung, Kelvin; Cunha, Alexandre; Toga, A W; Parker, D Stott

    2014-01-01

    People often use multiple metrics in image processing, but here we take a novel approach of mining the values of batteries of metrics on image processing results. We present a case for extending image processing methods to incorporate automated mining of multiple image metric values. Here by a metric we mean any image similarity or distance measure, and in this paper we consider intensity-based and statistical image measures and focus on registration as an image processing problem. We show how it is possible to develop meta-algorithms that evaluate different image processing results with a number of different metrics and mine the results in an automated fashion so as to select the best results. We show that the mining of multiple metrics offers a variety of potential benefits for many image processing problems, including improved robustness and validation.

  20. Land Mines (Landminen)

    Science.gov (United States)

    1978-02-02

    making contact with the safety pin of the pull fuze 42. Two locking bolts held the upper and lower case in position during transport, so that there... safety pin out of the extended striker, thus releasing it. These mines were filled with 200 g of explosives. This type of mine was the model for the...by inserting the detonator slide. However, the mine is not fully armed until the safety pin is removed and reinserted until it makes contact with the

  1. A REVIEW ON TEXT MINING IN DATA MINING

    OpenAIRE

    2016-01-01

    Data mining is the knowledge discovery in databases and the gaol is to extract patterns and knowledge from large amounts of data. The important term in data mining is text mining. Text mining extracts the quality information highly from text. Statistical pattern learning is used to high quality information. High –quality in text mining defines the combinations of relevance, novelty and interestingness. Tasks in text mining are text categorization, text clustering, entity extraction and sentim...

  2. Materials Testing and Automation

    Science.gov (United States)

    Cooper, Wayne D.; Zweigoron, Ronald B.

    1980-07-01

    The advent of automation in materials testing has been in large part responsible for recent radical changes in the materials testing field: Tests virtually impossible to perform without a computer have become more straightforward to conduct. In addition, standardized tests may be performed with enhanced efficiency and repeatability. A typical automated system is described in terms of its primary subsystems — an analog station, a digital computer, and a processor interface. The processor interface links the analog functions with the digital computer; it includes data acquisition, command function generation, and test control functions. Features of automated testing are described with emphasis on calculated variable control, control of a variable that is computed by the processor and cannot be read directly from a transducer. Three calculated variable tests are described: a yield surface probe test, a thermomechanical fatigue test, and a constant-stress-intensity range crack-growth test. Future developments are discussed.

  3. Automation of Taxiing

    Directory of Open Access Journals (Sweden)

    Jaroslav Bursík

    2017-01-01

    Full Text Available The article focuses on the possibility of automation of taxiing, which is the part of a flight, which, under adverse weather conditions, greatly reduces the operational usability of an airport, and is the only part of a flight that has not been affected by automation, yet. Taxiing is currently handled manually by the pilot, who controls the airplane based on information from visual perception. The article primarily deals with possible ways of obtaining navigational information, and its automatic transfer to the controls. Analyzed wand assessed were currently available technologies such as computer vision, Light Detection and Ranging and Global Navigation Satellite System, which are useful for navigation and their general implementation into an airplane was designed. Obstacles to the implementation were identified, too. The result is a proposed combination of systems along with their installation into airplane’s systems so that it is possible to use the automated taxiing.

  4. Physics Mining of Multi-Source Data Sets

    Science.gov (United States)

    Helly, John; Karimabadi, Homa; Sipes, Tamara

    2012-01-01

    Powerful new parallel data mining algorithms can produce diagnostic and prognostic numerical models and analyses from observational data. These techniques yield higher-resolution measures than ever before of environmental parameters by fusing synoptic imagery and time-series measurements. These techniques are general and relevant to observational data, including raster, vector, and scalar, and can be applied in all Earth- and environmental science domains. Because they can be highly automated and are parallel, they scale to large spatial domains and are well suited to change and gap detection. This makes it possible to analyze spatial and temporal gaps in information, and facilitates within-mission replanning to optimize the allocation of observational resources. The basis of the innovation is the extension of a recently developed set of algorithms packaged into MineTool to multi-variate time-series data. MineTool is unique in that it automates the various steps of the data mining process, thus making it amenable to autonomous analysis of large data sets. Unlike techniques such as Artificial Neural Nets, which yield a blackbox solution, MineTool's outcome is always an analytical model in parametric form that expresses the output in terms of the input variables. This has the advantage that the derived equation can then be used to gain insight into the physical relevance and relative importance of the parameters and coefficients in the model. This is referred to as physics-mining of data. The capabilities of MineTool are extended to include both supervised and unsupervised algorithms, handle multi-type data sets, and parallelize it.

  5. Statistical data analytics foundations for data mining, informatics, and knowledge discovery

    CERN Document Server

    Piegorsch, Walter W

    2015-01-01

      A comprehensive introduction to statistical methods for data mining and knowledge discovery.Applications of data mining and 'big data' increasingly take center stage in our modern, knowledge-driven society, supported by advances in computing power, automated data acquisition, social media development and interactive, linkable internet software.  This book presents a coherent, technical introduction to modern statistical learning and analytics, starting from the core foundations of statistics and probability. It includes an overview of probability and statistical distributions, basic

  6. Automating the CMS DAQ

    CERN Document Server

    Bauer, Gerry; Behrens, Ulf; Branson, James; Chaze, Olivier; Cittolin, Sergio; Coarasa Perez, Jose Antonio; Darlea, Georgiana Lavinia; Deldicque, Christian; Dobson, Marc; Dupont, Aymeric; Erhan, Samim; Gigi, Dominique; Glege, Frank; Gomez Ceballos, Guillelmo; Gomez-Reino Garrido, Robert; Hartl, Christian; Hegeman, Jeroen Guido; Holzner, Andre Georg; Masetti, Lorenzo; Meijers, Franciscus; Meschi, Emilio; Mommsen, Remigius; Morovic, Srecko; Nunez Barranco Fernandez, Carlos; O'Dell, Vivian; Orsini, Luciano; Ozga, Wojciech Andrzej; Paus, Christoph Maria Ernst; Petrucci, Andrea; Pieri, Marco; Racz, Attila; Raginel, Olivier; Sakulin, Hannes; Sani, Matteo; Schwick, Christoph; Spataru, Andrei Cristian; Stieger, Benjamin Bastian; Sumorok, Konstanty; Veverka, Jan; Wakefield, Christopher Colin; Zejdl, Petr

    2014-01-01

    We present the automation mechanisms that have been added to the Data Acquisition and Run Control systems of the Compact Muon Solenoid (CMS) experiment during Run 1 of the LHC, ranging from the automation of routine tasks to automatic error recovery and context-sensitive guidance to the operator. These mechanisms helped CMS to maintain a data taking efficiency above 90\\% and to even improve it to 95\\% towards the end of Run 1, despite an increase in the occurrence of single-event upsets in sub-detector electronics at high LHC luminosity.

  7. Automating the CMS DAQ

    Energy Technology Data Exchange (ETDEWEB)

    Bauer, G.; et al.

    2014-01-01

    We present the automation mechanisms that have been added to the Data Acquisition and Run Control systems of the Compact Muon Solenoid (CMS) experiment during Run 1 of the LHC, ranging from the automation of routine tasks to automatic error recovery and context-sensitive guidance to the operator. These mechanisms helped CMS to maintain a data taking efficiency above 90% and to even improve it to 95% towards the end of Run 1, despite an increase in the occurrence of single-event upsets in sub-detector electronics at high LHC luminosity.

  8. Altering user' acceptance of automation through prior automation exposure.

    Science.gov (United States)

    Bekier, Marek; Molesworth, Brett R C

    2016-08-22

    Air navigation service providers worldwide see increased use of automation as one solution to overcome the capacity constraints imbedded in the present air traffic management (ATM) system. However, increased use of automation within any system is dependent on user acceptance. The present research sought to determine if the point at which an individual is no longer willing to accept or cooperate with automation can be manipulated. Forty participants underwent training on a computer-based air traffic control programme, followed by two ATM exercises (order counterbalanced), one with and one without the aid of automation. Results revealed after exposure to a task with automation assistance, user acceptance of high(er) levels of automation ('tipping point') decreased; suggesting it is indeed possible to alter automation acceptance. Practitioner Summary: This paper investigates whether the point at which a user of automation rejects automation (i.e. 'tipping point') is constant or can be manipulated. The results revealed after exposure to a task with automation assistance, user acceptance of high(er) levels of automation decreased; suggesting it is possible to alter automation acceptance.

  9. Genome bioinformatics of tomato and potato

    OpenAIRE

    E Datema

    2011-01-01

    In the past two decades genome sequencing has developed from a laborious and costly technology employed by large international consortia to a widely used, automated and affordable tool used worldwide by many individual research groups. Genome sequences of many food animals and crop plants have been deciphered and are being exploited for fundamental research and applied to improve their breeding programs. The developments in sequencing technologies have also impacted the associated bioinformat...

  10. Integrated efficient solution : mines adopting energy management system

    Energy Technology Data Exchange (ETDEWEB)

    Lopez-Pacheco, A.

    2010-12-15

    This article discussed a multi-faceted approach to energy consumption optimization (ECO) in underground mines. BESTECH, a provider of engineering, software and environmental monitoring services developed the NRG1-ECO to manage the many pieces of automated equipment in a mine. This complete energy management system can be applied to processes such as compressors, pumps and other systems in a mine that could benefit from lower energy use. A mine's ventilation system usually operates continuously at peak capacity. The ventilation-on-demand (VOD) module of NRG1-ECO can reduce ventilation costs by up to 30 percent by enabling the mine to instantly control the air flow to where it is needed. BESTECH developed the technology in collaboration with a consortium of mine industry experts to establish best practices and standards. There are 5 control strategies in the NRG1-ECO, including time-of-day scheduling; real time control; environmental monitoring control; real-time air flow monitoring and adjustment; and the Intelligent Zone Controller (IZC) which increases the system responsiveness as data can be analyzed and processed internally and does not have to be transmitted to surface for decision making. Each control parameters is based on the needs of individual mines and designed according to their specifications. The NRG1-ECO also stores all the data it processes for real-time monitoring. The system will be tested in a two-staged approach in the Hoyle Pond Mine in Timmins, Ontario, and will expand to the rest of the mine if that proves to be satisfactory. 1 fig.

  11. Text Mining for Protein Docking.

    Directory of Open Access Journals (Sweden)

    Varsha D Badal

    2015-12-01

    Full Text Available The rapidly growing amount of publicly available information from biomedical research is readily accessible on the Internet, providing a powerful resource for predictive biomolecular modeling. The accumulated data on experimentally determined structures transformed structure prediction of proteins and protein complexes. Instead of exploring the enormous search space, predictive tools can simply proceed to the solution based on similarity to the existing, previously determined structures. A similar major paradigm shift is emerging due to the rapidly expanding amount of information, other than experimentally determined structures, which still can be used as constraints in biomolecular structure prediction. Automated text mining has been widely used in recreating protein interaction networks, as well as in detecting small ligand binding sites on protein structures. Combining and expanding these two well-developed areas of research, we applied the text mining to structural modeling of protein-protein complexes (protein docking. Protein docking can be significantly improved when constraints on the docking mode are available. We developed a procedure that retrieves published abstracts on a specific protein-protein interaction and extracts information relevant to docking. The procedure was assessed on protein complexes from Dockground (http://dockground.compbio.ku.edu. The results show that correct information on binding residues can be extracted for about half of the complexes. The amount of irrelevant information was reduced by conceptual analysis of a subset of the retrieved abstracts, based on the bag-of-words (features approach. Support Vector Machine models were trained and validated on the subset. The remaining abstracts were filtered by the best-performing models, which decreased the irrelevant information for ~ 25% complexes in the dataset. The extracted constraints were incorporated in the docking protocol and tested on the Dockground unbound

  12. Mining anaerobic digester consortia metagenomes for secreted carbohydrate active enzymes

    DEFF Research Database (Denmark)

    Wilkens, Casper; Busk, Peter Kamp; Pilgaard, Bo

    was done with the Peptide Pattern Recognition (PPR) program (Busk and Lange, 2013), which is a novel non-alignment based approach that can predict function of e.g. CAZymes. PPR identifies a set of short conserved sequences, which can be used as a finger print when mining genomes for novel enzymes. In both...

  13. Microcontroller for automation application

    Science.gov (United States)

    Cooper, H. W.

    1975-01-01

    The description of a microcontroller currently being developed for automation application was given. It is basically an 8-bit microcomputer with a 40K byte random access memory/read only memory, and can control a maximum of 12 devices through standard 15-line interface ports.

  14. Automated Composite Column Wrapping

    OpenAIRE

    ECT Team, Purdue

    2007-01-01

    The Automated Composite Column Wrapping is performed by a patented machine known as Robo-Wrapper. Currently there are three versions of the machine available for bridge retrofit work depending on the size of the columns being wrapped. Composite column retrofit jacket systems can be structurally just as effective as conventional steel jacketing in improving the seismic response characteristics of substandard reinforced concrete columns.

  15. Automated Web Applications Testing

    Directory of Open Access Journals (Sweden)

    Alexandru Dan CĂPRIŢĂ

    2009-01-01

    Full Text Available Unit tests are a vital part of several software development practicesand processes such as Test-First Programming, Extreme Programming andTest-Driven Development. This article shortly presents the software quality andtesting concepts as well as an introduction to an automated unit testingframework for PHP web based applications.

  16. Automated Student Model Improvement

    Science.gov (United States)

    Koedinger, Kenneth R.; McLaughlin, Elizabeth A.; Stamper, John C.

    2012-01-01

    Student modeling plays a critical role in developing and improving instruction and instructional technologies. We present a technique for automated improvement of student models that leverages the DataShop repository, crowd sourcing, and a version of the Learning Factors Analysis algorithm. We demonstrate this method on eleven educational…

  17. Automated Accounting. Instructor Guide.

    Science.gov (United States)

    Moses, Duane R.

    This curriculum guide was developed to assist business instructors using Dac Easy Accounting College Edition Version 2.0 software in their accounting programs. The module consists of four units containing assignment sheets and job sheets designed to enable students to master competencies identified in the area of automated accounting. The first…

  18. ERGONOMICS AND PROCESS AUTOMATION

    OpenAIRE

    Carrión Muñoz, Rolando; Docente de la FII - UNMSM

    2014-01-01

    The article shows the role that ergonomics in automation of processes, and the importance for Industrial Engineering.  El artículo nos muestra el papel que tiene la ergonomía en la automatización de los procesos, y la importancia para la Ingeniería Industrial.

  19. Mechatronic Design Automation

    DEFF Research Database (Denmark)

    Fan, Zhun

    successfully design analogue filters, vibration absorbers, micro-electro-mechanical systems, and vehicle suspension systems, all in an automatic or semi-automatic way. It also investigates the very important issue of co-designing plant-structures and dynamic controllers in automated design of Mechatronic...

  20. Protokoller til Home Automation

    DEFF Research Database (Denmark)

    Kjær, Kristian Ellebæk

    2008-01-01

    computer, der kan skifte mellem foruddefinerede indstillinger. Nogle gange kan computeren fjernstyres over internettet, så man kan se hjemmets status fra en computer eller måske endda fra en mobiltelefon. Mens nævnte anvendelser er klassiske indenfor home automation, er yderligere funktionalitet dukket op...

  1. Myths in test automation

    Directory of Open Access Journals (Sweden)

    Jazmine Francis

    2015-01-01

    Full Text Available Myths in automation of software testing is an issue of discussion that echoes about the areas of service in validation of software industry. Probably, the first though that appears in knowledgeable reader would be Why this old topic again? What's New to discuss the matter? But, for the first time everyone agrees that undoubtedly automation testing today is not today what it used to be ten or fifteen years ago, because it has evolved in scope and magnitude. What began as a simple linear scripts for web applications today has a complex architecture and a hybrid framework to facilitate the implementation of testing applications developed with various platforms and technologies. Undoubtedly automation has advanced, but so did the myths associated with it. The change in perspective and knowledge of people on automation has altered the terrain. This article reflects the points of views and experience of the author in what has to do with the transformation of the original myths in new versions, and how they are derived; also provides his thoughts on the new generation of myths.

  2. Automating Shallow Seismic Imaging

    Energy Technology Data Exchange (ETDEWEB)

    Steeples, Don W.

    2004-12-09

    This seven-year, shallow-seismic reflection research project had the aim of improving geophysical imaging of possible contaminant flow paths. Thousands of chemically contaminated sites exist in the United States, including at least 3,700 at Department of Energy (DOE) facilities. Imaging technologies such as shallow seismic reflection (SSR) and ground-penetrating radar (GPR) sometimes are capable of identifying geologic conditions that might indicate preferential contaminant-flow paths. Historically, SSR has been used very little at depths shallower than 30 m, and even more rarely at depths of 10 m or less. Conversely, GPR is rarely useful at depths greater than 10 m, especially in areas where clay or other electrically conductive materials are present near the surface. Efforts to image the cone of depression around a pumping well using seismic methods were only partially successful (for complete references of all research results, see the full Final Technical Report, DOE/ER/14826-F), but peripheral results included development of SSR methods for depths shallower than one meter, a depth range that had not been achieved before. Imaging at such shallow depths, however, requires geophone intervals of the order of 10 cm or less, which makes such surveys very expensive in terms of human time and effort. We also showed that SSR and GPR could be used in a complementary fashion to image the same volume of earth at very shallow depths. The primary research focus of the second three-year period of funding was to develop and demonstrate an automated method of conducting two-dimensional (2D) shallow-seismic surveys with the goal of saving time, effort, and money. Tests involving the second generation of the hydraulic geophone-planting device dubbed the ''Autojuggie'' showed that large numbers of geophones can be placed quickly and automatically and can acquire high-quality data, although not under rough topographic conditions. In some easy

  3. Data mining for service

    CERN Document Server

    2014-01-01

    Virtually all nontrivial and modern service related problems and systems involve data volumes and types that clearly fall into what is presently meant as "big data", that is, are huge, heterogeneous, complex, distributed, etc. Data mining is a series of processes which include collecting and accumulating data, modeling phenomena, and discovering new information, and it is one of the most important steps to scientific analysis of the processes of services.  Data mining application in services requires a thorough understanding of the characteristics of each service and knowledge of the compatibility of data mining technology within each particular service, rather than knowledge only in calculation speed and prediction accuracy. Varied examples of services provided in this book will help readers understand the relation between services and data mining technology. This book is intended to stimulate interest among researchers and practitioners in the relation between data mining technology and its application to ...

  4. Mining Deployment Optimization

    Science.gov (United States)

    Čech, Jozef

    2016-09-01

    The deployment problem, researched primarily in the military sector, is emerging in some other industries, mining included. The principal decision is how to deploy some activities in space and time to achieve desired outcome while complying with certain requirements or limits. Requirements and limits are on the side constraints, while minimizing costs or maximizing some benefits are on the side of objectives. A model with application to mining of polymetallic deposit is presented. To obtain quick and immediate decision solutions for a mining engineer with experimental possibilities is the main intention of a computer-based tool. The task is to determine strategic deployment of mining activities on a deposit, meeting planned output from the mine and at the same time complying with limited reserves and haulage capacities. Priorities and benefits can be formulated by the planner.

  5. Cancer genomics

    DEFF Research Database (Denmark)

    Norrild, Bodil; Guldberg, Per; Ralfkiær, Elisabeth Methner

    2007-01-01

    Almost all cells in the human body contain a complete copy of the genome with an estimated number of 25,000 genes. The sequences of these genes make up about three percent of the genome and comprise the inherited set of genetic information. The genome also contains information that determines whe...

  6. Automating spectral measurements

    Science.gov (United States)

    Goldstein, Fred T.

    2008-09-01

    This paper discusses the architecture of software utilized in spectroscopic measurements. As optical coatings become more sophisticated, there is mounting need to automate data acquisition (DAQ) from spectrophotometers. Such need is exacerbated when 100% inspection is required, ancillary devices are utilized, cost reduction is crucial, or security is vital. While instrument manufacturers normally provide point-and-click DAQ software, an application programming interface (API) may be missing. In such cases automation is impossible or expensive. An API is typically provided in libraries (*.dll, *.ocx) which may be embedded in user-developed applications. Users can thereby implement DAQ automation in several Windows languages. Another possibility, developed by FTG as an alternative to instrument manufacturers' software, is the ActiveX application (*.exe). ActiveX, a component of many Windows applications, provides means for programming and interoperability. This architecture permits a point-and-click program to act as automation client and server. Excel, for example, can control and be controlled by DAQ applications. Most importantly, ActiveX permits ancillary devices such as barcode readers and XY-stages to be easily and economically integrated into scanning procedures. Since an ActiveX application has its own user-interface, it can be independently tested. The ActiveX application then runs (visibly or invisibly) under DAQ software control. Automation capabilities are accessed via a built-in spectro-BASIC language with industry-standard (VBA-compatible) syntax. Supplementing ActiveX, spectro-BASIC also includes auxiliary serial port commands for interfacing programmable logic controllers (PLC). A typical application is automatic filter handling.

  7. Innovative management techniques to deal with mine water issues in the Sydney coal field, Nova Scotia, Canada

    Energy Technology Data Exchange (ETDEWEB)

    Shea, J. [Enterprise Cape Breton Corp., Sydney, NS (Canada)

    2010-07-01

    There are currently 20 mine pools that have flooded to an equilibrium point and are discharging water at the Sydney Coalfield in Nova Scotia (NS). This paper discussed a new mine water technique that is being used at 3 of the mine's mine pools. An emergency active treatment plant was constructed at one of the mine shafts to prevent uncontrolled discharges. A drilling program was also conducted in the inflooded zones of the mine to test the quality of the rising mine water. Pump tests were conducted to allow for the discharge of better quality mine water into a receiving stream without treatment. An automated and remote-controlled pumping system was installed. A passive treatment system consisting of aeration cascades and a 1.2 hectare settling pond and a 1.1 hectare reed bed wetland was constructed. The mine water flow through the pond was designed using a simple piston flow theory that provided a 50 hour retention time for the mine water. Floating pond curtains were also installed. Boreholes were drilled to combine mine waters from other pools into the passive treatment plant. It is expected that mine water issues at the site will be resolved within the next 5 years. 3 refs., 4 figs.

  8. Homology and phylogeny and their automated inference

    Science.gov (United States)

    Fuellen, Georg

    2008-06-01

    The analysis of the ever-increasing amount of biological and biomedical data can be pushed forward by comparing the data within and among species. For example, an integrative analysis of data from the genome sequencing projects for various species traces the evolution of the genomes and identifies conserved and innovative parts. Here, I review the foundations and advantages of this “historical” approach and evaluate recent attempts at automating such analyses. Biological data is comparable if a common origin exists (homology), as is the case for members of a gene family originating via duplication of an ancestral gene. If the family has relatives in other species, we can assume that the ancestral gene was present in the ancestral species from which all the other species evolved. In particular, describing the relationships among the duplicated biological sequences found in the various species is often possible by a phylogeny, which is more informative than homology statements. Detecting and elaborating on common origins may answer how certain biological sequences developed, and predict what sequences are in a particular species and what their function is. Such knowledge transfer from sequences in one species to the homologous sequences of the other is based on the principle of ‘my closest relative looks and behaves like I do’, often referred to as ‘guilt by association’. To enable knowledge transfer on a large scale, several automated ‘phylogenomics pipelines’ have been developed in recent years, and seven of these will be described and compared. Overall, the examples in this review demonstrate that homology and phylogeny analyses, done on a large (and automated) scale, can give insights into function in biology and biomedicine.

  9. Tellurium Mobility Through Mine Environments

    Science.gov (United States)

    Dorsk, M.

    2015-12-01

    Tellurium is a rare metalloid that has received minimal research regarding environmental mobility. Observations of Tellurium mobility are mainly based on observations of related metalloids such as selenium and beryllium; yet little research has been done on specific Tellurium behavior. This laboratory work established the environmental controls that influence Tellurium mobility and chemical speciation in aqueous driven systems. Theoretical simulations show possible mobility of Te as Te(OH)3[+] at highly oxidizing and acidic conditions. Movement as TeO3[2-] under more basic conditions may also be possible in elevated Eh conditions. Mobility in reducing environments is theoretically not as likely. For a practical approach to investigate mobility conditions for Te, a site with known Tellurium content was chosen in Colorado. Composite samples were selected from the top, center and bottom of a tailings pile for elution experiments. These samples were disintegrated using a rock crusher and pulverized with an automated mortar and pestle. The material was then classified to 70 microns. A 10g sample split was digested in concentrated HNO3 and HF and analyzed by Atomic Absorption Spectroscopy to determine initial Te concentrations. Additional 10g splits from each location were subjected to elution in 100 mL of each of the following solutions; nitric acid to a pH of 1.0, sulfuric acid to a pH of 2.0, sodium hydroxide to a pH of 12, ammonium hydroxide to a pH of 10, a pine needle/soil tea from material within the vicinity of the collection site to a pH of 3.5 and lastly distilled water to serve as control with a pH of 7. Sulfuric acid was purposefully chosen to simulate acid mine drainage from the decomposition of pyrite within the mine tailings. Sample sub sets were also inundated with 10mL of a 3% hydrogen peroxide solution to induce oxidizing conditions. All collected eluates were then analyzed by atomic absorption spectroscopy (AAS) to measure Tellurium concentrations in

  10. Radioecological challenges for mining

    Energy Technology Data Exchange (ETDEWEB)

    Vesterbacka, P.; Ikaeheimonen, T.K.; Solatie, D. [Radiation and Nuclear Safety Authority (Finland)

    2014-07-01

    In Finland, mining became popular in the mid-1990's when the mining amendments to the law made the mining activities easier for foreign companies. Also the price of the minerals rose and mining in Finland became economically profitable. Expanding mining industry brought new challenges to radiation safety aspect since radioactive substances occur in nearly all minerals. In Finnish soil and bedrock the average crystal abundance of uranium and thorium are 2.8 ppm and 10 ppm, respectively. It cannot be predicted beforehand how radionuclides behave in the mining processes which why they need to be taken into account in mining activities. Radiation and Nuclear Safety Authority (STUK) has given a national guide ST 12.1 based on the Finnish Radiation Act. The guide sets the limits for radiation doses to the public also from mining activities. In general, no measures to limit the radiation exposure are needed, if the dose from the operation liable to cause exposure to natural radiation is no greater than 0.1 mSv per year above the natural background radiation dose. If the exposure of the public may be higher than 0.1 mSv per year, the responsible party must provide STUK a plan describing the measures by which the radiation exposure is to be kept as low as is reasonably achievable. In that case the mining company responsible company has to make a radiological baseline study. The baseline study must focus on the environment that the mining activities may impact. The study describes the occurrence of natural radioactivity in the environment before any mining activities are started. The baseline study lasts usually for two to three years in natural circumstances. Based on the baseline study measurements, detailed information of the existing levels of radioactivity in the environment can be attained. Once the mining activities begin, it is important that the limits are set for the wastewater discharges to the environment and environmental surveillance in the vicinity of

  11. An ISU study of asteroid mining

    Science.gov (United States)

    Burke, J. D.

    1991-01-01

    During the 1990 summer session of the International Space University, 59 graduate students from 16 countries carried out a design project on using the resources of near-earth asteroids. The results of the project, whose full report is now available from ISU, are summarized. The student team included people in these fields: architecture, business and management, engineering, life sciences, physical sciences, policy and law, resources and manufacturing, and satellite applications. They designed a project for transporting equipment and personnel to a near-earth asteroid, setting up a mining base there, and hauling products back for use in cislunar space. In addition, they outlined the needed precursor steps, beginning with expansion of present ground-based programs for finding and characterizing near-earth asteroids and continuing with automated flight missions to candidate bodies. (To limit the summer project's scope the actual design of these flight-mission precursors was excluded.) The main conclusions were that asteroid mining may provide an important complement to the future use of lunar resources, with the potential to provide large amounts of water and carbonaceous materials for use off earth. However, the recovery of such materials from presently known asteroids did not show an economic gain under the study assumptions; therefore, asteroid mining cannot yet be considered a prospective business.

  12. Spatiotemporal Data Mining: A Computational Perspective

    Directory of Open Access Journals (Sweden)

    Shashi Shekhar

    2015-10-01

    Full Text Available Explosive growth in geospatial and temporal data as well as the emergence of new technologies emphasize the need for automated discovery of spatiotemporal knowledge. Spatiotemporal data mining studies the process of discovering interesting and previously unknown, but potentially useful patterns from large spatiotemporal databases. It has broad application domains including ecology and environmental management, public safety, transportation, earth science, epidemiology, and climatology. The complexity of spatiotemporal data and intrinsic relationships limits the usefulness of conventional data science techniques for extracting spatiotemporal patterns. In this survey, we review recent computational techniques and tools in spatiotemporal data mining, focusing on several major pattern families: spatiotemporal outlier, spatiotemporal coupling and tele-coupling, spatiotemporal prediction, spatiotemporal partitioning and summarization, spatiotemporal hotspots, and change detection. Compared with other surveys in the literature, this paper emphasizes the statistical foundations of spatiotemporal data mining and provides comprehensive coverage of computational approaches for various pattern families. ISPRS Int. J. Geo-Inf. 2015, 4 2307 We also list popular software tools for spatiotemporal data analysis. The survey concludes with a look at future research needs.

  13. Preprocessing Techniques for Image Mining on Biopsy Images

    Directory of Open Access Journals (Sweden)

    Ms. Nikita Ramrakhiani

    2015-08-01

    Full Text Available Biomedical imaging has been undergoing rapid technological advancements over the last several decades and has seen the development of many new applications. A single Image can give all the details about an organ from the cellular level to the whole-organ level. Biomedical imaging is becoming increasingly important as an approach to synthesize, extract and translate useful information from large multidimensional databases accumulated in research frontiers such as functional genomics, proteomics, and functional imaging. To fulfill this approach Image Mining can be used. Image Mining will bridge this gap to extract and translate semantically meaningful information from biomedical images and apply it for testing and detecting any anomaly in the target organ. The essential component in image mining is identifying similar objects in different images and finding correlations in them. Integration of Image Mining and Biomedical field can result in many real world applications

  14. The Challenge of Wireless Connectivity to Support Intelligent Mines

    DEFF Research Database (Denmark)

    Barbosa, Viviane S. B.; Garcia, Luis G. U.; Portela Lopes de Almeida, Erika;

    2016-01-01

    in terms of network planning, management and optimization. For example, the data rates required to support unmanned equipment, e.g. a teleoperated bulldozer, shift from a few kilobits/second to megabits/second due to live video feeds. This traffic volume is well beyond the capabilities of Professional...... for unmanned mine operations. Although voice and narrowband data radios have been used for years to support several types of mining activities, such as fleet management (dispatch) and telemetry, the use of automated equipment introduces a new set of connectivity requirements and poses a set of challenges...... Mobile Radio narrowband systems and mandates the deployment of broadband systems. Furthermore, the (data) traffic requirements of a mine also vary in time as the fleet expands. Additionally, wireless networks are planned according to the characteristics of the scenario in which they will be deployed...

  15. Automated Wildfire Detection Through Artificial Neural Networks

    Science.gov (United States)

    Miller, Jerry; Borne, Kirk; Thomas, Brian; Huang, Zhenping; Chi, Yuechen

    2005-01-01

    We have tested and deployed Artificial Neural Network (ANN) data mining techniques to analyze remotely sensed multi-channel imaging data from MODIS, GOES, and AVHRR. The goal is to train the ANN to learn the signatures of wildfires in remotely sensed data in order to automate the detection process. We train the ANN using the set of human-detected wildfires in the U.S., which are provided by the Hazard Mapping System (HMS) wildfire detection group at NOAA/NESDIS. The ANN is trained to mimic the behavior of fire detection algorithms and the subjective decision- making by N O M HMS Fire Analysts. We use a local extremum search in order to isolate fire pixels, and then we extract a 7x7 pixel array around that location in 3 spectral channels. The corresponding 147 pixel values are used to populate a 147-dimensional input vector that is fed into the ANN. The ANN accuracy is tested and overfitting is avoided by using a subset of the training data that is set aside as a test data set. We have achieved an automated fire detection accuracy of 80-92%, depending on a variety of ANN parameters and for different instrument channels among the 3 satellites. We believe that this system can be deployed worldwide or for any region to detect wildfires automatically in satellite imagery of those regions. These detections can ultimately be used to provide thermal inputs to climate models.

  16. Implementation of Paste Backfill Mining Technology in Chinese Coal Mines

    OpenAIRE

    Qingliang Chang; Jianhang Chen; Huaqiang Zhou; Jianbiao Bai

    2014-01-01

    Implementation of clean mining technology at coal mines is crucial to protect the environment and maintain balance among energy resources, consumption, and ecology. After reviewing present coal clean mining technology, we introduce the technology principles and technological process of paste backfill mining in coal mines and discuss the components and features of backfill materials, the constitution of the backfill system, and the backfill process. Specific implementation of this technology a...

  17. A jewel in the desert: BHP Billiton's San Juan underground mine

    Energy Technology Data Exchange (ETDEWEB)

    Buchsbaum, L.

    2007-12-15

    The Navajo Nation is America's largest native American tribe by population and acreage, and is blessed with large tracks of good coal deposits. BHP Billiton's New Mexico Coal Co. is the largest in the Navajo regeneration area. The holdings comprise the San Juan underground mine, the La Plata surface mine, now in reclamation, and the expanding Navajo surface mine. The article recounts the recent history of the mines. It stresses the emphasis on sensitivity to and helping to sustain tribal culture, and also on safety. San Juan's longwall system is unique to the nation. It started up as an automated system from the outset. Problems caused by hydrogen sulfide are being tackled. San Juan has a bleederless ventilation system to minimise the risk of spontaneous combustion of methane and the atmospheric conditions in the mine are heavily monitored, especially within the gob areas. 3 photos.

  18. ATLAS Distributed Computing Automation

    CERN Document Server

    Schovancova, J; The ATLAS collaboration; Borrego, C; Campana, S; Di Girolamo, A; Elmsheuser, J; Hejbal, J; Kouba, T; Legger, F; Magradze, E; Medrano Llamas, R; Negri, G; Rinaldi, L; Sciacca, G; Serfon, C; Van Der Ster, D C

    2012-01-01

    The ATLAS Experiment benefits from computing resources distributed worldwide at more than 100 WLCG sites. The ATLAS Grid sites provide over 100k CPU job slots, over 100 PB of storage space on disk or tape. Monitoring of status of such a complex infrastructure is essential. The ATLAS Grid infrastructure is monitored 24/7 by two teams of shifters distributed world-wide, by the ATLAS Distributed Computing experts, and by site administrators. In this paper we summarize automation efforts performed within the ATLAS Distributed Computing team in order to reduce manpower costs and improve the reliability of the system. Different aspects of the automation process are described: from the ATLAS Grid site topology provided by the ATLAS Grid Information System, via automatic site testing by the HammerCloud, to automatic exclusion from production or analysis activities.

  19. Rapid automated nuclear chemistry

    Energy Technology Data Exchange (ETDEWEB)

    Meyer, R.A.

    1979-05-31

    Rapid Automated Nuclear Chemistry (RANC) can be thought of as the Z-separation of Neutron-rich Isotopes by Automated Methods. The range of RANC studies of fission and its products is large. In a sense, the studies can be categorized into various energy ranges from the highest where the fission process and particle emission are considered, to low energies where nuclear dynamics are being explored. This paper presents a table which gives examples of current research using RANC on fission and fission products. The remainder of this text is divided into three parts. The first contains a discussion of the chemical methods available for the fission product elements, the second describes the major techniques, and in the last section, examples of recent results are discussed as illustrations of the use of RANC.

  20. Automatic annotation of organellar genomes with DOGMA

    Energy Technology Data Exchange (ETDEWEB)

    Wyman, Stacia; Jansen, Robert K.; Boore, Jeffrey L.

    2004-06-01

    Dual Organellar GenoMe Annotator (DOGMA) automates the annotation of extra-nuclear organellar (chloroplast and animal mitochondrial) genomes. It is a web-based package that allows the use of comparative BLAST searches to identify and annotate genes in a genome. DOGMA presents a list of putative genes to the user in a graphical format for viewing and editing. Annotations are stored on our password-protected server. Complete annotations can be extracted for direct submission to GenBank. Furthermore, intergenic regions of specified length can be extracted, as well the nucleotide sequences and amino acid sequences of the genes.

  1. The Automated Medical Office

    OpenAIRE

    1990-01-01

    With shock and surprise many physicians learned in the 1980s that they must change the way they do business. Competition for patients, increasing government regulation, and the rapidly escalating risk of litigation forces physicians to seek modern remedies in office management. The author describes a medical clinic that strives to be paperless using electronic innovation to solve the problems of medical practice management. A computer software program to automate information management in a c...

  2. Automation of printing machine

    OpenAIRE

    Sušil, David

    2016-01-01

    Bachelor thesis is focused on the automation of the printing machine and comparing the two types of printing machines. The first chapter deals with the history of printing, typesettings, printing techniques and various kinds of bookbinding. The second chapter describes the difference between sheet-fed printing machines and offset printing machines, the difference between two representatives of rotary machines, technological process of the products on these machines, the description of the mac...

  3. Automated Cooperative Trajectories

    Science.gov (United States)

    Hanson, Curt; Pahle, Joseph; Brown, Nelson

    2015-01-01

    This presentation is an overview of the Automated Cooperative Trajectories project. An introduction to the phenomena of wake vortices is given, along with a summary of past research into the possibility of extracting energy from the wake by flying close parallel trajectories. Challenges and barriers to adoption of civilian automatic wake surfing technology are identified. A hardware-in-the-loop simulation is described that will support future research. Finally, a roadmap for future research and technology transition is proposed.

  4. Automation in biological crystallization.

    Science.gov (United States)

    Stewart, Patrick Shaw; Mueller-Dieckmann, Jochen

    2014-06-01

    Crystallization remains the bottleneck in the crystallographic process leading from a gene to a three-dimensional model of the encoded protein or RNA. Automation of the individual steps of a crystallization experiment, from the preparation of crystallization cocktails for initial or optimization screens to the imaging of the experiments, has been the response to address this issue. Today, large high-throughput crystallization facilities, many of them open to the general user community, are capable of setting up thousands of crystallization trials per day. It is thus possible to test multiple constructs of each target for their ability to form crystals on a production-line basis. This has improved success rates and made crystallization much more convenient. High-throughput crystallization, however, cannot relieve users of the task of producing samples of high quality. Moreover, the time gained from eliminating manual preparations must now be invested in the careful evaluation of the increased number of experiments. The latter requires a sophisticated data and laboratory information-management system. A review of the current state of automation at the individual steps of crystallization with specific attention to the automation of optimization is given.

  5. Automation in biological crystallization

    Science.gov (United States)

    Shaw Stewart, Patrick; Mueller-Dieckmann, Jochen

    2014-01-01

    Crystallization remains the bottleneck in the crystallographic process leading from a gene to a three-dimensional model of the encoded protein or RNA. Automation of the individual steps of a crystallization experiment, from the preparation of crystallization cocktails for initial or optimization screens to the imaging of the experiments, has been the response to address this issue. Today, large high-throughput crystallization facilities, many of them open to the general user community, are capable of setting up thousands of crystallization trials per day. It is thus possible to test multiple constructs of each target for their ability to form crystals on a production-line basis. This has improved success rates and made crystallization much more convenient. High-throughput crystallization, however, cannot relieve users of the task of producing samples of high quality. Moreover, the time gained from eliminating manual preparations must now be invested in the careful evaluation of the increased number of experiments. The latter requires a sophisticated data and laboratory information-management system. A review of the current state of automation at the individual steps of crystallization with specific attention to the automation of optimization is given. PMID:24915074

  6. MTGD: The Medicago truncatula genome database.

    Science.gov (United States)

    Krishnakumar, Vivek; Kim, Maria; Rosen, Benjamin D; Karamycheva, Svetlana; Bidwell, Shelby L; Tang, Haibao; Town, Christopher D

    2015-01-01

    Medicago truncatula, a close relative of alfalfa (Medicago sativa), is a model legume used for studying symbiotic nitrogen fixation, mycorrhizal interactions and legume genomics. J. Craig Venter Institute (JCVI; formerly TIGR) has been involved in M. truncatula genome sequencing and annotation since 2002 and has maintained a web-based resource providing data to the community for this entire period. The website (http://www.MedicagoGenome.org) has seen major updates in the past year, where it currently hosts the latest version of the genome (Mt4.0), associated data and legacy project information, presented to users via a rich set of open-source tools. A JBrowse-based genome browser interface exposes tracks for visualization. Mutant gene symbols originally assembled and curated by the Frugoli lab are now hosted at JCVI and tie into our community annotation interface, Medicago EuCAP (to be integrated soon with our implementation of WebApollo). Literature pertinent to M. truncatula is indexed and made searchable via the Textpresso search engine. The site also implements MedicMine, an instance of InterMine that offers interconnectivity with other plant 'mines' such as ThaleMine and PhytoMine, and other model organism databases (MODs). In addition to these new features, we continue to provide keyword- and locus identifier-based searches served via a Chado-backed Tripal Instance, a BLAST search interface and bulk downloads of data sets from the iPlant Data Store (iDS). Finally, we maintain an E-mail helpdesk, facilitated by a JIRA issue tracking system, where we receive and respond to questions about the website and requests for specific data sets from the community.

  7. Automated expert modeling for automated student evaluation.

    Energy Technology Data Exchange (ETDEWEB)

    Abbott, Robert G.

    2006-01-01

    The 8th International Conference on Intelligent Tutoring Systems provides a leading international forum for the dissemination of original results in the design, implementation, and evaluation of intelligent tutoring systems and related areas. The conference draws researchers from a broad spectrum of disciplines ranging from artificial intelligence and cognitive science to pedagogy and educational psychology. The conference explores intelligent tutoring systems increasing real world impact on an increasingly global scale. Improved authoring tools and learning object standards enable fielding systems and curricula in real world settings on an unprecedented scale. Researchers deploy ITS's in ever larger studies and increasingly use data from real students, tasks, and settings to guide new research. With high volumes of student interaction data, data mining, and machine learning, tutoring systems can learn from experience and improve their teaching performance. The increasing number of realistic evaluation studies also broaden researchers knowledge about the educational contexts for which ITS's are best suited. At the same time, researchers explore how to expand and improve ITS/student communications, for example, how to achieve more flexible and responsive discourse with students, help students integrate Web resources into learning, use mobile technologies and games to enhance student motivation and learning, and address multicultural perspectives.

  8. Genotype-Specific Genomic Markers Associated with Primary Hepatomas, Based on Complete Genomic Sequencing of Hepatitis B Virus▿

    OpenAIRE

    Sung, Joseph J. Y.; Tsui, Stephen K. W.; Tse, Chi-Hang; Ng, Eddie Y. T.; Leung, Kwong-Sak; Lee, Kin-Hong; Mok, Tony S. K.; Bartholomeusz, Angeline; Au, Thomas C. C.; Tsoi, Kelvin K. F.; Locarnini, Stephen; Chan, Henry L. Y.

    2008-01-01

    We aimed to identify genomic markers in hepatitis B virus (HBV) that are associated with hepatocellular carcinoma (HCC) development by comparing the complete genomic sequences of HBVs among patients with HCC and those without. One hundred patients with HBV-related HCC and 100 age-matched HBV-infected non-HCC patients (controls) were studied. HBV DNA from serum was directly sequenced to study the whole viral genome. Data mining and rule learning were employed to develop diagnostic algorithms. ...

  9. Information and diagnostic tools of objective control as means to improve performance of mining machines

    Science.gov (United States)

    Zvonarev, I. E.; Shishlyannikov, D. I.

    2017-02-01

    The paper justifies the relevance of developing and implementing automated onboard systems for operation data and maintenance recording in heading-and-winning machines. The analysis of advantages and disadvantages of existing automated onboard systems for operation data and maintenance recording in heading-and-winning machines for potassium mines are presented. The basic technical requirements for the design, operating algorithms and functions of recording systems of mining machines for potassium mines are formulated. A method of controlling operating parameters is presented; the concept of the onboard automated recording system for the Ural heading-and-winning machine is outlined. The results of experimental studies of variations in loading of the Ural-20R miner’s operating member drives, using the VATUR portable measuring complex, are given. It is proved that existing means of objective control of operating parameters of the URAL-20R heading-and-winning machine do not assure its optimal operation. The authors present a technique of analyzing the data provided by parameter recorders that allow increasing efficiency of mechanical complexes by determining numerical values characterizing the technical and technological level of potassium ore production organization. The efficiency assessment criteria for engineering and maintenance departments of mining enterprises are advanced. A technology of continuous automated monitoring of potassium mine’s outburst hazard is described.

  10. Contaminant analysis automation demonstration proposal

    Energy Technology Data Exchange (ETDEWEB)

    Dodson, M.G.; Schur, A.; Heubach, J.G.

    1993-10-01

    The nation-wide and global need for environmental restoration and waste remediation (ER&WR) presents significant challenges to the analytical chemistry laboratory. The expansion of ER&WR programs forces an increase in the volume of samples processed and the demand for analysis data. To handle this expanding volume, productivity must be increased. However. The need for significantly increased productivity, faces contaminant analysis process which is costly in time, labor, equipment, and safety protection. Laboratory automation offers a cost effective approach to meeting current and future contaminant analytical laboratory needs. The proposed demonstration will present a proof-of-concept automated laboratory conducting varied sample preparations. This automated process also highlights a graphical user interface that provides supervisory, control and monitoring of the automated process. The demonstration provides affirming answers to the following questions about laboratory automation: Can preparation of contaminants be successfully automated?; Can a full-scale working proof-of-concept automated laboratory be developed that is capable of preparing contaminant and hazardous chemical samples?; Can the automated processes be seamlessly integrated and controlled?; Can the automated laboratory be customized through readily convertible design? and Can automated sample preparation concepts be extended to the other phases of the sample analysis process? To fully reap the benefits of automation, four human factors areas should be studied and the outputs used to increase the efficiency of laboratory automation. These areas include: (1) laboratory configuration, (2) procedures, (3) receptacles and fixtures, and (4) human-computer interface for the full automated system and complex laboratory information management systems.

  11. VRLane: a desktop virtual safety management program for underground coal mine

    Science.gov (United States)

    Li, Mei; Chen, Jingzhu; Xiong, Wei; Zhang, Pengpeng; Wu, Daozheng

    2008-10-01

    VR technologies, which generate immersive, interactive, and three-dimensional (3D) environments, are seldom applied to coal mine safety work management. In this paper, a new method that combined the VR technologies with underground mine safety management system was explored. A desktop virtual safety management program for underground coal mine, called VRLane, was developed. The paper mainly concerned about the current research advance in VR, system design, key techniques and system application. Two important techniques were introduced in the paper. Firstly, an algorithm was designed and implemented, with which the 3D laneway models and equipment models can be built on the basis of the latest mine 2D drawings automatically, whereas common VR programs established 3D environment by using 3DS Max or the other 3D modeling software packages with which laneway models were built manually and laboriously. Secondly, VRLane realized system integration with underground industrial automation. VRLane not only described a realistic 3D laneway environment, but also described the status of the coal mining, with functions of displaying the run states and related parameters of equipment, per-alarming the abnormal mining events, and animating mine cars, mine workers, or long-wall shearers. The system, with advantages of cheap, dynamic, easy to maintenance, provided a useful tool for safety production management in coal mine.

  12. Ensemble Data Mining Methods

    Data.gov (United States)

    National Aeronautics and Space Administration — Ensemble Data Mining Methods, also known as Committee Methods or Model Combiners, are machine learning methods that leverage the power of multiple models to achieve...

  13. Data mining in agriculture

    CERN Document Server

    Mucherino, Antonio; Pardalos, Panos M

    2009-01-01

    Data Mining in Agriculture represents a comprehensive effort to provide graduate students and researchers with an analytical text on data mining techniques applied to agriculture and environmental related fields. This book presents both theoretical and practical insights with a focus on presenting the context of each data mining technique rather intuitively with ample concrete examples represented graphically and with algorithms written in MATLAB®. Examples and exercises with solutions are provided at the end of each chapter to facilitate the comprehension of the material. For each data mining technique described in the book variants and improvements of the basic algorithm are also given. Also by P.J. Papajorgji and P.M. Pardalos: Advances in Modeling Agricultural Systems, 'Springer Optimization and its Applications' vol. 25, ©2009.

  14. Acid mine drainage

    Science.gov (United States)

    Bigham, Jerry M.; Cravotta, Charles A.

    2016-01-01

    Acid mine drainage (AMD) consists of metal-laden solutions produced by the oxidative dissolution of iron sulfide minerals exposed to air, moisture, and acidophilic microbes during the mining of coal and metal deposits. The pH of AMD is usually in the range of 2–6, but mine-impacted waters at circumneutral pH (5–8) are also common. Mine drainage usually contains elevated concentrations of sulfate, iron, aluminum, and other potentially toxic metals leached from rock that hydrolyze and coprecipitate to form rust-colored encrustations or sediments. When AMD is discharged into surface waters or groundwaters, degradation of water quality, injury to aquatic life, and corrosion or encrustation of engineered structures can occur for substantial distances. Prevention and remediation strategies should consider the biogeochemical complexity of the system, the longevity of AMD pollution, the predictive power of geochemical modeling, and the full range of available field technologies for problem mitigation.

  15. International mining forum 2004, new technologies in underground mining, safety in mines proceedings

    Energy Technology Data Exchange (ETDEWEB)

    Jerzy Kicki; Eugeniusz Sobczyk (eds.)

    2004-01-15

    The book comprises technical papers that were presented at the International Mining Forum 2004. This event aims to bring together scientists and engineers in mining, rock mechanics, and computer engineering, with a view to explore and discuss international developments in the field. Topics discussed in this book are: trends in the mining industry; new solutions and tendencies in underground mines; rock engineering problems in underground mines; utilization and exploitation of methane; prevention measures for the control of rock bursts in Polish mines; and current problems in Ukrainian coal mines.

  16. Applied data mining

    CERN Document Server

    Xu, Guandong

    2013-01-01

    Data mining has witnessed substantial advances in recent decades. New research questions and practical challenges have arisen from emerging areas and applications within the various fields closely related to human daily life, e.g. social media and social networking. This book aims to bridge the gap between traditional data mining and the latest advances in newly emerging information services. It explores the extension of well-studied algorithms and approaches into these new research arenas.

  17. Web Mining: An Overview

    Directory of Open Access Journals (Sweden)

    P. V. G. S. Mudiraj B. Jabber K. David raju

    2011-12-01

    Full Text Available Web usage mining is a main research area in Web mining focused on learning about Web users and their interactions with Web sites. The motive of mining is to find users’ access models automatically and quickly from the vast Web log data, such as frequent access paths, frequent access page groups and user clustering. Through web usage mining, the server log, registration information and other relative information left by user provide foundation for decision making of organizations. This article provides a survey and analysis of current Web usage mining systems and technologies. There are generally three tasks in Web Usage Mining: Preprocessing, Pattern analysis and Knowledge discovery. Preprocessing cleans log file of server by removing log entries such as error or failure and repeated request for the same URL from the same host etc... The main task of Pattern analysis is to filter uninteresting information and to visualize and interpret the interesting pattern to users. The statistics collected from the log file can help to discover the knowledge. This knowledge collected can be used to take decision on various factors like Excellent, Medium, Weak users and Excellent, Medium and Weak web pages based on hit counts of the web page in the web site. The design of the website is restructured based on user’s behavior or hit counts which provides quick response to the web users, saves memory space of servers and thus reducing HTTP requests and bandwidth utilization. This paper addresses challenges in three phases of Web Usage mining along with Web Structure Mining.This paper also discusses an application of WUM, an online Recommender System that dynamically generates links to pages that have not yet been visited by a user and might be of his potential interest. Differently from the recommender systems proposed so far, ONLINE MINER does not make use of any off-line component, and is able to manage Web sites made up of pages dynamically generated.

  18. Asteroid Mining and Prospecting

    OpenAIRE

    Esty, Thomas

    2013-01-01

    There has been a recent increase in interest in the idea of mining asteroids, as seen from the founding of multiple companies who seek to make this science fiction idea science fact. We analyzed a number of prior papers on asteroids to make an estimate as to whether mining asteroids is within the realm of possibility. Existing information on asteroid number, composition, and orbit from past research was synthesized with a new analysis using binomial statistics of the number of probes that wou...

  19. MINING INDUSTRY IN CROATIA

    Directory of Open Access Journals (Sweden)

    Slavko Vujec

    1996-12-01

    Full Text Available The trends of World and European mine industry is presented with introductory short review. The mining industry is very important in economy of Croatia, because of cover most of needed petroleum and natural gas quantity, total construction raw materials and industrial non-metallic raw minerals. Detail quantitative presentation of mineral raw material production is compared with pre-war situation. The value of annual production is represented for each raw mineral (the paper is published in Croatian.

  20. Investigation and characterization of mining subsidence in Kaiyang Phosphorus Mine

    Institute of Scientific and Technical Information of China (English)

    DENG Jian; BIAN Li

    2007-01-01

    In Kaiyang Phosphorus Mine, serious environmental and safety problems are caused by large scale mining activities in the past 40 years. These problems include mining subsidence, low recovery ratio, too much dead ore in pillars, and pollution of phosphorus gypsum. Mining subsidence falls into four categories: curved ground and mesa, ground cracks and collapse hole, spalling and eboulement, slope slide and creeping. Measures to treat the mining subsidence were put forward: finding out and managing abandoned stopes, optimizing mining method (cut and fill mining method), selecting proper backfilling materials (phosphogypsum mixtures), avoiding disorder mining operation, and treating highway slopes. These investigations and engineering treatment methods are believed to be able to contribute to the safety extraction of ore and sustainable development in Kaiyang Phosphorus Mine.

  1. Automated Eukaryotic Gene Structure Annotation Using EVidenceModeler and the Program to Assemble Spliced Alignments

    Energy Technology Data Exchange (ETDEWEB)

    Haas, B J; Salzberg, S L; Zhu, W; Pertea, M; Allen, J E; Orvis, J; White, O; Buell, C R; Wortman, J R

    2007-12-10

    EVidenceModeler (EVM) is presented as an automated eukaryotic gene structure annotation tool that reports eukaryotic gene structures as a weighted consensus of all available evidence. EVM, when combined with the Program to Assemble Spliced Alignments (PASA), yields a comprehensive, configurable annotation system that predicts protein-coding genes and alternatively spliced isoforms. Our experiments on both rice and human genome sequences demonstrate that EVM produces automated gene structure annotation approaching the quality of manual curation.

  2. Introduction of fibre-optic technology in an opencast lignite mine; Einfuehrung von LWL-Technik in einem Braunkohlentagebau

    Energy Technology Data Exchange (ETDEWEB)

    Paus, K.H. [RWE Power AG, Ressort Braunkohlenbergbau/Veredlung, Technikzentrum Tagebaue/HW, Elektrotechnik und technische Vergabe PDV, Kommunikationsanlagen (PBZ-KK), Frechen (Germany); Hehlert, H.A. [RWE Power AG, Ressort Braunkohlenbergbau/Veredlung, Technikzentrum Tagebaue/HW, Elektrotechnik und technische Vergabe Foerderanlagen (PBZ-KP), Frechen (Germany); Andres, M. [RWE Power AG, Ressort Braunkohlenbergbau/Veredlung, Tagebau Hambach, Infrastruktur Prozessdatenverarbeitung (PBH-IP), Niederzier (Germany)

    2006-06-15

    The introduction of fibre-optic technology for the communications infrastructure in our opencast mines entailed the envisaged improvements in automation and operations management. A motivated project team prepared to face new technologies, adopt and use them in day-to-day operations and optimize them has helped sensitive fibre-optic technology to stand the 'acid test' of opencast mine operations. The concepts and operating equipment developed within the scope of the pilot project in the Hambach mine have meanwhile become a standard applied in all RWE Power mines. The whole Garzweiler II mine, for example, has been erected on the basis of these standards. And the upcoming new installation of the belt conveyors in the Inden II mine will be executed in line with these standards as well. (orig.)

  3. AUTOMATED DETECTION OF STRUCTURAL ALERTS (CHEMICAL FRAGMENTS IN (ECOTOXICOLOGY

    Directory of Open Access Journals (Sweden)

    Alban Lepailleur

    2013-02-01

    Full Text Available This mini-review describes the evolution of different algorithms dedicated to the automated discovery of chemical fragments associated to (ecotoxicological endpoints. These structural alerts correspond to one of the most interesting approach of in silico toxicology due to their direct link with specific toxicological mechanisms. A number of expert systems are already available but, since the first work in this field which considered a binomial distribution of chemical fragments between two datasets, new data miners were developed and applied with success in chemoinformatics. The frequency of a chemical fragment in a dataset is often at the core of the process for the definition of its toxicological relevance. However, recent progresses in data mining provide new insights into the automated discovery of new rules. Particularly, this review highlights the notion of Emerging Patterns that can capture contrasts between classes of data.

  4. AUTOMATION OF REMEDY TICKETS CATEGORIZATION USING BUSINESS INTELLIGENCE TOOLS

    Directory of Open Access Journals (Sweden)

    DR. M RAJASEKHARA BABU

    2012-06-01

    Full Text Available The work log of an issue is often the primary source of information for predicting the cause. Mining patterns from work log is an important issue management task. This paper aims at developing an application which categorizes the issues into problem areas using a clustering algorithm. This algorithm helps one to cluster the issues by mining patterns from the work log files. Standard reports can be generated for the root cause analysis. The whole process is automated using Business Intelligence Tools. This paper can be helpful in minimizing the recurrence of issues by informing the technical decision makers about the impact of the issues on the system andthus providing a permanent fix.

  5. Automated Menu Recommendation System Based on Past Preferences

    Directory of Open Access Journals (Sweden)

    Daniel Simon Sanz

    2014-08-01

    Full Text Available Data mining plays an important role in ecommerce in today’s world. Time is critical when it comes to shopping as options are unlimited and making a choice can be tedious. This study presents an application of data mining in the form of an Android application that can provide user with automated suggestion based on past preferences. The application helps a person to choose what food they might want to order in a specific restaurant. The application learns user behavior with each order - what they order in each kind of meal and what are the products that they select together. After gathering enough information, the application can suggest the user about the most selected dish in the recent past and since the application started to learn. Applications, such as these, can play a major role in helping make a decision based on past preferences, thereby reducing the user involvement in decision making.

  6. New solutions of mining tools for hard rock mining

    Energy Technology Data Exchange (ETDEWEB)

    Kotwica, K.; Dasgupta, S. [University of Mining and Metallurgy, Cracow (Poland). Dept. of Mining, Dressing and Transportation Machines

    2002-12-01

    This article presents new solutions of mining tools for hard rock mining and the test results of research in the laboratory stand constructed at the University of Mining and Metallurgy in Crakow for cutting of artificial samples of rock with new mining tools. New solutions of rotary picks and non-symmetric disk cutters have been used. During the studies of the pick edge wear, force and mining effect were measured, using several selected mining parameters. Results obtained with new bell-type pick and disc cutters have proved very encouraging. 2 refs., 12 figs., 1 tab.

  7. Literature classification for semi-automated updating of biological knowledgebases

    DEFF Research Database (Denmark)

    Olsen, Lars Rønn; Kudahl, Ulrich Johan; Winther, Ole;

    2013-01-01

    abstracts yielded classification accuracy of 0.95, thus showing significant value in support of data extraction from the literature. Conclusion: We here propose a conceptual framework for semi-automated extraction of epitope data embedded in scientific literature using principles from text mining...... types of biological data, such as sequence data, are extensively stored in biological databases, functional annotations, such as immunological epitopes, are found primarily in semi-structured formats or free text embedded in primary scientific literature. Results: We defined and applied a machine...

  8. Greater Buyer Effectiveness through Automation

    Science.gov (United States)

    1989-01-01

    FOB = free on board FPAC = Federal Procurement Automation Council FPDS = Federal Procurement Data System 4GL = fourth generation language GAO = General...Procurement Automation Council ( FPAC ), entitled Compendium of Automated Procurement Systems in Federal Agencies. The FPAC inventory attempted to identify...In some cases we have updated descriptions of systems identified by the FPAC study, but many of the newer systems are identified here for the first

  9. 78 FR 66039 - Modification of National Customs Automation Program Test Concerning Automated Commercial...

    Science.gov (United States)

    2013-11-04

    ... SECURITY U.S. Customs and Border Protection Modification of National Customs Automation Program Test... National Customs Automation Program (NCAP) test concerning the Simplified Entry functionality in the...'s (CBP's) National Customs Automation Program (NCAP) test concerning Automated...

  10. 77 FR 48527 - National Customs Automation Program (NCAP) Test Concerning Automated Commercial Environment (ACE...

    Science.gov (United States)

    2012-08-14

    ... SECURITY U.S. Customs and Border Protection National Customs Automation Program (NCAP) Test Concerning...: General notice. SUMMARY: This notice announces modifications to the National Customs Automation Program...) National Customs Automation Program (NCAP) test concerning Automated Commercial Environment...

  11. Circos: an information aesthetic for comparative genomics.

    Science.gov (United States)

    Krzywinski, Martin; Schein, Jacqueline; Birol, Inanç; Connors, Joseph; Gascoyne, Randy; Horsman, Doug; Jones, Steven J; Marra, Marco A

    2009-09-01

    We created a visualization tool called Circos to facilitate the identification and analysis of similarities and differences arising from comparisons of genomes. Our tool is effective in displaying variation in genome structure and, generally, any other kind of positional relationships between genomic intervals. Such data are routinely produced by sequence alignments, hybridization arrays, genome mapping, and genotyping studies. Circos uses a circular ideogram layout to facilitate the display of relationships between pairs of positions by the use of ribbons, which encode the position, size, and orientation of related genomic elements. Circos is capable of displaying data as scatter, line, and histogram plots, heat maps, tiles, connectors, and text. Bitmap or vector images can be created from GFF-style data inputs and hierarchical configuration files, which can be easily generated by automated tools, making Circos suitable for rapid deployment in data analysis and reporting pipelines.

  12. Microbial species delineation using whole genome sequences

    Energy Technology Data Exchange (ETDEWEB)

    Kyrpides, Nikos; Mukherjee, Supratim; Ivanova, Natalia; Mavrommatics, Kostas; Pati, Amrita; Konstantinidis, Konstantinos

    2014-10-20

    Species assignments in prokaryotes use a manual, poly-phasic approach utilizing both phenotypic traits and sequence information of phylogenetic marker genes. With thousands of genomes being sequenced every year, an automated, uniform and scalable approach exploiting the rich genomic information in whole genome sequences is desired, at least for the initial assignment of species to an organism. We have evaluated pairwise genome-wide Average Nucleotide Identity (gANI) values and alignment fractions (AFs) for nearly 13,000 genomes using our fast implementation of the computation, identifying robust and widely applicable hard cut-offs for species assignments based on AF and gANI. Using these cutoffs, we generated stable species-level clusters of organisms, which enabled the identification of several species mis-assignments and facilitated the assignment of species for organisms without species definitions.

  13. Data mining in healthcare: decision making and precision

    Directory of Open Access Journals (Sweden)

    Ionuţ ŢĂRANU

    2016-05-01

    Full Text Available The trend of application of data mining in healthcare today is increased because the health sector is rich with information and data mining has become a necessity. Healthcare organizations generate and collect large volumes of information to a daily basis. Use of information technology enables automation of data mining and knowledge that help bring some interesting patterns which means eliminating manual tasks and easy data extraction directly from electronic records, electronic transfer system that will secure medical records, save lives and reduce the cost of medical services as well as enabling early detection of infectious diseases on the basis of advanced data collection. Data mining can enable healthcare organizations to anticipate trends in the patient's medical condition and behaviour proved by analysis of prospects different and by making connections between seemingly unrelated information. The raw data from healthcare organizations are voluminous and heterogeneous. It needs to be collected and stored in organized form and their integration allows the formation unite medical information system. Data mining in health offers unlimited possibilities for analyzing different data models less visible or hidden to common analysis techniques. These patterns can be used by healthcare practitioners to make forecasts, put diagnoses, and set treatments for patients in healthcare organizations.

  14. World-wide distribution automation systems

    Energy Technology Data Exchange (ETDEWEB)

    Devaney, T.M.

    1994-12-31

    A worldwide power distribution automation system is outlined. Distribution automation is defined and the status of utility automation is discussed. Other topics discussed include a distribution management system, substation feeder, and customer functions, potential benefits, automation costs, planning and engineering considerations, automation trends, databases, system operation, computer modeling of system, and distribution management systems.

  15. Microfluidic system with integrated microinjector for automated Drosophila embryo injection.

    Science.gov (United States)

    Delubac, Daniel; Highley, Christopher B; Witzberger-Krajcovic, Melissa; Ayoob, Joseph C; Furbee, Emily C; Minden, Jonathan S; Zappe, Stefan

    2012-11-21

    Drosophila is one of the most important model organisms in biology. Knowledge derived from the recently sequenced 12 genomes of various Drosophila species can today be combined with the results of more than 100 years of research to systematically investigate Drosophila biology at the molecular level. In order to enable automated, high-throughput manipulation of Drosophila embryos, we have developed a microfluidic system based on a Pyrex-silicon-Pyrex sandwich structure with integrated, surface-micromachined silicon nitride injector for automated injection of reagents. Our system automatically retrieves embryos from an external reservoir, separates potentially clustered embryos through a sheath flow mechanisms, passively aligns an embryo with the integrated injector through geometric constraints, and pushes the embryo onto the injector through flow drag forces. Automated detection of an embryo at injection position through an external camera triggers injection of reagents and subsequent ejection of the embryo to an external reservoir. Our technology can support automated screens based on Drosophila embryos as well as creation of transgenic Drosophila lines. Apart from Drosophila embryos, the layout of our system can be easily modified to accommodate injection of oocytes, embryos, larvae, or adults of other species and fills an important technological gap with regard to automated manipulation of multicellular organisms.

  16. Automating CPM-GOMS

    Science.gov (United States)

    John, Bonnie; Vera, Alonso; Matessa, Michael; Freed, Michael; Remington, Roger

    2002-01-01

    CPM-GOMS is a modeling method that combines the task decomposition of a GOMS analysis with a model of human resource usage at the level of cognitive, perceptual, and motor operations. CPM-GOMS models have made accurate predictions about skilled user behavior in routine tasks, but developing such models is tedious and error-prone. We describe a process for automatically generating CPM-GOMS models from a hierarchical task decomposition expressed in a cognitive modeling tool called Apex. Resource scheduling in Apex automates the difficult task of interleaving the cognitive, perceptual, and motor resources underlying common task operators (e.g. mouse move-and-click). Apex's UI automatically generates PERT charts, which allow modelers to visualize a model's complex parallel behavior. Because interleaving and visualization is now automated, it is feasible to construct arbitrarily long sequences of behavior. To demonstrate the process, we present a model of automated teller interactions in Apex and discuss implications for user modeling. available to model human users, the Goals, Operators, Methods, and Selection (GOMS) method [6, 21] has been the most widely used, providing accurate, often zero-parameter, predictions of the routine performance of skilled users in a wide range of procedural tasks [6, 13, 15, 27, 28]. GOMS is meant to model routine behavior. The user is assumed to have methods that apply sequences of operators and to achieve a goal. Selection rules are applied when there is more than one method to achieve a goal. Many routine tasks lend themselves well to such decomposition. Decomposition produces a representation of the task as a set of nested goal states that include an initial state and a final state. The iterative decomposition into goals and nested subgoals can terminate in primitives of any desired granularity, the choice of level of detail dependent on the predictions required. Although GOMS has proven useful in HCI, tools to support the

  17. AUTOMATED API TESTING APPROACH

    Directory of Open Access Journals (Sweden)

    SUNIL L. BANGARE

    2012-02-01

    Full Text Available Software testing is an investigation conducted to provide stakeholders with information about the quality of the product or service under test. With the help of software testing we can verify or validate the software product. Normally testing will be done after development of software but we can perform the software testing at the time of development process also. This paper will give you a brief introduction about Automated API Testing Tool. This tool of testing will reduce lots of headache after the whole development of software. It saves time as well as money. Such type of testing is helpful in the Industries & Colleges also.

  18. The automated medical office.

    Science.gov (United States)

    Petreman, M

    1990-08-01

    With shock and surprise many physicians learned in the 1980s that they must change the way they do business. Competition for patients, increasing government regulation, and the rapidly escalating risk of litigation forces physicians to seek modern remedies in office management. The author describes a medical clinic that strives to be paperless using electronic innovation to solve the problems of medical practice management. A computer software program to automate information management in a clinic shows that practical thinking linked to advanced technology can greatly improve office efficiency.

  19. [Automated anesthesia record system].

    Science.gov (United States)

    Zhu, Tao; Liu, Jin

    2005-12-01

    Based on Client/Server architecture, a software of automated anesthesia record system running under Windows operation system and networks has been developed and programmed with Microsoft Visual C++ 6.0, Visual Basic 6.0 and SQL Server. The system can deal with patient's information throughout the anesthesia. It can collect and integrate the data from several kinds of medical equipment such as monitor, infusion pump and anesthesia machine automatically and real-time. After that, the system presents the anesthesia sheets automatically. The record system makes the anesthesia record more accurate and integral and can raise the anesthesiologist's working efficiency.

  20. Mapping extent and change in surface mines within the United States for 2001 to 2006

    Science.gov (United States)

    Soulard, Christopher E.; Acevedo, William; Stehman, Stephen V.; Parker, Owen P.

    2016-01-01

    A complete, spatially explicit dataset illustrating the 21st century mining footprint for the conterminous United States does not exist. To address this need, we developed a semi-automated procedure to map the country's mining footprint (30-m pixel) and establish a baseline to monitor changes in mine extent over time. The process uses mine seed points derived from the U.S. Energy Information Administration (EIA), U.S. Geological Survey (USGS) Mineral Resources Data System (MRDS), and USGS National Land Cover Dataset (NLCD) and recodes patches of barren land that meet a “distance to seed” requirement and a patch area requirement before mapping a pixel as mining. Seed points derived from EIA coal points, an edited MRDS point file, and 1992 NLCD mine points were used in three separate efforts using different distance and patch area parameters for each. The three products were then merged to create a 2001 map of moderate-to-large mines in the United States, which was subsequently manually edited to reduce omission and commission errors. This process was replicated using NLCD 2006 barren pixels as a base layer to create a 2006 mine map and a 2001–2006 mine change map focusing on areas with surface mine expansion. In 2001, 8,324 km2 of surface mines were mapped. The footprint increased to 9,181 km2 in 2006, representing a 10·3% increase over 5 years. These methods exhibit merit as a timely approach to generate wall-to-wall, spatially explicit maps representing the recent extent of a wide range of surface mining activities across the country. 

  1. Data mining in Cloud Computing

    Directory of Open Access Journals (Sweden)

    Ruxandra-Ştefania PETRE

    2012-10-01

    Full Text Available This paper describes how data mining is used in cloud computing. Data Mining is used for extracting potentially useful information from raw data. The integration of data mining techniques into normal day-to-day activities has become common place. Every day people are confronted with targeted advertising, and data mining techniques help businesses to become more efficient by reducing costs.Data mining techniques and applications are very much needed in the cloud computing paradigm. The implementation of data mining techniques through Cloud computing will allow the users to retrieve meaningful information from virtually integrated data warehouse that reduces the costs of infrastructure and storage.

  2. Recent advances in genome-based polyketide discovery.

    Science.gov (United States)

    Helfrich, Eric J N; Reiter, Silke; Piel, Jörn

    2014-10-01

    Polyketides are extraordinarily diverse secondary metabolites of great pharmacological value and with interesting ecological functions. The post-genomics era has led to fundamental changes in natural product research by inverting the workflow of secondary metabolite discovery. As opposed to traditional bioactivity-guided screenings, genome mining is an in silico method to screen and analyze sequenced genomes for natural product biosynthetic gene clusters. Since genes for known compounds can be recognized at the early computational stage, genome mining presents an opportunity for dereplication. This review highlights recent progress in bioinformatics, pathway engineering and chemical analytics to extract the biosynthetic secrets hidden in the genome of both well-known natural product sources as well as previously neglected bacteria.

  3. Unsupervised Tensor Mining for Big Data Practitioners.

    Science.gov (United States)

    Papalexakis, Evangelos E; Faloutsos, Christos

    2016-09-01

    Multiaspect data are ubiquitous in modern Big Data applications. For instance, different aspects of a social network are the different types of communication between people, the time stamp of each interaction, and the location associated to each individual. How can we jointly model all those aspects and leverage the additional information that they introduce to our analysis? Tensors, which are multidimensional extensions of matrices, are a principled and mathematically sound way of modeling such multiaspect data. In this article, our goal is to popularize tensors and tensor decompositions to Big Data practitioners by demonstrating their effectiveness, outlining challenges that pertain to their application in Big Data scenarios, and presenting our recent work that tackles those challenges. We view this work as a step toward a fully automated, unsupervised tensor mining tool that can be easily and broadly adopted by practitioners in academia and industry.

  4. Clinical pertinence metric enables hypothesis-independent genome-phenome analysis for neurologic diagnosis.

    Science.gov (United States)

    Segal, Michael M; Abdellateef, Mostafa; El-Hattab, Ayman W; Hilbush, Brian S; De La Vega, Francisco M; Tromp, Gerard; Williams, Marc S; Betensky, Rebecca A; Gleeson, Joseph

    2015-06-01

    We describe an "integrated genome-phenome analysis" that combines both genomic sequence data and clinical information for genomic diagnosis. It is novel in that it uses robust diagnostic decision support and combines the clinical differential diagnosis and the genomic variants using a "pertinence" metric. This allows the analysis to be hypothesis-independent, not requiring assumptions about mode of inheritance, number of genes involved, or which clinical findings are most relevant. Using 20 genomic trios with neurologic disease, we find that pertinence scores averaging 99.9% identify the causative variant under conditions in which a genomic trio is analyzed and family-aware variant calling is done. The analysis takes seconds, and pertinence scores can be improved by clinicians adding more findings. The core conclusion is that automated genome-phenome analysis can be accurate, rapid, and efficient. We also conclude that an automated process offers a methodology for quality improvement of many components of genomic analysis.

  5. Sinbase: an integrated database to study genomics, genetics and comparative genomics in Sesamum indicum.

    Science.gov (United States)

    Wang, Linhai; Yu, Jingyin; Li, Donghua; Zhang, Xiurong

    2015-01-01

    Sesame (Sesamum indicum L.) is an ancient and important oilseed crop grown widely in tropical and subtropical areas. It belongs to the gigantic order Lamiales, which includes many well-known or economically important species, such as olive (Olea europaea), leonurus (Leonurus japonicus) and lavender (Lavandula spica), many of which have important pharmacological properties. Despite their importance, genetic and genomic analyses on these species have been insufficient due to a lack of reference genome information. The now available S. indicum genome will provide an unprecedented opportunity for studying both S. indicum genetic traits and comparative genomics. To deliver S. indicum genomic information to the worldwide research community, we designed Sinbase, a web-based database with comprehensive sesame genomic, genetic and comparative genomic information. Sinbase includes sequences of assembled sesame pseudomolecular chromosomes, protein-coding genes (27,148), transposable elements (372,167) and non-coding RNAs (1,748). In particular, Sinbase provides unique and valuable information on colinear regions with various plant genomes, including Arabidopsis thaliana, Glycine max, Vitis vinifera and Solanum lycopersicum. Sinbase also provides a useful search function and data mining tools, including a keyword search and local BLAST service. Sinbase will be updated regularly with new features, improvements to genome annotation and new genomic sequences, and is freely accessible at http://ocri-genomics.org/Sinbase/.

  6. String Mining in Bioinformatics

    Science.gov (United States)

    Abouelhoda, Mohamed; Ghanem, Moustafa

    Sequence analysis is a major area in bioinformatics encompassing the methods and techniques for studying the biological sequences, DNA, RNA, and proteins, on the linear structure level. The focus of this area is generally on the identification of intra- and inter-molecular similarities. Identifying intra-molecular similarities boils down to detecting repeated segments within a given sequence, while identifying inter-molecular similarities amounts to spotting common segments among two or multiple sequences. From a data mining point of view, sequence analysis is nothing but string- or pattern mining specific to biological strings. For a long time, this point of view, however, has not been explicitly embraced neither in the data mining nor in the sequence analysis text books, which may be attributed to the co-evolution of the two apparently independent fields. In other words, although the word "data-mining" is almost missing in the sequence analysis literature, its basic concepts have been implicitly applied. Interestingly, recent research in biological sequence analysis introduced efficient solutions to many problems in data mining, such as querying and analyzing time series [49,53], extracting information from web pages [20], fighting spam mails [50], detecting plagiarism [22], and spotting duplications in software systems [14].

  7. Automating quantum experiment control

    Science.gov (United States)

    Stevens, Kelly E.; Amini, Jason M.; Doret, S. Charles; Mohler, Greg; Volin, Curtis; Harter, Alexa W.

    2017-03-01

    The field of quantum information processing is rapidly advancing. As the control of quantum systems approaches the level needed for useful computation, the physical hardware underlying the quantum systems is becoming increasingly complex. It is already becoming impractical to manually code control for the larger hardware implementations. In this chapter, we will employ an approach to the problem of system control that parallels compiler design for a classical computer. We will start with a candidate quantum computing technology, the surface electrode ion trap, and build a system instruction language which can be generated from a simple machine-independent programming language via compilation. We incorporate compile time generation of ion routing that separates the algorithm description from the physical geometry of the hardware. Extending this approach to automatic routing at run time allows for automated initialization of qubit number and placement and additionally allows for automated recovery after catastrophic events such as qubit loss. To show that these systems can handle real hardware, we present a simple demonstration system that routes two ions around a multi-zone ion trap and handles ion loss and ion placement. While we will mainly use examples from transport-based ion trap quantum computing, many of the issues and solutions are applicable to other architectures.

  8. Automated Postediting of Documents

    CERN Document Server

    Knight, K; Knight, Kevin; Chander, Ishwar

    1994-01-01

    Large amounts of low- to medium-quality English texts are now being produced by machine translation (MT) systems, optical character readers (OCR), and non-native speakers of English. Most of this text must be postedited by hand before it sees the light of day. Improving text quality is tedious work, but its automation has not received much research attention. Anyone who has postedited a technical report or thesis written by a non-native speaker of English knows the potential of an automated postediting system. For the case of MT-generated text, we argue for the construction of postediting modules that are portable across MT systems, as an alternative to hardcoding improvements inside any one system. As an example, we have built a complete self-contained postediting module for the task of article selection (a, an, the) for English noun phrases. This is a notoriously difficult problem for Japanese-English MT. Our system contains over 200,000 rules derived automatically from online text resources. We report on l...

  9. Automated Test Case Generation

    CERN Document Server

    CERN. Geneva

    2015-01-01

    I would like to present the concept of automated test case generation. I work on it as part of my PhD and I think it would be interesting also for other people. It is also the topic of a workshop paper that I am introducing in Paris. (abstract below) Please note that the talk itself would be more general and not about the specifics of my PhD, but about the broad field of Automated Test Case Generation. I would introduce the main approaches (combinatorial testing, symbolic execution, adaptive random testing) and their advantages and problems. (oracle problem, combinatorial explosion, ...) Abstract of the paper: Over the last decade code-based test case generation techniques such as combinatorial testing or dynamic symbolic execution have seen growing research popularity. Most algorithms and tool implementations are based on finding assignments for input parameter values in order to maximise the execution branch coverage. Only few of them consider dependencies from outside the Code Under Test’s scope such...

  10. Maneuver Automation Software

    Science.gov (United States)

    Uffelman, Hal; Goodson, Troy; Pellegrin, Michael; Stavert, Lynn; Burk, Thomas; Beach, David; Signorelli, Joel; Jones, Jeremy; Hahn, Yungsun; Attiyah, Ahlam; Illsley, Jeannette

    2009-01-01

    The Maneuver Automation Software (MAS) automates the process of generating commands for maneuvers to keep the spacecraft of the Cassini-Huygens mission on a predetermined prime mission trajectory. Before MAS became available, a team of approximately 10 members had to work about two weeks to design, test, and implement each maneuver in a process that involved running many maneuver-related application programs and then serially handing off data products to other parts of the team. MAS enables a three-member team to design, test, and implement a maneuver in about one-half hour after Navigation has process-tracking data. MAS accepts more than 60 parameters and 22 files as input directly from users. MAS consists of Practical Extraction and Reporting Language (PERL) scripts that link, sequence, and execute the maneuver- related application programs: "Pushing a single button" on a graphical user interface causes MAS to run navigation programs that design a maneuver; programs that create sequences of commands to execute the maneuver on the spacecraft; and a program that generates predictions about maneuver performance and generates reports and other files that enable users to quickly review and verify the maneuver design. MAS can also generate presentation materials, initiate electronic command request forms, and archive all data products for future reference.

  11. Automated digital magnetofluidics

    Energy Technology Data Exchange (ETDEWEB)

    Schneider, J; Garcia, A A; Marquez, M [Harrington Department of Bioengineering Arizona State University, Tempe AZ 85287-9709 (United States)], E-mail: tony.garcia@asu.edu

    2008-08-15

    Drops can be moved in complex patterns on superhydrophobic surfaces using a reconfigured computer-controlled x-y metrology stage with a high degree of accuracy, flexibility, and reconfigurability. The stage employs a DMC-4030 controller which has a RISC-based, clock multiplying processor with DSP functions, accepting encoder inputs up to 22 MHz, provides servo update rates as high as 32 kHz, and processes commands at rates as fast as 40 milliseconds. A 6.35 mm diameter cylindrical NdFeB magnet is translated by the stage causing water drops to move by the action of induced magnetization of coated iron microspheres that remain in the drop and are attracted to the rare earth magnet through digital magnetofluidics. Water drops are easily moved in complex patterns in automated digital magnetofluidics at an average speed of 2.8 cm/s over a superhydrophobic polyethylene surface created by solvent casting. With additional components, some potential uses for this automated microfluidic system include characterization of superhydrophobic surfaces, water quality analysis, and medical diagnostics.

  12. Integration and mining of malaria molecular, functional and pharmacological data: how far are we from a chemogenomic knowledge space?

    CERN Document Server

    Birkholtz, L -M; Wells, G; Grando, D; Joubert, F; Kasam, V; Zimmermann, M; Ortet, P; Jacq, N; Roy, S; Hoffmann-Apitius, M; Breton, V; Louw, A I; Maréchal, E

    2006-01-01

    The organization and mining of malaria genomic and post-genomic data is highly motivated by the necessity to predict and characterize new biological targets and new drugs. Biological targets are sought in a biological space designed from the genomic data from Plasmodium falciparum, but using also the millions of genomic data from other species. Drug candidates are sought in a chemical space containing the millions of small molecules stored in public and private chemolibraries. Data management should therefore be as reliable and versatile as possible. In this context, we examined five aspects of the organization and mining of malaria genomic and post-genomic data: 1) the comparison of protein sequences including compositionally atypical malaria sequences, 2) the high throughput reconstruction of molecular phylogenies, 3) the representation of biological processes particularly metabolic pathways, 4) the versatile methods to integrate genomic data, biological representations and functional profiling obtained fro...

  13. The genome BLASTatlas - a GeneWiz extension for visualization of whole-genome homology

    DEFF Research Database (Denmark)

    Hallin, Peter Fischer; Binnewies, Tim Terence; Ussery, David

    2008-01-01

    The development of fast and inexpensive methods for sequencing bacterial genomes has led to a wealth of data, often with many genomes being sequenced of the same species or closely related organisms. Thus, there is a need for visualization methods that will allow easy comparison of many sequenced...... enabling automation of repeated tasks. This tool can be relevant in many pangenomic as well as in metagenomic studies, by giving a quick overview of clusters of insertion sites, genomic islands and overall homology between a reference sequence and a data set....

  14. Journey from Data Mining to Web Mining to Big Data

    OpenAIRE

    Gupta, Richa

    2014-01-01

    This paper describes the journey of big data starting from data mining to web mining to big data. It discusses each of this method in brief and also provides their applications. It states the importance of mining big data today using fast and novel approaches.

  15. Get smart! automate your house!

    NARCIS (Netherlands)

    Van Amstel, P.; Gorter, N.; De Rouw, J.

    2016-01-01

    This "designers' manual" is made during the TIDO-course AR0531 Innovation and Sustainability This manual will help you in reducing both energy usage and costs by automating your home. It gives an introduction to a number of home automation systems that every homeowner can install.

  16. Opening up Library Automation Software

    Science.gov (United States)

    Breeding, Marshall

    2009-01-01

    Throughout the history of library automation, the author has seen a steady advancement toward more open systems. In the early days of library automation, when proprietary systems dominated, the need for standards was paramount since other means of inter-operability and data exchange weren't possible. Today's focus on Application Programming…

  17. Classification of Automated Search Traffic

    Science.gov (United States)

    Buehrer, Greg; Stokes, Jack W.; Chellapilla, Kumar; Platt, John C.

    As web search providers seek to improve both relevance and response times, they are challenged by the ever-increasing tax of automated search query traffic. Third party systems interact with search engines for a variety of reasons, such as monitoring a web site’s rank, augmenting online games, or possibly to maliciously alter click-through rates. In this paper, we investigate automated traffic (sometimes referred to as bot traffic) in the query stream of a large search engine provider. We define automated traffic as any search query not generated by a human in real time. We first provide examples of different categories of query logs generated by automated means. We then develop many different features that distinguish between queries generated by people searching for information, and those generated by automated processes. We categorize these features into two classes, either an interpretation of the physical model of human interactions, or as behavioral patterns of automated interactions. Using the these detection features, we next classify the query stream using multiple binary classifiers. In addition, a multiclass classifier is then developed to identify subclasses of both normal and automated traffic. An active learning algorithm is used to suggest which user sessions to label to improve the accuracy of the multiclass classifier, while also seeking to discover new classes of automated traffic. Performance analysis are then provided. Finally, the multiclass classifier is used to predict the subclass distribution for the search query stream.

  18. Translation: Aids, Robots, and Automation.

    Science.gov (United States)

    Andreyewsky, Alexander

    1981-01-01

    Examines electronic aids to translation both as ways to automate it and as an approach to solve problems resulting from shortage of qualified translators. Describes the limitations of robotic MT (Machine Translation) systems, viewing MAT (Machine-Aided Translation) as the only practical solution and the best vehicle for further automation. (MES)

  19. Automated Methods Of Corrosion Measurements

    DEFF Research Database (Denmark)

    Bech-Nielsen, Gregers; Andersen, Jens Enevold Thaulov; Reeve, John Ch

    1997-01-01

    The chapter describes the following automated measurements: Corrosion Measurements by Titration, Imaging Corrosion by Scanning Probe Microscopy, Critical Pitting Temperature and Application of the Electrochemical Hydrogen Permeation Cell.......The chapter describes the following automated measurements: Corrosion Measurements by Titration, Imaging Corrosion by Scanning Probe Microscopy, Critical Pitting Temperature and Application of the Electrochemical Hydrogen Permeation Cell....

  20. Mining Available Data from the United States Environmental ...

    Science.gov (United States)

    Demands for quick and accurate life cycle assessments create a need for methods to rapidly generate reliable life cycle inventories (LCI). Data mining is a suitable tool for this purpose, especially given the large amount of available governmental data. These data are typically applied to LCIs on a case-by-case basis. As linked open data becomes more prevalent, it may be possible to automate LCI using data mining by establishing a reproducible approach for identifying, extracting, and processing the data. This work proposes a method for standardizing and eventually automating the discovery and use of publicly available data at the United States Environmental Protection Agency for chemical-manufacturing LCI. The method is developed using a case study of acetic acid. The data quality and gap analyses for the generated inventory found that the selected data sources can provide information with equal or better reliability and representativeness on air, water, hazardous waste, on-site energy usage, and production volumes but with key data gaps including material inputs, water usage, purchased electricity, and transportation requirements. A comparison of the generated LCI with existing data revealed that the data mining inventory is in reasonable agreement with existing data and may provide a more-comprehensive inventory of air emissions and water discharges. The case study highlighted challenges for current data management practices that must be overcome to successfu

  1. Imitating manual curation of text-mined facts in biomedicine.

    Directory of Open Access Journals (Sweden)

    Raul Rodriguez-Esteban

    2006-09-01

    Full Text Available Text-mining algorithms make mistakes in extracting facts from natural-language texts. In biomedical applications, which rely on use of text-mined data, it is critical to assess the quality (the probability that the message is correctly extracted of individual facts--to resolve data conflicts and inconsistencies. Using a large set of almost 100,000 manually produced evaluations (most facts were independently reviewed more than once, producing independent evaluations, we implemented and tested a collection of algorithms that mimic human evaluation of facts provided by an automated information-extraction system. The performance of our best automated classifiers closely approached that of our human evaluators (ROC score close to 0.95. Our hypothesis is that, were we to use a larger number of human experts to evaluate any given sentence, we could implement an artificial-intelligence curator that would perform the classification job at least as accurately as an average individual human evaluator. We illustrated our analysis by visualizing the predicted accuracy of the text-mined relations involving the term cocaine.

  2. Data mining methods

    CERN Document Server

    Chattamvelli, Rajan

    2015-01-01

    DATA MINING METHODS, Second Edition discusses both theoretical foundation and practical applications of datamining in a web field including banking, e-commerce, medicine, engineering and management. This book starts byintroducing data and information, basic data type, data category and applications of data mining. The second chapterbriefly reviews data visualization technology and importance in data mining. Fundamentals of probability and statisticsare discussed in chapter 3, and novel algorithm for sample covariants are derived. The next two chapters give an indepthand useful discussion of data warehousing and OLAP. Decision trees are clearly explained and a new tabularmethod for decision tree building is discussed. The chapter on association rules discusses popular algorithms andcompares various algorithms in summary table form. An interesting application of genetic algorithm is introduced inthe next chapter. Foundations of neural networks are built from scratch and the back propagation algorithm is derived...

  3. Recent Developments of Genomic Research in Soybean

    Institute of Scientific and Technical Information of China (English)

    Ching Chan; Xinpeng Qi; Man-Wah Li; Fuk-Ling Wong; Hon-Ming Lam

    2012-01-01

    Soybean is an important cash crop with unique and important traits such as the high seed protein and oil contents,and the ability to perform symbiotic nitrogen fixation.A reference genome of cultivated soybeans was established in 2010,followed by whole-genome re-sequencing of wild and cultivated soybean accessions.These efforts revealed unique features of the soybean genome and helped to understand its evolution.Mapping of variations between wild and cultivated soybean genomes were performed.These genomic variations may be related to the process of domestication and human selection.Wild soybean germplasms exhibited high genomic diversity and hence may be an important source of novel genes/alleles.Accumulation of genomic data will help to refine genetic maps and expedite the identification of functional genes.In this review,we summarize the major findings from the whole-genome sequencing projects and discuss the possible impacts on soybean researches and breeding programs.Some emerging areas such as transcriptomic and epigenomic studies will be introduced.In addition,we also tabulated some useful bioinformatics tools that will help the mining of the soybean genomic data.

  4. Personal continuous route pattern mining

    Institute of Scientific and Technical Information of China (English)

    Qian YE; Ling CHEN; Gen-cai CHEN

    2009-01-01

    In the daily life, people often repeat regular routes in certain periods. In this paper, a mining system is developed to find the continuous route patterns of personal past trips. In order to count the diversity of personal moving status, the mining system employs the adaptive GPS data recording and five data filters to guarantee the clean trips data. The mining system uses a client/server architecture to protect personal privacy and to reduce the computational load. The server conducts the main mining procedure but with insufficient information to recover real personal routes. In order to improve the scalability of sequential pattern mining, a novel pattern mining algorithm, continuous route pattern mining (CRPM), is proposed. This algorithm can tolerate the different disturbances in real routes and extract the frequent patterns. Experimental results based on nine persons' trips show that CRPM can extract more than two times longer route patterns than the traditional route pattern mining algorithms.

  5. Scientific Data Mining in Astronomy

    OpenAIRE

    Borne, Kirk

    2009-01-01

    We describe the application of data mining algorithms to research problems in astronomy. We posit that data mining has always been fundamental to astronomical research, since data mining is the basis of evidence-based discovery, including classification, clustering, and novelty discovery. These algorithms represent a major set of computational tools for discovery in large databases, which will be increasingly essential in the era of data-intensive astronomy. Historical examples of data mining...

  6. Land reclamation beautifies coal mines

    Energy Technology Data Exchange (ETDEWEB)

    Coblentz, B. [MSU Ag Communications (United States)

    2009-07-15

    The article explains how the Mississippi Agricultural and Forestry Experiments station, MAFES, has helped prepare land exploited by strip mining at North American Coal Corporation's Red Hills Mine. The 5,800 acre lignite mine is over 200 ft deep and uncovers six layers of coal. About 100 acres of land a year is mined and reclaimed, mostly as pine plantations. 5 photos.

  7. Road construction in underground mines

    Energy Technology Data Exchange (ETDEWEB)

    Benke, L.; Benkovics, I.

    1985-01-01

    The need and reasons of road construction for rubber-tyre vehicles in various mine sections are examined. A detailed analysis is given of the direct and indirect influences of underground haulage ways and transport roads on the parameters of mine performance. The various mine road construction technologies are overviewed. Experiences are presented with road construction in the Mecsek Ore Mines Company, Plant 3, Hungary. The cost factors of four construction technologies are compared.

  8. Data mining mobile devices

    CERN Document Server

    Mena, Jesus

    2013-01-01

    With today's consumers spending more time on their mobiles than on their PCs, new methods of empirical stochastic modeling have emerged that can provide marketers with detailed information about the products, content, and services their customers desire.Data Mining Mobile Devices defines the collection of machine-sensed environmental data pertaining to human social behavior. It explains how the integration of data mining and machine learning can enable the modeling of conversation context, proximity sensing, and geospatial location throughout large communities of mobile users

  9. Mining the Blazar Sky

    CERN Document Server

    Padovani, P; Padovani, Paolo; Giommi, Paolo

    2000-01-01

    We present the results of our methods to "mine" the blazar sky, i.e., select blazar candidates with very high efficiency. These are based on the cross-correlation between public radio and X-ray catalogs and have resulted in two surveys, the Deep X-ray Radio Blazar Survey (DXRBS) and the "Sedentary" BL Lac survey. We show that data mining is vital to select sizeable, deep samples of these rare active galactic nuclei and we touch upon the identification problems which deeper surveys will face.

  10. Data mining for dummies

    CERN Document Server

    Brown, Meta S

    2014-01-01

    Delve into your data for the key to success Data mining is quickly becoming integral to creating value and business momentum. The ability to detect unseen patterns hidden in the numbers exhaustively generated by day-to-day operations allows savvy decision-makers to exploit every tool at their disposal in the pursuit of better business. By creating models and testing whether patterns hold up, it is possible to discover new intelligence that could change your business''s entire paradigm for a more successful outcome. Data Mining for Dummies shows you why it doesn''t take a data scientist to gain

  11. Mining the social mediome.

    Science.gov (United States)

    Asch, David A; Rader, Daniel J; Merchant, Raina M

    2015-09-01

    The experiences and behaviors revealed in our everyday lives provide as much insight into health and disease as any analysis of our genome could ever produce. These characteristics are not found in the genome, but may be revealed in our online activities, which make up our social mediome.

  12. WEB MINING BASED FRAMEWORK FOR ONTOLOGY LEARNING

    Directory of Open Access Journals (Sweden)

    C.Ramesh

    2015-07-01

    Full Text Available Today, the notion of Semantic Web has emerged as a prominent solution to the problem of organizing the immense information provided by World Wide Web, and its focus on supporting a better co-operation between humans and machines is noteworthy. Ontology forms the major component of Semantic Web in its realization. However, manual method of ontology construction is time-consuming, costly, error-prone and inflexible to change and in addition, it requires a complete participation of knowledge engineer or domain expert. To address this issue, researchers hoped that a semi-automatic or automatic process would result in faster and better ontology construction and enrichment. Ontology learning has become recently a major area of research, whose goal is to facilitate construction of ontologies, which reduces the effort in developing ontology for a new domain. However, there are few research studies that attempt to construct ontology from semi-structured Web pages. In this paper, we present a complete framework for ontology learning that facilitates the semi-automation of constructing and enriching web site ontology from semi structured Web pages. The proposed framework employs Web Content Mining and Web Usage mining in extracting conceptual relationship from Web. The main idea behind this concept was to incorporate the web author's ideas as well as web users’ intentions in the ontology development and its evolution.

  13. Automated Standard Hazard Tool

    Science.gov (United States)

    Stebler, Shane

    2014-01-01

    The current system used to generate standard hazard reports is considered cumbersome and iterative. This study defines a structure for this system's process in a clear, algorithmic way so that standard hazard reports and basic hazard analysis may be completed using a centralized, web-based computer application. To accomplish this task, a test server is used to host a prototype of the tool during development. The prototype is configured to easily integrate into NASA's current server systems with minimal alteration. Additionally, the tool is easily updated and provides NASA with a system that may grow to accommodate future requirements and possibly, different applications. Results of this project's success are outlined in positive, subjective reviews complete by payload providers and NASA Safety and Mission Assurance personnel. Ideally, this prototype will increase interest in the concept of standard hazard automation and lead to the full-scale production of a user-ready application.

  14. Robust automated knowledge capture.

    Energy Technology Data Exchange (ETDEWEB)

    Stevens-Adams, Susan Marie; Abbott, Robert G.; Forsythe, James Chris; Trumbo, Michael Christopher Stefan; Haass, Michael Joseph; Hendrickson, Stacey M. Langfitt

    2011-10-01

    This report summarizes research conducted through the Sandia National Laboratories Robust Automated Knowledge Capture Laboratory Directed Research and Development project. The objective of this project was to advance scientific understanding of the influence of individual cognitive attributes on decision making. The project has developed a quantitative model known as RumRunner that has proven effective in predicting the propensity of an individual to shift strategies on the basis of task and experience related parameters. Three separate studies are described which have validated the basic RumRunner model. This work provides a basis for better understanding human decision making in high consequent national security applications, and in particular, the individual characteristics that underlie adaptive thinking.

  15. [From automation to robotics].

    Science.gov (United States)

    1985-01-01

    The introduction of automation into the laboratory of biology seems to be unavoidable. But at which cost, if it is necessary to purchase a new machine for every new application? Fortunately the same image processing techniques, belonging to a theoretic framework called Mathematical Morphology, may be used in visual inspection tasks, both in car industry and in the biology lab. Since the market for industrial robotics applications is much higher than the market of biomedical applications, the price of image processing devices drops, and becomes sometimes less than the price of a complete microscope equipment. The power of the image processing methods of Mathematical Morphology will be illustrated by various examples, as automatic silver grain counting in autoradiography, determination of HLA genotype, electrophoretic gels analysis, automatic screening of cervical smears... Thus several heterogeneous applications may share the same image processing device, provided there is a separate and devoted work station for each of them.

  16. Automated electronic filter design

    CERN Document Server

    Banerjee, Amal

    2017-01-01

    This book describes a novel, efficient and powerful scheme for designing and evaluating the performance characteristics of any electronic filter designed with predefined specifications. The author explains techniques that enable readers to eliminate complicated manual, and thus error-prone and time-consuming, steps of traditional design techniques. The presentation includes demonstration of efficient automation, using an ANSI C language program, which accepts any filter design specification (e.g. Chebyschev low-pass filter, cut-off frequency, pass-band ripple etc.) as input and generates as output a SPICE(Simulation Program with Integrated Circuit Emphasis) format netlist. Readers then can use this netlist to run simulations with any version of the popular SPICE simulator, increasing accuracy of the final results, without violating any of the key principles of the traditional design scheme.

  17. 76 FR 70075 - Proximity Detection Systems for Continuous Mining Machines in Underground Coal Mines

    Science.gov (United States)

    2011-11-10

    ... Mining Machines in Underground Coal Mines AGENCY: Mine Safety and Health Administration, Labor. ACTION... addressing Proximity Detection Systems for Continuous Mining Machines in Underground Coal Mines. This... Continuous Mining Machines in Underground Coal Mines. MSHA conducted hearings on October 18, October...

  18. Automated Motivic Analysis

    DEFF Research Database (Denmark)

    Lartillot, Olivier

    2016-01-01

    of the successive notes and intervals, various sets of musical parameters may be invoked. In this chapter, a method is presented that allows for these heterogeneous patterns to be discovered. Motivic repetition with local ornamentation is detected by reconstructing, on top of “surface-level” monodic voices, longer...... for lossless compression. The structural complexity resulting from successive repetitions of patterns can be controlled through a simple modelling of cycles. Generally, motivic patterns cannot always be defined solely as sequences of descriptions in a fixed set of dimensions: throughout the descriptions......-term relations between non-adjacent notes related to deeper structures, and by tracking motives on the resulting syntagmatic network. These principles are integrated into a computational framework, the MiningSuite, developed in Matlab....

  19. Automated Quantitative Rare Earth Elements Mineralogy by Scanning Electron Microscopy

    Science.gov (United States)

    Sindern, Sven; Meyer, F. Michael

    2016-09-01

    Increasing industrial demand of rare earth elements (REEs) stems from the central role they play for advanced technologies and the accelerating move away from carbon-based fuels. However, REE production is often hampered by the chemical, mineralogical as well as textural complexity of the ores with a need for better understanding of their salient properties. This is not only essential for in-depth genetic interpretations but also for a robust assessment of ore quality and economic viability. The design of energy and cost-efficient processing of REE ores depends heavily on information about REE element deportment that can be made available employing automated quantitative process mineralogy. Quantitative mineralogy assigns numeric values to compositional and textural properties of mineral matter. Scanning electron microscopy (SEM) combined with a suitable software package for acquisition of backscatter electron and X-ray signals, phase assignment and image analysis is one of the most efficient tools for quantitative mineralogy. The four different SEM-based automated quantitative mineralogy systems, i.e. FEI QEMSCAN and MLA, Tescan TIMA and Zeiss Mineralogic Mining, which are commercially available, are briefly characterized. Using examples of quantitative REE mineralogy, this chapter illustrates capabilities and limitations of automated SEM-based systems. Chemical variability of REE minerals and analytical uncertainty can reduce performance of phase assignment. This is shown for the REE phases parisite and synchysite. In another example from a monazite REE deposit, the quantitative mineralogical parameters surface roughness and mineral association derived from image analysis are applied for automated discrimination of apatite formed in a breakdown reaction of monazite and apatite formed by metamorphism prior to monazite breakdown. SEM-based automated mineralogy fulfils all requirements for characterization of complex unconventional REE ores that will become

  20. Automated Essay Scoring

    Directory of Open Access Journals (Sweden)

    Semire DIKLI

    2006-01-01

    Full Text Available Automated Essay Scoring Semire DIKLI Florida State University Tallahassee, FL, USA ABSTRACT The impacts of computers on writing have been widely studied for three decades. Even basic computers functions, i.e. word processing, have been of great assistance to writers in modifying their essays. The research on Automated Essay Scoring (AES has revealed that computers have the capacity to function as a more effective cognitive tool (Attali, 2004. AES is defined as the computer technology that evaluates and scores the written prose (Shermis & Barrera, 2002; Shermis & Burstein, 2003; Shermis, Raymat, & Barrera, 2003. Revision and feedback are essential aspects of the writing process. Students need to receive feedback in order to increase their writing quality. However, responding to student papers can be a burden for teachers. Particularly if they have large number of students and if they assign frequent writing assignments, providing individual feedback to student essays might be quite time consuming. AES systems can be very useful because they can provide the student with a score as well as feedback within seconds (Page, 2003. Four types of AES systems, which are widely used by testing companies, universities, and public schools: Project Essay Grader (PEG, Intelligent Essay Assessor (IEA, E-rater, and IntelliMetric. AES is a developing technology. Many AES systems are used to overcome time, cost, and generalizability issues in writing assessment. The accuracy and reliability of these systems have been proven to be high. The search for excellence in machine scoring of essays is continuing and numerous studies are being conducted to improve the effectiveness of the AES systems.

  1. Genomic Sequence Comparisons, 1987-2003 Final Report

    Energy Technology Data Exchange (ETDEWEB)

    George M. Church

    2004-07-29

    This project was to develop new DNA sequencing and RNA and protein quantitation methods and related genome annotation tools. The project began in 1987 with the development of multiplex sequencing (published in Science in 1988), and one of the first automated sequencing methods. This lead to the first commercial genome sequence in 1994 and to the establishment of the main commercial participants (GTC then Agencourt) in the public DOE/NIH genome project. In collaboration with GTC we contributed to one of the first complete DOE genome sequences, in 1997, that of Methanobacterium thermoautotropicum, a species of great relevance to energy-rich gas production.

  2. Genome-scale genetic engineering in Escherichia coli.

    Science.gov (United States)

    Jeong, Jaehwan; Cho, Namjin; Jung, Daehee; Bang, Duhee

    2013-11-01

    Genome engineering has been developed to create useful strains for biological studies and industrial uses. However, a continuous challenge remained in the field: technical limitations in high-throughput screening and precise manipulation of strains. Today, technical improvements have made genome engineering more rapid and efficient. This review introduces recent advances in genome engineering technologies applied to Escherichia coli as well as multiplex automated genome engineering (MAGE), a recent technique proposed as a powerful toolkit due to its straightforward process, rapid experimental procedures, and highly efficient properties.

  3. Internet technologies in the mining industry. Towards unattended mining systems

    Energy Technology Data Exchange (ETDEWEB)

    Krzykawski, Michal [FAMUR Group, Katowice (Poland)

    2009-08-27

    Global suppliers of longwall systems focus mainly on maximising the efficiency of the equipment they manufacture. Given the fact that, since 2004, coal demand on world markets has been constantly on the increase, even during an economic downturn, this endeavour seems fully justified. However, it should be remembered that maximum efficiency must be accompanied by maximum safety of all underground operations. This statement is based on the belief that the mining industry, which exploits increasingly deep and dangerous coal beds, faces the necessity to implement comprehensive IT systems for managing all mining processes and, in the near future, to use unmanned mining systems, fully controllable from the mine surface. The computerisation of mines is an indispensable element of the development of the world mining industry, a belief which has been put into practice with e-mine, developed by the FAMUR Group. (orig.)

  4. Comparative genomic paleontology across plant kingdom reveals the dynamics of TE-driven genome evolution.

    Science.gov (United States)

    El Baidouri, Moaine; Panaud, Olivier

    2013-01-01

    Long terminal repeat-retrotransposons (LTR-RTs) are the most abundant class of transposable elements (TEs) in plants. They strongly impact the structure, function, and evolution of their host genome, and, in particular, their role in genome size variation has been clearly established. However, the dynamics of the process through which LTR-RTs have differentially shaped plant genomes is still poorly understood because of a lack of comparative studies. Using a new robust and automated family classification procedure, we exhaustively characterized the LTR-RTs in eight plant genomes for which a high-quality sequence is available (i.e., Arabidopsis thaliana, A. lyrata, grapevine, soybean, rice, Brachypodium dystachion, sorghum, and maize). This allowed us to perform a comparative genome-wide study of the retrotranspositional landscape in these eight plant lineages from both monocots and dicots. We show that retrotransposition has recurrently occurred in all plant genomes investigated, regardless their size, and through bursts, rather than a continuous process. Moreover, in each genome, only one or few LTR-RT families have been active in the recent past, and the difference in genome size among the species studied could thus mostly be accounted for by the extent of the latest transpositional burst(s). Following these bursts, LTR-RTs are efficiently eliminated from their host genomes through recombination and deletion, but we show that the removal rate is not lineage specific. These new findings lead us to propose a new model of TE-driven genome evolution in plants.

  5. Simplified process model discovery based on role-oriented genetic mining.

    Science.gov (United States)

    Zhao, Weidong; Liu, Xi; Dai, Weihui

    2014-01-01

    Process mining is automated acquisition of process models from event logs. Although many process mining techniques have been developed, most of them are based on control flow. Meanwhile, the existing role-oriented process mining methods focus on correctness and integrity of roles while ignoring role complexity of the process model, which directly impacts understandability and quality of the model. To address these problems, we propose a genetic programming approach to mine the simplified process model. Using a new metric of process complexity in terms of roles as the fitness function, we can find simpler process models. The new role complexity metric of process models is designed from role cohesion and coupling, and applied to discover roles in process models. Moreover, the higher fitness derived from role complexity metric also provides a guideline for redesigning process models. Finally, we conduct case study and experiments to show that the proposed method is more effective for streamlining the process by comparing with related studies.

  6. GPS based checking survey and precise DEM development in Open mine

    Institute of Scientific and Technical Information of China (English)

    XU Ai-gong

    2008-01-01

    The checking survey in Open mine is one of the most frequent and important work. It plays the role of forming a connecting link between open mine planning and production. Traditional checking method has such disadvantages as long time consumption,heavy workload, complicated calculating process, and lower automation. Used GPS and GIS technologies to systematically study the core issues of checking survey in open mine.A detail GPS data acquisition coding scheme was presented. Based on the scheme an algorithm used for computer semiautomatic cartography was made. Three methods used for eliminating gross errors from raw data which were needed for creating DEM was discussed. Two algorithms were researched and realized which can be used to create open mine fine DEM model with constrained conditions and to dynamically update the model.The precision analysis and evaluation of the created model were carried out.

  7. Automated methods of predicting the function of biological sequences using GO and BLAST

    Directory of Open Access Journals (Sweden)

    Baumann Ute

    2005-11-01

    Full Text Available Abstract Background With the exponential increase in genomic sequence data there is a need to develop automated approaches to deducing the biological functions of novel sequences with high accuracy. Our aim is to demonstrate how accuracy benchmarking can be used in a decision-making process evaluating competing designs of biological function predictors. We utilise the Gene Ontology, GO, a directed acyclic graph of functional terms, to annotate sequences with functional information describing their biological context. Initially we examine the effect on accuracy scores of increasing the allowed distance between predicted and a test set of curator assigned terms. Next we evaluate several annotator methods using accuracy benchmarking. Given an unannotated sequence we use the Basic Local Alignment Search Tool, BLAST, to find similar sequences that have already been assigned GO terms by curators. A number of methods were developed that utilise terms associated with the best five matching sequences. These methods were compared against a benchmark method of simply using terms associated with the best BLAST-matched sequence (best BLAST approach. Results The precision and recall of estimates increases rapidly as the amount of distance permitted between a predicted term and a correct term assignment increases. Accuracy benchmarking allows a comparison of annotation methods. A covering graph approach performs poorly, except where the term assignment rate is high. A term distance concordance approach has a similar accuracy to the best BLAST approach, demonstrating lower precision but higher recall. However, a discriminant function method has higher precision and recall than the best BLAST approach and other methods shown here. Conclusion Allowing term predictions to be counted correct if closely related to a correct term decreases the reliability of the accuracy score. As such we recommend using accuracy measures that require exact matching of predicted

  8. High-content screening of functional genomic libraries.

    Science.gov (United States)

    Rines, Daniel R; Tu, Buu; Miraglia, Loren; Welch, Genevieve L; Zhang, Jia; Hull, Mitchell V; Orth, Anthony P; Chanda, Sumit K

    2006-01-01

    Recent advances in functional genomics have enabled genome-wide genetic studies in mammalian cells. These include the establishment of high-throughput transfection and viral propagation methodologies, the production of large-scale cDNA and siRNA libraries, and the development of sensitive assay detection processes and instrumentation. The latter has been significantly facilitated by the implementation of automated microscopy and quantitative image analysis, collectively referred to as high-content screening (HCS), toward cell-based functional genomics application. This technology can be applied to whole genome analysis of discrete molecular and phenotypic events at the level of individual cells and promises to significantly expand the scope of functional genomic analyses in mammalian cells. This chapter provides a comprehensive guide for curating and preparing function genomics libraries and performing HCS at the level of the genome.

  9. Frequent pattern mining

    CERN Document Server

    Aggarwal, Charu C

    2014-01-01

    Proposes numerous methods to solve some of the most fundamental problems in data mining and machine learning Presents various simplified perspectives, providing a range of information to benefit both students and practitioners Includes surveys on key research content, case studies and future research directions

  10. Mining Your Own Data

    Science.gov (United States)

    Clark, Maurice

    2014-05-01

    Conducting asteroid photometry frequently requires imaging one area of the sky for many hours. Apart from the asteroid being studied, there may be many other objects of interest buried in the data. The value of mining your own asteroid data is discussed, using examples from observations made by the author, primarily at the Preston Gott Observatory at Texas Tech University.

  11. Contextual Text Mining

    Science.gov (United States)

    Mei, Qiaozhu

    2009-01-01

    With the dramatic growth of text information, there is an increasing need for powerful text mining systems that can automatically discover useful knowledge from text. Text is generally associated with all kinds of contextual information. Those contexts can be explicit, such as the time and the location where a blog article is written, and the…

  12. Computer monitors mine conditions

    Energy Technology Data Exchange (ETDEWEB)

    Brezovec, D.

    1981-08-01

    At Cape Breton Development Corp's No. 26 Colliery in Canada, a Transmitton microprocessor-based system monitors methane concentrations, air velocities and pressures, fan vibration, machine temperatures and pump pressures continuously. Longwall mining at the colliery operating under the ocean is briefly described.

  13. Coal Mines Security System

    Directory of Open Access Journals (Sweden)

    Ankita Guhe

    2012-05-01

    Full Text Available Geological circumstances of mine seem to be extremely complicated and there are many hidden troubles. Coal is wrongly lifted by the musclemen from coal stocks, coal washeries, coal transfer and loading points and also in the transport routes by malfunctioning the weighing of trucks. CIL —Coal India Ltd is under the control of mafia and a large number of irregularities can be contributed to coal mafia. An Intelligent Coal Mine Security System using data acquisition method utilizes sensor, automatic detection, communication and microcontroller technologies, to realize the operational parameters of the mining area. The data acquisition terminal take the PIC 16F877A chip integrated circuit as a core for sensing the data, which carries on the communication through the RS232 interface with the main control machine, which has realized the intelligent monitoring. Data management system uses EEPROM chip as a Black box to store data permanently and also use CCTV camera for recording internal situation. The system implements the real-time monitoring and displaying for data undermine, query, deletion and maintenance of history data, graphic statistic, report printing, expert diagnosis and decision-making support. The Research, development and Promote Application will provide the safeguard regarding the mine pit control in accuracy, real-time capacity and has high reliability.

  14. Design and implementation of data mining tools

    CERN Document Server

    Thuraisingham, Bhavani; Awad, Mamoun

    2009-01-01

    DATA MINING TECHNIQUES AND APPLICATIONS IntroductionTrendsData Mining Techniques and ApplicationsData Mining for Cyber Security: Intrusion DetectionData Mining for Web: Web Page Surfing PredictionData Mining for Multimedia: Image ClassificationOrganization of This BookNext StepsData Mining TechniquesIntroductionOverview of Data Mining Tasks and TechniquesArtificial Neural NetworksSupport Vector MachinesMarkov ModelAssociation Rule Mining (ARM)Multiclass ProblemImage MiningSummaryData Mining ApplicationsIntroductionIntrusion DetectionWeb Page Surfing PredictionImage ClassificationSummaryDATA MI

  15. The Automatic Drilling System of 6R-2P Mining Drill Jumbos

    Directory of Open Access Journals (Sweden)

    Yujun Wang

    2015-02-01

    Full Text Available In order to improve the efficiency of underground mining and tunneling operations and to realize automatic drilling, it is necessary to develop the automation system for large drill jumbos. This work focuses on one such mining drill jumbo which is actually a redundant robotic manipulator with eight degrees of freedom, because it has six revolute joints and two prismatic joints. To realize the autonomous drilling operation, algorithms are proposed to calculate the desired pose of the end-effector and to solve the inverse kinematics of the drill jumbo, which is one of the key issues for developing the automation system. After that, a control strategy is proposed to independently control the eight joint variables using PID feedback control approaches. The simulation model is developed in Simulink. As the closed-loop controllers corresponding to all joints are local and independent of each other, the whole system is not a closed-loop feedback control. In order to estimate the possible maximal pose error, the analysis of the pose error caused by the errors of the joint variables is conducted. The results are satisfactory for mining applications and the developed automation system is being applied in the drill jumbos built by Mining Technologies International Inc.

  16. Strategies for complete plastid genome sequencing.

    Science.gov (United States)

    Twyford, Alex D; Ness, Rob W

    2016-10-28

    Plastid sequencing is an essential tool in the study of plant evolution. This high-copy organelle is one of the most technically accessible regions of the genome, and its sequence conservation makes it a valuable region for comparative genome evolution, phylogenetic analysis and population studies. Here, we discuss recent innovations and approaches for de novo plastid assembly that harness genomic tools. We focus on technical developments including low-cost sequence library preparation approaches for genome skimming, enrichment via hybrid baits and methylation-sensitive capture, sequence platforms with higher read outputs and longer read lengths, and automated tools for assembly. These developments allow for a much more streamlined assembly than via conventional short-range PCR. Although newer methods make complete plastid sequencing possible for any land plant or green alga, there are still challenges for producing finished plastomes particularly from herbarium material or from structurally divergent plastids such as those of parasitic plants.

  17. LTR retrotransposons contribute to genomic gigantism in plethodontid salamanders.

    Science.gov (United States)

    Sun, Cheng; Shepard, Donald B; Chong, Rebecca A; López Arriaza, José; Hall, Kathryn; Castoe, Todd A; Feschotte, Cédric; Pollock, David D; Mueller, Rachel Lockridge

    2012-01-01

    Among vertebrates, most of the largest genomes are found within the salamanders, a clade of amphibians that includes 613 species. Salamander genome sizes range from ~14 to ~120 Gb. Because genome size is correlated with nucleus and cell sizes, as well as other traits, morphological evolution in salamanders has been profoundly affected by genomic gigantism. However, the molecular mechanisms driving genomic expansion in this clade remain largely unknown. Here, we present the first comparative analysis of transposable element (TE) content in salamanders. Using high-throughput sequencing, we generated genomic shotgun data for six species from the Plethodontidae, the largest family of salamanders. We then developed a pipeline to mine TE sequences from shotgun data in taxa with limited genomic resources, such as salamanders. Our summaries of overall TE abundance and diversity for each species demonstrate that TEs make up a substantial portion of salamander genomes, and that all of the major known types of TEs are represented in salamanders. The most abundant TE superfamilies found in the genomes of our six focal species are similar, despite substantial variation in genome size. However, our results demonstrate a major difference between salamanders and other vertebrates: salamander genomes contain much larger amounts of long terminal repeat (LTR) retrotransposons, primarily Ty3/gypsy elements. Thus, the extreme increase in genome size that occurred in salamanders was likely accompanied by a shift in TE landscape. These results suggest that increased proliferation of LTR retrotransposons was a major molecular mechanism contributing to genomic expansion in salamanders.

  18. Mining Class-Correlated Patterns for Sequence Labeling

    Science.gov (United States)

    Hopf, Thomas; Kramer, Stefan

    Sequence labeling is the task of assigning a label sequence to an observation sequence. Since many methods to solve this problem depend on the specification of predictive features, automated methods for their derivation are desirable. Unlike in other areas of pattern-based classification, however, no algorithm to directly mine class-correlated patterns for sequence labeling has been proposed so far. We introduce the novel task of mining class-correlated sequence patterns for sequence labeling and present a supervised pattern growth algorithm to find all patterns in a set of observation sequences, which correlate with the assignment of a fixed sequence label no less than a user-specified minimum correlation constraint. From the resulting set of patterns, features for a variety of classifiers can be obtained in a straightforward manner. The efficiency of the approach and the influence of important parameters are shown in experiments on several biological datasets.

  19. Mining for Strategic Competitive Intelligence Foundations and Applications

    CERN Document Server

    Ziegler, Cai-Nicolas

    2012-01-01

    The textbook at hand aims to provide an introduction to the use of automated methods for gathering strategic competitive intelligence. Hereby, the text does not describe a singleton research discipline in its own right, such as machine learning or Web mining. It rather contemplates an application scenario, namely the gathering of knowledge that appears of paramount importance to organizations, e.g., companies and corporations. To this end, the book first summarizes the range of research disciplines that contribute to addressing the issue, extracting from each those grains that are of utmost relevance to the depicted application scope. Moreover, the book presents systems that put these techniques to practical use (e.g., reputation monitoring platforms) and takes an inductive approach to define the gestalt of mining for competitive strategic intelligence by selecting major use cases that are laid out and explained in detail. These pieces form the first part of the book. Each of those use cases is backed by a nu...

  20. Mechatronics in the mining industrie. With (development) method towards success; Mechatronik im Bergbau. Mit (Entwicklungs-) Methode zum Erfolg

    Energy Technology Data Exchange (ETDEWEB)

    Brandt, Thorsten; Bruckmann, Tobias [Mercatronics GmbH, Duisburg (Germany)

    2009-10-01

    Germany is a high-wage country. Hence the internationally competitive extraction of raw materials in Germany can only be ensured by highly efficient working processes. Tackling the associated extreme requirements on road-driving, coal winning and transport equipment has resulted in the German mining industry and its suppliers achieving the role of an international leader in technology. To safeguard this position also in the future the successful mechanisation will now be followed by the mechatronisation in the mining industry. Efficiency will be increased by (partial) automation and assistance systems. This contribution is a first step towards a series of articles, which explain the principles of mechatronic development methods in the mining industry and will make the development engineers in the mines aware of the high potential of mechatronics in the mining industry. (orig.)

  1. Visualizing data mining results with the Brede tools

    DEFF Research Database (Denmark)

    Nielsen, Finn Årup

    2009-01-01

    has expanded and now includes its own database with coordinates along with ontologies for brain regions and functions: The Brede Database. With Brede Toolbox and Database combined we setup automated workflows for extraction of data, mass meta-analytic data mining and visualizations. Most of the Web......A few neuroinformatics databases now exist that record results from neuroimaging studies in the form of brain coordinates in stereotaxic space. The Brede Toolbox was originally developed to extract, analyze and visualize data from one of them --- the BrainMap database. Since then the Brede Toolbox...

  2. FSRM: A Fast Algorithm for Sequential Rule Mining

    Directory of Open Access Journals (Sweden)

    Anjali Paliwal

    2014-10-01

    Full Text Available Recent developments in computing and automation technologies have resulted in computerizing business and scientific applications in various areas. Turing the massive amounts of accumulated information into knowledge is attracting researchers in numerous domains as well as databases, machine learning, statistics, and so on. From the views of information researchers, the stress is on discovering meaningful patterns hidden in the massive data sets. Hence, a central issue for knowledge discovery in databases, additionally the main focus of this paper, is to develop economical and scalable mining algorithms as integrated tools for management systems.

  3. Survey of the Euro Currency Fluctuation by Using Data Mining

    Directory of Open Access Journals (Sweden)

    M. Baan

    2013-08-01

    Full Text Available Data mining or Knowledge Discovery in Databases (KDD is a new field in information technology thatemerged because of progress in creation and maintenance of large databases by combining statisticaland artificial intelligence methods with database management. Data mining is used to recognizehiddenpatterns and provide relevant information for decision making on complex problems where conventionalmethods are inecient or too slow. Data mining can be used as a powerful tool to predict future trends andbehaviors, and this prediction allows making proactive, knowledge-driven decisions in businesses. Sincethe automated prospective analyses offered by data mining move beyond theanalyses of past eventsprovided by retrospective tools, it can answer the business questions which are traditionally timeconsuming to resolve. Based on this great advantage, it provides more interest for the government,industry andcommerce. In this paper we have used this tool to investigate the Euro currency fluctuation.For this investigation, we have three different algorithms: K*, IBK and MLP and we have extractedEuro currency volatility by using the same criteria for all used algorithms. The used dataset has21,084 records and is collected from daily price fluctuations in the Euro currency in the periodof10/2006 to 04/2010

  4. Mining-Induced Coal Permeability Change Under Different Mining Layouts

    Science.gov (United States)

    Zhang, Zetian; Zhang, Ru; Xie, Heping; Gao, Mingzhong; Xie, Jing

    2016-09-01

    To comprehensively understand the mining-induced coal permeability change, a series of laboratory unloading experiments are conducted based on a simplifying assumption of the actual mining-induced stress evolution processes of three typical longwall mining layouts in China, i.e., non-pillar mining (NM), top-coal caving mining (TCM) and protective coal-seam mining (PCM). A theoretical expression of the mining-induced permeability change ratio (MPCR) is derived and validated by laboratory experiments and in situ observations. The mining-induced coal permeability variation under the three typical mining layouts is quantitatively analyzed using the MPCR based on the test results. The experimental results show that the mining-induced stress evolution processes of different mining layouts do have an influence on the mechanical behavior and evolution of MPCR of coal. The coal mass in the PCM simulation has the lowest stress concentration but the highest peak MPCR (approximately 4000 %), whereas the opposite trends are observed for the coal mass under NM. The results of the coal mass under TCM fall between those for PCM and NM. The evolution of the MPCR of coal under different layouts can be divided into three sections, i.e., stable increasing section, accelerated increasing section and reducing section, but the evolution processes are slightly different for the different mining layouts. A coal bed gas intensive extraction region is recommended based on the MPCR distribution of coal seams obtained by simplifying assumptions and the laboratory testing results. The presented results are also compared with existing conventional triaxial compression test results to fully comprehend the effect of actual mining-induced stress evolution on coal property tests.

  5. Automated alignment-based curation of gene models in filamentous fungi

    NARCIS (Netherlands)

    Burgt, van der A.; Severing, E.I.; Collemare, J.A.R.; Wit, de P.J.G.M.

    2014-01-01

    Background Automated gene-calling is still an error-prone process, particularly for the highly plastic genomes of fungal species. Improvement through quality control and manual curation of gene models is a time-consuming process that requires skilled biologists and is only marginally performed. The

  6. Automated analysis and annotation of basketball video

    Science.gov (United States)

    Saur, Drew D.; Tan, Yap-Peng; Kulkarni, Sanjeev R.; Ramadge, Peter J.

    1997-01-01

    Automated analysis and annotation of video sequences are important for digital video libraries, content-based video browsing and data mining projects. A successful video annotation system should provide users with useful video content summary in a reasonable processing time. Given the wide variety of video genres available today, automatically extracting meaningful video content for annotation still remains hard by using current available techniques. However, a wide range video has inherent structure such that some prior knowledge about the video content can be exploited to improve our understanding of the high-level video semantic content. In this paper, we develop tools and techniques for analyzing structured video by using the low-level information available directly from MPEG compressed video. Being able to work directly in the video compressed domain can greatly reduce the processing time and enhance storage efficiency. As a testbed, we have developed a basketball annotation system which combines the low-level information extracted from MPEG stream with the prior knowledge of basketball video structure to provide high level content analysis, annotation and browsing for events such as wide- angle and close-up views, fast breaks, steals, potential shots, number of possessions and possession times. We expect our approach can also be extended to structured video in other domains.

  7. Automated CD-SEM metrology for efficient TD and HVM

    Science.gov (United States)

    Starikov, Alexander; Mulapudi, Satya P.

    2008-03-01

    CD-SEM is the metrology tool of choice for patterning process development and production process control. We can make these applications more efficient by extracting more information from each CD-SEM image. This enables direct monitors of key process parameters, such as lithography dose and focus, or predicting the outcome of processing, such as etched dimensions or electrical parameters. Automating CD-SEM recipes at the early stages of process development can accelerate technology characterization, segmentation of variance and process improvements. This leverages the engineering effort, reduces development costs and helps to manage the risks inherent in new technology. Automating CD-SEM for manufacturing enables efficient operations. Novel SEM Alarm Time Indicator (SATI) makes this task manageable. SATI pulls together data mining, trend charting of the key recipe and Operations (OPS) indicators, Pareto of OPS losses and inputs for root cause analysis. This approach proved natural to our FAB personnel. After minimal initial training, we applied new methods in 65nm FLASH manufacture. This resulted in significant lasting improvements of CD-SEM recipe robustness, portability and automation, increased CD-SEM capacity and MT productivity.

  8. Building Automation Using Wired Communication.

    Directory of Open Access Journals (Sweden)

    Ms. Supriya Gund*,

    2014-04-01

    Full Text Available In this paper, we present the design and implementation of a building automation system where communication technology LAN has been used. This paper mainly focuses on the controlling of home appliances remotely and providing security when the user is away from the place. This system provides ideal solution to the problems faced by home owners in daily life. This system provides security against intrusion as well as automates various home appliances using LAN. To demonstrate the feasibility and effectiveness of the proposed system, the device such as fire sensor, gas sensor, panic switch, intruder switch along with the smartcard have been developed and evaluated with the building automation system. These techniques are successfully merged in a single building automation system. This system offers a complete, low cost powerful and user friendly way of real-time monitoring and remote control of a building.

  9. Evolution of Home Automation Technology

    Directory of Open Access Journals (Sweden)

    Mohd. Rihan

    2009-01-01

    Full Text Available In modern society home and office automation has becomeincreasingly important, providing ways to interconnectvarious home appliances. This interconnection results infaster transfer of information within home/offices leading tobetter home management and improved user experience.Home Automation, in essence, is a technology thatintegrates various electrical systems of a home to provideenhanced comfort and security. Users are grantedconvenient and complete control over all the electrical homeappliances and they are relieved from the tasks thatpreviously required manual control. This paper tracks thedevelopment of home automation technology over the lasttwo decades. Various home automation technologies havebeen explained briefly, giving a chronological account of theevolution of one of the most talked about technologies ofrecent times.

  10. Home automation with Intel Galileo

    CERN Document Server

    Dundar, Onur

    2015-01-01

    This book is for anyone who wants to learn Intel Galileo for home automation and cross-platform software development. No knowledge of programming with Intel Galileo is assumed, but knowledge of the C programming language is essential.

  11. Automating the Purple Crow Lidar

    Directory of Open Access Journals (Sweden)

    Hicks Shannon

    2016-01-01

    Full Text Available The Purple Crow LiDAR (PCL was built to measure short and long term coupling between the lower, middle, and upper atmosphere. The initial component of my MSc. project is to automate two key elements of the PCL: the rotating liquid mercury mirror and the Zaber alignment mirror. In addition to the automation of the Zaber alignment mirror, it is also necessary to describe the mirror’s movement and positioning errors. Its properties will then be added into the alignment software. Once the alignment software has been completed, we will compare the new alignment method with the previous manual procedure. This is the first among several projects that will culminate in a fully-automated lidar. Eventually, we will be able to work remotely, thereby increasing the amount of data we collect. This paper will describe the motivation for automation, the methods we propose, preliminary results for the Zaber alignment error analysis, and future work.

  12. Network based automation for SMEs

    DEFF Research Database (Denmark)

    Shahabeddini Parizi, Mohammad; Radziwon, Agnieszka

    2017-01-01

    The implementation of appropriate automation concepts which increase productivity in Small and Medium Sized Enterprises (SMEs) requires a lot of effort, due to their limited resources. Therefore, it is strongly recommended for small firms to open up for the external sources of knowledge, which...... automation solutions. The empirical data collection involved application of a combination of comparative case study method with action research elements. This article provides an outlook over the challenges in implementing technological improvements and the way how it could be resolved in collaboration...... with other members of the same regional ecosystem. The findings highlight two main automation related areas where manufacturing SMEs could leverage on external sources on knowledge – these are assistance in defining automation problem as well as appropriate solution and provider selection. Consequently...

  13. National Automated Conformity Inspection Process -

    Data.gov (United States)

    Department of Transportation — The National Automated Conformity Inspection Process (NACIP) Application is intended to expedite the workflow process as it pertains to the FAA Form 81 0-10 Request...

  14. Listeria Genomics

    Science.gov (United States)

    Cabanes, Didier; Sousa, Sandra; Cossart, Pascale

    The opportunistic intracellular foodborne pathogen Listeria monocytogenes has become a paradigm for the study of host-pathogen interactions and bacterial adaptation to mammalian hosts. Analysis of L. monocytogenes infection has provided considerable insight into how bacteria invade cells, move intracellularly, and disseminate in tissues, as well as tools to address fundamental processes in cell biology. Moreover, the vast amount of knowledge that has been gathered through in-depth comparative genomic analyses and in vivo studies makes L. monocytogenes one of the most well-studied bacterial pathogens. This chapter provides an overview of progress in the exploration of genomic, transcriptomic, and proteomic data in Listeria spp. to understand genome evolution and diversity, as well as physiological aspects of metabolism used by bacteria when growing in diverse environments, in particular in infected hosts.

  15. Marine genomics

    DEFF Research Database (Denmark)

    Oliveira Ribeiro, Ângela Maria; Foote, Andrew D.; Kupczok, Anne

    2017-01-01

    Marine ecosystems occupy 71% of the surface of our planet, yet we know little about their diversity. Although the inventory of species is continually increasing, as registered by the Census of Marine Life program, only about 10% of the estimated two million marine species are known. This lag......-throughput sequencing approaches have been helping to improve our knowledge of marine biodiversity, from the rich microbial biota that forms the base of the tree of life to a wealth of plant and animal species. In this review, we present an overview of the applications of genomics to the study of marine life, from...... evolutionary biology of non-model organisms to species of commercial relevance for fishing, aquaculture and biomedicine. Instead of providing an exhaustive list of available genomic data, we rather set to present contextualized examples that best represent the current status of the field of marine genomics....

  16. Evolution of Home Automation Technology

    OpenAIRE

    Mohd. Rihan; M. Salim Beg

    2009-01-01

    In modern society home and office automation has becomeincreasingly important, providing ways to interconnectvarious home appliances. This interconnection results infaster transfer of information within home/offices leading tobetter home management and improved user experience.Home Automation, in essence, is a technology thatintegrates various electrical systems of a home to provideenhanced comfort and security. Users are grantedconvenient and complete control over all the electrical homeappl...

  17. Technology modernization assessment flexible automation

    Energy Technology Data Exchange (ETDEWEB)

    Bennett, D.W.; Boyd, D.R.; Hansen, N.H.; Hansen, M.A.; Yount, J.A.

    1990-12-01

    The objectives of this report are: to present technology assessment guidelines to be considered in conjunction with defense regulations before an automation project is developed to give examples showing how assessment guidelines may be applied to a current project to present several potential areas where automation might be applied successfully in the depot system. Depots perform primarily repair and remanufacturing operations, with limited small batch manufacturing runs. While certain activities (such as Management Information Systems and warehousing) are directly applicable to either environment, the majority of applications will require combining existing and emerging technologies in different ways, with the special needs of depot remanufacturing environment. Industry generally enjoys the ability to make revisions to its product lines seasonally, followed by batch runs of thousands or more. Depot batch runs are in the tens, at best the hundreds, of parts with a potential for large variation in product mix; reconfiguration may be required on a week-to-week basis. This need for a higher degree of flexibility suggests a higher level of operator interaction, and, in turn, control systems that go beyond the state of the art for less flexible automation and industry in general. This report investigates the benefits and barriers to automation and concludes that, while significant benefits do exist for automation, depots must be prepared to carefully investigate the technical feasibility of each opportunity and the life-cycle costs associated with implementation. Implementation is suggested in two ways: (1) develop an implementation plan for automation technologies based on results of small demonstration automation projects; (2) use phased implementation for both these and later stage automation projects to allow major technical and administrative risk issues to be addressed. 10 refs., 2 figs., 2 tabs. (JF)

  18. Aprendizaje automático

    OpenAIRE

    Moreno, Antonio

    1994-01-01

    En este libro se introducen los conceptos básicos en una de las ramas más estudiadas actualmente dentro de la inteligencia artificial: el aprendizaje automático. Se estudian temas como el aprendizaje inductivo, el razonamiento analógico, el aprendizaje basado en explicaciones, las redes neuronales, los algoritmos genéticos, el razonamiento basado en casos o las aproximaciones teóricas al aprendizaje automático.

  19. 2015 Chinese Intelligent Automation Conference

    CERN Document Server

    Li, Hongbo

    2015-01-01

    Proceedings of the 2015 Chinese Intelligent Automation Conference presents selected research papers from the CIAC’15, held in Fuzhou, China. The topics include adaptive control, fuzzy control, neural network based control, knowledge based control, hybrid intelligent control, learning control, evolutionary mechanism based control, multi-sensor integration, failure diagnosis, reconfigurable control, etc. Engineers and researchers from academia, industry and the government can gain valuable insights into interdisciplinary solutions in the field of intelligent automation.

  20. Automated Supernova Discovery (Abstract)

    Science.gov (United States)

    Post, R. S.

    2015-12-01

    (Abstract only) We are developing a system of robotic telescopes for automatic recognition of Supernovas as well as other transient events in collaboration with the Puckett Supernova Search Team. At the SAS2014 meeting, the discovery program, SNARE, was first described. Since then, it has been continuously improved to handle searches under a wide variety of atmospheric conditions. Currently, two telescopes are used to build a reference library while searching for PSN with a partial library. Since data is taken every night without clouds, we must deal with varying atmospheric and high background illumination from the moon. Software is configured to identify a PSN, reshoot for verification with options to change the run plan to acquire photometric or spectrographic data. The telescopes are 24-inch CDK24, with Alta U230 cameras, one in CA and one in NM. Images and run plans are sent between sites so the CA telescope can search while photometry is done in NM. Our goal is to find bright PSNs with magnitude 17.5 or less which is the limit of our planned spectroscopy. We present results from our first automated PSN discoveries and plans for PSN data acquisition.