WorldWideScience

Sample records for mining information extraction

  1. Mining knowledge from text repositories using information extraction ...

    Indian Academy of Sciences (India)

    Information extraction (IE); text mining; text repositories; knowledge discovery from .... general purpose English words. However ... of precision and recall, as extensive experimentation is required due to lack of public tagged corpora. 4. Mining ...

  2. Information Extraction for Clinical Data Mining: A Mammography Case Study.

    Science.gov (United States)

    Nassif, Houssam; Woods, Ryan; Burnside, Elizabeth; Ayvaci, Mehmet; Shavlik, Jude; Page, David

    2009-01-01

    Breast cancer is the leading cause of cancer mortality in women between the ages of 15 and 54. During mammography screening, radiologists use a strict lexicon (BI-RADS) to describe and report their findings. Mammography records are then stored in a well-defined database format (NMD). Lately, researchers have applied data mining and machine learning techniques to these databases. They successfully built breast cancer classifiers that can help in early detection of malignancy. However, the validity of these models depends on the quality of the underlying databases. Unfortunately, most databases suffer from inconsistencies, missing data, inter-observer variability and inappropriate term usage. In addition, many databases are not compliant with the NMD format and/or solely consist of text reports. BI-RADS feature extraction from free text and consistency checks between recorded predictive variables and text reports are crucial to addressing this problem. We describe a general scheme for concept information retrieval from free text given a lexicon, and present a BI-RADS features extraction algorithm for clinical data mining. It consists of a syntax analyzer, a concept finder and a negation detector. The syntax analyzer preprocesses the input into individual sentences. The concept finder uses a semantic grammar based on the BI-RADS lexicon and the experts' input. It parses sentences detecting BI-RADS concepts. Once a concept is located, a lexical scanner checks for negation. Our method can handle multiple latent concepts within the text, filtering out ultrasound concepts. On our dataset, our algorithm achieves 97.7% precision, 95.5% recall and an F 1 -score of 0.97. It outperforms manual feature extraction at the 5% statistical significance level.

  3. EnvMine: A text-mining system for the automatic extraction of contextual information

    Directory of Open Access Journals (Sweden)

    de Lorenzo Victor

    2010-06-01

    Full Text Available Abstract Background For ecological studies, it is crucial to count on adequate descriptions of the environments and samples being studied. Such a description must be done in terms of their physicochemical characteristics, allowing a direct comparison between different environments that would be difficult to do otherwise. Also the characterization must include the precise geographical location, to make possible the study of geographical distributions and biogeographical patterns. Currently, there is no schema for annotating these environmental features, and these data have to be extracted from textual sources (published articles. So far, this had to be performed by manual inspection of the corresponding documents. To facilitate this task, we have developed EnvMine, a set of text-mining tools devoted to retrieve contextual information (physicochemical variables and geographical locations from textual sources of any kind. Results EnvMine is capable of retrieving the physicochemical variables cited in the text, by means of the accurate identification of their associated units of measurement. In this task, the system achieves a recall (percentage of items retrieved of 92% with less than 1% error. Also a Bayesian classifier was tested for distinguishing parts of the text describing environmental characteristics from others dealing with, for instance, experimental settings. Regarding the identification of geographical locations, the system takes advantage of existing databases such as GeoNames to achieve 86% recall with 92% precision. The identification of a location includes also the determination of its exact coordinates (latitude and longitude, thus allowing the calculation of distance between the individual locations. Conclusion EnvMine is a very efficient method for extracting contextual information from different text sources, like published articles or web pages. This tool can help in determining the precise location and physicochemical

  4. Addressing Information Proliferation: Applications of Information Extraction and Text Mining

    Science.gov (United States)

    Li, Jingjing

    2013-01-01

    The advent of the Internet and the ever-increasing capacity of storage media have made it easy to store, deliver, and share enormous volumes of data, leading to a proliferation of information on the Web, in online libraries, on news wires, and almost everywhere in our daily lives. Since our ability to process and absorb this information remains…

  5. A construction scheme of web page comment information extraction system based on frequent subtree mining

    Science.gov (United States)

    Zhang, Xiaowen; Chen, Bingfeng

    2017-08-01

    Based on the frequent sub-tree mining algorithm, this paper proposes a construction scheme of web page comment information extraction system based on frequent subtree mining, referred to as FSM system. The entire system architecture and the various modules to do a brief introduction, and then the core of the system to do a detailed description, and finally give the system prototype.

  6. Using text mining techniques to extract phenotypic information from the PhenoCHF corpus.

    Science.gov (United States)

    Alnazzawi, Noha; Thompson, Paul; Batista-Navarro, Riza; Ananiadou, Sophia

    2015-01-01

    Phenotypic information locked away in unstructured narrative text presents significant barriers to information accessibility, both for clinical practitioners and for computerised applications used for clinical research purposes. Text mining (TM) techniques have previously been applied successfully to extract different types of information from text in the biomedical domain. They have the potential to be extended to allow the extraction of information relating to phenotypes from free text. To stimulate the development of TM systems that are able to extract phenotypic information from text, we have created a new corpus (PhenoCHF) that is annotated by domain experts with several types of phenotypic information relating to congestive heart failure. To ensure that systems developed using the corpus are robust to multiple text types, it integrates text from heterogeneous sources, i.e., electronic health records (EHRs) and scientific articles from the literature. We have developed several different phenotype extraction methods to demonstrate the utility of the corpus, and tested these methods on a further corpus, i.e., ShARe/CLEF 2013. Evaluation of our automated methods showed that PhenoCHF can facilitate the training of reliable phenotype extraction systems, which are robust to variations in text type. These results have been reinforced by evaluating our trained systems on the ShARe/CLEF corpus, which contains clinical records of various types. Like other studies within the biomedical domain, we found that solutions based on conditional random fields produced the best results, when coupled with a rich feature set. PhenoCHF is the first annotated corpus aimed at encoding detailed phenotypic information. The unique heterogeneous composition of the corpus has been shown to be advantageous in the training of systems that can accurately extract phenotypic information from a range of different text types. Although the scope of our annotation is currently limited to a single

  7. Metaproteomics: extracting and mining proteome information to characterize metabolic activities in microbial communities.

    Science.gov (United States)

    Abraham, Paul E; Giannone, Richard J; Xiong, Weili; Hettich, Robert L

    2014-06-17

    Contemporary microbial ecology studies usually employ one or more "omics" approaches to investigate the structure and function of microbial communities. Among these, metaproteomics aims to characterize the metabolic activities of the microbial membership, providing a direct link between the genetic potential and functional metabolism. The successful deployment of metaproteomics research depends on the integration of high-quality experimental and bioinformatic techniques for uncovering the metabolic activities of a microbial community in a way that is complementary to other "meta-omic" approaches. The essential, quality-defining informatics steps in metaproteomics investigations are: (1) construction of the metagenome, (2) functional annotation of predicted protein-coding genes, (3) protein database searching, (4) protein inference, and (5) extraction of metabolic information. In this article, we provide an overview of current bioinformatic approaches and software implementations in metaproteome studies in order to highlight the key considerations needed for successful implementation of this powerful community-biology tool. Copyright © 2014 John Wiley & Sons, Inc.

  8. Mining of the social network extraction

    Science.gov (United States)

    Nasution, M. K. M.; Hardi, M.; Syah, R.

    2017-01-01

    The use of Web as social media is steadily gaining ground in the study of social actor behaviour. However, information in Web can be interpreted in accordance with the ability of the method such as superficial methods for extracting social networks. Each method however has features and drawbacks: it cannot reveal the behaviour of social actors, but it has the hidden information about them. Therefore, this paper aims to reveal such information in the social networks mining. Social behaviour could be expressed through a set of words extracted from the list of snippets.

  9. Nuclear expert web mining system: monitoring and analysis of nuclear acceptance by information retrieval and opinion extraction on the Internet

    Energy Technology Data Exchange (ETDEWEB)

    Reis, Thiago; Barroso, Antonio C.O.; Imakuma, Kengo, E-mail: thiagoreis@usp.b, E-mail: barroso@ipen.b, E-mail: kimakuma@ipen.b [Instituto de Pesquisas Energeticas e Nucleares (IPEN/CNEN-SP), Sao Paulo, SP (Brazil)

    2011-07-01

    This paper presents a research initiative that aims to collect nuclear related information and to analyze opinionated texts by mining the hypertextual data environment and social networks web sites on the Internet. Different from previous approaches that employed traditional statistical techniques, it is being proposed a novel Web Mining approach, built using the concept of Expert Systems, for massive and autonomous data collection and analysis. The initial step has been accomplished, resulting in a framework design that is able to gradually encompass a set of evolving techniques, methods, and theories in such a way that this work will build a platform upon which new researches can be performed more easily by just substituting modules or plugging in new ones. Upon completion it is expected that this research will contribute to the understanding of the population views on nuclear technology and its acceptance. (author)

  10. Nuclear expert web mining system: monitoring and analysis of nuclear acceptance by information retrieval and opinion extraction on the Internet

    International Nuclear Information System (INIS)

    Reis, Thiago; Barroso, Antonio C.O.; Imakuma, Kengo

    2011-01-01

    This paper presents a research initiative that aims to collect nuclear related information and to analyze opinionated texts by mining the hypertextual data environment and social networks web sites on the Internet. Different from previous approaches that employed traditional statistical techniques, it is being proposed a novel Web Mining approach, built using the concept of Expert Systems, for massive and autonomous data collection and analysis. The initial step has been accomplished, resulting in a framework design that is able to gradually encompass a set of evolving techniques, methods, and theories in such a way that this work will build a platform upon which new researches can be performed more easily by just substituting modules or plugging in new ones. Upon completion it is expected that this research will contribute to the understanding of the population views on nuclear technology and its acceptance. (author)

  11. artery disease guidelines with extracted knowledge from data mining

    Directory of Open Access Journals (Sweden)

    Peyman Rezaei-Hachesu

    2017-06-01

    Conclusion: Guidelines confirm the achieved results from data mining (DM techniques and help to rank important risk factors based on national and local information. Evaluation of extracted rules determined new patterns for CAD patients.

  12. EXTRACTING KNOWLEDGE FROM DATA - DATA MINING

    Directory of Open Access Journals (Sweden)

    DIANA ELENA CODREANU

    2011-04-01

    Full Text Available Managers of economic organizations have at their disposal a large volume of information and practically facing an avalanche of information, but they can not operate studying reports containing detailed data volumes without a correlation because of the good an organization may be decided in fractions of time. Thus, to take the best and effective decisions in real time, managers need to have the correct information is presented quickly, in a synthetic way, but relevant to allow for predictions and analysis.This paper wants to highlight the solutions to extract knowledge from data, namely data mining. With this technology not only has to verify some hypotheses, but aims at discovering new knowledge, so that economic organization to cope with fierce competition in the market.

  13. Information mining in remote sensing imagery

    Science.gov (United States)

    Li, Jiang

    The volume of remotely sensed imagery continues to grow at an enormous rate due to the advances in sensor technology, and our capability for collecting and storing images has greatly outpaced our ability to analyze and retrieve information from the images. This motivates us to develop image information mining techniques, which is very much an interdisciplinary endeavor drawing upon expertise in image processing, databases, information retrieval, machine learning, and software design. This dissertation proposes and implements an extensive remote sensing image information mining (ReSIM) system prototype for mining useful information implicitly stored in remote sensing imagery. The system consists of three modules: image processing subsystem, database subsystem, and visualization and graphical user interface (GUI) subsystem. Land cover and land use (LCLU) information corresponding to spectral characteristics is identified by supervised classification based on support vector machines (SVM) with automatic model selection, while textural features that characterize spatial information are extracted using Gabor wavelet coefficients. Within LCLU categories, textural features are clustered using an optimized k-means clustering approach to acquire search efficient space. The clusters are stored in an object-oriented database (OODB) with associated images indexed in an image database (IDB). A k-nearest neighbor search is performed using a query-by-example (QBE) approach. Furthermore, an automatic parametric contour tracing algorithm and an O(n) time piecewise linear polygonal approximation (PLPA) algorithm are developed for shape information mining of interesting objects within the image. A fuzzy object-oriented database based on the fuzzy object-oriented data (FOOD) model is developed to handle the fuzziness and uncertainty. Three specific applications are presented: integrated land cover and texture pattern mining, shape information mining for change detection of lakes, and

  14. Information extraction system

    Science.gov (United States)

    Lemmond, Tracy D; Hanley, William G; Guensche, Joseph Wendell; Perry, Nathan C; Nitao, John J; Kidwell, Paul Brandon; Boakye, Kofi Agyeman; Glaser, Ron E; Prenger, Ryan James

    2014-05-13

    An information extraction system and methods of operating the system are provided. In particular, an information extraction system for performing meta-extraction of named entities of people, organizations, and locations as well as relationships and events from text documents are described herein.

  15. TSC mobile mining and extraction technology

    Energy Technology Data Exchange (ETDEWEB)

    Lavender, W.J. [TSC Company Ltd., Calgary, AB (Canada)

    2001-11-01

    This Power-Point presentation described an innovative mining and extraction technology developed by Calgary-based TSC Company Ltd. that has provided a major breakthrough in bitumen production from mineable oil sands. The presentation described the process and key mechanical components as demonstrated on oil sands leases. It also described the step change in cost structure and profitability. Oil sands mining provide a hugh resource base with no exploration costs and no decline in production. Despite these advantages, oil sands mining faces the challenge of high capital and operating costs and materials handling. Other challenges include the variability of the ore and environmental impacts. This paper described the fundamentals of the new technology called the Tar Sand Combine (TSC), a continuous mining machine, crusher, cyclone, tailings filter and stacker all in one mobile module. Several viewgraphs were included with the presentation to depict the recovery process as successfully demonstrated at a pilot project. Patent is pending on the process and components. The advantages of the TSC are reduced materials handling, and no tailings ponds are generated since tailings remain where they are mined. The final product is clean bitumen. The specifications of a commercial TSC are: 2000 ton/stream hour mining produce 25,000 bpsd bitumen at 12 per cent ore grade; mined ore bitumen recovery is greater than 95 per cent and the availability factor is 85 per cent. It was concluded that the TSC can maximize oil sands reserves, while providing significant cost savings and environmental benefits. 2 tabs., 24 figs.

  16. A Mine of Information.

    Science.gov (United States)

    Williams, Lisa B.

    1986-01-01

    Business researchers and marketers find certain databases useful for finding information on investments, competitors, products, and markets. Colleges can use these same databases to get background on corporate prospects. The largest data source available, DIALOG Information Services and some other databases are described. (MLW)

  17. Information is mine

    International Nuclear Information System (INIS)

    Ju, Sang Uk; Seo, Seung Gyun

    1994-02-01

    This is a guidebook on Cheonrian, Pos-serve and Hitel, which introduces how to use each service with specific explanations. This is divided seven parts which are data communication with its application, Modem and access, pos-serve with communication service, information service, pos world service, E-mail, Fax service, shopping, studying and meeting others, Cheonrian with join, basic command, public information, guideline for basic service, posting, chatting, entertainment with this programs, taking hobby through Cheonrian and shopping, Hitel with news, advertisement, exchanging news by e-mail, etiquette for chatting room, upload and download of data. It also deals with banking, KT-MAIL in Korea communication, basic Network, Anonymous FTP, Use net News, Telnet and the effort to make great communication world.

  18. Proceedings of the meeting on uranium exploration, mining and extraction

    International Nuclear Information System (INIS)

    1996-01-01

    Meeting on uranium exploration, mining, and extraction is aimed to expedite information exchange among researchers from the National Atomic Energy Agency (BATAN), their international colleagues, the higher education institutions,and other interested scientific communities on the latest development on Kalan uranium minerals exploration, mining, and extraction. Nuclear Minerals Development Centre (PPBGN) roles in nuclear energy provision, the theme of the meeting, reflect current advancements of the Centre in fulfilling its major tasks and responsibilities. In order to assist PPBGN better to assume its roles and responsibilities, the meeting is expected to bring forth essential solutions for problems and difficulties relevant to PPBGN's activities. Hence, the scope of the meeting will be limited to discussion on the status of nuclear minerals exploration, mining, and extraction technologies in Indonesia as well as the related environmental and workplace safeties in uranium mining and milling. Ten technical papers were presented in meeting, including four topics on exploration status and technology, three subject matter on mining, two presentations on milling, and one paper on environmental and workplace safeties

  19. Multimedia Information Extraction

    CERN Document Server

    Maybury, Mark T

    2012-01-01

    The advent of increasingly large consumer collections of audio (e.g., iTunes), imagery (e.g., Flickr), and video (e.g., YouTube) is driving a need not only for multimedia retrieval but also information extraction from and across media. Furthermore, industrial and government collections fuel requirements for stock media access, media preservation, broadcast news retrieval, identity management, and video surveillance.  While significant advances have been made in language processing for information extraction from unstructured multilingual text and extraction of objects from imagery and vid

  20. Mining and information: defining the need

    Energy Technology Data Exchange (ETDEWEB)

    Gray, J.; Peck, J. [AQUILA Mining Systems Ltd., Calgary, AB (Canada)

    1996-07-01

    Some of the current technologies at surface mining operations are discussed. The information system and communication system requirements needed to integrate these components are considered. A plan of a new mine that uses operating information, optimization through planning, monitoring, and locating systems, data processing and analysis, and integration of monitored data and information via the Total Mining System (TMS) is described. The TMS will allow integration of a network of stand-alone modules. There is an immediate requirement for setting standards in surface mining operations to prevent duplication of effort. 12 refs., 2 figs.

  1. Possibility of new mining project extracting in conditions of crisis

    Directory of Open Access Journals (Sweden)

    Stanislav Szabo

    2009-03-01

    Full Text Available This paper gives some information about investment in mining company, it specifies project of Strieborná Vein. The project usesinstruments of financial management and it gives a lot of information about cost, taxes, return of investment, incomes and loan. Thatinformation is very important for application of project in Strieborná Vein and they support decision of investors. Strieborná Veinis an example of investment in period of crisis. Gold, silver, copper and iron extracting needs great investment but commodities are toointeresting to invest to them.

  2. Mars Target Encyclopedia: Information Extraction for Planetary Science

    Science.gov (United States)

    Wagstaff, K. L.; Francis, R.; Gowda, T.; Lu, Y.; Riloff, E.; Singh, K.

    2017-06-01

    Mars surface targets / and published compositions / Seek and ye will find. We used text mining methods to extract information from LPSC abstracts about the composition of Mars surface targets. Users can search by element, mineral, or target.

  3. Challenges in Managing Information Extraction

    Science.gov (United States)

    Shen, Warren H.

    2009-01-01

    This dissertation studies information extraction (IE), the problem of extracting structured information from unstructured data. Example IE tasks include extracting person names from news articles, product information from e-commerce Web pages, street addresses from emails, and names of emerging music bands from blogs. IE is all increasingly…

  4. Sustainable rehabilitation of mining waste and acid mine drainage using geochemistry, mine type, mineralogy, texture, ore extraction and climate knowledge.

    Science.gov (United States)

    Anawar, Hossain Md

    2015-08-01

    The oxidative dissolution of sulfidic minerals releases the extremely acidic leachate, sulfate and potentially toxic elements e.g., As, Ag, Cd, Cr, Cu, Hg, Ni, Pb, Sb, Th, U, Zn, etc. from different mine tailings and waste dumps. For the sustainable rehabilitation and disposal of mining waste, the sources and mechanisms of contaminant generation, fate and transport of contaminants should be clearly understood. Therefore, this study has provided a critical review on (1) recent insights in mechanisms of oxidation of sulfidic minerals, (2) environmental contamination by mining waste, and (3) remediation and rehabilitation techniques, and (4) then developed the GEMTEC conceptual model/guide [(bio)-geochemistry-mine type-mineralogy- geological texture-ore extraction process-climatic knowledge)] to provide the new scientific approach and knowledge for remediation of mining wastes and acid mine drainage. This study has suggested the pre-mining geological, geochemical, mineralogical and microtextural characterization of different mineral deposits, and post-mining studies of ore extraction processes, physical, geochemical, mineralogical and microbial reactions, natural attenuation and effect of climate change for sustainable rehabilitation of mining waste. All components of this model should be considered for effective and integrated management of mining waste and acid mine drainage. Copyright © 2015 Elsevier Ltd. All rights reserved.

  5. Scenario Customization for Information Extraction

    National Research Council Canada - National Science Library

    Yangarber, Roman

    2001-01-01

    Information Extraction (IE) is an emerging NLP technology, whose function is to process unstructured, natural language text, to locate specific pieces of information, or facts, in the text, and to use these facts to fill a database...

  6. Feature extraction for classification in the data mining process

    NARCIS (Netherlands)

    Pechenizkiy, M.; Puuronen, S.; Tsymbal, A.

    2003-01-01

    Dimensionality reduction is a very important step in the data mining process. In this paper, we consider feature extraction for classification tasks as a technique to overcome problems occurring because of "the curse of dimensionality". Three different eigenvector-based feature extraction approaches

  7. A New Challenge for Information Mining

    Directory of Open Access Journals (Sweden)

    Roberto Paiano

    2017-07-01

    Full Text Available In the field of "Data Exploration" many approaches have been developed to solve the problem of management of big data that are also semantically rich. Nowadays, there is a strong need to support the discovery-oriented applications where data discovery is a highly ad hoc interactive process to support the users by assisting the navigation in the data to find interesting objects. In this work starting by a theoretical data exploration system, where we identified the main features that a data exploration system must have to an efficient exploratory experience, we propose a combination of two data exploration techniques faceted navigation and data mining with the aim to improve the discovery information during exploration. This approach is contextualized better in Information Mining. Information mining, in fact, aims at discovering knowledge, i.e. more general patterns within objects or collections of objects.

  8. Mining biomarker information in biomedical literature

    Directory of Open Access Journals (Sweden)

    Younesi Erfan

    2012-12-01

    Full Text Available Abstract Background For selection and evaluation of potential biomarkers, inclusion of already published information is of utmost importance. In spite of significant advancements in text- and data-mining techniques, the vast knowledge space of biomarkers in biomedical text has remained unexplored. Existing named entity recognition approaches are not sufficiently selective for the retrieval of biomarker information from the literature. The purpose of this study was to identify textual features that enhance the effectiveness of biomarker information retrieval for different indication areas and diverse end user perspectives. Methods A biomarker terminology was created and further organized into six concept classes. Performance of this terminology was optimized towards balanced selectivity and specificity. The information retrieval performance using the biomarker terminology was evaluated based on various combinations of the terminology's six classes. Further validation of these results was performed on two independent corpora representing two different neurodegenerative diseases. Results The current state of the biomarker terminology contains 119 entity classes supported by 1890 different synonyms. The result of information retrieval shows improved retrieval rate of informative abstracts, which is achieved by including clinical management terms and evidence of gene/protein alterations (e.g. gene/protein expression status or certain polymorphisms in combination with disease and gene name recognition. When additional filtering through other classes (e.g. diagnostic or prognostic methods is applied, the typical high number of unspecific search results is significantly reduced. The evaluation results suggest that this approach enables the automated identification of biomarker information in the literature. A demo version of the search engine SCAIView, including the biomarker retrieval, is made available to the public through http

  9. Social big data mining

    CERN Document Server

    Ishikawa, Hiroshi

    2015-01-01

    Social Media. Big Data and Social Data. Hypotheses in the Era of Big Data. Social Big Data Applications. Basic Concepts in Data Mining. Association Rule Mining. Clustering. Classification. Prediction. Web Structure Mining. Web Content Mining. Web Access Log Mining, Information Extraction and Deep Web Mining. Media Mining. Scalability and Outlier Detection.

  10. Remote Sensing Extraction of Stopes and Tailings Ponds in AN Ultra-Low Iron Mining Area

    Science.gov (United States)

    Ma, B.; Chen, Y.; Li, X.; Wu, L.

    2018-04-01

    With the development of economy, global demand for steel has accelerated since 2000, and thus mining activities of iron ore have become intensive accordingly. An ultra-low-grade iron has been extracted by open-pit mining and processed massively since 2001 in Kuancheng County, Hebei Province. There are large-scale stopes and tailings ponds in this area. It is important to extract their spatial distribution information for environmental protection and disaster prevention. A remote sensing method of extracting stopes and tailings ponds is studied based on spectral characteristics by use of Landsat 8 OLI imagery and ground spectral data. The overall accuracy of extraction is 95.06 %. In addition, tailings ponds are distinguished from stopes based on thermal characteristics by use of temperature image. The results could provide decision support for environmental protection, disaster prevention, and ecological restoration in the ultra-low-grade iron ore mining area.

  11. The mine where extracting coal is a bonus

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    1993-07-01

    Bowmans Harbour opencast mine is probably unique. Here Clay Colliery is mining an area that was derelict and contaminated land, which became a landfill site. When standards for landfill were raised the Black Country Development Corporation decided to redeposit the waste in a new repository on the same site, using higher standards. New cells for waste are being constructed. In creating these new cells coal is being extracted and sold. Four excavators are involved in this project.

  12. A Mining Algorithm for Extracting Decision Process Data Models

    Directory of Open Access Journals (Sweden)

    Cristina-Claudia DOLEAN

    2011-01-01

    Full Text Available The paper introduces an algorithm that mines logs of user interaction with simulation software. It outputs a model that explicitly shows the data perspective of the decision process, namely the Decision Data Model (DDM. In the first part of the paper we focus on how the DDM is extracted by our mining algorithm. We introduce it as pseudo-code and, then, provide explanations and examples of how it actually works. In the second part of the paper, we use a series of small case studies to prove the robustness of the mining algorithm and how it deals with the most common patterns we found in real logs.

  13. Mining the Temporal Dimension of the Information Propagation

    Science.gov (United States)

    Berlingerio, Michele; Coscia, Michele; Giannotti, Fosca

    In the last decade, Social Network Analysis has been a field in which the effort devoted from several researchers in the Data Mining area has increased very fast. Among the possible related topics, the study of the information propagation in a network attracted the interest of many researchers, also from the industrial world. However, only a few answers to the questions “How does the information propagates over a network, why and how fast?” have been discovered so far. On the other hand, these answers are of large interest, since they help in the tasks of finding experts in a network, assessing viral marketing strategies, identifying fast or slow paths of the information inside a collaborative network. In this paper we study the problem of finding frequent patterns in a network with the help of two different techniques: TAS (Temporally Annotated Sequences) mining, aimed at extracting sequential patterns where each transition between two events is annotated with a typical transition time that emerges from input data, and Graph Mining, which is helpful for locally analyzing the nodes of the networks with their properties. Finally we show preliminary results done in the direction of mining the information propagation over a network, performed on two well known email datasets, that show the power of the combination of these two approaches.

  14. Extracting useful information from images

    DEFF Research Database (Denmark)

    Kucheryavskiy, Sergey

    2011-01-01

    The paper presents an overview of methods for extracting useful information from digital images. It covers various approaches that utilized different properties of images, like intensity distribution, spatial frequencies content and several others. A few case studies including isotropic and heter......The paper presents an overview of methods for extracting useful information from digital images. It covers various approaches that utilized different properties of images, like intensity distribution, spatial frequencies content and several others. A few case studies including isotropic...

  15. Mining Hesitation Information by Vague Association Rules

    Science.gov (United States)

    Lu, An; Ng, Wilfred

    In many online shopping applications, such as Amazon and eBay, traditional Association Rule (AR) mining has limitations as it only deals with the items that are sold but ignores the items that are almost sold (for example, those items that are put into the basket but not checked out). We say that those almost sold items carry hesitation information, since customers are hesitating to buy them. The hesitation information of items is valuable knowledge for the design of good selling strategies. However, there is no conceptual model that is able to capture different statuses of hesitation information. Herein, we apply and extend vague set theory in the context of AR mining. We define the concepts of attractiveness and hesitation of an item, which represent the overall information of a customer's intent on an item. Based on the two concepts, we propose the notion of Vague Association Rules (VARs). We devise an efficient algorithm to mine the VARs. Our experiments show that our algorithm is efficient and the VARs capture more specific and richer information than do the traditional ARs.

  16. Mine railway equipments management information system

    Energy Technology Data Exchange (ETDEWEB)

    Zhang, X.; Han, K.; Duan, T.; Liu, Z.; Lu, H. [China University of Mining and Technology, Xuzhou (China)

    2007-06-15

    Based on client/server and browser/server models, the management information system described realized the entire life-cycle management of mine railway equipment which included universal equipment and special equipment in the locomotive depot, track maintenance division, electrical depot and car depot. The system has other online functions such as transmitting reports, graphics management, statistics, searches, graphics wizard and web propaganda. It was applied in Pingdingshan Coal Co. Ltd.'s Railway Transport Department. 5 refs., 4 figs.

  17. Mine robotics for the extraction of minerals at great depths

    Energy Technology Data Exchange (ETDEWEB)

    Chaikovskii, Eh G; Poller, B V; Konyukh, V L

    1983-09-01

    An article is discussed which was written by A.A. Bovin, N.V. Kurleni and E.I. Shemyakin on Problems in mining mineral deposits at great depth, printed in issue No. 2 of this journal in 1983. First the authors define the problems, then discuss the construction of automatic systems for the control of underground extraction and haulage and end with the basic problems and organizational measures connected with the development and construction of mining robots. They also deal with systems of control and radio communications for underground winning and hauling operations. The article represents a complex study of the need for full automation of mining and the gradual introduction of robots to replace men in hazardous work places. The authors suggest equipment for the automatic extraction and hauling of minerals based on the use of microcomputers underground and computers located on the surface, videosensors and pressure transducers. The authors state that in order to solve the problems of automation and remote control of mining operations it is necessary to involve more specialists in robotics and remote control at the mining scientific research institutes and to increase the number of graduates in this field. 28 references.

  18. A Financial Data Mining Model for Extracting Customer Behavior

    Directory of Open Access Journals (Sweden)

    Mark K.Y. Mak

    2011-08-01

    Full Text Available Facing the problem of variation and chaotic behavior of customers, the lack of sufficient information is a challenge to many business organizations. Human analysts lacking an understanding of the hidden patterns in business data, thus, can miss corporate business opportunities. In order to embrace all business opportunities, enhance the competitiveness, discovery of hidden knowledge, unexpected patterns and useful rules from large databases have provided a feasible solution for several decades. While there is a wide range of financial analysis products existing in the financial market, how to customize the investment portfolio for the customer is still a challenge to many financial institutions. This paper aims at developing an intelligent Financial Data Mining Model (FDMM for extracting customer behavior in the financial industry, so as to increase the availability of decision support data and hence increase customer satisfaction. The proposed financial model first clusters the customers into several sectors, and then finds the correlation among these sectors. It is noted that better customer segmentation can increase the ability to identify targeted customers, therefore extracting useful rules for specific clusters can provide an insight into customers' buying behavior and marketing implications. To validate the feasibility of the proposed model, a simple dataset is collected from a financial company in Hong Kong. The simulation experiments show that the proposed method not only can improve the workflow of a financial company, but also deepen understanding of investment behavior. Thus, a corporation is able to customize the most suitable products and services for customers on the basis of the rules extracted.

  19. Information Extraction From Chemical Patents

    Directory of Open Access Journals (Sweden)

    Sandra Bergmann

    2012-01-01

    Full Text Available The development of new chemicals or pharmaceuticals is preceded by an indepth analysis of published patents in this field. This information retrieval is a costly and time inefficient step when done by a human reader, yet it is mandatory for potential success of an investment. The goal of the research project UIMA-HPC is to automate and hence speed-up the process of knowledge mining about patents. Multi-threaded analysis engines, developed according to UIMA (Unstructured Information Management Architecture standards, process texts and images in thousands of documents in parallel. UNICORE (UNiform Interface to COmputing Resources workflow control structures make it possible to dynamically allocate resources for every given task to gain best cpu-time/realtime ratios in an HPC environment.

  20. 76 FR 589 - Proposed Extension of Existing Information Collection; Mine Accident, Injury, Illness, Mine...

    Science.gov (United States)

    2011-01-05

    ... requires mine operators and independent contractors to immediately notify MSHA in the event of an accident... provides for uniform information gathering across the mining industry. Section 50.30 requires mine... types. These rates are used to analyze trends and to assess the degree of success of the health and...

  1. A STUDY OF TEXT MINING METHODS, APPLICATIONS,AND TECHNIQUES

    OpenAIRE

    R. Rajamani*1 & S. Saranya2

    2017-01-01

    Data mining is used to extract useful information from the large amount of data. It is used to implement and solve different types of research problems. The research related areas in data mining are text mining, web mining, image mining, sequential pattern mining, spatial mining, medical mining, multimedia mining, structure mining and graph mining. Text mining also referred to text of data mining, it is also called knowledge discovery in text (KDT) or knowledge of intelligent text analysis. T...

  2. MBA: a literature mining system for extracting biomedical abbreviations.

    Science.gov (United States)

    Xu, Yun; Wang, ZhiHao; Lei, YiMing; Zhao, YuZhong; Xue, Yu

    2009-01-09

    The exploding growth of the biomedical literature presents many challenges for biological researchers. One such challenge is from the use of a great deal of abbreviations. Extracting abbreviations and their definitions accurately is very helpful to biologists and also facilitates biomedical text analysis. Existing approaches fall into four broad categories: rule based, machine learning based, text alignment based and statistically based. State of the art methods either focus exclusively on acronym-type abbreviations, or could not recognize rare abbreviations. We propose a systematic method to extract abbreviations effectively. At first a scoring method is used to classify the abbreviations into acronym-type and non-acronym-type abbreviations, and then their corresponding definitions are identified by two different methods: text alignment algorithm for the former, statistical method for the latter. A literature mining system MBA was constructed to extract both acronym-type and non-acronym-type abbreviations. An abbreviation-tagged literature corpus, called Medstract gold standard corpus, was used to evaluate the system. MBA achieved a recall of 88% at the precision of 91% on the Medstract gold-standard EVALUATION Corpus. We present a new literature mining system MBA for extracting biomedical abbreviations. Our evaluation demonstrates that the MBA system performs better than the others. It can identify the definition of not only acronym-type abbreviations including a little irregular acronym-type abbreviations (e.g., ), but also non-acronym-type abbreviations (e.g., ).

  3. Multi-Filter String Matching and Human-Centric Entity Matching for Information Extraction

    Science.gov (United States)

    Sun, Chong

    2012-01-01

    More and more information is being generated in text documents, such as Web pages, emails and blogs. To effectively manage this unstructured information, one broadly used approach includes locating relevant content in documents, extracting structured information and integrating the extracted information for querying, mining or further analysis. In…

  4. Text mining facilitates database curation - extraction of mutation-disease associations from Bio-medical literature.

    Science.gov (United States)

    Ravikumar, Komandur Elayavilli; Wagholikar, Kavishwar B; Li, Dingcheng; Kocher, Jean-Pierre; Liu, Hongfang

    2015-06-06

    Advances in the next generation sequencing technology has accelerated the pace of individualized medicine (IM), which aims to incorporate genetic/genomic information into medicine. One immediate need in interpreting sequencing data is the assembly of information about genetic variants and their corresponding associations with other entities (e.g., diseases or medications). Even with dedicated effort to capture such information in biological databases, much of this information remains 'locked' in the unstructured text of biomedical publications. There is a substantial lag between the publication and the subsequent abstraction of such information into databases. Multiple text mining systems have been developed, but most of them focus on the sentence level association extraction with performance evaluation based on gold standard text annotations specifically prepared for text mining systems. We developed and evaluated a text mining system, MutD, which extracts protein mutation-disease associations from MEDLINE abstracts by incorporating discourse level analysis, using a benchmark data set extracted from curated database records. MutD achieves an F-measure of 64.3% for reconstructing protein mutation disease associations in curated database records. Discourse level analysis component of MutD contributed to a gain of more than 10% in F-measure when compared against the sentence level association extraction. Our error analysis indicates that 23 of the 64 precision errors are true associations that were not captured by database curators and 68 of the 113 recall errors are caused by the absence of associated disease entities in the abstract. After adjusting for the defects in the curated database, the revised F-measure of MutD in association detection reaches 81.5%. Our quantitative analysis reveals that MutD can effectively extract protein mutation disease associations when benchmarking based on curated database records. The analysis also demonstrates that incorporating

  5. Multiple-Feature Extracting Modules Based Leak Mining System Design

    Directory of Open Access Journals (Sweden)

    Ying-Chiang Cho

    2013-01-01

    mining system that is equipped with SQL injection vulnerability detection, by means of an algorithm developed for the web crawler. In addition, we analyze portal sites of the governments of various countries or regions in order to investigate the information leaking status of each site. Subsequently, we analyze the database structure and content of each site, using the data collected. Thus, we make use of practical verification in order to focus on information security and privacy through black-box testing.

  6. Extracting software static defect models using data mining

    Directory of Open Access Journals (Sweden)

    Ahmed H. Yousef

    2015-03-01

    Full Text Available Large software projects are subject to quality risks of having defective modules that will cause failures during the software execution. Several software repositories contain source code of large projects that are composed of many modules. These software repositories include data for the software metrics of these modules and the defective state of each module. In this paper, a data mining approach is used to show the attributes that predict the defective state of software modules. Software solution architecture is proposed to convert the extracted knowledge into data mining models that can be integrated with the current software project metrics and bugs data in order to enhance the prediction. The results show better prediction capabilities when all the algorithms are combined using weighted votes. When only one individual algorithm is used, Naïve Bayes algorithm has the best results, then the Neural Network and the Decision Trees algorithms.

  7. Information Retrieval and Text Mining Technologies for Chemistry.

    Science.gov (United States)

    Krallinger, Martin; Rabal, Obdulia; Lourenço, Anália; Oyarzabal, Julen; Valencia, Alfonso

    2017-06-28

    Efficient access to chemical information contained in scientific literature, patents, technical reports, or the web is a pressing need shared by researchers and patent attorneys from different chemical disciplines. Retrieval of important chemical information in most cases starts with finding relevant documents for a particular chemical compound or family. Targeted retrieval of chemical documents is closely connected to the automatic recognition of chemical entities in the text, which commonly involves the extraction of the entire list of chemicals mentioned in a document, including any associated information. In this Review, we provide a comprehensive and in-depth description of fundamental concepts, technical implementations, and current technologies for meeting these information demands. A strong focus is placed on community challenges addressing systems performance, more particularly CHEMDNER and CHEMDNER patents tasks of BioCreative IV and V, respectively. Considering the growing interest in the construction of automatically annotated chemical knowledge bases that integrate chemical information and biological data, cheminformatics approaches for mapping the extracted chemical names into chemical structures and their subsequent annotation together with text mining applications for linking chemistry with biological information are also presented. Finally, future trends and current challenges are highlighted as a roadmap proposal for research in this emerging field.

  8. Vaccine adverse event text mining system for extracting features from vaccine safety reports.

    Science.gov (United States)

    Botsis, Taxiarchis; Buttolph, Thomas; Nguyen, Michael D; Winiecki, Scott; Woo, Emily Jane; Ball, Robert

    2012-01-01

    To develop and evaluate a text mining system for extracting key clinical features from vaccine adverse event reporting system (VAERS) narratives to aid in the automated review of adverse event reports. Based upon clinical significance to VAERS reviewing physicians, we defined the primary (diagnosis and cause of death) and secondary features (eg, symptoms) for extraction. We built a novel vaccine adverse event text mining (VaeTM) system based on a semantic text mining strategy. The performance of VaeTM was evaluated using a total of 300 VAERS reports in three sequential evaluations of 100 reports each. Moreover, we evaluated the VaeTM contribution to case classification; an information retrieval-based approach was used for the identification of anaphylaxis cases in a set of reports and was compared with two other methods: a dedicated text classifier and an online tool. The performance metrics of VaeTM were text mining metrics: recall, precision and F-measure. We also conducted a qualitative difference analysis and calculated sensitivity and specificity for classification of anaphylaxis cases based on the above three approaches. VaeTM performed best in extracting diagnosis, second level diagnosis, drug, vaccine, and lot number features (lenient F-measure in the third evaluation: 0.897, 0.817, 0.858, 0.874, and 0.914, respectively). In terms of case classification, high sensitivity was achieved (83.1%); this was equal and better compared to the text classifier (83.1%) and the online tool (40.7%), respectively. Our VaeTM implementation of a semantic text mining strategy shows promise in providing accurate and efficient extraction of key features from VAERS narratives.

  9. Towards A Model Of Knowledge Extraction Of Text Mining For Palliative Care Patients In Panama.

    Directory of Open Access Journals (Sweden)

    Denis Cedeno Moreno

    2015-08-01

    Full Text Available Solutions using information technology is an innovative way to manage the information hospice patients in hospitals in Panama. The application of techniques of text mining for the domain of medicine especially information from electronic health records of patients in palliative care is one of the most recent and promising research areas for the analysis of textual data. Text mining is based on new knowledge extraction from unstructured natural language data. We may also create ontologies to describe the terminology and knowledge in a given domain. In an ontology conceptualization of a domain that may be general or specific formalized. Knowledge can be used for decision making by health specialists or can help in research topics for improving the health system.

  10. Extracting information from multiplex networks

    Science.gov (United States)

    Iacovacci, Jacopo; Bianconi, Ginestra

    2016-06-01

    Multiplex networks are generalized network structures that are able to describe networks in which the same set of nodes are connected by links that have different connotations. Multiplex networks are ubiquitous since they describe social, financial, engineering, and biological networks as well. Extending our ability to analyze complex networks to multiplex network structures increases greatly the level of information that is possible to extract from big data. For these reasons, characterizing the centrality of nodes in multiplex networks and finding new ways to solve challenging inference problems defined on multiplex networks are fundamental questions of network science. In this paper, we discuss the relevance of the Multiplex PageRank algorithm for measuring the centrality of nodes in multilayer networks and we characterize the utility of the recently introduced indicator function Θ ˜ S for describing their mesoscale organization and community structure. As working examples for studying these measures, we consider three multiplex network datasets coming for social science.

  11. The study on privacy preserving data mining for information security

    Science.gov (United States)

    Li, Xiaohui

    2012-04-01

    Privacy preserving data mining have a rapid development in a short year. But it still faces many challenges in the future. Firstly, the level of privacy has different definitions in different filed. Therefore, the measure of privacy preserving data mining technology protecting private information is not the same. So, it's an urgent issue to present a unified privacy definition and measure. Secondly, the most of research in privacy preserving data mining is presently confined to the theory study.

  12. 77 FR 58170 - Proposed Renewal of Existing Information Collection; Fire Protection (Underground Coal Mines)

    Science.gov (United States)

    2012-09-19

    ... Renewal of Existing Information Collection; Fire Protection (Underground Coal Mines) AGENCY: Mine Safety... INFORMATION: I. Background Fire protection standards for underground coal mines are based on section 311(a) of the Federal Mine Safety and Health Act of 1977 (Mine Act). 30 CFR 75.1100 requires that each coal mine...

  13. A text-mining system for extracting metabolic reactions from full-text articles.

    Science.gov (United States)

    Czarnecki, Jan; Nobeli, Irene; Smith, Adrian M; Shepherd, Adrian J

    2012-07-23

    Increasingly biological text mining research is focusing on the extraction of complex relationships relevant to the construction and curation of biological networks and pathways. However, one important category of pathway - metabolic pathways - has been largely neglected.Here we present a relatively simple method for extracting metabolic reaction information from free text that scores different permutations of assigned entities (enzymes and metabolites) within a given sentence based on the presence and location of stemmed keywords. This method extends an approach that has proved effective in the context of the extraction of protein-protein interactions. When evaluated on a set of manually-curated metabolic pathways using standard performance criteria, our method performs surprisingly well. Precision and recall rates are comparable to those previously achieved for the well-known protein-protein interaction extraction task. We conclude that automated metabolic pathway construction is more tractable than has often been assumed, and that (as in the case of protein-protein interaction extraction) relatively simple text-mining approaches can prove surprisingly effective. It is hoped that these results will provide an impetus to further research and act as a useful benchmark for judging the performance of more sophisticated methods that are yet to be developed.

  14. Tagline: Information Extraction for Semi-Structured Text Elements in Medical Progress Notes

    Science.gov (United States)

    Finch, Dezon Kile

    2012-01-01

    Text analysis has become an important research activity in the Department of Veterans Affairs (VA). Statistical text mining and natural language processing have been shown to be very effective for extracting useful information from medical documents. However, neither of these techniques is effective at extracting the information stored in…

  15. Model architecture of intelligent data mining oriented urban transportation information

    Science.gov (United States)

    Yang, Bogang; Tao, Yingchun; Sui, Jianbo; Zhang, Feizhou

    2007-06-01

    Aiming at solving practical problems in urban traffic, the paper presents model architecture of intelligent data mining from hierarchical view. With artificial intelligent technologies used in the framework, the intelligent data mining technology improves, which is more suitable for the change of real-time road condition. It also provides efficient technology support for the urban transport information distribution, transmission and display.

  16. NAMED ENTITY RECOGNITION FROM BIOMEDICAL TEXT -AN INFORMATION EXTRACTION TASK

    Directory of Open Access Journals (Sweden)

    N. Kanya

    2016-07-01

    Full Text Available Biomedical Text Mining targets the Extraction of significant information from biomedical archives. Bio TM encompasses Information Retrieval (IR and Information Extraction (IE. The Information Retrieval will retrieve the relevant Biomedical Literature documents from the various Repositories like PubMed, MedLine etc., based on a search query. The IR Process ends up with the generation of corpus with the relevant document retrieved from the Publication databases based on the query. The IE task includes the process of Preprocessing of the document, Named Entity Recognition (NER from the documents and Relationship Extraction. This process includes Natural Language Processing, Data Mining techniques and machine Language algorithm. The preprocessing task includes tokenization, stop word Removal, shallow parsing, and Parts-Of-Speech tagging. NER phase involves recognition of well-defined objects such as genes, proteins or cell-lines etc. This process leads to the next phase that is extraction of relationships (IE. The work was based on machine learning algorithm Conditional Random Field (CRF.

  17. Transductive Pattern Learning for Information Extraction

    National Research Council Canada - National Science Library

    McLernon, Brian; Kushmerick, Nicholas

    2006-01-01

    .... We present TPLEX, a semi-supervised learning algorithm for information extraction that can acquire extraction patterns from a small amount of labelled text in conjunction with a large amount of unlabelled text...

  18. A malware detection scheme based on mining format information.

    Science.gov (United States)

    Bai, Jinrong; Wang, Junfeng; Zou, Guozhong

    2014-01-01

    Malware has become one of the most serious threats to computer information system and the current malware detection technology still has very significant limitations. In this paper, we proposed a malware detection approach by mining format information of PE (portable executable) files. Based on in-depth analysis of the static format information of the PE files, we extracted 197 features from format information of PE files and applied feature selection methods to reduce the dimensionality of the features and achieve acceptable high performance. When the selected features were trained using classification algorithms, the results of our experiments indicate that the accuracy of the top classification algorithm is 99.1% and the value of the AUC is 0.998. We designed three experiments to evaluate the performance of our detection scheme and the ability of detecting unknown and new malware. Although the experimental results of identifying new malware are not perfect, our method is still able to identify 97.6% of new malware with 1.3% false positive rates.

  19. Natural radioactivity in mining and hydrocarbon extraction industry. Vol. 1

    Energy Technology Data Exchange (ETDEWEB)

    Testa, C; Desideri, D; Meli, M A; Roselli, C [General Chemistry Institute, Urbino University, 61029 Urbino, (Italy)

    1996-03-01

    Water and soil natural radioactivity is a well known phenomenon which can produced by variable concentrations of uranium and thorium series radionuclides. Generally, the relevant radiological hazard is not important; however, some radiation protection problems can occur in particular industrial processes involving the treatment of large quantities of materials. In this case a high concentration of radioactive substance (NORM: nationally occurring radioactive materials) can be found at special points of the plant, in the manufacture by-products and in the waters. Sometimes the national radioactivity concentration can be so high to raise radiation protection problems which can be assimilated in a sense to the ones faced in the presence, handling, and disposal of non-sealed radioactive sources. In this paper the following mining and hydrocarbon extraction plants were particularly taken into account: (a) industries using zircon sands to produce refractory and ceramic materials; (b) phosphorites manufacture to prepare phosphoric acids, plasters and fertilizers (c) hydrocarbon extraction and treatment processes where formations of low specific activity (L.S.A.) scales and sludges are produced. The relevant results and the possible radiation protection risks for the professional exposed staff will be reported. A special emphasis will be given to some african phosphorites (boucraa, togo, morocco), and L.S.A. scales (tunisia, congo, Egypt). 4 figs., 5 tabs.

  20. Natural radioactivity in mining and hydrocarbon extraction industry. Vol. 1

    International Nuclear Information System (INIS)

    Testa, C.; Desideri, D.; Meli, M.A.; Roselli, C.

    1996-01-01

    Water and soil natural radioactivity is a well known phenomenon which can produced by variable concentrations of uranium and thorium series radionuclides. Generally, the relevant radiological hazard is not important; however, some radiation protection problems can occur in particular industrial processes involving the treatment of large quantities of materials. In this case a high concentration of radioactive substance (NORM: nationally occurring radioactive materials) can be found at special points of the plant, in the manufacture by-products and in the waters. Sometimes the national radioactivity concentration can be so high to raise radiation protection problems which can be assimilated in a sense to the ones faced in the presence, handling, and disposal of non-sealed radioactive sources. In this paper the following mining and hydrocarbon extraction plants were particularly taken into account: a) industries using zircon sands to produce refractory and ceramic materials; b) phosphorites manufacture to prepare phosphoric acids, plasters and fertilizers c) hydrocarbon extraction and treatment processes where formations of low specific activity (L.S.A.) scales and sludges are produced. The relevant results and the possible radiation protection risks for the professional exposed staff will be reported. A special emphasis will be given to some african phosphorites (boucraa, togo, morocco), and L.S.A. scales (tunisia, congo, Egypt). 4 figs., 5 tabs

  1. Multiple-feature extracting modules based leak mining system design.

    Science.gov (United States)

    Cho, Ying-Chiang; Pan, Jen-Yi

    2013-01-01

    Over the years, human dependence on the Internet has increased dramatically. A large amount of information is placed on the Internet and retrieved from it daily, which makes web security in terms of online information a major concern. In recent years, the most problematic issues in web security have been e-mail address leakage and SQL injection attacks. There are many possible causes of information leakage, such as inadequate precautions during the programming process, which lead to the leakage of e-mail addresses entered online or insufficient protection of database information, a loophole that enables malicious users to steal online content. In this paper, we implement a crawler mining system that is equipped with SQL injection vulnerability detection, by means of an algorithm developed for the web crawler. In addition, we analyze portal sites of the governments of various countries or regions in order to investigate the information leaking status of each site. Subsequently, we analyze the database structure and content of each site, using the data collected. Thus, we make use of practical verification in order to focus on information security and privacy through black-box testing.

  2. An unsupervised text mining method for relation extraction from biomedical literature.

    Directory of Open Access Journals (Sweden)

    Changqin Quan

    Full Text Available The wealth of interaction information provided in biomedical articles motivated the implementation of text mining approaches to automatically extract biomedical relations. This paper presents an unsupervised method based on pattern clustering and sentence parsing to deal with biomedical relation extraction. Pattern clustering algorithm is based on Polynomial Kernel method, which identifies interaction words from unlabeled data; these interaction words are then used in relation extraction between entity pairs. Dependency parsing and phrase structure parsing are combined for relation extraction. Based on the semi-supervised KNN algorithm, we extend the proposed unsupervised approach to a semi-supervised approach by combining pattern clustering, dependency parsing and phrase structure parsing rules. We evaluated the approaches on two different tasks: (1 Protein-protein interactions extraction, and (2 Gene-suicide association extraction. The evaluation of task (1 on the benchmark dataset (AImed corpus showed that our proposed unsupervised approach outperformed three supervised methods. The three supervised methods are rule based, SVM based, and Kernel based separately. The proposed semi-supervised approach is superior to the existing semi-supervised methods. The evaluation on gene-suicide association extraction on a smaller dataset from Genetic Association Database and a larger dataset from publicly available PubMed showed that the proposed unsupervised and semi-supervised methods achieved much higher F-scores than co-occurrence based method.

  3. An Application for Data Preprocessing and Models Extractions in Web Usage Mining

    Directory of Open Access Journals (Sweden)

    Claudia Elena DINUCA

    2011-11-01

    Full Text Available Web servers worldwide generate a vast amount of information on web users’ browsing activities. Several researchers have studied these so-called clickstream or web access log data to better understand and characterize web users. The goal of this application is to analyze user behaviour by mining enriched web access log data. With the continued growth and proliferation of e-commerce, Web services, and Web-based information systems, the volumes of click stream and user data collected by Web-based organizations in their daily operations has reached astronomical proportions. This information can be exploited in various ways, such as enhancing the effectiveness of websites or developing directed web marketing campaigns. The discovered patterns are usually represented as collections of pages, objects, or re-sources that are frequently accessed by groups of users with common needs or interests. In this paper we will focus on displaying the way how it was implemented the application for data preprocessing and extracting different data models from web logs data, finding association as a data mining technique to extract potentially useful knowledge from web usage data. We find different data models navigation patterns by analysing the log files of the web-site. I implemented the application in Java using NetBeans IDE. For exemplification, I used the log files data from a commercial web site www.nice-layouts.com.

  4. Mining

    Directory of Open Access Journals (Sweden)

    Khairullah Khan

    2014-09-01

    Full Text Available Opinion mining is an interesting area of research because of its applications in various fields. Collecting opinions of people about products and about social and political events and problems through the Web is becoming increasingly popular every day. The opinions of users are helpful for the public and for stakeholders when making certain decisions. Opinion mining is a way to retrieve information through search engines, Web blogs and social networks. Because of the huge number of reviews in the form of unstructured text, it is impossible to summarize the information manually. Accordingly, efficient computational methods are needed for mining and summarizing the reviews from corpuses and Web documents. This study presents a systematic literature survey regarding the computational techniques, models and algorithms for mining opinion components from unstructured reviews.

  5. Information Extraction for Social Media

    NARCIS (Netherlands)

    Habib, M. B.; Keulen, M. van

    2014-01-01

    The rapid growth in IT in the last two decades has led to a growth in the amount of information available online. A new style for sharing information is social media. Social media is a continuously instantly updated source of information. In this position paper, we propose a framework for

  6. Mining for solutions, extracting discord: corporate social responsibility and canadian mining companies in Latin America

    OpenAIRE

    Stevens, Julie Ann

    2009-01-01

    While the mining industry generates many benefits to society, the industry has in some cases had a detrimental impact on affected communities. This paradox, manifested in the unequal distribution of costs and benefits amongst stakeholders, has prompted widespread scrutiny of the mining industry. Critique of the industry has questioned whether mining provides an economically, environmentally and socially sustainable model of development. Mining companies are increasingly adopting Corporate Soc...

  7. Comparison of three-stage sequential extraction and toxicity characteristic leaching tests to evaluate metal mobility in mining wastes

    International Nuclear Information System (INIS)

    Margui, E.; Salvado, V.; Queralt, I.; Hidalgo, M.

    2004-01-01

    Abandoned mining sites contain residues from ore processing operations that are characterised by high concentrations of heavy metals. The form in which a metal exists strongly influences its mobility and, thus, the effects on the environment. Operational methods of speciation analysis, such as the use of sequential extraction procedures, are commonly applied. In this work, the modified three-stage sequential extraction procedure proposed by the BCR (now the Standards, Measurements and Testing Programme) was applied for the fractionation of Ni, Zn, Pb and Cd in mining wastes from old Pb-Zn mining areas located in the Val d'Aran (NE Spain) and Cartagena (SE Spain). Analyses of the extracts were performed by inductively coupled plasma atomic emission spectrometry and electrothermal atomic absorption spectrometry. The procedure was evaluated by using a certified reference material, BCR-701. The results of the partitioning study indicate that more easily mobilised forms (acid exchangeable) were predominant for Cd and Zn, particularly in the sample from Cartagena. In contrast, the largest amount of lead was associated with the iron and manganese oxide fractions. On the other hand, the applicability of lixiviation tests commonly used to evaluate the leaching of toxic species from landfill disposal (US-EPA Toxicity Characteristic Leaching Procedure and DIN 38414-S4) to mining wastes was also investigated and the obtained results compared with the information on metal mobility derivable from the application of the three-stage sequential extraction procedure

  8. Mining Matters : Natural Resource Extraction and Local Business Constraints

    NARCIS (Netherlands)

    de Haas, Ralph; Poelhekke, Steven

    2016-01-01

    We estimate the impact of local mining activity on the business constraints experienced by 22,150 firms across eight resource-rich countries. We find that with the presence of active mines, the business environment in the immediate vicinity (<20 km) of a firm deteriorates but business constraints of

  9. Environmental Impacts and Health Aspects in the Mining Industry. A Comparative Study of the Mining and Extraction of Uranium, Copper and Gold

    Energy Technology Data Exchange (ETDEWEB)

    Nilsson, Jenny-Ann; Randhem, Johan

    2008-07-01

    This thesis work has analysed environmental impacts and health aspects in the mining industry of copper, uranium and gold with the aim of determining the relative performance, in a given set of parameters, of the uranium mining industry. A selection of fifteen active mining operations in Australia, Canada, Namibia, South Africa, and the United States of America constitute the subject of this study. The project includes detailed background information about mineral extraction methods, the investigated minerals and the mining operations together with descriptions of the general main health hazards and environmental impacts connected to mining. The mineral operations are investigated in a cradle to gate analysis for the year of activity of 2007 using the economic value of the product at the gate as functional unit. Primary data has been collected from environmental reports, company web pages, national databases and through personal contact with company representatives. The subsequent analysis examines the collected data from a resource consumption, human health and ecological consequences point of view. Using the Life Cycle Impact Assessment methodology of characterisation, primary data of environmental loads have been converted to a synoptic set of environmental impacts. For radiation and tailings issues, a more general approach is used to address the problem. Based on the collected data and the investigated parameters, the results indicate a presumptive relative disadvantageous result for the uranium mining industry in terms of health aspects but an apparent favourable relative result in terms of environmental impacts. Given the prerequisites of this study, it is not feasible to draw any unambiguous conclusions. Inabilities to do this are mainly related to inadequate data availability from mine sites (especially in areas concerning tailings management), and difficulties concerned with the relative valuation of specific performance parameters, in particular radiation

  10. Environmental Impacts and Health Aspects in the Mining Industry. A Comparative Study of the Mining and Extraction of Uranium, Copper and Gold

    International Nuclear Information System (INIS)

    Nilsson, Jenny-Ann; Randhem, Johan

    2008-01-01

    This thesis work has analysed environmental impacts and health aspects in the mining industry of copper, uranium and gold with the aim of determining the relative performance, in a given set of parameters, of the uranium mining industry. A selection of fifteen active mining operations in Australia, Canada, Namibia, South Africa, and the United States of America constitute the subject of this study. The project includes detailed background information about mineral extraction methods, the investigated minerals and the mining operations together with descriptions of the general main health hazards and environmental impacts connected to mining. The mineral operations are investigated in a cradle to gate analysis for the year of activity of 2007 using the economic value of the product at the gate as functional unit. Primary data has been collected from environmental reports, company web pages, national databases and through personal contact with company representatives. The subsequent analysis examines the collected data from a resource consumption, human health and ecological consequences point of view. Using the Life Cycle Impact Assessment methodology of characterisation, primary data of environmental loads have been converted to a synoptic set of environmental impacts. For radiation and tailings issues, a more general approach is used to address the problem. Based on the collected data and the investigated parameters, the results indicate a presumptive relative disadvantageous result for the uranium mining industry in terms of health aspects but an apparent favourable relative result in terms of environmental impacts. Given the prerequisites of this study, it is not feasible to draw any unambiguous conclusions. Inabilities to do this are mainly related to inadequate data availability from mine sites (especially in areas concerning tailings management), and difficulties concerned with the relative valuation of specific performance parameters, in particular radiation

  11. Integrated system of production information processing for surface mines

    Energy Technology Data Exchange (ETDEWEB)

    Li, K.; Wang, S.; Zeng, Z.; Wei, J.; Ren, Z. [China University of Mining and Technology, Xuzhou (China). Dept of Mining Engineering

    2000-09-01

    Based on the concept of geological statistic, mathematical program, condition simulation, system engineering, and the features and duties of each main department in surface mine production, an integrated system for surface mine production information was studied systematically and developed by using the technology of data warehousing, CAD, object-oriented and system integration, which leads to the systematizing and automating of the information management, data processing, optimization computing and plotting. In this paper, its overall object, system design, structure and functions and some key techniques were described. 2 refs., 3 figs.

  12. Information extraction from multi-institutional radiology reports.

    Science.gov (United States)

    Hassanpour, Saeed; Langlotz, Curtis P

    2016-01-01

    The radiology report is the most important source of clinical imaging information. It documents critical information about the patient's health and the radiologist's interpretation of medical findings. It also communicates information to the referring physicians and records that information for future clinical and research use. Although efforts to structure some radiology report information through predefined templates are beginning to bear fruit, a large portion of radiology report information is entered in free text. The free text format is a major obstacle for rapid extraction and subsequent use of information by clinicians, researchers, and healthcare information systems. This difficulty is due to the ambiguity and subtlety of natural language, complexity of described images, and variations among different radiologists and healthcare organizations. As a result, radiology reports are used only once by the clinician who ordered the study and rarely are used again for research and data mining. In this work, machine learning techniques and a large multi-institutional radiology report repository are used to extract the semantics of the radiology report and overcome the barriers to the re-use of radiology report information in clinical research and other healthcare applications. We describe a machine learning system to annotate radiology reports and extract report contents according to an information model. This information model covers the majority of clinically significant contents in radiology reports and is applicable to a wide variety of radiology study types. Our automated approach uses discriminative sequence classifiers for named-entity recognition to extract and organize clinically significant terms and phrases consistent with the information model. We evaluated our information extraction system on 150 radiology reports from three major healthcare organizations and compared its results to a commonly used non-machine learning information extraction method. We

  13. DiMeX: A Text Mining System for Mutation-Disease Association Extraction.

    Science.gov (United States)

    Mahmood, A S M Ashique; Wu, Tsung-Jung; Mazumder, Raja; Vijay-Shanker, K

    2016-01-01

    The number of published articles describing associations between mutations and diseases is increasing at a fast pace. There is a pressing need to gather such mutation-disease associations into public knowledge bases, but manual curation slows down the growth of such databases. We have addressed this problem by developing a text-mining system (DiMeX) to extract mutation to disease associations from publication abstracts. DiMeX consists of a series of natural language processing modules that preprocess input text and apply syntactic and semantic patterns to extract mutation-disease associations. DiMeX achieves high precision and recall with F-scores of 0.88, 0.91 and 0.89 when evaluated on three different datasets for mutation-disease associations. DiMeX includes a separate component that extracts mutation mentions in text and associates them with genes. This component has been also evaluated on different datasets and shown to achieve state-of-the-art performance. The results indicate that our system outperforms the existing mutation-disease association tools, addressing the low precision problems suffered by most approaches. DiMeX was applied on a large set of abstracts from Medline to extract mutation-disease associations, as well as other relevant information including patient/cohort size and population data. The results are stored in a database that can be queried and downloaded at http://biotm.cis.udel.edu/dimex/. We conclude that this high-throughput text-mining approach has the potential to significantly assist researchers and curators to enrich mutation databases.

  14. A Two-Step Resume Information Extraction Algorithm

    Directory of Open Access Journals (Sweden)

    Jie Chen

    2018-01-01

    Full Text Available With the rapid growth of Internet-based recruiting, there are a great number of personal resumes among recruiting systems. To gain more attention from the recruiters, most resumes are written in diverse formats, including varying font size, font colour, and table cells. However, the diversity of format is harmful to data mining, such as resume information extraction, automatic job matching, and candidates ranking. Supervised methods and rule-based methods have been proposed to extract facts from resumes, but they strongly rely on hierarchical structure information and large amounts of labelled data, which are hard to collect in reality. In this paper, we propose a two-step resume information extraction approach. In the first step, raw text of resume is identified as different resume blocks. To achieve the goal, we design a novel feature, Writing Style, to model sentence syntax information. Besides word index and punctuation index, word lexical attribute and prediction results of classifiers are included in Writing Style. In the second step, multiple classifiers are employed to identify different attributes of fact information in resumes. Experimental results on a real-world dataset show that the algorithm is feasible and effective.

  15. Symposium 'geology, mining and extractive processing of uranium, with special reference to Europe'

    International Nuclear Information System (INIS)

    Pietsch, H.B.

    1977-01-01

    This review of the symposium 'Geology, mining and extractive processing of uranium' gives a survey from the point of view of ore processing rather than exploration. A reason for the uranium consumption assumed is given, and uranium deposits and availability, methods of exploration, and interesting facts on uranium extraction from ores are gone into. (HK) [de

  16. Mining of hospital laboratory information systems

    DEFF Research Database (Denmark)

    Søeby, Karen; Jensen, Peter Bjødstrup; Werge, Thomas

    2015-01-01

    of hospital laboratory data as a source of information, we analyzed enzymatic plasma creatinine as a model analyte in two large pediatric hospital samples. Methods: Plasma creatinine measurements from 9700 children aged 0-18 years were obtained from hospital laboratory databases and partitioned into high...... in creatinine levels at different time points after birth and around the early teens, which challenges the establishment and usefulness of reference intervals in those age groups. Conclusions: The study documents that hospital laboratory data may inform on the developmental aspects of creatinine, on periods...... with pronounced heterogeneity and valid reference intervals. Furthermore, part of the heterogeneity in creatinine distribution is likely due to differences in biological and chronological age of children and should be considered when using age-specific reference intervals....

  17. Recommender system based on scarce information mining.

    Science.gov (United States)

    Lu, Wei; Chung, Fu-Lai; Lai, Kunfeng; Zhang, Liang

    2017-09-01

    Guessing what user may like is now a typical interface for video recommendation. Nowadays, the highly popular user generated content sites provide various sources of information such as tags for recommendation tasks. Motivated by a real world online video recommendation problem, this work targets at the long tail phenomena of user behavior and the sparsity of item features. A personalized compound recommendation framework for online video recommendation called Dirichlet mixture probit model for information scarcity (DPIS) is hence proposed. Assuming that each clicking sample is generated from a representation of user preferences, DPIS models the sample level topic proportions as a multinomial item vector, and utilizes topical clustering on the user part for recommendation through a probit classifier. As demonstrated by the real-world application, the proposed DPIS achieves better performance in accuracy, perplexity as well as diversity in coverage than traditional methods. Copyright © 2017 Elsevier Ltd. All rights reserved.

  18. 77 FR 26046 - Proposed Extension of Existing Information Collection; Ground Control for Surface Coal Mines and...

    Science.gov (United States)

    2012-05-02

    ... Extension of Existing Information Collection; Ground Control for Surface Coal Mines and Surface Work Areas of Underground Coal Mines AGENCY: Mine Safety and Health Administration, Labor. ACTION: Request for... inspections and investigations in coal or other mines shall be made each year for the purposes of, among other...

  19. Utilization of Integrated Geophysical Techniques to Delineate the Extraction of Mining Bench of Ornamental Rocks (Marble

    Directory of Open Access Journals (Sweden)

    Julián Martínez

    2017-12-01

    Full Text Available Low yields in ornamental rock mining remain one of the most important problems in this industry. This fact is usually associated with the presence of anisotropies in the rock, which makes it difficult to extract the blocks. An optimised planning of the exploitation, together with an improved geological understanding of the deposit, could increase these yields. In this work, marble mining in Macael (Spain was studied to test the capacity of non-destructive geophysical prospecting methods (GPR and ERI as tools to characterize the geology of the deposit. It is well-known that the ERI method provides a greater penetration depth. By using this technique, it is possible to distinguish the boundaries between the marble and the underlying micaschists, the morphology of the unit to be exploited, and even fracture zones to be identified. Therefore, this technique could be used in the early stages of research, to estimate the reserves of the deposit. The GPR methodology, with a lower penetration depth, is able to offer more detailed information. Specifically, it detects lateral and vertical changes of the facies inside the marble unit, as well as the anisotropies of the rock (fractures or holes. This technique would be suitable for use in a second stage of research. On the one hand, it is very useful for characterization of the texture and fabric of the rock, which allows us to determine in advance its properties, and therefore, the quality for ornamental use. On the other hand, the localization of anisotropy using the GPR technique will make it possible to improve the planning of the rock exploitation in order to increase yields. Both integrated geophysical techniques are effective for assessing the quality of ornamental rock and thus can serve as useful tools in mine planning to improve yields and costs.

  20. Extracting Information from Multimedia Meeting Collections

    OpenAIRE

    Gatica-Perez, Daniel; Zhang, Dong; Bengio, Samy

    2005-01-01

    Multimedia meeting collections, composed of unedited audio and video streams, handwritten notes, slides, and electronic documents that jointly constitute a raw record of complex human interaction processes in the workplace, have attracted interest due to the increasing feasibility of recording them in large quantities, by the opportunities for information access and retrieval applications derived from the automatic extraction of relevant meeting information, and by the challenges that the ext...

  1. DKIE: Open Source Information Extraction for Danish

    DEFF Research Database (Denmark)

    Derczynski, Leon; Field, Camilla Vilhelmsen; Bøgh, Kenneth Sejdenfaden

    2014-01-01

    Danish is a major Scandinavian language spoken daily by around six million people. However, it lacks a unified, open set of NLP tools. This demonstration will introduce DKIE, an extensible open-source toolkit for processing Danish text. We implement an information extraction architecture for Danish...

  2. Selectivity assessment of an arsenic sequential extraction procedure for evaluating mobility in mine wastes

    International Nuclear Information System (INIS)

    Drahota, Petr; Grösslová, Zuzana; Kindlová, Helena

    2014-01-01

    Highlights: • Extraction efficiency and selectivity of phosphate and oxalate were tested. • Pure As-bearing mineral phases and mine wastes were used. • The reagents were found to be specific and selective for most major forms of As. • An optimized sequential extraction scheme for mine wastes has been developed. • It has been tested over a model mineral mixtures and natural mine waste materials. - Abstract: An optimized sequential extraction (SE) scheme for mine waste materials has been developed and tested for As partitioning over a range of pure As-bearing mineral phases, their model mixtures, and natural mine waste materials. This optimized SE procedure employs five extraction steps: (1) nitrogen-purged deionized water, 10 h; (2) 0.01 M NH 4 H 2 PO 4 , 16 h; (3) 0.2 M NH 4 -oxalate in the dark, pH3, 2 h; (4) 0.2 M NH 4 -oxalate, pH3/80 °C, 4 h; (5) KClO 3 /HCl/HNO 3 digestion. Selectivity and specificity tests on natural mine wastes and major pure As-bearing mineral phases showed that these As fractions appear to be primarily associated with: (1) readily soluble; (2) adsorbed; (3) amorphous and poorly-crystalline arsenates, oxides and hydroxosulfates of Fe; (4) well-crystalline arsenates, oxides, and hydroxosulfates of Fe; as well as (5) sulfides and arsenides. The specificity and selectivity of extractants, and the reproducibility of the optimized SE procedure were further verified by artificial model mineral mixtures and different natural mine waste materials. Partitioning data for extraction steps 3, 4, and 5 showed good agreement with those calculated in the model mineral mixtures (<15% difference), as well as that expected in different natural mine waste materials. The sum of the As recovered in the different extractant pools was not significantly different (89–112%) than the results for acid digestion. This suggests that the optimized SE scheme can reliably be employed for As partitioning in mine waste materials

  3. Ergonomic, psychosocial factors and risks at work in informal mining

    Directory of Open Access Journals (Sweden)

    Milena Nunes Alves de Sousa

    2015-09-01

    Full Text Available The goal of this study was to identify ergonomic and psychosocial factors, and risks at informal work in the mining sector of the State of Paraíba, Brazil, from miners' perspective. A cross-sectional and descriptive study was conducted with 371 informal mining workers. They responded two questionnaires for assessing work performed in three dimensions: ergonomic factors; psychosocial factors; and occupational risks. The scores of the items of each dimension were added so that, the higher the score, the lower workers' satisfaction related to the area investigated. The results indicated that noise was common in the working environment (66%. Most workers (54.7% pointed out that the work was too hard and that it required attention and reasoning (85.7%. The workers emphasized the lack of training for working in mining (59.3% and few of them regarded the maintenance of the workplace as a component to prevent lumbago (32.3%. Risk of accidents was pointed out as the factor that needed increased attention in daily work (56.6%. All occupational risks were mentioned, including physical and chemical risks. There was significant correlation between age and occupational risks, indicating that the greater the age, the greater the perception of harmful agents (ρ = -0.23; p < 0.01. In the end, it was observed that, to a greater or lesser degree, all workers perceived ergonomic and psychosocial factors, and risks in informal mining. Length of service and age were the features that interfered significantly with the understanding of those factors and occupational risks.

  4. Planning maximum extraction of a safety pillar in the Most surface mine

    Energy Technology Data Exchange (ETDEWEB)

    Helis, P; Hess, L; Kubiznak, K [SHR - Banske Projekty, Teplice (Czechoslovakia)

    1990-11-01

    Discusses planned coal surface mining in the Most mine in the area of the Hnevin safety pillar with coal reserves amounting to about 7.5 Mt. The following aspects are evaluated: coal reserves and their distribution in the pillar, coal seam thickness and dip angles, water conditions, water influx rates, mechanical properties of the overburden and strata situated in the seam floor, slope stability and hazards of landslides, effects of water influx on landslide hazards, types of bucket wheel excavators used for overburden removal and mining, types of belt conveyors used for mine haulage, stackers, position of mining equipment in the mine. A scheme developed by Banske Projekty Teplice for partial extraction of the safety pillar would result in extraction of 4.5 Mt coal. About 1.7 Mt coal would be left in a safety coal layer about 10.0 m thick situated in the floor in zones with landslide hazards. KU 300 bucket wheel excavators, belt conveyors 1,200 mm wide and ZP 2,500 stackers would be used. 4 refs.

  5. Application of the method of optimum increase of Carboniferous gass exploitation for the determination of its extractable amount from the space of attenuated plant of the Paskov Mine

    Directory of Open Access Journals (Sweden)

    Dragon Vladimír

    2003-09-01

    Full Text Available A way of optimum extraction increase of Carboniferous gas which can be applied in any mine of the Ostrava-Karviná Mining District (OKMD during the current period of the restructuralisation and mining attenuation.

  6. Mining Heterogeneous Information Networks by Exploring the Power of Links

    Science.gov (United States)

    Han, Jiawei

    Knowledge is power but for interrelated data, knowledge is often hidden in massive links in heterogeneous information networks. We explore the power of links at mining heterogeneous information networks with several interesting tasks, including link-based object distinction, veracity analysis, multidimensional online analytical processing of heterogeneous information networks, and rank-based clustering. Some recent results of our research that explore the crucial information hidden in links will be introduced, including (1) Distinct for object distinction analysis, (2) TruthFinder for veracity analysis, (3) Infonet-OLAP for online analytical processing of information networks, and (4) RankClus for integrated ranking-based clustering. We also discuss some of our on-going studies in this direction.

  7. Modeling stress–strain state of rock mass under mining of complex-shape extraction pillar

    Science.gov (United States)

    Fryanov, VN; Pavlova, LD

    2018-03-01

    Based on the results of numerical modeling of stresses and strains in rock mass, geomechanical parameters of development workings adjacent to coal face operation area are provided for multi-entry preparation and extraction of flat seams with production faces of variable length. The negative effects on the geomechanical situation during the transition from the longwall to shortwall mining in a fully mechanized extraction face are found.

  8. Unsupervised information extraction by text segmentation

    CERN Document Server

    Cortez, Eli

    2013-01-01

    A new unsupervised approach to the problem of Information Extraction by Text Segmentation (IETS) is proposed, implemented and evaluated herein. The authors' approach relies on information available on pre-existing data to learn how to associate segments in the input string with attributes of a given domain relying on a very effective set of content-based features. The effectiveness of the content-based features is also exploited to directly learn from test data structure-based features, with no previous human-driven training, a feature unique to the presented approach. Based on the approach, a

  9. Optimizing Transport in Surface Mines, Taking into Account the Quality of Extracted Raw Ore

    Directory of Open Access Journals (Sweden)

    Marian Šofranko

    2012-12-01

    Full Text Available This articles concerns problemacy of appropriate separation of transporting mechanisms for mining minerals from individulalteritories. In the following sections of the article a model solution is presented with the use of newly created program for optimizationof transport, taking into account the required quality of extracted raw ore. This process is being done through computing analysisand programming language Borland C++ Builder

  10. Extracting the information backbone in online system.

    Science.gov (United States)

    Zhang, Qian-Ming; Zeng, An; Shang, Ming-Sheng

    2013-01-01

    Information overload is a serious problem in modern society and many solutions such as recommender system have been proposed to filter out irrelevant information. In the literature, researchers have been mainly dedicated to improving the recommendation performance (accuracy and diversity) of the algorithms while they have overlooked the influence of topology of the online user-object bipartite networks. In this paper, we find that some information provided by the bipartite networks is not only redundant but also misleading. With such "less can be more" feature, we design some algorithms to improve the recommendation performance by eliminating some links from the original networks. Moreover, we propose a hybrid method combining the time-aware and topology-aware link removal algorithms to extract the backbone which contains the essential information for the recommender systems. From the practical point of view, our method can improve the performance and reduce the computational time of the recommendation system, thus improving both of their effectiveness and efficiency.

  11. The role of conflict minerals, artisanal mining, and informal trading networks in African intrastate and regional conflicts

    Science.gov (United States)

    Chirico, Peter G.; Malpeli, Katherine C.

    2014-01-01

    The relationship between natural resources and armed conflict gained public and political attention in the 1990s, when it became evident that the mining and trading of diamonds were connected with brutal rebellions in several African nations. Easily extracted resources such as alluvial diamonds and gold have been and continue to be exploited by rebel groups to fund their activities. Artisanal and small-scale miners operating under a quasi-legal status often mine these mineral deposits. While many African countries have legalized artisanal mining and established flow chains through which production is intended to travel, informal trading networks frequently emerge in which miners seek to evade taxes and fees by selling to unauthorized buyers. These networks have the potential to become international in scope, with actors operating in multiple countries. The lack of government control over the artisanal mining sector and the prominence of informal trade networks can have severe social, political, and economic consequences. In the past, mineral extraction fuelled violent civil wars in Sierra Leone, Liberia, and Angola, and it continues to do so today in several other countries. The significant influence of the informal network that surrounds artisanal mining is therefore an important security concern that can extend across borders and have far-reaching impacts.

  12. Presentations from the 1992 Coal Mining Impoundment Informational Meeting

    Energy Technology Data Exchange (ETDEWEB)

    1993-12-31

    On May 20 and 21, 1992, the MSHA Coal Mining Impoundment Informational Meeting was held at the National Mine Health and Safety Academy in Beckley, West Virginia. Fifteen presentations were given on key issues involved in the design and construction of dams associated with coal mining. The attendees were told that to improve the consistency among the plan reviewers, engineers from the Denver and Pittsburgh Technical Support Centers meet twice annually to discuss specific technical issues. It was soon discovered that the topics being discussed needed to be shared with anyone involved with coal waste dam design, construction, or inspection. The only way to accomplish that goal was through the issuance of Procedure Instruction Letters. The Letters present a consensus of engineering philosophy that could change over time. They do not present policy or carry the force of law. Currently, thirteen position papers have been disseminated and more will follow as the need arises. The individual paper were not even entered into the database.

  13. Knowledge discovery: Extracting usable information from large amounts of data

    International Nuclear Information System (INIS)

    Whiteson, R.

    1998-01-01

    The threat of nuclear weapons proliferation is a problem of world wide concern. Safeguards are the key to nuclear nonproliferation and data is the key to safeguards. The safeguards community has access to a huge and steadily growing volume of data. The advantages of this data rich environment are obvious, there is a great deal of information which can be utilized. The challenge is to effectively apply proven and developing technologies to find and extract usable information from that data. That information must then be assessed and evaluated to produce the knowledge needed for crucial decision making. Efficient and effective analysis of safeguards data will depend on utilizing technologies to interpret the large, heterogeneous data sets that are available from diverse sources. With an order-of-magnitude increase in the amount of data from a wide variety of technical, textual, and historical sources there is a vital need to apply advanced computer technologies to support all-source analysis. There are techniques of data warehousing, data mining, and data analysis that can provide analysts with tools that will expedite their extracting useable information from the huge amounts of data to which they have access. Computerized tools can aid analysts by integrating heterogeneous data, evaluating diverse data streams, automating retrieval of database information, prioritizing inputs, reconciling conflicting data, doing preliminary interpretations, discovering patterns or trends in data, and automating some of the simpler prescreening tasks that are time consuming and tedious. Thus knowledge discovery technologies can provide a foundation of support for the analyst. Rather than spending time sifting through often irrelevant information, analysts could use their specialized skills in a focused, productive fashion. This would allow them to make their analytical judgments with more confidence and spend more of their time doing what they do best

  14. Effect of coal mine dust and clay extracts on the biological activity of the quartz surface

    Energy Technology Data Exchange (ETDEWEB)

    Stone, V.; Jones, R.; Rollo, K.; Duffin, R.; Donaldson, K.; Brown, D.M. [Napier University, Edinburgh (United Kingdom). School of Life Science

    2004-04-01

    Modification of the quartz surface by aluminum salts and metallic iron have been shown to reduce the biological activity of quartz. This study aimed to investigate the ability of water soluble extracts of coal mine dust (CMD), low aluminum clays (hectorite and montmorillonite) and high aluminum clays (attapulgite and kaolin) to inhibit the reactivity of the quartz surface. DQ12 induced significant haemolysis of sheep erythrocytes in vitro and inflammation in vivo as indicated by increases in the total cell numbers, neutrophil cell numbers, MIP-2 protein and albumin content of bronchoalveolar lavage (BAL) fluid. Treatment of DQ12 with CMD extract prevented both haemolysis and inflammation. Extracts of the high aluminum clays (kaolin and attapulgite) prevented inhibition of DQ12 induced haemolysis, and the kaolin extract inhibited quartz driven inflammation. DQ12 induced haemolysis by coal mine dust and kaolin extract could be prevented by pre-treatment of the extracts with a cation chellator. Extracts of the low aluminum clays (montmorillonite and hectorite) did not prevent DQ12 induced haemolysis, although the hectorite extract did prevent inflammation. These results suggest that CMD, and clays both low and rich in aluminum, all contain soluble components (possibly cations) capable of masking the reactivity of the quartz surface.

  15. Information extraction from muon radiography data

    International Nuclear Information System (INIS)

    Borozdin, K.N.; Asaki, T.J.; Chartrand, R.; Hengartner, N.W.; Hogan, G.E.; Morris, C.L.; Priedhorsky, W.C.; Schirato, R.C.; Schultz, L.J.; Sottile, M.J.; Vixie, K.R.; Wohlberg, B.E.; Blanpied, G.

    2004-01-01

    Scattering muon radiography was proposed recently as a technique of detection and 3-d imaging for dense high-Z objects. High-energy cosmic ray muons are deflected in matter in the process of multiple Coulomb scattering. By measuring the deflection angles we are able to reconstruct the configuration of high-Z material in the object. We discuss the methods for information extraction from muon radiography data. Tomographic methods widely used in medical images have been applied to a specific muon radiography information source. Alternative simple technique based on the counting of high-scattered muons in the voxels seems to be efficient in many simulated scenes. SVM-based classifiers and clustering algorithms may allow detection of compact high-Z object without full image reconstruction. The efficiency of muon radiography can be increased using additional informational sources, such as momentum estimation, stopping power measurement, and detection of muonic atom emission.

  16. Pharmspresso: a text mining tool for extraction of pharmacogenomic concepts and relationships from full text.

    Science.gov (United States)

    Garten, Yael; Altman, Russ B

    2009-02-05

    Pharmacogenomics studies the relationship between genetic variation and the variation in drug response phenotypes. The field is rapidly gaining importance: it promises drugs targeted to particular subpopulations based on genetic background. The pharmacogenomics literature has expanded rapidly, but is dispersed in many journals. It is challenging, therefore, to identify important associations between drugs and molecular entities--particularly genes and gene variants, and thus these critical connections are often lost. Text mining techniques can allow us to convert the free-style text to a computable, searchable format in which pharmacogenomic concepts (such as genes, drugs, polymorphisms, and diseases) are identified, and important links between these concepts are recorded. Availability of full text articles as input into text mining engines is key, as literature abstracts often do not contain sufficient information to identify these pharmacogenomic associations. Thus, building on a tool called Textpresso, we have created the Pharmspresso tool to assist in identifying important pharmacogenomic facts in full text articles. Pharmspresso parses text to find references to human genes, polymorphisms, drugs and diseases and their relationships. It presents these as a series of marked-up text fragments, in which key concepts are visually highlighted. To evaluate Pharmspresso, we used a gold standard of 45 human-curated articles. Pharmspresso identified 78%, 61%, and 74% of target gene, polymorphism, and drug concepts, respectively. Pharmspresso is a text analysis tool that extracts pharmacogenomic concepts from the literature automatically and thus captures our current understanding of gene-drug interactions in a computable form. We have made Pharmspresso available at http://pharmspresso.stanford.edu.

  17. Modeling of information on the impact of mining exploitation on bridge objects in BIM

    Science.gov (United States)

    Bętkowski, Piotr

    2018-04-01

    The article discusses the advantages of BIM (Building Information Modeling) technology in the management of bridge infrastructure on mining areas. The article shows the problems with information flow in the case of bridge objects located on mining areas and the advantages of proper information management, e.g. the possibility of automatic monitoring of structures, improvement of safety, optimization of maintenance activities, cost reduction of damage removal and preventive actions, improvement of atmosphere for mining exploitation, improvement of the relationship between the manager of the bridge and the mine. Traditional model of managing bridge objects on mining areas has many disadvantages, which are discussed in this article. These disadvantages include among others: duplication of information about the object, lack of correlation in investments due to lack of information flow between bridge manager and mine, limited assessment possibilities of damage propagation on technical condition and construction resistance to mining influences.

  18. 78 FR 45566 - Agency Information Collection Activities; Submission for OMB Review; Comment Request; Coal Mine...

    Science.gov (United States)

    2013-07-29

    ... for OMB Review; Comment Request; Coal Mine Dust Sampling Devices ACTION: Notice. SUMMARY: The... information collection request (ICR) titled, ``Coal Mine Dust Sampling Devices,'' to the Office of Management...) determine the concentration of respirable dust in coal mines. CPDMs must be designed and constructed for...

  19. Extracting the information backbone in online system.

    Directory of Open Access Journals (Sweden)

    Qian-Ming Zhang

    Full Text Available Information overload is a serious problem in modern society and many solutions such as recommender system have been proposed to filter out irrelevant information. In the literature, researchers have been mainly dedicated to improving the recommendation performance (accuracy and diversity of the algorithms while they have overlooked the influence of topology of the online user-object bipartite networks. In this paper, we find that some information provided by the bipartite networks is not only redundant but also misleading. With such "less can be more" feature, we design some algorithms to improve the recommendation performance by eliminating some links from the original networks. Moreover, we propose a hybrid method combining the time-aware and topology-aware link removal algorithms to extract the backbone which contains the essential information for the recommender systems. From the practical point of view, our method can improve the performance and reduce the computational time of the recommendation system, thus improving both of their effectiveness and efficiency.

  20. Extracting the Information Backbone in Online System

    Science.gov (United States)

    Zhang, Qian-Ming; Zeng, An; Shang, Ming-Sheng

    2013-01-01

    Information overload is a serious problem in modern society and many solutions such as recommender system have been proposed to filter out irrelevant information. In the literature, researchers have been mainly dedicated to improving the recommendation performance (accuracy and diversity) of the algorithms while they have overlooked the influence of topology of the online user-object bipartite networks. In this paper, we find that some information provided by the bipartite networks is not only redundant but also misleading. With such “less can be more” feature, we design some algorithms to improve the recommendation performance by eliminating some links from the original networks. Moreover, we propose a hybrid method combining the time-aware and topology-aware link removal algorithms to extract the backbone which contains the essential information for the recommender systems. From the practical point of view, our method can improve the performance and reduce the computational time of the recommendation system, thus improving both of their effectiveness and efficiency. PMID:23690946

  1. Semantics-based information extraction for detecting economic events

    NARCIS (Netherlands)

    A.C. Hogenboom (Alexander); F. Frasincar (Flavius); K. Schouten (Kim); O. van der Meer

    2013-01-01

    textabstractAs today's financial markets are sensitive to breaking news on economic events, accurate and timely automatic identification of events in news items is crucial. Unstructured news items originating from many heterogeneous sources have to be mined in order to extract knowledge useful for

  2. Gold-Mining

    DEFF Research Database (Denmark)

    Raaballe, J.; Grundy, B.D.

    2002-01-01

      Based on standard option pricing arguments and assumptions (including no convenience yield and sustainable property rights), we will not observe operating gold mines. We find that asymmetric information on the reserves in the gold mine is a necessary and sufficient condition for the existence...... of operating gold mines. Asymmetric information on the reserves in the mine implies that, at a high enough price of gold, the manager of high type finds the extraction value of the company to be higher than the current market value of the non-operating gold mine. Due to this under valuation the maxim of market...

  3. Using Fuzzy SOM Strategy for Satellite Image Retrieval and Information Mining

    Directory of Open Access Journals (Sweden)

    Yo-Ping Huang

    2008-02-01

    Full Text Available This paper proposes an efficient satellite image retrieval and knowledge discovery model. The strategy comprises two major parts. First, a computational algorithm is used for off-line satellite image feature extraction, image data representation and image retrieval. Low level features are automatically extracted from the segmented regions of satellite images. A self-organization feature map is used to construct a two-layer satellite image concept hierarchy. The events are stored in one layer and the corresponding feature vectors are categorized in the other layer. Second, a user friendly interface is provided that retrieves images of interest and mines useful information based on the events in the concept hierarchy. The proposed system is evaluated with prominent features such as typhoons or high-pressure masses.

  4. Mining residential water and electricity demand data in Southern California to inform demand management strategies

    Science.gov (United States)

    Cominola, A.; Spang, E. S.; Giuliani, M.; Castelletti, A.; Loge, F. J.; Lund, J. R.

    2016-12-01

    Demand side management strategies are key to meet future water and energy demands in urban contexts, promote water and energy efficiency in the residential sector, provide customized services and communications to consumers, and reduce utilities' costs. Smart metering technologies allow gathering high temporal and spatial resolution water and energy consumption data and support the development of data-driven models of consumers' behavior. Modelling and predicting resource consumption behavior is essential to inform demand management. Yet, analyzing big, smart metered, databases requires proper data mining and modelling techniques, in order to extract useful information supporting decision makers to spot end uses towards which water and energy efficiency or conservation efforts should be prioritized. In this study, we consider the following research questions: (i) how is it possible to extract representative consumers' personalities out of big smart metered water and energy data? (ii) are residential water and energy consumption profiles interconnected? (iii) Can we design customized water and energy demand management strategies based on the knowledge of water- energy demand profiles and other user-specific psychographic information? To address the above research questions, we contribute a data-driven approach to identify and model routines in water and energy consumers' behavior. We propose a novel customer segmentation procedure based on data-mining techniques. Our procedure consists of three steps: (i) extraction of typical water-energy consumption profiles for each household, (ii) profiles clustering based on their similarity, and (iii) evaluation of the influence of candidate explanatory variables on the identified clusters. The approach is tested onto a dataset of smart metered water and energy consumption data from over 1000 households in South California. Our methodology allows identifying heterogeneous groups of consumers from the studied sample, as well as

  5. Accumulation of some metals by legumes and their extractability from acid mine spoils

    International Nuclear Information System (INIS)

    Taylor, R.W.; Ibeabuchi, I.O.; Sistani, K.R.; Shuford, J.W.

    1992-01-01

    A greenhouse study was conducted to investigate the growth (dry matter yield) of selected legume cover crops; phytoaccumulation of metals such as Zn, Mn, Pb, Cu, Ni, and Al; the extractability of heavy metals from three different Alabama acid mine spoils. The spoils were amended based on soil test recommended levels of N, P, K, Ca and Mg prior to plant growth. Metals were extracted by three extractants (Mehlich 1, DTPA, and 0.1 M HCl) and values correlated with their accumulation by the selected legumes. Among the cover crops, kobe lespedeza Lespedeza striata (Thung.) Hook and Arn, sericea lespedeza Lespedeza cuneata (Dum.) G. Don, and red clover (Trifolium pratense L.) did not survive the stressful conditions of the spoils. However, cowpea (Vigna unguiculata L.) followed by 'Bragg' soybean Glycine max (L.) Merr. generally produced the highest dry matter yield while accumulating the largest quantity of metals, except Al, from spoils. The extractability of most metals from the spoils was generally in the order of: 0.1 MHCl > DTPA. Mehlich 1 did not extract Pb and 0.1 M HCl did not extract Ni, whereas DTPA extracted all the metals in a small amount relative to HCl and Mehlich 1. All the extractants were quite effective in removing plant-available Zn from the spoils. In general, the extractants' ability to predict plant-available metals depended on the crop species, spoil type, and extractant used. 28 refs., 4 tabs

  6. Genetic process mining

    NARCIS (Netherlands)

    Aalst, van der W.M.P.; Alves De Medeiros, A.K.; Weijters, A.J.M.M.; Ciardo, G.; Darondeau, P.

    2005-01-01

    The topic of process mining has attracted the attention of both researchers and tool vendors in the Business Process Management (BPM) space. The goal of process mining is to discover process models from event logs, i.e., events logged by some information system are used to extract information about

  7. Proactive mining system in potosi silver mines : new information from re-evaluation of historical materials regarding the fifth viceroy toledo’s various policies on environment

    OpenAIRE

    Miyoshi, Emako; Anezaki, Shoji

    2016-01-01

    In this paper, the proactive mining system introduced by the fifth viceroy, Francisco de Toledo (1569–1581) to the Potosi Silver Mine is clarified on the facts found in the historical documents. Main policies in Toledo’s mining business are followings, the application of mercury-amalgamation to extract silver from ores, the construction of the hydraulic-powered system for silver-ore crashing with cascading uses, the recycle system included the extraction of silver from waste ores and collecti...

  8. Chaotic spectra: How to extract dynamic information

    International Nuclear Information System (INIS)

    Taylor, H.S.; Gomez Llorente, J.M.; Zakrzewski, J.; Kulander, K.C.

    1988-10-01

    Nonlinear dynamics is applied to chaotic unassignable atomic and molecular spectra with the aim of extracting detailed information about regular dynamic motions that exist over short intervals of time. It is shown how this motion can be extracted from high resolution spectra by doing low resolution studies or by Fourier transforming limited regions of the spectrum. These motions mimic those of periodic orbits (PO) and are inserts into the dominant chaotic motion. Considering these inserts and the PO as a dynamically decoupled region of space, resonant scattering theory and stabilization methods enable us to compute ladders of resonant states which interact with the chaotic quasi-continuum computed in principle from basis sets placed off the PO. The interaction of the resonances with the quasicontinuum explains the low resolution spectra seen in such experiments. It also allows one to associate low resolution features with a particular PO. The motion on the PO thereby supplies the molecular movements whose quantization causes the low resolution spectra. Characteristic properties of the periodic orbit based resonances are discussed. The method is illustrated on the photoabsorption spectrum of the hydrogen atom in a strong magnetic field and on the photodissociation spectrum of H 3 + . Other molecular systems which are currently under investigation using this formalism are also mentioned. 53 refs., 10 figs., 2 tabs

  9. A Process Mining Based Service Composition Approach for Mobile Information Systems

    Directory of Open Access Journals (Sweden)

    Chengxi Huang

    2017-01-01

    Full Text Available Due to the growing trend in applying big data and cloud computing technologies in information systems, it is becoming an important issue to handle the connection between large scale of data and the associated business processes in the Internet of Everything (IoE environment. Service composition as a widely used phase in system development has some limits when the complexity of relationship among data increases. Considering the expanding scale and the variety of devices in mobile information systems, a process mining based service composition approach is proposed in this paper in order to improve the adaptiveness and efficiency of compositions. Firstly, a preprocessing is conducted to extract existing service execution information from server-side logs. Then process mining algorithms are applied to discover the overall event sequence with preprocessed data. After that, a scene-based service composition is applied to aggregate scene information and relocate services of the system. Finally, a case study that applied the work in mobile medical application proves that the approach is practical and valuable in improving service composition adaptiveness and efficiency.

  10. Extraction of quantifiable information from complex systems

    CERN Document Server

    Dahmen, Wolfgang; Griebel, Michael; Hackbusch, Wolfgang; Ritter, Klaus; Schneider, Reinhold; Schwab, Christoph; Yserentant, Harry

    2014-01-01

    In April 2007, the  Deutsche Forschungsgemeinschaft (DFG) approved the  Priority Program 1324 “Mathematical Methods for Extracting Quantifiable Information from Complex Systems.” This volume presents a comprehensive overview of the most important results obtained over the course of the program.   Mathematical models of complex systems provide the foundation for further technological developments in science, engineering and computational finance.  Motivated by the trend toward steadily increasing computer power, ever more realistic models have been developed in recent years. These models have also become increasingly complex, and their numerical treatment poses serious challenges.   Recent developments in mathematics suggest that, in the long run, much more powerful numerical solution strategies could be derived if the interconnections between the different fields of research were systematically exploited at a conceptual level. Accordingly, a deeper understanding of the mathematical foundations as w...

  11. Extraction of temporal information in functional MRI

    Science.gov (United States)

    Singh, M.; Sungkarat, W.; Jeong, Jeong-Won; Zhou, Yongxia

    2002-10-01

    The temporal resolution of functional MRI (fMRI) is limited by the shape of the haemodynamic response function (hrf) and the vascular architecture underlying the activated regions. Typically, the temporal resolution of fMRI is on the order of 1 s. We have developed a new data processing approach to extract temporal information on a pixel-by-pixel basis at the level of 100 ms from fMRI data. Instead of correlating or fitting the time-course of each pixel to a single reference function, which is the common practice in fMRI, we correlate each pixel's time-course to a series of reference functions that are shifted with respect to each other by 100 ms. The reference function yielding the highest correlation coefficient for a pixel is then used as a time marker for that pixel. A Monte Carlo simulation and experimental study of this approach were performed to estimate the temporal resolution as a function of signal-to-noise ratio (SNR) in the time-course of a pixel. Assuming a known and stationary hrf, the simulation and experimental studies suggest a lower limit in the temporal resolution of approximately 100 ms at an SNR of 3. The multireference function approach was also applied to extract timing information from an event-related motor movement study where the subjects flexed a finger on cue. The event was repeated 19 times with the event's presentation staggered to yield an approximately 100-ms temporal sampling of the haemodynamic response over the entire presentation cycle. The timing differences among different regions of the brain activated by the motor task were clearly visualized and quantified by this method. The results suggest that it is possible to achieve a temporal resolution of /spl sim/200 ms in practice with this approach.

  12. Impact of historical mining assessed in soils by kinetic extraction and lead isotopic ratios

    International Nuclear Information System (INIS)

    Camizuli, E.; Monna, F.; Bermond, A.; Manouchehri, N.; Besançon, S.; Losno, R.; Oort, F. van; Labanowski, J.; Perreira, A.; Chateau, C.; Alibert, P.

    2014-01-01

    The aim of this study is to estimate the long-term behaviour of trace metals, in two soils differently impacted by past mining. Topsoils from two 1 km 2 zones in the forested Morvan massif (France) were sampled to assess the spatial distribution of Cd, Cu, Pb and Zn. The first zone had been contaminated by historical mining. As expected, it exhibits higher trace-metal levels and greater spatial heterogeneity than the second non-contaminated zone, supposed to represent the local background. One soil profile from each zone was investigated in detail to estimate metal behaviour, and hence, bioavailability. Kinetic extractions were performed using EDTA on three samples: the A horizon from both soil profiles and the B horizon from the contaminated soil. For all three samples, kinetic extractions can be modelled by two first-order reactions. Similar kinetic behaviour was observed for all metals, but more metal was extracted from the contaminated A horizon than from the B horizon. More surprising is the general predominance of the residual fraction over the “labile” and “less labile” pools. Past anthropogenic inputs may have percolated over time through the soil profiles because of acidic pH conditions. Stable organo-metallic complexes may also have been formed over time, reducing metal availability. These processes are not mutually exclusive. After kinetic extraction, the lead isotopic compositions of the samples exhibited different signatures, related to contamination history and intrinsic soil parameters. However, no variation in lead signature was observed during the extraction experiment, demonstrating that the “labile” and “less labile” lead pools do not differ in terms of origin. Even if trace metals resulting from past mining and metallurgy persist in soils long after these activities have ceased, kinetic extractions suggest that metals, at least for these particular forest soils, do not represent a threat for biota. - Highlights: • Trace

  13. Impact of historical mining assessed in soils by kinetic extraction and lead isotopic ratios

    Energy Technology Data Exchange (ETDEWEB)

    Camizuli, E., E-mail: estelle.camizuli@u-bourgogne.fr [UMR 6298, ArTeHiS, Université de Bourgogne — CNRS — Culture, 6 bd Gabriel, Bat. Gabriel, 21000 Dijon (France); Monna, F. [UMR 6298, ArTeHiS, Université de Bourgogne — CNRS — Culture, 6 bd Gabriel, Bat. Gabriel, 21000 Dijon (France); Bermond, A.; Manouchehri, N.; Besançon, S. [Institut des sciences et industries du vivant et de l' environnement (AgroParisTech), Laboratoire de Chimie Analytique, 16, rue Claude Bernard, 75231 Paris Cedex 05 (France); Losno, R. [UMR 7583, LISA, Universités Paris 7-Paris 12 — CNRS, 61 av. du Gal de Gaulle, 94010 Créteil Cedex (France); Oort, F. van [UR 251, Pessac, Institut National de la Recherche Agronomique, Centre de Versailles-Grignon, RD 10, 78026 Versailles Cedex (France); Labanowski, J. [UMR 7285, IC2MP, Université de Poitiers — CNRS, 4, rue Michel Brunet, 86022 Poitiers (France); Perreira, A. [UMR 6298, ArTeHiS, Université de Bourgogne — CNRS — Culture, 6 bd Gabriel, Bat. Gabriel, 21000 Dijon (France); Chateau, C. [UFR SVTE, Université de Bourgogne, 6 bd Gabriel, Bat. Gabriel, 21000 Dijon (France); Alibert, P. [UMR 6282, Biogeosciences, Université de Bourgogne — CNRS, 6 bd Gabriel, Bat. Gabriel, 21000 Dijon (France)

    2014-02-01

    The aim of this study is to estimate the long-term behaviour of trace metals, in two soils differently impacted by past mining. Topsoils from two 1 km{sup 2} zones in the forested Morvan massif (France) were sampled to assess the spatial distribution of Cd, Cu, Pb and Zn. The first zone had been contaminated by historical mining. As expected, it exhibits higher trace-metal levels and greater spatial heterogeneity than the second non-contaminated zone, supposed to represent the local background. One soil profile from each zone was investigated in detail to estimate metal behaviour, and hence, bioavailability. Kinetic extractions were performed using EDTA on three samples: the A horizon from both soil profiles and the B horizon from the contaminated soil. For all three samples, kinetic extractions can be modelled by two first-order reactions. Similar kinetic behaviour was observed for all metals, but more metal was extracted from the contaminated A horizon than from the B horizon. More surprising is the general predominance of the residual fraction over the “labile” and “less labile” pools. Past anthropogenic inputs may have percolated over time through the soil profiles because of acidic pH conditions. Stable organo-metallic complexes may also have been formed over time, reducing metal availability. These processes are not mutually exclusive. After kinetic extraction, the lead isotopic compositions of the samples exhibited different signatures, related to contamination history and intrinsic soil parameters. However, no variation in lead signature was observed during the extraction experiment, demonstrating that the “labile” and “less labile” lead pools do not differ in terms of origin. Even if trace metals resulting from past mining and metallurgy persist in soils long after these activities have ceased, kinetic extractions suggest that metals, at least for these particular forest soils, do not represent a threat for biota. - Highlights: • Trace

  14. GROUND DEFORMATION EXTRACTION USING VISIBLE IMAGES AND LIDAR DATA IN MINING AREA

    Directory of Open Access Journals (Sweden)

    W. Hu

    2016-06-01

    Full Text Available Recognition and extraction of mining ground deformation can help us understand the deformation process and space distribution, and estimate the deformation laws and trends. This study focuses on the application of ground deformation detection and extraction combining with high resolution visible stereo imagery, LiDAR observation point cloud data and historical data. The DEM in large mining area is generated using high-resolution satellite stereo images, and ground deformation is obtained through time series analysis combined with historical DEM data. Ground deformation caused by mining activities are detected and analyzed to explain the link between the regional ground deformation and local deformation. A district of covering 200 km2 around the West Open Pit Mine in Fushun of Liaoning province, a city located in the Northeast China is chosen as the test area for example. Regional and local ground deformation from 2010 to 2015 time series are detected and extracted with DEMs derived from ZY-3 images and LiDAR point DEMs in the case study. Results show that the mean regional deformation is 7.1 m of rising elevation with RMS 9.6 m. Deformation of rising elevation and deformation of declining elevation couple together in local area. The area of higher elevation variation is 16.3 km2 and the mean rising value is 35.8 m with RMS 15.7 m, while the deformation area of lower elevation variation is 6.8 km2 and the mean declining value is 17.6 m with RMS 9.3 m. Moreover, local large deformation and regional slow deformation couple together, the deformation in local mining activities has expanded to the surrounding area, a large ground fracture with declining elevation has been detected and extracted in the south of West Open Pit Mine, the mean declining elevation of which is 23.1 m and covering about 2.3 km2 till 2015. The results in this paper are preliminary currently; we are making efforts to improve more precision results with

  15. Selective extraction of metals from products of mine acidic water treatment

    International Nuclear Information System (INIS)

    Andreeva, N.N.; Romanchuk, S.A.; Voronin, N.N.; Demidov, V.D.; Pasynkova, T.A.; Manuilova, O.A.; Ivanova, N.V.

    1989-01-01

    A study was made on possibility of processing of foam products prepared during flotation purification of mine acidic waters for the purpose of selective extraction of non-ferrous (Co, Ni) and rare earth elements (REE) and their separation from the basic macrocomponent of waters-iron. Optimal conditions of selective metal extraction from foam flotation products are the following: T=333 K, pH=3.0-3.5, ratio of solid and liquid phase - 1:4-1:7, duration of sulfuric acid leaching - 30 min. Rare earth extraction under such conditions equals 87.6-93.0%. The degree of valuable component concentration equals ∼ 10. Rare earths are separated from iron by extraction methods

  16. Optical Aperture Synthesis Object's Information Extracting Based on Wavelet Denoising

    International Nuclear Information System (INIS)

    Fan, W J; Lu, Y

    2006-01-01

    Wavelet denoising is studied to improve OAS(optical aperture synthesis) object's Fourier information extracting. Translation invariance wavelet denoising based on Donoho wavelet soft threshold denoising is researched to remove Pseudo-Gibbs in wavelet soft threshold image. OAS object's information extracting based on translation invariance wavelet denoising is studied. The study shows that wavelet threshold denoising can improve the precision and the repetition of object's information extracting from interferogram, and the translation invariance wavelet denoising information extracting is better than soft threshold wavelet denoising information extracting

  17. Economic statistics for the mining and metallurgical industries: 1990. Statistique economique des industries extractives et metallurgiques annee 1990

    Energy Technology Data Exchange (ETDEWEB)

    Rzonzef, L.

    1991-01-01

    Provides economic statistics for the Belgian mining and metallurgical industries in 1990. The review is divided into 4 parts: the extractive industries (including an analysis of the coal market and mines, quarries and associated industries); coke and briquette making; metallurgy (i.e. blast furnaces, steel making, rolling mills and manpower and materials consumption in the steel industry); and the extraction of sand from the Belgian continental shelf. 17 tabs.

  18. Data mining in Cloud Computing

    Directory of Open Access Journals (Sweden)

    Ruxandra-Ştefania PETRE

    2012-10-01

    Full Text Available This paper describes how data mining is used in cloud computing. Data Mining is used for extracting potentially useful information from raw data. The integration of data mining techniques into normal day-to-day activities has become common place. Every day people are confronted with targeted advertising, and data mining techniques help businesses to become more efficient by reducing costs.Data mining techniques and applications are very much needed in the cloud computing paradigm. The implementation of data mining techniques through Cloud computing will allow the users to retrieve meaningful information from virtually integrated data warehouse that reduces the costs of infrastructure and storage.

  19. 77 FR 25205 - Proposed Extension of Existing Information Collection; Roof Control Plans for Underground Coal Mines

    Science.gov (United States)

    2012-04-27

    ... collections of information in accordance with the Paperwork Reduction Act of 1995. This program helps to assure that requested data can be provided in the desired format, reporting burden (time and financial... Information Collection; Roof Control Plans for Underground Coal Mines AGENCY: Mine Safety and Health...

  20. 77 FR 38323 - Proposed Extension of Existing Information Collection; Respirable Coal Mine Dust Sampling

    Science.gov (United States)

    2012-06-27

    ... Information Collection; Respirable Coal Mine Dust Sampling AGENCY: Mine Safety and Health Administration... Sampling'' to more accurately reflect the type of information that is collected. Chronic exposure to... dust levels since 1970 and, consequently, the prevalence rate of black lung among coal miners, severe...

  1. Respiratory Information Extraction from Electrocardiogram Signals

    KAUST Repository

    Amin, Gamal El Din Fathy

    2010-12-01

    The Electrocardiogram (ECG) is a tool measuring the electrical activity of the heart, and it is extensively used for diagnosis and monitoring of heart diseases. The ECG signal reflects not only the heart activity but also many other physiological processes. The respiratory activity is a prominent process that affects the ECG signal due to the close proximity of the heart and the lungs. In this thesis, several methods for the extraction of respiratory process information from the ECG signal are presented. These methods allow an estimation of the lung volume and the lung pressure from the ECG signal. The potential benefit of this is to eliminate the corresponding sensors used to measure the respiration activity. A reduction of the number of sensors connected to patients will increase patients’ comfort and reduce the costs associated with healthcare. As a further result, the efficiency of diagnosing respirational disorders will increase since the respiration activity can be monitored with a common, widely available method. The developed methods can also improve the detection of respirational disorders that occur while patients are sleeping. Such disorders are commonly diagnosed in sleeping laboratories where the patients are connected to a number of different sensors. Any reduction of these sensors will result in a more natural sleeping environment for the patients and hence a higher sensitivity of the diagnosis.

  2. Occurrence of Acidithiobacillus ferrooxidans and Acidithiobacillus thiooxidans in uranium mine-Caldas uranium mining and extraction plant, Brazil (CUMEP)

    International Nuclear Information System (INIS)

    Gomes, H.A.; Garcia, O.; Gomes, J.E.; Rabello, E.; Cannavan, F.S.; Tsai, S.M.

    2005-01-01

    The sulfated minerals present in mining areas may cause serious environmental problems due to the action of chemolithotrophic bacteria from genus Acithiobacillus, represented mainly by Acithiobacillus ferrooxidans and Acithiobacillus thiooxidans. These microorganisms are able to oxidize mineral sulfates, elementary sulfur and ferrous ion (A. ferrooxidans), as well are capable of mobilizing radionuclide as uranium to the environment. In this context, this study aimed at investigating the occurrence and the fluctuation of A. ferrooxidans and A. thiooxidans populations within the mine effluents, tailing dam and waste rocks of the Caldas Uranium Mining arid Extraction Plant (CUMEP) in Minas Gerais State - Brazil. Samples from 16 sites were evenly taken monthly in the CUMEP, during 28 months. The oxi-reduction potential, pH and temperature values were determined at the Radioecology Laboratory. The Most Probable Number technique was applied using a series of five tubes for selective counting of A. ferrooxidans and A. thiooxidans. Each sample was submitted to serial dilutions using Tween 80 and sterilized water (pH=2.0) and subsequently transferred into assay tubes containing T and K with ferrous ion and also elementary sulfur, as energy source, for detection of A. ferrooxidans and A. thiooxidans, respectively. Populations of A. ferrooxidans and A. thiooxidans presented seasonal quantitative fluctuations at the different studied sites. A. ferrooxidans showed higher or equal frequency to that observed for A. thiooxidans; as consequence, they were considered the predominant bacteria in this environment. In the majority of the sites, the highest values for the frequency and counting of A. ferrooxidans and A. thiooxidans were observed during the rainy period (October to March). The relative seasonal behavior when several variables are evaluated simultaneously indicated that, due to the high values of oxi-reduction potential, the low values of pH, the detection of the highest

  3. Using Open Web APIs in Teaching Web Mining

    Science.gov (United States)

    Chen, Hsinchun; Li, Xin; Chau, M.; Ho, Yi-Jen; Tseng, Chunju

    2009-01-01

    With the advent of the World Wide Web, many business applications that utilize data mining and text mining techniques to extract useful business information on the Web have evolved from Web searching to Web mining. It is important for students to acquire knowledge and hands-on experience in Web mining during their education in information systems…

  4. Innovative Extraction Method for a Coal Seam with a Thick Rock-Parting for Supporting Coal Mine Sustainability

    Directory of Open Access Journals (Sweden)

    Meng Li

    2017-10-01

    Full Text Available As thick rock partings delay the efficient mining of coal seams and constrain the sustainable development of coal mines, an innovative extraction method for a coal seam with thick rock parting was proposed. The coal seams were divided into different sub-zones according to the thickness of rock parting and then the sub-zones were mined by separately using three mining schemes involving full-seam mining, combined mining using backfill and caving (CMBC, and reducing height mining. Afterwards, the study introduced the basic mechanism and key devices for the CMBC and analysed the working state of the backfill support in detail. Moreover, the method for calculating the length of the backfill zone was proposed to design the length of backfill zone and the influences of four factors (including bulking coefficient of rock parting on the length of the backfill zone were also explored. By taking the No. 22203 panel, Buertai mine, Inner Mongolia, China as an example, the mined coal resource by using the CMBC extraction method will increase by 1.83 × 106 tons and the recovery ratio will rise from 56.2% to 92.4% compared with mining of the 2-2 upper coal seam alone. Moreover, by applying CMBC, a series of environmental and ecological problems caused by rock parting is reduced, which can improve the environment in mined areas. The research can provide technological guidance for mining panels of a coal seam with a thick rock parting and the disposal thereof under similar conditions.

  5. Biomedical Information Extraction: Mining Disease Associated Genes from Literature

    Science.gov (United States)

    Huang, Zhong

    2014-01-01

    Disease associated gene discovery is a critical step to realize the future of personalized medicine. However empirical and clinical validation of disease associated genes are time consuming and expensive. In silico discovery of disease associated genes from literature is therefore becoming the first essential step for biomarker discovery to…

  6. Informing child welfare policy and practice: using knowledge discovery and data mining technology via a dynamic Web site.

    Science.gov (United States)

    Duncan, Dean F; Kum, Hye-Chung; Weigensberg, Elizabeth Caplick; Flair, Kimberly A; Stewart, C Joy

    2008-11-01

    Proper management and implementation of an effective child welfare agency requires the constant use of information about the experiences and outcomes of children involved in the system, emphasizing the need for comprehensive, timely, and accurate data. In the past 20 years, there have been many advances in technology that can maximize the potential of administrative data to promote better evaluation and management in the field of child welfare. Specifically, this article discusses the use of knowledge discovery and data mining (KDD), which makes it possible to create longitudinal data files from administrative data sources, extract valuable knowledge, and make the information available via a user-friendly public Web site. This article demonstrates a successful project in North Carolina where knowledge discovery and data mining technology was used to develop a comprehensive set of child welfare outcomes available through a public Web site to facilitate information sharing of child welfare data to improve policy and practice.

  7. Improving the extraction-and-loading process in the open mining operations

    Directory of Open Access Journals (Sweden)

    Cheban A. Yu.

    2017-09-01

    Full Text Available Using the explosions is the main way to prepare solid rocks for the excavation, and that results in the formation of a rock mass of uneven granulometric composition, which makes it impossible to use a conveyor quarry transport without the preliminary large crushing of the rock mass obtained during the explosion. A way to achieve the greatest technical and economic effect is the full conveyorization of quarry transport, what, in this case, ensures the sequenced-flow of transport operations, automation of management and high labor productivity. The extraction-and-loading machines are the determining factor in the performance of mining and transport machines in the technological flow of the quarry. When extracting a blasted rock mass with single-bucket excavators or loaders working in combination with bottom-hole conveyors, one uses self-propelled crushing and reloading units of various designs to grind large individual parts to fractions of conditioning size. The presence of a crushing and reloading unit in the pit-face along with the excavator requires an additional space for its placement, complicates the maneuvering of the equipment in the pit-face, and increases the number of personnel and the cost of maintaining the extraction-and-reloading operations. The article proposes an improved method for carrying out the extraction-and-loading process, as well as the design of extraction-and-grinding unit based on a quarry hydraulic excavator. The design of the proposed unit makes it possible to convert the cyclic process of scooping the rock mass into the continuous process of its loading on the bottom-hole conveyor. Using the extraction-and-grinding unit allows one to combine the processes of excavation, preliminary crushing and loading of the rock mass, which ensures an increase in the efficiency of mining operations.

  8. Intelligent Information Retrieval and Web Mining Architecture Using SOA

    Science.gov (United States)

    El-Bathy, Naser Ibrahim

    2010-01-01

    The study of this dissertation provides a solution to a very specific problem instance in the area of data mining, data warehousing, and service-oriented architecture in publishing and newspaper industries. The research question focuses on the integration of data mining and data warehousing. The research problem focuses on the development of…

  9. Uranium mining and metallurgy library information service under the network environment

    International Nuclear Information System (INIS)

    Tang Lilei

    2012-01-01

    This paper analyzes the effect of the network environment on the uranium mining and metallurgy of the information service. Introduces some measures such as strengthening professional characteristic literature resources construction, changing the service mode, building up information navigation, deepening service, meet the individual needs of users, raising librarian's quality, promoting the co-construction and sharing of library information resources, and puts forward the development idea of uranium mining and metallurgy library information service under the network environment. (author)

  10. Heavy metal concentration in forage grasses and extractability from some acid mine spoils

    Energy Technology Data Exchange (ETDEWEB)

    Taylor, R.W.; Ibeabuchi, I.O.; Sistani, K.R.; Shuford, J.W. (Alabama A and M University, Normal (United States). Department of Plant and Soil Science)

    1993-06-01

    Laboratory and greenhouse studies were conducted on several forage grasses, bermudagrass ([ital Cynodon dactylon]), creeping red fescue ([ital Festuca rubra]), Kentucky 31-tall fescue ([ital Festuca arundinacea]), oat ([ital Avena sativa]), orchardgrass ([ital Dactylis glomerata]), perennial ryegrass ([ital Lolium perenne]), sorghum ([ital Sorghum bicolor]), triticale (X. [ital triticosecale Wittmack]), and winter wheat ([ital Triticum aestivum]) grown on three Alabama acid mine spoils to study heavy metal accumulation, dry matter yield and spoil metal extractability by three chemical extractants (Mehlich 1, DTPA, and 0.1 M HCl). Heavy metals removed by these extractants were correlated with their accumulation by several forage grasses. Among the forages tested, creeping red fescue did not survive the stressful conditions of any of the spoils, while orchard grass and Kentucky 31-tall fescue did not grow in Mulberry spoil. Sorghum followed by bermudagrass generally produced the highest dry matter yield. However, the high yielding bermudagrass was most effective in accumulating high tissue levels of Mn and Zn from all spoils (compared to the other grasses) but did not remove Ni. On the average, higher levels of metals were extracted from spoils in the order of 0.1 M HCl[gt] Mehlich 1[gt] DTPA. However, DTPA extracted all the metals from spoils while Mehlich 1 did not extract Pb and 0.1 M HCl did not extract detectable levels of Ni. All of the extractants were quite effective in determining plant available Zn from the spoils. For the other metals, the effective determination of plant availability depended on the crop, the extractant, and the metal in concert. 20 refs., 6 tabs.

  11. US uranium mining industry: background information on economics and emissions

    International Nuclear Information System (INIS)

    Bruno, G.A.; Dirks, J.A.; Jackson, P.O.; Young, J.K.

    1984-03-01

    A review of the US uranium mining industry has revealed a generally depressed industry situation. The 1982 U 3 O 8 production from both open-pit and underground mines declined to 3800 and 6300 tons respectively with the underground portion representing 46% of total production. US exploration and development has continued downward in 1982. Employment in the mining and milling sectors has dropped 31% and 17% respectively in 1982. Representative forecasts were developed for reactor fuel demand and U 3 O 8 production for the years 1983 and 1990. Reactor fuel demand is estimated to increase from 15,900 tons to 21,300 tons U 3 O 8 respectively. U 3 O 8 production, however, is estimated to decrease from 10,600 tons to 9600 tons respectively. A field examination was conducted of 29 selected underground uranium mines that represent 84% of the 1982 underground production. Data was gathered regarding population, land ownership and private property valuation. An analysis of the increased cost to production resulting from the installation of 20-meter high exhaust borehole vent stacks was conducted. An assessment was made of the current and future 222 Rn emission levels for a group of 27 uranium mines. It is shown that 222 Rn emission rates are increasing from 10 individual operating mines through 1990 by 1.2 to 3.8 times. But for the group of 27 mines as a whole, a reduction of total 222 Rn emissions is predicted due to 17 of the mines being shutdown and sealed. The estimated total 222 Rn emission rate for this group of mines will be 105 Ci/yr by year end 1983 or 70% of the 1978-79 measured rate and 124 Ci/yr by year end 1990 or 83% of the 1978-79 measured rate

  12. US uranium mining industry: background information on economics and emissions

    Energy Technology Data Exchange (ETDEWEB)

    Bruno, G.A.; Dirks, J.A.; Jackson, P.O.; Young, J.K.

    1984-03-01

    A review of the US uranium mining industry has revealed a generally depressed industry situation. The 1982 U/sub 3/O/sub 8/ production from both open-pit and underground mines declined to 3800 and 6300 tons respectively with the underground portion representing 46% of total production. US exploration and development has continued downward in 1982. Employment in the mining and milling sectors has dropped 31% and 17% respectively in 1982. Representative forecasts were developed for reactor fuel demand and U/sub 3/O/sub 8/ production for the years 1983 and 1990. Reactor fuel demand is estimated to increase from 15,900 tons to 21,300 tons U/sub 3/O/sub 8/ respectively. U/sub 3/O/sub 8/ production, however, is estimated to decrease from 10,600 tons to 9600 tons respectively. A field examination was conducted of 29 selected underground uranium mines that represent 84% of the 1982 underground production. Data was gathered regarding population, land ownership and private property valuation. An analysis of the increased cost to production resulting from the installation of 20-meter high exhaust borehole vent stacks was conducted. An assessment was made of the current and future /sup 222/Rn emission levels for a group of 27 uranium mines. It is shown that /sup 222/Rn emission rates are increasing from 10 individual operating mines through 1990 by 1.2 to 3.8 times. But for the group of 27 mines as a whole, a reduction of total /sup 222/Rn emissions is predicted due to 17 of the mines being shutdown and sealed. The estimated total /sup 222/Rn emission rate for this group of mines will be 105 Ci/yr by year end 1983 or 70% of the 1978-79 measured rate and 124 Ci/yr by year end 1990 or 83% of the 1978-79 measured rate.

  13. A New Framework for Textual Information Mining over Parse Trees. CRESST Report 805

    Science.gov (United States)

    Mousavi, Hamid; Kerr, Deirdre; Iseli, Markus R.

    2011-01-01

    Textual information mining is a challenging problem that has resulted in the creation of many different rule-based linguistic query languages. However, these languages generally are not optimized for the purpose of text mining. In other words, they usually consider queries as individuals and only return raw results for each query. Moreover they…

  14. Personal continuous route pattern mining

    Institute of Scientific and Technical Information of China (English)

    Qian YE; Ling CHEN; Gen-cai CHEN

    2009-01-01

    In the daily life, people often repeat regular routes in certain periods. In this paper, a mining system is developed to find the continuous route patterns of personal past trips. In order to count the diversity of personal moving status, the mining system employs the adaptive GPS data recording and five data filters to guarantee the clean trips data. The mining system uses a client/server architecture to protect personal privacy and to reduce the computational load. The server conducts the main mining procedure but with insufficient information to recover real personal routes. In order to improve the scalability of sequential pattern mining, a novel pattern mining algorithm, continuous route pattern mining (CRPM), is proposed. This algorithm can tolerate the different disturbances in real routes and extract the frequent patterns. Experimental results based on nine persons' trips show that CRPM can extract more than two times longer route patterns than the traditional route pattern mining algorithms.

  15. A COMPARATIVE ANALYSIS OF WEB INFORMATION EXTRACTION TECHNIQUES DEEP LEARNING vs. NAÏVE BAYES vs. BACK PROPAGATION NEURAL NETWORKS IN WEB DOCUMENT EXTRACTION

    OpenAIRE

    J. Sharmila; A. Subramani

    2016-01-01

    Web mining related exploration is getting the chance to be more essential these days in view of the reason that a lot of information is overseen through the web. Web utilization is expanding in an uncontrolled way. A particular framework is required for controlling such extensive measure of information in the web space. Web mining is ordered into three noteworthy divisions: Web content mining, web usage mining and web structure mining. Tak-Lam Wong has proposed a web content mining methodolog...

  16. Uranium mining

    International Nuclear Information System (INIS)

    2008-01-01

    Full text: The economic and environmental sustainability of uranium mining has been analysed by Monash University researcher Dr Gavin Mudd in a paper that challenges the perception that uranium mining is an 'infinite quality source' that provides solutions to the world's demand for energy. Dr Mudd says information on the uranium industry touted by politicians and mining companies is not necessarily inaccurate, but it does not tell the whole story, being often just an average snapshot of the costs of uranium mining today without reflecting the escalating costs associated with the process in years to come. 'From a sustainability perspective, it is critical to evaluate accurately the true lifecycle costs of all forms of electricity production, especially with respect to greenhouse emissions, ' he says. 'For nuclear power, a significant proportion of greenhouse emissions are derived from the fuel supply, including uranium mining, milling, enrichment and fuel manufacture.' Dr Mudd found that financial and environmental costs escalate dramatically as the uranium ore is used. The deeper the mining process required to extract the ore, the higher the cost for mining companies, the greater the impact on the environment and the more resources needed to obtain the product. I t is clear that there is a strong sensitivity of energy and water consumption and greenhouse emissions to ore grade, and that ore grades are likely to continue to decline gradually in the medium to long term. These issues are critical to the current debate over nuclear power and greenhouse emissions, especially with respect to ascribing sustainability to such activities as uranium mining and milling. For example, mining at Roxby Downs is responsible for the emission of over one million tonnes of greenhouse gases per year and this could increase to four million tonnes if the mine is expanded.'

  17. Sample-based XPath Ranking for Web Information Extraction

    NARCIS (Netherlands)

    Jundt, Oliver; van Keulen, Maurice

    Web information extraction typically relies on a wrapper, i.e., program code or a configuration that specifies how to extract some information from web pages at a specific website. Manually creating and maintaining wrappers is a cumbersome and error-prone task. It may even be prohibitive as some

  18. Geographical Information System Model for Potential Mines Data Management Presentation in Kabupaten Gorontalo

    Science.gov (United States)

    Roviana, D.; Tajuddin, A.; Edi, S.

    2017-03-01

    Mining potential in Indonesian is very abundant, ranging from Sabang to Marauke. Kabupaten Gorontalo is one of many places in Indonesia that have different types of minerals and natural resources that can be found in every district. The abundant of mining potential must be balanced with good management and ease of getting information by investors. The current issue is, (1) ways of presenting data/information about potential mines area is still manually (the maps that already capture from satellite image, then printed and attached to information board in the office) it caused the difficulties of getting information; (2) the high cost of maps printing; (3) the difficulties of regency leader (bupati) to obtain information for strategic decision making about mining potential. The goal of this research is to build a model of Geographical Information System that could provide data management of potential mines, so that the investors could easily get information according to their needs. To achieve that goal Research and Development method is used. The result of this research, is a model of Geographical Information System that implemented in an application to presenting data management of mines.

  19. Metal speciation of historic and new copper mine tailings from Repparfjorden, Northern Norway, before and after acid, base and electrodialytic extraction

    DEFF Research Database (Denmark)

    Pedersen, Kristine B.; Jensen, Pernille Erland; Ottosen, Lisbeth M.

    2017-01-01

    the new mine tailings. Electrodialysis, based on applying an electric field of low intensity to extract metals from polluted soils/sediments, was designed for acidic and alkaline extraction, and in both cases more Cu was extracted than in the pure acid/base extractions, while maintaining low mobilisation......In Kvalsund, Northern Norway, a permit for submarine mine tailings disposal in Repparfjorden was recently issued for a copper mine with expected operation from 2019. A copper mine was active in the same area in the 1970s and also deposited mine tailings in the fjord. Investigations of the metal...... tailings. Substantial desorption (>40%) for both historic and new mine tailings occurred at pH values below 3 and above 12. These results combined with metal speciation, showing that the binding of Cu in the sediment changes around pH values 3 and 10, indicate potential for extraction of more Cu from...

  20. Summary of fish and wildlife information needs to surface mine coal in the United States. Part 3. A handbook for meeting fish and wildlife information needs to surface mine coal: OSM Region V. Final report

    Energy Technology Data Exchange (ETDEWEB)

    Hinkle, C.R.; Ambrose, R.E.; Wenzel, C.R.

    1981-02-01

    This report contains information to assist in protecting, enhancing, and reducing impacts to fish and wildlife resources during surface mining of coal. It gives information on the premining, mining, reclamation and compliance phases of surface mining. This volume is specifically for the states of Washington, Idaho, Montana, North Dakota, South Dakota, Wyoming, Oregon, California, Nevada, Utah, Colorado, Arizona and New Mexico.

  1. The Agent of extracting Internet Information with Lead Order

    Science.gov (United States)

    Mo, Zan; Huang, Chuliang; Liu, Aijun

    In order to carry out e-commerce better, advanced technologies to access business information are in need urgently. An agent is described to deal with the problems of extracting internet information that caused by the non-standard and skimble-scamble structure of Chinese websites. The agent designed includes three modules which respond to the process of extracting information separately. A method of HTTP tree and a kind of Lead algorithm is proposed to generate a lead order, with which the required web can be retrieved easily. How to transform the extracted information structuralized with natural language is also discussed.

  2. Data mining

    CERN Document Server

    Gorunescu, Florin

    2011-01-01

    The knowledge discovery process is as old as Homo sapiens. Until some time ago, this process was solely based on the 'natural personal' computer provided by Mother Nature. Fortunately, in recent decades the problem has begun to be solved based on the development of the Data mining technology, aided by the huge computational power of the 'artificial' computers. Digging intelligently in different large databases, data mining aims to extract implicit, previously unknown and potentially useful information from data, since 'knowledge is power'. The goal of this book is to provide, in a friendly way

  3. A COMPARATIVE ANALYSIS OF WEB INFORMATION EXTRACTION TECHNIQUES DEEP LEARNING vs. NAÏVE BAYES vs. BACK PROPAGATION NEURAL NETWORKS IN WEB DOCUMENT EXTRACTION

    Directory of Open Access Journals (Sweden)

    J. Sharmila

    2016-01-01

    Full Text Available Web mining related exploration is getting the chance to be more essential these days in view of the reason that a lot of information is overseen through the web. Web utilization is expanding in an uncontrolled way. A particular framework is required for controlling such extensive measure of information in the web space. Web mining is ordered into three noteworthy divisions: Web content mining, web usage mining and web structure mining. Tak-Lam Wong has proposed a web content mining methodology in the exploration with the aid of Bayesian Networks (BN. In their methodology, they were learning on separating the web data and characteristic revelation in view of the Bayesian approach. Roused from their investigation, we mean to propose a web content mining methodology, in view of a Deep Learning Algorithm. The Deep Learning Algorithm gives the interest over BN on the basis that BN is not considered in any learning architecture planning like to propose system. The main objective of this investigation is web document extraction utilizing different grouping algorithm and investigation. This work extricates the data from the web URL. This work shows three classification algorithms, Deep Learning Algorithm, Bayesian Algorithm and BPNN Algorithm. Deep Learning is a capable arrangement of strategies for learning in neural system which is connected like computer vision, speech recognition, and natural language processing and biometrics framework. Deep Learning is one of the simple classification technique and which is utilized for subset of extensive field furthermore Deep Learning has less time for classification. Naive Bayes classifiers are a group of basic probabilistic classifiers in view of applying Bayes hypothesis with concrete independence assumptions between the features. At that point the BPNN algorithm is utilized for classification. Initially training and testing dataset contains more URL. We extract the content presently from the dataset. The

  4. Development and application of a Chinese webpage suicide information mining system (sims).

    Science.gov (United States)

    Chen, Penglai; Chai, Jing; Zhang, Lu; Wang, Debin

    2014-11-01

    This study aims at designing and piloting a convenient Chinese webpage suicide information mining system (SIMS) to help search and filter required data from the internet and discover potential features and trends of suicide. SIMS utilizes Microsoft Visual Studio2008, SQL2008 and C# as development tools. It collects webpage data via popular search engines; cleans the data using trained models plus minimum manual help; translates the cleaned texts into quantitative data through models and supervised fuzzy recognition; analyzes and visualizes related variables by self-programmed algorithms. The SIMS developed comprises such functions as suicide news and blogs collection, data filtering, cleaning, extraction and translation, data analysis and presentation. SIMS-mediated mining of one-year webpage revealed that: peak months and hours of web-reported suicide events were June-July and 10-11 am respectively, and the lowest months and hours, September-October and 1-7 am; suicide reports came mostly from Soho, Tecent, Sina etc.; male suicide victims over counted female victims in most sub-regions but southwest China; homes, public places and rented houses were the top three places to commit suicide; poisoning, cutting vein and jumping from building were the most commonly used methods to commit suicide; love disputes, family disputes and mental diseases were the leading causes. SIMS provides a preliminary and supplementary means for monitoring and understanding suicide. It proposes useful aspects as well as tools for analyzing the features and trends of suicide using data derived from Chinese webpages. Yet given the intrinsic "dual nature" of internet-based suicide information and the tremendous difficulties experienced by ourselves and other researchers, there is still a long way to go for us to expand, refine and evaluate the system.

  5. Cause Information Extraction from Financial Articles Concerning Business Performance

    Science.gov (United States)

    Sakai, Hiroyuki; Masuyama, Shigeru

    We propose a method of extracting cause information from Japanese financial articles concerning business performance. Our method acquires cause informtion, e. g. “_??__??__??__??__??__??__??__??__??__??_ (zidousya no uriage ga koutyou: Sales of cars were good)”. Cause information is useful for investors in selecting companies to invest. Our method extracts cause information as a form of causal expression by using statistical information and initial clue expressions automatically. Our method can extract causal expressions without predetermined patterns or complex rules given by hand, and is expected to be applied to other tasks for acquiring phrases that have a particular meaning not limited to cause information. We compared our method with our previous one originally proposed for extracting phrases concerning traffic accident causes and experimental results showed that our new method outperforms our previous one.

  6. Partnership in mining

    Energy Technology Data Exchange (ETDEWEB)

    Haslam, R

    1988-04-01

    This paper discusses the benefits resulting from mutual cooperation and information exchange between the UK and USA coal industries. The aim of this cooperation is to promote safe and efficient extraction and profitable use of coal. Advanced mining technologies and mechanisation of the coal mines are some of the results of research cooperation between British Coal and the US Bureau of Mines. In addition, Britain has studied and put into good use the management styles, working practices and pay structure, and mining engineering adopted in the USA.

  7. Fundus Image Features Extraction for Exudate Mining in Coordination with Content Based Image Retrieval: A Study

    Science.gov (United States)

    Gururaj, C.; Jayadevappa, D.; Tunga, Satish

    2018-06-01

    Medical field has seen a phenomenal improvement over the previous years. The invention of computers with appropriate increase in the processing and internet speed has changed the face of the medical technology. However there is still scope for improvement of the technologies in use today. One of the many such technologies of medical aid is the detection of afflictions of the eye. Although a repertoire of research has been accomplished in this field, most of them fail to address how to take the detection forward to a stage where it will be beneficial to the society at large. An automated system that can predict the current medical condition of a patient after taking the fundus image of his eye is yet to see the light of the day. Such a system is explored in this paper by summarizing a number of techniques for fundus image features extraction, predominantly hard exudate mining, coupled with Content Based Image Retrieval to develop an automation tool. The knowledge of the same would bring about worthy changes in the domain of exudates extraction of the eye. This is essential in cases where the patients may not have access to the best of technologies. This paper attempts at a comprehensive summary of the techniques for Content Based Image Retrieval (CBIR) or fundus features image extraction, and few choice methods of both, and an exploration which aims to find ways to combine these two attractive features, and combine them so that it is beneficial to all.

  8. Fundus Image Features Extraction for Exudate Mining in Coordination with Content Based Image Retrieval: A Study

    Science.gov (United States)

    Gururaj, C.; Jayadevappa, D.; Tunga, Satish

    2018-02-01

    Medical field has seen a phenomenal improvement over the previous years. The invention of computers with appropriate increase in the processing and internet speed has changed the face of the medical technology. However there is still scope for improvement of the technologies in use today. One of the many such technologies of medical aid is the detection of afflictions of the eye. Although a repertoire of research has been accomplished in this field, most of them fail to address how to take the detection forward to a stage where it will be beneficial to the society at large. An automated system that can predict the current medical condition of a patient after taking the fundus image of his eye is yet to see the light of the day. Such a system is explored in this paper by summarizing a number of techniques for fundus image features extraction, predominantly hard exudate mining, coupled with Content Based Image Retrieval to develop an automation tool. The knowledge of the same would bring about worthy changes in the domain of exudates extraction of the eye. This is essential in cases where the patients may not have access to the best of technologies. This paper attempts at a comprehensive summary of the techniques for Content Based Image Retrieval (CBIR) or fundus features image extraction, and few choice methods of both, and an exploration which aims to find ways to combine these two attractive features, and combine them so that it is beneficial to all.

  9. Challenges in service mining : record, check, discover

    NARCIS (Netherlands)

    Aalst, van der W.M.P.; Daniel, F.; Dolog, P.; Li, Q.

    2013-01-01

    Process mining aims to discover, monitor and improve real processes by extracting knowledge from event logs abundantly available in today’s information systems. Although process mining has been applied in hundreds of organizations and process mining techniques have been embedded in a variety of

  10. Recurrent process mining with live event data

    NARCIS (Netherlands)

    Syamsiyah, A.; van Dongen, B.F.; van der Aalst, W.M.P.; Teniente, E.; Weidlich, M.

    2018-01-01

    In organizations, process mining activities are typically performed in a recurrent fashion, e.g. once a week, an event log is extracted from the information systems and a process mining tool is used to analyze the process’ characteristics. Typically, process mining tools import the data from a

  11. ADA Title I allegations and the Mining, Quarrying, and Oil/Gas Extraction industry.

    Science.gov (United States)

    Van Wieren, Todd A; Rhoades, Laura; McMahon, Brian T

    2017-01-01

    The majority of research about employment discrimination in the U.S. Mining, Quarrying, and Oil/Gas (MQOGE) industries has concentrated on gender and race, while little attention has focused on disability. To explore allegations of Americans with Disabilities Act (ADA) Title I discrimination made to the Equal Employment Opportunity Commission (EEOC) by individuals with disabilities against MQOGE employers. Key data available to this study included demographic characteristics of charging parties, size of employers, types of allegations, and case outcomes. Using descriptive analysis, allegation profiles were developed for MQOGE's three main sectors (i.e., Oil/Gas Extraction, Mining except Oil/Gas, and Support Activities). These three profiles where then comparatively analyzed. Lastly, regression analysis explored whether some of the available data could partially predict MQOGE case outcomes. The predominant characteristics of MQOGE allegations were found to be quite similar to the allegation profile of U.S. private-sector industry as a whole, and fairly representative of MQOGE's workforce demographics. Significant differences between MQOGE's three main sector profiles were noted on some important characteristics. Lastly, it was found that MQOGE case outcomes could be partially predicted via some of the available variables. The study's limitations were presented and recommendations were offered for further research.

  12. Safety and environmental aspect uranium mining and extraction in Kalan, Kalimantan

    International Nuclear Information System (INIS)

    Mudiar Masdja; Tampubolon, P.; Sihombing, W.

    1996-01-01

    Safety in uranium mining and extraction in Kalan, Kalimantan, Batan's activities, has been observed by concerning about personnel safety, monitoring of the work place and radiation surveillance. the personnel safety includes procurements of personnel protective equipment, work clothes, and washing facility. monitoring of the work place covers climate (temperature, humidity) noise frequency, poisonous gases, and tailing management. Radiation surveillance measures Rn gas and radioactive dust . Environmental assessment of Kalan site consist of physical, biological and cultural environments. The physical assessment mayor area such as water and air qualities, morphology and climatology. the biological assessment examines flora, fauna and aquatic biota. The culture assessment collect data of human population and distribution, occupation and income level, education, health and public perception. Guidelines for environmental management and monitoring have been documented and they have in Kalan site. (author). 8 refs; 3 tabs; 3 figs

  13. Can we replace curation with information extraction software?

    Science.gov (United States)

    Karp, Peter D

    2016-01-01

    Can we use programs for automated or semi-automated information extraction from scientific texts as practical alternatives to professional curation? I show that error rates of current information extraction programs are too high to replace professional curation today. Furthermore, current IEP programs extract single narrow slivers of information, such as individual protein interactions; they cannot extract the large breadth of information extracted by professional curators for databases such as EcoCyc. They also cannot arbitrate among conflicting statements in the literature as curators can. Therefore, funding agencies should not hobble the curation efforts of existing databases on the assumption that a problem that has stymied Artificial Intelligence researchers for more than 60 years will be solved tomorrow. Semi-automated extraction techniques appear to have significantly more potential based on a review of recent tools that enhance curator productivity. But a full cost-benefit analysis for these tools is lacking. Without such analysis it is possible to expend significant effort developing information-extraction tools that automate small parts of the overall curation workflow without achieving a significant decrease in curation costs.Database URL. © The Author(s) 2016. Published by Oxford University Press.

  14. When process mining meets bioinformatics

    NARCIS (Netherlands)

    Jagadeesh Chandra Bose, R.P.; Aalst, van der W.M.P.; Nurcan, S.

    2011-01-01

    Process mining techniques can be used to extract non-trivial process related knowledge and thus generate interesting insights from event logs. Similarly, bioinformatics aims at increasing the understanding of biological processes through the analysis of information associated with biological

  15. Safety Psychology Applicating on Coal Mine Safety Management Based on Information System

    Science.gov (United States)

    Hou, Baoyue; Chen, Fei

    In recent years, with the increase of intensity of coal mining, a great number of major accidents happen frequently, the reason mostly due to human factors, but human's unsafely behavior are affected by insecurity mental control. In order to reduce accidents, and to improve safety management, with the help of application security psychology, we analyse the cause of insecurity psychological factors from human perception, from personality development, from motivation incentive, from reward and punishment mechanism, and from security aspects of mental training , and put forward countermeasures to promote coal mine safety production,and to provide information for coal mining to improve the level of safety management.

  16. Uranium in situ leach mining in the United States. Information circular

    International Nuclear Information System (INIS)

    Larson, W.C.

    1978-01-01

    This report discusses uranium in situ leach mining in the United States; the purpose of which is to acquaint the reader with an overview of this emerging mining technology. This report is not a technical discussion of the subject matter, but rather should be used as a reference source for information on in situ leaching. An in situ leaching bibliography is included as well as engineering data tables for almost all of the active pilot-scale and commercial uranium in situ leaching operators. These tables represent a first attempt at consolidating operational data in one source, on a regional scale. Additional information is given which discusses the current Bureau of Mines uranium in situ leaching research program. Also included is a listing of various State and Federal permitting agencies, and a summary of the current uranium in situ leaching operators. Finally, a glossary of terms has been added, listing some of the more common terms used in uranium in situ leach mining

  17. Integrating Information Extraction Agents into a Tourism Recommender System

    Science.gov (United States)

    Esparcia, Sergio; Sánchez-Anguix, Víctor; Argente, Estefanía; García-Fornes, Ana; Julián, Vicente

    Recommender systems face some problems. On the one hand information needs to be maintained updated, which can result in a costly task if it is not performed automatically. On the other hand, it may be interesting to include third party services in the recommendation since they improve its quality. In this paper, we present an add-on for the Social-Net Tourism Recommender System that uses information extraction and natural language processing techniques in order to automatically extract and classify information from the Web. Its goal is to maintain the system updated and obtain information about third party services that are not offered by service providers inside the system.

  18. Appraisal of Hydrologic Information Needed in Anticipation of Lignite Mining in Lauderdale County, Tennessee

    Science.gov (United States)

    Parks, William Scott

    1981-01-01

    Lignite in western Tennessee occurs as lenses or beds at various stratigraphic horizons in the Coastal Plain sediments of Late Cretaceous and Tertiary age. The occurrence of this lignite has been known for many decades, but not until the energy crisis was it considered an important energy resource. In recent years, several energy companies have conducted extensive exploration programs in western Tennessee, and tremendous reserves of lignite have been found. From available information, Lauderdale County was selected as one of the counties where strip-mining of lignite will most likely occur. Lignite in this county occurs in the Jackson and Cockfield Formations, undivided, of Tertiary age. The hydrology of the county is known only from regional studies and the collection of some site-specific data. Therefore, in anticipation of the future mining of lignite, a plan is needed for obtaining hydrologic and geologic information to adequately define the hydrologic system before mining begins and to monitor the effects of strip-mining once it is begun. For this planning effort, available hydrologic, geologic, land use, and associated data were located and compiled; a summary description of the surface and shallow subsurface hydrologic system was prepared: the need for additional baseline hydrologic information was outlined; and plans to monitor the effects of strip-mining were proposed. This planning approach, although limited to a county area, has transferability to other Coastal Plain areas under consideration for strip-mining of lignite.

  19. A method for extracting design rationale knowledge based on Text Mining

    Directory of Open Access Journals (Sweden)

    Liu Jihong

    2017-01-01

    Full Text Available Capture design rationale (DR knowledge and presenting it to designers by good form, which have great significance for design reuse and design innovation. Since the 1970s design rationality began to develop, many teams have developed their own design rational system. However, the DR acquisition system is not intelligent enough, and it still requires designers to do a lot of operations. In addition, the existing design documents contain a large number of DR knowledge, but it has not been well excavated. Therefore, a method and system are needed to better extract DR knowledge in design documents. We have proposed a DRKH (design rationale knowledge hierarchy model for DR representation. The DRKH model has three layers, respectively as design intent layer, design decision layer and design basis layer. In this paper, we use text mining method to extract DR from design documents and construct DR model. Finally, the welding robot design specification is taken as an example to demonstrate the system interface.

  20. A summary of fish and wildlife information needs to surface mine coal in the United States. Part 3. A handbook for meeting fish and wildlife information needs to surface mine coal: OSM Region III. Final report

    Energy Technology Data Exchange (ETDEWEB)

    Hinkle, C.R.; Ambrose, R.E.; Wenzel, C.R.

    1981-02-01

    The report contains information to assist in protecting, enhancing, and reducing impacts to fish and wildlife resources during surface mining of coal. It gives information on the premining, mining, reclamation and compliance phases of surface mining. Methods and sources to obtain information to satisfy state and Federal regulations are presented. Considerable emphasis is placed on postmining assistance. This volume is specifically for the states of Minnesota, Wisconsin, Michigan, Illinois, Indiana and Ohio.

  1. Summary of fish and wildlife information needs to surface mine coal in the United States. Part 3. A handbook for meeting fish and wildlife information needs to surface mine coal: OSM Region IV. Final report

    Energy Technology Data Exchange (ETDEWEB)

    Hinkle, C.R.; Ambrose, R.E.; Wenzel, C.R.

    1981-02-01

    The report contains information to assist in protecting, enhancing, and reducing impacts to fish and wildlife resources during surface mining of coal. It gives information on the premining, mining, reclamation and compliance phases of surface mining. Methods and sources to obtain information to satisfy state and Federal regulations are presented. This volume is specifically for the states of Nebraska, Iowa, Kansas, Missouri, Oklahoma, Arkansas, Texas and Louisiana.

  2. Summary of fish and wildlife information needs to surface mine coal in the United States. Part 3. A handbook for meeting fish and wildlife information needs to surface mine coal: OSM Region I. Final report

    Energy Technology Data Exchange (ETDEWEB)

    Hinkle, C.R.; Ambrose, R.E.; Wenzel, C.R.

    1981-02-01

    The report contains information to assist in protecting, enhancing, and reducing impacts to fish and wildlife resources during surface mining of coal. It gives information on the premining, mining, reclamation and compliance phases of surface mining. Methods and sources to obtain information to satisfy state and Federal regulations are presented. This volume is specifically for the states of Maine, Vermont, New Hampshire, Massachusetts, Connecticut, New York, Rhode Island, Pennsylvania, New Jersey, Delaware, Maryland, West Virginia and Virginia.

  3. Summary of fish and wildlife information needs to surface mine coal in the United States. Part 3. A handbook for meeting fish and wildlife information needs to surface mine coal: OSM Region II. Final report

    Energy Technology Data Exchange (ETDEWEB)

    Hinkle, C.R.; Ambrose, R.E.; Wenzel, C.R.

    1981-02-01

    The report contains information to assist in protecting, enhancing, and reducing impacts to fish and wildlife resources during surface mining of coal. It gives information on the premining, mining, reclamation and compliance phases of surface mining. Methods and sources to obtain information to satisfy state and Federal regulations are presented. This volume is specifically for the states of Kentucky, Tennessee, North Carolina, South Carolina, Georgia, Alabama, Mississippi and Florida.

  4. Adverse Event extraction from Structured Product Labels using the Event-based Text-mining of Health Electronic Records (ETHER)system.

    Science.gov (United States)

    Pandey, Abhishek; Kreimeyer, Kory; Foster, Matthew; Botsis, Taxiarchis; Dang, Oanh; Ly, Thomas; Wang, Wei; Forshee, Richard

    2018-01-01

    Structured Product Labels follow an XML-based document markup standard approved by the Health Level Seven organization and adopted by the US Food and Drug Administration as a mechanism for exchanging medical products information. Their current organization makes their secondary use rather challenging. We used the Side Effect Resource database and DailyMed to generate a comparison dataset of 1159 Structured Product Labels. We processed the Adverse Reaction section of these Structured Product Labels with the Event-based Text-mining of Health Electronic Records system and evaluated its ability to extract and encode Adverse Event terms to Medical Dictionary for Regulatory Activities Preferred Terms. A small sample of 100 labels was then selected for further analysis. Of the 100 labels, Event-based Text-mining of Health Electronic Records achieved a precision and recall of 81 percent and 92 percent, respectively. This study demonstrated Event-based Text-mining of Health Electronic Record's ability to extract and encode Adverse Event terms from Structured Product Labels which may potentially support multiple pharmacoepidemiological tasks.

  5. Applying a text mining framework to the extraction of numerical parameters from scientific literature in the biotechnology domain

    Directory of Open Access Journals (Sweden)

    André SANTOS

    2012-07-01

    Full Text Available Scientific publications are the main vehicle to disseminate information in the field of biotechnology for wastewater treatment. Indeed, the new research paradigms and the application of high-throughput technologies have increased the rate of publication considerably. The problem is that manual curation becomes harder, prone-to-errors and time-consuming, leading to a probable loss of information and inefficient knowledge acquisition. As a result, research outputs are hardly reaching engineers, hampering the calibration of mathematical models used to optimize the stability and performance of biotechnological systems. In this context, we have developed a data curation workflow, based on text mining techniques, to extract numerical parameters from scientific literature, and applied it to the biotechnology domain. A workflow was built to process wastewater-related articles with the main goal of identifying physico-chemical parameters mentioned in the text. This work describes the implementation of the workflow, identifies achievements and current limitations in the overall process, and presents the results obtained for a corpus of 50 full-text documents.

  6. Applying a text mining framework to the extraction of numerical parameters from scientific literature in the biotechnology domain

    Directory of Open Access Journals (Sweden)

    Anália LOURENÇO

    2013-07-01

    Full Text Available Scientific publications are the main vehicle to disseminate information in the field of biotechnology for wastewater treatment. Indeed, the new research paradigms and the application of high-throughput technologies have increased the rate of publication considerably. The problem is that manual curation becomes harder, prone-to-errors and time-consuming, leading to a probable loss of information and inefficient knowledge acquisition. As a result, research outputs are hardly reaching engineers, hampering the calibration of mathematical models used to optimize the stability and performance of biotechnological systems. In this context, we have developed a data curation workflow, based on text mining techniques, to extract numerical parameters from scientific literature, and applied it to the biotechnology domain. A workflow was built to process wastewater-related articles with the main goal of identifying physico-chemical parameters mentioned in the text. This work describes the implementation of the workflow, identifies achievements and current limitations in the overall process, and presents the results obtained for a corpus of 50 full-text documents.

  7. Fine-grained information extraction from German transthoracic echocardiography reports.

    Science.gov (United States)

    Toepfer, Martin; Corovic, Hamo; Fette, Georg; Klügl, Peter; Störk, Stefan; Puppe, Frank

    2015-11-12

    Information extraction techniques that get structured representations out of unstructured data make a large amount of clinically relevant information about patients accessible for semantic applications. These methods typically rely on standardized terminologies that guide this process. Many languages and clinical domains, however, lack appropriate resources and tools, as well as evaluations of their applications, especially if detailed conceptualizations of the domain are required. For instance, German transthoracic echocardiography reports have not been targeted sufficiently before, despite of their importance for clinical trials. This work therefore aimed at development and evaluation of an information extraction component with a fine-grained terminology that enables to recognize almost all relevant information stated in German transthoracic echocardiography reports at the University Hospital of Würzburg. A domain expert validated and iteratively refined an automatically inferred base terminology. The terminology was used by an ontology-driven information extraction system that outputs attribute value pairs. The final component has been mapped to the central elements of a standardized terminology, and it has been evaluated according to documents with different layouts. The final system achieved state-of-the-art precision (micro average.996) and recall (micro average.961) on 100 test documents that represent more than 90 % of all reports. In particular, principal aspects as defined in a standardized external terminology were recognized with f 1=.989 (micro average) and f 1=.963 (macro average). As a result of keyword matching and restraint concept extraction, the system obtained high precision also on unstructured or exceptionally short documents, and documents with uncommon layout. The developed terminology and the proposed information extraction system allow to extract fine-grained information from German semi-structured transthoracic echocardiography reports

  8. Extraction of Information of Audio-Visual Contents

    Directory of Open Access Journals (Sweden)

    Carlos Aguilar

    2011-10-01

    Full Text Available In this article we show how it is possible to use Channel Theory (Barwise and Seligman, 1997 for modeling the process of information extraction realized by audiences of audio-visual contents. To do this, we rely on the concepts pro- posed by Channel Theory and, especially, its treatment of representational systems. We then show how the information that an agent is capable of extracting from the content depends on the number of channels he is able to establish between the content and the set of classifications he is able to discriminate. The agent can endeavor the extraction of information through these channels from the totality of content; however, we discuss the advantages of extracting from its constituents in order to obtain a greater number of informational items that represent it. After showing how the extraction process is endeavored for each channel, we propose a method of representation of all the informative values an agent can obtain from a content using a matrix constituted by the channels the agent is able to establish on the content (source classifications, and the ones he can understand as individual (destination classifications. We finally show how this representation allows reflecting the evolution of the informative items through the evolution of audio-visual content.

  9. Text mining of cancer-related information: review of current status and future directions.

    Science.gov (United States)

    Spasić, Irena; Livsey, Jacqueline; Keane, John A; Nenadić, Goran

    2014-09-01

    This paper reviews the research literature on text mining (TM) with the aim to find out (1) which cancer domains have been the subject of TM efforts, (2) which knowledge resources can support TM of cancer-related information and (3) to what extent systems that rely on knowledge and computational methods can convert text data into useful clinical information. These questions were used to determine the current state of the art in this particular strand of TM and suggest future directions in TM development to support cancer research. A review of the research on TM of cancer-related information was carried out. A literature search was conducted on the Medline database as well as IEEE Xplore and ACM digital libraries to address the interdisciplinary nature of such research. The search results were supplemented with the literature identified through Google Scholar. A range of studies have proven the feasibility of TM for extracting structured information from clinical narratives such as those found in pathology or radiology reports. In this article, we provide a critical overview of the current state of the art for TM related to cancer. The review highlighted a strong bias towards symbolic methods, e.g. named entity recognition (NER) based on dictionary lookup and information extraction (IE) relying on pattern matching. The F-measure of NER ranges between 80% and 90%, while that of IE for simple tasks is in the high 90s. To further improve the performance, TM approaches need to deal effectively with idiosyncrasies of the clinical sublanguage such as non-standard abbreviations as well as a high degree of spelling and grammatical errors. This requires a shift from rule-based methods to machine learning following the success of similar trends in biological applications of TM. Machine learning approaches require large training datasets, but clinical narratives are not readily available for TM research due to privacy and confidentiality concerns. This issue remains the main

  10. Semantic Information Extraction of Lanes Based on Onboard Camera Videos

    Science.gov (United States)

    Tang, L.; Deng, T.; Ren, C.

    2018-04-01

    In the field of autonomous driving, semantic information of lanes is very important. This paper proposes a method of automatic detection of lanes and extraction of semantic information from onboard camera videos. The proposed method firstly detects the edges of lanes by the grayscale gradient direction, and improves the Probabilistic Hough transform to fit them; then, it uses the vanishing point principle to calculate the lane geometrical position, and uses lane characteristics to extract lane semantic information by the classification of decision trees. In the experiment, 216 road video images captured by a camera mounted onboard a moving vehicle were used to detect lanes and extract lane semantic information. The results show that the proposed method can accurately identify lane semantics from video images.

  11. Data Mining – Innovative Method for Obtaining Information in Marketingand Business Management

    Directory of Open Access Journals (Sweden)

    Mirela-Cristina Voicu

    2011-05-01

    Full Text Available The existence of massive amounts of data raised the question of using their reorientation to a retrospective to a prospective operation. Data mining offers the promise of an important aid for discovering hidden patterns in data that can be used to predict the behavior of customers, products and processes. Data mining tools must be guided by users who understand the business, the general nature of the data and analytical methods involved. It discovers information within the data that queries and reports can’t effectively reveal. It is vital to collect data and prepare properly, to face reality models. Choosing the most appropriate product data mining is to find a tool with the capabilities required, an interface that matches the skills of users and can be applied in a specific business problem. In this context, the purpose of this paper is to illustrate some of the problems of company activity problems which can be solved by using data mining techniques.

  12. Undermining the state? Informal mining and trajectories of state formation in Eastern Mindanao, Philippines

    NARCIS (Netherlands)

    Verbrugge, B.L.P.

    2015-01-01

    Building on critical perspectives on the state and the informal economy, this article provides an analysis of the "state of the state" on the eastern Mindanao mineral frontier. In the first instance, the author explains that the massive expansion of informal small-scale gold mining, instead of

  13. The viability of business data mining in the sports environment ...

    African Journals Online (AJOL)

    Data mining can be viewed as the process of extracting previously unknown information from large databases and utilising this information to make crucial business decisions (Simoudis, 1996: 26). This paper considers the viability of using data mining tools and techniques in sports, particularly with regard to mining the ...

  14. Information Mining from Heterogeneous Data Sources: A Case Study on Drought Predictions

    Directory of Open Access Journals (Sweden)

    Getachew B. Demisse

    2017-07-01

    Full Text Available The objective of this study was to develop information mining methodology for drought modeling and predictions using historical records of climate, satellite, environmental, and oceanic data. The classification and regression tree (CART approach was used for extracting drought episodes at different time-lag prediction intervals. Using the CART approach, a number of successful model trees were constructed, which can easily be interpreted and used by decision makers in their drought management decisions. The regression rules produced by CART were found to have correlation coefficients from 0.71–0.95 in rules-alone modeling. The accuracies of the models were found to be higher in the instance and rules model (0.77–0.96 compared to the rules-alone model. From the experimental analysis, it was concluded that different combinations of the nearest neighbor and committee models significantly increase the performances of CART drought models. For more robust results from the developed methodology, it is recommended that future research focus on selecting relevant attributes for slow-onset drought episode identification and prediction.

  15. Concept and Establishment of the Mine Information System within the CROMAC GIP Project

    Directory of Open Access Journals (Sweden)

    Zvonko Biljecki

    2006-12-01

    Full Text Available In order to solve mine problems in the Republic of Croatia, a unique project CROMAC GIP (Croatian Mine Action Centre Geoinformation Project has been initiated significantly increasing the functional quality of the existing Mine Information System (MIS. Since mine problems are closely related to space, geodata are a crucial part of MIS intended for monitoring and planning of demining. Since the moment the Croatian Mine Action Centre was funded till today, the process of demining has progressed. The implementation of a topographic database in accordance with the CROTIS data model and the usage of orthophoto data produced according to the official product specifications can be pointed out in that progress. Usage of such geodata requires a sophisticated information system that enables a simultaneous usage of geodata and other data connected with solving mine problems. In order to reach all goals in demining and to use all advantages of geodata, it was indispensable to upgrade the existing Mine Information System by merging geodata and HCR data and to collect new data according to the standardized procedures, but controlling at the same time the quality and automated procedures of uploading into the system. Apart from being constructed in accordance with the Standard Operative Procedures (SOP, the modernised MIS is also based on generally accepted standards in the field of geoinformation and it is implemented on advanced technology. The core of the system is the Oracle database, and GeoMedia is a WebMap Professional tool on the basis of which the distribution and the work with spatial data is possible on intranet/Internet. In order to achieve full efficiency of the system, it is necessary to provide high quality and updated geodata. In this respect, photogrammetric data are the most efficient solution.

  16. Knowledge Dictionary for Information Extraction on the Arabic Text Data

    Directory of Open Access Journals (Sweden)

    Wahyu Jauharis Saputra

    2013-04-01

    Full Text Available Information extraction is an early stage of a process of textual data analysis. Information extraction is required to get information from textual data that can be used for process analysis, such as classification and categorization. A textual data is strongly influenced by the language. Arabic is gaining a significant attention in many studies because Arabic language is very different from others, and in contrast to other languages, tools and research on the Arabic language is still lacking. The information extracted using the knowledge dictionary is a concept of expression. A knowledge dictionary is usually constructed manually by an expert and this would take a long time and is specific to a problem only. This paper proposed a method for automatically building a knowledge dictionary. Dictionary knowledge is formed by classifying sentences having the same concept, assuming that they will have a high similarity value. The concept that has been extracted can be used as features for subsequent computational process such as classification or categorization. Dataset used in this paper was the Arabic text dataset. Extraction result was tested by using a decision tree classification engine and the highest precision value obtained was 71.0% while the highest recall value was 75.0%. 

  17. Ontology-Based Information Extraction for Business Intelligence

    Science.gov (United States)

    Saggion, Horacio; Funk, Adam; Maynard, Diana; Bontcheva, Kalina

    Business Intelligence (BI) requires the acquisition and aggregation of key pieces of knowledge from multiple sources in order to provide valuable information to customers or feed statistical BI models and tools. The massive amount of information available to business analysts makes information extraction and other natural language processing tools key enablers for the acquisition and use of that semantic information. We describe the application of ontology-based extraction and merging in the context of a practical e-business application for the EU MUSING Project where the goal is to gather international company intelligence and country/region information. The results of our experiments so far are very promising and we are now in the process of building a complete end-to-end solution.

  18. Community perspectives of natural resource extraction: coal-seam gas mining and social identity in Eastern Australia

    Directory of Open Access Journals (Sweden)

    David Lloyd

    2013-01-01

    Full Text Available Using a recent case study of community reaction to proposed coal-seam gas mining in eastern Australia, we illustrate the role of community views in issues of natural resource use. Drawing on interviews, observations and workshops, the paper explores the anti-coal-seam gas social movement from its stages of infancy through to being a national debate linking community groups across and beyond Australia. Primary community concerns of inadequate community consultation translate into fears regarding potential impacts on farmland and cumulative impacts on aquifers and future water supply, and questions regarding economic, social and environmental benefits. Many of the community activists had not previously been involved in such social action. A recurring message from affected communities is concern around perceived insufficient research and legislation for such rapid industrial expansion. A common citizen demand is the cessation of the industry until there is better understanding of underground water system interconnectivity and the methane extraction and processing life cycle. Improved scientific knowledge of the industry and its potential impacts will, in the popular view, enable better comparison of power generation efficiency with coal and renewable energy sources and better comprehension of the industry as a transition energy industry. It will also enable elected representatives and policy makers to make more informed decisions while developing appropriate legislation to ensure a sustainable future.

  19. Study of Cu and Pb partitioning in mine tailings using the Tessier sequential extraction scheme

    Energy Technology Data Exchange (ETDEWEB)

    Andrei, Mariana Lucia, E-mail: marianaluciaandrei@yahoo.com [National Institute for Research and Development of Isotopic and Molecular Technologies, 65-103 Donath, 400293 Cluj-Napoca (Romania); Babes-Bolyai University, Environmental Science and Engineering Faculty, 30 Fantanele, 400294, Cluj-Napoca (Romania); Senila, Marin; Hoaghia, Maria Alexandra; Levei, Erika-Andrea [INCDO-INOE 2000, Research Institute for Analytical Instrumentation, 67 Donath, 400293, Cluj-Napoca (Romania); Borodi, Gheorghe [National Institute for Research and Development of Isotopic and Molecular Technologies, 65-103 Donath, 400293 Cluj-Napoca (Romania)

    2015-12-23

    The Cu and Pb partitioning in nonferrous mine tailings was investigated using the Tessier sequential extraction scheme. The contents of Cu and Pb found in the five operationally defined fractions were determined by inductively coupled plasma optical emission spectrometry. The results showed different partitioning patterns for Cu and Pb in the studied tailings. The total Cu and Pb contents were higher in tailings from Brazesti than in those from Saliste, while the Cu contents in the first two fractions considered as mobile were comparable and the content of mobile Pb was the highest in Brazesti tailings. In the tailings from Saliste about 30% of Cu and 3% of Pb were found in exchangeable fraction, while in those from Brazesti no metals were found in the exchangeable fraction, but the percent of Cu and Pb found in the bound to carbonate fraction were high (20% and 26%, respectively). The highest Pb content was found in the residual fraction in Saliste tailings and in bound to Fe and Mn oxides fraction in Brazesti tailings, while the highest Cu content was found in the fraction bound to organic matter in Saliste tailings and in the residual fraction in Brazesti tailings. In case of tailings of Brazesti medium environmental risk was found both for Pb and Cu, while in case of Saliste tailings low risk for Pb and high risk for Cu were found.

  20. A methodology for semiautomatic taxonomy of concepts extraction from nuclear scientific documents using text mining techniques

    International Nuclear Information System (INIS)

    Braga, Fabiane dos Reis

    2013-01-01

    This thesis presents a text mining method for semi-automatic extraction of taxonomy of concepts, from a textual corpus composed of scientific papers related to nuclear area. The text classification is a natural human practice and a crucial task for work with large repositories. The document clustering technique provides a logical and understandable framework that facilitates the organization, browsing and searching. Most clustering algorithms using the bag of words model to represent the content of a document. This model generates a high dimensionality of the data, ignores the fact that different words can have the same meaning and does not consider the relationship between them, assuming that words are independent of each other. The methodology presents a combination of a model for document representation by concepts with a hierarchical document clustering method using frequency of co-occurrence concepts and a technique for clusters labeling more representatives, with the objective of producing a taxonomy of concepts which may reflect a structure of the knowledge domain. It is hoped that this work will contribute to the conceptual mapping of scientific production of nuclear area and thus support the management of research activities in this area. (author)

  1. Finding occupational accident patterns in the extractive industry using a systematic data mining approach

    International Nuclear Information System (INIS)

    Silva, Joaquim F.; Jacinto, Celeste

    2012-01-01

    This paper deals with occupational accident patterns of in the Portuguese Extractive Industry. It constitutes a significant advance with relation to a previous study made in 2008, both in terms of methodology and extended knowledge on the patterns’ details. This work uses more recent data (2005–2007) and this time the identification of the “typical accident” shifts from a bivariate, to a multivariate pattern, for characterising more accurately the accident mechanisms. Instead of crossing only two variables (Deviation x Contact), the new methodology developed here uses data mining techniques to associate nine variables, through their categories, and to quantify the statistical cohesion of each pattern. The results confirmed the “typical accident” of the 2008 study, but went much further: it reveals three statistically significant patterns (the top-3 categories in frequency); moreover, each pattern includes now more variables (4–5 categories) and indicates their statistical cohesion. This approach allowed a more accurate vision of the reality, which is fundamental for risk management. The methodology is best suited for large groups, such as national Authorities, Insurers or Corporate Groups, to assist them planning target-oriented safety strategies. Not least importantly, researchers can apply the same algorithm to other study areas, as it is not restricted to accidents, neither to safety.

  2. Radiological evaluation near three old mines of uranium extraction in the department of Creuse - year 2007

    International Nuclear Information System (INIS)

    2007-01-01

    The observations made for three sites of 'Chaumaillat, Ribiere and Grands Champs', demonstrate the existence of an atypical radiological situation which seems marked by the past activities of the mining. If the geochemical context can sometimes be at the origin of abnormalities in sediments and muds, the regional industrial context, conjugated to the strong measured values of uranium, incites us to privilege a human origin to explain these abnormalities. The presence of almost pure uraniums is the result supposed by the past activities of ore treatment on site (lixiviation) to extract the raw material from it (yellow cake) used for the manufacturing of the nuclear fuel. However, this observation on the site of 'Grands Champs' is surprising considering the absence of treatment activity in situ declared by the operator and the absence of residues storage.Given the accessibility of these sites to the public and considering the stop of any device of surveillance, a follow-up study seems necessary to estimate the importance of the radiological abnormalities and their persistent impact on the environment. (N.C.)

  3. Study of Cu and Pb partitioning in mine tailings using the Tessier sequential extraction scheme

    International Nuclear Information System (INIS)

    Andrei, Mariana Lucia; Senila, Marin; Hoaghia, Maria Alexandra; Levei, Erika-Andrea; Borodi, Gheorghe

    2015-01-01

    The Cu and Pb partitioning in nonferrous mine tailings was investigated using the Tessier sequential extraction scheme. The contents of Cu and Pb found in the five operationally defined fractions were determined by inductively coupled plasma optical emission spectrometry. The results showed different partitioning patterns for Cu and Pb in the studied tailings. The total Cu and Pb contents were higher in tailings from Brazesti than in those from Saliste, while the Cu contents in the first two fractions considered as mobile were comparable and the content of mobile Pb was the highest in Brazesti tailings. In the tailings from Saliste about 30% of Cu and 3% of Pb were found in exchangeable fraction, while in those from Brazesti no metals were found in the exchangeable fraction, but the percent of Cu and Pb found in the bound to carbonate fraction were high (20% and 26%, respectively). The highest Pb content was found in the residual fraction in Saliste tailings and in bound to Fe and Mn oxides fraction in Brazesti tailings, while the highest Cu content was found in the fraction bound to organic matter in Saliste tailings and in the residual fraction in Brazesti tailings. In case of tailings of Brazesti medium environmental risk was found both for Pb and Cu, while in case of Saliste tailings low risk for Pb and high risk for Cu were found

  4. DEVELOPMENT OF AUTOMATIC EXTRACTION METHOD FOR ROAD UPDATE INFORMATION BASED ON PUBLIC WORK ORDER OUTLOOK

    Science.gov (United States)

    Sekimoto, Yoshihide; Nakajo, Satoru; Minami, Yoshitaka; Yamaguchi, Syohei; Yamada, Harutoshi; Fuse, Takashi

    Recently, disclosure of statistic data, representing financial effects or burden for public work, through each web site of national or local government, enables us to discuss macroscopic financial trends. However, it is still difficult to grasp a basic property nationwide how each spot was changed by public work. In this research, our research purpose is to collect road update information reasonably which various road managers provide, in order to realize efficient updating of various maps such as car navigation maps. In particular, we develop the system extracting public work concerned and registering summary including position information to database automatically from public work order outlook, released by each local government, combinating some web mining technologies. Finally, we collect and register several tens of thousands from web site all over Japan, and confirm the feasibility of our method.

  5. Imprinted magnetic graphene oxide for the mini-solid phase extraction of Eu (III) from coal mine area

    Science.gov (United States)

    Patra, Santanu; Roy, Ekta; Madhuri, Rashmi; Sharma, Prashant K.

    2017-05-01

    The present work represents the preparation of imprinted magnetic reduced graphene oxide and applied it for the selective removal of Eu (III) from local coal mines area. A simple solid phase extraction method was used for this purpose. The material shows a very high adsorption as well as removal efficiency towards Eu (III), which suggest that the material have potential to be used in future for their real time applications in removal of Eu (III) from complex matrices.

  6. A summary of fish and wildlife information needs to surface mine coal in the United States. Part 2. The status of state surface mining regulations as of January 1980 and the fish and wildlife information needs. Final report

    Energy Technology Data Exchange (ETDEWEB)

    1980-01-01

    This is part 2 of a three part series to assist government agencies and private citizens in determining fish and wildlife information needs for new coal mining operations pursuant to the Surface Mining Control and Reclamation Act of 1977. This portion documents the status of individual state surface mining regulations as of January 1980 in those states having significant strippable reserves and/or active strip mining operations. It also provides documentation of fish and wildlife information needs identified in the state regulations of compliance to PL 95-87.

  7. The economic logic of persistent informality: Artisanal and small-scale mining in the Southern Philippines

    NARCIS (Netherlands)

    Verbrugge, B.L.P.

    2015-01-01

    This article critically evaluates existing causal explanations for the persistence of informality in artisanal and small-scale mining (ASM). These explanations share a legalistic focus on entry barriers and political impediments that prevent or discourage the formalization of poverty-driven ASM

  8. 76 FR 27355 - Agency Information Collection Activities; Submission for OMB Review; Comment Request; Mine...

    Science.gov (United States)

    2011-05-11

    ... comprehensive and reliable occupational data available concerning the mining industry. This submission has been... miners. Accident, injury, and illness data, when correlated with employment and production data, provide information that allows the MSHA to improve its safety and health enforcement programs, focus its education...

  9. 75 FR 51488 - Division of Coal Mine Workers' Compensation; Proposed Extension of Information Collection...

    Science.gov (United States)

    2010-08-20

    ... order to carry out its responsibility to administer the Black Lung Benefits Act. Agency: Office of...). SUPPLEMENTARY INFORMATION: I. Background: The Division of Coal Mine Workers' Compensation administers the Black Lung Benefits Act (30 U.S.C. 901 et seq.), which provides benefits to coal miners totally disabled due...

  10. Aspects of transport system management within mining complex using information and telecommunication systems

    Science.gov (United States)

    Semykina, A. S.; Zagorodniy, N. A.; Konev, A. A.; Duganova, E. V.

    2018-05-01

    The paper considers aspects of transport system management within the mining complex. It indicates information and telecommunication systems that are used to increase transportation efficiency. It also describes key advantages and disadvantages. It is found that software products of the Modular Company used in pits allow increasing transport performance, minimizing losses and ensuring efficient transportation of minerals.

  11. 76 FR 14647 - Proposed Information Collection; Comment Request; 2012 Economic Census Covering the Mining Sector

    Science.gov (United States)

    2011-03-17

    ... essential information for government, business and the general public. The 2012 Economic Census covering the... Economic Census Covering the Mining Sector AGENCY: U.S. Census Bureau. ACTION: Notice. SUMMARY: The... provider of timely, relevant and quality data about the people and economy of the United States. Economic...

  12. 75 FR 51487 - Division of Coal Mine Workers' Compensation; Proposed Extension of Information Collection...

    Science.gov (United States)

    2010-08-20

    ... DEPARTMENT OF LABOR Office of Workers' Compensation Programs Division of Coal Mine Workers' Compensation; Proposed Extension of Information Collection; Comment Request ACTION: Notice. SUMMARY: The Department of Labor, as part of its continuing effort to reduce paperwork and respondent burden, conducts a...

  13. Text mining of web-based medical content

    CERN Document Server

    Neustein, Amy

    2014-01-01

    Text Mining of Web-Based Medical Content examines web mining for extracting useful information that can be used for treating and monitoring the healthcare of patients. This work provides methodological approaches to designing mapping tools that exploit data found in social media postings. Specific linguistic features of medical postings are analyzed vis-a-vis available data extraction tools for culling useful information.

  14. Advances in research methods for information systems research data mining, data envelopment analysis, value focused thinking

    CERN Document Server

    Osei-Bryson, Kweku-Muata

    2013-01-01

    Advances in social science research methodologies and data analytic methods are changing the way research in information systems is conducted. New developments in statistical software technologies for data mining (DM) such as regression splines or decision tree induction can be used to assist researchers in systematic post-positivist theory testing and development. Established management science techniques like data envelopment analysis (DEA), and value focused thinking (VFT) can be used in combination with traditional statistical analysis and data mining techniques to more effectively explore

  15. Development of mechanization of extraction in underground coal mining (part I)

    Energy Technology Data Exchange (ETDEWEB)

    Strzeminski, J

    1984-01-01

    The history of underground coal mining and history of mechanizing underground operations of cutting, strata control, mine haulage, hoisting and ventilation are discussed. The following development periods are characterized: until 1769 (date of steam engine invention by J. Watt), from 1769 to 1945 (period of partial mechanization of operations in underground coal mining), from 1945 (period of comprehensive mechanization and automation). A general description of mining in the first development period is given. Evaluation of the second development period concentrates on mechanization in underground coal mining. The following equipment types are described: cutting (pneumatic picks and pneumatic drills, coal saws developed by Eickhoff, coal cutters developed after 1870, cutter loaders patented in 1925-1927, coal plows and coal cutter loaders), mine haulage (mine cars, conveyors developed in the United Kingdom, Germany and Russia, Poland), strata control at working faces (timber props, steel friction props, roof bars), strata control in the goaf (room and pillar mining, stowing, minestone utilization for stowing in Upper Silesia, hydraulic stowing in Upper Silesia). 5 references.

  16. EXTRACT

    DEFF Research Database (Denmark)

    Pafilis, Evangelos; Buttigieg, Pier Luigi; Ferrell, Barbra

    2016-01-01

    The microbial and molecular ecology research communities have made substantial progress on developing standards for annotating samples with environment metadata. However, sample manual annotation is a highly labor intensive process and requires familiarity with the terminologies used. We have the...... and text-mining-assisted curation revealed that EXTRACT speeds up annotation by 15-25% and helps curators to detect terms that would otherwise have been missed.Database URL: https://extract.hcmr.gr/......., organism, tissue and disease terms. The evaluators in the BioCreative V Interactive Annotation Task found the system to be intuitive, useful, well documented and sufficiently accurate to be helpful in spotting relevant text passages and extracting organism and environment terms. Comparison of fully manual...

  17. Optimum detection for extracting maximum information from symmetric qubit sets

    International Nuclear Information System (INIS)

    Mizuno, Jun; Fujiwara, Mikio; Sasaki, Masahide; Akiba, Makoto; Kawanishi, Tetsuya; Barnett, Stephen M.

    2002-01-01

    We demonstrate a class of optimum detection strategies for extracting the maximum information from sets of equiprobable real symmetric qubit states of a single photon. These optimum strategies have been predicted by Sasaki et al. [Phys. Rev. A 59, 3325 (1999)]. The peculiar aspect is that the detections with at least three outputs suffice for optimum extraction of information regardless of the number of signal elements. The cases of ternary (or trine), quinary, and septenary polarization signals are studied where a standard von Neumann detection (a projection onto a binary orthogonal basis) fails to access the maximum information. Our experiments demonstrate that it is possible with present technologies to attain about 96% of the theoretical limit

  18. Optimizing the Information Presentation on Mining Potential by using Web Services Technology with Restful Protocol

    Science.gov (United States)

    Abdillah, T.; Dai, R.; Setiawan, E.

    2018-02-01

    This study aims to develop the application of Web Services technology with RestFul Protocol to optimize the information presentation on mining potential. This study used User Interface Design approach for the information accuracy and relevance as well as the Web Service for the reliability in presenting the information. The results show that: the information accuracy and relevance regarding mining potential can be seen from the achievement of User Interface implementation in the application that is based on the following rules: The consideration of the appropriate colours and objects, the easiness of using the navigation, and users’ interaction with the applications that employs symbols and languages understood by the users; the information accuracy and relevance related to mining potential can be observed by the information presented by using charts and Tool Tip Text to help the users understand the provided chart/figure; the reliability of the information presentation is evident by the results of Web Services testing in Figure 4.5.6. This study finds out that User Interface Design and Web Services approaches (for the access of different Platform apps) are able to optimize the presentation. The results of this study can be used as a reference for software developers and Provincial Government of Gorontalo.

  19. Visualization and Integrated Data Mining of Disparate Information

    Energy Technology Data Exchange (ETDEWEB)

    Saffer, Jeffrey D.(OMNIVIZ, INC); Albright, Cory L.(BATTELLE (PACIFIC NW LAB)); Calapristi, Augustin J.(BATTELLE (PACIFIC NW LAB)); Chen, Guang (OMNIVIZ, INC); Crow, Vernon L.(BATTELLE (PACIFIC NW LAB)); Decker, Scott D.(BATTELLE (PACIFIC NW LAB)); Groch, Kevin M.(BATTELLE (PACIFIC NW LAB)); Havre, Susan L.(BATTELLE (PACIFIC NW LAB)); Malard, Joel (BATTELLE (PACIFIC NW LAB)); Martin, Tonya J.(BATTELLE (PACIFIC NW LAB)); Miller, Nancy E.(BATTELLE (PACIFIC NW LAB)); Monroe, Philip J.(OMNIVIZ, INC); Nowell, Lucy T.(BATTELLE (PACIFIC NW LAB)); Payne, Deborah A.(BATTELLE (PACIFIC NW LAB)); Reyes Spindola, Jorge F.(BATTELLE (PACIFIC NW LAB)); Scarberry, Randall E.(OMNIVIZ, INC); Sofia, Heidi J.(BATTELLE (PACIFIC NW LAB)); Stillwell, Lisa C.(OMNIVIZ, INC); Thomas, Gregory S.(BATTELLE (PACIFIC NW LAB)); Thurston, Sarah J.(OMNIVIZ, INC); Williams, Leigh K.(BATTELLE (PACIFIC NW LAB)); Zabriskie, Sean J.(OMNIVIZ, INC); MG Hicks

    2001-05-11

    The volumes and diversity of information in the discovery, development, and business processes within the chemical and life sciences industries require new approaches for analysis. Traditional list- or spreadsheet-based methods are easily overwhelmed by large amounts of data. Furthermore, generating strong hypotheses and, just as importantly, ruling out weak ones, requires integration across different experimental and informational sources. We have developed a framework for this integration, including common conceptual data models for multiple data types and linked visualizations that provide an overview of the entire data set, a measure of how each data record is related to every other record, and an assessment of the associations within the data set.

  20. Extracting Semantic Information from Visual Data: A Survey

    Directory of Open Access Journals (Sweden)

    Qiang Liu

    2016-03-01

    Full Text Available The traditional environment maps built by mobile robots include both metric ones and topological ones. These maps are navigation-oriented and not adequate for service robots to interact with or serve human users who normally rely on the conceptual knowledge or semantic contents of the environment. Therefore, the construction of semantic maps becomes necessary for building an effective human-robot interface for service robots. This paper reviews recent research and development in the field of visual-based semantic mapping. The main focus is placed on how to extract semantic information from visual data in terms of feature extraction, object/place recognition and semantic representation methods.

  1. Rapid automatic keyword extraction for information retrieval and analysis

    Science.gov (United States)

    Rose, Stuart J [Richland, WA; Cowley,; E, Wendy [Richland, WA; Crow, Vernon L [Richland, WA; Cramer, Nicholas O [Richland, WA

    2012-03-06

    Methods and systems for rapid automatic keyword extraction for information retrieval and analysis. Embodiments can include parsing words in an individual document by delimiters, stop words, or both in order to identify candidate keywords. Word scores for each word within the candidate keywords are then calculated based on a function of co-occurrence degree, co-occurrence frequency, or both. Based on a function of the word scores for words within the candidate keyword, a keyword score is calculated for each of the candidate keywords. A portion of the candidate keywords are then extracted as keywords based, at least in part, on the candidate keywords having the highest keyword scores.

  2. The enhanced mine communications and information systems. The development of the Nexsys realtime risk management system

    Energy Technology Data Exchange (ETDEWEB)

    Haustein, K.; Rowan, G. [CSIRO Exploration and Mining (Australia)

    2007-03-15

    The article describes two safety projects under way between JCOAL in Japan and CSIRO (Australia) which are concluding in March 2007. The first was to develop a real-time roof fall monitoring and warning system for underground coal mines. The system consisted of extensometers, stress meters and a seismic monitoring system. It was installed at the Ulan colliery in New South Wales. The output of the system is a set of probabilities of a roof fall happening within various periods of time. The three instruments have colour-coded warning lights. The second project, the enhanced mine communications and information systems for real-time risk analysis project, collects and analyses data from diverse sources with the Nexsys{trademark} hardware and software system. It is now installed in two mines in Australia and one in Japan. The system is described in detail in the article. 2 refs., 6 figs.

  3. Information and communication technology and climate change adaptation: Evidence from selected mining companies in South Africa

    Directory of Open Access Journals (Sweden)

    Bartholomew I. Aleke

    2016-04-01

    Full Text Available The mining sector is a significant contributor to the gross domestic product of many global economies. Given the increasing trends in climate-induced disasters and the growing desire to find lasting solutions, information and communication technology (ICT has been introduced into the climate change adaptation mix. Climate change-induced extreme weather events such as flooding, drought, excessive fog, and cyclones have compounded the environmental challenges faced by the mining sector. This article presents the adoption of ICT innovation as part of the adaptation strategies towards reducing the mining sector’s vulnerability and exposure to climate change disaster risks. Document analysis and systematic literature review were adopted as the methodology. Findings from the study reflect how ICT intervention orchestrated changes in communication patterns which are tailored towards the reduction in climate change vulnerability and exposure. The research concludes with a proposition that ICT intervention must be part of the bigger and ongoing climate change adaptation agenda in the mining sector. Keywords: ICT; climate change; disaster risk reduction; mining; adaptation; South Africa

  4. Robust Vehicle and Traffic Information Extraction for Highway Surveillance

    Directory of Open Access Journals (Sweden)

    Yeh Chia-Hung

    2005-01-01

    Full Text Available A robust vision-based traffic monitoring system for vehicle and traffic information extraction is developed in this research. It is challenging to maintain detection robustness at all time for a highway surveillance system. There are three major problems in detecting and tracking a vehicle: (1 the moving cast shadow effect, (2 the occlusion effect, and (3 nighttime detection. For moving cast shadow elimination, a 2D joint vehicle-shadow model is employed. For occlusion detection, a multiple-camera system is used to detect occlusion so as to extract the exact location of each vehicle. For vehicle nighttime detection, a rear-view monitoring technique is proposed to maintain tracking and detection accuracy. Furthermore, we propose a method to improve the accuracy of background extraction, which usually serves as the first step in any vehicle detection processing. Experimental results are given to demonstrate that the proposed techniques are effective and efficient for vision-based highway surveillance.

  5. PaperBLAST: Text Mining Papers for Information about Homologs.

    Science.gov (United States)

    Price, Morgan N; Arkin, Adam P

    2017-01-01

    Large-scale genome sequencing has identified millions of protein-coding genes whose function is unknown. Many of these proteins are similar to characterized proteins from other organisms, but much of this information is missing from annotation databases and is hidden in the scientific literature. To make this information accessible, PaperBLAST uses EuropePMC to search the full text of scientific articles for references to genes. PaperBLAST also takes advantage of curated resources (Swiss-Prot, GeneRIF, and EcoCyc) that link protein sequences to scientific articles. PaperBLAST's database includes over 700,000 scientific articles that mention over 400,000 different proteins. Given a protein of interest, PaperBLAST quickly finds similar proteins that are discussed in the literature and presents snippets of text from relevant articles or from the curators. PaperBLAST is available at http://papers.genomics.lbl.gov/. IMPORTANCE With the recent explosion of genome sequencing data, there are now millions of uncharacterized proteins. If a scientist becomes interested in one of these proteins, it can be very difficult to find information as to its likely function. Often a protein whose sequence is similar, and which is likely to have a similar function, has been studied already, but this information is not available in any database. To help find articles about similar proteins, PaperBLAST searches the full text of scientific articles for protein identifiers or gene identifiers, and it links these articles to protein sequences. Then, given a protein of interest, it can quickly find similar proteins in its database by using standard software (BLAST), and it can show snippets of text from relevant papers. We hope that PaperBLAST will make it easier for biologists to predict proteins' functions.

  6. PaperBLAST: Text Mining Papers for Information about Homologs

    International Nuclear Information System (INIS)

    Price, Morgan N.; Arkin, Adam P.

    2017-01-01

    Large-scale genome sequencing has identified millions of protein-coding genes whose function is unknown. Many of these proteins are similar to characterized proteins from other organisms, but much of this information is missing from annotation databases and is hidden in the scientific literature. To make this information accessible, PaperBLAST uses EuropePMC to search the full text of scientific articles for references to genes. PaperBLAST also takes advantage of curated resources (Swiss-Prot, GeneRIF, and EcoCyc) that link protein sequences to scientific articles. PaperBLAST’s database includes over 700,000 scientific articles that mention over 400,000 different proteins. Given a protein of interest, PaperBLAST quickly finds similar proteins that are discussed in the literature and presents snippets of text from relevant articles or from the curators. With the recent explosion of genome sequencing data, there are now millions of uncharacterized proteins. If a scientist becomes interested in one of these proteins, it can be very difficult to find information as to its likely function. Often a protein whose sequence is similar, and which is likely to have a similar function, has been studied already, but this information is not available in any database. To help find articles about similar proteins, PaperBLAST searches the full text of scientific articles for protein identifiers or gene identifiers, and it links these articles to protein sequences. Then, given a protein of interest, it can quickly find similar proteins in its database by using standard software (BLAST), and it can show snippets of text from relevant papers. We hope that PaperBLAST will make it easier for biologists to predict proteins’ functions.

  7. Mining Contextual Information for Ephemeral Digital Video Preservation

    OpenAIRE

    Shah, Chirag

    2009-01-01

    For centuries the archival community has understood and practiced the art of adding contextual information while preserving an artifact. The question now is how these practices can be transferred to the digital domain. With the growing expansion of production and consumption of digital objects (documents, audio, video, etc.) it has become essential to identify and study issues related to their representation. A cura­tor in the digital realm may be said to have the same responsibilities as on...

  8. Advanced applications of natural language processing for performing information extraction

    CERN Document Server

    Rodrigues, Mário

    2015-01-01

    This book explains how can be created information extraction (IE) applications that are able to tap the vast amount of relevant information available in natural language sources: Internet pages, official documents such as laws and regulations, books and newspapers, and social web. Readers are introduced to the problem of IE and its current challenges and limitations, supported with examples. The book discusses the need to fill the gap between documents, data, and people, and provides a broad overview of the technology supporting IE. The authors present a generic architecture for developing systems that are able to learn how to extract relevant information from natural language documents, and illustrate how to implement working systems using state-of-the-art and freely available software tools. The book also discusses concrete applications illustrating IE uses.   ·         Provides an overview of state-of-the-art technology in information extraction (IE), discussing achievements and limitations for t...

  9. Extraction of Eu (III) in monazite from soils containing amang collected from Kampung Gajah ex-mining area

    International Nuclear Information System (INIS)

    Zaini Hamzah; Nor Monica Ahmad; Ahmad Saat

    2011-01-01

    Malaysia was once a major tin exporting country. One of the by-products of the tin-mining activities is tin-tailing which known as amang very rich in rare earth elements, especially the lanthanides which are present as a mixture of phosphate minerals, mainly as ilmenite, xenotime and monazite. In this study, Kg Gajah in Kinta Valley occupying the State of Perak was chosen as a study area, since this area used to be the largest mining area in the 60s and 70s. The soil samples were separated using wet separation technique followed by magnetic separation. The monazite was then digested using a mixture of HF/ HNO 3 acids. The digested sample was extracted for its cerium content. The extraction behaviour of cerium in those samples has been investigated as a function of Cyanex 302 concentration in diluents and the time taken to reach the equilibrium. Extractant of bis(2,4,4-trimethylpentyl)-mono-thio phosphinic acid (Cyanex302) in n-heptane was used throughout the analysis. Aqueous phase from extraction was analyzed spectro metrically using Arsenazo (III) while organic phase was subjected to rotavapour followed by analysis by FTIR. The aim of this study is to have the best concentration for Cyanex302 in order to extract as much as possible of Europium and to confirm the transfer of Eu (III) to the Cyanex 302 as an extractant. Result from UV/ VIS shows that 0.7 M is the best concentration of Cyanex 302 for the Eu (III) extraction from samples. Result from FTIR confirmed the structure of Cyanex302 has been replaced by Ce (IV). (author)

  10. Mercury Speciation in Contaminated Soils from Old Mining Activities in Mexico Using a Chemical Selective Extraction

    OpenAIRE

    Gavilán-García, Irma; Santos-Santos, Elvira; Tovar-Gálvez, Luis R.; Gavilán-García, Arturo; Suárez, Sara; Olmos, Jesús

    2008-01-01

    Amalgamation was heavily used in mining since 1557 in Spanish Colonies. In Mexico and other parts of Latin-America, this process generated tailings which were left aside in the mine backyards. In the valley of Zacatecas, tailings were carried out of the mines due to the run-off from the mountains and contaminated most of the Zacatecan Valley which most important economic activity is agricultural (crop and livestock raising). The main concern in this area is the high level of total mercury fou...

  11. Data-Throughput Enhancement Using Data Mining-Informed Cognitive Radio

    Directory of Open Access Journals (Sweden)

    Khashayar Kotobi

    2015-03-01

    Full Text Available We propose the data mining-informed cognitive radio, which uses non-traditional data sources and data-mining techniques for decision making and improving the performance of a wireless network. To date, the application of information other than wireless channel data in cognitive radios has not been significantly studied. We use a novel dataset (Twitter traffic as an indicator of network load in a wireless channel. Using this dataset, we present and test a series of predictive algorithms that show an improvement in wireless channel utilization over traditional collision-detection algorithms. Our results demonstrate the viability of using these novel datasets to inform and create more efficient cognitive radio networks.

  12. A summary of fish and wildlife information needs to surface mine coal in the United States. Part 1. Fish and wildlife information needs in the federal surface mining permanent regulations. Final report

    Energy Technology Data Exchange (ETDEWEB)

    1980-01-01

    This is part 1 of three part series to assist government agencies and private citizens in determining fish and wildlife information needs for new coal mining operations pursuant to the Surface Mining Control and Reclamation Act of 1977. Part 2 will document status of individual state surface mining regulations as of January 1980 in those states having significant strippable reserves and/or active strip mining operations. It will also provide documentation of fish and wildlife information needs identified in the state regulations of compliance to PL 95-87. Part 3 will be a discussion of the information needed to develop the Fish and Wildlife Plan identified in the Permanent Regulations. The objective of this three part series is to include consideration of fish and wildlife resources in the surface mining process.

  13. Information technology and data mining for spent fuel treatment

    International Nuclear Information System (INIS)

    Vilim, R. B.

    2000-01-01

    Information technology is being used to provide interactive access to data collected from the electro-metallurgical treatment of spent fuel. The data are results from many hundreds of experiments performed to better characterize the processes by which uranium is separated from the waste products. Web-based display and relational database query capabilities facilitate the identification of trends in the data and the relating of these trends to the underlying electrochemistry. The objectives are to ensure that the process behavior is well understood, to make readily accessible the necessary data for development and validation of models, and to identify unexpected trends in the data as indications of phenomena not yet represented in the models

  14. KID - an algorithm for fast and efficient text mining used to automatically generate a database containing kinetic information of enzymes

    Directory of Open Access Journals (Sweden)

    Schomburg Dietmar

    2010-07-01

    Full Text Available Abstract Background The amount of available biological information is rapidly increasing and the focus of biological research has moved from single components to networks and even larger projects aiming at the analysis, modelling and simulation of biological networks as well as large scale comparison of cellular properties. It is therefore essential that biological knowledge is easily accessible. However, most information is contained in the written literature in an unstructured way, so that methods for the systematic extraction of knowledge directly from the primary literature have to be deployed. Description Here we present a text mining algorithm for the extraction of kinetic information such as KM, Ki, kcat etc. as well as associated information such as enzyme names, EC numbers, ligands, organisms, localisations, pH and temperatures. Using this rule- and dictionary-based approach, it was possible to extract 514,394 kinetic parameters of 13 categories (KM, Ki, kcat, kcat/KM, Vmax, IC50, S0.5, Kd, Ka, t1/2, pI, nH, specific activity, Vmax/KM from about 17 million PubMed abstracts and combine them with other data in the abstract. A manual verification of approx. 1,000 randomly chosen results yielded a recall between 51% and 84% and a precision ranging from 55% to 96%, depending of the category searched. The results were stored in a database and are available as "KID the KInetic Database" via the internet. Conclusions The presented algorithm delivers a considerable amount of information and therefore may aid to accelerate the research and the automated analysis required for today's systems biology approaches. The database obtained by analysing PubMed abstracts may be a valuable help in the field of chemical and biological kinetics. It is completely based upon text mining and therefore complements manually curated databases. The database is available at http://kid.tu-bs.de. The source code of the algorithm is provided under the GNU General Public

  15. Effect of high-extraction coal mining on surface and ground waters

    International Nuclear Information System (INIS)

    Kendorski, F.S.

    1993-01-01

    Since first quantified around 1979, much new data have become available. In examining the sources of data and the methods and intents of the researchers of over 65 case histories, it became apparent that the strata behaviors were being confused with overlapping vertical extents reported for the fractured zones and aquiclude zones depending on whether the researcher was interested in water intrusion into the mine or in water loss from surface or ground waters. These more recent data, and critical examination of existing data, have led to the realization that the former Aquiclude Zone defined for its ability to prevent or minimize the intrusion of ground or surface waters into mines has another important character in increasing storage of surface and shallow ground waters in response to mining with no permanent loss of waters. This zone is here named the Dilated Zone. Surface and ground waters can drain into this zone, but seldom into the mine, and can eventually be recovered through closing of dilations by mine subsidence progression away from the area, or filling of the additional void space created, or both. A revised model has been developed which accommodates the available data, by modifying the zones as follows: collapse and disaggregation extending 6 to 10 times the mined thickness above the panel; continuous fracturing extending approximately 24 times the mined thickness above the panel, allowing temporary drainage of intersected surface and ground waters; development of a zone of dilated, increased storativity, and leaky strata with little enhanced vertical permeability from 24 to 60 times the mined thickness above the panel above the continuous fracturing zone, and below the constrained or surface effects zones; maintenance of a constrained but leaky zone above the dilated zone and below the surface effects zone; and limited surface fracturing in areas of extension extending up to 50 ft or so beneath the ground surface. 119 ref., 5 figs., 2 tabs

  16. Mining Contextual Information for Ephemeral Digital Video Preservation

    Directory of Open Access Journals (Sweden)

    Chirag Shah

    2009-06-01

    Full Text Available Normal 0 For centuries the archival community has understood and practiced the art of adding contextual information while preserving an artifact. The question now is how these practices can be transferred to the digital domain. With the growing expansion of production and consumption of digital objects (documents, audio, video, etc. it has become essential to identify and study issues related to their representation. A cura­tor in the digital realm may be said to have the same responsibilities as one in a traditional archival domain. However, with the mass production and spread of digital objects, it may be difficult to do all the work manually. In the present article this problem is considered in the area of digital video preservation. We show how this problem can be formulated and propose a framework for capturing contextual infor­mation for ephemeral digital video preservation. This proposal is realized in a system called ContextMiner, which allows us to cater to a digital curator's needs with its four components: digital video curation, collection visualization, browsing interfaces, and video harvesting and monitoring. While the issues and systems described here are geared toward digital videos, they can easily be applied to other kinds of digital objects.

  17. Disposal and improvement of contaminated by waste extraction of copper mining in chile

    Science.gov (United States)

    Naranjo Lamilla, Pedro; Blanco Fernández, David; Díaz González, Marcos; Robles Castillo, Marcelo; Decinti Weiss, Alejandra; Tapia Alvarez, Carolina; Pardo Fabregat, Francisco; Vidal, Manuel Miguel Jordan; Bech, Jaume; Roca, Nuria

    2016-04-01

    This project originated from the need of a mining company, which mines and processes copper ore. High purity copper is produced with an annual production of 1,113,928 tons of concentrate to a law of 32%. This mining company has generated several illegal landfills and has been forced by the government to make a management center Industrial Solid Waste (ISW). The forecast volume of waste generated is 20,000 tons / year. Chemical analysis established that the studied soil has a high copper content, caused by nature or from the spread of contaminants from mining activities. Moreover, in some sectors, soil contamination by mercury, hydrocarbons and oils and fats were detected, likely associated with the accumulation of waste. The waters are also impacted by mining industrial tasks, specifically copper ores, molybdenum, manganese, sulfates and have an acidic pH. The ISW management center dispels the pollution of soil and water and concentrating all activities in a technically suitable place. In this center the necessary guidelines for the treatment and disposal of soil contamination caused by uncontrolled landfills are given, also generating a leachate collection system and a network of fluid monitoring physicochemical water quality and soil environment. Keywords: Industrial solid waste, soil contamination, Mining waste

  18. Scholarly Information Extraction Is Going to Make a Quantum Leap with PubMed Central (PMC).

    Science.gov (United States)

    Matthies, Franz; Hahn, Udo

    2017-01-01

    With the increasing availability of complete full texts (journal articles), rather than their surrogates (titles, abstracts), as resources for text analytics, entirely new opportunities arise for information extraction and text mining from scholarly publications. Yet, we gathered evidence that a range of problems are encountered for full-text processing when biomedical text analytics simply reuse existing NLP pipelines which were developed on the basis of abstracts (rather than full texts). We conducted experiments with four different relation extraction engines all of which were top performers in previous BioNLP Event Extraction Challenges. We found that abstract-trained engines loose up to 6.6% F-score points when run on full-text data. Hence, the reuse of existing abstract-based NLP software in a full-text scenario is considered harmful because of heavy performance losses. Given the current lack of annotated full-text resources to train on, our study quantifies the price paid for this short cut.

  19. Improving information extraction using a probability-based approach

    DEFF Research Database (Denmark)

    Kim, S.; Ahmed, Saeema; Wallace, K.

    2007-01-01

    Information plays a crucial role during the entire life-cycle of a product. It has been shown that engineers frequently consult colleagues to obtain the information they require to solve problems. However, the industrial world is now more transient and key personnel move to other companies...... or retire. It is becoming essential to retrieve vital information from archived product documents, if it is available. There is, therefore, great interest in ways of extracting relevant and sharable information from documents. A keyword-based search is commonly used, but studies have shown...... the recall, while maintaining the high precision, a learning approach that makes identification decisions based on a probability model, rather than simply looking up the presence of the pre-defined variations, looks promising. This paper presents the results of developing such a probability-based entity...

  20. Web Mining

    Science.gov (United States)

    Fürnkranz, Johannes

    The World-Wide Web provides every internet citizen with access to an abundance of information, but it becomes increasingly difficult to identify the relevant pieces of information. Research in web mining tries to address this problem by applying techniques from data mining and machine learning to Web data and documents. This chapter provides a brief overview of web mining techniques and research areas, most notably hypertext classification, wrapper induction, recommender systems and web usage mining.

  1. TCMGeneDIT: a database for associated traditional Chinese medicine, gene and disease information using text mining

    Directory of Open Access Journals (Sweden)

    Chen Hsin-Hsi

    2008-10-01

    Full Text Available Abstract Background Traditional Chinese Medicine (TCM, a complementary and alternative medical system in Western countries, has been used to treat various diseases over thousands of years in East Asian countries. In recent years, many herbal medicines were found to exhibit a variety of effects through regulating a wide range of gene expressions or protein activities. As available TCM data continue to accumulate rapidly, an urgent need for exploring these resources systematically is imperative, so as to effectively utilize the large volume of literature. Methods TCM, gene, disease, biological pathway and protein-protein interaction information were collected from public databases. For association discovery, the TCM names, gene names, disease names, TCM ingredients and effects were used to annotate the literature corpus obtained from PubMed. The concept to mine entity associations was based on hypothesis testing and collocation analysis. The annotated corpus was processed with natural language processing tools and rule-based approaches were applied to the sentences for extracting the relations between TCM effecters and effects. Results We developed a database, TCMGeneDIT, to provide association information about TCMs, genes, diseases, TCM effects and TCM ingredients mined from vast amount of biomedical literature. Integrated protein-protein interaction and biological pathways information are also available for exploring the regulations of genes associated with TCM curative effects. In addition, the transitive relationships among genes, TCMs and diseases could be inferred through the shared intermediates. Furthermore, TCMGeneDIT is useful in understanding the possible therapeutic mechanisms of TCMs via gene regulations and deducing synergistic or antagonistic contributions of the prescription components to the overall therapeutic effects. The database is now available at http://tcm.lifescience.ntu.edu.tw/. Conclusion TCMGeneDIT is a unique database

  2. A Wireless LAN and Voice Information System for Underground Coal Mine

    Directory of Open Access Journals (Sweden)

    Yu Zhang

    2014-06-01

    Full Text Available In this paper we constructed a wireless information system, and developed a wireless voice communication subsystem based on Wireless Local Area Networks (WLAN for underground coal mine, which employs Voice over IP (VoIP technology and Session Initiation Protocol (SIP to achieve wireless voice dispatching communications. The master control voice dispatching interface and call terminal software are also developed on the WLAN ground server side to manage and implement the voice dispatching communication. A testing system for voice communication was constructed in tunnels of an underground coal mine, which was used to actually test the wireless voice communication subsystem via a network analysis tool, named Clear Sight Analyzer. In tests, the actual flow charts of registration, call establishment and call removal were analyzed by capturing call signaling of SIP terminals, and the key performance indicators were evaluated in coal mine, including average subjective value of voice quality, packet loss rate, delay jitter, disorder packet transmission and end-to- end delay. Experimental results and analysis demonstrate that the wireless voice communication subsystem developed communicates well in underground coal mine environment, achieving the designed function of voice dispatching communication.

  3. Transliteration normalization for Information Extraction and Machine Translation

    Directory of Open Access Journals (Sweden)

    Yuval Marton

    2014-12-01

    Full Text Available Foreign name transliterations typically include multiple spelling variants. These variants cause data sparseness and inconsistency problems, increase the Out-of-Vocabulary (OOV rate, and present challenges for Machine Translation, Information Extraction and other natural language processing (NLP tasks. This work aims to identify and cluster name spelling variants using a Statistical Machine Translation method: word alignment. The variants are identified by being aligned to the same “pivot” name in another language (the source-language in Machine Translation settings. Based on word-to-word translation and transliteration probabilities, as well as the string edit distance metric, names with similar spellings in the target language are clustered and then normalized to a canonical form. With this approach, tens of thousands of high-precision name transliteration spelling variants are extracted from sentence-aligned bilingual corpora in Arabic and English (in both languages. When these normalized name spelling variants are applied to Information Extraction tasks, improvements over strong baseline systems are observed. When applied to Machine Translation tasks, a large improvement potential is shown.

  4. The Development of Financial Information System and Business Intelligence Using Data Mining Concepts

    OpenAIRE

    PVD PRASAD

    2014-01-01

    One of the most emerging technologies is finance, becoming more amenable to data-driven modeling as large sets of financial data become available everywhere. So we are applying the data mining techniques in financial information system with Business Intelligence. A Business Intelligence System (BIS) can be described as an interactive, computer-based system designed to help decision-makers to solve unstructured problems. Using a combination of models, analytical techniques, and...

  5. A Wireless LAN and Voice Information System for Underground Coal Mine

    OpenAIRE

    Yu Zhang; Wei Yang; Dongsheng Han; Young-Il Kim

    2014-01-01

    In this paper we constructed a wireless information system, and developed a wireless voice communication subsystem based on Wireless Local Area Networks (WLAN) for underground coal mine, which employs Voice over IP (VoIP) technology and Session Initiation Protocol (SIP) to achieve wireless voice dispatching communications. The master control voice dispatching interface and call terminal software are also developed on the WLAN ground server side to manage and implement the voice dispatching co...

  6. A sequential approach to control gas for the extraction of multi-gassy coal seams from traditional gas well drainage to mining-induced stress relief

    International Nuclear Information System (INIS)

    Kong, Shengli; Cheng, Yuanping; Ren, Ting; Liu, Hongyong

    2014-01-01

    Highlights: • The gas reservoirs characteristics are measured and analyzed. • A sequential approach to control gas of multi-gassy coal seams is proposed. • The design of gas drainage wells has been improved. • The utilization ways of different concentrations of gas production are shown. - Abstract: As coal resources become exhausted in shallow mines, mining operations will inevitably progress from shallow depth to deep and gassy seams due to increased demands for more coal products. However, during the extraction process of deeper and gassier coal seams, new challenges to current gas control methods have emerged, these include the conflict between the coal mine safety and the economic benefits, the difficulties in reservoirs improvement, as well as the imbalance between pre-gas drainage, roadway development and coal mining. To solve these problems, a sequential approach is introduced in this paper. Three fundamental principles are proposed: the mining-induced stress relief effect of the first-mined coalbed should be sufficient to improve the permeability of the others; the coal resource of the first-mined seams must be abundant to guarantee the economic benefits; the arrangement of the vertical wells must fit the underground mining panel. Tunlan coal mine is taken as a typical example to demonstrate the effectiveness of this approach. The approach of integrating surface coalbed methane (CBM) exploitation with underground gas control technologies brings three major benefits: the improvement of underground coal mining safety, the implementation of CBM extraction, and the reduction of greenhouse gas emissions. This practice could be used as a valuable example for other coal mines having similar geological conditions

  7. Systematic Review of Data Mining Applications in Patient-Centered Mobile-Based Information Systems.

    Science.gov (United States)

    Fallah, Mina; Niakan Kalhori, Sharareh R

    2017-10-01

    Smartphones represent a promising technology for patient-centered healthcare. It is claimed that data mining techniques have improved mobile apps to address patients' needs at subgroup and individual levels. This study reviewed the current literature regarding data mining applications in patient-centered mobile-based information systems. We systematically searched PubMed, Scopus, and Web of Science for original studies reported from 2014 to 2016. After screening 226 records at the title/abstract level, the full texts of 92 relevant papers were retrieved and checked against inclusion criteria. Finally, 30 papers were included in this study and reviewed. Data mining techniques have been reported in development of mobile health apps for three main purposes: data analysis for follow-up and monitoring, early diagnosis and detection for screening purpose, classification/prediction of outcomes, and risk calculation (n = 27); data collection (n = 3); and provision of recommendations (n = 2). The most accurate and frequently applied data mining method was support vector machine; however, decision tree has shown superior performance to enhance mobile apps applied for patients' self-management. Embedded data-mining-based feature in mobile apps, such as case detection, prediction/classification, risk estimation, or collection of patient data, particularly during self-management, would save, apply, and analyze patient data during and after care. More intelligent methods, such as artificial neural networks, fuzzy logic, and genetic algorithms, and even the hybrid methods may result in more patients-centered recommendations, providing education, guidance, alerts, and awareness of personalized output.

  8. Evolving spectral transformations for multitemporal information extraction using evolutionary computation

    Science.gov (United States)

    Momm, Henrique; Easson, Greg

    2011-01-01

    Remote sensing plays an important role in assessing temporal changes in land features. The challenge often resides in the conversion of large quantities of raw data into actionable information in a timely and cost-effective fashion. To address this issue, research was undertaken to develop an innovative methodology integrating biologically-inspired algorithms with standard image classification algorithms to improve information extraction from multitemporal imagery. Genetic programming was used as the optimization engine to evolve feature-specific candidate solutions in the form of nonlinear mathematical expressions of the image spectral channels (spectral indices). The temporal generalization capability of the proposed system was evaluated by addressing the task of building rooftop identification from a set of images acquired at different dates in a cross-validation approach. The proposed system generates robust solutions (kappa values > 0.75 for stage 1 and > 0.4 for stage 2) despite the statistical differences between the scenes caused by land use and land cover changes coupled with variable environmental conditions, and the lack of radiometric calibration between images. Based on our results, the use of nonlinear spectral indices enhanced the spectral differences between features improving the clustering capability of standard classifiers and providing an alternative solution for multitemporal information extraction.

  9. Recognition techniques for extracting information from semistructured documents

    Science.gov (United States)

    Della Ventura, Anna; Gagliardi, Isabella; Zonta, Bruna

    2000-12-01

    Archives of optical documents are more and more massively employed, the demand driven also by the new norms sanctioning the legal value of digital documents, provided they are stored on supports that are physically unalterable. On the supply side there is now a vast and technologically advanced market, where optical memories have solved the problem of the duration and permanence of data at costs comparable to those for magnetic memories. The remaining bottleneck in these systems is the indexing. The indexing of documents with a variable structure, while still not completely automated, can be machine supported to a large degree with evident advantages both in the organization of the work, and in extracting information, providing data that is much more detailed and potentially significant for the user. We present here a system for the automatic registration of correspondence to and from a public office. The system is based on a general methodology for the extraction, indexing, archiving, and retrieval of significant information from semi-structured documents. This information, in our prototype application, is distributed among the database fields of sender, addressee, subject, date, and body of the document.

  10. An Enhanced Text-Mining Framework for Extracting Disaster Relevant Data through Social Media and Remote Sensing Data Fusion

    Science.gov (United States)

    Scheele, C. J.; Huang, Q.

    2016-12-01

    In the past decade, the rise in social media has led to the development of a vast number of social media services and applications. Disaster management represents one of such applications leveraging massive data generated for event detection, response, and recovery. In order to find disaster relevant social media data, current approaches utilize natural language processing (NLP) methods based on keywords, or machine learning algorithms relying on text only. However, these approaches cannot be perfectly accurate due to the variability and uncertainty in language used on social media. To improve current methods, the enhanced text-mining framework is proposed to incorporate location information from social media and authoritative remote sensing datasets for detecting disaster relevant social media posts, which are determined by assessing the textual content using common text mining methods and how the post relates spatiotemporally to the disaster event. To assess the framework, geo-tagged Tweets were collected for three different spatial and temporal disaster events: hurricane, flood, and tornado. Remote sensing data and products for each event were then collected using RealEarthTM. Both Naive Bayes and Logistic Regression classifiers were used to compare the accuracy within the enhanced text-mining framework. Finally, the accuracies from the enhanced text-mining framework were compared to the current text-only methods for each of the case study disaster events. The results from this study address the need for more authoritative data when using social media in disaster management applications.

  11. Improvements mineral dressing and extraction processes of gold-silver ores from San Pedro Frio Mining District, Colombia

    International Nuclear Information System (INIS)

    Yanez Traslavina, J. J.; Vargas Avila, M. A.; Garcia Paez, I. H.; Pedraza Rosas, J. E.

    2005-01-01

    The San Pedro Frio district mining, Colombia, is a rich region production gold-silver ores. Nowadays, the extraction processes used are amalgamation, percolation cyanidation and precipitation with zinc wood. Due to the ignorance of the ore characteristics, gold and silver treatment processes are inadequate and not efficient. In addition the inappropriate use of mercury and cyanide cause environmental contamination. In this research the ore characterization was carried out obtained fundamental parameters for the technical selection of more efficient gold and silver extraction processes. Experimental work was addressed to the study of both processes the agitation cyanidation and the adsorption on activated carbon in pulp. As a final result proposed a flowsheet to improve the precious metals recovery and reduce the environment contamination. (Author)

  12. KneeTex: an ontology-driven system for information extraction from MRI reports.

    Science.gov (United States)

    Spasić, Irena; Zhao, Bo; Jones, Christopher B; Button, Kate

    2015-01-01

    In the realm of knee pathology, magnetic resonance imaging (MRI) has the advantage of visualising all structures within the knee joint, which makes it a valuable tool for increasing diagnostic accuracy and planning surgical treatments. Therefore, clinical narratives found in MRI reports convey valuable diagnostic information. A range of studies have proven the feasibility of natural language processing for information extraction from clinical narratives. However, no study focused specifically on MRI reports in relation to knee pathology, possibly due to the complexity of knee anatomy and a wide range of conditions that may be associated with different anatomical entities. In this paper we describe KneeTex, an information extraction system that operates in this domain. As an ontology-driven information extraction system, KneeTex makes active use of an ontology to strongly guide and constrain text analysis. We used automatic term recognition to facilitate the development of a domain-specific ontology with sufficient detail and coverage for text mining applications. In combination with the ontology, high regularity of the sublanguage used in knee MRI reports allowed us to model its processing by a set of sophisticated lexico-semantic rules with minimal syntactic analysis. The main processing steps involve named entity recognition combined with coordination, enumeration, ambiguity and co-reference resolution, followed by text segmentation. Ontology-based semantic typing is then used to drive the template filling process. We adopted an existing ontology, TRAK (Taxonomy for RehAbilitation of Knee conditions), for use within KneeTex. The original TRAK ontology expanded from 1,292 concepts, 1,720 synonyms and 518 relationship instances to 1,621 concepts, 2,550 synonyms and 560 relationship instances. This provided KneeTex with a very fine-grained lexico-semantic knowledge base, which is highly attuned to the given sublanguage. Information extraction results were evaluated

  13. WEB STRUCTURE MINING

    Directory of Open Access Journals (Sweden)

    CLAUDIA ELENA DINUCĂ

    2011-01-01

    Full Text Available The World Wide Web became one of the most valuable resources for information retrievals and knowledge discoveries due to the permanent increasing of the amount of data available online. Taking into consideration the web dimension, the users get easily lost in the web’s rich hyper structure. Application of data mining methods is the right solution for knowledge discovery on the Web. The knowledge extracted from the Web can be used to raise the performances for Web information retrievals, question answering and Web based data warehousing. In this paper, I provide an introduction of Web mining categories and I focus on one of these categories: the Web structure mining. Web structure mining, one of three categories of web mining for data, is a tool used to identify the relationship between Web pages linked by information or direct link connection. It offers information about how different pages are linked together to form this huge web. Web Structure Mining finds hidden basic structures and uses hyperlinks for more web applications such as web search.

  14. Ultrasound-assisted extraction for total sulphur measurement in mine tailings

    International Nuclear Information System (INIS)

    Khan, Adnan Hossain; Shang, Julie Q.; Alam, Raquibul

    2012-01-01

    Highlights: ► We develop a total sulphur measuring procedure of mine tailings. ► Ultrasound is used in the sample pre-treatment process. ► Full factorial design is applied to identify the best level of effecting factors. - Abstract: A sample preparation method for percentage recovery of total sulphur (%S) in reactive mine tailings based on ultrasound-assisted digestion (USAD) and inductively coupled plasma-optical emission spectroscopy (ICP-OES) was developed. The influence of various methodological factors was screened by employing a two-level and three-factor (2 3 ) full factorial design and using KZK-1, a sericite schist certified reference material (CRM), to find the optimal combination of studied factors and %S. Factors such as the sonication time, temperature and acid combination were studied, with the best result identified as 20 min of sonication, 80 °C temperature and 1 ml of HNO 3 :1 ml of HCl, which can achieve 100% recovery for the selected CRM. Subsequently a fraction of the 2 3 full factorial design was applied to mine tailings. The percentage relative standard deviation (%RSD) for the ultrasound method is less than 3.0% for CRM and less than 6% for the mine tailings. The investigated method was verified by X-ray diffraction analysis. The USAD method compared favorably with existing methods such as hot plate assisted digestion method, X-ray fluorescence and LECO™-CNS method.

  15. Text Mining for Information Systems Researchers: An Annotated Topic Modeling Tutorial

    DEFF Research Database (Denmark)

    Debortoli, Stefan; Müller, Oliver; Junglas, Iris

    2016-01-01

    , such as manual coding. Yet, the size of text data setsobtained from the Internet makes manual analysis virtually impossible. In this tutorial, we discuss the challengesencountered when applying automated text-mining techniques in information systems research. In particular, weshowcase the use of probabilistic...... researchers,this tutorial provides some guidance for conducting text mining studies on their own and for evaluating the quality ofothers.......t is estimated that more than 80 percent of today’s data is stored in unstructured form (e.g., text, audio, image, video);and much of it is expressed in rich and ambiguous natural language. Traditionally, the analysis of natural languagehas prompted the use of qualitative data analysis approaches...

  16. Information Mining Technologies to Enable Discovery of Actionable Intelligence to Facilitate Maritime Situational Awareness: I-MINE

    Science.gov (United States)

    2013-01-01

    website). Data mining tools are in-house code developed in Python, C++ and Java . • NGA The National Geospatial-Intelligence Agency (NGA) performs data...as PostgreSQL (with PostGIS), MySQL , Microsoft SQL Server, SQLite, etc. using the appropriate JDBC driver. 14 The documentation and ease to learn are...written in Java that is able to perform various types of regressions, classi- fications, and other data mining tasks. There is also a commercial version

  17. Sequential extraction of heavy metals in river sediments of an abandoned pyrite mining area: pollution detection and affinity series

    International Nuclear Information System (INIS)

    Pagnanelli, F.; Moscardini, E.; Giuliano, V.; Toro, L.

    2004-01-01

    In this paper heavy metal pollution at an abandoned Italian pyrite mine has been investigated by comparing total concentrations and speciation of heavy metals (Fe, Cu, Mn, Zn, Pb and As) in a red mud sample and a river sediment. Acid digestions show that all the investigated heavy metals present larger concentrations in the sediment than in the tailing. A modified Tessier's procedure has been used to discriminate heavy metal bound to organic fraction from those originally present in the mineral sulphide matrix and to detect a possible trend of metal mobilisation from red mud to river sediment. Sequential extractions on bulk and size fractionated samples denote that sediment samples present larger percent concentrations of the investigated heavy metals in the first extractive steps (I-IV) especially in lower dimension size fractionated samples suggesting that heavy metals in the sediment are significantly bound by superficial adsorption mechanisms. - Capsule: A modified Tessier's procedure, discriminating organic and sulphide bound metals, was used to detect pollutant mobilisation from red mud to river sediment in an abandoned pyrite mine

  18. Automated extraction of chemical structure information from digital raster images

    Directory of Open Access Journals (Sweden)

    Shedden Kerby A

    2009-02-01

    Full Text Available Abstract Background To search for chemical structures in research articles, diagrams or text representing molecules need to be translated to a standard chemical file format compatible with cheminformatic search engines. Nevertheless, chemical information contained in research articles is often referenced as analog diagrams of chemical structures embedded in digital raster images. To automate analog-to-digital conversion of chemical structure diagrams in scientific research articles, several software systems have been developed. But their algorithmic performance and utility in cheminformatic research have not been investigated. Results This paper aims to provide critical reviews for these systems and also report our recent development of ChemReader – a fully automated tool for extracting chemical structure diagrams in research articles and converting them into standard, searchable chemical file formats. Basic algorithms for recognizing lines and letters representing bonds and atoms in chemical structure diagrams can be independently run in sequence from a graphical user interface-and the algorithm parameters can be readily changed-to facilitate additional development specifically tailored to a chemical database annotation scheme. Compared with existing software programs such as OSRA, Kekule, and CLiDE, our results indicate that ChemReader outperforms other software systems on several sets of sample images from diverse sources in terms of the rate of correct outputs and the accuracy on extracting molecular substructure patterns. Conclusion The availability of ChemReader as a cheminformatic tool for extracting chemical structure information from digital raster images allows research and development groups to enrich their chemical structure databases by annotating the entries with published research articles. Based on its stable performance and high accuracy, ChemReader may be sufficiently accurate for annotating the chemical database with links

  19. Information Extraction, Data Integration, and Uncertain Data Management: The State of The Art

    NARCIS (Netherlands)

    Habib, Mena Badieh; van Keulen, Maurice

    2011-01-01

    Information Extraction, data Integration, and uncertain data management are different areas of research that got vast focus in the last two decades. Many researches tackled those areas of research individually. However, information extraction systems should have integrated with data integration

  20. Ultrasound-assisted extraction for total sulphur measurement in mine tailings

    Energy Technology Data Exchange (ETDEWEB)

    Khan, Adnan Hossain, E-mail: ad_li2@yahoo.com [Department of Civil and Environmental Engineering, University of Western Ontario (Canada); Shang, Julie Q.; Alam, Raquibul [Department of Civil and Environmental Engineering, University of Western Ontario (Canada)

    2012-10-15

    Highlights: Black-Right-Pointing-Pointer We develop a total sulphur measuring procedure of mine tailings. Black-Right-Pointing-Pointer Ultrasound is used in the sample pre-treatment process. Black-Right-Pointing-Pointer Full factorial design is applied to identify the best level of effecting factors. - Abstract: A sample preparation method for percentage recovery of total sulphur (%S) in reactive mine tailings based on ultrasound-assisted digestion (USAD) and inductively coupled plasma-optical emission spectroscopy (ICP-OES) was developed. The influence of various methodological factors was screened by employing a two-level and three-factor (2{sup 3}) full factorial design and using KZK-1, a sericite schist certified reference material (CRM), to find the optimal combination of studied factors and %S. Factors such as the sonication time, temperature and acid combination were studied, with the best result identified as 20 min of sonication, 80 Degree-Sign C temperature and 1 ml of HNO{sub 3}:1 ml of HCl, which can achieve 100% recovery for the selected CRM. Subsequently a fraction of the 2{sup 3} full factorial design was applied to mine tailings. The percentage relative standard deviation (%RSD) for the ultrasound method is less than 3.0% for CRM and less than 6% for the mine tailings. The investigated method was verified by X-ray diffraction analysis. The USAD method compared favorably with existing methods such as hot plate assisted digestion method, X-ray fluorescence and LECO Trade-Mark-Sign -CNS method.

  1. Information Extraction and Interpretation Analysis of Mineral Potential Targets Based on ETM+ Data and GIS technology: A Case Study of Copper and Gold Mineralization in Burma

    International Nuclear Information System (INIS)

    Wenhui, Du; Yongqing, Chen; Nana, Guo; Yinglong, Hao; Pengfei, Zhao; Gongwen, Wang

    2014-01-01

    Mineralization-alteration and structure information extraction plays important roles in mineral resource prospecting and assessment using remote sensing data and the Geographical Information System (GIS) technology. Choosing copper and gold mines in Burma as example, the authors adopt band ratio, threshold segmentation and principal component analysis (PCA) to extract the hydroxyl alteration information using ETM+ remote sensing images. Digital elevation model (DEM) (30m spatial resolution) and ETM+ data was used to extract linear and circular faults that are associated with copper and gold mineralization. Combining geological data and the above information, the weights of evidence method and the C-A fractal model was used to integrate and identify the ore-forming favourable zones in this area. Research results show that the high grade potential targets are located with the known copper and gold deposits, and the integrated information can be used to the next exploration for the mineral resource decision-making

  2. Ultrasound-assisted extraction for total sulphur measurement in mine tailings.

    Science.gov (United States)

    Khan, Adnan Hossain; Shang, Julie Q; Alam, Raquibul

    2012-10-15

    A sample preparation method for percentage recovery of total sulphur (%S) in reactive mine tailings based on ultrasound-assisted digestion (USAD) and inductively coupled plasma-optical emission spectroscopy (ICP-OES) was developed. The influence of various methodological factors was screened by employing a two-level and three-factor (2(3)) full factorial design and using KZK-1, a sericite schist certified reference material (CRM), to find the optimal combination of studied factors and %S. Factors such as the sonication time, temperature and acid combination were studied, with the best result identified as 20 min of sonication, 80°C temperature and 1 ml of HNO(3):1 ml of HCl, which can achieve 100% recovery for the selected CRM. Subsequently a fraction of the 2(3) full factorial design was applied to mine tailings. The percentage relative standard deviation (%RSD) for the ultrasound method is less than 3.0% for CRM and less than 6% for the mine tailings. The investigated method was verified by X-ray diffraction analysis. The USAD method compared favorably with existing methods such as hot plate assisted digestion method, X-ray fluorescence and LECO™-CNS method. Copyright © 2012 Elsevier B.V. All rights reserved.

  3. 30 CFR 75.1200-1 - Additional information on mine map.

    Science.gov (United States)

    2010-07-01

    ... SAFETY AND HEALTH MANDATORY SAFETY STANDARDS-UNDERGROUND COAL MINES Maps § 75.1200-1 Additional... symbols; (g) The location of railroad tracks and public highways leading to the mine, and mine buildings... permanent base line points coordinated with the underground and surface mine traverses, and the location and...

  4. 77 FR 16863 - Proposed Extension of Existing Information Collection; Mine Mapping and Records of Opening...

    Science.gov (United States)

    2012-03-22

    .... Using accurate, up-to-date maps during a disaster, mine emergency personnel can locate where miners may... coal mine inspectors, miners and their representatives, operators of adjacent coal mines, and persons... are essential to the planning and safe operation of the mine. In addition, these maps provide a...

  5. [Retrieval of Copper Pollution Information from Hyperspectral Satellite Data in a Vegetation Cover Mining Area].

    Science.gov (United States)

    Qu, Yong-hua; Jiao, Si-hong; Liu, Su-hong; Zhu, Ye-qing

    2015-11-01

    Heavy metal mining activities have caused the complex influence on the ecological environment of the mining regions. For example, a large amount of acidic waste water containing heavy metal ions have be produced in the process of copper mining which can bring serious pollution to the ecological environment of the region. In the previous research work, bare soil is mainly taken as the research target when monitoring environmental pollution, and thus the effects of land surface vegetation have been ignored. It is well known that vegetation condition is one of the most important indictors to reflect the ecological change in a certain region and there is a significant linkage between the vegetation spectral characteristics and the heavy metal when the vegetation is effected by the heavy metal pollution. It means the vegetation is sensitive to heavy metal pollution by their physiological behaviors in response to the physiological ecology change of their growing environment. The conventional methods, which often rely on large amounts of field survey data and laboratorial chemical analysis, are time consuming and costing a lot of material resources. The spectrum analysis method using remote sensing technology can acquire the information of the heavy mental content in the vegetation without touching it. However, the retrieval of that information from the hyperspectral data is not an easy job due to the difficulty in figuring out the specific band, which is sensitive to the specific heavy metal, from a huge number of hyperspectral bands. Thus the selection of the sensitive band is the key of the spectrum analysis method. This paper proposed a statistical analysis method to find the feature band sensitive to heavy metal ion from the hyperspectral data and to then retrieve the metal content using the field survey data and the hyperspectral images from China Environment Satellite HJ-1. This method selected copper ion content in the leaves as the indicator of copper pollution

  6. INFORMATION EXTRACTION IN TOMB PIT USING HYPERSPECTRAL DATA

    Directory of Open Access Journals (Sweden)

    X. Yang

    2018-04-01

    Full Text Available Hyperspectral data has characteristics of multiple bands and continuous, large amount of data, redundancy, and non-destructive. These characteristics make it possible to use hyperspectral data to study cultural relics. In this paper, the hyperspectral imaging technology is adopted to recognize the bottom images of an ancient tomb located in Shanxi province. There are many black remains on the bottom surface of the tomb, which are suspected to be some meaningful texts or paintings. Firstly, the hyperspectral data is preprocessing to get the reflectance of the region of interesting. For the convenient of compute and storage, the original reflectance value is multiplied by 10000. Secondly, this article uses three methods to extract the symbols at the bottom of the ancient tomb. Finally we tried to use morphology to connect the symbols and gave fifteen reference images. The results show that the extraction of information based on hyperspectral data can obtain a better visual experience, which is beneficial to the study of ancient tombs by researchers, and provides some references for archaeological research findings.

  7. Information Extraction in Tomb Pit Using Hyperspectral Data

    Science.gov (United States)

    Yang, X.; Hou, M.; Lyu, S.; Ma, S.; Gao, Z.; Bai, S.; Gu, M.; Liu, Y.

    2018-04-01

    Hyperspectral data has characteristics of multiple bands and continuous, large amount of data, redundancy, and non-destructive. These characteristics make it possible to use hyperspectral data to study cultural relics. In this paper, the hyperspectral imaging technology is adopted to recognize the bottom images of an ancient tomb located in Shanxi province. There are many black remains on the bottom surface of the tomb, which are suspected to be some meaningful texts or paintings. Firstly, the hyperspectral data is preprocessing to get the reflectance of the region of interesting. For the convenient of compute and storage, the original reflectance value is multiplied by 10000. Secondly, this article uses three methods to extract the symbols at the bottom of the ancient tomb. Finally we tried to use morphology to connect the symbols and gave fifteen reference images. The results show that the extraction of information based on hyperspectral data can obtain a better visual experience, which is beneficial to the study of ancient tombs by researchers, and provides some references for archaeological research findings.

  8. Automated Extraction of Substance Use Information from Clinical Texts.

    Science.gov (United States)

    Wang, Yan; Chen, Elizabeth S; Pakhomov, Serguei; Arsoniadis, Elliot; Carter, Elizabeth W; Lindemann, Elizabeth; Sarkar, Indra Neil; Melton, Genevieve B

    2015-01-01

    Within clinical discourse, social history (SH) includes important information about substance use (alcohol, drug, and nicotine use) as key risk factors for disease, disability, and mortality. In this study, we developed and evaluated a natural language processing (NLP) system for automated detection of substance use statements and extraction of substance use attributes (e.g., temporal and status) based on Stanford Typed Dependencies. The developed NLP system leveraged linguistic resources and domain knowledge from a multi-site social history study, Propbank and the MiPACQ corpus. The system attained F-scores of 89.8, 84.6 and 89.4 respectively for alcohol, drug, and nicotine use statement detection, as well as average F-scores of 82.1, 90.3, 80.8, 88.7, 96.6, and 74.5 respectively for extraction of attributes. Our results suggest that NLP systems can achieve good performance when augmented with linguistic resources and domain knowledge when applied to a wide breadth of substance use free text clinical notes.

  9. Genomic research and data-mining technology: implications for personal privacy and informed consent.

    Science.gov (United States)

    Tavani, Herman T

    2004-01-01

    This essay examines issues involving personal privacy and informed consent that arise at the intersection of information and communication technology (ICT) and population genomics research. I begin by briefly examining the ethical, legal, and social implications (ELSI) program requirements that were established to guide researchers working on the Human Genome Project (HGP). Next I consider a case illustration involving deCODE Genetics, a privately owned genetic company in Iceland, which raises some ethical concerns that are not clearly addressed in the current ELSI guidelines. The deCODE case also illustrates some ways in which an ICT technique known as data mining has both aided and posed special challenges for researchers working in the field of population genomics. On the one hand, data-mining tools have greatly assisted researchers in mapping the human genome and in identifying certain "disease genes" common in specific populations (which, in turn, has accelerated the process of finding cures for diseases tha affect those populations). On the other hand, this technology has significantly threatened the privacy of research subjects participating in population genomics studies, who may, unwittingly, contribute to the construction of new groups (based on arbitrary and non-obvious patterns and statistical correlations) that put those subjects at risk for discrimination and stigmatization. In the final section of this paper I examine some ways in which the use of data mining in the context of population genomics research poses a critical challenge for the principle of informed consent, which traditionally has played a central role in protecting the privacy interests of research subjects participating in epidemiological studies.

  10. Domain-independent information extraction in unstructured text

    Energy Technology Data Exchange (ETDEWEB)

    Irwin, N.H. [Sandia National Labs., Albuquerque, NM (United States). Software Surety Dept.

    1996-09-01

    Extracting information from unstructured text has become an important research area in recent years due to the large amount of text now electronically available. This status report describes the findings and work done during the second year of a two-year Laboratory Directed Research and Development Project. Building on the first-year`s work of identifying important entities, this report details techniques used to group words into semantic categories and to output templates containing selective document content. Using word profiles and category clustering derived during a training run, the time-consuming knowledge-building task can be avoided. Though the output still lacks in completeness when compared to systems with domain-specific knowledge bases, the results do look promising. The two approaches are compatible and could complement each other within the same system. Domain-independent approaches retain appeal as a system that adapts and learns will soon outpace a system with any amount of a priori knowledge.

  11. Extracting and Using Photon Polarization Information in Radiative B Decays

    Energy Technology Data Exchange (ETDEWEB)

    Grossman, Yuval

    2000-05-09

    The authors discuss the uses of conversion electron pairs for extracting photon polarization information in weak radiative B decays. Both cases of leptons produced through a virtual and real photon are considered. Measurements of the angular correlation between the (K-pi) and (e{sup +}e{sup {minus}}) decay planes in B --> K*(--> K-pi)gamma (*)(--> e{sup +}e{sup {minus}}) decays can be used to determine the helicity amplitudes in the radiative B --> K*gamma decays. A large right-handed helicity amplitude in B-bar decays is a signal of new physics. The time-dependent CP asymmetry in the B{sup 0} decay angular correlation is shown to measure sin 2-beta and cos 2-beta with little hadronic uncertainty.

  12. Extraction of neutron spectral information from Bonner-Sphere data

    CERN Document Server

    Haney, J H; Zaidins, C S

    1999-01-01

    We have extended a least-squares method of extracting neutron spectral information from Bonner-Sphere data which was previously developed by Zaidins et al. (Med. Phys. 5 (1978) 42). A pulse-height analysis with background stripping is employed which provided a more accurate count rate for each sphere. Newer response curves by Mares and Schraube (Nucl. Instr. and Meth. A 366 (1994) 461) were included for the moderating spheres and the bare detector which comprise the Bonner spectrometer system. Finally, the neutron energy spectrum of interest was divided using the philosophy of fuzzy logic into three trapezoidal regimes corresponding to slow, moderate, and fast neutrons. Spectral data was taken using a PuBe source in two different environments and the analyzed data is presented for these cases as slow, moderate, and fast neutron fluences. (author)

  13. ONTOGRABBING: Extracting Information from Texts Using Generative Ontologies

    DEFF Research Database (Denmark)

    Nilsson, Jørgen Fischer; Szymczak, Bartlomiej Antoni; Jensen, P.A.

    2009-01-01

    We describe principles for extracting information from texts using a so-called generative ontology in combination with syntactic analysis. Generative ontologies are introduced as semantic domains for natural language phrases. Generative ontologies extend ordinary finite ontologies with rules...... for producing recursively shaped terms representing the ontological content (ontological semantics) of NL noun phrases and other phrases. We focus here on achieving a robust, often only partial, ontology-driven parsing of and ascription of semantics to a sentence in the text corpus. The aim of the ontological...... analysis is primarily to identify paraphrases, thereby achieving a search functionality beyond mere keyword search with synsets. We further envisage use of the generative ontology as a phrase-based rather than word-based browser into text corpora....

  14. Toxicity of sediments potentially contaminated by coal mining and natural gas extraction to unionid mussels and commonly tested benthic invertebrates

    Science.gov (United States)

    Wang, Ning; Ingersoll, Christopher G.; Kunz, James L.; Brumbaugh, William G.; Kane, Cindy M.; Evans, R. Brian; Alexander, Steven; Walker, Craig; Bakaletz, Steve

    2013-01-01

    Sediment toxicity tests were conducted to assess potential effects of contaminants associated with coal mining or natural gas extraction activities in the upper Tennessee River basin and eastern Cumberland River basin in the United States. Test species included two unionid mussels (rainbow mussel, Villosa iris, and wavy-rayed lampmussel, Lampsilis fasciola, 28-d exposures), and the commonly tested amphipod, Hyalella azteca (28-d exposure) and midge, Chironomus dilutus (10-d exposure). Sediments were collected from seven test sites with mussel communities classified as impacted and in proximity to coal mining or gas extraction activities, and from five reference sites with mussel communities classified as not impacted and no or limited coal mining or gas extraction activities. Additional samples were collected from six test sites potentially with high concentrations of polycyclic aromatic hydrocarbons (PAHs) and from a test site contaminated by a coal ash spill. Mean survival, length, or biomass of one or more test species was reduced in 10 of 14 test samples (71%) from impacted areas relative to the response of organisms in the five reference samples. A higher proportion of samples was classified as toxic to mussels (63% for rainbow mussels, 50% for wavy-rayed lampmussels) compared with amphipods (38%) or midge (38%). Concentrations of total recoverable metals and total PAHs in sediments did not exceed effects-based probable effect concentrations (PECs). However, the survival, length, or biomasses of the mussels were reduced significantly with increasing PEC quotients for metals and for total PAHs, or with increasing sum equilibrium-partitioning sediment benchmark toxic units for PAHs. The growth of the rainbow mussel also significantly decreased with increasing concentrations of a major anion (chloride) and major cations (calcium and magnesium) in sediment pore water. Results of the present study indicated that (1) the findings from laboratory tests were generally

  15. Perception versus reality: Bridging the gap between quantitative and qualitative information relating to the risks of uranium mining

    International Nuclear Information System (INIS)

    Needham, S.

    2002-01-01

    Environmental impact of uranium mining in Australia is frequently raised as an issue of public concern. However, the level of concern both in terms of public agitation and political response has diminished over the last decade, largely as a consequence of many years of demonstrated high levels of environmental protection achieved at Australian uranium mines. Another reason is because of improved information now accessible to the public on mine environmental management systems, monitoring results, and audit outcomes. This paper describes some communication methods developed for the uranium mines of the Alligator Rivers Region of the Northern Territory. These methods have improved the effectiveness of dialogue between stakeholders, and better inform the public about the levels of environmental protection achieved and the level of risk to the environment and the community. A simple approach is described which has been developed to help build a mutual understanding between technocrats and the lay person on perceptions of risk and actual environmental impact. (author)

  16. Extraction panel guidelines for high production underground auger mining in Australian conditions

    Energy Technology Data Exchange (ETDEWEB)

    Paul Buddery; David Hill [Strata Engineering (Australia)

    2004-09-15

    The project involved monitoring ground behaviour during augering, with the intention of monitoring several sites with varying geotechnical environments and developing guidelines from these to assist in future layout design. This approach is appropriate where the mining layout involves the complex interaction of several components that cannot be readily simplified to the extent necessary for numerical or physical models to play the primary role. Only one site was secured within the project time frame. Consequently, the project has utilised the results from a Southern Colliery augering trial, coupled to the outcomes of numerical and physical modelling tests. The auger mining operations themselves were carried out by a Joint Venture (Coal Recovery Australia Pty Ltd) between Cutting Edge Technology Pty Ltd and SBD Services Pty Ltd. The underground trial indicated that empirical design methodologies involving pillar strength equations coupled to abutment angle models can be used to design stable augering layouts. Although the designed hole configuration was not fully achieved, there is, a suggestion that a layout so determined will be conservative, holding out the possibility of future optimisation on the basis of actual performance. Monitoring and re-appraisal in the context of a formal strata management process are critical to the success of any such approach, particularly in terms of optimisation. The two-dimensional UDEC numerical modelling code was used to model augering webs, but seemed to underestimate the stability of an auger mining panel, while over -estimating the strength of individual auger webs. Physical tests appeared to give a realistic quantification of the size effect. The tests suggest that determining the strength of an hourglass web by increasing the strength of an equivalent rectangular web by 25% would be a justifiable step at this stage.

  17. Applied data mining for business and industry

    CERN Document Server

    Giudici, Paolo

    2009-01-01

    The increasing availability of data in our current, information overloaded society has led to the need for valid tools for its modelling and analysis. Data mining and applied statistical methods are the appropriate tools to extract knowledge from such data. This book provides an accessible introduction to data mining methods in a consistent and application oriented statistical framework, using case studies drawn from real industry projects and highlighting the use of data mining methods in a variety of business applications. Introduces data mining methods and applications.Covers classical and Bayesian multivariate statistical methodology as well as machine learning and computational data mining methods.Includes many recent developments such as association and sequence rules, graphical Markov models, lifetime value modelling, credit risk, operational risk and web mining.Features detailed case studies based on applied projects within industry.Incorporates discussion of data mining software, with case studies a...

  18. Selenium speciation in phosphate mine soils and evaluation of a sequential extraction procedure using XAFS

    Energy Technology Data Exchange (ETDEWEB)

    Favorito, Jessica E.; Luxton, Todd P.; Eick, Matthew J.; Grossl, Paul R. (VP); (Utah SU); (EPA)

    2017-10-01

    Selenium is a trace element found in western US soils, where ingestion of Se-accumulating plants has resulted in livestock fatalities. Therefore, a reliable understanding of Se speciation and bioavailability is critical for effective mitigation. Sequential extraction procedures (SEP) are often employed to examine Se phases and speciation in contaminated soils but may be limited by experimental conditions. We examined the validity of a SEP using X-ray absorption spectroscopy (XAS) for both whole and a sequence of extracted soils. The sequence included removal of soluble, PO4-extractable, carbonate, amorphous Fe-oxide, crystalline Fe-oxide, organic, and residual Se forms. For whole soils, XANES analyses indicated Se(0) and Se(-II) predominated, with lower amounts of Se(IV) present, related to carbonates and Fe-oxides. Oxidized Se species were more elevated and residual/elemental Se was lower than previous SEP results from ICP-AES suggested. For soils from the SEP sequence, XANES results indicated only partial recovery of carbonate, Fe-oxide and organic Se. This suggests Se was incompletely removed during designated extractions, possibly due to lack of mineral solubilization or reagent specificity. Selenium fractions associated with Fe-oxides were reduced in amount or removed after using hydroxylamine HCl for most soils examined. XANES results indicate partial dissolution of solid-phases may occur during extraction processes. This study demonstrates why precautions should be taken to improve the validity of SEPs. Mineralogical and chemical characterizations should be completed prior to SEP implementation to identify extractable phases or mineral components that may influence extraction effectiveness. Sequential extraction procedures can be appropriately tailored for reliable quantification of speciation in contaminated soils.

  19. Information extraction and knowledge graph construction from geoscience literature

    Science.gov (United States)

    Wang, Chengbin; Ma, Xiaogang; Chen, Jianguo; Chen, Jingwen

    2018-03-01

    Geoscience literature published online is an important part of open data, and brings both challenges and opportunities for data analysis. Compared with studies of numerical geoscience data, there are limited works on information extraction and knowledge discovery from textual geoscience data. This paper presents a workflow and a few empirical case studies for that topic, with a focus on documents written in Chinese. First, we set up a hybrid corpus combining the generic and geology terms from geology dictionaries to train Chinese word segmentation rules of the Conditional Random Fields model. Second, we used the word segmentation rules to parse documents into individual words, and removed the stop-words from the segmentation results to get a corpus constituted of content-words. Third, we used a statistical method to analyze the semantic links between content-words, and we selected the chord and bigram graphs to visualize the content-words and their links as nodes and edges in a knowledge graph, respectively. The resulting graph presents a clear overview of key information in an unstructured document. This study proves the usefulness of the designed workflow, and shows the potential of leveraging natural language processing and knowledge graph technologies for geoscience.

  20. A Novel Visual Data Mining Module for the Geographical Information System gvSIG

    Directory of Open Access Journals (Sweden)

    Romel Vázquez-Rodríguez

    2013-01-01

    Full Text Available The exploration of large GIS models containing spatio-temporal information is a challenge. In this paper we propose the integration of scientific visualization (ScVis techniques into geographic information systems (GIS as an alternative for the visual analysis of data. Providing GIS with such tools improves the analysis and understanding of datasets with very low spatial density and allows to find correlations between variables in time and space. In this regard, we present a new visual data mining tool for the GIS gvSIG. This tool has been implemented as a gvSIG module and contains several ScVis techniques for multiparameter data with a wide range of possibilities to explore interactively the data. The developed module is a powerful visual data mining and data visualization tool to obtain knowledge from multiple datasets in time and space. A real case study with meteorological data from Villa Clara province (Cuba is presented, where the implemented visualization techniques were used to analyze the available datasets. Although it is tested with meteorological data, the developed module is of general application in the sense that it can be used in multiple application fields related with Earth Sciences.

  1. On Robust Information Extraction from High-Dimensional Data

    Czech Academy of Sciences Publication Activity Database

    Kalina, Jan

    2014-01-01

    Roč. 9, č. 1 (2014), s. 131-144 ISSN 1452-4864 Grant - others:GA ČR(CZ) GA13-01930S Institutional support: RVO:67985807 Keywords : data mining * high-dimensional data * robust econometrics * outliers * machine learning Subject RIV: IN - Informatics, Computer Science

  2. Selenium speciation in phosphate mine soils and evaluation of a sequential extraction procedure using XAFS

    International Nuclear Information System (INIS)

    Favorito, Jessica E.; Luxton, Todd P.; Eick, Matthew J.; Grossl, Paul R.

    2017-01-01

    Selenium is a trace element found in western US soils, where ingestion of Se-accumulating plants has resulted in livestock fatalities. Therefore, a reliable understanding of Se speciation and bioavailability is critical for effective mitigation. Sequential extraction procedures (SEP) are often employed to examine Se phases and speciation in contaminated soils but may be limited by experimental conditions. We examined the validity of a SEP using X-ray absorption spectroscopy (XAS) for both whole and a sequence of extracted soils. The sequence included removal of soluble, PO 4 -extractable, carbonate, amorphous Fe-oxide, crystalline Fe-oxide, organic, and residual Se forms. For whole soils, XANES analyses indicated Se(0) and Se(-II) predominated, with lower amounts of Se(IV) present, related to carbonates and Fe-oxides. Oxidized Se species were more elevated and residual/elemental Se was lower than previous SEP results from ICP-AES suggested. For soils from the SEP sequence, XANES results indicated only partial recovery of carbonate, Fe-oxide and organic Se. This suggests Se was incompletely removed during designated extractions, possibly due to lack of mineral solubilization or reagent specificity. Selenium fractions associated with Fe-oxides were reduced in amount or removed after using hydroxylamine HCl for most soils examined. XANES results indicate partial dissolution of solid-phases may occur during extraction processes. This study demonstrates why precautions should be taken to improve the validity of SEPs. Mineralogical and chemical characterizations should be completed prior to SEP implementation to identify extractable phases or mineral components that may influence extraction effectiveness. Sequential extraction procedures can be appropriately tailored for reliable quantification of speciation in contaminated soils. - Highlights: • XANES spectra indicated whole soils consisted of mostly elemental and organic Se and lower amounts of sorbed oxidized Se.

  3. A diagnostic of the strategy employed for communicating nuclear related information to Brazilian communities around uranium mining areas

    International Nuclear Information System (INIS)

    Ferrari Dias, Fabiana; Tirollo Taddei, Maria H.

    2008-01-01

    This paper presents a diagnostic of the strategy used by the Brazilian uranium mining industry to communicate nuclear related information to communities around a mining area. The uranium mining industry in Brazil, which is run by the government, has been concerned with communication issues for quite some time. The need to communicate became more apparent after new mining operations started in the Northern region of Brazil. The fact that the government does not have a clear communication guideline made the operators of the uranium mining industry aware of the increasing demand for establishment of a good relationship with several types of Stake holders as well as employment of personnel with experience in dealing with them. A diagnostic of the current communication situation in Brazil and an analysis of the approaches over the past years was done through interviews with employees of the mining industry and review of institutional communication materials. The results were discussed during a Consultant's Meeting organized by the IAEA 's Seibersdorf Laboratory in October 2007. The output of the meeting included an overview of modern communication strategies used by different countries and a suggestion for new uranium mining operations in developing or under developed countries. The strategy for communicating nuclear related information to Brazilian communities varied according to the influence of different Stake holder groups. One initiative worth mentioning was the creation of a Mobile Nuclear Information Thematic Room, which was installed in several locations. This project was seen as one of the main tools to relate to community. Many Stake holders were identified during the diagnostic phase in preparation for the IAEA 's meeting on communication strategy: children, NGOs (Non Government Organizations), local churches, media and internal Stake holders, among others. An initial evaluation showed that the perception of a neighbouring community regarding an uranium

  4. Data Assimilation to Extract Soil Moisture Information from SMAP Observations

    Directory of Open Access Journals (Sweden)

    Jana Kolassa

    2017-11-01

    Full Text Available This study compares different methods to extract soil moisture information through the assimilation of Soil Moisture Active Passive (SMAP observations. Neural network (NN and physically-based SMAP soil moisture retrievals were assimilated into the National Aeronautics and Space Administration (NASA Catchment model over the contiguous United States for April 2015 to March 2017. By construction, the NN retrievals are consistent with the global climatology of the Catchment model soil moisture. Assimilating the NN retrievals without further bias correction improved the surface and root zone correlations against in situ measurements from 14 SMAP core validation sites (CVS by 0.12 and 0.16, respectively, over the model-only skill, and reduced the surface and root zone unbiased root-mean-square error (ubRMSE by 0.005 m 3 m − 3 and 0.001 m 3 m − 3 , respectively. The assimilation reduced the average absolute surface bias against the CVS measurements by 0.009 m 3 m − 3 , but increased the root zone bias by 0.014 m 3 m − 3 . Assimilating the NN retrievals after a localized bias correction yielded slightly lower surface correlation and ubRMSE improvements, but generally the skill differences were small. The assimilation of the physically-based SMAP Level-2 passive soil moisture retrievals using a global bias correction yielded similar skill improvements, as did the direct assimilation of locally bias-corrected SMAP brightness temperatures within the SMAP Level-4 soil moisture algorithm. The results show that global bias correction methods may be able to extract more independent information from SMAP observations compared to local bias correction methods, but without accurate quality control and observation error characterization they are also more vulnerable to adverse effects from retrieval errors related to uncertainties in the retrieval inputs and algorithm. Furthermore, the results show that using global bias correction approaches without a

  5. Automated information and control complex of hydro-gas endogenous mine processes

    Science.gov (United States)

    Davkaev, K. S.; Lyakhovets, M. V.; Gulevich, T. M.; Zolin, K. A.

    2017-09-01

    The automated information and control complex designed to prevent accidents, related to aerological situation in the underground workings, accounting of the received and handed over individual devices, transmission and display of measurement data, and the formation of preemptive solutions is considered. Examples for the automated workplace of an airgas control operator by individual means are given. The statistical characteristics of field data characterizing the aerological situation in the mine are obtained. The conducted studies of statistical characteristics confirm the feasibility of creating a subsystem of controlled gas distribution with an adaptive arrangement of points for gas control. The adaptive (multivariant) algorithm for processing measuring information of continuous multidimensional quantities and influencing factors has been developed.

  6. [Bronchopulmonary diseases in workers engaged in deep-mined extraction of copper-nickel ore].

    Science.gov (United States)

    Siurin, S A; Derevoedov, A A; Nikanov, A N

    2008-01-01

    Examinations were made in 220 male workers exposed to dust-gas (low-silicon dioxide, nitric oxides, and carbon oxide) mixture, physical exercises, and cooling microclimate on deep-mined output of copper-nickel ore. Twenty-eight per cent of the workers were found to have evolving chronic bronchitis that did not substantially affect the patients' working capacity; 3.2% had chronic obstructive pulmonary disease and 1.4% had asthma that had developed before the onset of professional activity. 32.3% of the examinees were ascertained to have individual clinicofunctional disorders that permit their identification as a bronchopulmonary disease risk group to carry out early preventive and rehabilitative measures.

  7. Mapping informal small-scale mining features in a data-sparse tropical environment with a small UAS

    Science.gov (United States)

    Chirico, Peter G.; Dewitt, Jessica D.

    2017-01-01

    This study evaluates the use of a small unmanned aerial system (UAS) to collect imagery over artisanal mining sites in West Africa. The purpose of this study is to consider how very high-resolution imagery and digital surface models (DSMs) derived from structure-from-motion (SfM) photogrammetric techniques from a small UAS can fill the gap in geospatial data collection between satellite imagery and data gathered during field work to map and monitor informal mining sites in tropical environments. The study compares both wide-angle and narrow field of view camera systems in the collection and analysis of high-resolution orthoimages and DSMs of artisanal mining pits. The results of the study indicate that UAS imagery and SfM photogrammetric techniques permit DSMs to be produced with a high degree of precision and relative accuracy, but highlight the challenges of mapping small artisanal mining pits in remote and data sparse terrain.

  8. Observations on strategic planning of information technology in the Indian coal mining industry

    Energy Technology Data Exchange (ETDEWEB)

    Owen, D.

    1988-05-01

    A view of the needs and plans to improve the coal mining industry of India is presented, focusing primarily on telecommunications and computerization. Further, details on mining electronics and vendor relationships with foreign firms are also discussed. 4 refs.

  9. WEB STRUCTURE MINING USING PAGERANK, IMPROVED PAGERANK – AN OVERVIEW

    Directory of Open Access Journals (Sweden)

    V. Lakshmi Praba

    2011-03-01

    Full Text Available Web Mining is the extraction of interesting and potentially useful patterns and information from Web. It includes Web documents, hyperlinks between documents, and usage logs of web sites. The significant task for web mining can be listed out as Information Retrieval, Information Selection / Extraction, Generalization and Analysis. Web information retrieval tools consider only the text on pages and ignore information in the links. The goal of Web structure mining is to explore structural summary about web. Web structure mining focusing on link information is an important aspect of web data. This paper presents an overview of the PageRank, Improved Page Rank and its working functionality in web structure mining.

  10. Earth Science Data Analytics: Preparing for Extracting Knowledge from Information

    Science.gov (United States)

    Kempler, Steven; Barbieri, Lindsay

    2016-01-01

    Data analytics is the process of examining large amounts of data of a variety of types to uncover hidden patterns, unknown correlations and other useful information. Data analytics is a broad term that includes data analysis, as well as an understanding of the cognitive processes an analyst uses to understand problems and explore data in meaningful ways. Analytics also include data extraction, transformation, and reduction, utilizing specific tools, techniques, and methods. Turning to data science, definitions of data science sound very similar to those of data analytics (which leads to a lot of the confusion between the two). But the skills needed for both, co-analyzing large amounts of heterogeneous data, understanding and utilizing relevant tools and techniques, and subject matter expertise, although similar, serve different purposes. Data Analytics takes on a practitioners approach to applying expertise and skills to solve issues and gain subject knowledge. Data Science, is more theoretical (research in itself) in nature, providing strategic actionable insights and new innovative methodologies. Earth Science Data Analytics (ESDA) is the process of examining, preparing, reducing, and analyzing large amounts of spatial (multi-dimensional), temporal, or spectral data using a variety of data types to uncover patterns, correlations and other information, to better understand our Earth. The large variety of datasets (temporal spatial differences, data types, formats, etc.) invite the need for data analytics skills that understand the science domain, and data preparation, reduction, and analysis techniques, from a practitioners point of view. The application of these skills to ESDA is the focus of this presentation. The Earth Science Information Partners (ESIP) Federation Earth Science Data Analytics (ESDA) Cluster was created in recognition of the practical need to facilitate the co-analysis of large amounts of data and information for Earth science. Thus, from a to

  11. 77 FR 62266 - Proposed Extension of Existing Information Collection; Daily Inspection of Surface Coal Mines...

    Science.gov (United States)

    2012-10-12

    ... conducting an on shift examination for hazardous conditions, mine operators better ensure a safe working environment for the miners and a reduction in accidents. II. Desired Focus of Comments The Mine Safety and... (30 CFR 77.1713) requires coal mine operators to conduct examinations of each active working area of...

  12. Information Management of Health and Safety at the Tarkwa Mine of ...

    African Journals Online (AJOL)

    The Tarkwa Mine (TM) of Goldfields Ghana Limited (GGL) undertakes open pit mining operations with gold recovery by heap leach technology. As a mine, it is susceptible to health and safety risks in its operations. In spite of health and safety policy and regulations put in place at the TM, there have been reported cases of ...

  13. Potential for improved extraction of tellurium as a byproduct of current copper mining processes

    Science.gov (United States)

    Hayes, S. M.; Spaleta, K. J.; Skidmore, A. E.

    2016-12-01

    Tellurium (Te) is classified as a critical element due to its increasing use in high technology applications, low average crustal abundance (3 μg kg-1), and primary source as a byproduct of copper extraction. Although Te can be readily recovered from copper processing, previous studies have estimated a 4 percent extraction efficiency, and few studies have addressed Te behavior during the entire copper extraction process. The goals of the present study are to perform a mass balance examining Te behavior during copper extraction and to connect these observations with mineralogy of Te-bearing phases which are essential first steps in devising ways to optimize Te recovery. Our preliminary mass balance results indicate that less than 3 percent of Te present in copper ore is recovered, with particularly high losses during initial concentration of copper ore minerals by flotation. Tellurium is present in the ore in telluride minerals (e.g., Bi-Te-S phases, altaite, and Ag-S-Se-Te phases identified using electron microprobe) with limited substitution into sulfide minerals (possibly 10 mg kg-1 Te in bulk pyrite and chalcopyrite). This work has also identified Te accumulation in solid-phase intermediate extraction products that could be further processed to recover Te, including smelter dusts (158 mg kg-1) and pressed anode slimes (2.7 percent by mass). In both the smelter dusts and anode slimes, X-ray absorption spectroscopy indicates that about two thirds of the Te is present as reduced tellurides. In anode slimes, electron microscopy shows that the remaining Te is present in an oxidized form in a complex Te-bearing oxidate phase also containing Pb, Cu, Ag, As, Sb, and S. These results clearly indicate that more efficient, increased recovery of Te may be possible, likely at minimal expense from operating copper processing operations, thereby providing more Te for manufacturing of products such as inexpensive high-efficiency solar panels.

  14. CHANGE OF PARADIGM IN UNDERGROUND HARD COAL MINING THROUGH EXTRACTION AND CAPITALIZATION OF METHANE FOR ENERGY PRODUCTION

    Directory of Open Access Journals (Sweden)

    Valeriu PLESEA

    2014-05-01

    Full Text Available Besides oil and gas, coal is the most important fossil fuel for energy production. Of the energy mixture of our country, the internal production gas share is 80% of the required annual consumption, of about 14 billion cubic meters, the rest of 20% being insured by importing, by the Russian company Gazprom. The share of coal in the National Power System (NPS is of 24% and is one of the most profitable energy production sources, taking into account the continuous increase of gas price and its dependence on external suppliers. Taking into account the infestation of the atmosphere and global warming as effect of important release of greenhouse gas and carbon dioxide as a result of coal burning for energy production in thermal power plants, there is required to identify new solutions for keeping the environment clean. Such a solution is presented in the study and analysis shown in the paper and is the extraction and capitalization of methane from the coal deposits and the underground spaces remaining free after mine closures. Underground methane extraction is considered even more opportune because, during coal exploitation, large quantities of such combustible gas are released and exhausted into the atmosphere by the degasification and ventilation stations from the surface, representing and important pollution factor for the environment, as greenhouse gas with high global warming potential (high GWP of about 21 times higher than carbon dioxide.

  15. Testing the reliability of information extracted from ancient zircon

    Science.gov (United States)

    Kielman, Ross; Whitehouse, Martin; Nemchin, Alexander

    2015-04-01

    Studies combining zircon U-Pb chronology, trace element distribution as well as O and Hf isotope systematics are a powerful way to gain understanding of the processes shaping Earth's evolution, especially in detrital populations where constraints from the original host are missing. Such studies of the Hadean detrital zircon population abundant in sedimentary rocks in Western Australia have involved analysis of an unusually large number of individual grains, but also highlighted potential problems with the approach, only apparent when multiple analyses are obtained from individual grains. A common feature of the Hadean as well as many early Archaean zircon populations is their apparent inhomogeneity, which reduces confidence in conclusions based on studies combining chemistry and isotopic characteristics of zircon. In order to test the reliability of information extracted from early Earth zircon, we report results from one of the first in-depth multi-method study of zircon from a relatively simple early Archean magmatic rock, used as an analogue to ancient detrital zircon. The approach involves making multiple SIMS analyses in individual grains in order to be comparable to the most advanced studies of detrital zircon populations. The investigated sample is a relatively undeformed, non-migmatitic ca. 3.8 Ga tonalite collected a few kms south of the Isua Greenstone Belt, southwest Greenland. Extracted zircon grains can be combined into three different groups based on the behavior of their U-Pb systems: (i) grains that show internally consistent and concordant ages and define an average age of 3805±15 Ma, taken to be the age of the rock, (ii) grains that are distributed close to the concordia line, but with significant variability between multiple analyses, suggesting an ancient Pb loss and (iii) grains that have multiple analyses distributed along a discordia pointing towards a zero intercept, indicating geologically recent Pb-loss. This overall behavior has

  16. Knowledge-driven information mining in remote-sensing image archives

    Science.gov (United States)

    Datcu, M.; Seidel, K.; D'Elia, S.; Marchetti, P. G.

    2002-05-01

    Users in all domains require information or information-related services that are focused, concise, reliable, low cost and timely and which are provided in forms and formats compatible with the user's own activities. In the current Earth Observation (EO) scenario, the archiving centres generally only offer data, images and other "low level" products. The user's needs are being only partially satisfied by a number of, usually small, value-adding companies applying time-consuming (mostly manual) and expensive processes relying on the knowledge of experts to extract information from those data or images.

  17. Text Mining in Biomedical Domain with Emphasis on Document Clustering.

    Science.gov (United States)

    Renganathan, Vinaitheerthan

    2017-07-01

    With the exponential increase in the number of articles published every year in the biomedical domain, there is a need to build automated systems to extract unknown information from the articles published. Text mining techniques enable the extraction of unknown knowledge from unstructured documents. This paper reviews text mining processes in detail and the software tools available to carry out text mining. It also reviews the roles and applications of text mining in the biomedical domain. Text mining processes, such as search and retrieval of documents, pre-processing of documents, natural language processing, methods for text clustering, and methods for text classification are described in detail. Text mining techniques can facilitate the mining of vast amounts of knowledge on a given topic from published biomedical research articles and draw meaningful conclusions that are not possible otherwise.

  18. Prospecting for dinosaurs on the mining frontier: The value of information in America's Gilded Age.

    Science.gov (United States)

    Rieppel, Lukas

    2015-04-01

    How much is a dinosaur worth? This essay offers an account of the way vertebrate fossils were priced in late 19th-century America to explore the process by which monetary values are established in science. Examining a long and drawn-out negotiation over the sale of an unusually rich dinosaur quarry in Wyoming, I argue that, on their own, abstract market principles did not suffice to mediate between supply and demand. Rather, people haggling over the price of dinosaur bones looked to social norms from the mineral industry for cues on how to value these rare and unusual objects, adopting a set of negotiation tactics that exploited asymmetries in the distribution of scarce information to secure the better end of the deal. On the mining frontier in America's Gilded Age, dinosaurs were thus valued in much the same way as any other scarce natural resource one could dig out of the ground, including gold, silver, and coal.

  19. Mining for diagnostic information in body surface potential maps: A comparison of feature selection techniques

    Directory of Open Access Journals (Sweden)

    McCullagh Paul J

    2005-09-01

    Full Text Available Abstract Background In body surface potential mapping, increased spatial sampling is used to allow more accurate detection of a cardiac abnormality. Although diagnostically superior to more conventional electrocardiographic techniques, the perceived complexity of the Body Surface Potential Map (BSPM acquisition process has prohibited its acceptance in clinical practice. For this reason there is an interest in striking a compromise between the minimum number of electrocardiographic recording sites required to sample the maximum electrocardiographic information. Methods In the current study, several techniques widely used in the domains of data mining and knowledge discovery have been employed to mine for diagnostic information in 192 lead BSPMs. In particular, the Single Variable Classifier (SVC based filter and Sequential Forward Selection (SFS based wrapper approaches to feature selection have been implemented and evaluated. Using a set of recordings from 116 subjects, the diagnostic ability of subsets of 3, 6, 9, 12, 24 and 32 electrocardiographic recording sites have been evaluated based on their ability to correctly asses the presence or absence of Myocardial Infarction (MI. Results It was observed that the wrapper approach, using sequential forward selection and a 5 nearest neighbour classifier, was capable of choosing a set of 24 recording sites that could correctly classify 82.8% of BSPMs. Although the filter method performed slightly less favourably, the performance was comparable with a classification accuracy of 79.3%. In addition, experiments were conducted to show how (a features chosen using the wrapper approach were specific to the classifier used in the selection model, and (b lead subsets chosen were not necessarily unique. Conclusion It was concluded that both the filter and wrapper approaches adopted were suitable for guiding the choice of recording sites useful for determining the presence of MI. It should be noted however

  20. Extraction of CT dose information from DICOM metadata: automated Matlab-based approach.

    Science.gov (United States)

    Dave, Jaydev K; Gingold, Eric L

    2013-01-01

    The purpose of this study was to extract exposure parameters and dose-relevant indexes of CT examinations from information embedded in DICOM metadata. DICOM dose report files were identified and retrieved from a PACS. An automated software program was used to extract from these files information from the structured elements in the DICOM metadata relevant to exposure. Extracting information from DICOM metadata eliminated potential errors inherent in techniques based on optical character recognition, yielding 100% accuracy.

  1. Applications of Geomatics in Surface Mining

    Science.gov (United States)

    Blachowski, Jan; Górniak-Zimroz, Justyna; Milczarek, Wojciech; Pactwa, Katarzyna

    2017-12-01

    In terms of method of extracting mineral from deposit, mining can be classified into: surface, underground, and borehole mining. Surface mining is a form of mining, in which the soil and the rock covering the mineral deposits are removed. Types of surface mining include mainly strip and open-cast methods, as well as quarrying. Tasks associated with surface mining of minerals include: resource estimation and deposit documentation, mine planning and deposit access, mine plant development, extraction of minerals from deposits, mineral and waste processing, reclamation and reclamation of former mining grounds. At each stage of mining, geodata describing changes occurring in space during the entire life cycle of surface mining project should be taken into consideration, i.e. collected, analysed, processed, examined, distributed. These data result from direct (e.g. geodetic) and indirect (i.e. remote or relative) measurements and observations including airborne and satellite methods, geotechnical, geological and hydrogeological data, and data from other types of sensors, e.g. located on mining equipment and infrastructure, mine plans and maps. Management of such vast sources and sets of geodata, as well as information resulting from processing, integrated analysis and examining such data can be facilitated with geomatic solutions. Geomatics is a discipline of gathering, processing, interpreting, storing and delivering spatially referenced information. Thus, geomatics integrates methods and technologies used for collecting, management, processing, visualizing and distributing spatial data. In other words, its meaning covers practically every method and tool from spatial data acquisition to distribution. In this work examples of application of geomatic solutions in surface mining on representative case studies in various stages of mine operation have been presented. These applications include: prospecting and documenting mineral deposits, assessment of land accessibility

  2. Point Cloud Classification of Tesserae from Terrestrial Laser Data Combined with Dense Image Matching for Archaeological Information Extraction

    Science.gov (United States)

    Poux, F.; Neuville, R.; Billen, R.

    2017-08-01

    Reasoning from information extraction given by point cloud data mining allows contextual adaptation and fast decision making. However, to achieve this perceptive level, a point cloud must be semantically rich, retaining relevant information for the end user. This paper presents an automatic knowledge-based method for pre-processing multi-sensory data and classifying a hybrid point cloud from both terrestrial laser scanning and dense image matching. Using 18 features including sensor's biased data, each tessera in the high-density point cloud from the 3D captured complex mosaics of Germigny-des-prés (France) is segmented via a colour multi-scale abstraction-based featuring extracting connectivity. A 2D surface and outline polygon of each tessera is generated by a RANSAC plane extraction and convex hull fitting. Knowledge is then used to classify every tesserae based on their size, surface, shape, material properties and their neighbour's class. The detection and semantic enrichment method shows promising results of 94% correct semantization, a first step toward the creation of an archaeological smart point cloud.

  3. IMPROVEMENT EVALUATION ON CERAMIC ROOF EXTRACTION USING WORLDVIEW-2 IMAGERY AND GEOGRAPHIC DATA MINING APPROACH

    Directory of Open Access Journals (Sweden)

    V. S. Brum-Bastos

    2016-06-01

    Full Text Available Advances in geotechnologies and in remote sensing have improved analysis of urban environments. The new sensors are increasingly suited to urban studies, due to the enhancement in spatial, spectral and radiometric resolutions. Urban environments present high heterogeneity, which cannot be tackled using pixel–based approaches on high resolution images. Geographic Object–Based Image Analysis (GEOBIA has been consolidated as a methodology for urban land use and cover monitoring; however, classification of high resolution images is still troublesome. This study aims to assess the improvement on ceramic roof classification using WorldView-2 images due to the increase of 4 new bands besides the standard “Blue-Green-Red-Near Infrared” bands. Our methodology combines GEOBIA, C4.5 classification tree algorithm, Monte Carlo simulation and statistical tests for classification accuracy. Two samples groups were considered: 1 eight multispectral and panchromatic bands, and 2 four multispectral and panchromatic bands, representing previous high-resolution sensors. The C4.5 algorithm generates a decision tree that can be used for classification; smaller decision trees are closer to the semantic networks produced by experts on GEOBIA, while bigger trees, are not straightforward to implement manually, but are more accurate. The choice for a big or small tree relies on the user’s skills to implement it. This study aims to determine for what kind of user the addition of the 4 new bands might be beneficial: 1 the common user (smaller trees or 2 a more skilled user with coding and/or data mining abilities (bigger trees. In overall the classification was improved by the addition of the four new bands for both types of users.

  4. Medicaid Analytic eXtract (MAX) General Information

    Data.gov (United States)

    U.S. Department of Health & Human Services — The Medicaid Analytic eXtract (MAX) data is a set of person-level data files on Medicaid eligibility, service utilization, and payments. The MAX data are created to...

  5. Radiological evaluation near three old mines of uranium extraction in the department of Creuse - year 2007; Evaluation radiologique aux abords de trois anciennes mines d'extraction d'uranium du departement de la Creuse - annee 2007

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2007-07-01

    The observations made for three sites of 'Chaumaillat, Ribiere and Grands Champs', demonstrate the existence of an atypical radiological situation which seems marked by the past activities of the mining. If the geochemical context can sometimes be at the origin of abnormalities in sediments and muds, the regional industrial context, conjugated to the strong measured values of uranium, incites us to privilege a human origin to explain these abnormalities. The presence of almost pure uraniums is the result supposed by the past activities of ore treatment on site (lixiviation) to extract the raw material from it (yellow cake) used for the manufacturing of the nuclear fuel. However, this observation on the site of 'Grands Champs' is surprising considering the absence of treatment activity in situ declared by the operator and the absence of residues storage.Given the accessibility of these sites to the public and considering the stop of any device of surveillance, a follow-up study seems necessary to estimate the importance of the radiological abnormalities and their persistent impact on the environment. (N.C.)

  6. Information from geology: Implications for soil formation and rehabilitation in the post coal mining environment, Bowen Basin, Australia

    International Nuclear Information System (INIS)

    Spain, A.V.; Esterle, J.; McLennan, T.P.T.

    1995-01-01

    The coal mining industry is likely to disturb as much as 60,000 ha of the Bowen Basin up to the year 2000. While comprising only a small proportion of the approximately 32,000 km 2 of the Bowen Basin, this considerable area will eventually need to be rehabilitated by creating appropriate land forms with a stabilizing and self-sustaining cover of vegetation. The job of restoring the disturbed area will fall to the practitioners of rehabilitation science. This paper briefly outlines the actual and potential significance of geological information to rehabilitation practice in the open-cut coal mining industry of the Bowen Basin. It focuses particularly on the problems of soil formation and the consequent limitations to ecosystem development due to the nature of the overburden materials and the environment. Lastly, it describes some of the distinctive features of the mine-soils of the area. Geological information can assist in the identification, classification, description and behaviour of post-mining materials. Potential inputs are not restricted to these and there is scope for wider inputs to management of the mining environment although the interface with biology requires further development. (author). 4 figs., 31 refs

  7. Opinion mining feature-level using Naive Bayes and feature extraction based analysis dependencies

    Science.gov (United States)

    Sanda, Regi; Baizal, Z. K. Abdurahman; Nhita, Fhira

    2015-12-01

    Development of internet and technology, has major impact and providing new business called e-commerce. Many e-commerce sites that provide convenience in transaction, and consumers can also provide reviews or opinions on products that purchased. These opinions can be used by consumers and producers. Consumers to know the advantages and disadvantages of particular feature of the product. Procuders can analyse own strengths and weaknesses as well as it's competitors products. Many opinions need a method that the reader can know the point of whole opinion. The idea emerged from review summarization that summarizes the overall opinion based on sentiment and features contain. In this study, the domain that become the main focus is about the digital camera. This research consisted of four steps 1) giving the knowledge to the system to recognize the semantic orientation of an opinion 2) indentify the features of product 3) indentify whether the opinion gives a positive or negative 4) summarizing the result. In this research discussed the methods such as Naï;ve Bayes for sentiment classification, and feature extraction algorithm based on Dependencies Analysis, which is one of the tools in Natural Language Processing (NLP) and knowledge based dictionary which is useful for handling implicit features. The end result of research is a summary that contains a bunch of reviews from consumers on the features and sentiment. With proposed method, accuration for sentiment classification giving 81.2 % for positive test data, 80.2 % for negative test data, and accuration for feature extraction reach 90.3 %.

  8. The BEL information extraction workflow (BELIEF): evaluation in the BioCreative V BEL and IAT track.

    Science.gov (United States)

    Madan, Sumit; Hodapp, Sven; Senger, Philipp; Ansari, Sam; Szostak, Justyna; Hoeng, Julia; Peitsch, Manuel; Fluck, Juliane

    2016-01-01

    Network-based approaches have become extremely important in systems biology to achieve a better understanding of biological mechanisms. For network representation, the Biological Expression Language (BEL) is well designed to collate findings from the scientific literature into biological network models. To facilitate encoding and biocuration of such findings in BEL, a BEL Information Extraction Workflow (BELIEF) was developed. BELIEF provides a web-based curation interface, the BELIEF Dashboard, that incorporates text mining techniques to support the biocurator in the generation of BEL networks. The underlying UIMA-based text mining pipeline (BELIEF Pipeline) uses several named entity recognition processes and relationship extraction methods to detect concepts and BEL relationships in literature. The BELIEF Dashboard allows easy curation of the automatically generated BEL statements and their context annotations. Resulting BEL statements and their context annotations can be syntactically and semantically verified to ensure consistency in the BEL network. In summary, the workflow supports experts in different stages of systems biology network building. Based on the BioCreative V BEL track evaluation, we show that the BELIEF Pipeline automatically extracts relationships with an F-score of 36.4% and fully correct statements can be obtained with an F-score of 30.8%. Participation in the BioCreative V Interactive task (IAT) track with BELIEF revealed a systems usability scale (SUS) of 67. Considering the complexity of the task for new users-learning BEL, working with a completely new interface, and performing complex curation-a score so close to the overall SUS average highlights the usability of BELIEF.Database URL: BELIEF is available at http://www.scaiview.com/belief/. © The Author(s) 2016. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.

  9. A Digital Mixed Methods Research Design: Integrating Multimodal Analysis with Data Mining and Information Visualization for Big Data Analytics

    Science.gov (United States)

    O'Halloran, Kay L.; Tan, Sabine; Pham, Duc-Son; Bateman, John; Vande Moere, Andrew

    2018-01-01

    This article demonstrates how a digital environment offers new opportunities for transforming qualitative data into quantitative data in order to use data mining and information visualization for mixed methods research. The digital approach to mixed methods research is illustrated by a framework which combines qualitative methods of multimodal…

  10. Comparative Data Mining Analysis for Information Retrieval of MODIS Images: Monitoring Lake Turbidity Changes at Lake Okeechobee, Florida

    Science.gov (United States)

    In the remote sensing field, a frequently recurring question is: Which computational intelligence or data mining algorithms are most suitable for the retrieval of essential information given that most natural systems exhibit very high non-linearity. Among potential candidates mig...

  11. Process mining can be applied to software too!

    NARCIS (Netherlands)

    Rubin, V.A.; Mitsyuk, A.A.; Lomazova, I.A.; Aalst, van der W.M.P.

    2014-01-01

    Modern information systems produce tremendous amounts of event data. The area of process mining deals with extracting knowledge from this data. Real-life processes can be effectively discovered, analyzed and optimized with the help of mature process mining techniques. There is a variety of process

  12. Introduction to the JASIST Special Topic Issue on Web Retrieval and Mining: A Machine Learning Perspective.

    Science.gov (United States)

    Chen, Hsinchun

    2003-01-01

    Discusses information retrieval techniques used on the World Wide Web. Topics include machine learning in information extraction; relevance feedback; information filtering and recommendation; text classification and text clustering; Web mining, based on data mining techniques; hyperlink structure; and Web size. (LRW)

  13. Data Mining in Education : A Review on the Knowledge Discovery Perspective

    OpenAIRE

    Pratiyush Guleria; Manu Sood

    2014-01-01

    Knowledge Discovery in Databases is the process of finding knowledge in massive amount of data where data mining is the core of this process. Data minin g can be used to mine understandable meaningful patterns from large databases and these patterns ma y then be converted into knowledge.Data mining is t he process of extracting the information and patterns derived by the KDD process which helps in crucial decision-making.Data mining works with data warehou se and...

  14. A Remote Sensing Approach to Environmental Monitoring in a Reclaimed Mine Area

    OpenAIRE

    Rajchandar Padmanaban; Avit K. Bhowmik; Pedro Cabral

    2017-01-01

    Padmanaban, R., Bhowmik, A. K., & Cabral, P. (2017). A Remote Sensing Approach to Environmental Monitoring in a Reclaimed Mine Area. ISPRS International Journal of Geo-Information, 6(12), 1-14. [401]. DOI: 10.3390/ijgi6120401 Mining for resources extraction may lead to geological and associated environmental changes due to ground movements, collision with mining cavities, and deformation of aquifers. Geological changes may continue in a reclaimed mine area, and the deformed aquifers may en...

  15. WEKA-G: Parallel data mining on computational grids

    Directory of Open Access Journals (Sweden)

    PIMENTA, A.

    2009-12-01

    Full Text Available Data mining is a technology that can extract useful information from large amounts of data. However, mining a database often requires a high computational power. To resolve this problem, this paper presents a tool (Weka-G, which runs in parallel algorithms used in the mining process data. As the environment for doing so, we use a computational grid by adding several features within a WAN.

  16. Automated Trait Extraction using ClearEarth, a Natural Language Processing System for Text Mining in Natural Sciences

    OpenAIRE

    Thessen,Anne; Preciado,Jenette; Jain,Payoj; Martin,James; Palmer,Martha; Bhat,Riyaz

    2018-01-01

    The cTAKES package (using the ClearTK Natural Language Processing toolkit Bethard et al. 2014, http://cleartk.github.io/cleartk/) has been successfully used to automatically read clinical notes in the medical field (Albright et al. 2013, Styler et al. 2014). It is used on a daily basis to automatically process clinical notes and extract relevant information by dozens of medical institutions. ClearEarth is a collaborative project that brings together computational linguistics and domain scient...

  17. Information Extraction with Character-level Neural Networks and Free Noisy Supervision

    OpenAIRE

    Meerkamp, Philipp; Zhou, Zhengyi

    2016-01-01

    We present an architecture for information extraction from text that augments an existing parser with a character-level neural network. The network is trained using a measure of consistency of extracted data with existing databases as a form of noisy supervision. Our architecture combines the ability of constraint-based information extraction systems to easily incorporate domain knowledge and constraints with the ability of deep neural networks to leverage large amounts of data to learn compl...

  18. A Hybrid Information Mining Approach for Knowledge Discovery in Cardiovascular Disease (CVD

    Directory of Open Access Journals (Sweden)

    Stefania Pasanisi

    2018-04-01

    Full Text Available The healthcare ambit is usually perceived as “information rich” yet “knowledge poor”. Nowadays, an unprecedented effort is underway to increase the use of business intelligence techniques to solve this problem. Heart disease (HD is a major cause of mortality in modern society. This paper analyzes the risk factors that have been identified in cardiovascular disease (CVD surveillance systems. The Heart Care study identifies attributes related to CVD risk (gender, age, smoking habit, etc. and other dependent variables that include a specific form of CVD (diabetes, hypertension, cardiac disease, etc.. In this paper, we combine Clustering, Association Rules, and Neural Networks for the assessment of heart-event-related risk factors, targeting the reduction of CVD risk. With the use of the K-means algorithm, significant groups of patients are found. Then, the Apriori algorithm is applied in order to understand the kinds of relations between the attributes within the dataset, first looking within the whole dataset and then refining the results through the subsets defined by the clusters. Finally, both results allow us to better define patients’ characteristics in order to make predictions about CVD risk with a Multilayer Perceptron Neural Network. The results obtained with the hybrid information mining approach indicate that it is an effective strategy for knowledge discovery concerning chronic diseases, particularly for CVD risk.

  19. Application of Text Mining to Extract Hotel Attributes and Construct Perceptual Map of Five Star Hotels from Online Review: Study of Jakarta and Singapore Five-Star Hotels

    Directory of Open Access Journals (Sweden)

    Arga Hananto

    2015-12-01

    Full Text Available The use of post-purchase online consumer review in hotel attributes study was still scarce in the literature. Arguably, post purchase online review data would gain more accurate attributes thatconsumers actually consider in their purchase decision. This study aims to extract attributes from two samples of five-star hotel reviews (Jakarta and Singapore with text mining methodology. In addition,this study also aims to describe positioning of five-star hotels in Jakarta and Singapore based on the extracted attributes using Correspondence Analysis. This study finds that reviewers of five star hotels in both cities mentioned similar attributes such as service, staff, club, location, pool and food. Attributes derived from text mining seem to be viable input to build fairly accurate positioning map of hotels. This study has demonstrated the viability of online review as a source of data for hotel attribute and positioning studies.

  20. An Effective Approach to Biomedical Information Extraction with Limited Training Data

    Science.gov (United States)

    Jonnalagadda, Siddhartha

    2011-01-01

    In the current millennium, extensive use of computers and the internet caused an exponential increase in information. Few research areas are as important as information extraction, which primarily involves extracting concepts and the relations between them from free text. Limitations in the size of training data, lack of lexicons and lack of…

  1. Information system for preserving culture heritage in areas affected by heavy industry and mining

    Science.gov (United States)

    Pacina, Jan; Kopecký, Jiří; Bedrníková, Lenka; Handrychová, Barbora; Švarcová, Martina; Holá, Markéta; Pončíková, Edita

    2014-05-01

    The natural development of the Ústí region (North-West Bohemia, the Czech Republic) has been affected by the human activity during the past hundred years. The heavy industrialization and the brown coal mining have completely changed the land-use in the region. The open-pit coal mines are completely destroying the surrounding landscape, including settlement, communications, hydrological network and the over-all natural development of the region. The other factor affecting the natural development of the landscape, land-use and settlement was the political situation in 1945 (end of the 2nd World War) when the borderland was depopulated. All these factors caused vanishing of more than two hundreds of colonies, villages and towns during this period of time. The task of this project is to prepare and offer for public use a comprehensive information system preserving the cultural heritage in the form of processed old maps, aerial imagery, land-use and georelief reconstructions, local studies, text and photo documents covering the extinct landscape and settlement. Wide range of various maps was used for this area - Müller's map of Bohemia (ca. 1720) followed by the 1st, 2nd and 3rd Military survey of Habsburg empire (1792, 1894, 1938), maps of Stabile cadaster (ca. 1840) and State map derived in the scale 1:5000 (1953, 1972, 1981). All the maps were processed, georeferenced, hand digitized and are further used as base layers for visualization and analysis. The historical aerial imagery was processed in standard ways of photogrammetry and is covering the year 1938, 1953 and the current state. The other important task covered by this project is the georelief reconstruction. We use the old maps and aerial imagery to reconstruct the complete time-line of the georelief development. This time-line is covering the period since 1938 until now. The derived digital terrain models and further on analyzed and printed on a 3D printer. Other reconstruction task are performed using

  2. A rapid extraction of landslide disaster information research based on GF-1 image

    Science.gov (United States)

    Wang, Sai; Xu, Suning; Peng, Ling; Wang, Zhiyi; Wang, Na

    2015-08-01

    In recent years, the landslide disasters occurred frequently because of the seismic activity. It brings great harm to people's life. It has caused high attention of the state and the extensive concern of society. In the field of geological disaster, landslide information extraction based on remote sensing has been controversial, but high resolution remote sensing image can improve the accuracy of information extraction effectively with its rich texture and geometry information. Therefore, it is feasible to extract the information of earthquake- triggered landslides with serious surface damage and large scale. Taking the Wenchuan county as the study area, this paper uses multi-scale segmentation method to extract the landslide image object through domestic GF-1 images and DEM data, which uses the estimation of scale parameter tool to determine the optimal segmentation scale; After analyzing the characteristics of landslide high-resolution image comprehensively and selecting spectrum feature, texture feature, geometric features and landform characteristics of the image, we can establish the extracting rules to extract landslide disaster information. The extraction results show that there are 20 landslide whose total area is 521279.31 .Compared with visual interpretation results, the extraction accuracy is 72.22%. This study indicates its efficient and feasible to extract earthquake landslide disaster information based on high resolution remote sensing and it provides important technical support for post-disaster emergency investigation and disaster assessment.

  3. VALUING ACID MINE DRAINAGE REMEDIATION IN WEST VIRGINIA: A HEDONIC MODELING APPROACH INCORPORATING GEOGRAPHIC INFORMATION SYSTEMS

    Science.gov (United States)

    States with active and abandoned mines face large private and public costs to remediate damage to streams and rivers from acid mine drainage (AMD). Appalachian states have an especially large number of contaminated streams and rivers, and the USGS places AMD as the primary source...

  4. The Application of Chinese High-Spatial Remote Sensing Satellite Image in Land Law Enforcement Information Extraction

    Science.gov (United States)

    Wang, N.; Yang, R.

    2018-04-01

    Chinese high -resolution (HR) remote sensing satellites have made huge leap in the past decade. Commercial satellite datasets, such as GF-1, GF-2 and ZY-3 images, the panchromatic images (PAN) resolution of them are 2 m, 1 m and 2.1 m and the multispectral images (MS) resolution are 8 m, 4 m, 5.8 m respectively have been emerged in recent years. Chinese HR satellite imagery has been free downloaded for public welfare purposes using. Local government began to employ more professional technician to improve traditional land management technology. This paper focused on analysing the actual requirements of the applications in government land law enforcement in Guangxi Autonomous Region. 66 counties in Guangxi Autonomous Region were selected for illegal land utilization spot extraction with fusion Chinese HR images. The procedure contains: A. Defines illegal land utilization spot type. B. Data collection, GF-1, GF-2, and ZY-3 datasets were acquired in the first half year of 2016 and other auxiliary data were collected in 2015. C. Batch process, HR images were collected for batch preprocessing through ENVI/IDL tool. D. Illegal land utilization spot extraction by visual interpretation. E. Obtaining attribute data with ArcGIS Geoprocessor (GP) model. F. Thematic mapping and surveying. Through analysing 42 counties results, law enforcement officials found 1092 illegal land using spots and 16 suspicious illegal mining spots. The results show that Chinese HR satellite images have great potential for feature information extraction and the processing procedure appears robust.

  5. A Text Mining Approach for Extracting Lessons Learned from Project Documentation: An Illustrative Case Study

    Directory of Open Access Journals (Sweden)

    Benjamin Matthies

    2017-12-01

    Full Text Available Lessons learned are important building blocks for continuous learning in project-based organisations. Nonetheless, the practical reality is that lessons learned are often not consistently reused for organisational learning. Two problems are commonly described in this context: the information overload and the lack of procedures and methods for the assessment and implementation of lessons learned. This paper addresses these problems, and appropriate solutions are combined in a systematic lesson learned process. Latent Dirichlet Allocation is presented to solve the first problem. Regarding the second problem, established risk management methods are adapted. The entire lessons learned process will be demonstrated in a practical case study

  6. Machine learning classification of surgical pathology reports and chunk recognition for information extraction noise reduction.

    Science.gov (United States)

    Napolitano, Giulio; Marshall, Adele; Hamilton, Peter; Gavin, Anna T

    2016-06-01

    Machine learning techniques for the text mining of cancer-related clinical documents have not been sufficiently explored. Here some techniques are presented for the pre-processing of free-text breast cancer pathology reports, with the aim of facilitating the extraction of information relevant to cancer staging. The first technique was implemented using the freely available software RapidMiner to classify the reports according to their general layout: 'semi-structured' and 'unstructured'. The second technique was developed using the open source language engineering framework GATE and aimed at the prediction of chunks of the report text containing information pertaining to the cancer morphology, the tumour size, its hormone receptor status and the number of positive nodes. The classifiers were trained and tested respectively on sets of 635 and 163 manually classified or annotated reports, from the Northern Ireland Cancer Registry. The best result of 99.4% accuracy - which included only one semi-structured report predicted as unstructured - was produced by the layout classifier with the k nearest algorithm, using the binary term occurrence word vector type with stopword filter and pruning. For chunk recognition, the best results were found using the PAUM algorithm with the same parameters for all cases, except for the prediction of chunks containing cancer morphology. For semi-structured reports the performance ranged from 0.97 to 0.94 and from 0.92 to 0.83 in precision and recall, while for unstructured reports performance ranged from 0.91 to 0.64 and from 0.68 to 0.41 in precision and recall. Poor results were found when the classifier was trained on semi-structured reports but tested on unstructured. These results show that it is possible and beneficial to predict the layout of reports and that the accuracy of prediction of which segments of a report may contain certain information is sensitive to the report layout and the type of information sought. Copyright

  7. Object-based image analysis and data mining for building ontology of informal urban settlements

    Science.gov (United States)

    Khelifa, Dejrriri; Mimoun, Malki

    2012-11-01

    During recent decades, unplanned settlements have been appeared around the big cities in most developing countries and as consequence, numerous problems have emerged. Thus the identification of different kinds of settlements is a major concern and challenge for authorities of many countries. Very High Resolution (VHR) Remotely Sensed imagery has proved to be a very promising way to detect different kinds of settlements, especially through the using of new objectbased image analysis (OBIA). The most important key is in understanding what characteristics make unplanned settlements differ from planned ones, where most experts characterize unplanned urban areas by small building sizes at high densities, no orderly road arrangement and Lack of green spaces. Knowledge about different kinds of settlements can be captured as a domain ontology that has the potential to organize knowledge in a formal, understandable and sharable way. In this work we focus on extracting knowledge from VHR images and expert's knowledge. We used an object based strategy by segmenting a VHR image taken over urban area into regions of homogenous pixels at adequate scale level and then computing spectral, spatial and textural attributes for each region to create objects. A genetic-based data mining was applied to generate high predictive and comprehensible classification rules based on selected samples from the OBIA result. Optimized intervals of relevant attributes are found, linked with land use types for forming classification rules. The unplanned areas were separated from the planned ones, through analyzing of the line segments detected from the input image. Finally a simple ontology was built based on the previous processing steps. The approach has been tested to VHR images of one of the biggest Algerian cities, that has grown considerably in recent decades.

  8. Towards an information extraction and knowledge formation framework based on Shannon entropy

    Directory of Open Access Journals (Sweden)

    Iliescu Dragoș

    2017-01-01

    Full Text Available Information quantity subject is approached in this paperwork, considering the specific domain of nonconforming product management as information source. This work represents a case study. Raw data were gathered from a heavy industrial works company, information extraction and knowledge formation being considered herein. Involved method for information quantity estimation is based on Shannon entropy formula. Information and entropy spectrum are decomposed and analysed for extraction of specific information and knowledge-that formation. The result of the entropy analysis point out the information needed to be acquired by the involved organisation, this being presented as a specific knowledge type.

  9. Mining dark information resources to develop new informatics capabilities to support science

    Science.gov (United States)

    Ramachandran, Rahul; Maskey, Manil; Bugbee, Kaylin

    2016-04-01

    Dark information resources are digital resources that organizations collect, process, and store for regular business or operational activities but fail to realize their potential for other purposes. The challenge for any organization is to recognize, identify and effectively exploit these dark information stores. Metadata catalogs at different data centers store dark information resources consisting of structured information, free form descriptions of data and browse images. These information resources are never fully exploited beyond a few fields used for search and discovery. For example, the NASA Earth science catalog holds greater than 6000 data collections, 127 million records for individual files and 67 million browse images. We believe that the information contained in the metadata catalogs and the browse images can be utilized beyond their original design intent to provide new data discovery and exploration pathways to support science and education communities. In this paper we present two research applications using information stored in the metadata catalog in a completely novel way. The first application is designing a data curation service. The objective of the data curation service is to augment the existing data search capabilities. Given a specific atmospheric phenomenon, the data curation service returns the user a ranked list of relevant data sets. Different fields in the metadata records including textual descriptions are mined. A specialized relevancy ranking algorithm has been developed that uses a "bag of words" to define phenomena along with an ensemble of known approaches such as the Jaccard Coefficient, Cosine Similarity and Zone ranking to rank the data sets. This approach is also extended to map from the data set level to data file variable level. The second application is focused on providing a service where a user can search and discover browse images containing specific phenomena from the vast catalog. This service will aid researchers

  10. STUDY ON PHYTO-EXTRACTION BALANCE OF ZN, CD AND PB FROM MINE-WASTE POLLUTED SOILS BY USING FESTUCA ARUNDINACEA AND LOLIUM PERENNE SPECIES

    Directory of Open Access Journals (Sweden)

    B. LIXANDRU

    2009-05-01

    Full Text Available Through the cultivation of tall fescue (Festuca arundinacea and of perennial ryegrass for two years on a chernozem type of soil, in the Banat's plain area we investigated the phyto-extraction potential of Zn, Cd and Pb. In the experimental plot it has been incorporated a quantity of 20 kg of mine-waste per square meter, in a mass ratio of 1:2,5. The mine-waste polluting "contribution" was of 1209 mg Zn / kg d.s., 4.70 mg Cd / kg d.s. and 188.2 mg Pb / kg d.s. The metals content in the soil was determined at the two moments of biomass harvesting, and through balance calculations we could establish the phyto-extraction efficiency of the two foragegrasses species. The obtained results indicate that Festuca arundinacea has an average phyto-extraction yield of 50% for Zn and Cd in the soil; in the case of an ionic excess of 3,5 to 4 times, the phyto-extraction efficiency is reduced, more obvious in the case of Pb (lead ions. The species Lolium perenne registers a yield of almost 92% in the process of phyto-extraction of Zn. The yield values for Cd si Pb are lower, but comparable with the control plot. Unlike Festuca arundinacea, the Lollium perenne species tolerates better the Cd and Pb ionic excess.

  11. CONAN : Text Mining in the Biomedical Domain

    NARCIS (Netherlands)

    Malik, R.

    2006-01-01

    This thesis is about Text Mining. Extracting important information from literature. In the last years, the number of biomedical articles and journals is growing exponentially. Scientists might not find the information they want because of the large number of publications. Therefore a system was

  12. Text Mining of Supreme Administrative Court Jurisdictions

    OpenAIRE

    Feinerer, Ingo; Hornik, Kurt

    2007-01-01

    Within the last decade text mining, i.e., extracting sensitive information from text corpora, has become a major factor in business intelligence. The automated textual analysis of law corpora is highly valuable because of its impact on a company's legal options and the raw amount of available jurisdiction. The study of supreme court jurisdiction and international law corpora is equally important due to its effects on business sectors. In this paper we use text mining methods to investigate Au...

  13. Data mining and business analytics with R

    CERN Document Server

    Ledolter, Johannes

    2013-01-01

    Collecting, analyzing, and extracting valuable information from a large amount of data requires easily accessible, robust, computational and analytical tools. Data Mining and Business Analytics with R utilizes the open source software R for the analysis, exploration, and simplification of large high-dimensional data sets. As a result, readers are provided with the needed guidance to model and interpret complicated data and become adept at building powerful models for prediction and classification. Highlighting both underlying concepts and practical computational skills, Data Mining

  14. An Information Framework for Facilitating Cost Saving of Environmental Impacts in the Coal Mining Industry in South Africa

    Directory of Open Access Journals (Sweden)

    Mashudu D. Mbedzi

    2018-05-01

    Full Text Available Coal-mining contributes much to the economic welfare of a country. Yet it brings along a number of challenges, notably environmental impacts which include water pollution in a water scarce country such as South Africa. This research is conducted in two phases. The first phase intends to establish environmental and other challenges brought about by the coal-mining industry through a comprehensive analysis of available literature. Combatting these challenges is costly; consequently, our work investigates how established management accounting tools and techniques such as Environmental Management Accounting (EMA, Material Flow Cost Accounting (MFCA and Life Cycle Costing (LCC may facilitate cost savings for the companies involved. These techniques promote increased transparency of material usage by tracing and quantifying the flows and inventories of materials within the coal-mining industry in physical and monetary terms, hence hidden costs are elicited. The researchers postulate that an Information Framework integrating these aspects may be the way forward. To this end existing frameworks in the literature are identified. A number of research questions embodying the above aspects are defined and the objective is to define a conceptual framework to facilitate cost savings for coal-mining companies. The main contribution of this work is an information framework presented towards the end of this article. The second phase of the research will involve fieldwork in the form of a survey among stakeholders in industry to validate the conceptual framework.

  15. Extracting local information from crowds through betting markets

    Science.gov (United States)

    Weijs, Steven

    2015-04-01

    In this research, a set-up is considered in which users can bet against a forecasting agency to challenge their probabilistic forecasts. From an information theory standpoint, a reward structure is considered that either provides the forecasting agency with better information, paying the successful providers of information for their winning bets, or funds excellent forecasting agencies through users that think they know better. Especially for local forecasts, the approach may help to diagnose model biases and to identify local predictive information that can be incorporated in the models. The challenges and opportunities for implementing such a system in practice are also discussed.

  16. Mining compressing sequential problems

    NARCIS (Netherlands)

    Hoang, T.L.; Mörchen, F.; Fradkin, D.; Calders, T.G.K.

    2012-01-01

    Compression based pattern mining has been successfully applied to many data mining tasks. We propose an approach based on the minimum description length principle to extract sequential patterns that compress a database of sequences well. We show that mining compressing patterns is NP-Hard and

  17. Spoken Language Understanding Systems for Extracting Semantic Information from Speech

    CERN Document Server

    Tur, Gokhan

    2011-01-01

    Spoken language understanding (SLU) is an emerging field in between speech and language processing, investigating human/ machine and human/ human communication by leveraging technologies from signal processing, pattern recognition, machine learning and artificial intelligence. SLU systems are designed to extract the meaning from speech utterances and its applications are vast, from voice search in mobile devices to meeting summarization, attracting interest from both commercial and academic sectors. Both human/machine and human/human communications can benefit from the application of SLU, usin

  18. Prospecção e monitoramento informacional no processo de inteligência competitiva Information scanning and information mining in the process of competitive intelligence

    Directory of Open Access Journals (Sweden)

    Marta Lígia Pomim Valentim

    2004-01-01

    Full Text Available A prospecção e o monitoramento informacional são atividades base para a inteligência competitiva, entendida como um processo dinâmico, composto pela gestão da informação e pela gestão do conhecimento. O processo de inteligência competitiva (I. C. nas organizações ocorre a partir de diferentes atividades informacionais, dentre elas estão as ligadas a prospecção e ao monitoramento. O papel destas atividades é essencial, pois alimentam todo o processo com dados, informação e conhecimento, constroem diversas estruturas formais e informais de informação dentro da organização, além do que, as atividades de prospecção e monitoramento geram serviços e produtos informacionais sistematizados, com alto valor agregado.The information scanning and information mining are activities base for the competitive intelligence, understood as a dynamic process, composed by the information management and knowledge management. The process of competitive intelligence (I. C. in the organizations it happens starting from different informational activities, they are the tied up ones information scanning and information mining. The function of these activities is essential, because they feed the whole process with data, information and knowledge, they build several formal structures and you inform inside of information of the organization, in addition, the information scanning and information mining activities generate information services and products systematized, with high value aggregate.

  19. Sifting Through Chaos: Extracting Information from Unstructured Legal Opinions.

    Science.gov (United States)

    Oliveira, Bruno Miguel; Guimarães, Rui Vasconcellos; Antunes, Luís; Rodrigues, Pedro Pereira

    2018-01-01

    Abiding to the law is, in some cases, a delicate balance between the rights of different players. Re-using health records is such a case. While the law grants reuse rights to public administration documents, in which health records produced in public health institutions are included, it also grants privacy to personal records. To safeguard a correct usage of data, public hospitals in Portugal employ jurists that are responsible for allowing or withholding access rights to health records. To help decision making, these jurists can consult the legal opinions issued by the national committee on public administration documents usage. While these legal opinions are of undeniable value, due to their doctrine contribution, they are only available in a format best suited from printing, forcing individual consultation of each document, with no option, whatsoever of clustered search, filtering or indexing, which are standard operations nowadays in a document management system. When having to decide on tens of data requests a day, it becomes unfeasible to consult the hundreds of legal opinions already available. With the objective to create a modern document management system, we devised an open, platform agnostic system that extracts and compiles the legal opinions, ex-tracts its contents and produces metadata, allowing for a fast searching and filtering of said legal opinions.

  20. Healthcare information systems: data mining methods in the creation of a clinical recommender system

    Science.gov (United States)

    Duan, L.; Street, W. N.; Xu, E.

    2011-05-01

    Recommender systems have been extensively studied to present items, such as movies, music and books that are likely of interest to the user. Researchers have indicated that integrated medical information systems are becoming an essential part of the modern healthcare systems. Such systems have evolved to an integrated enterprise-wide system. In particular, such systems are considered as a type of enterprise information systems or ERP system addressing healthcare industry sector needs. As part of efforts, nursing care plan recommender systems can provide clinical decision support, nursing education, clinical quality control, and serve as a complement to existing practice guidelines. We propose to use correlations among nursing diagnoses, outcomes and interventions to create a recommender system for constructing nursing care plans. In the current study, we used nursing diagnosis data to develop the methodology. Our system utilises a prefix-tree structure common in itemset mining to construct a ranked list of suggested care plan items based on previously-entered items. Unlike common commercial systems, our system makes sequential recommendations based on user interaction, modifying a ranked list of suggested items at each step in care plan construction. We rank items based on traditional association-rule measures such as support and confidence, as well as a novel measure that anticipates which selections might improve the quality of future rankings. Since the multi-step nature of our recommendations presents problems for traditional evaluation measures, we also present a new evaluation method based on average ranking position and use it to test the effectiveness of different recommendation strategies.

  1. Automation of technological processes at surface mines in the GDR as one of the main directions of increased coal extraction effectiveness by surface mining

    Energy Technology Data Exchange (ETDEWEB)

    Jona, U.

    1987-12-01

    In the GDR, about 53% of brown coal is mined with the use of overburden conveyor bridges, 27% with the use of belt conveyors, and 20% with the use of rail transport. Compares efficiency and cost per 1 m/sup 3/ of these transport methods. The overburden conveyor bridges, their specifications and microcomputer control are described. Describes utilization of microcomputer techniques, especially the stereochart system of Carl Zeiss Jena, for automated processing of data on surface mine geometry. Other computer applications are also presented, e.g. for surveying, slope stability calculation, and conveyor bridge control. Maintains that application of the KED/KEM microcomputer system for overburden conveyor bridge control increases its effectiveness by 10%, i.e. by 8 million m/sup 3//a.

  2. Utilizing a Value of Information Framework to Improve Ore Collection and Classification Procedures

    National Research Council Canada - National Science Library

    Phillips, Julia A

    2006-01-01

    .... We use a value of information framework (VOI) to consider the economic feasibility of a mine purchasing additional information on extracted ore type to reduce the uncertainty of extracted ore grade quality...

  3. Data Mining and Information Technology: Its Impact on Intelligence Collection and Privacy Rights

    National Research Council Canada - National Science Library

    Soderberg, Eric; Glenney, William

    2007-01-01

    .... At a time when the threat environment has shifted in emphasis to COIN, terrorism, and cyber war, IT-enhanced data mining capabilities could provide some of the critical intelligence demanded by these types of threats...

  4. The Application of Information Mining Technology to the Total Army Injury and Health Outcomes Database (TAIHOD)

    National Research Council Canada - National Science Library

    Amoroso, Paul

    2000-01-01

    .... This educational package allows the TAIHOD staff to take 60 days of training classes from SAS Institute in order to learn and exploit the data mining and warehouse administration software purchased...

  5. Hydrogeologic and stratigraphic data pertinent to uranium mining, Cheyenne Basin, Colorado. Information series 12

    International Nuclear Information System (INIS)

    Kirkham, R.M.; O'Leary, W.; Warner, J.W.

    1980-01-01

    Recoverable low-grade uranium deposits occur in the Upper Cretaceous Fox Hills Sandstone and Laramie Formation in the Cheyenne Basin, Colorado. One of these deposits, the Grover deposit, has been test mined on a pilot scale using in-situ solution-mining techniques. A second deposit, the Keota deposit, is currently being licensed and will produce about 500,000 lb/yr (227,000 kg/yr) of yellowcake also using in-situ solution-mining techniques. Other uranium deposits exist in this area and will also probably be solution mined, although open-pit mining may possibly be employed at a few locations in the Cheyenne Basin. One of the principal environmental impacts of this uranium-mining activity is the potential effect on ground-water quality and quantity. In order to fully assess potential ground-water impacts, regulatory agencies and mine planners and operators must be familiar with regional geologic and hydrologic characteristics of the basin. The Oligocene White River Group and Upper Cretaceous Laramie Formation, Fox Hills Sandstone, and Pierre Shale contain important aquifers which supply water for domestic, stock-watering, irrigation, and municipal purposes in the study area. Should uranium mining seriously impact shallower aquifers, the upper Pierre and lower Fox Hills aquifers may become important sources of water. Water samples collected and analyzed from over 100 wells during this investigation provide baseline water-quality data for much of the study area. These analyses indicate water quality is highly variable not only between aquifers, but also within a particular aquifer. Many of the wells yield water that exceeds US Public Health drinking water standards for pH, TDS, sulfate, manganese, iron and selenium. Uranium, molybdenum, and vanadium concentrations are also high in many of these wells. 8 figures

  6. Information extraction from FN plots of tungsten microemitters

    Energy Technology Data Exchange (ETDEWEB)

    Mussa, Khalil O. [Department of Physics, Mu' tah University, Al-Karak (Jordan); Mousa, Marwan S., E-mail: mmousa@mutah.edu.jo [Department of Physics, Mu' tah University, Al-Karak (Jordan); Fischer, Andreas, E-mail: andreas.fischer@physik.tu-chemnitz.de [Institut für Physik, Technische Universität Chemnitz, Chemnitz (Germany)

    2013-09-15

    Tungsten based microemitter tips have been prepared both clean and coated with dielectric materials. For clean tungsten tips, apex radii have been varied ranging from 25 to 500 nm. These tips were manufactured by electrochemical etching a 0.1 mm diameter high purity (99.95%) tungsten wire at the meniscus of two molar NaOH solution. Composite micro-emitters considered here are consisting of a tungsten core coated with different dielectric materials—such as magnesium oxide (MgO), sodium hydroxide (NaOH), tetracyanoethylene (TCNE), and zinc oxide (ZnO). It is worthwhile noting here, that the rather unconventional NaOH coating has shown several interesting properties. Various properties of these emitters were measured including current–voltage (IV) characteristics and the physical shape of the tips. A conventional field emission microscope (FEM) with a tip (cathode)–screen (anode) separation standardized at 10 mm was used to electrically characterize the electron emitters. The system was evacuated down to a base pressure of ∼10{sup −8}mbar when baked at up to ∼180°C overnight. This allowed measurements of typical field electron emission (FE) characteristics, namely the IV characteristics and the emission images on a conductive phosphorus screen (the anode). Mechanical characterization has been performed through a FEI scanning electron microscope (SEM). Within this work, the mentioned experimental results are connected to the theory for analyzing Fowler–Nordheim (FN) plots. We compared and evaluated the data extracted from clean tungsten tips of different radii and determined deviations between the results of different extraction methods applied. In particular, we derived the apex radii of several clean and coated tungsten tips by both SEM imaging and analyzing FN plots. The aim of this analysis is to support the ongoing discussion on recently developed improvements of the theory for analyzing FN plots related to metal field electron emitters, which in

  7. Information extraction from FN plots of tungsten microemitters

    International Nuclear Information System (INIS)

    Mussa, Khalil O.; Mousa, Marwan S.; Fischer, Andreas

    2013-01-01

    Tungsten based microemitter tips have been prepared both clean and coated with dielectric materials. For clean tungsten tips, apex radii have been varied ranging from 25 to 500 nm. These tips were manufactured by electrochemical etching a 0.1 mm diameter high purity (99.95%) tungsten wire at the meniscus of two molar NaOH solution. Composite micro-emitters considered here are consisting of a tungsten core coated with different dielectric materials—such as magnesium oxide (MgO), sodium hydroxide (NaOH), tetracyanoethylene (TCNE), and zinc oxide (ZnO). It is worthwhile noting here, that the rather unconventional NaOH coating has shown several interesting properties. Various properties of these emitters were measured including current–voltage (IV) characteristics and the physical shape of the tips. A conventional field emission microscope (FEM) with a tip (cathode)–screen (anode) separation standardized at 10 mm was used to electrically characterize the electron emitters. The system was evacuated down to a base pressure of ∼10 −8 mbar when baked at up to ∼180°C overnight. This allowed measurements of typical field electron emission (FE) characteristics, namely the IV characteristics and the emission images on a conductive phosphorus screen (the anode). Mechanical characterization has been performed through a FEI scanning electron microscope (SEM). Within this work, the mentioned experimental results are connected to the theory for analyzing Fowler–Nordheim (FN) plots. We compared and evaluated the data extracted from clean tungsten tips of different radii and determined deviations between the results of different extraction methods applied. In particular, we derived the apex radii of several clean and coated tungsten tips by both SEM imaging and analyzing FN plots. The aim of this analysis is to support the ongoing discussion on recently developed improvements of the theory for analyzing FN plots related to metal field electron emitters, which in

  8. Optimal Information Extraction of Laser Scanning Dataset by Scale-Adaptive Reduction

    Science.gov (United States)

    Zang, Y.; Yang, B.

    2018-04-01

    3D laser technology is widely used to collocate the surface information of object. For various applications, we need to extract a good perceptual quality point cloud from the scanned points. To solve the problem, most of existing methods extract important points based on a fixed scale. However, geometric features of 3D object come from various geometric scales. We propose a multi-scale construction method based on radial basis function. For each scale, important points are extracted from the point cloud based on their importance. We apply a perception metric Just-Noticeable-Difference to measure degradation of each geometric scale. Finally, scale-adaptive optimal information extraction is realized. Experiments are undertaken to evaluate the effective of the proposed method, suggesting a reliable solution for optimal information extraction of object.

  9. OPTIMAL INFORMATION EXTRACTION OF LASER SCANNING DATASET BY SCALE-ADAPTIVE REDUCTION

    Directory of Open Access Journals (Sweden)

    Y. Zang

    2018-04-01

    Full Text Available 3D laser technology is widely used to collocate the surface information of object. For various applications, we need to extract a good perceptual quality point cloud from the scanned points. To solve the problem, most of existing methods extract important points based on a fixed scale. However, geometric features of 3D object come from various geometric scales. We propose a multi-scale construction method based on radial basis function. For each scale, important points are extracted from the point cloud based on their importance. We apply a perception metric Just-Noticeable-Difference to measure degradation of each geometric scale. Finally, scale-adaptive optimal information extraction is realized. Experiments are undertaken to evaluate the effective of the proposed method, suggesting a reliable solution for optimal information extraction of object.

  10. USGS compilation of geographic information system (GIS) data of coal mines and coal-bearing areas in Mongolia

    Science.gov (United States)

    Trippi, Michael H.; Belkin, Harvey E.

    2015-09-10

    Geographic information system (GIS) information may facilitate energy studies, which in turn provide input for energy policy decisions. The U.S. Geological Survey (USGS) has compiled GIS data representing coal mines, deposits (including those with and without coal mines), occurrences, areas, basins, and provinces of Mongolia as of 2009. These data are now available for download, and may be used in a GIS for a variety of energy resource and environmental studies of Mongolia. Chemical data for 37 coal samples from a previous USGS study of Mongolia (Tewalt and others, 2010) are included in a downloadable GIS point shapefile and shown on the map of Mongolia. A brief report summarizes the methodology used for creation of the shapefiles and the chemical analyses run on the samples.

  11. Swamp Works: A New Approach to Develop Space Mining and Resource Extraction Technologies at the National Aeronautics Space Administration (NASA) Kennedy Space Center (KSC)

    Science.gov (United States)

    Mueller, R. P.; Sibille, L.; Leucht, K.; Smith, J. D.; Townsend, I. I.; Nick, A. J.; Schuler, J. M.

    2015-01-01

    environment and methodology, with associated laboratories that uses lean development methods and creativity-enhancing processes to invent and develop new solutions for space exploration. This paper will discuss the Swamp Works approach to developing space mining and resource extraction systems and the vision of space development it serves. The ultimate goal of the Swamp Works is to expand human civilization into the solar system via the use of local resources utilization. By mining and using the local resources in situ, it is conceivable that one day the logistics supply train from Earth can be eliminated and Earth independence of a space-based community will be enabled.

  12. Data Mining and Statistics for Decision Making

    CERN Document Server

    Tufféry, Stéphane

    2011-01-01

    Data mining is the process of automatically searching large volumes of data for models and patterns using computational techniques from statistics, machine learning and information theory; it is the ideal tool for such an extraction of knowledge. Data mining is usually associated with a business or an organization's need to identify trends and profiles, allowing, for example, retailers to discover patterns on which to base marketing objectives. This book looks at both classical and recent techniques of data mining, such as clustering, discriminant analysis, logistic regression, generalized lin

  13. Gold mineralogy and extraction

    Energy Technology Data Exchange (ETDEWEB)

    Cashion, J.D.; Brown, L.J. [Monash University, Physics Department (Australia)

    1998-12-15

    Several examples are examined in which Moessbauer spectroscopic analysis of gold mineral samples, treated concentrates and extracted species has provided information not obtainable by competing techniques. Descriptions are given of current work on bacterial oxidation of pyritic ores and on the adsorbed species from gold extracted from cyanide and chloride solutions onto activated carbon and polyurethane foams. The potential benefits for the gold mining industry from Moessbauer studies and some limitations on the use of the technique are also discussed.

  14. Gold mineralogy and extraction

    International Nuclear Information System (INIS)

    Cashion, J.D.; Brown, L.J.

    1998-01-01

    Several examples are examined in which Moessbauer spectroscopic analysis of gold mineral samples, treated concentrates and extracted species has provided information not obtainable by competing techniques. Descriptions are given of current work on bacterial oxidation of pyritic ores and on the adsorbed species from gold extracted from cyanide and chloride solutions onto activated carbon and polyurethane foams. The potential benefits for the gold mining industry from Moessbauer studies and some limitations on the use of the technique are also discussed

  15. Information extraction from FN plots of tungsten microemitters.

    Science.gov (United States)

    Mussa, Khalil O; Mousa, Marwan S; Fischer, Andreas

    2013-09-01

    Tungsten based microemitter tips have been prepared both clean and coated with dielectric materials. For clean tungsten tips, apex radii have been varied ranging from 25 to 500 nm. These tips were manufactured by electrochemical etching a 0.1 mm diameter high purity (99.95%) tungsten wire at the meniscus of two molar NaOH solution. Composite micro-emitters considered here are consisting of a tungsten core coated with different dielectric materials-such as magnesium oxide (MgO), sodium hydroxide (NaOH), tetracyanoethylene (TCNE), and zinc oxide (ZnO). It is worthwhile noting here, that the rather unconventional NaOH coating has shown several interesting properties. Various properties of these emitters were measured including current-voltage (IV) characteristics and the physical shape of the tips. A conventional field emission microscope (FEM) with a tip (cathode)-screen (anode) separation standardized at 10 mm was used to electrically characterize the electron emitters. The system was evacuated down to a base pressure of ∼10(-8) mbar when baked at up to ∼180 °C overnight. This allowed measurements of typical field electron emission (FE) characteristics, namely the IV characteristics and the emission images on a conductive phosphorus screen (the anode). Mechanical characterization has been performed through a FEI scanning electron microscope (SEM). Within this work, the mentioned experimental results are connected to the theory for analyzing Fowler-Nordheim (FN) plots. We compared and evaluated the data extracted from clean tungsten tips of different radii and determined deviations between the results of different extraction methods applied. In particular, we derived the apex radii of several clean and coated tungsten tips by both SEM imaging and analyzing FN plots. The aim of this analysis is to support the ongoing discussion on recently developed improvements of the theory for analyzing FN plots related to metal field electron emitters, which in particular

  16. Study on methods and techniques of aeroradiometric weak information extraction for sandstone-hosted uranium deposits based on GIS

    International Nuclear Information System (INIS)

    Han Shaoyang; Ke Dan; Hou Huiqun

    2005-01-01

    The weak information extraction is one of the important research contents in the current sandstone-type uranium prospecting in China. This paper introduces the connotation of aeroradiometric weak information extraction, and discusses the formation theories of aeroradiometric weak information extraction, and discusses the formation theories of aeroradiometric weak information and establishes some effective mathematic models for weak information extraction. Models for weak information extraction are realized based on GIS software platform. Application tests of weak information extraction are realized based on GIS software platform. Application tests of weak information extraction are completed in known uranium mineralized areas. Research results prove that the prospective areas of sandstone-type uranium deposits can be rapidly delineated by extracting aeroradiometric weak information. (authors)

  17. Extraction of Graph Information Based on Image Contents and the Use of Ontology

    Science.gov (United States)

    Kanjanawattana, Sarunya; Kimura, Masaomi

    2016-01-01

    A graph is an effective form of data representation used to summarize complex information. Explicit information such as the relationship between the X- and Y-axes can be easily extracted from a graph by applying human intelligence. However, implicit knowledge such as information obtained from other related concepts in an ontology also resides in…

  18. Text Mining to inform construction of Earth and Environmental Science Ontologies

    Science.gov (United States)

    Schildhauer, M.; Adams, B.; Rebich Hespanha, S.

    2013-12-01

    There is a clear need for better semantic representation of Earth and environmental concepts, to facilitate more effective discovery and re-use of information resources relevant to scientists doing integrative research. In order to develop general-purpose Earth and environmental science ontologies, however, it is necessary to represent concepts and relationships that span usage across multiple disciplines and scientific specialties. Traditional knowledge modeling through ontologies utilizes expert knowledge but inevitably favors the particular perspectives of the ontology engineers, as well as the domain experts who interacted with them. This often leads to ontologies that lack robust coverage of synonymy, while also missing important relationships among concepts that can be extremely useful for working scientists to be aware of. In this presentation we will discuss methods we have developed that utilize statistical topic modeling on a large corpus of Earth and environmental science articles, to expand coverage and disclose relationships among concepts in the Earth sciences. For our work we collected a corpus of over 121,000 abstracts from many of the top Earth and environmental science journals. We performed latent Dirichlet allocation topic modeling on this corpus to discover a set of latent topics, which consist of terms that commonly co-occur in abstracts. We match terms in the topics to concept labels in existing ontologies to reveal gaps, and we examine which terms are commonly associated in natural language discourse, to identify relationships that are important to formally model in ontologies. Our text mining methodology uncovers significant gaps in the content of some popular existing ontologies, and we show how, through a workflow involving human interpretation of topic models, we can bootstrap ontologies to have much better coverage and richer semantics. Because we base our methods directly on what working scientists are communicating about their

  19. Extracting information of fixational eye movements through pupil tracking

    Science.gov (United States)

    Xiao, JiangWei; Qiu, Jian; Luo, Kaiqin; Peng, Li; Han, Peng

    2018-01-01

    Human eyes are never completely static even when they are fixing a stationary point. These irregular, small movements, which consist of micro-tremors, micro-saccades and drifts, can prevent the fading of the images that enter our eyes. The importance of researching the fixational eye movements has been experimentally demonstrated recently. However, the characteristics of fixational eye movements and their roles in visual process have not been explained clearly, because these signals can hardly be completely extracted by now. In this paper, we developed a new eye movement detection device with a high-speed camera. This device includes a beam splitter mirror, an infrared light source and a high-speed digital video camera with a frame rate of 200Hz. To avoid the influence of head shaking, we made the device wearable by fixing the camera on a safety helmet. Using this device, the experiments of pupil tracking were conducted. By localizing the pupil center and spectrum analysis, the envelope frequency spectrum of micro-saccades, micro-tremors and drifts are shown obviously. The experimental results show that the device is feasible and effective, so that the device can be applied in further characteristic analysis.

  20. Extracting Social Networks and Contact Information From Email and the Web

    National Research Council Canada - National Science Library

    Culotta, Aron; Bekkerman, Ron; McCallum, Andrew

    2005-01-01

    ...-suited for such information extraction tasks. By recursively calling itself on new people discovered on the Web, the system builds a social network with multiple degrees of separation from the user...

  1. Resistances 2.0: New Communication and Information Practices for Challenging Mining Extraction in Colombia

    OpenAIRE

    Quiñones-Torres, Aída Julieta; Pontificia Universidad Javeriana; Menéndez-Echavarría, Alfredo Luis; Pontificia Universidad Javeriana; Herrera-Santoyo, Héctor; Asociación Interamericana para la Defensa del Ambiente

    2016-01-01

    Las dos últimas décadas se han convertido para Colombia en un campo propicio para la inversión extranjera directa (IED), principalmente en materia de recursos naturales de los cuales se destaca el extractivismo minero. No obstante, dadas las características del país relacionadas con la riqueza cultural, hídrica y biodiversa, se ha levantado resistencia de comunidades pertenecientes a pueblos étnicos, campesinos y urbanos, que dicen''no a la minería y por la defensa de la vida, los territorios...

  2. Lithium NLP: A System for Rich Information Extraction from Noisy User Generated Text on Social Media

    OpenAIRE

    Bhargava, Preeti; Spasojevic, Nemanja; Hu, Guoning

    2017-01-01

    In this paper, we describe the Lithium Natural Language Processing (NLP) system - a resource-constrained, high- throughput and language-agnostic system for information extraction from noisy user generated text on social media. Lithium NLP extracts a rich set of information including entities, topics, hashtags and sentiment from text. We discuss several real world applications of the system currently incorporated in Lithium products. We also compare our system with existing commercial and acad...

  3. Information Extraction of High Resolution Remote Sensing Images Based on the Calculation of Optimal Segmentation Parameters

    Science.gov (United States)

    Zhu, Hongchun; Cai, Lijie; Liu, Haiying; Huang, Wei

    2016-01-01

    Multi-scale image segmentation and the selection of optimal segmentation parameters are the key processes in the object-oriented information extraction of high-resolution remote sensing images. The accuracy of remote sensing special subject information depends on this extraction. On the basis of WorldView-2 high-resolution data, the optimal segmentation parameters methodof object-oriented image segmentation and high-resolution image information extraction, the following processes were conducted in this study. Firstly, the best combination of the bands and weights was determined for the information extraction of high-resolution remote sensing image. An improved weighted mean-variance method was proposed andused to calculatethe optimal segmentation scale. Thereafter, the best shape factor parameter and compact factor parameters were computed with the use of the control variables and the combination of the heterogeneity and homogeneity indexes. Different types of image segmentation parameters were obtained according to the surface features. The high-resolution remote sensing images were multi-scale segmented with the optimal segmentation parameters. Ahierarchical network structure was established by setting the information extraction rules to achieve object-oriented information extraction. This study presents an effective and practical method that can explain expert input judgment by reproducible quantitative measurements. Furthermore the results of this procedure may be incorporated into a classification scheme. PMID:27362762

  4. Overview of ImageCLEF 2017: information extraction from images

    OpenAIRE

    Ionescu, Bogdan; Müller, Henning; Villegas, Mauricio; Arenas, Helbert; Boato, Giulia; Dang Nguyen, Duc Tien; Dicente Cid, Yashin; Eickhoff, Carsten; Seco de Herrera, Alba G.; Gurrin, Cathal; Islam, Bayzidul; Kovalev, Vassili; Liauchuk, Vitali; Mothe, Josiane; Piras, Luca

    2017-01-01

    This paper presents an overview of the ImageCLEF 2017 evaluation campaign, an event that was organized as part of the CLEF (Conference and Labs of the Evaluation Forum) labs 2017. ImageCLEF is an ongoing initiative (started in 2003) that promotes the evaluation of technologies for annotation, indexing and retrieval for providing information access to collections of images in various usage scenarios and domains. In 2017, the 15th edition of ImageCLEF, three main tasks were proposed and one pil...

  5. Statistical techniques to extract information during SMAP soil moisture assimilation

    Science.gov (United States)

    Kolassa, J.; Reichle, R. H.; Liu, Q.; Alemohammad, S. H.; Gentine, P.

    2017-12-01

    Statistical techniques permit the retrieval of soil moisture estimates in a model climatology while retaining the spatial and temporal signatures of the satellite observations. As a consequence, the need for bias correction prior to an assimilation of these estimates is reduced, which could result in a more effective use of the independent information provided by the satellite observations. In this study, a statistical neural network (NN) retrieval algorithm is calibrated using SMAP brightness temperature observations and modeled soil moisture estimates (similar to those used to calibrate the SMAP Level 4 DA system). Daily values of surface soil moisture are estimated using the NN and then assimilated into the NASA Catchment model. The skill of the assimilation estimates is assessed based on a comprehensive comparison to in situ measurements from the SMAP core and sparse network sites as well as the International Soil Moisture Network. The NN retrieval assimilation is found to significantly improve the model skill, particularly in areas where the model does not represent processes related to agricultural practices. Additionally, the NN method is compared to assimilation experiments using traditional bias correction techniques. The NN retrieval assimilation is found to more effectively use the independent information provided by SMAP resulting in larger model skill improvements than assimilation experiments using traditional bias correction techniques.

  6. Mining wastes

    International Nuclear Information System (INIS)

    Pradel, J.

    1981-01-01

    In this article mining wastes means wastes obtained during extraction and processing of uranium ores including production of uraniferous concentrates. The hazards for the population are irradiation, ingestion, dust or radon inhalation. The different wastes produced are reviewed. Management of liquid effluents, water treatment, contamined materials, gaseous wastes and tailings are examined. Environmental impact of wastes during and after exploitation is discussed. Monitoring and measurements are made to verify that ICRP recommendations are met. Studies in progress to improve mining waste management are given [fr

  7. Research on Crowdsourcing Emergency Information Extraction of Based on Events' Frame

    Science.gov (United States)

    Yang, Bo; Wang, Jizhou; Ma, Weijun; Mao, Xi

    2018-01-01

    At present, the common information extraction method cannot extract the structured emergency event information accurately; the general information retrieval tool cannot completely identify the emergency geographic information; these ways also do not have an accurate assessment of these results of distilling. So, this paper proposes an emergency information collection technology based on event framework. This technique is to solve the problem of emergency information picking. It mainly includes emergency information extraction model (EIEM), complete address recognition method (CARM) and the accuracy evaluation model of emergency information (AEMEI). EIEM can be structured to extract emergency information and complements the lack of network data acquisition in emergency mapping. CARM uses a hierarchical model and the shortest path algorithm and allows the toponomy pieces to be joined as a full address. AEMEI analyzes the results of the emergency event and summarizes the advantages and disadvantages of the event framework. Experiments show that event frame technology can solve the problem of emergency information drawing and provides reference cases for other applications. When the emergency disaster is about to occur, the relevant departments query emergency's data that has occurred in the past. They can make arrangements ahead of schedule which defense and reducing disaster. The technology decreases the number of casualties and property damage in the country and world. This is of great significance to the state and society.

  8. [Extraction of management information from the national quality assurance program].

    Science.gov (United States)

    Stausberg, Jürgen; Bartels, Claus; Bobrowski, Christoph

    2007-07-15

    Starting with clinically motivated projects, the national quality assurance program has established a legislative obligatory framework. Annual feedback of results is an important means of quality control. The annual reports cover quality-related information with high granularity. A synopsis for corporate management is missing, however. Therefore, the results of the University Clinics in Greifswald, Germany, have been analyzed and aggregated to support hospital management. Strengths were identified by the ranking of results within the state for each quality indicator, weaknesses by the comparison with national reference values. The assessment was aggregated per clinical discipline and per category (indication, process, and outcome). A composition of quality indicators was claimed multiple times. A coherent concept is still missing. The method presented establishes a plausible summary of strengths and weaknesses of a hospital from the point of view of the national quality assurance program. Nevertheless, further adaptation of the program is needed to better assist corporate management.

  9. Extracting of implicit information in English advertising texts with phonetic and lexical-morphological means

    Directory of Open Access Journals (Sweden)

    Traikovskaya Natalya Petrovna

    2015-12-01

    Full Text Available The article deals with phonetic and lexical-morphological language means participating in the process of extracting implicit information in English-speaking advertising texts for men and women. The functioning of phonetic means of the English language is not the basis for implication of information in advertising texts. Lexical and morphological means play the role of markers of relevant information, playing the role of the activator ofimplicit information in the texts of advertising.

  10. Automatic flow-through dynamic extraction: A fast tool to evaluate char-based remediation of multi-element contaminated mine soils.

    Science.gov (United States)

    Rosende, María; Beesley, Luke; Moreno-Jimenez, Eduardo; Miró, Manuel

    2016-02-01

    An automatic in-vitro bioaccessibility test based upon dynamic microcolumn extraction in a programmable flow setup is herein proposed as a screening tool to evaluate bio-char based remediation of mine soils contaminated with trace elements as a compelling alternative to conventional phyto-availability tests. The feasibility of the proposed system was evaluated by extracting the readily bioaccessible pools of As, Pb and Zn in two contaminated mine soils before and after the addition of two biochars (9% (w:w)) of diverse source origin (pine and olive). Bioaccessible fractions under worst-case scenarios were measured using 0.001 mol L(-1) CaCl2 as extractant for mimicking plant uptake, and analysis of the extracts by inductively coupled optical emission spectrometry. The t-test of comparison of means revealed an efficient metal (mostly Pb and Zn) immobilization by the action of olive pruning-based biochar against the bare (control) soil at the 0.05 significance level. In-vitro flow-through bioaccessibility tests are compared for the first time with in-vivo phyto-toxicity assays in a microcosm soil study. By assessing seed germination and shoot elongation of Lolium perenne in contaminated soils with and without biochar amendments the dynamic flow-based bioaccessibility data proved to be in good agreement with the phyto-availability tests. Experimental results indicate that the dynamic extraction method is a viable and economical in-vitro tool in risk assessment explorations to evaluate the feasibility of a given biochar amendment for revegetation and remediation of metal contaminated soils in a mere 10 min against 4 days in case of phyto-toxicity assays. Copyright © 2015 Elsevier B.V. All rights reserved.

  11. Data mining in radiology

    International Nuclear Information System (INIS)

    Kharat, Amit T; Singh, Amarjit; Kulkarni, Vilas M; Shah, Digish

    2014-01-01

    Data mining facilitates the study of radiology data in various dimensions. It converts large patient image and text datasets into useful information that helps in improving patient care and provides informative reports. Data mining technology analyzes data within the Radiology Information System and Hospital Information System using specialized software which assesses relationships and agreement in available information. By using similar data analysis tools, radiologists can make informed decisions and predict the future outcome of a particular imaging finding. Data, information and knowledge are the components of data mining. Classes, Clusters, Associations, Sequential patterns, Classification, Prediction and Decision tree are the various types of data mining. Data mining has the potential to make delivery of health care affordable and ensure that the best imaging practices are followed. It is a tool for academic research. Data mining is considered to be ethically neutral, however concerns regarding privacy and legality exists which need to be addressed to ensure success of data mining

  12. Koenigstein mine. Background information, data and facts on the flooding of the former Koenigstein mine of the Wismut GmbH

    International Nuclear Information System (INIS)

    Anon.

    2001-01-01

    The Koenigstein mine is in the south east of the German state of Sachsen in the 'Saechsische Schweiz' district. The mine comprised an area of 6 km 2 between the towns of Koenigstein/H''utten, Bielatal, Langenhennersdorf and Struppen/Siedlung. The small town of Leupoldishain was completely undermined. The mining site is in a natural reserve, borders on a national park, and the shortes distance from the Elbe river is 600 m. Mining and sanitation work had to take account of the local and regional hydrogeological conditions, especially the 3rd aquifer which supplies freshwater for the region [de

  13. Building a Bridge or Digging a Pipeline? Clinical Data Mining in Evidence-Informed Knowledge Building

    Science.gov (United States)

    Epstein, Irwin

    2015-01-01

    Challenging the "bridge metaphor" theme of this conference, this article contends that current practice-research integration strategies are more like research-to-practice "pipelines." The purpose of this article is to demonstrate the potential of clinical data-mining studies conducted by practitioners, practitioner-oriented PhD…

  14. Data Mining in Finance: Using Counterfactuals To Generate Knowledge from Organizational Information Systems.

    Science.gov (United States)

    Dhar, Vasant

    1998-01-01

    Shows how counterfactuals and machine learning methods can be used to guide exploration of large databases that addresses some of the fundamental problems that organizations face in learning from data. Discusses data mining, particularly in the financial arena; generating useful knowledge from data; and the evaluation of counterfactuals. (LRW)

  15. Classifying unstructed textual data using the Product Score Model: an alternative text mining algorithm

    NARCIS (Netherlands)

    He, Qiwei; Veldkamp, Bernard P.; Eggen, T.J.H.M.; Veldkamp, B.P.

    2012-01-01

    Unstructured textual data such as students’ essays and life narratives can provide helpful information in educational and psychological measurement, but often contain irregularities and ambiguities, which creates difficulties in analysis. Text mining techniques that seek to extract useful

  16. Post-processing of Deep Web Information Extraction Based on Domain Ontology

    Directory of Open Access Journals (Sweden)

    PENG, T.

    2013-11-01

    Full Text Available Many methods are utilized to extract and process query results in deep Web, which rely on the different structures of Web pages and various designing modes of databases. However, some semantic meanings and relations are ignored. So, in this paper, we present an approach for post-processing deep Web query results based on domain ontology which can utilize the semantic meanings and relations. A block identification model (BIM based on node similarity is defined to extract data blocks that are relevant to specific domain after reducing noisy nodes. Feature vector of domain books is obtained by result set extraction model (RSEM based on vector space model (VSM. RSEM, in combination with BIM, builds the domain ontology on books which can not only remove the limit of Web page structures when extracting data information, but also make use of semantic meanings of domain ontology. After extracting basic information of Web pages, a ranking algorithm is adopted to offer an ordered list of data records to users. Experimental results show that BIM and RSEM extract data blocks and build domain ontology accurately. In addition, relevant data records and basic information are extracted and ranked. The performances precision and recall show that our proposed method is feasible and efficient.

  17. Relational XES: Data management for process mining

    NARCIS (Netherlands)

    Dongen, van B.F.; Shabani, S.; Grabis, J.; Sandkuhl, K.

    2015-01-01

    Information systems log data during the execution of business processes in so called "event logs". Process mining aims to improve business processes by extracting knowledge from event logs. Currently, the de-facto standard for storing and managing event data, XES, is tailored towards sequential

  18. Relational XES : data management for process mining

    NARCIS (Netherlands)

    Dongen, van B.F.; Shabani, S.

    2015-01-01

    Information systems log data during the execution of business processes in so called "event logs". Process mining aims to improve business processes by extracting knowledge from event logs. Currently, the de-facto standard for storing and managing event data, XES, is tailored towards sequential

  19. A Study on Environmental Research Trends Using Text-Mining Method - Focus on Spatial information and ICT -

    Science.gov (United States)

    Lee, M. J.; Oh, K. Y.; Joung-ho, L.

    2016-12-01

    Recently there are many research about analysing the interaction between entities by text-mining analysis in various fields. In this paper, we aimed to quantitatively analyse research-trends in the area of environmental research relating either spatial information or ICT (Information and Communications Technology) by Text-mining analysis. To do this, we applied low-dimensional embedding method, clustering analysis, and association rule to find meaningful associative patterns of key words frequently appeared in the articles. As the authors suppose that KCI (Korea Citation Index) articles reflect academic demands, total 1228 KCI articles that have been published from 1996 to 2015 were reviewed and analysed by Text-mining method. First, we derived KCI articles from NDSL(National Discovery for Science Leaders) site. And then we pre-processed their key-words elected from abstract and then classified those in separable sectors. We investigated the appearance rates and association rule of key-words for articles in the two fields: spatial-information and ICT. In order to detect historic trends, analysis was conducted separately for the four periods: 1996-2000, 2001-2005, 2006-2010, 2011-2015. These analysis were conducted with the usage of R-software. As a result, we conformed that environmental research relating spatial information mainly focused upon such fields as `GIS(35%)', `Remote-Sensing(25%)', `environmental theme map(15.7%)'. Next, `ICT technology(23.6%)', `ICT service(5.4%)', `mobile(24%)', `big data(10%)', `AI(7%)' are primarily emerging from environmental research relating ICT. Thus, from the analysis results, this paper asserts that research trends and academic progresses are well-structured to review recent spatial information and ICT technology and the outcomes of the analysis can be an adequate guidelines to establish environment policies and strategies. KEY WORDS: Big data, Test-mining, Environmental research, Spatial-information, ICT Acknowledgements: The

  20. a Statistical Texture Feature for Building Collapse Information Extraction of SAR Image

    Science.gov (United States)

    Li, L.; Yang, H.; Chen, Q.; Liu, X.

    2018-04-01

    Synthetic Aperture Radar (SAR) has become one of the most important ways to extract post-disaster collapsed building information, due to its extreme versatility and almost all-weather, day-and-night working capability, etc. In view of the fact that the inherent statistical distribution of speckle in SAR images is not used to extract collapsed building information, this paper proposed a novel texture feature of statistical models of SAR images to extract the collapsed buildings. In the proposed feature, the texture parameter of G0 distribution from SAR images is used to reflect the uniformity of the target to extract the collapsed building. This feature not only considers the statistical distribution of SAR images, providing more accurate description of the object texture, but also is applied to extract collapsed building information of single-, dual- or full-polarization SAR data. The RADARSAT-2 data of Yushu earthquake which acquired on April 21, 2010 is used to present and analyze the performance of the proposed method. In addition, the applicability of this feature to SAR data with different polarizations is also analysed, which provides decision support for the data selection of collapsed building information extraction.

  1. A method for automating the extraction of specialized information from the web

    NARCIS (Netherlands)

    Lin, L.; Liotta, A.; Hippisley, A.; Hao, Y.; Liu, J.; Wang, Y.; Cheung, Y-M.; Yin, H.; Jiao, L.; Ma, j.; Jiao, Y-C.

    2005-01-01

    The World Wide Web can be viewed as a gigantic distributed database including millions of interconnected hosts some of which publish information via web servers or peer-to-peer systems. We present here a novel method for the extraction of semantically rich information from the web in a fully

  2. Information analysis of iris biometrics for the needs of cryptology key extraction

    Directory of Open Access Journals (Sweden)

    Adamović Saša

    2013-01-01

    Full Text Available The paper presents a rigorous analysis of iris biometric information for the synthesis of an optimized system for the extraction of a high quality cryptology key. Estimations of local entropy and mutual information were identified as segments of the iris most suitable for this purpose. In order to optimize parameters, corresponding wavelets were transformed, in order to obtain the highest possible entropy and mutual information lower in the transformation domain, which set frameworks for the synthesis of systems for the extraction of truly random sequences of iris biometrics, without compromising authentication properties. [Projekat Ministarstva nauke Republike Srbije, br. TR32054 i br. III44006

  3. 36 CFR 6.7 - Mining wastes.

    Science.gov (United States)

    2010-07-01

    ... 36 Parks, Forests, and Public Property 1 2010-07-01 2010-07-01 false Mining wastes. 6.7 Section 6... DISPOSAL SITES IN UNITS OF THE NATIONAL PARK SYSTEM § 6.7 Mining wastes. (a) Solid waste from mining includes but is not limited to mining overburden, mining byproducts, solid waste from the extraction...

  4. Discrimination and Privacy in the Information Society Data Mining and Profiling in Large Databases

    CERN Document Server

    Calders, Toon; Schermer, Bart; Zarsky, Tal

    2013-01-01

    Vast amounts of data are nowadays collected, stored and processed, in an effort to assist in  making a variety of administrative and governmental decisions. These innovative steps considerably improve the speed, effectiveness and quality of decisions. Analyses are increasingly performed by data mining and profiling technologies that statistically and automatically determine patterns and trends. However, when such practices lead to unwanted or unjustified selections, they may result in unacceptable forms of  discrimination. Processing vast amounts of data may lead to situations in which data controllers know many of the characteristics, behaviors and whereabouts of people. In some cases, analysts might know more about individuals than these individuals know about themselves. Judging people by their digital identities sheds a different light on our views of privacy and data protection. This book discusses discrimination and privacy issues related to data mining and profiling practices. It provides technologic...

  5. Radio communication in mines: information, data processing (Report on ECSC contract 7220-AF/201)

    Energy Technology Data Exchange (ETDEWEB)

    Delogne, P; de Keyser, R; Deryck, L; Fourny, R; Hellin, H; Leonard, D [INIEX

    1980-01-01

    The aim of the research was to develop and construct transmitter-receivers for use in coal mines for communication transmission of signals and remote control. The reliability, miniaturization and ease of handling of existing equipment were improved. Research was carried out into interfaces between traditional remote-sensing elements and a radio transmission line. The intelligibility of spoken messages was also investigated. (In French)

  6. Mining engineer requirements in a German coal mine

    Energy Technology Data Exchange (ETDEWEB)

    Rauhut, F J

    1985-10-01

    Basic developments in German coal mines, new definitions of working areas of mining engineers, and groups of requirements in education are discussed. These groups include: requirements of hard-coal mining at great depth and in extended collieries; application of process technology and information systems in semi-automated mines; thinking in processes and systems; organizational changes; future requirements of mining engineers; responsibility of the mining engineer for employees and society.

  7. The Raising Influence of Information Technologies on Professional Training in the Sphere of Automated Driving When Transporting Mined Rock

    Directory of Open Access Journals (Sweden)

    Kosolapov Andrey

    2017-01-01

    Full Text Available Revolutionary changes in the area of production, holding and exploitation of the automobile as a transport vehicle are analyzed in the article. Current state of the issue is described and the development stages of new approach to driving without human participation are predicted, taking into consideration the usage of automobiles for transportation of mined rock in Kuzbass. The influence of modern information technologies on the development of new sector of automobile industry and on the process of professional and further training of the specialists in the sphere of automobile driving is considered.

  8. Evaluation of Head Mounted and Head Down Information Displays During Simulated Mine-Countermeasures Dives to 42 msw

    Science.gov (United States)

    2008-04-01

    visualisation tête basse (VTB) ou d’un visiocasque pour l’exécution de leurs tâches courantes sous l’eau. Neuf plongeurs de lutte contre les mines ont...existe peu d’information empirique sur l’aptitude des plongeurs à utiliser un dispositif multifonction de visualisation tête basse (VTB) ou d’un...experiment, the diver was also linked to the researchers and operations crew via audio communications. 11 Display Screen Chamber Laptop Diver

  9. The psychological impact of the risks of mines caving-in: anxiety, perception of the environment and access to information

    International Nuclear Information System (INIS)

    Dodeler, V.; Tarquinio, C.

    2004-01-01

    Research has been conducted to assess the extent to which the risk of losing one's home or seeing it damaged due to a mine cave-in influences an individual's state of health and, in particular, of anxiety. According to the results, persons living in such risky situations have higher anxiety scores than members of a control group. Furthermore, their perception of the environment apparently affects their anxiety: the individuals most affected have a deteriorated perception of their environment. This study draws attention to the key role played by networks of associations, where inhabitants feel they can obtain reliable information. (authors)

  10. The Raising Influence of Information Technologies on Professional Training in the Sphere of Automated Driving When Transporting Mined Rock

    Science.gov (United States)

    Kosolapov, Andrey; Krysin, Sergey

    2017-11-01

    Revolutionary changes in the area of production, holding and exploitation of the automobile as a transport vehicle are analyzed in the article. Current state of the issue is described and the development stages of new approach to driving without human participation are predicted, taking into consideration the usage of automobiles for transportation of mined rock in Kuzbass. The influence of modern information technologies on the development of new sector of automobile industry and on the process of professional and further training of the specialists in the sphere of automobile driving is considered.

  11. MedTime: a temporal information extraction system for clinical narratives.

    Science.gov (United States)

    Lin, Yu-Kai; Chen, Hsinchun; Brown, Randall A

    2013-12-01

    Temporal information extraction from clinical narratives is of critical importance to many clinical applications. We participated in the EVENT/TIMEX3 track of the 2012 i2b2 clinical temporal relations challenge, and presented our temporal information extraction system, MedTime. MedTime comprises a cascade of rule-based and machine-learning pattern recognition procedures. It achieved a micro-averaged f-measure of 0.88 in both the recognitions of clinical events and temporal expressions. We proposed and evaluated three time normalization strategies to normalize relative time expressions in clinical texts. The accuracy was 0.68 in normalizing temporal expressions of dates, times, durations, and frequencies. This study demonstrates and evaluates the integration of rule-based and machine-learning-based approaches for high performance temporal information extraction from clinical narratives. Copyright © 2013 Elsevier Inc. All rights reserved.

  12. Research of building information extraction and evaluation based on high-resolution remote-sensing imagery

    Science.gov (United States)

    Cao, Qiong; Gu, Lingjia; Ren, Ruizhi; Wang, Lang

    2016-09-01

    Building extraction currently is important in the application of high-resolution remote sensing imagery. At present, quite a few algorithms are available for detecting building information, however, most of them still have some obvious disadvantages, such as the ignorance of spectral information, the contradiction between extraction rate and extraction accuracy. The purpose of this research is to develop an effective method to detect building information for Chinese GF-1 data. Firstly, the image preprocessing technique is used to normalize the image and image enhancement is used to highlight the useful information in the image. Secondly, multi-spectral information is analyzed. Subsequently, an improved morphological building index (IMBI) based on remote sensing imagery is proposed to get the candidate building objects. Furthermore, in order to refine building objects and further remove false objects, the post-processing (e.g., the shape features, the vegetation index and the water index) is employed. To validate the effectiveness of the proposed algorithm, the omission errors (OE), commission errors (CE), the overall accuracy (OA) and Kappa are used at final. The proposed method can not only effectively use spectral information and other basic features, but also avoid extracting excessive interference details from high-resolution remote sensing images. Compared to the original MBI algorithm, the proposed method reduces the OE by 33.14% .At the same time, the Kappa increase by 16.09%. In experiments, IMBI achieved satisfactory results and outperformed other algorithms in terms of both accuracies and visual inspection

  13. Seeking science information online: Data mining Google to better understand the roles of the media and the education system.

    Science.gov (United States)

    Segev, Elad; Baram-Tsabari, Ayelet

    2012-10-01

    Which extrinsic cues motivate people to search for science-related information? For many science-related search queries, media attention and time during the academic year are highly correlated with changes in information seeking behavior (expressed by changes in the proportion of Google science-related searches). The data mining analysis presented here shows that changes in the volume of searches for general and well-established science terms are strongly linked to the education system. By contrast, ad-hoc events and current concerns were better aligned with media coverage. The interest and ability to independently seek science knowledge in response to current events or concerns is one of the fundamental goals of the science literacy movement. This method provides a mirror of extrapolated behavior and as such can assist researchers in assessing the role of the media in shaping science interests, and inform the ways in which lifelong interests in science are manifested in real world situations.

  14. Process mining applied to the test process of wafer steppers in ASML

    NARCIS (Netherlands)

    Rozinat, A.; Jong, de I.S.M.; Günther, C.W.; Aalst, van der W.M.P.

    2009-01-01

    Process mining techniques attempt to extract nontrivial and useful information from event logs. For example, there are many process mining techniques to automatically discover a process model describing the causal dependencies between activities. Several successful case studies have been reported in

  15. Information Extraction of High-Resolution Remotely Sensed Image Based on Multiresolution Segmentation

    Directory of Open Access Journals (Sweden)

    Peng Shao

    2014-08-01

    Full Text Available The principle of multiresolution segmentation was represented in detail in this study, and the canny algorithm was applied for edge-detection of a remotely sensed image based on this principle. The target image was divided into regions based on object-oriented multiresolution segmentation and edge-detection. Furthermore, object hierarchy was created, and a series of features (water bodies, vegetation, roads, residential areas, bare land and other information were extracted by the spectral and geometrical features. The results indicate that the edge-detection has a positive effect on multiresolution segmentation, and overall accuracy of information extraction reaches to 94.6% by the confusion matrix.

  16. End-to-end information extraction without token-level supervision

    DEFF Research Database (Denmark)

    Palm, Rasmus Berg; Hovy, Dirk; Laws, Florian

    2017-01-01

    Most state-of-the-art information extraction approaches rely on token-level labels to find the areas of interest in text. Unfortunately, these labels are time-consuming and costly to create, and consequently, not available for many real-life IE tasks. To make matters worse, token-level labels...... and output text. We evaluate our model on the ATIS data set, MIT restaurant corpus and the MIT movie corpus and compare to neural baselines that do use token-level labels. We achieve competitive results, within a few percentage points of the baselines, showing the feasibility of E2E information extraction...

  17. The Leadville Mine Drainage Tunnel Catastrophe: A Case Study of How Isotope Geochemistry Provided Forensic Evidence to Inform Policy Decisions

    Science.gov (United States)

    Williams, M. W.; Wireman, M.; Liu, F.; Gertson, J.

    2008-12-01

    A state of emergency was declared in February 2008 because of fears that a blocked drainage tunnel in the Leadville mining district of Colorado could cause a catastrophic flood. An estimated 1 billion gallons of metals-laden water poses an eminent threat to the city of Leadville and the headwaters of the Arkansas river. Within days of the declaration of a state of emergency, Governor Ritter and Senator Salazer of Colorado, along with a host of other local and statewide politicians, visited the site and emphasized the need to develop a fast yet safe mitigation plan. Here we provide information from a case study that illustrates how a suite of isotopic and hydrologic tools enables identification of critical, site-specific variables essential in developing a science plan to guide targeted remediation of the Leadville drainage tunnel. The isotopic tools, including both stable and radiogenic isotopes, provided clear and compelling evidence of water sources and flowpaths in an area that has undergone extensive perturbations, including the drilling of more than 2,000 mine shafts. This forensic evidence was the key information in developing a plan to plug the drainage tunnel several hundred feet underground, divert a major source of polluted water from reaching the collapsed tunnel and piping it to an existing treatment plant, and guidance on where to place pumps in additional mine shafts, and the drilling of new wells to pump water in case the plugging of the tunnel caused water to pool up and raise the water table to dangerous heights. This particular case of forensic hydrology using isotopic tools not only provides the scientific basis for an operational plan to defuse a life- and property-threatening situation, it also provides the basis for decommissioning an existing water treatment plant, which will result in savings of over 1 million annually in operational costs. Decommissioning the existing water treatment plant will pay for the tunnel mitigation within several

  18. Text mining by Tsallis entropy

    Science.gov (United States)

    Jamaati, Maryam; Mehri, Ali

    2018-01-01

    Long-range correlations between the elements of natural languages enable them to convey very complex information. Complex structure of human language, as a manifestation of natural languages, motivates us to apply nonextensive statistical mechanics in text mining. Tsallis entropy appropriately ranks the terms' relevance to document subject, taking advantage of their spatial correlation length. We apply this statistical concept as a new powerful word ranking metric in order to extract keywords of a single document. We carry out an experimental evaluation, which shows capability of the presented method in keyword extraction. We find that, Tsallis entropy has reliable word ranking performance, at the same level of the best previous ranking methods.

  19. Extraction Method for Earthquake-Collapsed Building Information Based on High-Resolution Remote Sensing

    International Nuclear Information System (INIS)

    Chen, Peng; Wu, Jian; Liu, Yaolin; Wang, Jing

    2014-01-01

    At present, the extraction of earthquake disaster information from remote sensing data relies on visual interpretation. However, this technique cannot effectively and quickly obtain precise and efficient information for earthquake relief and emergency management. Collapsed buildings in the town of Zipingpu after the Wenchuan earthquake were used as a case study to validate two kinds of rapid extraction methods for earthquake-collapsed building information based on pixel-oriented and object-oriented theories. The pixel-oriented method is based on multi-layer regional segments that embody the core layers and segments of the object-oriented method. The key idea is to mask layer by layer all image information, including that on the collapsed buildings. Compared with traditional techniques, the pixel-oriented method is innovative because it allows considerably rapid computer processing. As for the object-oriented method, a multi-scale segment algorithm was applied to build a three-layer hierarchy. By analyzing the spectrum, texture, shape, location, and context of individual object classes in different layers, the fuzzy determined rule system was established for the extraction of earthquake-collapsed building information. We compared the two sets of results using three variables: precision assessment, visual effect, and principle. Both methods can extract earthquake-collapsed building information quickly and accurately. The object-oriented method successfully overcomes the pepper salt noise caused by the spectral diversity of high-resolution remote sensing data and solves the problem of same object, different spectrums and that of same spectrum, different objects. With an overall accuracy of 90.38%, the method achieves more scientific and accurate results compared with the pixel-oriented method (76.84%). The object-oriented image analysis method can be extensively applied in the extraction of earthquake disaster information based on high-resolution remote sensing

  20. Applying Data Mining Techniques to Improve Information Security in the Cloud: A Single Cache System Approach

    OpenAIRE

    Amany AlShawi

    2016-01-01

    Presently, the popularity of cloud computing is gradually increasing day by day. The purpose of this research was to enhance the security of the cloud using techniques such as data mining with specific reference to the single cache system. From the findings of the research, it was observed that the security in the cloud could be enhanced with the single cache system. For future purposes, an Apriori algorithm can be applied to the single cache system. This can be applied by all cloud providers...

  1. Terrain Extraction by Integrating Terrestrial Laser Scanner Data and Spectral Information

    Science.gov (United States)

    Lau, C. L.; Halim, S.; Zulkepli, M.; Azwan, A. M.; Tang, W. L.; Chong, A. K.

    2015-10-01

    The extraction of true terrain points from unstructured laser point cloud data is an important process in order to produce an accurate digital terrain model (DTM). However, most of these spatial filtering methods just utilizing the geometrical data to discriminate the terrain points from nonterrain points. The point cloud filtering method also can be improved by using the spectral information available with some scanners. Therefore, the objective of this study is to investigate the effectiveness of using the three-channel (red, green and blue) of the colour image captured from built-in digital camera which is available in some Terrestrial Laser Scanner (TLS) for terrain extraction. In this study, the data acquisition was conducted at a mini replica landscape in Universiti Teknologi Malaysia (UTM), Skudai campus using Leica ScanStation C10. The spectral information of the coloured point clouds from selected sample classes are extracted for spectral analysis. The coloured point clouds which within the corresponding preset spectral threshold are identified as that specific feature point from the dataset. This process of terrain extraction is done through using developed Matlab coding. Result demonstrates that a higher spectral resolution passive image is required in order to improve the output. This is because low quality of the colour images captured by the sensor contributes to the low separability in spectral reflectance. In conclusion, this study shows that, spectral information is capable to be used as a parameter for terrain extraction.

  2. Information retrieval and terminology extraction in online resources for patients with diabetes.

    Science.gov (United States)

    Seljan, Sanja; Baretić, Maja; Kucis, Vlasta

    2014-06-01

    Terminology use, as a mean for information retrieval or document indexing, plays an important role in health literacy. Specific types of users, i.e. patients with diabetes need access to various online resources (on foreign and/or native language) searching for information on self-education of basic diabetic knowledge, on self-care activities regarding importance of dietetic food, medications, physical exercises and on self-management of insulin pumps. Automatic extraction of corpus-based terminology from online texts, manuals or professional papers, can help in building terminology lists or list of "browsing phrases" useful in information retrieval or in document indexing. Specific terminology lists represent an intermediate step between free text search and controlled vocabulary, between user's demands and existing online resources in native and foreign language. The research aiming to detect the role of terminology in online resources, is conducted on English and Croatian manuals and Croatian online texts, and divided into three interrelated parts: i) comparison of professional and popular terminology use ii) evaluation of automatic statistically-based terminology extraction on English and Croatian texts iii) comparison and evaluation of extracted terminology performed on English manual using statistical and hybrid approaches. Extracted terminology candidates are evaluated by comparison with three types of reference lists: list created by professional medical person, list of highly professional vocabulary contained in MeSH and list created by non-medical persons, made as intersection of 15 lists. Results report on use of popular and professional terminology in online diabetes resources, on evaluation of automatically extracted terminology candidates in English and Croatian texts and on comparison of statistical and hybrid extraction methods in English text. Evaluation of automatic and semi-automatic terminology extraction methods is performed by recall

  3. A mine of energy

    International Nuclear Information System (INIS)

    Fallon, M.

    1982-01-01

    In July 1978 the then Union Corporation (which is a wholly-owned Subsidiary of the larger Gencor Group) announced its intention to develop Beisa mine in the Orange Free State. They started up a medium sized uranium mine with gold as a by-product. The main idea was for the processing of uranium. The planning of the uranium recovery plant, the actual mining, and the recovery and extraction of uranium are discussed

  4. OpenCV-Based Nanomanipulation Information Extraction and the Probe Operation in SEM

    Directory of Open Access Journals (Sweden)

    Dongjie Li

    2015-02-01

    Full Text Available Aimed at the established telenanomanipulation system, the method of extracting location information and the strategies of probe operation were studied in this paper. First, the machine learning algorithm of OpenCV was used to extract location information from SEM images. Thus nanowires and probe in SEM images can be automatically tracked and the region of interest (ROI can be marked quickly. Then the location of nanowire and probe can be extracted from the ROI. To study the probe operation strategy, the Van der Waals force between probe and a nanowire was computed; thus relevant operating parameters can be obtained. With these operating parameters, the nanowire in 3D virtual environment can be preoperated and an optimal path of the probe can be obtained. The actual probe runs automatically under the telenanomanipulation system's control. Finally, experiments were carried out to verify the above methods, and results show the designed methods have achieved the expected effect.

  5. Methods to extract information on the atomic and molecular states from scientific abstracts

    International Nuclear Information System (INIS)

    Sasaki, Akira; Ueshima, Yutaka; Yamagiwa, Mitsuru; Murata, Masaki; Kanamaru, Toshiyuki; Shirado, Tamotsu; Isahara, Hitoshi

    2005-01-01

    We propose a new application of information technology to recognize and extract expressions of atomic and molecular states from electrical forms of scientific abstracts. Present results will help scientists to understand atomic states as well as the physics discussed in the articles. Combining with the internet search engines, it will make one possible to collect not only atomic and molecular data but broader scientific information over a wide range of research fields. (author)

  6. System and method for extracting physiological information from remotely detected electromagnetic radiation

    NARCIS (Netherlands)

    2016-01-01

    The present invention relates to a device and a method for extracting physiological information indicative of at least one health symptom from remotely detected electromagnetic radiation. The device comprises an interface (20) for receiving a data stream comprising remotely detected image data

  7. System and method for extracting physiological information from remotely detected electromagnetic radiation

    NARCIS (Netherlands)

    2015-01-01

    The present invention relates to a device and a method for extracting physiological information indicative of at least one health symptom from remotely detected electromagnetic radiation. The device comprises an interface (20) for receiving a data stream comprising remotely detected image data

  8. Network and Ensemble Enabled Entity Extraction in Informal Text (NEEEEIT) final report

    Energy Technology Data Exchange (ETDEWEB)

    Kegelmeyer, Philip W. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Shead, Timothy M. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States); Dunlavy, Daniel M. [Sandia National Lab. (SNL-NM), Albuquerque, NM (United States)

    2013-09-01

    This SAND report summarizes the activities and outcomes of the Network and Ensemble Enabled Entity Extraction in Information Text (NEEEEIT) LDRD project, which addressed improving the accuracy of conditional random fields for named entity recognition through the use of ensemble methods.

  9. Semi-automatic building extraction in informal settlements from high-resolution satellite imagery

    Science.gov (United States)

    Mayunga, Selassie David

    The extraction of man-made features from digital remotely sensed images is considered as an important step underpinning management of human settlements in any country. Man-made features and buildings in particular are required for varieties of applications such as urban planning, creation of geographical information systems (GIS) databases and Urban City models. The traditional man-made feature extraction methods are very expensive in terms of equipment, labour intensive, need well-trained personnel and cannot cope with changing environments, particularly in dense urban settlement areas. This research presents an approach for extracting buildings in dense informal settlement areas using high-resolution satellite imagery. The proposed system uses a novel strategy of extracting building by measuring a single point at the approximate centre of the building. The fine measurement of the building outlines is then effected using a modified snake model. The original snake model on which this framework is based, incorporates an external constraint energy term which is tailored to preserving the convergence properties of the snake model; its use to unstructured objects will negatively affect their actual shapes. The external constrained energy term was removed from the original snake model formulation, thereby, giving ability to cope with high variability of building shapes in informal settlement areas. The proposed building extraction system was tested on two areas, which have different situations. The first area was Tungi in Dar Es Salaam, Tanzania where three sites were tested. This area is characterized by informal settlements, which are illegally formulated within the city boundaries. The second area was Oromocto in New Brunswick, Canada where two sites were tested. Oromocto area is mostly flat and the buildings are constructed using similar materials. Qualitative and quantitative measures were employed to evaluate the accuracy of the results as well as the performance

  10. Information Management of Health and Safety at the Tarkwa Mine of ...

    African Journals Online (AJOL)

    Michael

    2016-06-01

    Jun 1, 2016 ... Information Management of Health and Safety at the Tarkwa ... heap leach technology. ... the quality of information was assessed using the content of information ..... managing library users' expectations; and reference service.

  11. How ISO/IEC 17799 can be used for base lining information assurance among entities using data mining for defense, homeland security, commercial, and other civilian/commercial domains

    Science.gov (United States)

    Perry, William G.

    2006-04-01

    One goal of database mining is to draw unique and valid perspectives from multiple data sources. Insights that are fashioned from closely-held data stores are likely to possess a high degree of reliability. The degree of information assurance comes into question, however, when external databases are accessed, combined and analyzed to form new perspectives. ISO/IEC 17799, Information technology-Security techniques-Code of practice for information security management, can be used to establish a higher level of information assurance among disparate entities using data mining in the defense, homeland security, commercial and other civilian/commercial domains. Organizations that meet ISO/IEC information security standards have identified and assessed risks, threats and vulnerabilities and have taken significant proactive steps to meet their unique security requirements. The ISO standards address twelve domains: risk assessment and treatment, security policy, organization of information security, asset management, human resources security, physical and environmental security, communications and operations management, access control, information systems acquisition, development and maintenance, information security incident management and business continuity management and compliance. Analysts can be relatively confident that if organizations are ISO 17799 compliant, a high degree of information assurance is likely to be a characteristic of the data sets being used. The reverse may be true. Extracting, fusing and drawing conclusions based upon databases with a low degree of information assurance may be wrought with all of the hazards that come from knowingly using bad data to make decisions. Using ISO/IEC 17799 as a baseline for information assurance can help mitigate these risks.

  12. Mining for Social Media: Usage Patterns of Small Businesses

    OpenAIRE

    Balan, Shilpa; Rege, Janhavi

    2017-01-01

    Background: Information can now be rapidly exchanged due to social media. Due to its openness, Twitter has generated massive amounts of data. In this paper, we apply data mining and analytics to extract the usage patterns of social media by small businesses. Objectives: The aim of this paper is to describe with an example how data mining can be applied to social media. This paper further examines the impact of social media on small businesses. The Twitter posts related to small businesses are...

  13. A REVIEW ON PREDICTIVE ANALYTICS IN DATA MINING

    OpenAIRE

    Arumugam.S

    2016-01-01

    The data mining its main process is to collect, extract and store the valuable information and now-a-days it’s done by many enterprises actively. In advanced analytics, Predictive analytics is the one of the branch which is mainly used to make predictions about future events which are unknown. Predictive analytics which uses various techniques from machine learning, statistics, data mining, modeling, and artificial intelligence for analyzing the current data and to make predictions about futu...

  14. Applying Data Mining Techniques to Improve Information Security in the Cloud: A Single Cache System Approach

    Directory of Open Access Journals (Sweden)

    Amany AlShawi

    2016-01-01

    Full Text Available Presently, the popularity of cloud computing is gradually increasing day by day. The purpose of this research was to enhance the security of the cloud using techniques such as data mining with specific reference to the single cache system. From the findings of the research, it was observed that the security in the cloud could be enhanced with the single cache system. For future purposes, an Apriori algorithm can be applied to the single cache system. This can be applied by all cloud providers, vendors, data distributors, and others. Further, data objects entered into the single cache system can be extended into 12 components. Database and SPSS modelers can be used to implement the same.

  15. Mining biological information from 3D short time-series gene expression data: the OPTricluster algorithm.

    Science.gov (United States)

    Tchagang, Alain B; Phan, Sieu; Famili, Fazel; Shearer, Heather; Fobert, Pierre; Huang, Yi; Zou, Jitao; Huang, Daiqing; Cutler, Adrian; Liu, Ziying; Pan, Youlian

    2012-04-04

    Nowadays, it is possible to collect expression levels of a set of genes from a set of biological samples during a series of time points. Such data have three dimensions: gene-sample-time (GST). Thus they are called 3D microarray gene expression data. To take advantage of the 3D data collected, and to fully understand the biological knowledge hidden in the GST data, novel subspace clustering algorithms have to be developed to effectively address the biological problem in the corresponding space. We developed a subspace clustering algorithm called Order Preserving Triclustering (OPTricluster), for 3D short time-series data mining. OPTricluster is able to identify 3D clusters with coherent evolution from a given 3D dataset using a combinatorial approach on the sample dimension, and the order preserving (OP) concept on the time dimension. The fusion of the two methodologies allows one to study similarities and differences between samples in terms of their temporal expression profile. OPTricluster has been successfully applied to four case studies: immune response in mice infected by malaria (Plasmodium chabaudi), systemic acquired resistance in Arabidopsis thaliana, similarities and differences between inner and outer cotyledon in Brassica napus during seed development, and to Brassica napus whole seed development. These studies showed that OPTricluster is robust to noise and is able to detect the similarities and differences between biological samples. Our analysis showed that OPTricluster generally outperforms other well known clustering algorithms such as the TRICLUSTER, gTRICLUSTER and K-means; it is robust to noise and can effectively mine the biological knowledge hidden in the 3D short time-series gene expression data.

  16. The Distribution of the Informative Intensity of the Text in Terms of its Structure (On Materials of the English Texts in the Mining Sphere)

    Science.gov (United States)

    Znikina, Ludmila; Rozhneva, Elena

    2017-11-01

    The article deals with the distribution of informative intensity of the English-language scientific text based on its structural features contributing to the process of formalization of the scientific text and the preservation of the adequacy of the text with derived semantic information in relation to the primary. Discourse analysis is built on specific compositional and meaningful examples of scientific texts taken from the mining field. It also analyzes the adequacy of the translation of foreign texts into another language, the relationships between elements of linguistic systems, the degree of a formal conformance, translation with the specific objectives and information needs of the recipient. Some key words and ideas are emphasized in the paragraphs of the English-language mining scientific texts. The article gives the characteristic features of the structure of paragraphs of technical text and examples of constructions in English scientific texts based on a mining theme with the aim to explain the possible ways of their adequate translation.

  17. RESEARCH ON REMOTE SENSING GEOLOGICAL INFORMATION EXTRACTION BASED ON OBJECT ORIENTED CLASSIFICATION

    Directory of Open Access Journals (Sweden)

    H. Gao

    2018-04-01

    Full Text Available The northern Tibet belongs to the Sub cold arid climate zone in the plateau. It is rarely visited by people. The geological working conditions are very poor. However, the stratum exposures are good and human interference is very small. Therefore, the research on the automatic classification and extraction of remote sensing geological information has typical significance and good application prospect. Based on the object-oriented classification in Northern Tibet, using the Worldview2 high-resolution remote sensing data, combined with the tectonic information and image enhancement, the lithological spectral features, shape features, spatial locations and topological relations of various geological information are excavated. By setting the threshold, based on the hierarchical classification, eight kinds of geological information were classified and extracted. Compared with the existing geological maps, the accuracy analysis shows that the overall accuracy reached 87.8561 %, indicating that the classification-oriented method is effective and feasible for this study area and provides a new idea for the automatic extraction of remote sensing geological information.

  18. A Method for Extracting Road Boundary Information from Crowdsourcing Vehicle GPS Trajectories.

    Science.gov (United States)

    Yang, Wei; Ai, Tinghua; Lu, Wei

    2018-04-19

    Crowdsourcing trajectory data is an important approach for accessing and updating road information. In this paper, we present a novel approach for extracting road boundary information from crowdsourcing vehicle traces based on Delaunay triangulation (DT). First, an optimization and interpolation method is proposed to filter abnormal trace segments from raw global positioning system (GPS) traces and interpolate the optimization segments adaptively to ensure there are enough tracking points. Second, constructing the DT and the Voronoi diagram within interpolated tracking lines to calculate road boundary descriptors using the area of Voronoi cell and the length of triangle edge. Then, the road boundary detection model is established integrating the boundary descriptors and trajectory movement features (e.g., direction) by DT. Third, using the boundary detection model to detect road boundary from the DT constructed by trajectory lines, and a regional growing method based on seed polygons is proposed to extract the road boundary. Experiments were conducted using the GPS traces of taxis in Beijing, China, and the results show that the proposed method is suitable for extracting the road boundary from low-frequency GPS traces, multi-type road structures, and different time intervals. Compared with two existing methods, the automatically extracted boundary information was proved to be of higher quality.

  19. A Method for Extracting Road Boundary Information from Crowdsourcing Vehicle GPS Trajectories

    Directory of Open Access Journals (Sweden)

    Wei Yang

    2018-04-01

    Full Text Available Crowdsourcing trajectory data is an important approach for accessing and updating road information. In this paper, we present a novel approach for extracting road boundary information from crowdsourcing vehicle traces based on Delaunay triangulation (DT. First, an optimization and interpolation method is proposed to filter abnormal trace segments from raw global positioning system (GPS traces and interpolate the optimization segments adaptively to ensure there are enough tracking points. Second, constructing the DT and the Voronoi diagram within interpolated tracking lines to calculate road boundary descriptors using the area of Voronoi cell and the length of triangle edge. Then, the road boundary detection model is established integrating the boundary descriptors and trajectory movement features (e.g., direction by DT. Third, using the boundary detection model to detect road boundary from the DT constructed by trajectory lines, and a regional growing method based on seed polygons is proposed to extract the road boundary. Experiments were conducted using the GPS traces of taxis in Beijing, China, and the results show that the proposed method is suitable for extracting the road boundary from low-frequency GPS traces, multi-type road structures, and different time intervals. Compared with two existing methods, the automatically extracted boundary information was proved to be of higher quality.

  20. BioCreative V track 4: a shared task for the extraction of causal network information using the Biological Expression Language.

    Science.gov (United States)

    Rinaldi, Fabio; Ellendorff, Tilia Renate; Madan, Sumit; Clematide, Simon; van der Lek, Adrian; Mevissen, Theo; Fluck, Juliane

    2016-01-01

    Automatic extraction of biological network information is one of the most desired and most complex tasks in biological and medical text mining. Track 4 at BioCreative V attempts to approach this complexity using fragments of large-scale manually curated biological networks, represented in Biological Expression Language (BEL), as training and test data. BEL is an advanced knowledge representation format which has been designed to be both human readable and machine processable. The specific goal of track 4 was to evaluate text mining systems capable of automatically constructing BEL statements from given evidence text, and of retrieving evidence text for given BEL statements. Given the complexity of the task, we designed an evaluation methodology which gives credit to partially correct statements. We identified various levels of information expressed by BEL statements, such as entities, functions, relations, and introduced an evaluation framework which rewards systems capable of delivering useful BEL fragments at each of these levels. The aim of this evaluation method is to help identify the characteristics of the systems which, if combined, would be most useful for achieving the overall goal of automatically constructing causal biological networks from text. © The Author(s) 2016. Published by Oxford University Press.

  1. YAdumper: extracting and translating large information volumes from relational databases to structured flat files.

    Science.gov (United States)

    Fernández, José M; Valencia, Alfonso

    2004-10-12

    Downloading the information stored in relational databases into XML and other flat formats is a common task in bioinformatics. This periodical dumping of information requires considerable CPU time, disk and memory resources. YAdumper has been developed as a purpose-specific tool to deal with the integral structured information download of relational databases. YAdumper is a Java application that organizes database extraction following an XML template based on an external Document Type Declaration. Compared with other non-native alternatives, YAdumper substantially reduces memory requirements and considerably improves writing performance.

  2. Process-aware information systems : lessons to be learned from process mining

    NARCIS (Netherlands)

    Aalst, van der W.M.P.; Jensen, K.; Aalst, van der W.M.P.

    2009-01-01

    A Process-Aware Information System (PAIS) is a software system that manages and executes operational processes involving people, applications, and/or information sources on the basis of process models. Example PAISs are workflow management systems, case-handling systems, enterprise information

  3. Machine-learned solutions for three stages of clinical information extraction: the state of the art at i2b2 2010.

    Science.gov (United States)

    de Bruijn, Berry; Cherry, Colin; Kiritchenko, Svetlana; Martin, Joel; Zhu, Xiaodan

    2011-01-01

    As clinical text mining continues to mature, its potential as an enabling technology for innovations in patient care and clinical research is becoming a reality. A critical part of that process is rigid benchmark testing of natural language processing methods on realistic clinical narrative. In this paper, the authors describe the design and performance of three state-of-the-art text-mining applications from the National Research Council of Canada on evaluations within the 2010 i2b2 challenge. The three systems perform three key steps in clinical information extraction: (1) extraction of medical problems, tests, and treatments, from discharge summaries and progress notes; (2) classification of assertions made on the medical problems; (3) classification of relations between medical concepts. Machine learning systems performed these tasks using large-dimensional bags of features, as derived from both the text itself and from external sources: UMLS, cTAKES, and Medline. Performance was measured per subtask, using micro-averaged F-scores, as calculated by comparing system annotations with ground-truth annotations on a test set. The systems ranked high among all submitted systems in the competition, with the following F-scores: concept extraction 0.8523 (ranked first); assertion detection 0.9362 (ranked first); relationship detection 0.7313 (ranked second). For all tasks, we found that the introduction of a wide range of features was crucial to success. Importantly, our choice of machine learning algorithms allowed us to be versatile in our feature design, and to introduce a large number of features without overfitting and without encountering computing-resource bottlenecks.

  4. Mining Method

    Energy Technology Data Exchange (ETDEWEB)

    Kim, Young Shik; Lee, Kyung Woon; Kim, Oak Hwan; Kim, Dae Kyung [Korea Institute of Geology Mining and Materials, Taejon (Korea, Republic of)

    1996-12-01

    The reducing coal market has been enforcing the coal industry to make exceptional rationalization and restructuring efforts since the end of the eighties. To the competition from crude oil and natural gas has been added the growing pressure from rising wages and rising production cost as the workings get deeper. To improve the competitive position of the coal mines against oil and gas through cost reduction, studies to improve mining system have been carried out. To find fields requiring improvements most, the technologies using in Tae Bak Colliery which was selected one of long running mines were investigated and analyzed. The mining method appeared the field needing improvements most to reduce the production cost. The present method, so-called inseam roadway caving method presently is using to extract the steep and thick seam. However, this method has several drawbacks. To solve the problems, two mining methods are suggested for a long term and short term method respectively. Inseam roadway caving method with long-hole blasting method is a variety of the present inseam roadway caving method modified by replacing timber sets with steel arch sets and the shovel loaders with chain conveyors. And long hole blasting is introduced to promote caving. And pillar caving method with chock supports method uses chock supports setting in the cross-cut from the hanging wall to the footwall. Two single chain conveyors are needed. One is installed in front of chock supports to clear coal from the cutting face. The other is installed behind the supports to transport caved coal from behind. This method is superior to the previous one in terms of safety from water-inrushes, production rate and productivity. The only drawback is that it needs more investment. (author). 14 tabs., 34 figs.

  5. A geographical information system-based analysis of cancer mortality and population exposure to coal mining activities in West Virginia, United States of America

    Directory of Open Access Journals (Sweden)

    Michael Hendryx

    2010-05-01

    Full Text Available Cancer incidence and mortality rates are high in West Virginia compared to the rest of the United States of America. Previous research has suggested that exposure to activities of the coal mining industry may contribute to elevated cancer mortality, although exposure measures have been limited. This study tests alternative specifications of exposure to mining activity to determine whether a measure based on location of mines, processing plants, coal slurry impoundments and underground slurry injection sites relative to population levels is superior to a previously-reported measure of exposure based on tons mined at the county level, in the prediction of age-adjusted cancer mortality rates. To this end, we utilize two geographical information system (GIS techniques – exploratory spatial data analysis and inverse distance mapping – to construct new statistical analyses. Total, respiratory and “other” age-adjusted cancer mortality rates in West Virginia were found to be more highly associated with the GIS-exposure measure than the tonnage measure, before and after statistical control for smoking rates. The superior performance of the GIS measure, based on where people in the state live relative to mining activity, suggests that activities of the industry contribute to cancer mortality. Further confirmation of observed phenomena is necessary with person-level studies, but the results add to the body of evidence that coal mining poses environmental risks to population health in West Virginia.

  6. USING WEB MINING IN E-COMMERCE APPLICATIONS

    Directory of Open Access Journals (Sweden)

    Claudia Elena Dinucă

    2011-09-01

    Full Text Available Nowadays, the web is an important part of our daily life. The web is now the best medium of doing business. Large companies rethink their business strategy using the web to improve business. Business carried on the Web offers the opportunity to potential customers or partners where their products and specific business can be found. Business presence through a company web site has several advantages as it breaks the barrier of time and space compared with the existence of a physical office. To differentiate through the Internet economy, winning companies have realized that e-commerce transactions is more than just buying / selling, appropriate strategies are key to improve competitive power. One effective technique used for this purpose is data mining. Data mining is the process of extracting interesting knowledge from data. Web mining is the use of data mining techniques to extract information from web data. This article presents the three components of web mining: web usage mining, web structure mining and web content mining.

  7. A mine of information: Benthic algal communities as biomonitors of metal contamination from abandoned tailings

    International Nuclear Information System (INIS)

    Lavoie, Isabelle; Lavoie, Michel; Fortin, Claude

    2012-01-01

    Various biomonitoring approaches were tested in the field to assess the response of natural periphythic algal communities to chronic metal contamination downstream from an abandoned mine tailings site. The accumulation of cadmium (Cd), copper (Cu), lead (Pb) and zinc (Zn) as well as the production of phytochelatins, the presence of diatom taxa known to tolerate high metal concentrations, diatom diversity and the presence of teratologies were determined. We observed highly significant relationships between intracellular metal and calculated free metal ion concentrations. Such relationships are often observed in laboratory studies but have been rarely validated in field studies. These results suggest that the concentration of metal inside the field-collected periphyton, regardless of its species composition, is a good indicator of exposure and is an interesting proxy for bioavailable metal concentrations in natural waters. The presence of teratologies and metal-tolerant taxa at our contaminated sites provided a clear indication that diatom communities were responding to this metal stress. A multi-metric approach integrating various bioassessment methods could be used for the field monitoring of metal contamination and the quantification of its effects. Highlights: ► Various approaches for metal contamination biomonitoring were used in the field. ► Metal accumulation in periphyton is correlated to free ion concentration. ► Teratologies and metal-tolerant taxa provided a clear indication of metal stress. ► Stream periphyton shows great potential as a biomonitor of metal contamination.

  8. Application of text mining for customer evaluations in commercial banking

    Science.gov (United States)

    Tan, Jing; Du, Xiaojiang; Hao, Pengpeng; Wang, Yanbo J.

    2015-07-01

    Nowadays customer attrition is increasingly serious in commercial banks. To combat this problem roundly, mining customer evaluation texts is as important as mining customer structured data. In order to extract hidden information from customer evaluations, Textual Feature Selection, Classification and Association Rule Mining are necessary techniques. This paper presents all three techniques by using Chinese Word Segmentation, C5.0 and Apriori, and a set of experiments were run based on a collection of real textual data that includes 823 customer evaluations taken from a Chinese commercial bank. Results, consequent solutions, some advice for the commercial bank are given in this paper.

  9. Extracting information from two-dimensional electrophoresis gels by partial least squares regression

    DEFF Research Database (Denmark)

    Jessen, Flemming; Lametsch, R.; Bendixen, E.

    2002-01-01

    of all proteins/spots in the gels. In the present study it is demonstrated how information can be extracted by multivariate data analysis. The strategy is based on partial least squares regression followed by variable selection to find proteins that individually or in combination with other proteins vary......Two-dimensional gel electrophoresis (2-DE) produces large amounts of data and extraction of relevant information from these data demands a cautious and time consuming process of spot pattern matching between gels. The classical approach of data analysis is to detect protein markers that appear...... or disappear depending on the experimental conditions. Such biomarkers are found by comparing the relative volumes of individual spots in the individual gels. Multivariate statistical analysis and modelling of 2-DE data for comparison and classification is an alternative approach utilising the combination...

  10. From remote sensing data about information extraction for 3D geovisualization - Development of a workflow

    International Nuclear Information System (INIS)

    Tiede, D.

    2010-01-01

    With an increased availability of high (spatial) resolution remote sensing imagery since the late nineties, the need to develop operative workflows for the automated extraction, provision and communication of information from such data has grown. Monitoring requirements, aimed at the implementation of environmental or conservation targets, management of (environmental-) resources, and regional planning as well as international initiatives, especially the joint initiative of the European Commission and ESA (European Space Agency) for Global Monitoring for Environment and Security (GMES) play also a major part. This thesis addresses the development of an integrated workflow for the automated provision of information derived from remote sensing data. Considering applied data and fields of application, this work aims to design the workflow as generic as possible. Following research questions are discussed: What are the requirements of a workflow architecture that seamlessly links the individual workflow elements in a timely manner and secures accuracy of the extracted information effectively? How can the workflow retain its efficiency if mounds of data are processed? How can the workflow be improved with regards to automated object-based image analysis (OBIA)? Which recent developments could be of use? What are the limitations or which workarounds could be applied in order to generate relevant results? How can relevant information be prepared target-oriented and communicated effectively? How can the more recently developed freely available virtual globes be used for the delivery of conditioned information under consideration of the third dimension as an additional, explicit carrier of information? Based on case studies comprising different data sets and fields of application it is demonstrated how methods to extract and process information as well as to effectively communicate results can be improved and successfully combined within one workflow. It is shown that (1

  11. Urban Mining: Quality and quantity of recyclable and recoverable material mechanically and physically extractable from residual waste

    International Nuclear Information System (INIS)

    Di Maria, Francesco; Micale, Caterina; Sordi, Alessio; Cirulli, Giuseppe; Marionni, Moreno

    2013-01-01

    Highlights: • Material recycling and recovery from residual waste by physical and mechanical process has been investigated. • About 6% of recyclable can be extracted by NIR and 2-3Dimension selector. • Another 2% of construction materials can be extracted by adopting modified soil washing process. • Extracted material quality is quite high even some residual heavy metal have been detected by leaching test. - Abstract: The mechanically sorted dry fraction (MSDF) and Fines (<20 mm) arising from the mechanical biological treatment of residual municipal solid waste (RMSW) contains respectively about 11% w/w each of recyclable and recoverable materials. Processing a large sample of MSDF in an existing full-scale mechanical sorting facility equipped with near infrared and 2-3 dimensional selectors led to the extraction of about 6% w/w of recyclables with respect to the RMSW weight. Maximum selection efficiency was achieved for metals, about 98% w/w, whereas it was lower for Waste Electrical and Electronic Equipment (WEEE), about 2% w/w. After a simulated lab scale soil washing treatment it was possible to extract about 2% w/w of inert exploitable substances recoverable as construction materials, with respect to the amount of RMSW. The passing curve showed that inert materials were mainly sand with a particle size ranging from 0.063 to 2 mm. Leaching tests showed quite low heavy metal concentrations with the exception of the particles retained by the 0.5 mm sieve. A minimum pollutant concentration was in the leachate from the 10 and 20 mm particle size fractions

  12. Addressing Risk Assessment for Patient Safety in Hospitals through Information Extraction in Medical Reports

    Science.gov (United States)

    Proux, Denys; Segond, Frédérique; Gerbier, Solweig; Metzger, Marie Hélène

    Hospital Acquired Infections (HAI) is a real burden for doctors and risk surveillance experts. The impact on patients' health and related healthcare cost is very significant and a major concern even for rich countries. Furthermore required data to evaluate the threat is generally not available to experts and that prevents from fast reaction. However, recent advances in Computational Intelligence Techniques such as Information Extraction, Risk Patterns Detection in documents and Decision Support Systems allow now to address this problem.

  13. From Specific Information Extraction to Inferences: A Hierarchical Framework of Graph Comprehension

    Science.gov (United States)

    2004-09-01

    The skill to interpret the information displayed in graphs is so important to have, the National Council of Teachers of Mathematics has created...guidelines to ensure that students learn these skills ( NCTM : Standards for Mathematics , 2003). These guidelines are based primarily on the extraction of...graphical perception. Human Computer Interaction, 8, 353-388. NCTM : Standards for Mathematics . (2003, 2003). Peebles, D., & Cheng, P. C.-H. (2002

  14. Extracting breathing rate information from a wearable reflectance pulse oximeter sensor.

    Science.gov (United States)

    Johnston, W S; Mendelson, Y

    2004-01-01

    The integration of multiple vital physiological measurements could help combat medics and field commanders to better predict a soldier's health condition and enhance their ability to perform remote triage procedures. In this paper we demonstrate the feasibility of extracting accurate breathing rate information from a photoplethysmographic signal that was recorded by a reflectance pulse oximeter sensor mounted on the forehead and subsequently processed by a simple time domain filtering and frequency domain Fourier analysis.

  15. Extraction of land cover change information from ENVISAT-ASAR data in Chengdu Plain

    Science.gov (United States)

    Xu, Wenbo; Fan, Jinlong; Huang, Jianxi; Tian, Yichen; Zhang, Yong

    2006-10-01

    Land cover data are essential to most global change research objectives, including the assessment of current environmental conditions and the simulation of future environmental scenarios that ultimately lead to public policy development. Chinese Academy of Sciences generated a nationwide land cover database in order to carry out the quantification and spatial characterization of land use/cover changes (LUCC) in 1990s. In order to improve the reliability of the database, we will update the database anytime. But it is difficult to obtain remote sensing data to extract land cover change information in large-scale. It is hard to acquire optical remote sensing data in Chengdu plain, so the objective of this research was to evaluate multitemporal ENVISAT advanced synthetic aperture radar (ASAR) data for extracting land cover change information. Based on the fieldwork and the nationwide 1:100000 land cover database, the paper assesses several land cover changes in Chengdu plain, for example: crop to buildings, forest to buildings, and forest to bare land. The results show that ENVISAT ASAR data have great potential for the applications of extracting land cover change information.

  16. Benchmarking infrastructure for mutation text mining.

    Science.gov (United States)

    Klein, Artjom; Riazanov, Alexandre; Hindle, Matthew M; Baker, Christopher Jo

    2014-02-25

    Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption.

  17. Benchmarking infrastructure for mutation text mining

    Science.gov (United States)

    2014-01-01

    Background Experimental research on the automatic extraction of information about mutations from texts is greatly hindered by the lack of consensus evaluation infrastructure for the testing and benchmarking of mutation text mining systems. Results We propose a community-oriented annotation and benchmarking infrastructure to support development, testing, benchmarking, and comparison of mutation text mining systems. The design is based on semantic standards, where RDF is used to represent annotations, an OWL ontology provides an extensible schema for the data and SPARQL is used to compute various performance metrics, so that in many cases no programming is needed to analyze results from a text mining system. While large benchmark corpora for biological entity and relation extraction are focused mostly on genes, proteins, diseases, and species, our benchmarking infrastructure fills the gap for mutation information. The core infrastructure comprises (1) an ontology for modelling annotations, (2) SPARQL queries for computing performance metrics, and (3) a sizeable collection of manually curated documents, that can support mutation grounding and mutation impact extraction experiments. Conclusion We have developed the principal infrastructure for the benchmarking of mutation text mining tasks. The use of RDF and OWL as the representation for corpora ensures extensibility. The infrastructure is suitable for out-of-the-box use in several important scenarios and is ready, in its current state, for initial community adoption. PMID:24568600

  18. Mined-out land

    International Nuclear Information System (INIS)

    Reinsalu, Enno; Toomik, Arvi; Valgma, Ingo

    2002-01-01

    Estonian mineral resources are deposited in low depth and mining fields are large, therefore vast areas are affected by mining. There are at least 800 deposits with total area of 6,000 km 2 and about the same number of underground mines, surface mines, peat fields, quarries, and sand and gravel pits. The deposits cover more than 10% of Estonian mainland. The total area of operating mine claims exceeds 150 km 2 that makes 0.3 % of Estonian area. The book is written mainly for the people who are living or acting in the area influenced by mining. The observations and research could benefit those who are interested in geography and environment, who follow formation and look of mined-out landscapes. The book contains also warnings for careless people on and under the surface of the mined-out land. Part of the book contains results of the research made in 1968-1993 by the first two authors working at the Estonian branch of A.Skochinsky Institute of Mining. Since 1990, Arvi Toomik continued this study at the Northeastern section of the Institute of Ecology of Tallinn Pedagogical University. Enno Reinsalu studied aftereffects of mining at the Mining Department of Tallinn Technical University from 1998 to 2000. Geographical Information System for Mining was studied by Ingo Valgma within his doctoral dissertation, and this book is one of the applications of his study

  19. Text mining for adverse drug events: the promise, challenges, and state of the art.

    Science.gov (United States)

    Harpaz, Rave; Callahan, Alison; Tamang, Suzanne; Low, Yen; Odgers, David; Finlayson, Sam; Jung, Kenneth; LePendu, Paea; Shah, Nigam H

    2014-10-01

    Text mining is the computational process of extracting meaningful information from large amounts of unstructured text. It is emerging as a tool to leverage underutilized data sources that can improve pharmacovigilance, including the objective of adverse drug event (ADE) detection and assessment. This article provides an overview of recent advances in pharmacovigilance driven by the application of text mining, and discusses several data sources-such as biomedical literature, clinical narratives, product labeling, social media, and Web search logs-that are amenable to text mining for pharmacovigilance. Given the state of the art, it appears text mining can be applied to extract useful ADE-related information from multiple textual sources. Nonetheless, further research is required to address remaining technical challenges associated with the text mining methodologies, and to conclusively determine the relative contribution of each textual source to improving pharmacovigilance.

  20. Text mining scientific papers: a survey on FCA-based information retrieval research

    NARCIS (Netherlands)

    Poelmans, J.; Ignatov, D.I.; Viaene, S.; Dedene, G.; Kuznetsov, S.O.

    2012-01-01

    Formal Concept Analysis (FCA) is an unsupervised clustering technique and many scientific papers are devoted to applying FCA in Information Retrieval (IR) research. We collected 103 papers published between 2003-2009 which mention FCA and information retrieval in the abstract, title or keywords.

  1. Mining royalties

    Directory of Open Access Journals (Sweden)

    Jelenković Rade J.

    2014-01-01

    Full Text Available Mineral resources are finite and nonrenewable in the sense that their extraction permanently depletes a country's resource inventory. The role of governments should be to manage the exploitation of these resources to maximize the economic benefits to their community, consistent with the need to attract and retain the exploration and development capital necessary to continue to realize these benefits for as long as possible. In designing mineral sector taxation systems, policy makers must carefully seek to balance tax types, rates, and incentives that satisfy the needs of both the nation and the mining investor.

  2. SAR matrices: automated extraction of information-rich SAR tables from large compound data sets.

    Science.gov (United States)

    Wassermann, Anne Mai; Haebel, Peter; Weskamp, Nils; Bajorath, Jürgen

    2012-07-23

    We introduce the SAR matrix data structure that is designed to elucidate SAR patterns produced by groups of structurally related active compounds, which are extracted from large data sets. SAR matrices are systematically generated and sorted on the basis of SAR information content. Matrix generation is computationally efficient and enables processing of large compound sets. The matrix format is reminiscent of SAR tables, and SAR patterns revealed by different categories of matrices are easily interpretable. The structural organization underlying matrix formation is more flexible than standard R-group decomposition schemes. Hence, the resulting matrices capture SAR information in a comprehensive manner.

  3. Accounting and Financial Data Analysis Data Mining Tools

    Directory of Open Access Journals (Sweden)

    Diana Elena Codreanu

    2011-05-01

    Full Text Available Computerized accounting systems in recent years have seen an increase in complexity due to thecompetitive economic environment but with the help of data analysis solutions such as OLAP and DataMining can be a multidimensional data analysis, can detect the fraud and can discover knowledge hidden indata, ensuring such information is useful for decision making within the organization. In the literature thereare many definitions for data mining but all boils down to same idea: the process takes place to extract newinformation from large data collections, information without the aid of data mining tools would be verydifficult to obtain. Information obtained by data mining process has the advantage that only respond to thequestion of what happens but at the same time argue and show why certain things are happening. In this paperwe wish to present advanced techniques for analysis and exploitation of data stored in a multidimensionaldatabase.

  4. Comparison of methods of extracting information for meta-analysis of observational studies in nutritional epidemiology

    Directory of Open Access Journals (Sweden)

    Jong-Myon Bae

    2016-01-01

    Full Text Available OBJECTIVES: A common method for conducting a quantitative systematic review (QSR for observational studies related to nutritional epidemiology is the “highest versus lowest intake” method (HLM, in which only the information concerning the effect size (ES of the highest category of a food item is collected on the basis of its lowest category. However, in the interval collapsing method (ICM, a method suggested to enable a maximum utilization of all available information, the ES information is collected by collapsing all categories into a single category. This study aimed to compare the ES and summary effect size (SES between the HLM and ICM. METHODS: A QSR for evaluating the citrus fruit intake and risk of pancreatic cancer and calculating the SES by using the HLM was selected. The ES and SES were estimated by performing a meta-analysis using the fixed-effect model. The directionality and statistical significance of the ES and SES were used as criteria for determining the concordance between the HLM and ICM outcomes. RESULTS: No significant differences were observed in the directionality of SES extracted by using the HLM or ICM. The application of the ICM, which uses a broader information base, yielded more-consistent ES and SES, and narrower confidence intervals than the HLM. CONCLUSIONS: The ICM is advantageous over the HLM owing to its higher statistical accuracy in extracting information for QSR on nutritional epidemiology. The application of the ICM should hence be recommended for future studies.

  5. Feature extraction and learning using context cue and Rényi entropy based mutual information

    DEFF Research Database (Denmark)

    Pan, Hong; Olsen, Søren Ingvor; Zhu, Yaping

    2015-01-01

    information. In particular, for feature extraction, we develop a new set of kernel descriptors−Context Kernel Descriptors (CKD), which enhance the original KDES by embedding the spatial context into the descriptors. Context cues contained in the context kernel enforce some degree of spatial consistency, thus...... improving the robustness of CKD. For feature learning and reduction, we propose a novel codebook learning method, based on a Rényi quadratic entropy based mutual information measure called Cauchy-Schwarz Quadratic Mutual Information (CSQMI), to learn a compact and discriminative CKD codebook. Projecting...... as the information about the underlying labels of the CKD using CSQMI. Thus the resulting codebook and reduced CKD are discriminative. We verify the effectiveness of our method on several public image benchmark datasets such as YaleB, Caltech-101 and CIFAR-10, as well as a challenging chicken feet dataset of our own...

  6. Method of extracting significant trouble information of nuclear power plants using probabilistic analysis technique

    International Nuclear Information System (INIS)

    Shimada, Yoshio; Miyazaki, Takamasa

    2005-01-01

    In order to analyze and evaluate large amounts of trouble information of overseas nuclear power plants, it is necessary to select information that is significant in terms of both safety and reliability. In this research, a method of efficiently and simply classifying degrees of importance of components in terms of safety and reliability while paying attention to root-cause components appearing in the information was developed. Regarding safety, the reactor core damage frequency (CDF), which is used in the probabilistic analysis of a reactor, was used. Regarding reliability, the automatic plant trip probability (APTP), which is used in the probabilistic analysis of automatic reactor trips, was used. These two aspects were reflected in the development of criteria for classifying degrees of importance of components. By applying these criteria, a simple method of extracting significant trouble information of overseas nuclear power plants was developed. (author)

  7. Distributed Information Search and Retrieval for Astronomical Resource Discovery and Data Mining

    Science.gov (United States)

    Murtagh, Fionn; Guillaume, Damien

    Information search and retrieval has become by nature a distributed task. We look at tools and techniques which are of importance in this area. Current technological evolution can be summarized as the growing stability and cohesiveness of distributed architectures of searchable objects. The objects themselves are more often than not multimedia, including published articles or grey literature reports, yellow page services, image data, catalogs, presentation and online display materials, and ``operations'' information such as scheduling and publicly accessible proposal information. The evolution towards distributed architectures, protocols and formats, and the direction of our own work, are focussed on in this paper.

  8. Analysis of arsenic speciation in mine contaminated lacustrine sediment using selective sequential extraction, HR-ICPMS and TEM

    International Nuclear Information System (INIS)

    Haus, Kelly L.; Hooper, Robert L.; Strumness, Laura A.; Mahoney, J. Brian

    2008-01-01

    In order to determine how As speciation in lacustrine sediment changes as a function of local conditions, sediment cores were taken from three lakes with differing hydrologic regimes and subjected to extensive chemical and TEM analysis. The lakes (Killarney, Thompson and Swan Lakes) are located within the Coeur d' Alene River system (northern Idaho, USA), which has been contaminated with trace metals and As, from over 100 a of sulfide mining. Previous analyses of these lakebed sediments have shown an extensive amount of contaminant metals and As associated with sub-μm grains, making them extremely difficult to analyze using standard methods (scanning electron microscopy, X-ray diffraction). Transmission electron microscopy offers great advantages in spatial resolution and can be invaluable in determining As speciation when combined with other techniques. Data indicate that because of differences in local redox conditions, As speciation and stability is dramatically different in these lakes. Killarney and Thompson Lakes experience seasonal water-level fluctuations due to drawdown on a downstream dam, causing changes in O 2 content in sediment exposed during drawdown. Swan Lake has relatively constant water levels as its only inlet is dammed. Consequently, Killarney and Thompson Lakes show an increase in labile As-bearing phases with depth, while Swan Lake data indicate stable As hosts throughout the sediment profile. Based on these observations it can be stated that As in lakebed sediments is much less mobile, and therefore less bioavailable, when water is kept at a constant level

  9. Analysis of arsenic speciation in mine contaminated lacustrine sediment using selective sequential extraction, HR-ICPMS and TEM

    Energy Technology Data Exchange (ETDEWEB)

    Haus, Kelly L. [Department of Geology, Phillips 157, University of Wisconsin - Eau Claire, Eau Claire, WI 54702-4004 (United States)], E-mail: khaus@vt.edu; Hooper, Robert L.; Strumness, Laura A.; Mahoney, J. Brian [Department of Geology, Phillips 157, University of Wisconsin - Eau Claire, Eau Claire, WI 54702-4004 (United States)

    2008-04-15

    In order to determine how As speciation in lacustrine sediment changes as a function of local conditions, sediment cores were taken from three lakes with differing hydrologic regimes and subjected to extensive chemical and TEM analysis. The lakes (Killarney, Thompson and Swan Lakes) are located within the Coeur d' Alene River system (northern Idaho, USA), which has been contaminated with trace metals and As, from over 100 a of sulfide mining. Previous analyses of these lakebed sediments have shown an extensive amount of contaminant metals and As associated with sub-{mu}m grains, making them extremely difficult to analyze using standard methods (scanning electron microscopy, X-ray diffraction). Transmission electron microscopy offers great advantages in spatial resolution and can be invaluable in determining As speciation when combined with other techniques. Data indicate that because of differences in local redox conditions, As speciation and stability is dramatically different in these lakes. Killarney and Thompson Lakes experience seasonal water-level fluctuations due to drawdown on a downstream dam, causing changes in O{sub 2} content in sediment exposed during drawdown. Swan Lake has relatively constant water levels as its only inlet is dammed. Consequently, Killarney and Thompson Lakes show an increase in labile As-bearing phases with depth, while Swan Lake data indicate stable As hosts throughout the sediment profile. Based on these observations it can be stated that As in lakebed sediments is much less mobile, and therefore less bioavailable, when water is kept at a constant level.

  10. Automated concept-level information extraction to reduce the need for custom software and rules development.

    Science.gov (United States)

    D'Avolio, Leonard W; Nguyen, Thien M; Goryachev, Sergey; Fiore, Louis D

    2011-01-01

    Despite at least 40 years of promising empirical performance, very few clinical natural language processing (NLP) or information extraction systems currently contribute to medical science or care. The authors address this gap by reducing the need for custom software and rules development with a graphical user interface-driven, highly generalizable approach to concept-level retrieval. A 'learn by example' approach combines features derived from open-source NLP pipelines with open-source machine learning classifiers to automatically and iteratively evaluate top-performing configurations. The Fourth i2b2/VA Shared Task Challenge's concept extraction task provided the data sets and metrics used to evaluate performance. Top F-measure scores for each of the tasks were medical problems (0.83), treatments (0.82), and tests (0.83). Recall lagged precision in all experiments. Precision was near or above 0.90 in all tasks. Discussion With no customization for the tasks and less than 5 min of end-user time to configure and launch each experiment, the average F-measure was 0.83, one point behind the mean F-measure of the 22 entrants in the competition. Strong precision scores indicate the potential of applying the approach for more specific clinical information extraction tasks. There was not one best configuration, supporting an iterative approach to model creation. Acceptable levels of performance can be achieved using fully automated and generalizable approaches to concept-level information extraction. The described implementation and related documentation is available for download.

  11. Data mining : open systems drill through layers of legacy data to manage the flow of information

    International Nuclear Information System (INIS)

    Polczer, S.

    1999-01-01

    Information management challenges facing the petroleum and natural gas industry are discussed in conjunction with the increasing difficulty of accessing information because of the sheer volume of it, plus the fact that most data systems are proprietary 'closed' systems. In this context, reference is made to a newly developed software system named PetroDesk, developed by Merak Petroleum. PetroDesk is a geographical information browser used for integration and analysis of public, proprietary and personal data under a common interface. The software can be used to plot land position, chart productivity of wells, and produce graphs of decline rates, reserves and production. The software, which was originally designed for engineering data, also has been found useful in determining costs, revenue projections and other information needed to obtain a real-time net present worth of a company, and also in identifying business opportunities. 2 figs

  12. 77 FR 57111 - Agency Information Collection Activities: Comment Request for the Mine, Development, and Mineral...

    Science.gov (United States)

    2012-09-17

    ...: Private sector: U.S. nonfuel minerals producers and exploration operations; Public sector: State and local.... While you can ask us in your comment to withhold your personal identifying information from public...

  13. Associations of cadmium, zinc, and lead in soils from a lead and zinc mining area as studied by single and sequential extractions.

    Science.gov (United States)

    Anju, M; Banerjee, D K

    2011-05-01

    An exploratory study of the area surrounding a historical Pb-Zn mining and smelting area in Zawar, India, detected significant contamination of the terrestrial environment by heavy metals. Soils (n=87) were analyzed for pH, EC, total organic matter (TOM), Pb, Zn, Mn, and Cd levels. The statistical analysis indicated that the frequency distribution of the analyzed parameters for these soils was not normal. The median concentrations of metals in surface soils were: Pb 420.21 μ g/g, Zn 870.25 μ g/g, Mn 696.70 μ g/g, and Cd 2.09 μ g/g. Zn concentrations were significantly correlated with Cd (r=0.867), indicating that levels of Cd are dependent on Zn. However, pH, electrical conductivity and total organic matter were not correlated significantly with Cd, Pb, Zn, and Mn. To assess the potential mobility of Cd, Pb, and Zn in soils, single (EDTA) as well as sequential extraction scheme (modified BCR) were applied to representative (n=23) soil samples. The amount of Cd, Pb, and Zn extracted by EDTA and their total concentrations showed linear positive correlation, which are statistically significant (r values for Cd, Pb, and Zn being 0.901, 0.971, and 0.795, respectively, and P values being soils from all the locations. As indicated by single extraction, the apparent mobility and potential bioavailability of metals in soils followed the order: Cd ≥ Pb > > Zn. Soil samples were sequentially extracted (modified BCR) so that solid pools of Cd, Zn, and Pb could be partitioned into four operationally defined fractions viz. acid-soluble, reducible, oxidizable, and residual. Cadmium was present appreciably (39.41%) in the acid-soluble fraction and zinc was predominantly associated (32.42%) with residual fraction. Pb (66.86%) and Zn (30.44%) were present mainly in the reducible fraction. Assuming that the mobility and bioavailability are related to solubility of geochemical forms of metals and decrease in the order of extraction, the apparent mobility and potential metal

  14. Modeling and mining term association for improving biomedical information retrieval performance.

    Science.gov (United States)

    Hu, Qinmin; Huang, Jimmy Xiangji; Hu, Xiaohua

    2012-06-11

    The growth of the biomedical information requires most information retrieval systems to provide short and specific answers in response to complex user queries. Semantic information in the form of free text that is structured in a way makes it straightforward for humans to read but more difficult for computers to interpret automatically and search efficiently. One of the reasons is that most traditional information retrieval models assume terms are conditionally independent given a document/passage. Therefore, we are motivated to consider term associations within different contexts to help the models understand semantic information and use it for improving biomedical information retrieval performance. We propose a term association approach to discover term associations among the keywords from a query. The experiments are conducted on the TREC 2004-2007 Genomics data sets and the TREC 2004 HARD data set. The proposed approach is promising and achieves superiority over the baselines and the GSP results. The parameter settings and different indices are investigated that the sentence-based index produces the best results in terms of the document-level, the word-based index for the best results in terms of the passage-level and the paragraph-based index for the best results in terms of the passage2-level. Furthermore, the best term association results always come from the best baseline. The tuning number k in the proposed recursive re-ranking algorithm is discussed and locally optimized to be 10. First, modelling term association for improving biomedical information retrieval using factor analysis, is one of the major contributions in our work. Second, the experiments confirm that term association considering co-occurrence and dependency among the keywords can produce better results than the baselines treating the keywords independently. Third, the baselines are re-ranked according to the importance and reliance of latent factors behind term associations. These latent

  15. An investigation on natural radioactivity from mining industry ...

    African Journals Online (AJOL)

    An investigation on natural radioactivity from mining industry # ... PROMOTING ACCESS TO AFRICAN RESEARCH ... Mining originating industries such as the coal industries, petroleum extraction and processing and natural gas, mining enrichment waste, phosphate, ... EMAIL FREE FULL TEXT EMAIL FREE FULL TEXT

  16. The Feature Extraction Based on Texture Image Information for Emotion Sensing in Speech

    Directory of Open Access Journals (Sweden)

    Kun-Ching Wang

    2014-09-01

    Full Text Available In this paper, we present a novel texture image feature for Emotion Sensing in Speech (ESS. This idea is based on the fact that the texture images carry emotion-related information. The feature extraction is derived from time-frequency representation of spectrogram images. First, we transform the spectrogram as a recognizable image. Next, we use a cubic curve to enhance the image contrast. Then, the texture image information (TII derived from the spectrogram image can be extracted by using Laws’ masks to characterize emotional state. In order to evaluate the effectiveness of the proposed emotion recognition in different languages, we use two open emotional databases including the Berlin Emotional Speech Database (EMO-DB and eNTERFACE corpus and one self-recorded database (KHUSC-EmoDB, to evaluate the performance cross-corpora. The results of the proposed ESS system are presented using support vector machine (SVM as a classifier. Experimental results show that the proposed TII-based feature extraction inspired by visual perception can provide significant classification for ESS systems. The two-dimensional (2-D TII feature can provide the discrimination between different emotions in visual expressions except for the conveyance pitch and formant tracks. In addition, the de-noising in 2-D images can be more easily completed than de-noising in 1-D speech.

  17. An Accurate Integral Method for Vibration Signal Based on Feature Information Extraction

    Directory of Open Access Journals (Sweden)

    Yong Zhu

    2015-01-01

    Full Text Available After summarizing the advantages and disadvantages of current integral methods, a novel vibration signal integral method based on feature information extraction was proposed. This method took full advantage of the self-adaptive filter characteristic and waveform correction feature of ensemble empirical mode decomposition in dealing with nonlinear and nonstationary signals. This research merged the superiorities of kurtosis, mean square error, energy, and singular value decomposition on signal feature extraction. The values of the four indexes aforementioned were combined into a feature vector. Then, the connotative characteristic components in vibration signal were accurately extracted by Euclidean distance search, and the desired integral signals were precisely reconstructed. With this method, the interference problem of invalid signal such as trend item and noise which plague traditional methods is commendably solved. The great cumulative error from the traditional time-domain integral is effectively overcome. Moreover, the large low-frequency error from the traditional frequency-domain integral is successfully avoided. Comparing with the traditional integral methods, this method is outstanding at removing noise and retaining useful feature information and shows higher accuracy and superiority.

  18. A cascade of classifiers for extracting medication information from discharge summaries

    Directory of Open Access Journals (Sweden)

    Halgrim Scott

    2011-07-01

    Full Text Available Abstract Background Extracting medication information from clinical records has many potential applications, and recently published research, systems, and competitions reflect an interest therein. Much of the early extraction work involved rules and lexicons, but more recently machine learning has been applied to the task. Methods We present a hybrid system consisting of two parts. The first part, field detection, uses a cascade of statistical classifiers to identify medication-related named entities. The second part uses simple heuristics to link those entities into medication events. Results The system achieved performance that is comparable to other approaches to the same task. This performance is further improved by adding features that reference external medication name lists. Conclusions This study demonstrates that our hybrid approach outperforms purely statistical or rule-based systems. The study also shows that a cascade of classifiers works better than a single classifier in extracting medication information. The system is available as is upon request from the first author.

  19. Three-dimensional information extraction from GaoFen-1 satellite images for landslide monitoring

    Science.gov (United States)

    Wang, Shixin; Yang, Baolin; Zhou, Yi; Wang, Futao; Zhang, Rui; Zhao, Qing

    2018-05-01

    To more efficiently use GaoFen-1 (GF-1) satellite images for landslide emergency monitoring, a Digital Surface Model (DSM) can be generated from GF-1 across-track stereo image pairs to build a terrain dataset. This study proposes a landslide 3D information extraction method based on the terrain changes of slope objects. The slope objects are mergences of segmented image objects which have similar aspects; and the terrain changes are calculated from the post-disaster Digital Elevation Model (DEM) from GF-1 and the pre-disaster DEM from GDEM V2. A high mountain landslide that occurred in Wenchuan County, Sichuan Province is used to conduct a 3D information extraction test. The extracted total area of the landslide is 22.58 ha; the displaced earth volume is 652,100 m3; and the average sliding direction is 263.83°. The accuracies of them are 0.89, 0.87 and 0.95, respectively. Thus, the proposed method expands the application of GF-1 satellite images to the field of landslide emergency monitoring.

  20. Microarray data and gene expression statistics for Saccharomyces cerevisiae exposed to simulated asbestos mine drainage

    Directory of Open Access Journals (Sweden)

    Heather E. Driscoll

    2017-08-01

    Full Text Available Here we describe microarray expression data (raw and normalized, experimental metadata, and gene-level data with expression statistics from Saccharomyces cerevisiae exposed to simulated asbestos mine drainage from the Vermont Asbestos Group (VAG Mine on Belvidere Mountain in northern Vermont, USA. For nearly 100 years (between the late 1890s and 1993, chrysotile asbestos fibers were extracted from serpentinized ultramafic rock at the VAG Mine for use in construction and manufacturing industries. Studies have shown that water courses and streambeds nearby have become contaminated with asbestos mine tailings runoff, including elevated levels of magnesium, nickel, chromium, and arsenic, elevated pH, and chrysotile asbestos-laden mine tailings, due to leaching and gradual erosion of massive piles of mine waste covering approximately 9 km2. We exposed yeast to simulated VAG Mine tailings leachate to help gain insight on how eukaryotic cells exposed to VAG Mine drainage may respond in the mine environment. Affymetrix GeneChip® Yeast Genome 2.0 Arrays were utilized to assess gene expression after 24-h exposure to simulated VAG Mine tailings runoff. The chemistry of mine-tailings leachate, mine-tailings leachate plus yeast extract peptone dextrose media, and control yeast extract peptone dextrose media is also reported. To our knowledge this is the first dataset to assess global gene expression patterns in a eukaryotic model system simulating asbestos mine tailings runoff exposure. Raw and normalized gene expression data are accessible through the National Center for Biotechnology Information Gene Expression Omnibus (NCBI GEO Database Series GSE89875 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE89875.

  1. Characterization of auriferous ores from the mining zone of San Pedro Frio (Bolivar-Colombia) to selection the extractive processes

    International Nuclear Information System (INIS)

    Yanez Traslavina, J J; Garcia Paez, I H; Pedraza Rosas, J E; Laverde Catano, D

    2005-01-01

    The benefit and treatment process of auriferous minerals that actuality are apply in San Pedro Frio's Miner Place, this carry a slightly efficient systems with percentages of gold recovery not bigger than 40 %. The present work is the summary of the results of analysis and characterization of minerals auriferous of the above mentioned mining zone. With this article, one tries to stimulate the miners to realize a development reconciled with the university, since according to the results obtained during this research, it is possible to achieve percentages of gold dissolution of up to 85 % for the minerals of San Pedro Frio. This affirmation was possible to propose it, by means of the analyses of the characterization of the mineral. By means of the physicochemical characterization of the mineral, there was possible the estimation of certain conditions of the metallurgical processes involved in a possible plant of treatment, it is so the percentage of solid more adapted for the managing of the solid ones in flesh was 40 % in weight, the high quantity of clay minerals they might impede the processes of separation solidly -liquid, among others operative specifications according to the mineralogical and metallurgical characterization, the average size of the present gold in the mineral, one finds to a minor grain 75 μm, this indicates that the cyanidation process for agitation would turn out to be the most suitable. The answer of the leaching diagnosis, they assumed very satisfactory results, since 94 % of the gold in the samples is as free gold, which will be able to affect in cyanidation short times

  2. Lane-Level Road Information Mining from Vehicle GPS Trajectories Based on Naïve Bayesian Classification

    Directory of Open Access Journals (Sweden)

    Luliang Tang

    2015-11-01

    Full Text Available In this paper, we propose a novel approach for mining lane-level road network information from low-precision vehicle GPS trajectories (MLIT, which includes the number and turn rules of traffic lanes based on naïve Bayesian classification. First, the proposed method (MLIT uses an adaptive density optimization method to remove outliers from the raw GPS trajectories based on their space-time distribution and density clustering. Second, MLIT acquires the number of lanes in two steps. The first step establishes a naïve Bayesian classifier according to the trace features of the road plane and road profiles and the real number of lanes, as found in the training samples. The second step confirms the number of lanes using test samples in reference to the naïve Bayesian classifier using the known trace features of test sample. Third, MLIT infers the turn rules of each lane through tracking GPS trajectories. Experiments were conducted using the GPS trajectories of taxis in Wuhan, China. Compared with human-interpreted results, the automatically generated lane-level road network information was demonstrated to be of higher quality in terms of displaying detailed road networks with the number of lanes and turn rules of each lane.

  3. Exploiting Structure and Conventions of Movie Scripts for Information Retrieval and Text Mining

    DEFF Research Database (Denmark)

    Jhala, Arnav

    2008-01-01

    Movie scripts are documents that describe the story, stage direction for actors and camera, and dialogue. Script writers, directors, and cinematographers have standardized the format and language that is used in script writing. Scripts contain a wealth of information about narrative patterns, cha...

  4. Availability analysis of selected mining machinery

    Directory of Open Access Journals (Sweden)

    Brodny Jarosław

    2017-06-01

    Full Text Available Underground extraction of coal is characterized by high variability of mining and geological conditions in which it is conducted. Despite ever more effective methods and tools, used to identify the factors influencing this process, mining machinery, used in mining underground, work in difficult and not always foreseeable conditions, which means that these machines should be very universal and reliable. Additionally, a big competition, occurring on the coal market, causes that it is necessary to take action in order to reduce the cost of its production, e.g. by increasing the efficiency of utilization machines. To meet this objective it should be pro-ceed with analysis presented in this paper. The analysis concerns to availability of utilization selected mining machinery, conducted using the model of OEE, which is a tool for quantitative estimate strategy TPM. In this article we considered the machines being part of the mechanized longwall complex and the basis of analysis was the data recording by the industrial automation system. Using this data set we evaluated the availability of studied machines and the structure of registered breaks in their work. The results should be an important source of information for maintenance staff and management of mining plants, needed to improve the economic efficiency of underground mining.

  5. A mine of information: can sports analytics provide wisdom from your data?

    OpenAIRE

    Passfield, Louis; Hopker, James G.

    2017-01-01

    This paper explores the notion that the availability and analysis of large datasets has the capacity to improve practice and change the nature of science in the sport and exercise setting. The increasing use of data and information technology in sport is giving rise to this change. Websites hold large data repositories and the development of wearable technology, mobile phone applications and related instruments for monitoring physical activity, training and competition, provide large data set...

  6. SPICE: A Geometry Information System Supporting Planetary Mapping, Remote Sensing and Data Mining

    Science.gov (United States)

    Acton, C.; Bachman, N.; Semenov, B.; Wright, E.

    2013-01-01

    SPICE is an information system providing space scientists ready access to a wide assortment of space geometry useful in planning science observations and analyzing the instrument data returned therefrom. The system includes software used to compute many derived parameters such as altitude, LAT/LON and lighting angles, and software able to find when user-specified geometric conditions are obtained. While not a formal standard, it has achieved widespread use in the worldwide planetary science community

  7. THE EXTRACTION OF INDOOR BUILDING INFORMATION FROM BIM TO OGC INDOORGML

    Directory of Open Access Journals (Sweden)

    T.-A. Teo

    2017-07-01

    Full Text Available Indoor Spatial Data Infrastructure (indoor-SDI is an important SDI for geosptial analysis and location-based services. Building Information Model (BIM has high degree of details in geometric and semantic information for building. This study proposed direct conversion schemes to extract indoor building information from BIM to OGC IndoorGML. The major steps of the research include (1 topological conversion from building model into indoor network model; and (2 generation of IndoorGML. The topological conversion is a major process of generating and mapping nodes and edges from IFC to indoorGML. Node represents every space (e.g. IfcSpace and objects (e.g. IfcDoor in the building while edge shows the relationships between nodes. According to the definition of IndoorGML, the topological model in the dual space is also represented as a set of nodes and edges. These definitions of IndoorGML are the same as in the indoor network. Therefore, we can extract the necessary data in the indoor network and easily convert them into IndoorGML based on IndoorGML Schema. The experiment utilized a real BIM model to examine the proposed method. The experimental results indicated that the 3D indoor model (i.e. IndoorGML model can be automatically imported from IFC model by the proposed procedure. In addition, the geometric and attribute of building elements are completely and correctly converted from BIM to indoor-SDI.

  8. Methods from Information Extraction from LIDAR Intensity Data and Multispectral LIDAR Technology

    Science.gov (United States)

    Scaioni, M.; Höfle, B.; Baungarten Kersting, A. P.; Barazzetti, L.; Previtali, M.; Wujanz, D.

    2018-04-01

    LiDAR is a consolidated technology for topographic mapping and 3D reconstruction, which is implemented in several platforms On the other hand, the exploitation of the geometric information has been coupled by the use of laser intensity, which may provide additional data for multiple purposes. This option has been emphasized by the availability of sensors working on different wavelength, thus able to provide additional information for classification of surfaces and objects. Several applications ofmonochromatic and multi-spectral LiDAR data have been already developed in different fields: geosciences, agriculture, forestry, building and cultural heritage. The use of intensity data to extract measures of point cloud quality has been also developed. The paper would like to give an overview on the state-of-the-art of these techniques, and to present the modern technologies for the acquisition of multispectral LiDAR data. In addition, the ISPRS WG III/5 on `Information Extraction from LiDAR Intensity Data' has collected and made available a few open data sets to support scholars to do research on this field. This service is presented and data sets delivered so far as are described.

  9. The effect of informed consent on stress levels associated with extraction of impacted mandibular third molars.

    Science.gov (United States)

    Casap, Nardy; Alterman, Michael; Sharon, Guy; Samuni, Yuval

    2008-05-01

    To evaluate the effect of informed consent on stress levels associated with removal of impacted mandibular third molars. A total of 60 patients scheduled for extraction of impacted mandibular third molars participated in this study. The patients were unaware of the study's objectives. Data from 20 patients established the baseline levels of electrodermal activity (EDA). The remaining 40 patients were randomly assigned into 2 equal groups receiving either a detailed document of informed consent, disclosing the possible risks involved with the surgery, or a simplified version. Pulse, blood pressure, and EDA were monitored before, during, and after completion of the consent document. Changes in EDA, but not in blood pressure, were measured on completion of either version of the consent document. A greater increase in EDA was associated with the detailed version of the consent document (P = .004). A similar concomitant increase (although nonsignificant) in pulse values was monitored on completion of both versions. Completion of overdisclosed document of informed consent is associated with changes in physiological parameters. The results suggest that overdetailed listing and disclosure before extraction of impacted mandibular third molars can increase patient stress.

  10. METHODS FROM INFORMATION EXTRACTION FROM LIDAR INTENSITY DATA AND MULTISPECTRAL LIDAR TECHNOLOGY

    Directory of Open Access Journals (Sweden)

    M. Scaioni

    2018-04-01

    Full Text Available LiDAR is a consolidated technology for topographic mapping and 3D reconstruction, which is implemented in several platforms On the other hand, the exploitation of the geometric information has been coupled by the use of laser intensity, which may provide additional data for multiple purposes. This option has been emphasized by the availability of sensors working on different wavelength, thus able to provide additional information for classification of surfaces and objects. Several applications ofmonochromatic and multi-spectral LiDAR data have been already developed in different fields: geosciences, agriculture, forestry, building and cultural heritage. The use of intensity data to extract measures of point cloud quality has been also developed. The paper would like to give an overview on the state-of-the-art of these techniques, and to present the modern technologies for the acquisition of multispectral LiDAR data. In addition, the ISPRS WG III/5 on ‘Information Extraction from LiDAR Intensity Data’ has collected and made available a few open data sets to support scholars to do research on this field. This service is presented and data sets delivered so far as are described.

  11. About increasing informativity of diagnostic system of asynchronous electric motor by extracting additional information from values of consumed current parameter

    Science.gov (United States)

    Zhukovskiy, Y.; Korolev, N.; Koteleva, N.

    2018-05-01

    This article is devoted to expanding the possibilities of assessing the technical state of the current consumption of asynchronous electric drives, as well as increasing the information capacity of diagnostic methods, in conditions of limited access to equipment and incompleteness of information. The method of spectral analysis of the electric drive current can be supplemented by an analysis of the components of the current of the Park's vector. The research of the hodograph evolution in the moment of appearance and development of defects was carried out using the example of current asymmetry in the phases of an induction motor. The result of the study is the new diagnostic parameters of the asynchronous electric drive. During the research, it was proved that the proposed diagnostic parameters allow determining the type and level of the defect. At the same time, there is no need to stop the equipment and taky it out of service for repair. Modern digital control and monitoring systems can use the proposed parameters based on the stator current of an electrical machine to improve the accuracy and reliability of obtaining diagnostic patterns and predicting their changes in order to improve the equipment maintenance systems. This approach can also be used in systems and objects where there are significant parasitic vibrations and unsteady loads. The extraction of useful information can be carried out in electric drive systems in the structure of which there is a power electric converter.

  12. Multi-Paradigm and Multi-Lingual Information Extraction as Support for Medical Web Labelling Authorities

    Directory of Open Access Journals (Sweden)

    Martin Labsky

    2010-10-01

    Full Text Available Until recently, quality labelling of medical web content has been a pre-dominantly manual activity. However, the advances in automated text processing opened the way to computerised support of this activity. The core enabling technology is information extraction (IE. However, the heterogeneity of websites offering medical content imposes particular requirements on the IE techniques to be applied. In the paper we discuss these requirements and describe a multi-paradigm approach to IE addressing them. Experiments on multi-lingual data are reported. The research has been carried out within the EU MedIEQ project.

  13. Small scale gold mining in Brazil and Suriname: the troubles of cultural rules, legal regulations and politics of access : In the ENV - Panel Artisanal and small scale mining in Latin America: challenges for reshaping extractive governance

    NARCIS (Netherlands)

    de Theije, Marjo

    2017-01-01

    Suriname and Brazil have very different politics in relation to small scale gold mining. Nevertheless, at the same time we observe a number of similarities in the gold mining practices of both Amazonian countries. In this paper we will identify a number of reasons contributing to the commonalities

  14. 21 Recipes for Mining Twitter

    CERN Document Server

    Russell, Matthew

    2011-01-01

    Millions of public Twitter streams harbor a wealth of data, and once you mine them, you can gain some valuable insights. This short and concise book offers a collection of recipes to help you extract nuggets of Twitter information using easy-to-learn Python tools. Each recipe offers a discussion of how and why the solution works, so you can quickly adapt it to fit your particular needs. The recipes include techniques to: Use OAuth to access Twitter dataCreate and analyze graphs of retweet relationshipsUse the streaming API to harvest tweets in realtimeHarvest and analyze friends and followers

  15. Trust Mines

    Science.gov (United States)

    The United States and the Navajo Nation entered into settlement agreements that provide funds to conduct investigations and any needed cleanup at 16 of the 46 priority mines, including six mines in the Northern Abandoned Uranium Mine Region.

  16. Novel ion-imprinted polymer coated on nanoporous silica as a highly selective sorbent for the extraction of ultratrace quantities of gold ions from mine stone samples

    International Nuclear Information System (INIS)

    Ebrahimzadeh, H.; Moazzen, E.; Amini, M.; Sadeghi, O.

    2013-01-01

    We have developed a gold ion-imprinted polymer (GIP) by incorporating a dipyridyl ligand into an ethylene glycol dimethacrylate matrix which then was coated onto porous silica particles. The material was used for the selective extraction of ultratrace quantities of gold ion from mine stones, this followed by its quantitation by FAAS. The effects of concentration and volume of eluent, pH of the solution, flow rates of sample and eluent, and effect of potentially interfering ions, especially palladium and platinum, was investigated. The limit of detection is -1 , the precision (RSD%) is 1.03 %, and recoveries are >99 %. In order to show the high selectivity and efficiency of the new sorbent, the results were compared to those obtained with more simple sorbents possessing the same functional groups. The accuracy of the method was demonstrated by the accurate determination of gold ions in a certified reference material. To the best of our knowledge, there is no report so far on an imprint for gold ions that has such a selectivity over Pd(II) and Pt(II) ions. (author)

  17. A Mine of Information: Can Sports Analytics Provide Wisdom From Your Data?

    Science.gov (United States)

    Passfield, Louis; Hopker, James G

    2017-08-01

    This paper explores the notion that the availability and analysis of large data sets have the capacity to improve practice and change the nature of science in the sport and exercise setting. The increasing use of data and information technology in sport is giving rise to this change. Web sites hold large data repositories, and the development of wearable technology, mobile phone applications, and related instruments for monitoring physical activity, training, and competition provide large data sets of extensive and detailed measurements. Innovative approaches conceived to more fully exploit these large data sets could provide a basis for more objective evaluation of coaching strategies and new approaches to how science is conducted. An emerging discipline, sports analytics, could help overcome some of the challenges involved in obtaining knowledge and wisdom from these large data sets. Examples of where large data sets have been analyzed, to evaluate the career development of elite cyclists and to characterize and optimize the training load of well-trained runners, are discussed. Careful verification of large data sets is time consuming and imperative before useful conclusions can be drawn. Consequently, it is recommended that prospective studies be preferred over retrospective analyses of data. It is concluded that rigorous analysis of large data sets could enhance our knowledge in the sport and exercise sciences, inform competitive strategies, and allow innovative new research and findings.

  18. Sequential chemical extraction of heavy metals in a study of the chemical alteration of mine tailings at Ticapampa (Huaraz, Peru); Extraccion quimica secuencial de metales pesados en el estudio de alteracion quimica de relaves de mina en Ticapampa (Huaraz, Peru)

    Energy Technology Data Exchange (ETDEWEB)

    Jara Facundo, M. A.

    2011-07-01

    The upper reaches of the Rio Santa (Huaraz, Peru) are highly affected by the mining activities of generally small and very small mining companies located in two specific areas, Cordillera Blanca, and Cordillera Negra, with the largest mining claims located in the districts of Recuay and Ticapampa. To assess the mobility and bioavailability of heavy metals in the abandoned tailings pond belonging to the Alianza mining company in the district of Ticapampa, and to identify the fractions to which they are associated we applied a sequential chemical extraction. The results were compared with studies into their mineralogical characterization, a quantitative chemical analysis and a determination of potential acidity and potential neutralization by the ABA (acid-base accounting) method applied to samples of tailings. The sequential extraction procedure confirmed the mode of general alteration observed in the area through mineralogical studies: a relatively easy mobility of Pb, and Cd, and considerable immobility with regard to Ag, Cr and Co, as well as an intermediate mobility of Cu, Zn, and As. Significant cadmium and lead contents found in the most mobile fractions of the tailings may represent an environmental threat, bearing in mind the toxic nature of these elements. Despite the low mobility of arsenic, the total quantities of this element are so high that the waters of the Rio Santa are being affected. (Author) 22 refs.

  19. Sentiment topic mining based on comment tags

    Science.gov (United States)

    Zhang, Daohai; Liu, Xue; Li, Juan; Fan, Mingyue

    2018-03-01

    With the development of e-commerce, various comments based on tags are generated, how to extract valuable information from these comment tags has become an important content of business management decisions. This study takes HUAWEI mobile phone tags as an example using the sentiment analysis and topic LDA mining method. The first step is data preprocessing and classification of comment tag topic mining. And then make the sentiment classification for comment tags. Finally, mine the comments again and analyze the emotional theme distribution under different sentiment classification. The results show that HUAWEI mobile phone has a good user experience in terms of fluency, cost performance, appearance, etc. Meanwhile, it should pay more attention to independent research and development, product design and development. In addition, battery and speed performance should be enhanced.

  20. Technologies for Decreasing Mining Losses

    Science.gov (United States)

    Valgma, Ingo; Väizene, Vivika; Kolats, Margit; Saarnak, Martin

    2013-12-01

    In case of stratified deposits like oil shale deposit in Estonia, mining losses depend on mining technologies. Current research focuses on extraction and separation possibilities of mineral resources. Selective mining, selective crushing and separation tests have been performed, showing possibilities of decreasing mining losses. Rock crushing and screening process simulations were used for optimizing rock fractions. In addition mine backfilling, fine separation, and optimized drilling and blasting have been analyzed. All tested methods show potential and depend on mineral usage. Usage in addition depends on the utilization technology. The questions like stability of the material flow and influences of the quality fluctuations to the final yield are raised.

  1. Accurate facade feature extraction method for buildings from three-dimensional point cloud data considering structural information

    Science.gov (United States)

    Wang, Yongzhi; Ma, Yuqing; Zhu, A.-xing; Zhao, Hui; Liao, Lixia

    2018-05-01

    Facade features represent segmentations of building surfaces and can serve as a building framework. Extracting facade features from three-dimensional (3D) point cloud data (3D PCD) is an efficient method for 3D building modeling. By combining the advantages of 3D PCD and two-dimensional optical images, this study describes the creation of a highly accurate building facade feature extraction method from 3D PCD with a focus on structural information. The new extraction method involves three major steps: image feature extraction, exploration of the mapping method between the image features and 3D PCD, and optimization of the initial 3D PCD facade features considering structural information. Results show that the new method can extract the 3D PCD facade features of buildings more accurately and continuously. The new method is validated using a case study. In addition, the effectiveness of the new method is demonstrated by comparing it with the range image-extraction method and the optical image-extraction method in the absence of structural information. The 3D PCD facade features extracted by the new method can be applied in many fields, such as 3D building modeling and building information modeling.

  2. An Unsupervised Opinion Mining Approach for Japanese Weblog Reputation Information Using an Improved SO-PMI Algorithm

    Science.gov (United States)

    Wang, Guangwei; Araki, Kenji

    In this paper, we propose an improved SO-PMI (Semantic Orientation Using Pointwise Mutual Information) algorithm, for use in Japanese Weblog Opinion Mining. SO-PMI is an unsupervised approach proposed by Turney that has been shown to work well for English. When this algorithm was translated into Japanese naively, most phrases, whether positive or negative in meaning, received a negative SO. For dealing with this slanting phenomenon, we propose three improvements: to expand the reference words to sets of words, to introduce a balancing factor and to detect neutral expressions. In our experiments, the proposed improvements obtained a well-balanced result: both positive and negative accuracy exceeded 62%, when evaluated on 1,200 opinion sentences sampled from three different domains (reviews of Electronic Products, Cars and Travels from Kakaku. com). In a comparative experiment on the same corpus, a supervised approach (SA-Demo) achieved a very similar accuracy to our method. This shows that our proposed approach effectively adapted SO-PMI for Japanese, and it also shows the generality of SO-PMI.

  3. Analyzing Patterns of Community Interest at a Legacy Mining Waste Site to Assess and Inform Environmental Health Literacy Efforts

    Science.gov (United States)

    Ramirez-Andreotta, Monica D.; Lothrop, Nathan; Wilkinson, Sarah T.; Root, Robert A.; Artiola, Janick F.; Klimecki, Walter; Loh, Miranda

    2015-01-01

    Understanding a community’s concerns and informational needs is crucial to conducting and improving environmental health research and literacy initiatives. We hypothesized that analysis of community inquiries over time at a legacy mining site would be an effective method for assessing environmental health literacy efforts and determining whether community concerns were thoroughly addressed. Through a qualitative analysis, we determined community concerns at the time of being listed as a Superfund site. We analyzed how community concerns changed from this starting point over the subsequent years, and whether: 1) communication materials produced by the USEPA and other media were aligned with community concerns; and 2) these changes demonstrated a progression of the community’s understanding resulting from community involvement and engaged research efforts. We observed that when the Superfund site was first listed, community members were most concerned with USEPA management, remediation, site-specific issues, health effects, and environmental monitoring efforts related to air/dust and water. Over the next five years, community inquiries shifted significantly to include exposure assessment and reduction methods and issues unrelated to the site, particularly the local public water supply and home water treatment systems. Such documentation of community inquiries over time at contaminated sites is a novel method to assess environmental health literacy efforts and determine whether community concerns were thoroughly addressed. PMID:27595054

  4. Identification of Social and Environmental Conflicts Resulting from Open-Cast Mining

    Science.gov (United States)

    Górniak-Zimroz, Justyna; Pactwa, Katarzyna

    2016-10-01

    Open-cast mining is related to interference in the natural environment. It also affects human health and quality of life. This influence is, among others, dependent on the type of extracted materials, size of deposit, methods of mining and mineral processing, as well as, equally important, sensitivity of the environment within which mining is planned. The negative effects of mining include deformations of land surface or contamination of soils, air and water. What is more, in many cases, mining for minerals leads to clearing of housing and transport infrastructures located within the mining area, a decrease in values of the properties in the immediate vicinity of a deposit, and an increase in stress levels in local residents exposed to noise. The awareness of negative consequences of taking up open-cast mining activity leads to conflicts between a mining entrepreneur and self-government authorities, society or nongovernment organisations. The article attempts to identify potential social and environmental conflicts that may occur in relation to a planned mining activity. The results of the analyses were interpreted with respect to the deposits which were or have been mined. That enabled one to determine which facilities exclude mineral mining and which allow it. The research took the non-energy mineral resources into consideration which are included in the group of solid minerals located in one of the districts of Lower Silesian Province (SW Poland). The spatial analyses used the tools available in the geographical information systems

  5. Developing an Approach to Prioritize River Restoration using Data Extracted from Flood Risk Information System Databases.

    Science.gov (United States)

    Vimal, S.; Tarboton, D. G.; Band, L. E.; Duncan, J. M.; Lovette, J. P.; Corzo, G.; Miles, B.

    2015-12-01

    Prioritizing river restoration requires information on river geometry. In many states in the US detailed river geometry has been collected for floodplain mapping and is available in Flood Risk Information Systems (FRIS). In particular, North Carolina has, for its 100 Counties, developed a database of numerous HEC-RAS models which are available through its Flood Risk Information System (FRIS). These models that include over 260 variables were developed and updated by numerous contractors. They contain detailed surveyed or LiDAR derived cross-sections and modeled flood extents for different extreme event return periods. In this work, over 4700 HEC-RAS models' data was integrated and upscaled to utilize detailed cross-section information and 100-year modelled flood extent information to enable river restoration prioritization for the entire state of North Carolina. We developed procedures to extract geomorphic properties such as entrenchment ratio, incision ratio, etc. from these models. Entrenchment ratio quantifies the vertical containment of rivers and thereby their vulnerability to flooding and incision ratio quantifies the depth per unit width. A map of entrenchment ratio for the whole state was derived by linking these model results to a geodatabase. A ranking of highly entrenched counties enabling prioritization for flood allowance and mitigation was obtained. The results were shared through HydroShare and web maps developed for their visualization using Google Maps Engine API.

  6. Extracting Low-Frequency Information from Time Attenuation in Elastic Waveform Inversion

    Science.gov (United States)

    Guo, Xuebao; Liu, Hong; Shi, Ying; Wang, Weihong

    2017-03-01

    Low-frequency information is crucial for recovering background velocity, but the lack of low-frequency information in field data makes inversion impractical without accurate initial models. Laplace-Fourier domain waveform inversion can recover a smooth model from real data without low-frequency information, which can be used for subsequent inversion as an ideal starting model. In general, it also starts with low frequencies and includes higher frequencies at later inversion stages, while the difference is that its ultralow frequency information comes from the Laplace-Fourier domain. Meanwhile, a direct implementation of the Laplace-transformed wavefield using frequency domain inversion is also very convenient. However, because broad frequency bands are often used in the pure time domain waveform inversion, it is difficult to extract the wavefields dominated by low frequencies in this case. In this paper, low-frequency components are constructed by introducing time attenuation into the recorded residuals, and the rest of the method is identical to the traditional time domain inversion. Time windowing and frequency filtering are also applied to mitigate the ambiguity of the inverse problem. Therefore, we can start at low frequencies and to move to higher frequencies. The experiment shows that the proposed method can achieve a good inversion result in the presence of a linear initial model and records without low-frequency information.

  7. Audio-Visual Speech Recognition Using Lip Information Extracted from Side-Face Images

    Directory of Open Access Journals (Sweden)

    Koji Iwano

    2007-03-01

    Full Text Available This paper proposes an audio-visual speech recognition method using lip information extracted from side-face images as an attempt to increase noise robustness in mobile environments. Our proposed method assumes that lip images can be captured using a small camera installed in a handset. Two different kinds of lip features, lip-contour geometric features and lip-motion velocity features, are used individually or jointly, in combination with audio features. Phoneme HMMs modeling the audio and visual features are built based on the multistream HMM technique. Experiments conducted using Japanese connected digit speech contaminated with white noise in various SNR conditions show effectiveness of the proposed method. Recognition accuracy is improved by using the visual information in all SNR conditions. These visual features were confirmed to be effective even when the audio HMM was adapted to noise by the MLLR method.

  8. Approaching the largest ‘API’: extracting information from the Internet with Python

    Directory of Open Access Journals (Sweden)

    Jonathan E. Germann

    2018-02-01

    Full Text Available This article explores the need for libraries to algorithmically access and manipulate the world’s largest API: the Internet. The billions of pages on the ‘Internet API’ (HTTP, HTML, CSS, XPath, DOM, etc. are easily accessible and manipulable. Libraries can assist in creating meaning through the datafication of information on the world wide web. Because most information is created for human consumption, some programming is required for automated extraction. Python is an easy-to-learn programming language with extensive packages and community support for web page automation. Four packages (Urllib, Selenium, BeautifulSoup, Scrapy in Python can automate almost any web page for all sized projects. An example warrant data project is explained to illustrate how well Python packages can manipulate web pages to create meaning through assembling custom datasets.

  9. Soil-characterization and soil-amendment use on coal surface mine lands: An annotated bibliography. Information Circular/1991

    International Nuclear Information System (INIS)

    Norland, M.R.; Veith, D.L.

    1991-01-01

    The U.S. Bureau of Mines Report on United States and Canadian Literature pertaining to soil characterization and the use of soil amendments as a part of the reclamation process of coal surface-mined lands contains 1,280 references. The references were published during the 1977 to 1988 period. Each reference is evaluated by keywords, providing the reader with a means of rapidly sorting through the references to locate those articles with the coal mining regions and subjects of interest. All references are annotated

  10. Environmental stewardship for gold mining in tropical regions

    Directory of Open Access Journals (Sweden)

    A Isahak

    2013-10-01

    Full Text Available Mining has gained strong popularity in recent years due to the increase in global demand for metals and other industrial raw material derived from the ground. However, information and good governance regarding activities related to mining is still very much lacking especially in underdeveloped and developing countries in the tropics. In Malaysia, the importance of environmental stewardship in mining is a new phenomenon. The new National Mineral Policy 2 calls for compliance with existing standards and guidelines, stresses on progressive and post mining rehabilitation as well as promotes the gathering and dissemination of information, best mining practices, public disclosure and corporate social responsibility. Our preliminary studies however have shown that its implementation may have been hampered by inadequate legal and administrative structures, lack of freedom of information, physical inaccessibility, lack of information and public participation. In this presentation, the above issues and measures to reduce the impact of mining, particularly that of gold on the environment with a special focus on Malaysia is discussed. These measures include alternative gold extraction methods, appropriate tailing dam construction and management, health risk assessment and risk management, compliance with the Cyanide Code and liberalization of access to information, facilitation of access to justice, the strengthening of legal and administrative structures as well as corporate accountability to the public as part of corporate social responsibility.

  11. Geopositioning with a quadcopter: Extracted feature locations and predicted accuracy without a priori sensor attitude information

    Science.gov (United States)

    Dolloff, John; Hottel, Bryant; Edwards, David; Theiss, Henry; Braun, Aaron

    2017-05-01

    This paper presents an overview of the Full Motion Video-Geopositioning Test Bed (FMV-GTB) developed to investigate algorithm performance and issues related to the registration of motion imagery and subsequent extraction of feature locations along with predicted accuracy. A case study is included corresponding to a video taken from a quadcopter. Registration of the corresponding video frames is performed without the benefit of a priori sensor attitude (pointing) information. In particular, tie points are automatically measured between adjacent frames using standard optical flow matching techniques from computer vision, an a priori estimate of sensor attitude is then computed based on supplied GPS sensor positions contained in the video metadata and a photogrammetric/search-based structure from motion algorithm, and then a Weighted Least Squares adjustment of all a priori metadata across the frames is performed. Extraction of absolute 3D feature locations, including their predicted accuracy based on the principles of rigorous error propagation, is then performed using a subset of the registered frames. Results are compared to known locations (check points) over a test site. Throughout this entire process, no external control information (e.g. surveyed points) is used other than for evaluation of solution errors and corresponding accuracy.

  12. Inexperienced clinicians can extract pathoanatomic information from MRI narrative reports with high reproducability for use in research/quality assurance

    DEFF Research Database (Denmark)

    Kent, Peter; Briggs, Andrew M; Albert, Hanne Birgit

    2011-01-01

    Background Although reproducibility in reading MRI images amongst radiologists and clinicians has been studied previously, no studies have examined the reproducibility of inexperienced clinicians in extracting pathoanatomic information from magnetic resonance imaging (MRI) narrative reports and t...

  13. [Extraction of buildings three-dimensional information from high-resolution satellite imagery based on Barista software].

    Science.gov (United States)

    Zhang, Pei-feng; Hu, Yuan-man; He, Hong-shi

    2010-05-01

    The demand for accurate and up-to-date spatial information of urban buildings is becoming more and more important for urban planning, environmental protection, and other vocations. Today's commercial high-resolution satellite imagery offers the potential to extract the three-dimensional information of urban buildings. This paper extracted the three-dimensional information of urban buildings from QuickBird imagery, and validated the precision of the extraction based on Barista software. It was shown that the extraction of three-dimensional information of the buildings from high-resolution satellite imagery based on Barista software had the advantages of low professional level demand, powerful universality, simple operation, and high precision. One pixel level of point positioning and height determination accuracy could be achieved if the digital elevation model (DEM) and sensor orientation model had higher precision and the off-Nadir View Angle was relatively perfect.

  14. Data Stream Mining

    Science.gov (United States)

    Gaber, Mohamed Medhat; Zaslavsky, Arkady; Krishnaswamy, Shonali

    Data mining is concerned with the process of computationally extracting hidden knowledge structures represented in models and patterns from large data repositories. It is an interdisciplinary field of study that has its roots in databases, statistics, machine learning, and data visualization. Data mining has emerged as a direct outcome of the data explosion that resulted from the success in database and data warehousing technologies over the past two decades (Fayyad, 1997,Fayyad, 1998,Kantardzic, 2003).

  15. Text Mining for Protein Docking.

    Directory of Open Access Journals (Sweden)

    Varsha D Badal

    2015-12-01

    Full Text Available The rapidly growing amount of publicly available information from biomedical research is readily accessible on the Internet, providing a powerful resource for predictive biomolecular modeling. The accumulated data on experimentally determined structures transformed structure prediction of proteins and protein complexes. Instead of exploring the enormous search space, predictive tools can simply proceed to the solution based on similarity to the existing, previously determined structures. A similar major paradigm shift is emerging due to the rapidly expanding amount of information, other than experimentally determined structures, which still can be used as constraints in biomolecular structure prediction. Automated text mining has been widely used in recreating protein interaction networks, as well as in detecting small ligand binding sites on protein structures. Combining and expanding these two well-developed areas of research, we applied the text mining to structural modeling of protein-protein complexes (protein docking. Protein docking can be significantly improved when constraints on the docking mode are available. We developed a procedure that retrieves published abstracts on a specific protein-protein interaction and extracts information relevant to docking. The procedure was assessed on protein complexes from Dockground (http://dockground.compbio.ku.edu. The results show that correct information on binding residues can be extracted for about half of the complexes. The amount of irrelevant information was reduced by conceptual analysis of a subset of the retrieved abstracts, based on the bag-of-words (features approach. Support Vector Machine models were trained and validated on the subset. The remaining abstracts were filtered by the best-performing models, which decreased the irrelevant information for ~ 25% complexes in the dataset. The extracted constraints were incorporated in the docking protocol and tested on the Dockground unbound

  16. Data mining, mining data : energy consumption modelling

    Energy Technology Data Exchange (ETDEWEB)

    Dessureault, S. [Arizona Univ., Tucson, AZ (United States)

    2007-09-15

    Most modern mining operations are accumulating large amounts of data on production and business processes. Data, however, provides value only if it can be translated into information that appropriate users can utilize. This paper emphasized that a new technological focus should emerge, notably how to concentrate data into information; analyze information sufficiently to become knowledge; and, act on that knowledge. Researchers at the Mining Information Systems and Operations Management (MISOM) laboratory at the University of Arizona have created a method to transform data into action. The data-to-action approach was exercised in the development of an energy consumption model (ECM), in partnership with a major US-based copper mining company, 2 software companies, and the MISOM laboratory. The approach begins by integrating several key data sources using data warehousing techniques, and increasing the existing level of integration and data cleaning. An online analytical processing (OLAP) cube was also created to investigate the data and identify a subset of several million records. Data mining algorithms were applied using the information that was isolated by the OLAP cube. The data mining results showed that traditional cost drivers of energy consumption are poor predictors. A comparison was made between traditional methods of predicting energy consumption and the prediction formed using data mining. Traditionally, in the mines for which data were available, monthly averages of tons and distance are used to predict diesel fuel consumption. However, this article showed that new information technology can be used to incorporate many more variables into the budgeting process, resulting in more accurate predictions. The ECM helped mine planners improve the prediction of energy use through more data integration, measure development, and workflow analysis. 5 refs., 11 figs.

  17. INFORMATION MINING OF SPATIO-TEMPORAL EVOLUTION OF LAKES BASED ON MULTIPLE DYNAMIC MEASUREMENTS

    Directory of Open Access Journals (Sweden)

    W. Feng

    2017-09-01

    Full Text Available Lakes are important water resources and integral parts of the natural ecosystem, and it is of great significance to study the evolution of lakes. The area of each lake increased and decreased at the same time in natural condition, only but the net change of lakes’ area is the result of the bidirectional evolution of lakes. In this paper, considering the effects of net fragmentation, net attenuation, swap change and spatial invariant part in lake evolution, a comprehensive evaluation indexes of lake dynamic evolution were defined,. Such degree contains three levels of measurement: 1 the swap dynamic degree (SDD reflects the space activity of lakes in the study period. 2 the attenuation dynamic degree (ADD reflects the net attenuation of lakes into non-lake areas. 3 the fragmentation dynamic degree (FDD reflects the trend of lakes to be divided and broken into smaller lakes. Three levels of dynamic measurement constitute the three-dimensional "Swap - attenuation – fragmentation" dynamic evolution measurement system of lakes. To show its effectiveness, the dynamic measurement was applied to lakes in Jianghan Plain, the middle Yangtze region of China for a more detailed analysis of lakes from 1984 to 2014. In combination with spatial-temporal location characteristics of lakes, the hidden information in lake evolution in the past 30 years can be revealed.

  18. GPR Detection of Buried Symmetrically Shaped Mine-like Objects using Selective Independent Component Analysis

    DEFF Research Database (Denmark)

    Karlsen, Brian; Sørensen, Helge Bjarup Dissing; Larsen, Jan

    2003-01-01

    from small-scale anti-personal (AP) mines to large-scale anti-tank (AT) mines were designed. Large-scale SF-GPR measurements on this series of mine-like objects buried in soil were performed. The SF-GPR data was acquired using a wideband monostatic bow-tie antenna operating in the frequency range 750......This paper addresses the detection of mine-like objects in stepped-frequency ground penetrating radar (SF-GPR) data as a function of object size, object content, and burial depth. The detection approach is based on a Selective Independent Component Analysis (SICA). SICA provides an automatic...... ranking of components, which enables the suppression of clutter, hence extraction of components carrying mine information. The goal of the investigation is to evaluate various time and frequency domain ICA approaches based on SICA. Performance comparison is based on a series of mine-like objects ranging...

  19. Minimizing the Impact of Mining Activities for Sustainable Mined-Out ...

    African Journals Online (AJOL)

    Minimizing the Impact of Mining Activities for Sustainable Mined-Out Area ... sensing and Geographical Information System (GIS) in assessing environmental impact of ... Keywords: Solid mineral, Impact assessment, Mined-out area utilization, ...

  20. Mining on the Mesa

    Energy Technology Data Exchange (ETDEWEB)

    Sprouls, M.W.

    1994-10-01

    Peabody Western Coal Co. is the owner of Black Mesa and Kayenta coal opencast mines, both sited on Hopi and Navajo lands. 93% of the employees are native American, mostly Navajo. Kayenta is the larger and extracts coal with draglines. Sulphur content is high so the coal has to be analyzed and carefully blended before use. Black Mesa also uses draglines, here quality control is not as important as it is at Kayenta. Coal is transported to power stations using slurry pipelines. Both mines are heavily involved in land reclamation, leaving a landscape that makes better grazing than it did before mining. 2 figs.

  1. Overview of image processing tools to extract physical information from JET videos

    Science.gov (United States)

    Craciunescu, T.; Murari, A.; Gelfusa, M.; Tiseanu, I.; Zoita, V.; EFDA Contributors, JET

    2014-11-01

    In magnetic confinement nuclear fusion devices such as JET, the last few years have witnessed a significant increase in the use of digital imagery, not only for the surveying and control of experiments, but also for the physical interpretation of results. More than 25 cameras are routinely used for imaging on JET in the infrared (IR) and visible spectral regions. These cameras can produce up to tens of Gbytes per shot and their information content can be very different, depending on the experimental conditions. However, the relevant information about the underlying physical processes is generally of much reduced dimensionality compared to the recorded data. The extraction of this information, which allows full exploitation of these diagnostics, is a challenging task. The image analysis consists, in most cases, of inverse problems which are typically ill-posed mathematically. The typology of objects to be analysed is very wide, and usually the images are affected by noise, low levels of contrast, low grey-level in-depth resolution, reshaping of moving objects, etc. Moreover, the plasma events have time constants of ms or tens of ms, which imposes tough conditions for real-time applications. On JET, in the last few years new tools and methods have been developed for physical information retrieval. The methodology of optical flow has allowed, under certain assumptions, the derivation of information about the dynamics of video objects associated with different physical phenomena, such as instabilities, pellets and filaments. The approach has been extended in order to approximate the optical flow within the MPEG compressed domain, allowing the manipulation of the large JET video databases and, in specific cases, even real-time data processing. The fast visible camera may provide new information that is potentially useful for disruption prediction. A set of methods, based on the extraction of structural information from the visual scene, have been developed for the

  2. Overview of image processing tools to extract physical information from JET videos

    International Nuclear Information System (INIS)

    Craciunescu, T; Tiseanu, I; Zoita, V; Murari, A; Gelfusa, M

    2014-01-01

    In magnetic confinement nuclear fusion devices such as JET, the last few years have witnessed a significant increase in the use of digital imagery, not only for the surveying and control of experiments, but also for the physical interpretation of results. More than 25 cameras are routinely used for imaging on JET in the infrared (IR) and visible spectral regions. These cameras can produce up to tens of Gbytes per shot and their information content can be very different, depending on the experimental conditions. However, the relevant information about the underlying physical processes is generally of much reduced dimensionality compared to the recorded data. The extraction of this information, which allows full exploitation of these diagnostics, is a challenging task. The image analysis consists, in most cases, of inverse problems which are typically ill-posed mathematically. The typology of objects to be analysed is very wide, and usually the images are affected by noise, low levels of contrast, low grey-level in-depth resolution, reshaping of moving objects, etc. Moreover, the plasma events have time constants of ms or tens of ms, which imposes tough conditions for real-time applications. On JET, in the last few years new tools and methods have been developed for physical information retrieval. The methodology of optical flow has allowed, under certain assumptions, the derivation of information about the dynamics of video objects associated with different physical phenomena, such as instabilities, pellets and filaments. The approach has been extended in order to approximate the optical flow within the MPEG compressed domain, allowing the manipulation of the large JET video databases and, in specific cases, even real-time data processing. The fast visible camera may provide new information that is potentially useful for disruption prediction. A set of methods, based on the extraction of structural information from the visual scene, have been developed for the

  3. Detecting the effects of coal mining, acid rain, and natural gas extraction in Appalachian basin streams in Pennsylvania (USA) through analysis of barium and sulfate concentrations.

    Science.gov (United States)

    Niu, Xianzeng; Wendt, Anna; Li, Zhenhui; Agarwal, Amal; Xue, Lingzhou; Gonzales, Matthew; Brantley, Susan L

    2018-04-01

    To understand how extraction of different energy sources impacts water resources requires assessment of how water chemistry has changed in comparison with the background values of pristine streams. With such understanding, we can develop better water quality standards and ecological interpretations. However, determination of pristine background chemistry is difficult in areas with heavy human impact. To learn to do this, we compiled a master dataset of sulfate and barium concentrations ([SO 4 ], [Ba]) in Pennsylvania (PA, USA) streams from publically available sources. These elements were chosen because they can represent contamination related to oil/gas and coal, respectively. We applied changepoint analysis (i.e., likelihood ratio test) to identify pristine streams, which we defined as streams with a low variability in concentrations as measured over years. From these pristine streams, we estimated the baseline concentrations for major bedrock types in PA. Overall, we found that 48,471 data values are available for [SO 4 ] from 1904 to 2014 and 3243 data for [Ba] from 1963 to 2014. Statewide [SO 4 ] baseline was estimated to be 15.8 ± 9.6 mg/L, but values range from 12.4 to 26.7 mg/L for different bedrock types. The statewide [Ba] baseline is 27.7 ± 10.6 µg/L and values range from 25.8 to 38.7 µg/L. Results show that most increases in [SO 4 ] from the baseline occurred in areas with intensive coal mining activities, confirming previous studies. Sulfate inputs from acid rain were also documented. Slight increases in [Ba] since 2007 and higher [Ba] in areas with higher densities of gas wells when compared to other areas could document impacts from shale gas development, the prevalence of basin brines, or decreases in acid rain and its coupled effects on [Ba] related to barite solubility. The largest impacts on PA stream [Ba] and [SO 4 ] are related to releases from coal mining or burning rather than oil and gas development.

  4. Extraction and Analysis of Information Related to Research & Development Declared Under an Additional Protocol

    International Nuclear Information System (INIS)

    Idinger, J.; Labella, R.; Rialhe, A.; Teller, N.

    2015-01-01

    The additional protocol (AP) provides important tools to strengthen and improve the effectiveness and efficiency of the safeguards system. Safeguards are designed to verify that States comply with their international commitments not to use nuclear material or to engage in nuclear-related activities for the purpose of developing nuclear weapons or other nuclear explosive devices. Under an AP based on INFCIRC/540, a State must provide to the IAEA additional information about, and inspector access to, all parts of its nuclear fuel cycle. In addition, the State has to supply information about its nuclear fuel cycle-related research and development (R&D) activities. The majority of States declare their R&D activities under the AP Articles 2.a.(i), 2.a.(x), and 2.b.(i) as part of initial declarations and their annual updates under the AP. In order to verify consistency and completeness of information provided under the AP by States, the Agency has started to analyze declared R&D information by identifying interrelationships between States in different R&D areas relevant to safeguards. The paper outlines the quality of R&D information provided by States to the Agency, describes how the extraction and analysis of relevant declarations are currently carried out at the Agency and specifies what kinds of difficulties arise during evaluation in respect to cross-linking international projects and finding gaps in reporting. In addition, the paper tries to elaborate how the reporting quality of AP information with reference to R&D activities and the assessment process of R&D information could be improved. (author)

  5. Report of exploration in the mining reserve N XIV; Informe de exploracion en la reserva minera XIV

    Energy Technology Data Exchange (ETDEWEB)

    Spoturno, J.; Lara, P.

    1991-07-01

    This report is about the geological exploration in the mining reserve N X IV. There were recognized basically three units : 1) granitic basement neisico migma tic. 2) lithologic group a md 3) a unit of quartz feldspar granitoid rocks.

  6. Mining robotics sensors

    CSIR Research Space (South Africa)

    Green, JJ

    2012-04-01

    Full Text Available of threedimensional cameras (SR 4000 and XBOX Kinect) and a thermal imaging sensor (FLIR A300) in order to create 3d thermal models of narrow mining stopes. This information can be used in determining the risk of rockfall in an underground mine, which is a major...

  7. Distributed genetic process mining

    NARCIS (Netherlands)

    Bratosin, C.C.; Sidorova, N.; Aalst, van der W.M.P.

    2010-01-01

    Process mining aims at discovering process models from data logs in order to offer insight into the real use of information systems. Most of the existing process mining algorithms fail to discover complex constructs or have problems dealing with noise and infrequent behavior. The genetic process

  8. Zone analysis in biology articles as a basis for information extraction.

    Science.gov (United States)

    Mizuta, Yoko; Korhonen, Anna; Mullen, Tony; Collier, Nigel

    2006-06-01

    In the field of biomedicine, an overwhelming amount of experimental data has become available as a result of the high throughput of research in this domain. The amount of results reported has now grown beyond the limits of what can be managed by manual means. This makes it increasingly difficult for the researchers in this area to keep up with the latest developments. Information extraction (IE) in the biological domain aims to provide an effective automatic means to dynamically manage the information contained in archived journal articles and abstract collections and thus help researchers in their work. However, while considerable advances have been made in certain areas of IE, pinpointing and organizing factual information (such as experimental results) remains a challenge. In this paper we propose tackling this task by incorporating into IE information about rhetorical zones, i.e. classification of spans of text in terms of argumentation and intellectual attribution. As the first step towards this goal, we introduce a scheme for annotating biological texts for rhetorical zones and provide a qualitative and quantitative analysis of the data annotated according to this scheme. We also discuss our preliminary research on automatic zone analysis, and its incorporation into our IE framework.

  9. Extract the Relational Information of Static Features and Motion Features for Human Activities Recognition in Videos

    Directory of Open Access Journals (Sweden)

    Li Yao

    2016-01-01

    Full Text Available Both static features and motion features have shown promising performance in human activities recognition task. However, the information included in these features is insufficient for complex human activities. In this paper, we propose extracting relational information of static features and motion features for human activities recognition. The videos are represented by a classical Bag-of-Word (BoW model which is useful in many works. To get a compact and discriminative codebook with small dimension, we employ the divisive algorithm based on KL-divergence to reconstruct the codebook. After that, to further capture strong relational information, we construct a bipartite graph to model the relationship between words of different feature set. Then we use a k-way partition to create a new codebook in which similar words are getting together. With this new codebook, videos can be represented by a new BoW vector with strong relational information. Moreover, we propose a method to compute new clusters from the divisive algorithm’s projective function. We test our work on the several datasets and obtain very promising results.

  10. Comprehensive Evaluation on Employee Satisfaction of Mine Occupational Health and Safety Management System Based on Improved AHP and 2-Tuple Linguistic Information

    Directory of Open Access Journals (Sweden)

    Jiangdong Bao

    2017-01-01

    Full Text Available In order to comprehensively evaluate the employee satisfaction of mine occupational health and safety management system, an analytic method based on fuzzy analytic hierarchy process and 2-tuple linguistic model was established. Based on the establishment of 5 first-grade indicators and 20 second-grade ones, method of improved AHP and the time-ordered Weighted Averaging Operator (T-OWA model is constructed. The results demonstrate that the employee satisfaction of the mine occupational health and safety management system is of the ‘general’ rank. The method including the evaluation of employee satisfaction and the quantitative analysis of language evaluation information ensures the authenticity of the language evaluation information.

  11. A Sequential Chemical Extraction and Spectroscopic Assessment of the Potential Bioavailability of Mercury Released From the Inoperative New Idria Mercury Mine, San Benito Co., CA

    Science.gov (United States)

    Jew, A. D.; Luong, P. N.; Rytuba, J. J.; Brown, G. E.

    2012-12-01

    The inoperative New Idria mercury mine in San Benito Co., CA, is a potential point source of Hg to the Central Valley of California. To determine the phases and the potential bioavailability of Hg present in stream bed deposits downstream of the mine, sequential chemical extractions (SCEs) targeting Hg-bearing phases and synchrotron-based spectroscopic and imaging techniques were used on sediment samples taken from the acid mine drainage (AMD) system, Hg sorbed in the laboratory to ferrihydrite (synthetic 2-line and natural), and Hg associated with diatom-rich samples. In all field samples examined, both the wet and dry seasons, removal of > 97% of the Hg required 1M KOH or harsher chemical treatments. X-ray absorption spectroscopy (XAS) showed that HgS was the dominant inorganic Hg phase present, with no detectable Hg associated with the ferrihydrite. Uptake and subsequent SCE analysis of Hg to both synthetic and natural ferrihydrite showed that 1M MgCl2 removed ≥ 90% of the total Hg, suggesting that Hg does not sorb strongly to ferrihydrite. This finding is surprising, because in most settings ferrihydrite is considered to be a strong adsorbent of heavy metals. Due to the lack of Hg sorption to ferrihydrite in field samples, another pool for the non-HgS/HgSe fraction in sediments is needed. SEM analysis of the downstream samples showed that regardless of pH, freshwater diatoms were present. To determine if diatoms were the sink for dissolved Hg in this system, SCE analysis on commercially available and diatom-rich field samples from the New Idria site and Harley Gulch (Lake County, CA) were completed. The vast majority of Hg in diatom-rich samples was removed by 1M KOH, which corresponds to the non-HgS/HgSe fraction of the New Idria field samples. Analysis for carbon and nitrogen in the diatom-rich samples showed no detectable nitrogen, indicating little to no organic material was left in the samples. We therefore infer that Hg in the diatoms is contained in

  12. Data Preparation for Web Mining – A survey

    OpenAIRE

    Amog Rajenderan

    2012-01-01

    An accepted trend is to categorize web mining intothree main areas: web content mining, webstructure mining and web usage mining. Webcontent mining involves extractingdetails/information from the contents of webpagesand performing things like knowledge synthesis.Web structure mining involves the usage of graphtheory to understand website structure/hierarchy.Web usage mining involves the mining of usefulinformation from things like server logs, tounderstand what the user does while on the inte...

  13. The Distribution of the Informative Intensity of the Text in Terms of its Structure (On Materials of the English Texts in the Mining Sphere

    Directory of Open Access Journals (Sweden)

    Znikina Ludmila

    2017-01-01

    Full Text Available The article deals with the distribution of informative intensity of the English-language scientific text based on its structural features contributing to the process of formalization of the scientific text and the preservation of the adequacy of the text with derived semantic information in relation to the primary. Discourse analysis is built on specific compositional and meaningful examples of scientific texts taken from the mining field. It also analyzes the adequacy of the translation of foreign texts into another language, the relationships between elements of linguistic systems, the degree of a formal conformance, translation with the specific objectives and information needs of the recipient. Some key words and ideas are emphasized in the paragraphs of the English-language mining scientific texts. The article gives the characteristic features of the structure of paragraphs of technical text and examples of constructions in English scientific texts based on a mining theme with the aim to explain the possible ways of their adequate translation.

  14. Email-Based Informed Consent: Innovative Method for Reaching Large Numbers of Subjects for Data Mining Research

    Science.gov (United States)

    Lee, Lesley R.; Mason, Sara S.; Babiak-Vazquez, Adriana; Ray, Stacie L.; Van Baalen, Mary

    2015-01-01

    Since the 2010 NASA authorization to make the Life Sciences Data Archive (LSDA) and Lifetime Surveillance of Astronaut Health (LSAH) data archives more accessible by the research and operational communities, demand for data has greatly increased. Correspondingly, both the number and scope of requests have increased, from 142 requests fulfilled in 2011 to 224 in 2014, and with some datasets comprising up to 1 million data points. To meet the demand, the LSAH and LSDA Repositories project was launched, which allows active and retired astronauts to authorize full, partial, or no access to their data for research without individual, study-specific informed consent. A one-on-one personal informed consent briefing is required to fully communicate the implications of the several tiers of consent. Due to the need for personal contact to conduct Repositories consent meetings, the rate of consenting has not kept up with demand for individualized, possibly attributable data. As a result, other methods had to be implemented to allow the release of large datasets, such as release of only de-identified data. However the compilation of large, de-identified data sets places a significant resource burden on LSAH and LSDA and may result in diminished scientific usefulness of the dataset. As a result, LSAH and LSDA worked with the JSC Institutional Review Board Chair, Astronaut Office physicians, and NASA Office of General Counsel personnel to develop a "Remote Consenting" process for retrospective data mining studies. This is particularly useful since the majority of the astronaut cohort is retired from the agency and living outside the Houston area. Originally planned as a method to send informed consent briefing slides and consent forms only by mail, Remote Consenting has evolved into a means to accept crewmember decisions on individual studies via their method of choice: email or paper copy by mail. To date, 100 emails have been sent to request participation in eight HRP

  15. Social Network Analysis and Mining to Monitor and Identify Problems with Large-Scale Information and Communication Technology Interventions.

    Science.gov (United States)

    da Silva, Aleksandra do Socorro; de Brito, Silvana Rossy; Vijaykumar, Nandamudi Lankalapalli; da Rocha, Cláudio Alex Jorge; Monteiro, Maurílio de Abreu; Costa, João Crisóstomo Weyl Albuquerque; Francês, Carlos Renato Lisboa

    2016-01-01

    The published literature reveals several arguments concerning the strategic importance of information and communication technology (ICT) interventions for developing countries where the digital divide is a challenge. Large-scale ICT interventions can be an option for countries whose regions, both urban and rural, present a high number of digitally excluded people. Our goal was to monitor and identify problems in interventions aimed at certification for a large number of participants in different geographical regions. Our case study is the training at the Telecentros.BR, a program created in Brazil to install telecenters and certify individuals to use ICT resources. We propose an approach that applies social network analysis and mining techniques to data collected from Telecentros.BR dataset and from the socioeconomics and telecommunications infrastructure indicators of the participants' municipalities. We found that (i) the analysis of interactions in different time periods reflects the objectives of each phase of training, highlighting the increased density in the phase in which participants develop and disseminate their projects; (ii) analysis according to the roles of participants (i.e., tutors or community members) reveals that the interactions were influenced by the center (or region) to which the participant belongs (that is, a community contained mainly members of the same region and always with the presence of tutors, contradicting expectations of the training project, which aimed for intense collaboration of the participants, regardless of the geographic region); (iii) the social network of participants influences the success of the training: that is, given evidence that the degree of the community member is in the highest range, the probability of this individual concluding the training is 0.689; (iv) the North region presented the lowest probability of participant certification, whereas the Northeast, which served municipalities with similar

  16. MedEx: a medication information extraction system for clinical narratives

    Science.gov (United States)

    Stenner, Shane P; Doan, Son; Johnson, Kevin B; Waitman, Lemuel R; Denny, Joshua C

    2010-01-01

    Medication information is one of the most important types of clinical data in electronic medical records. It is critical for healthcare safety and quality, as well as for clinical research that uses electronic medical record data. However, medication data are often recorded in clinical notes as free-text. As such, they are not accessible to other computerized applications that rely on coded data. We describe a new natural language processing system (MedEx), which extracts medication information from clinical notes. MedEx was initially developed using discharge summaries. An evaluation using a data set of 50 discharge summaries showed it performed well on identifying not only drug names (F-measure 93.2%), but also signature information, such as strength, route, and frequency, with F-measures of 94.5%, 93.9%, and 96.0% respectively. We then applied MedEx unchanged to outpatient clinic visit notes. It performed similarly with F-measures over 90% on a set of 25 clinic visit notes. PMID:20064797

  17. Videomicroscopic extraction of specific information on cell proliferation and migration in vitro

    International Nuclear Information System (INIS)

    Debeir, Olivier; Megalizzi, Veronique; Warzee, Nadine; Kiss, Robert; Decaestecker, Christine

    2008-01-01

    In vitro cell imaging is a useful exploratory tool for cell behavior monitoring with a wide range of applications in cell biology and pharmacology. Combined with appropriate image analysis techniques, this approach has been shown to provide useful information on the detection and dynamic analysis of cell events. In this context, numerous efforts have been focused on cell migration analysis. In contrast, the cell division process has been the subject of fewer investigations. The present work focuses on this latter aspect and shows that, in complement to cell migration data, interesting information related to cell division can be extracted from phase-contrast time-lapse image series, in particular cell division duration, which is not provided by standard cell assays using endpoint analyses. We illustrate our approach by analyzing the effects induced by two sigma-1 receptor ligands (haloperidol and 4-IBP) on the behavior of two glioma cell lines using two in vitro cell models, i.e., the low-density individual cell model and the high-density scratch wound model. This illustration also shows that the data provided by our approach are suggestive as to the mechanism of action of compounds, and are thus capable of informing the appropriate selection of further time-consuming and more expensive biological evaluations required to elucidate a mechanism

  18. Advanced mercury removal from gold leachate solutions prior to gold and silver extraction: a field study from an active gold mine in Peru.

    Science.gov (United States)

    Matlock, Matthew M; Howerton, Brock S; Van Aelstyn, Mike A; Nordstrom, Fredrik L; Atwood, David A

    2002-04-01

    Mercury contamination in the Gold-Cyanide Process (GCP) is a serious health and environmental problem. Following the heap leaching of gold and silver ores with NaCN solutions, portions of the mercury-cyano complexes often adhere to the activated carbon (AC) used to extract the gold. During the electrowinning and retorting steps, mercury can be (and often is) emitted to the air as a vapor. This poses a severe health hazard to plant workers and the local environment. Additional concerns relate to the safety of workers when handling the mercury-laden AC. Currently, mercury treatment from the heap leach solution is nonexistent. This is due to the fact that chelating ligands which can effectively work under the adverse pH conditions (as present in the heap leachate solutions) do not exist. In an effort to economically and effectively treat the leachate solution prior to passing over the AC, a dipotassium salt of 1,3-benzenediamidoethanethiol (BDET2-) has been developed to irreversibly bind and precipitate the mercury. The ligand has proven to be highly effective by selectively reducing mercury levels from average initial concentrations of 34.5 ppm (parts per million) to 0.014 ppm within 10 min and to 0.008 ppm within 15 min. X-ray powder diffraction (XRD), proton nuclear magnetic resonance (1H NMR), Raman, and infrared (IR) spectroscopy demonstrate the formation of a mercury-ligand compound, which remains insoluble over pH ranges of 0.0-14.0. Leachate samples from an active gold mine in Peru have been analyzed using cold vapor atomic fluorescence (CVAF) and inductively coupled plasma optical emission spectroscopy (ICP-OES) for metal concentrations before and after treatment with the BDET2- ligand.

  19. 5W1H Information Extraction with CNN-Bidirectional LSTM

    Science.gov (United States)

    Nurdin, A.; Maulidevi, N. U.

    2018-03-01

    In this work, information about who, did what, when, where, why, and how on Indonesian news articles were extracted by combining Convolutional Neural Network and Bidirectional Long Short-Term Memory. Convolutional Neural Network can learn semantically meaningful representations of sentences. Bidirectional LSTM can analyze the relations among words in the sequence. We also use word embedding word2vec for word representation. By combining these algorithms, we obtained F-measure 0.808. Our experiments show that CNN-BLSTM outperforms other shallow methods, namely IBk, C4.5, and Naïve Bayes with the F-measure 0.655, 0.645, and 0.595, respectively.

  20. Developing a Process Model for the Forensic Extraction of Information from Desktop Search Applications

    Directory of Open Access Journals (Sweden)

    Timothy Pavlic

    2008-03-01

    Full Text Available Desktop search applications can contain cached copies of files that were deleted from the file system. Forensic investigators see this as a potential source of evidence, as documents deleted by suspects may still exist in the cache. Whilst there have been attempts at recovering data collected by desktop search applications, there is no methodology governing the process, nor discussion on the most appropriate means to do so. This article seeks to address this issue by developing a process model that can be applied when developing an information extraction application for desktop search applications, discussing preferred methods and the limitations of each. This work represents a more structured approach than other forms of current research.