Sample records for polish natural language

  1. A Natural Language Architecture


    Sodiya, Adesina Simon


    Natural languages are the latest generation of programming languages, which require processing real human natural expressions. Over the years, several groups or researchers have trying to develop widely accepted natural language languages based on artificial intelligence (AI). But no true natural language has been developed. The goal of this work is to design a natural language preprocessing architecture that identifies and accepts programming instructions or sentences in their natural forms ...

  2. The corpus-driven revolution in Polish Sign Language: the interview with Dr. Paweł Rutkowski

    Iztok Kosem


    Full Text Available Dr. Paweł Rutkowski is head of the Section for Sign Linguistics at the University of Warsaw. He is a general linguist and a specialist in the field of syntax of natural languages, carrying out research on Polish Sign Language (polski język migowy — PJM. He has been awarded a number of prizes, grants and scholarships by such institutions as the Foundation for Polish Science, Polish Ministry of Science and Higher Education, National Science Centre, Poland, Polish–U.S. Fulbright Commission, Kosciuszko Foundation and DAAD. Dr. Rutkowski leads the team developing the Corpus of Polish Sign Language and the Corpus-based Dictionary of Polish Sign Language, the first dictionary of this language prepared in compliance with modern lexicographical standards. The dictionary is an open-access publication, available freely at the following address: This interview took place at eLex 2017, a biennial conference on electronic lexicography, where Dr. Rutkowski was awarded the Adam Kilgarriff Prize and gave a keynote address entitled Sign language as a challenge to electronic lexicography: The Corpus-based Dictionary of Polish Sign Language and beyond. The interview was conducted by Dr. Victoria Nyst from Leiden University, Faculty of Humanities, and Dr. Iztok Kosem from the University of Ljubljana, Faculty of Arts.

  3. Constraints on Negative Prefixation in Polish Sign Language. (United States)

    Tomaszewski, Piotr


    The aim of this article is to describe a negative prefix, NEG-, in Polish Sign Language (PJM) which appears to be indigenous to the language. This is of interest given the relative rarity of prefixes in sign languages. Prefixed PJM signs were analyzed on the basis of both a corpus of texts signed by 15 deaf PJM users who are either native or near-native signers, and material including a specified range of prefixed signs as demonstrated by native signers in dictionary form (i.e. signs produced in isolation, not as part of phrases or sentences). In order to define the morphological rules behind prefixation on both the phonological and morphological levels, native PJM users were consulted for their expertise. The research results can enrich models for describing processes of grammaticalization in the context of the visual-gestural modality that forms the basis for sign language structure.

  4. Natural language modeling

    Sharp, J.K. [Sandia National Labs., Albuquerque, NM (United States)


    This seminar describes a process and methodology that uses structured natural language to enable the construction of precise information requirements directly from users, experts, and managers. The main focus of this natural language approach is to create the precise information requirements and to do it in such a way that the business and technical experts are fully accountable for the results. These requirements can then be implemented using appropriate tools and technology. This requirement set is also a universal learning tool because it has all of the knowledge that is needed to understand a particular process (e.g., expense vouchers, project management, budget reviews, tax, laws, machine function).

  5. Symbolic Natural Language Processing


    Laporte , Eric


    The connection between language processing and combinatorics on words is natural. Historically, linguists actually played a part in the beginning of the construction of theoretical combinatorics on words. Some of the terms in current use originate from linguistics: word, prefix, suffix, grammar, syntactic monoid... However, interpenetration between the formal world of computer theory and the intuitive world of linguistics is still a love story with ups and downs. We will encounter in this cha...

  6. Natural immunity factors in Polish mixed breed rabbits. (United States)

    Tokarz-Deptuła, B; Niedźwiedzka-Rystwej, P; Adamiak, M; Hukowska-Szematowicz, B; Trzeciak-Ryczek, A; Deptuła, W


    Mixed-breed rabbits in Poland are widely used for diagnostic and scientific research and as utility animals, therefore there is a need to know their immunological status, as well as their haematological status. In this study natural immunity factors were analyzed in Polish mixed-breed rabbits and Polish mixed-breed rabbits with addition of blood of meet-breed, considering the impact of sex and season of the year (spring, summer, autumn, winter) using measurement of non-specific cellular and humoral immunity parameters in peripheral blood. The study has revealed that there is a variety between the two commonly used mixed-breed types of rabbits, especially when sex and season is concerned, which is crucial for using these animals in experiments.

  7. Fighting alcoholism among railway workers in the light of early 20th Century Polish-language temperance publications

    Izabela Krasińska


    Discussion and conclusions: The Polish-language temperance periodicals provide, among other things, valuable information referring to as yet unknown though essential problem of fighting alcoholism among railway workers in Europe, USA and the Polish territories of the Three Partitions.

  8. Natural language understanding

    Yoshida, S


    Language understanding is essential for intelligent information processing. Processing of language itself involves configuration element analysis, syntactic analysis (parsing), and semantic analysis. They are not carried out in isolation. These are described for the Japanese language and their usage in understanding-systems is examined. 30 references.

  9. Polish Vocabulary Development in 2-Year-Olds: Comparisons with English Using the Language Development Survey (United States)

    Rescorla, Leslie; Constants, Holly; Bialecka-Pikul, Marta; Stepien-Nycz, Malgorzata; Ochal, Anna


    Purpose: The objective of this study was to compare vocabulary size and composition in 2-year-olds learning Polish or English as measured by the Language Development Survey (LDS; Rescorla, 1989). Method: Participants were 199 Polish toddlers (M = 24.14 months, SD = 0.35) and 422 U.S. toddlers (M = 24.69 months, SD = 0.78). Results: Test-retest…

  10. Polish as a foreign language at elementary level of instruction : crosslinguistic influences in writing

    Danuta Gabrys-Barker


    Full Text Available Being a minority European language, Polish has not attracted the attention of second language research (SLA very much. Most studies in the area focus on English and other major languages describing variables and process observed in learners’ interlanguage development. This article looks at the language performance of elementary learners of Polish as a foreign language with a view to diagnosing areas of difficulty at the initial stages of language instruction. It is a case study of five learners’ written production after a year of intensive language instruction in the controlled conditions of a classroom. The objective of the study presented here is: 1. to determine the types of error produced in a short translation task at different levels of language (morphosyntactic, lexical 2. to observe manifestations of crosslinguistic influences between languages the subjects know (interlingual transfer as well as those related to the language learnt itself (intralingual transfer.The small sample of texts produced does not allow for any generalized observations and conclusions, however, at the level of elementary competence in any foreign language, as other research shows, the amount of individual variation is not the most significant factor. Thus the incorrect forms produced may testify to some more universally error-prone areas of language. The value of this kind of analysis lies in this direct application to the teaching of Polish as a synthetic language. The study also demonstrates the fact that communicative teaching has a limited contribution to make in the case of this family of languages. It suggests that overt and explicit teaching of a synthetic language will give a sounder basis for further development of language competence in its communicative dimension

  11. Teaching natural language to computers


    Corneli, Joseph; Corneli, Miriam


    "Natural Language," whether spoken and attended to by humans, or processed and generated by computers, requires networked structures that reflect creative processes in semantic, syntactic, phonetic, linguistic, social, emotional, and cultural modules. Being able to produce novel and useful behavior following repeated practice gets to the root of both artificial intelligence and human language. This paper investigates the modalities involved in language-like applications that computers -- and ...

  12. Handbook of Natural Language Processing

    Indurkhya, Nitin


    Provides a comprehensive, modern reference of practical tools and techniques for implementing natural language processing in computer systems. This title covers classical methods, empirical and statistical techniques, and various applications. It describes how the techniques can be applied to European and Asian languages as well as English

  13. Advances in natural language processing. (United States)

    Hirschberg, Julia; Manning, Christopher D


    Natural language processing employs computational techniques for the purpose of learning, understanding, and producing human language content. Early computational approaches to language research focused on automating the analysis of the linguistic structure of language and developing basic technologies such as machine translation, speech recognition, and speech synthesis. Today's researchers refine and make use of such tools in real-world applications, creating spoken dialogue systems and speech-to-speech translation engines, mining social media for information about health or finance, and identifying sentiment and emotion toward products and services. We describe successes and challenges in this rapidly advancing area. Copyright © 2015, American Association for the Advancement of Science.

  14. Stress 'deafness' in a language with fixed word stress: an ERP study on Polish

    Ulrike eDomahs


    Full Text Available The aim of the present contribution was to examine the factors influencing the prosodic processing in a language with predictable word stress. For Polish, a language with fixed penultimate stress but several well-defined exceptions, difficulties in the processing and representation of prosodic information have been reported (e.g., Peperkamp & Dupoux, 2002. The present study utilized event-related potentials (ERPs to investigate the factors influencing prosodic processing in Polish. These factors are i the predictability of stress and ii the prosodic structure in terms of metrical feet. Polish native speakers were presented with correctly and incorrectly stressed Polish words and instructed to judge the correctness of the perceived stress patterns. For each stress violation an early negativity was found which was interpreted as reflection of an error-detection mechanism, and in addition exceptional stress patterns (= antepenultimate stress and post-lexical (= initial stress evoked a task-related positivity effect (P300 whose amplitude and latency is correlated with the degree of anomaly and deviation from an expectation. Violations involving the default (= penultimate stress in contrast did not produce such an effect. This asymmetrical result is interpreted to reflect that Polish native speakers are less sensitive to the default pattern than to the exceptional or post-lexical patterns. Behavioral results are orthogonal to the electrophysiological results showing that Polish speakers had difficulties to reject any kind of stress violation. Thus, on a meta-linguistic level Polish speakers appeared to be stress-‘deaf’ for any kind of stress manipulation, whereas the neural reactions differentiate between the default and lexicalized patterns.

  15. The Factor Structure of the Polish-Language Version of the Romantic Beliefs Scale

    Katarzyna Adamczyk


    Full Text Available The aim of the present study was to investigate the factor structure and psychometric properties of the Polish adaptation of Romantic Beliefs Scale (RBS; Sprecher & Metts, 1989. In a sample of 414 Polish university students aged 19-25 (227 females and 187 males, the factor structure of the original English version was confirmed for the four subscales: Love Finds a Way, One and Only, Idealization, and Love at First Sight. The present study provides evidence that the 15-item version of the Polish adaptation of the (RBS possesses a factor structure and psychometric properties comparable to the English-language version of RBS. It was shown to be a reliable self-report measure for romantic beliefs within a sample of the Polish population. The development of a new Polish measure of romantic beliefs has provided further validation for the RBS, and provided evidence in support of the ideology of romanticism in various populations, and indicated the importance of differentiating between the different types of romantic beliefs.

  16. Natural language processing with Java

    CERN Document Server

    Reese, Richard M


    If you are a Java programmer who wants to learn about the fundamental tasks underlying natural language processing, this book is for you. You will be able to identify and use NLP tasks for many common problems, and integrate them in your applications to solve more difficult problems. Readers should be familiar/experienced with Java software development.



    Kowalska, Monika


    After the Polish accession to the European Union in 2004, language services have considerably grown in importance. Intensive contacts with foreign companies and institutions coupled with information technology developments have increased the role of English as a linguistic medium of international cooperation. The overall aim of this paper is to examine the Polish business environment for Language Service Providers (LSPs) offering specialized English courses and translation services (EN-PL and...


    Michał Głuszkowski


    Full Text Available The article discusses factors influencing language maintenance under changing social, cultural, economic and political conditions of Polish minority in Siberia. The village of Vershina was founded in 1910 by Polish voluntary settlers from Little Poland.During its first three decades Vershina preserved Polish language,traditions, farming methods and machines and also the Roman Catholic religion. The changes came to a village in taiga in the1930s. Vershina lost its ethnocultural homogeneity because of Russian and Buryat workers in the local kolkhoz. Nowadays the inhabitants of Vershina regained their minority rights: religious, educational and cultural. However, during the years of sovietization and ateization, their culture and customs became much more similar to other Siberian villages. Polish language in Vershina is under strong influence of Russian, which is the language of education,administration, and surrounding villages. Children from Polish-Russian families become monolingual and use Polish very rare, only asa school subject and in contacts with grandparents. The process of abandoning mother tongue in Vershina is growing rapidly. However,there are some factors which may hinder the actual changes:the activity of local Polish organisations and Roman Catholic parish as well as folk group “Jazhumbek”.

  19. Empirical Methods in Natural Language Generation

    Krahmer, Emiel; Theune, Mariet

    Natural language generation (NLG) is a subfield of natural language processing (NLP) that is often characterized as the study of automatically converting non-linguistic representations (e.g., from databases or other knowledge sources) into coherent natural language text. In recent years the field

  20. Health Information in Polish (polski) (United States)

    ... Tools You Are Here: Home → Multiple LanguagesPolish (polski) URL of this page: Health Information in Polish (polski) To use the sharing features on this page, ...

  1. Natural language processing: an introduction. (United States)

    Nadkarni, Prakash M; Ohno-Machado, Lucila; Chapman, Wendy W


    To provide an overview and tutorial of natural language processing (NLP) and modern NLP-system design. This tutorial targets the medical informatics generalist who has limited acquaintance with the principles behind NLP and/or limited knowledge of the current state of the art. We describe the historical evolution of NLP, and summarize common NLP sub-problems in this extensive field. We then provide a synopsis of selected highlights of medical NLP efforts. After providing a brief description of common machine-learning approaches that are being used for diverse NLP sub-problems, we discuss how modern NLP architectures are designed, with a summary of the Apache Foundation's Unstructured Information Management Architecture. We finally consider possible future directions for NLP, and reflect on the possible impact of IBM Watson on the medical field.

  2. Visualizing Natural Language Descriptions: A Survey


    Hassani, Kaveh; Lee, Won-Sook


    A natural language interface exploits the conceptual simplicity and naturalness of the language to create a high-level user-friendly communication channel between humans and machines. One of the promising applications of such interfaces is generating visual interpretations of semantic content of a given natural language that can be then visualized either as a static scene or a dynamic animation. This survey discusses requirements and challenges of developing such systems and reports 26 graphi...

  3. Natural language processing techniques for automatic test ...

    Natural language processing techniques for automatic test questions generation using discourse connectives. ... PROMOTING ACCESS TO AFRICAN RESEARCH. AFRICAN JOURNALS ... Journal of Computer Science and Its Application.

  4. Knowledge representation and natural language processing

    Weischedel, R.M.


    In principle, natural language and knowledge representation are closely related. This paper investigates this by demonstrating how several natural language phenomena, such as definite reference, ambiguity, ellipsis, ill-formed input, figures of speech, and vagueness, require diverse knowledge sources and reasoning. The breadth of kinds of knowledge needed to represent morphology, syntax, semantics, and pragmatics is surveyed. Furthermore, several current issues in knowledge representation, such as logic versus semantic nets, general-purpose versus special-purpose reasoners, adequacy of first-order logic, wait-and-see strategies, and default reasoning, are illustrated in terms of their relation to natural language processing and how natural language impact the issues.

  5. Natural radioactive isotopes in food of Polish population

    Pietrzak-Lis, Z.


    The natural radioactive isotopes contamination of basic food products and water in two regions of Poland (Central Poland and Silesia Region) have been measured. The following isotopes have been taken into account: U-234, U-238, Th-228, Th-230, Th-232, Ra-226, Ra-228, Pb-210; Po-210. The annually intake of mentioned isotopes by regional population and relative doses have been assessed for typical diet of adults in Poland

  6. Mobile speech and advanced natural language solutions

    CERN Document Server

    Markowitz, Judith


    Mobile Speech and Advanced Natural Language Solutions provides a comprehensive and forward-looking treatment of natural speech in the mobile environment. This fourteen-chapter anthology brings together lead scientists from Apple, Google, IBM, AT&T, Yahoo! Research and other companies, along with academicians, technology developers and market analysts.  They analyze the growing markets for mobile speech, new methodological approaches to the study of natural language, empirical research findings on natural language and mobility, and future trends in mobile speech.  Mobile Speech opens with a challenge to the industry to broaden the discussion about speech in mobile environments beyond the smartphone, to consider natural language applications across different domains.   Among the new natural language methods introduced in this book are Sequence Package Analysis, which locates and extracts valuable opinion-related data buried in online postings; microintonation as a way to make TTS truly human-like; and se...

  7. Generating natural language under pragmatic constraints

    CERN Document Server

    Hovy, Eduard H


    Recognizing that the generation of natural language is a goal- driven process, where many of the goals are pragmatic (i.e., interpersonal and situational) in nature, this book provides an overview of the role of pragmatics in language generation. Each chapter states a problem that arises in generation, develops a pragmatics-based solution, and then describes how the solution is implemented in PAULINE, a language generator that can produce numerous versions of a single underlying message, depending on its setting.

  8. Three-dimensional grammar in the brain: Dissociating the neural correlates of natural sign language and manually coded spoken language. (United States)

    Jednoróg, Katarzyna; Bola, Łukasz; Mostowski, Piotr; Szwed, Marcin; Boguszewski, Paweł M; Marchewka, Artur; Rutkowski, Paweł


    In several countries natural sign languages were considered inadequate for education. Instead, new sign-supported systems were created, based on the belief that spoken/written language is grammatically superior. One such system called SJM (system językowo-migowy) preserves the grammatical and lexical structure of spoken Polish and since 1960s has been extensively employed in schools and on TV. Nevertheless, the Deaf community avoids using SJM for everyday communication, its preferred language being PJM (polski język migowy), a natural sign language, structurally and grammatically independent of spoken Polish and featuring classifier constructions (CCs). Here, for the first time, we compare, with fMRI method, the neural bases of natural vs. devised communication systems. Deaf signers were presented with three types of signed sentences (SJM and PJM with/without CCs). Consistent with previous findings, PJM with CCs compared to either SJM or PJM without CCs recruited the parietal lobes. The reverse comparison revealed activation in the anterior temporal lobes, suggesting increased semantic combinatory processes in lexical sign comprehension. Finally, PJM compared with SJM engaged left posterior superior temporal gyrus and anterior temporal lobe, areas crucial for sentence-level speech comprehension. We suggest that activity in these two areas reflects greater processing efficiency for naturally evolved sign language. Copyright © 2015 Elsevier Ltd. All rights reserved.

  9. Specialist English as a foreign language for European public health: evaluation of competencies and needs among Polish and Lithuanian students. (United States)

    Sumskas, Linas; Czabanowska, Katarzyna; Bruneviciūte, Raimonda; Kregzdyte, Rima; Krikstaponyte, Zita; Ziomkiewicz, Anna


    Foreign languages are becoming an essential prerequisite for a successful carrier among all professions including public health professionals in many countries. The expanding role of English as a mode of communication allows for university graduates to project and to seek their career in English-speaking countries. The present study was carried out in the framework of EU Leonardo da Vinci project "Specialist English as a foreign language for European public health." The study aimed to get a deeper insight how the English language is perceived as a foreign language, by Polish and Lithuanian public health students, what is level of their language competence, which level of English proficiency they expect to use in future. MATERIAL AND METHODS. A total of 246 respondents completed the special questionnaires in autumn semester in 2005. A questionnaire form was developed by the international project team. For evaluation of English competences, the Language Passport (Common European Framework of Reference for Languages of Council of Europe) was applied. RESULTS. Current self-rated proficiency of the English language was at the same level for Lithuanian (3.47+/-1.14) and Polish (3.31+/-0.83) respondents (P>0.05). Majority of respondents (88.6% of Lithuanian and 87.8% of Polish) reported using the English language for their current studies. Respondents reported a significant increase in necessity for higher level of English proficiency in future: mean scores provided by respondents changed from B1 level to B2 level. Respondents gave priority to less formal and practice-based interactive English teaching methods (going abroad, contacts with native speakers) in comparison with theory-oriented methods of learning (self-studying, Internet courses). CONCLUSIONS. Similar levels of English language in all five areas of language skills were established in Polish and Lithuanian university students. Respondents gave more priorities to less formal and practice-based interactive

  10. Polish origins of the Faculty of Mathematical and Natural Sciences of the University of Fribourg and the Polish contribution to the Fribourg industrial revolution (in Polish

    Wojciech KOCUREK


    Full Text Available The article is dedicated to high-tech companies founded by Poles at the end of the 19th century in the rural canton of Fribourg in Switzerland. The text is divided into two parts. In the first part, the author attempts to present the economic, social and political reality of Fribourg in a period of intense industrialization in the world and the formation of the liberal free market system. In this rapidly changing reality, the new Catholic-conservative authorities of the canton tried to lead to establishing of a comprehensive, but also different system of a “Christian republic”, whose aim was to achieve social justice consistent with the teachings of the Gospel. In order to complete the project, the cantonal government did not shy away from using the possibilities and measures offered by the contemporary world. Decision-makers, led by Georges Python, needed support from the society, who was aware of the changes. Due to this fact, it became necessary to establish a university capable of shaping new attitudes and views. However, the costs significantly exceeded the financial capabilities of the agricultural and relatively poor canton of Fribourg. In these less favourable circumstances, a conscious policy of industrialization was the way out of the deadlock. Newly created industrial institutions were to contribute to an increase of cash inflows to the canton and thus allow for the financing of the university, which would also become an intellectual foundation for the emerging industry. The activity of Polish scientists, which is the subject of the second part of the article, matched this philosophy perfectly. The Poles invited to cooperate with Python, i.e. Józef Wierusz-Kowalski, Ignacy Mościcki and Jan Modzelewski, created the foundations of the Faculty of Mathematical and Natural Sciences at the University of Fribourg. As members of the faculty, in addition to teaching, they conducted research into, among other things, nitric acid

  11. A System for Natural Language Sentence Generation. (United States)

    Levison, Michael; Lessard, Gregory


    Describes the natural language computer program, "Vinci." Explains that using an attribute grammar formalism, Vinci can simulate components of several current linguistic theories. Considers the design of the system and its applications in linguistic modelling and second language acquisition research. Notes Vinci's uses in linguistics…

  12. Natural Language Generation from Pictographs


    Sevens, Leen; Vandeghinste, Vincent; Schuurman, Ineke; Van Eynde, Frank


    We present a Pictograph-to-Text translation system for people with Intellectual or Developmental Disabilities (IDD). The system translates pictograph messages, consisting of one or more pictographs, into Dutch text using WordNet links and an n-gram language model. We also provide several pictograph input methods assisting the users in selecting the appropriate pictographs.

  13. Natural Language Description of Emotion (United States)

    Kazemzadeh, Abe


    This dissertation studies how people describe emotions with language and how computers can simulate this descriptive behavior. Although many non-human animals can express their current emotions as social signals, only humans can communicate about emotions symbolically. This symbolic communication of emotion allows us to talk about emotions that we…

  14. Bayesian natural language semantics and pragmatics

    CERN Document Server

    Zeevat, Henk


    The contributions in this volume focus on the Bayesian interpretation of natural languages, which is widely used in areas of artificial intelligence, cognitive science, and computational linguistics. This is the first volume to take up topics in Bayesian Natural Language Interpretation and make proposals based on information theory, probability theory, and related fields. The methodologies offered here extend to the target semantic and pragmatic analyses of computational natural language interpretation. Bayesian approaches to natural language semantics and pragmatics are based on methods from signal processing and the causal Bayesian models pioneered by especially Pearl. In signal processing, the Bayesian method finds the most probable interpretation by finding the one that maximizes the product of the prior probability and the likelihood of the interpretation. It thus stresses the importance of a production model for interpretation as in Grice's contributions to pragmatics or in interpretation by abduction.

    Adelphi, MD 20783-1197 This technical note provides a brief description of a Java library for Arabic natural language processing ( NLP ) containing code...for training and applying the Arabic NLP system described in the paper "A Cross-Task Flexible Transition Model for Arabic Tokenization, Affix...and also English) natural language processing ( NLP ), containing code for training and applying the Arabic NLP system described in Stephen Tratz’s

  16. Evolution, brain, and the nature of language. (United States)

    Berwick, Robert C; Friederici, Angela D; Chomsky, Noam; Bolhuis, Johan J


    Language serves as a cornerstone for human cognition, yet much about its evolution remains puzzling. Recent research on this question parallels Darwin's attempt to explain both the unity of all species and their diversity. What has emerged from this research is that the unified nature of human language arises from a shared, species-specific computational ability. This ability has identifiable correlates in the brain and has remained fixed since the origin of language approximately 100 thousand years ago. Although songbirds share with humans a vocal imitation learning ability, with a similar underlying neural organization, language is uniquely human. Copyright © 2012 Elsevier Ltd. All rights reserved.

  17. Thought beyond language: neural dissociation of algebra and natural language. (United States)

    Monti, Martin M; Parsons, Lawrence M; Osherson, Daniel N


    A central question in cognitive science is whether natural language provides combinatorial operations that are essential to diverse domains of thought. In the study reported here, we addressed this issue by examining the role of linguistic mechanisms in forging the hierarchical structures of algebra. In a 3-T functional MRI experiment, we showed that processing of the syntax-like operations of algebra does not rely on the neural mechanisms of natural language. Our findings indicate that processing the syntax of language elicits the known substrate of linguistic competence, whereas algebraic operations recruit bilateral parietal brain regions previously implicated in the representation of magnitude. This double dissociation argues against the view that language provides the structure of thought across all cognitive domains.

  18. A Natural Logic for Natural-Language Knowledge Bases

    Andreasen, Troels; Styltsvig, Henrik Bulskov; Jensen, Per Anker


    We describe a natural logic for computational reasoning with a regimented fragment of natural language. The natural logic comes with intuitive inference rules enabling deductions and with an internal graph representation facilitating conceptual path finding between pairs of terms as an approach t...

  19. A Natural Logic for Natural-language Knowledge Bases

    Andreasen, Troels; Bulskov, Henrik; Jensen, Per Anker


    We describe a natural logic for computational reasoning with a regimented fragment of natural language. The natural logic comes with intuitive inference rules enabling deductions and with an internal graph representation facilitating conceptual path finding between pairs of terms as an approach t...

  20. Efficiency of a natural wetland for effluent polishing of a septic tank

    Z. Yousefi


    Full Text Available Wetlands now days apply as a polishing system for the classical wastewater treatment, in addition of different usages. Usually wetland systems are inexpensive methods vs. expensive high technology treatment systems. Objective of this study is an evaluation of natural wetland treatment in polishing of a septic effluent. Research duration works extended for 10 months on a natural wetland system in Pardis of Mazandaran University of medical sciences and eastern north of health faculty. Wastewater quality index such as pH, EC, BOD, COD, TSS, Nitrate, Phosphorus, Ammonia and Temperature performed on the samples of influent and effluent of the system. The study showed the system works as a buffering system for flow and pH. Results indicated that average of BOD5 and TSS efficiency were 67.70and 83%, respectively. Efficiency of COD was 65.26 and 80 % for a Low and moderate strength influent respectively. Average of phosphorus, NH3 and Nitrate in effluent were 0.032 mg/L, 7.18 and 0.036 mg/L, respectively. Efficiency of ammonia and Phosphorus were slightly increased in best condition. Based on this study result, natural wetland can be success in BOD, COD, and TSS removal of the classical septic tank, but for nitrogen and Phosphorus removal do not have considerable effects.

  1. Prediction During Natural Language Comprehension. (United States)

    Willems, Roel M; Frank, Stefan L; Nijhof, Annabel D; Hagoort, Peter; van den Bosch, Antal


    The notion of prediction is studied in cognitive neuroscience with increasing intensity. We investigated the neural basis of 2 distinct aspects of word prediction, derived from information theory, during story comprehension. We assessed the effect of entropy of next-word probability distributions as well as surprisal A computational model determined entropy and surprisal for each word in 3 literary stories. Twenty-four healthy participants listened to the same 3 stories while their brain activation was measured using fMRI. Reversed speech fragments were presented as a control condition. Brain areas sensitive to entropy were left ventral premotor cortex, left middle frontal gyrus, right inferior frontal gyrus, left inferior parietal lobule, and left supplementary motor area. Areas sensitive to surprisal were left inferior temporal sulcus ("visual word form area"), bilateral superior temporal gyrus, right amygdala, bilateral anterior temporal poles, and right inferior frontal sulcus. We conclude that prediction during language comprehension can occur at several levels of processing, including at the level of word form. Our study exemplifies the power of combining computational linguistics with cognitive neuroscience, and additionally underlines the feasibility of studying continuous spoken language materials with fMRI. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail:


    Paulina Pietras


    Full Text Available The present article is aimed at examining the category of the reality status by discussing the dichotomy “realis / irrealis” in the context of the categories of modality, habituality and futurity. Prototype analysis is juxtaposed with scope analysis, and the category of the habitual is discussed from the typological perspective as well as from the perspective of its connection with the category of futurity. The paper presents aspect diversity of habituals (perfective and imperfective aspect and its contextual implications as well as the differentiation between the habitual and modality. A special focus is on the prototype analysis and its application instances in Polish, English and Hebrew. The primary objective of the paper is to show that, although it is possible to treat irrealis as notional category, the habituals in Polish and many other Slavic languages (e.g. Czech should be identified with the modality domain rather than irrealis category. The paper is also an attempt to provide an insight into the distinction between (irrealis and encoding systems of modalities as the habitual aspect displays modal category features in many languages (including Polish.

  3. Natural language generation of surgical procedures. (United States)

    Wagner, J C; Rogers, J E; Baud, R H; Scherrer, J R


    A number of compositional Medical Concept Representation systems are being developed. Although these provide for a detailed conceptual representation of the underlying information, they have to be translated back to natural language for used by end-users and applications. The GALEN programme has been developing one such representation and we report here on a tool developed to generate natural language phrases from the GALEN conceptual representations. This tool can be adapted to different source modelling schemes and to different destination languages or sublanguages of a domain. It is based on a multilingual approach to natural language generation, realised through a clean separation of the domain model from the linguistic model and their link by well defined structures. Specific knowledge structures and operations have been developed for bridging between the modelling 'style' of the conceptual representation and natural language. Using the example of the scheme developed for modelling surgical operative procedures within the GALEN-IN-USE project, we show how the generator is adapted to such a scheme. The basic characteristics of the surgical procedures scheme are presented together with the basic principles of the generation tool. Using worked examples, we discuss the transformation operations which change the initial source representation into a form which can more directly be translated to a given natural language. In particular, the linguistic knowledge which has to be introduced--such as definitions of concepts and relationships is described. We explain the overall generator strategy and how particular transformation operations are triggered by language-dependent and conceptual parameters. Results are shown for generated French phrases corresponding to surgical procedures from the urology domain.

  4. Semantic structures advances in natural language processing

    CERN Document Server

    Waltz, David L


    Natural language understanding is central to the goals of artificial intelligence. Any truly intelligent machine must be capable of carrying on a conversation: dialogue, particularly clarification dialogue, is essential if we are to avoid disasters caused by the misunderstanding of the intelligent interactive systems of the future. This book is an interim report on the grand enterprise of devising a machine that can use natural language as fluently as a human. What has really been achieved since this goal was first formulated in Turing's famous test? What obstacles still need to be overcome?

  5. Theoretical approaches to natural language understanding

    This book discusses the following: Computational Linguistics, Artificial Intelligence, Linguistics, Philosophy, and Cognitive Science and the current state of natural language understanding. Three topics form the focus for discussion; these topics include aspects of grammars, aspects of semantics/pragmatics, and knowledge representation.

  6. The nature of pragmatic language impairment

    NARCIS (Netherlands)

    Ketelaars, M.P.


    The present dissertation reports on research into the nature of Pragmatic Language Impairment (PLI) in children aged 4 to 7 in the Netherlands. First, the possibility of screening for PLI in the general population is examined. Results show that this is indeed possible as well as feasible. Second, an

  7. Natural Language Generation for dialogue: system survey

    Theune, Mariet

    Many natural language dialogue systems make use of `canned text' for output generation. This approach may be su±cient for dialogues in restricted domains where system utterances are short and simple and use fixed expressions (e.g., slot filling dialogues in the ticket reservation or travel

  8. Natural Language Navigation Support in Virtual Reality

    NARCIS (Netherlands)

    van Luin, J.; Nijholt, Antinus; op den Akker, Hendrikus J.A.; Giagourta, V.; Strintzis, M.G.


    We describe our work on designing a natural language accessible navigation agent for a virtual reality (VR) environment. The agent is part of an agent framework, which means that it can communicate with other agents. Its navigation task consists of guiding the visitors in the environment and to

  9. Brain readiness and the nature of language. (United States)

    Bouchard, Denis


    To identify the neural components that make a brain ready for language, it is important to have well defined linguistic phenotypes, to know precisely what language is. There are two central features to language: the capacity to form signs (words), and the capacity to combine them into complex structures. We must determine how the human brain enables these capacities. A sign is a link between a perceptual form and a conceptual meaning. Acoustic elements and content elements, are already brain-internal in non-human animals, but as categorical systems linked with brain-external elements. Being indexically tied to objects of the world, they cannot freely link to form signs. A crucial property of a language-ready brain is the capacity to process perceptual forms and contents offline, detached from any brain-external phenomena, so their "representations" may be linked into signs. These brain systems appear to have pleiotropic effects on a variety of phenotypic traits and not to be specifically designed for language. Syntax combines signs, so the combination of two signs operates simultaneously on their meaning and form. The operation combining the meanings long antedates its function in language: the primitive mode of predication operative in representing some information about an object. The combination of the forms is enabled by the capacity of the brain to segment vocal and visual information into discrete elements. Discrete temporal units have order and juxtaposition, and vocal units have intonation, length, and stress. These are primitive combinatorial processes. So the prior properties of the physical and conceptual elements of the sign introduce combinatoriality into the linguistic system, and from these primitive combinatorial systems derive concatenation in phonology and combination in morphosyntax. Given the nature of language, a key feature to our understanding of the language-ready brain is to be found in the mechanisms in human brains that enable the unique

  10. Brain readiness and the nature of language

    Denis eBouchard


    Full Text Available To identify the neural components that make a brain ready for language, it is important to have well defined linguistic phenotypes, to know precisely what language is. There are two central features to language: the capacity to form signs (words, and the capacity to combine them into complex structures. We must determine how the human brain enables these capacities.A sign is a link between a perceptual form and a conceptual meaning. Acoustic elements and content elements, are already brain-internal in non-human animals, but as categorical systems linked with brain-external elements. Being indexically tied to objects of the world, they cannot freely link to form signs. A crucial property of a language-ready brain is the capacity to process perceptual forms and contents offline, detached from any brain-external phenomena, so their representations may be linked into signs. These brain systems appear to have pleiotropic effects on a variety of phenotypic traits and not to be specifically designed for language.Syntax combines signs, so the combination of two signs operates simultaneously on their meaning and form. The operation combining the meanings long antedates its function in language: the primitive mode of predication operative in representing some information about an object. The combination of the forms is enabled by the capacity of the brain to segment vocal and visual information into discrete elements. Discrete temporal units have order and juxtaposition, and vocal units have intonation, length, and stress. These are primitive combinatorial processes. So the prior properties of the physical and conceptual elements of the sign introduce combinatoriality into the linguistic system, and from these primitive combinatorial systems derive concatenation in phonology and combination in morphosyntax.Given the nature of language, a key feature to our understanding of the language-ready brain is to be found in the mechanisms in human brains that

  11. Natural language interface for nuclear data bases

    International Nuclear Information System (INIS)

    Heger, A.S.; Koen, B.V.


    A natural language interface has been developed for access to information from a data base, simulating a nuclear plant reliability data system (NPRDS), one of the several existing data bases serving the nuclear industry. In the last decade, the importance of information has been demonstrated by the impressive diffusion of data base management systems. The present methods that are employed to access data bases fall into two main categories of menu-driven systems and use of data base manipulation languages. Both of these methods are currently used by NPRDS. These methods have proven to be tedious, however, and require extensive training by the user for effective utilization of the data base. Artificial intelligence techniques have been used in the development of several intelligent front ends for data bases in nonnuclear domains. Lunar is a natural language program for interface to a data base describing moon rock samples brought back by Apollo. Intellect is one of the first data base question-answering systems that was commercially available in the financial area. Ladder is an intelligent data base interface that was developed as a management aid to Navy decision makers. A natural language interface for nuclear data bases that can be used by nonprogrammers with little or no training provides a means for achieving this goal for this industry

  12. Task planning systems with natural language interface

    International Nuclear Information System (INIS)

    Kambayashi, Shaw; Uenaka, Junji


    In this report, a natural language analyzer and two different task planning systems are described. In 1988, we have introduced a Japanese language analyzer named CS-PARSER for the input interface of the task planning system in the Human Acts Simulation Program (HASP). For the purpose of a high speed analysis, we have modified a dictionary system of the CS-PARSER by using C language description. It is found that the new dictionary system is very useful for a high speed analysis and an efficient maintenance of the dictionary. For the study of the task planning problem, we have modified a story generating system named Micro TALE-SPIN to generate a story written in Japanese sentences. We have also constructed a planning system with natural language interface by using the CS-PARSER. Task planning processes and related knowledge bases of these systems are explained. A concept design for a new task planning system will be also discussed from evaluations of above mentioned systems. (author)

  13. Natural language processing tools for computer assisted language learning

    Vandeventer Faltin, Anne


    Full Text Available This paper illustrates the usefulness of natural language processing (NLP tools for computer assisted language learning (CALL through the presentation of three NLP tools integrated within a CALL software for French. These tools are (i a sentence structure viewer; (ii an error diagnosis system; and (iii a conjugation tool. The sentence structure viewer helps language learners grasp the structure of a sentence, by providing lexical and grammatical information. This information is derived from a deep syntactic analysis. Two different outputs are presented. The error diagnosis system is composed of a spell checker, a grammar checker, and a coherence checker. The spell checker makes use of alpha-codes, phonological reinterpretation, and some ad hoc rules to provide correction proposals. The grammar checker employs constraint relaxation and phonological reinterpretation as diagnosis techniques. The coherence checker compares the underlying "semantic" structures of a stored answer and of the learners' input to detect semantic discrepancies. The conjugation tool is a resource with enhanced capabilities when put on an electronic format, enabling searches from inflected and ambiguous verb forms.

  14. Natural language generation in health care. (United States)

    Cawsey, A J; Webber, B L; Jones, R B


    Good communication is vital in health care, both among health care professionals, and between health care professionals and their patients. And well-written documents, describing and/or explaining the information in structured databases may be easier to comprehend, more edifying, and even more convincing than the structured data, even when presented in tabular or graphic form. Documents may be automatically generated from structured data, using techniques from the field of natural language generation. These techniques are concerned with how the content, organization and language used in a document can be dynamically selected, depending on the audience and context. They have been used to generate health education materials, explanations and critiques in decision support systems, and medical reports and progress notes.

  15. Diagnostic validity Polish language version of the questionnaire MINI-KID (Mini International Neuropsychiatry Interview for Children and Adolescent). (United States)

    Adamowska, Sylwia; Sylwia, Adamowska; Adamowski, Tomasz; Tomasz, Adamowski; Frydecka, Dorota; Dorota, Frydecka; Kiejna, Andrzej; Andrzej, Kiejna


    Since over forty years structuralized interviews for clinical and epidemiological research in child and adolescent psychiatry are being developed that should increase validity and reliability of diagnoses according to classification systems (DSM and ICD). The aim of the study is to assess the validity of the Polish version of MINI-KID (Mini International Neuropsychiatric Interview for Children and Adolescents) in comparison to clinical diagnosis made by a specialist in the field of child and adolescent psychiatry. There were 140 patients included in the study (93 boys, 66.4%, mean age 11.8±3.0 and 47 girls 33.5%, mean age 14.0±2.9). All the patients were diagnosed by the specialist in the field of child and adolescent psychiatry according to ICD-10 criteria and by the independent interviewer with the Polish version of MINI-KID (version 2.0, 2001). There was higher agreement between clinical diagnoses and diagnoses based on MINI-KID interview with respect to eating disorders and externalizing disorders (κ 0.43-0.56) and lower in internalizing disorders (κ 0.13-0.45). In the clinical interview, there was smaller number of diagnostic categories (maximum 3 diagnoses per one patient) in comparison to MINI-KID (maximum 10 diagnoses per one patient), and the smaller percentage of patients with one diagnosis (65,7%) in comparison to MINI-KID interview (72%). Our study has shown satisfactory validity parameters of MINI-KID questionnaire, promoting its use for clinical and epidemiological settings. The Mini International Neuropsychiatry Interview for Children and Adolescent (MINI-KID) is the first structuralized diagnostic interview for assessing mental status in children and adolescents, which has been translated into Polish language. Our validation study demonstrated satisfactory psychometric properties of the questionnaire, enabling its use in clinical practice and in research projects. Copyright © 2014 Elsevier Inc. All rights reserved.


    Maciek Czerwiński


    Full Text Available Many languages in the new era are open to the influence of English language. There is no doubt about it but process of adaptation in various languages can run quite different. I focused in the article on such problems. Adaptation itself of every single loan word (and thus English loan word as well runs on the three basic levels: phonological, morphological and semantic. On every level we find some particular tendencies in particular language. To research them properly we should look at another language and compare them.

  17. The social impact of natural language processing

    Hovy, Dirk; Spruit, Shannon

    Research in natural language processing (NLP) used to be mostly performed on anonymous corpora, with the goal of enriching linguistic analysis. Authors were either largely unknown or public figures. As we increasingly use more data from social media, this situation has changed: users are now...... individually identifiable, and the outcome of NLP experiments and applications can have a direct effect on their lives. This change should spawn a debate about the ethical implications of NLP, but until now, the internal discourse in the field has not followed the technological development. This position paper...

  18. ‘It is English and there is no alternative’: intersectionality, language and social/organizational differentiation of Polish migrants in the UK


    Johansson, Marjana; Śliwa, Martyna


    In this paper, we employ an intersectional approach to explore language as a process of social and organizational differentiation of Polish migrant workers in the UK. In addition to intersectionality, our conceptual framework is informed by a sociolinguistic perspective on globalization, which accounts for the social differentiation produced by language in transnational contexts. Empirically, the paper is based on a qualitative study employing life history interviews. Our findings show that f...

  19. Investigating the attitudes towards learning a third language and its culture in Polish junior high school


    Kiermasz, Zuzanna


    It is believed that attitudes to languages and culture tend to affect achievement in foreign language learning (Baker, 1997). Thus, this factor may be seen as crucial when it comes to the discrepancies in attainment in different languages learnt by the same students. Therefore, it seems vital to investigate variation in attitudes towards both learning L2 together with the approach to the L2 culture and the corresponding issues with respect to L3. Nevertheless, the general at...

    Hoard, James E.


    Integrating diverse information sources and application software in a principled and general manner will require a very capable advanced information management (AIM) system. In particular, such a system will need a comprehensive addressing scheme to locate the material in its docuverse. It will also need a natural language processing (NLP) system of great sophistication. It seems that the NLP system must serve three functions. First, it provides an natural language interface (NLI) for the users. Second, it serves as the core component that understands and makes use of the real-world interpretations (RWIs) contained in the docuverse. Third, it enables the reasoning specialists (RSs) to arrive at conclusions that can be transformed into procedures that will satisfy the users' requests. The best candidate for an intelligent agent that can satisfactorily make use of RSs and transform documents (TDs) appears to be an object oriented data base (OODB). OODBs have, apparently, an inherent capacity to use the large numbers of RSs and TDs that will be required by an AIM system and an inherent capacity to use them in an effective way.

  1. An Overview of Computer-Based Natural Language Processing. (United States)

    Gevarter, William B.

    Computer-based Natural Language Processing (NLP) is the key to enabling humans and their computer-based creations to interact with machines using natural languages (English, Japanese, German, etc.) rather than formal computer languages. NLP is a major research area in the fields of artificial intelligence and computational linguistics. Commercial…

  2. On the Relationship between a Computational Natural Logic and Natural Language

    Andreasen, Troels; Bulskov, Henrik; Nilsson, Jørgen Fischer


    This paper makes a case for adopting appropriate forms of natural logic as target language for computational reasoning with descriptive natural language. Natural logics are stylized fragments of natural language where reasoning can be conducted directly by natural reasoning rules reflecting intui...... intuitive reasoning in natural language. The approach taken in this paper is to extend natural logic stepwise with a view to covering successively larger parts of natural language. We envisage applications for computational querying and reasoning, in particular within the life-sciences....

  3. Perceived teacher support and language anxiety in Polish secondary school EFL learners

    Directory of Open Access Journals (Sweden)

    Ewa Piechurska-Kuciel


    Full Text Available The teacher’s role is vital, both in respect to achieving academic goals, and with regard to the regulation of emotional and social processes. Positive perceptions of teacher support can endorse psychological wellness, and help maintain students’ academic interests, higher academic achievement and more positive peer relationships. The teacher who shows understanding, empathy and consistency in their behavior helps students start forming an identity, which will assist them in coping with stress and anxiety directly connected with the foreign language learning process (language anxiety. The main aim of this research is to investigate the relationship between teacher support and language anxiety levels. It is speculated that teacher support functions as a buffer from the effects of negative emotions, such as language anxiety experienced in the foreign language learning process. The participants of the study were 621 secondary grammar school students whose responses to a questionnaire were the main data source. The results of the study demonstrate that students with higher levels of teacher support experience lower language anxiety levels in comparison to their peers with lower levels of teacher support. Students who have a feeling that they can count on the instructor’s help, advice, assistance, or backing manage the learning process more successfully. They evaluate their language abilities highly and receive better final grades. Nevertheless, gender and residential location do not moderate teacher support and language anxiety due to the specificity of the sample consisting of novice secondary grammar school students.

  4. Phenolic content, antioxidant and antibacterial activity of selected natural sweeteners available on the Polish market. (United States)

    Grabek-Lejko, Dorota; Tomczyk-Ulanowska, Kinga


    Seventeen natural sweeteners available on the Polish market were screened for total phenolic content, by the Folin-Ciocalteu method, and for antioxidant activity, using the ferric reducing antioxidant power (FRAP) assay and the 2,2'-Azinobis (3-ethylbenzthiazoline-6-sulphonic acid) radical cation decolorization assay (ABTS(·+)). In addition, we analyzed antibacterial activities against Staphylococcus aureus strains: both those susceptible and those resistant to methicillin (MRSA). The results of the study showed that total phenolic content, antioxidant activity and antibacterial activity differ widely among different samples of sweeteners. Phenolic content, expressed as a gallic acid equivalent, ranged from 0 mg kg(-1) in white, refined sugar, xylitol and wheat malt syrup to 11.4 g kg(-1) in sugarcane molasses. Antioxidant activity was lowest in refined white sugar, xylitol, brown beet sugar, liquid fructose, and rape honey; it was average in spelt syrup and corn syrup, and highest in sugar cane, beet molasses, date and barley syrups. Despite the great variety of sweeteners, a strong correlation was noted between the concentration of phenolics and antioxidant properties, as determined by the ABTS(·+) method (r = 0.97) and the FRAP assay (r = 0.77). The strongest antibacterial activity was observed in sugarcane molasses, which was lethal to S. aureus strains at 2 and 4% concentrations in medium for susceptible and MRSA strains respectively. Other sweeteners kill bacteria in 6-15% solutions, whereas some did not show any antibacterial activities against S. aureus strains, even at 20% concentrations. Due to their high antioxidant and antibacterial activities, some of the tested sweeteners have potential therapeutic value as supporting agents in antibiotic therapy.

  5. Understanding and representing natural language meaning (United States)

    Waltz, D. L.; Maran, L. R.; Dorfman, M. H.; Dinitz, R.; Farwell, D.


  6. Mathematical Formula Search using Natural Language Queries

    Directory of Open Access Journals (Sweden)

    YANG, S.


    Full Text Available This paper presents how to search mathematical formulae written in MathML when given plain words as a query. Since the proposed method allows natural language queries like the traditional Information Retrieval for the mathematical formula search, users do not need to enter any complicated math symbols and to use any formula input tool. For this, formula data is converted into plain texts, and features are extracted from the converted texts. In our experiments, we achieve an outstanding performance, a MRR of 0.659. In addition, we introduce how to utilize formula classification for formula search. By using class information, we finally achieve an improved performance, a MRR of 0.690.

  7. The social impact of natural language processing

    Hovy, Dirk; Spruit, Shannon

    Research in natural language processing (NLP) used to be mostly performed on anonymous corpora, with the goal of enriching linguistic analysis. Authors were either largely unknown or public figures. As we increasingly use more data from social media, this situation has changed: users are now...... individually identifiable, and the outcome of NLP experiments and applications can have a direct effect on their lives. This change should spawn a debate about the ethical implications of NLP, but until now, the internal discourse in the field has not followed the technological development. This position paper...... identifies a number of social implications that NLP research may have, and discusses their ethical significance, as well as ways to address them....

  8. Quantum Algorithms for Compositional Natural Language Processing

    Directory of Open Access Journals (Sweden)

    William Zeng


    Full Text Available We propose a new application of quantum computing to the field of natural language processing. Ongoing work in this field attempts to incorporate grammatical structure into algorithms that compute meaning. In (Coecke, Sadrzadeh and Clark, 2010, the authors introduce such a model (the CSC model based on tensor product composition. While this algorithm has many advantages, its implementation is hampered by the large classical computational resources that it requires. In this work we show how computational shortcomings of the CSC approach could be resolved using quantum computation (possibly in addition to existing techniques for dimension reduction. We address the value of quantum RAM (Giovannetti,2008 for this model and extend an algorithm from Wiebe, Braun and Lloyd (2012 into a quantum algorithm to categorize sentences in CSC. Our new algorithm demonstrates a quadratic speedup over classical methods under certain conditions.

  9. A Tableau Prover for Natural Logic and Language

    Abzianidze, Lasha


    Modeling the entailment relation over sentences is one of the generic problems of natural language understanding. In order to account for this problem, we design a theorem prover for Natural Logic, a logic whose terms resemble natural language expressions. The prover is based on an analytic tableau

  10. Capturing and Modeling Domain Knowledge Using Natural Language Processing Techniques

    National Research Council Canada - National Science Library

    Auger, Alain


    .... Initiated in 2004 at Defense Research and Development Canada (DRDC), the SACOT knowledge engineering research project is currently investigating, developing and validating innovative natural language processing (NLP...

  11. Natural language solution to a Tuff problem

    International Nuclear Information System (INIS)

    Langkopf, B.S.; Mallory, L.H.


    A scientific data base, the Tuff Data Base, is being created at Sandia National Laboratories on the Cyber 170/855, using System 2000. It is being developed for use by scientists and engineers investigating the feasibility of locating a high-level radioactive waste repository in tuff (a type of volcanic rock) at Yucca Mountain on and adjacent to the Nevada Test Site. This project, the Nevada Nuclear Waste Storage Investigations (NNWSI) Project, is managed by the Nevada Operations Office of the US Department of Energy. A user-friendly interface, PRIMER, was developed that uses the Self-Contained Facility (SCF) command SUBMIT and System 2000 Natural Language functions and parametric strings that are schema resident. The interface was designed to: (1) allow users, with or without computer experience or keyboard skill, to sporadically access data in the Tuff Data Base; (2) produce retrieval capabilities for the user quickly; and (3) acquaint the users with the data in the Tuff Data Base. This paper gives a brief description of the Tuff Data Base Schema and the interface, PRIMER, which is written in Fortran V. 3 figures

  12. Policy-Based Management Natural Language Parser (United States)

    James, Mark


    The Policy-Based Management Natural Language Parser (PBEM) is a rules-based approach to enterprise management that can be used to automate certain management tasks. This parser simplifies the management of a given endeavor by establishing policies to deal with situations that are likely to occur. Policies are operating rules that can be referred to as a means of maintaining order, security, consistency, or other ways of successfully furthering a goal or mission. PBEM provides a way of managing configuration of network elements, applications, and processes via a set of high-level rules or business policies rather than managing individual elements, thus switching the control to a higher level. This software allows unique management rules (or commands) to be specified and applied to a cross-section of the Global Information Grid (GIG). This software embodies a parser that is capable of recognizing and understanding conversational English. Because all possible dialect variants cannot be anticipated, a unique capability was developed that parses passed on conversation intent rather than the exact way the words are used. This software can increase productivity by enabling a user to converse with the system in conversational English to define network policies. PBEM can be used in both manned and unmanned science-gathering programs. Because policy statements can be domain-independent, this software can be applied equally to a wide variety of applications.

  13. Natural language metaphors covertly influence reasoning.

    Paul H Thibodeau

    Full Text Available Metaphors pervade discussions of social issues like climate change, the economy, and crime. We ask how natural language metaphors shape the way people reason about such social issues. In previous work, we showed that describing crime metaphorically as a beast or a virus, led people to generate different solutions to a city's crime problem. In the current series of studies, instead of asking people to generate a solution on their own, we provided them with a selection of possible solutions and asked them to choose the best ones. We found that metaphors influenced people's reasoning even when they had a set of options available to compare and select among. These findings suggest that metaphors can influence not just what solution comes to mind first, but also which solution people think is best, even when given the opportunity to explicitly compare alternatives. Further, we tested whether participants were aware of the metaphor. We found that very few participants thought the metaphor played an important part in their decision. Further, participants who had no explicit memory of the metaphor were just as much affected by the metaphor as participants who were able to remember the metaphorical frame. These findings suggest that metaphors can act covertly in reasoning. Finally, we examined the role of political affiliation on reasoning about crime. The results confirm our previous findings that Republicans are more likely to generate enforcement and punishment solutions for dealing with crime, and are less swayed by metaphor than are Democrats or Independents.

  14. Presentation of the verbs in Bulgarian-Polish electronic dictionary

    Directory of Open Access Journals (Sweden)

    Ludmila Dimitrova


    Full Text Available Presentation of the verbs in Bulgarian-Polish electronic dictionary This paper briefly discusses the presentation of the verbs in the first electronic Bulgarian-Polish dictionary that is currently being developed under a bilateral collaboration between IMI-BAS and ISS-PAS. Special attention is given to the digital entry classifiers that describe Bulgarian and Polish verbs. Problems related to the correspondence between natural language phenomena and their presentations are discussed. Some examples illustrate the different types of dictionary entries for verbs.

  15. Cognitive Neuroscience of Natural Language Use

    Willems, R.M.


    When we think of everyday language use, the first things that come to mind include colloquial conversations, reading and writing e-mails, sending text messages or reading a book. But can we study the brain basis of language as we use it in our daily lives? As a topic of study, the cognitive

  16. Bibliography of Research in Natural Language Generation (United States)


    593], pages International Conference of the IEEE Engineer- 351-363. ing in Medicine and Biology Society, volume 3, pages 1347-1348, New Orleans, LA...Conference on Machine Translation of Languages and Applied [1218] Ingrid Zukerman. Koalas are not bears: Gener- Language Analysis. pages 66-80. Her

  17. Do neural nets learn statistical laws behind natural language?

    Shuntaro Takahashi

    Full Text Available The performance of deep learning in natural language processing has been spectacular, but the reasons for this success remain unclear because of the inherent complexity of deep learning. This paper provides empirical evidence of its effectiveness and of a limitation of neural networks for language engineering. Precisely, we demonstrate that a neural language model based on long short-term memory (LSTM effectively reproduces Zipf's law and Heaps' law, two representative statistical properties underlying natural language. We discuss the quality of reproducibility and the emergence of Zipf's law and Heaps' law as training progresses. We also point out that the neural language model has a limitation in reproducing long-range correlation, another statistical property of natural language. This understanding could provide a direction for improving the architectures of neural networks.

  18. Generating and Executing Complex Natural Language Queries across Linked Data. (United States)

    Hamon, Thierry; Mougin, Fleur; Grabar, Natalia


    With the recent and intensive research in the biomedical area, the knowledge accumulated is disseminated through various knowledge bases. Links between these knowledge bases are needed in order to use them jointly. Linked Data, SPARQL language, and interfaces in Natural Language question-answering provide interesting solutions for querying such knowledge bases. We propose a method for translating natural language questions in SPARQL queries. We use Natural Language Processing tools, semantic resources, and the RDF triples description. The method is designed on 50 questions over 3 biomedical knowledge bases, and evaluated on 27 questions. It achieves 0.78 F-measure on the test set. The method for translating natural language questions into SPARQL queries is implemented as Perl module available at thhamon/RDF-NLP-SPARQLQuery.

  19. Natural language computing an English generative grammar in Prolog

    CERN Document Server

    Dougherty, Ray C


    This book's main goal is to show readers how to use the linguistic theory of Noam Chomsky, called Universal Grammar, to represent English, French, and German on a computer using the Prolog computer language. In so doing, it presents a follow-the-dots approach to natural language processing, linguistic theory, artificial intelligence, and expert systems. The basic idea is to introduce meaningful answers to significant problems involved in representing human language data on a computer. The book offers a hands-on approach to anyone who wishes to gain a perspective on natural language

  20. Concepts and implementations of natural language query systems (United States)

    Dominick, Wayne D. (Editor); Liu, I-Hsiung


    The currently developed user language interfaces of information systems are generally intended for serious users. These interfaces commonly ignore potentially the largest user group, i.e., casual users. This project discusses the concepts and implementations of a natural query language system which satisfy the nature and information needs of casual users by allowing them to communicate with the system in the form of their native (natural) language. In addition, a framework for the development of such an interface is also introduced for the MADAM (Multics Approach to Data Access and Management) system at the University of Southwestern Louisiana.

  1. UNLization of Punjabi text for natural language processing ...

    Indian Academy of Sciences (India)

    Vaibhav Agarwal


    May 26, 2018 ... resent, and store information in a natural-language-inde- pendent format [8]. UNL is .... account semantic information available in words of the problem ...... Sentiment Analysis (SA) plays a vital role in decision making process.

  2. Finite-State Methodology in Natural Language Processing

    Michal Korzycki


    Full Text Available Recent mathematical and algorithmic results in the field of finite-state technology, as well the increase in computing power, have constructed the base for a new approach in natural language processing. However the task of creating an appropriate model that would describe the phenomena of the natural language is still to be achieved. ln this paper I'm presenting some notions related to the finite-state modelling of syntax and morphology.

  3. The Islamic State Battle Plan: Press Release Natural Language Processing (United States)


    Institute for the Study of Violent Groups NATO North Atlantic Treaty Organization NLP Natural Language Processing PCorpus Permanent Corpus PDF...approaches, we apply Natural Language Processing ( NLP ) tools to a unique database of text documents collected by Whiteside (2014). His collection...from Arabic to English. Compared to other terrorism databases, Whiteside’s collection methodology limits the scope of the database and avoids coding

  4. The Arabic Natural Language Processing: Introduction and Challenges

    Boukhatem Nadera


    Full Text Available Arabic is a Semitic language spoken by more than 330 million people as a native language, in an area extending from the Arabian/Persian Gulf in the East to the Atlantic Ocean in the West. Moreover, it is the language in which 1.4 billion Muslims around the world perform their daily prayers. Over the last few years, Arabic natural language processing (ANLP has gained increasing importance, and several state of the art systems have been developed for a wide range of applications.

  5. Natural language processing in psychiatry. Artificial intelligence technology and psychopathology. (United States)

    Garfield, D A; Rapp, C; Evens, M


    The potential benefit of artificial intelligence (AI) technology as a tool of psychiatry has not been well defined. In this essay, the technology of natural language processing and its position with regard to the two main schools of AI is clearly outlined. Past experiments utilizing AI techniques in understanding psychopathology are reviewed. Natural language processing can automate the analysis of transcripts and can be used in modeling theories of language comprehension. In these ways, it can serve as a tool in testing psychological theories of psychopathology and can be used as an effective tool in empirical research on verbal behavior in psychopathology.

  6. Naturalizing language: human appraisal and (quasi) technology

    Cowley, Stephen


    Using contemporary science, the paper builds on Wittgenstein’s views of human language. Rather than ascribing reality to inscription-like entities, it links embodiment with distributed cognition. The verbal or (quasi) technological aspect of language is traced to not action, but human specific...... interactivity. This species-specific form of sense-making sustains, among other things, using texts, making/construing phonetic gestures and thinking. Human action is thus grounded in appraisals or sense-saturated coordination. To illustrate interactivity at work, the paper focuses on a case study. Over 11 s......, a crime scene investigator infers that she is probably dealing with an inside job: she uses not words, but intelligent gaze. This connects professional expertise to circumstances and the feeling of thinking. It is suggested that, as for other species, human appraisal is based in synergies. However, since...

  7. An overview of computer-based natural language processing (United States)

    Gevarter, W. B.


    Computer based Natural Language Processing (NLP) is the key to enabling humans and their computer based creations to interact with machines in natural language (like English, Japanese, German, etc., in contrast to formal computer languages). The doors that such an achievement can open have made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural language interfaces to computers have recently entered the market and future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state of the art of the technology, issues and research requirements, the major participants and finally, future trends and expectations. It is anticipated that this report will prove useful to engineering and research managers, potential users, and others who will be affected by this field as it unfolds.

  8. Handbook of natural language processing and machine translation DARPA global autonomous language exploitation

    Olive, Joseph P; McCary, John


    This comprehensive handbook, written by leading experts in the field, details the groundbreaking research conducted under the breakthrough GALE program - The Global Autonomous Language Exploitation within the Defense Advanced Research Projects Agency (DARPA), while placing it in the context of previous research in the fields of natural language and signal processing, artificial intelligence and machine translation. The most fundamental contrast between GALE and its predecessor programs was its holistic integration of previously separate or sequential processes. In earlier language research pro

  9. Statistical Language Models and Information Retrieval: Natural Language Processing Really Meets Retrieval

    NARCIS (Netherlands)

    Hiemstra, Djoerd; de Jong, Franciska M.G.


    Traditionally, natural language processing techniques for information retrieval have always been studied outside the framework of formal models of information retrieval. In this article, we introduce a new formal model of information retrieval based on the application of statistical language models.

  10. ROPE: Recoverable Order-Preserving Embedding of Natural Language

    Widemann, David P. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Wang, Eric X. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Thiagarajan, Jayaraman J. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)


    We present a novel Recoverable Order-Preserving Embedding (ROPE) of natural language. ROPE maps natural language passages from sparse concatenated one-hot representations to distributed vector representations of predetermined fixed length. We use Euclidean distance to return search results that are both grammatically and semantically similar. ROPE is based on a series of random projections of distributed word embeddings. We show that our technique typically forms a dictionary with sufficient incoherence such that sparse recovery of the original text is possible. We then show how our embedding allows for efficient and meaningful natural search and retrieval on Microsoft’s COCO dataset and the IMDB Movie Review dataset.

  11. The Integration Hypothesis of Human Language Evolution and the Nature of Contemporary Languages

    Directory of Open Access Journals (Sweden)

    Shigeru eMiyagawa


    Full Text Available How human language arose is a mystery in the evolution of Homo sapiens. Miyagawa, Berwick, & Okanoya (Frontiers 2013 put forward a proposal, which we will call the Integration Hypothesis of human language evolution, which holds that human language is composed of two components, E for expressive, and L for lexical. Each component has an antecedent in nature: E as found, for example, in birdsong, and L in, for example, the alarm calls of monkeys. E and L integrated uniquely in humans to give rise to language. A challenge to the Integration Hypothesis is that while these non-human systems are finite-state in nature, human language is known to require characterization by a non-finite state grammar. Our claim is that E and L, taken separately, are finite-state; when a grammatical process crosses the boundary between E and L, it gives rise to the non-finite state character of human language. We provide empirical evidence for the Integration Hypothesis by showing that certain processes found in contemporary languages that have been characterized as non-finite state in nature can in fact be shown to be finite-state. We also speculate on how human language actually arose in evolution through the lens of the Integration Hypothesis.

  12. Clinical Natural Language Processing in languages other than English: opportunities and challenges. (United States)

    Névéol, Aurélie; Dalianis, Hercules; Velupillai, Sumithra; Savova, Guergana; Zweigenbaum, Pierre


    Natural language processing applied to clinical text or aimed at a clinical outcome has been thriving in recent years. This paper offers the first broad overview of clinical Natural Language Processing (NLP) for languages other than English. Recent studies are summarized to offer insights and outline opportunities in this area. We envision three groups of intended readers: (1) NLP researchers leveraging experience gained in other languages, (2) NLP researchers faced with establishing clinical text processing in a language other than English, and (3) clinical informatics researchers and practitioners looking for resources in their languages in order to apply NLP techniques and tools to clinical practice and/or investigation. We review work in clinical NLP in languages other than English. We classify these studies into three groups: (i) studies describing the development of new NLP systems or components de novo, (ii) studies describing the adaptation of NLP architectures developed for English to another language, and (iii) studies focusing on a particular clinical application. We show the advantages and drawbacks of each method, and highlight the appropriate application context. Finally, we identify major challenges and opportunities that will affect the impact of NLP on clinical practice and public health studies in a context that encompasses English as well as other languages.

  13. Share capital in stock corporations under Polish law. Nature – functions – perspectives

    Directory of Open Access Journals (Sweden)

    Zdzisław Gordon


    Full Text Available Share capital of stock corporations is a monetary value whose equivalent shareholders are obliged to contribute to a company, and which cannot be paid back by a company to shareholders throughout its duration. Share capital exercises three functions: legal, economic and security-enforcing. From a traditional perspective the security (guarantee function is the most important and it entails that share capital constitutes a guarantee of protecting a company’s creditors. In the literature, however, the ability of share capital to perform this function has been more and more often undermined and consequently proposals are put forward to resign from the construction of share capital. The decision to reform share capital of a limited liability company in Polish law, too, seems already to have been decided upon. It is, however, unacceptable to completely resign from the protection of creditors’ interests since the law must provide protection for weaker participants of trading such as small entrepreneurs in relation to stock companies. A serious alternative to share capital, however, seems to be the protection of creditors through the so-called solvency test, which subjects the payments for the benefit of shareholders to the condition that a company’s assets at least balance its liabilities after such a payment. The protection of creditors based on the solvency test is not, however, free from faults. If the construction of share capital was to be resigned from and the solvency test was to be adopted to serve the function of a means of protecting creditors, it seems necessary to develop a characteristic buffer which would prevent using the construction of a limited liability company in high risk ventures and shifting a considerable amount of this risk on to the company’s business partners, not to mention defending against making use of it for common fraud. The role of such a buffer might be played by an obligatory reserve capital based on the

  14. Learning to Understand Natural Language with Less Human Effort (United States)


    Supervision Distant supervision is a recent trend in information extraction. Distantly-supervised extractors are trained using a corpus of unlabeled text...consists of fill-in-the-blank natural language questions such as “Incan emperor ” or “Cunningham directed Auchtre’s second music video .” These questions...with an 132 unknown knowledge base, simultaneously learning how to semantically parse language and pop - ulate the knowledge base. The weakly

  15. Artificial intelligence, expert systems, computer vision, and natural language processing (United States)

    Gevarter, W. B.


    An overview of artificial intelligence (AI), its core ingredients, and its applications is presented. The knowledge representation, logic, problem solving approaches, languages, and computers pertaining to AI are examined, and the state of the art in AI is reviewed. The use of AI in expert systems, computer vision, natural language processing, speech recognition and understanding, speech synthesis, problem solving, and planning is examined. Basic AI topics, including automation, search-oriented problem solving, knowledge representation, and computational logic, are discussed.

  16. Natural-language processing applied to an ITS interface


    Antonio Gisolfi; Enrico Fischetti


    The aim of this paper is to show that with a subset of a natural language, simple systems running on PCs can be developed that can nevertheless be an effective tool for interfacing purposes in the building of an Intelligent Tutoring System (ITS). After presenting the special characteristics of the Smalltalk/V language, which provides an appropriate environment for the development of an interface, the overall architecture of the interface module is discussed. We then show how sentences are par...

  17. Natural language processing and the Now-or-Never bottleneck. (United States)

    Gómez-Rodríguez, Carlos


    Researchers, motivated by the need to improve the efficiency of natural language processing tools to handle web-scale data, have recently arrived at models that remarkably match the expected features of human language processing under the Now-or-Never bottleneck framework. This provides additional support for said framework and highlights the research potential in the interaction between applied computational linguistics and cognitive science.

  18. The Soviet-Polish expedition on the study of natural radioactivity of the Baltic sea sediments (June-July 1975)

    International Nuclear Information System (INIS)

    Aksenov, A.A.; Vypykh, K.; Nevesskij, E.N.


    Results of the work of the common Soviet-Polish expedition on natural radioactivity of the Baltic Sea ground during June-July 1975 are presented. The work was aimed at revealing higher concentrations of heavy minerals and certain valuable mineral complexes and at establishing the rules of their localization at the sea bottom by means of radiometric and radiospectral survey of sea bottom. Spectrometric surveying made it possible to collect simultaneously with the ship's movement continuous information on distribution and contents of natural radioactive minerals, i.e. uranium, thorium and potassium in the upper layers of marine sediments. The level of gamma-activity of the sea grounds was being recorded. A correlation was found between the ground radioactivity level in certain areas of Eastern Baltics and the contents of some minerals, in particular, zircon. The maps of bottom concretion fields for some areas was composed. 'Splashes' of gamma-activity which is primarily due to thorium were found to be associated with the presence of local morphological elements of buried ancient relief covered by thin layers of silt at the sea bottom. It was established that iso-lines of gamma-radiation field run primarily along the latitudes; that the total intensity of soil radiation and its richness in heavy elements increase from the West to the East and the enriched band apparently becomes narrower in the same direction

  19. Polish Semantic Parser

    Directory of Open Access Journals (Sweden)

    Agnieszka Grudzinska


    Full Text Available Amount of information transferred by computers grows very rapidly thus outgrowing the average man's capability of reception. It implies computer programs increase in the demand for which would be able to perform an introductory classitication or even selection of information directed to a particular receiver. Due to the complexity of the problem, we restricted it to understanding short newspaper notes. Among many conceptions formulated so far, the conceptual dependency worked out by Roger Schank has been chosen. It is a formal language of description of the semantics of pronouncement integrated with a text understanding algorithm. Substantial part of each text transformation system is a semantic parser of the Polish language. It is a module, which as the first and the only one has an access to the text in the Polish language. lt plays the role of an element, which finds relations between words of the Polish language and the formal registration. It translates sentences written in the language used by people into the language theory. The presented structure of knowledge units and the shape of understanding process algorithms are universal by virtue of the theory. On the other hand the defined knowledge units and the rules used in the algorithms ure only examples because they are constructed in order to understand short newspaper notes.

  20. System reliability analysis with natural language and expert's subjectivity

    International Nuclear Information System (INIS)

    Onisawa, T.


    This paper introduces natural language expressions and expert's subjectivity to system reliability analysis. To this end, this paper defines a subjective measure of reliability and presents the method of the system reliability analysis using the measure. The subjective measure of reliability corresponds to natural language expressions of reliability estimation, which is represented by a fuzzy set defined on [0,1]. The presented method deals with the dependence among subsystems and employs parametrized operations of subjective measures of reliability which can reflect expert 's subjectivity towards the analyzed system. The analysis results are also expressed by linguistic terms. Finally this paper gives an example of the system reliability analysis by the presented method

  1. Learning to rank for information retrieval and natural language processing

    CERN Document Server

    Li, Hang


    Learning to rank refers to machine learning techniques for training a model in a ranking task. Learning to rank is useful for many applications in information retrieval, natural language processing, and data mining. Intensive studies have been conducted on its problems recently, and significant progress has been made. This lecture gives an introduction to the area including the fundamental problems, major approaches, theories, applications, and future work.The author begins by showing that various ranking problems in information retrieval and natural language processing can be formalized as tw

  2. Second Language Aquisition and The Development through Nature-Nurture

    Directory of Open Access Journals (Sweden)

    Syahfitri Purnama


    Full Text Available There are some factors regarding which aspect of second language acquisition is affected by individual learner factors, age, learning style. aptitude, motivation, and personality. This research is about English language acquisition of fourth-year child by nature and nurture. The child acquired her second language acquisition at home and also in one of the courses in Jakarta. She schooled by her parents in order to be able to speak English well as a target language for her future time. The purpose of this paper is to see and examine individual learner difference especially in using English as a second language. This study is a library research and retrieved data collected, recorded, transcribed, and analyzed descriptively. The results can be concluded: the child is able to communicate well and also able to construct simple sentences, complex sentences, sentence statement, phrase questions, and explain something when her teacher asks her at school. She is able to communicate by making a simple sentence or compound sentence in well-form (two clauses or three clauses, even though she still not focus to use the past tense form and sometimes she forgets to put bound morpheme -s in third person singular but she can use turn-taking in her utterances. It is a very long process since the child does the second language acquisition. The family and teacher should participate and assist the child, the proven child can learn the first and the second language at the same time.

  3. Applications of Natural Language Processing in Biodiversity Science

    Directory of Open Access Journals (Sweden)

    Anne E. Thessen


    A computer can handle the volume but cannot make sense of the language. This paper reviews and discusses the use of natural language processing (NLP and machine-learning algorithms to extract information from systematic literature. NLP algorithms have been used for decades, but require special development for application in the biological realm due to the special nature of the language. Many tools exist for biological information extraction (cellular processes, taxonomic names, and morphological characters, but none have been applied life wide and most still require testing and development. Progress has been made in developing algorithms for automated annotation of taxonomic text, identification of taxonomic names in text, and extraction of morphological character information from taxonomic descriptions. This manuscript will briefly discuss the key steps in applying information extraction tools to enhance biodiversity science.

  4. Learning from a Computer Tutor with Natural Language Capabilities (United States)

    Michael, Joel; Rovick, Allen; Glass, Michael; Zhou, Yujian; Evens, Martha


    CIRCSIM-Tutor is a computer tutor designed to carry out a natural language dialogue with a medical student. Its domain is the baroreceptor reflex, the part of the cardiovascular system that is responsible for maintaining a constant blood pressure. CIRCSIM-Tutor's interaction with students is modeled after the tutoring behavior of two experienced…

  5. CITE NLM: Natural-Language Searching in an Online Catalog. (United States)

    Doszkocs, Tamas E.


    The National Library of Medicine's Current Information Transfer in English public access online catalog offers unique subject search capabilities--natural-language query input, automatic medical subject headings display, closest match search strategy, ranked document output, dynamic end user feedback for search refinement. References, description…

  6. Computing an Ontological Semantics for a Natural Language Fragment

    DEFF Research Database (Denmark)

    Szymczak, Bartlomiej Antoni

    tried to establish a domain independent “ontological semantics” for relevant fragments of natural language. The purpose of this research is to develop methods and systems for taking advantage of formal ontologies for the purpose of extracting the meaning contents of texts. This functionality...

  7. Orwell's 1984: Natural Language Searching and the Contemporary Metaphor. (United States)

    Dadlez, Eva M.


    Describes a natural language searching strategy for retrieving current material which has bearing on George Orwell's "1984," and identifies four main themes (technology, authoritarianism, press and psychological/linguistic implications of surveillance, political oppression) which have emerged from cross-database searches of the "Big…

  8. Recurrent Artificial Neural Networks and Finite State Natural Language Processing. (United States)

    Moisl, Hermann

    It is argued that pessimistic assessments of the adequacy of artificial neural networks (ANNs) for natural language processing (NLP) on the grounds that they have a finite state architecture are unjustified, and that their adequacy in this regard is an empirical issue. First, arguments that counter standard objections to finite state NLP on the…

  9. Paired structures in logical and semiotic models of natural language

    DEFF Research Database (Denmark)

    Rodríguez, J. Tinguaro; Franco, Camilo; Montero, Javier


    The evidence coming from cognitive psychology and linguistics shows that pairs of reference concepts (as e.g. good/bad, tall/short, nice/ugly, etc.) play a crucial role in the way we everyday use and understand natural languages in order to analyze reality and make decisions. Different situations...

  10. Ontology Based Queries - Investigating a Natural Language Interface

    NARCIS (Netherlands)

    van der Sluis, Ielka; Hielkema, F.; Mellish, C.; Doherty, G.


    In this paper we look at what may be learned from a comparative study examining non-technical users with a background in social science browsing and querying metadata. Four query tasks were carried out with a natural language interface and with an interface that uses a web paradigm with hyperlinks.

  11. Liturgical language of the Eastern Slavonic Orthodox Churches. The Position of The Polish Autocephalous Orthodox Church’s Faithful Concerning Liturgical Language

    Directory of Open Access Journals (Sweden)

    Tomasz Stempa


    Full Text Available The analysis of collected materials from the life of the Slavic Orthodox Churches indicates, that in some cases Church Slavonic language is no longer a current or justifiable liturgical language. Bilingualism was introduced or Church Slavonic language was replaced by national languages. A closer investigation into the liturgical language situation in Orthodox Churches reveals that the topicality and the validity of using Church Slavonic language as a liturgical language depends on a few factors. As in the case of the non-canonical Orthodox Churches in Macedonia and Ukraine, the Church Slavonic language has been replaced by national languages for nationalistic reasons. In the case of Bulgaria and Serbia, the main factor that has influenced this change is treating Orthodox Church as a national church. In Eastern Slavonic Orthodox Churches (Belarus, Poland and Russia, changing the liturgical language has occurred at a slow pace. The history of churches in XIX and XXI century, the temper and character of Eastern Slavs have had an influence on this. In this case, the biggest opponent of the Church Slavonic language is democracy in a broad sense. Orthodox Christians in Poland still want to pray in the Church Slavonic language. It is worth mentioning, that in churches, where the national language is used, Church Slavonic language has not been completely removed from liturgical life. Bilingualism of liturgical languages is common and in some cases, when the place is considered as backbone for the Orthodox Church, reversion to Church Slavonic language has been noted (Serbia, Bulgaria.

  12. Developing Formal Correctness Properties from Natural Language Requirements (United States)

    Nikora, Allen P.


    This viewgraph presentation reviews the rationale of the program to transform natural language specifications into formal notation.Specifically, automate generation of Linear Temporal Logic (LTL)correctness properties from natural language temporal specifications. There are several reasons for this approach (1) Model-based techniques becoming more widely accepted, (2) Analytical verification techniques (e.g., model checking, theorem proving) significantly more effective at detecting types of specification design errors (e.g., race conditions, deadlock) than manual inspection, (3) Many requirements still written in natural language, which results in a high learning curve for specification languages, associated tools and increased schedule and budget pressure on projects reduce training opportunities for engineers, and (4) Formulation of correctness properties for system models can be a difficult problem. This has relevance to NASA in that it would simplify development of formal correctness properties, lead to more widespread use of model-based specification, design techniques, assist in earlier identification of defects and reduce residual defect content for space mission software systems. The presentation also discusses: potential applications, accomplishments and/or technological transfer potential and the next steps.

  13. Natural language retrieval in nuclear safety information system

    International Nuclear Information System (INIS)

    Komata, Masaoki; Oosawa, Yasuo; Ujita, Hiroshi


    A natural language retrieval program NATLANG is developed to assist in the retrieval of information from event-and-cause descriptions in Licensee Event Reports (LER). The characteristics of NATLANG are (1) the use of base forms of words to retrieve related forms altered by the addition of prefixes or suffixes or changes in inflection, (2) direct access and short time retrieval with an alphabet pointer, (3) effective determination of the items and entries for a Hitachi event classification in a two step retrieval scheme, and (4) Japanese character output with the PL-1 language. NATLANG output reduces the effort needed to re-classify licensee events in the Hitachi event classification. (author)

  14. Hoe maak je het, lakmoes? Over de (semantische productiviteit van Nederlandse ontleningen in het Pools / How are you, litmus? On the (semantical productivity of the Dutch borrowings in the Polish language

    Directory of Open Access Journals (Sweden)

    Kowalska-Szubert Agata


    Full Text Available Polish language contains hundreds of loan words from Dutch. They are rooted so firmly that they are capable of creating new words. This article presents the most common word-formation phenomena involving Dutch loan words. It also highlights their ability to form phrasemes and transfer meanings.

  15. Managing Fieldwork Data with Toolbox and the Natural Language Toolkit

    Directory of Open Access Journals (Sweden)

    Stuart Robinson


    Full Text Available This paper shows how fieldwork data can be managed using the program Toolbox together with the Natural Language Toolkit (NLTK for the Python programming language. It provides background information about Toolbox and describes how it can be downloaded and installed. The basic functionality of the program for lexicons and texts is described, and its strengths and weaknesses are reviewed. Its underlying data format is briefly discussed, and Toolbox processing capabilities of NLTK are introduced, showing ways in which it can be used to extend the functionality of Toolbox. This is illustrated with a few simple scripts that demonstrate basic data management tasks relevant to language documentation, such as printing out the contents of a lexicon as HTML.

  16. Combining Natural Language Processing and Statistical Text Mining: A Study of Specialized versus Common Languages (United States)

    Jarman, Jay


    This dissertation focuses on developing and evaluating hybrid approaches for analyzing free-form text in the medical domain. This research draws on natural language processing (NLP) techniques that are used to parse and extract concepts based on a controlled vocabulary. Once important concepts are extracted, additional machine learning algorithms,…

  17. Using natural language processing techniques to inform research on nanotechnology

    Directory of Open Access Journals (Sweden)

    Nastassja A. Lewinski


    Full Text Available Literature in the field of nanotechnology is exponentially increasing with more and more engineered nanomaterials being created, characterized, and tested for performance and safety. With the deluge of published data, there is a need for natural language processing approaches to semi-automate the cataloguing of engineered nanomaterials and their associated physico-chemical properties, performance, exposure scenarios, and biological effects. In this paper, we review the different informatics methods that have been applied to patent mining, nanomaterial/device characterization, nanomedicine, and environmental risk assessment. Nine natural language processing (NLP-based tools were identified: NanoPort, NanoMapper, TechPerceptor, a Text Mining Framework, a Nanodevice Analyzer, a Clinical Trial Document Classifier, Nanotoxicity Searcher, NanoSifter, and NEIMiner. We conclude with recommendations for sharing NLP-related tools through online repositories to broaden participation in nanoinformatics.

  18. Using of Natural Language Processing Techniques in Suicide Research

    Directory of Open Access Journals (Sweden)

    Azam Orooji


    Full Text Available It is estimated that each year many people, most of whom are teenagers and young adults die by suicide worldwide. Suicide receives special attention with many countries developing national strategies for prevention. Since, more medical information is available in text, Preventing the growing trend of suicide in communities requires analyzing various textual resources, such as patient records, information on the web or questionnaires. For this purpose, this study systematically reviews recent studies related to the use of natural language processing techniques in the area of people’s health who have completed suicide or are at risk. After electronically searching for the PubMed and ScienceDirect databases and studying articles by two reviewers, 21 articles matched the inclusion criteria. This study revealed that, if a suitable data set is available, natural language processing techniques are well suited for various types of suicide related research.

  19. Exploiting Lexical Regularities in Designing Natural Language Systems. (United States)


    ELEMENT. PROJECT. TASKN Artificial Inteligence Laboratory A1A4WR NTumet 0) 545 Technology Square Cambridge, MA 02139 Ln *t- CONTROLLING OFFICE NAME AND...RO-RI95 922 EXPLOITING LEXICAL REGULARITIES IN DESIGNING NATURAL 1/1 LANGUAGE SYSTENS(U) MASSACHUSETTS INST OF TECH CAMBRIDGE ARTIFICIAL INTELLIGENCE...oes.ary and ftdou.Ip hr Nl wow" L,2This paper presents the lexical component of the START Question Answering system developed at the MIT Artificial

  20. Automatic Requirements Specification Extraction from Natural Language (ARSENAL) (United States)


    studies: the Time-Triggered Ethernet (TTEthernet) communication platform used in space, and FAA-Isolette infant incubators used in NICU . space, and FAA-Isolette infant incubators used in Neonatal Intensive Care Units ( NICUs ). We systematically evalu- ated various aspects of ARSENAL...effect, we present the ARSENAL methodology. ARSENAL uses state-of-the-art advances in natural language processing (NLP) and formal methods (FM) to

  1. Representing Information in Patient Reports Using Natural Language Processing and the Extensible Markup Language (United States)

    Friedman, Carol; Hripcsak, George; Shagina, Lyuda; Liu, Hongfang


    Objective: To design a document model that provides reliable and efficient access to clinical information in patient reports for a broad range of clinical applications, and to implement an automated method using natural language processing that maps textual reports to a form consistent with the model. Methods: A document model that encodes structured clinical information in patient reports while retaining the original contents was designed using the extensible markup language (XML), and a document type definition (DTD) was created. An existing natural language processor (NLP) was modified to generate output consistent with the model. Two hundred reports were processed using the modified NLP system, and the XML output that was generated was validated using an XML validating parser. Results: The modified NLP system successfully processed all 200 reports. The output of one report was invalid, and 199 reports were valid XML forms consistent with the DTD. Conclusions: Natural language processing can be used to automatically create an enriched document that contains a structured component whose elements are linked to portions of the original textual report. This integrated document model provides a representation where documents containing specific information can be accurately and efficiently retrieved by querying the structured components. If manual review of the documents is desired, the salient information in the original reports can also be identified and highlighted. Using an XML model of tagging provides an additional benefit in that software tools that manipulate XML documents are readily available. PMID:9925230

  2. Polish-German bilingualism at school. A Polish perspective

    Directory of Open Access Journals (Sweden)

    Pulaczewska, Hanna


    Full Text Available This article presents the institutional frames for the acquisition of Polish literacy skills in Germany and the maintenance of Polish-German bilingualism after the repatriation of bilingual children to Poland. These processes are examined in the context of recent developments in the European domestic job market. While the European Union has placed proficiency in several languages among its educational objectives, and foreign languages have been made obligatory school subjects in all member countries, the potential advantages of internal European migrations for producing high-proficiency bilinguals are being ignored. Bilingualism resulting from migration and biculturalism enjoys little social prestige in the host countries. In Germany, there is significant regional variation in how school authorities react to challenges posed by the presence of minority languages. In many cases, the linguistic potential of many second-generation migrants and re-emigrants gets largely wasted because of lacking interest and incentives from German and Polish institutions alike.

  3. Natural Language Processing Technologies in Radiology Research and Clinical Applications (United States)

    Cai, Tianrun; Giannopoulos, Andreas A.; Yu, Sheng; Kelil, Tatiana; Ripley, Beth; Kumamaru, Kanako K.; Rybicki, Frank J.


    The migration of imaging reports to electronic medical record systems holds great potential in terms of advancing radiology research and practice by leveraging the large volume of data continuously being updated, integrated, and shared. However, there are significant challenges as well, largely due to the heterogeneity of how these data are formatted. Indeed, although there is movement toward structured reporting in radiology (ie, hierarchically itemized reporting with use of standardized terminology), the majority of radiology reports remain unstructured and use free-form language. To effectively “mine” these large datasets for hypothesis testing, a robust strategy for extracting the necessary information is needed. Manual extraction of information is a time-consuming and often unmanageable task. “Intelligent” search engines that instead rely on natural language processing (NLP), a computer-based approach to analyzing free-form text or speech, can be used to automate this data mining task. The overall goal of NLP is to translate natural human language into a structured format (ie, a fixed collection of elements), each with a standardized set of choices for its value, that is easily manipulated by computer programs to (among other things) order into subcategories or query for the presence or absence of a finding. The authors review the fundamentals of NLP and describe various techniques that constitute NLP in radiology, along with some key applications. ©RSNA, 2016 PMID:26761536

  4. Natural Language Processing Technologies in Radiology Research and Clinical Applications. (United States)

    Cai, Tianrun; Giannopoulos, Andreas A; Yu, Sheng; Kelil, Tatiana; Ripley, Beth; Kumamaru, Kanako K; Rybicki, Frank J; Mitsouras, Dimitrios


    The migration of imaging reports to electronic medical record systems holds great potential in terms of advancing radiology research and practice by leveraging the large volume of data continuously being updated, integrated, and shared. However, there are significant challenges as well, largely due to the heterogeneity of how these data are formatted. Indeed, although there is movement toward structured reporting in radiology (ie, hierarchically itemized reporting with use of standardized terminology), the majority of radiology reports remain unstructured and use free-form language. To effectively "mine" these large datasets for hypothesis testing, a robust strategy for extracting the necessary information is needed. Manual extraction of information is a time-consuming and often unmanageable task. "Intelligent" search engines that instead rely on natural language processing (NLP), a computer-based approach to analyzing free-form text or speech, can be used to automate this data mining task. The overall goal of NLP is to translate natural human language into a structured format (ie, a fixed collection of elements), each with a standardized set of choices for its value, that is easily manipulated by computer programs to (among other things) order into subcategories or query for the presence or absence of a finding. The authors review the fundamentals of NLP and describe various techniques that constitute NLP in radiology, along with some key applications. ©RSNA, 2016.

  5. Polish Academy of Sciences Great Dictionary of Polish [Wielki słownik języka polskiego PAN

    Directory of Open Access Journals (Sweden)

    Piotr Žmigrodzki


    Full Text Available The paper describes a lexicographical project involving the development of the newest general dictionary of the Polish language: the Polish Academy of Sciences Great Dictionary of Polish [Wielki słownik języka polskiego PAN]. The project is coordinated by the Institute of Polish Language at the Polish Academy of Sciences and carried out in collaboration with linguists and lexicographers from several other Polish academic centres. The paper offers a brief description of the genesis of the project and the scope of information included in the dictionary, the organisation of work, the life of the dictionary on the Web as well as the plans for the future.

  6. Polish visit

    CERN Document Server


    On 6 October, Professor Michal Kleiber, Polish Minister of Science and Chairman of the State Committee for Scientific Research, visited CERN and met both the current and designated Director General, Luciano Maiani and Robert Aymar. Professor Kleiber visited the CMS and ATLAS detector assembly halls, the underground cavern for ATLAS, and the LHC superconducting magnet string test hall. Michal Kleiber (left), Polish minister of science and Jan Krolikowski, scientist at Warsaw University and working for CMS, who shows the prototypes of the Muon Trigger board of CMS.

  7. Discovery of Kolmogorov Scaling in the Natural Language

    Directory of Open Access Journals (Sweden)

    Maurice H. P. M. van Putten


    Full Text Available We consider the rate R and variance σ 2 of Shannon information in snippets of text based on word frequencies in the natural language. We empirically identify Kolmogorov’s scaling law in σ 2 ∝ k - 1 . 66 ± 0 . 12 (95% c.l. as a function of k = 1 / N measured by word count N. This result highlights a potential association of information flow in snippets, analogous to energy cascade in turbulent eddies in fluids at high Reynolds numbers. We propose R and σ 2 as robust utility functions for objective ranking of concordances in efficient search for maximal information seamlessly across different languages and as a starting point for artificial attention.

  8. Natural-language processing applied to an ITS interface

    Directory of Open Access Journals (Sweden)

    Antonio Gisolfi


    Full Text Available The aim of this paper is to show that with a subset of a natural language, simple systems running on PCs can be developed that can nevertheless be an effective tool for interfacing purposes in the building of an Intelligent Tutoring System (ITS. After presenting the special characteristics of the Smalltalk/V language, which provides an appropriate environment for the development of an interface, the overall architecture of the interface module is discussed. We then show how sentences are parsed by the interface, and how interaction takes place with the user. The knowledge-acquisition phase is subsequently described. Finally, some excerpts from a tutoring session concerned with elementary geometry are discussed, and some of the problems and limitations of the approach are illustrated.

  9. Determination of suitability of natural Polish resources for production of ceramic proppants applied in gas exploration from European shale formations (United States)

    Szymanska, Joanna; Mizera, Jaroslaw


    Poland is one of few European countries undertaking innovative research towards effective exploration of hydrocarbons form shale deposits. With regard for strict geological conditions, which occur during hydraulic fracturing, it is required to apply ceramic proppants enhancing extraction of shale gas. Ceramic proppants are granules (16/30 - 70/120 Mesh) classified as propping agents. These granules located in the newly created fissures (due to injected high pressure fluid) in the shale rock, act as a prop, what enables gas flow up the well. It occurs if the proppants can resist high stress of the closing fractures. Commonly applied proppants are quartz sands used only for shallow reservoirs and fissile shales (in the USA). Whereas, the ceramic granules are proper for extraction of gas on the high depths at hard geomechanical conditions (in Europe) to increase output even by 30 - 50%. In comparison to other propping materials, this kind of proppants predominate with mechanical strength, smoother surface, lower solubility in acids and also high stability in water. Such parameters can be available through proper raw materials selection to further proppants production. The Polish ceramic proppants are produced from natural resources as kaolin, bauxite and white clay mixed with water and binders. Afterwards, the slurries are subjected to granulation in a mechanical granulator and sintered at high temperatures (1200 - 1550°C). Taking into consideration presence of geomechanical barriers, that prevent fracture propagation beyond shale formations, it is crucial to determine quality of applied natural deposits. Next step is to optimize the proppants production and select the best kind of granules, what was the aim of this research. Utility of the raw materials was estimated on basis of their particle size distribution, bulk density, specific surface area (BET) and thermal analysis (thermogravimetry). Morphology and shape were determined by Scanning Electron Microscopy (SEM

  10. Recent Technological Advances in Natural Language Processing and Artificial Intelligence


    Shah, Nishal Pradeepkumar


    A recent advance in computer technology has permitted scientists to implement and test algorithms that were known from quite some time (or not) but which were computationally expensive. Two such projects are IBM's Jeopardy as a part of its DeepQA project [1] and Wolfram's Wolframalpha[2]. Both these methods implement natural language processing (another goal of AI scientists) and try to answer questions as asked by the user. Though the goal of the two projects is similar, both of them have a ...

  11. Deviations in the Zipf and Heaps laws in natural languages (United States)

    Bochkarev, Vladimir V.; Lerner, Eduard Yu; Shevlyakova, Anna V.


    This paper is devoted to verifying of the empirical Zipf and Hips laws in natural languages using Google Books Ngram corpus data. The connection between the Zipf and Heaps law which predicts the power dependence of the vocabulary size on the text size is discussed. In fact, the Heaps exponent in this dependence varies with the increasing of the text corpus. To explain it, the obtained results are compared with the probability model of text generation. Quasi-periodic variations with characteristic time periods of 60-100 years were also found.

  12. Deviations in the Zipf and Heaps laws in natural languages

    International Nuclear Information System (INIS)

    Bochkarev, Vladimir V; Lerner, Eduard Yu; Shevlyakova, Anna V


    This paper is devoted to verifying of the empirical Zipf and Hips laws in natural languages using Google Books Ngram corpus data. The connection between the Zipf and Heaps law which predicts the power dependence of the vocabulary size on the text size is discussed. In fact, the Heaps exponent in this dependence varies with the increasing of the text corpus. To explain it, the obtained results are compared with the probability model of text generation. Quasi-periodic variations with characteristic time periods of 60-100 years were also found

  13. Box: Natural Language Processing Research Using Amazon Web Services

    Directory of Open Access Journals (Sweden)

    Axelrod Amittai


    Full Text Available We present a publicly-available state-of-the-art research and development platform for Machine Translation and Natural Language Processing that runs on the Amazon Elastic Compute Cloud. This provides a standardized research environment for all users, and enables perfect reproducibility and compatibility. Box also enables users to use their hardware budget to avoid the management and logistical overhead of maintaining a research lab, yet still participate in global research community with the same state-of-the-art tools.

  14. Query2Question: Translating Visualization Interaction into Natural Language. (United States)

    Nafari, Maryam; Weaver, Chris


    Richly interactive visualization tools are increasingly popular for data exploration and analysis in a wide variety of domains. Existing systems and techniques for recording provenance of interaction focus either on comprehensive automated recording of low-level interaction events or on idiosyncratic manual transcription of high-level analysis activities. In this paper, we present the architecture and translation design of a query-to-question (Q2Q) system that automatically records user interactions and presents them semantically using natural language (written English). Q2Q takes advantage of domain knowledge and uses natural language generation (NLG) techniques to translate and transcribe a progression of interactive visualization states into a visual log of styled text that complements and effectively extends the functionality of visualization tools. We present Q2Q as a means to support a cross-examination process in which questions rather than interactions are the focus of analytic reasoning and action. We describe the architecture and implementation of the Q2Q system, discuss key design factors and variations that effect question generation, and present several visualizations that incorporate Q2Q for analysis in a variety of knowledge domains.

  15. Suicide Note Classification Using Natural Language Processing: A Content Analysis

    Directory of Open Access Journals (Sweden)

    John Pestian


    Full Text Available Suicide is the second leading cause of death among 25–34 year olds and the third leading cause of death among 15–25 year olds in the United States. In the Emergency Department, where suicidal patients often present, estimating the risk of repeated attempts is generally left to clinical judgment. This paper presents our second attempt to determine the role of computational algorithms in understanding a suicidal patient’s thoughts, as represented by suicide notes. We focus on developing methods of natural language processing that distinguish between genuine and elicited suicide notes. We hypothesize that machine learning algorithms can categorize suicide notes as well as mental health professionals and psychiatric physician trainees do. The data used are comprised of suicide notes from 33 suicide completers and matched to 33 elicited notes from healthy control group members. Eleven mental health professionals and 31 psychiatric trainees were asked to decide if a note was genuine or elicited. Their decisions were compared to nine different machine-learning algorithms. The results indicate that trainees accurately classified notes 49% of the time, mental health professionals accurately classified notes 63% of the time, and the best machine learning algorithm accurately classified the notes 78% of the time. This is an important step in developing an evidence-based predictor of repeated suicide attempts because it shows that natural language processing can aid in distinguishing between classes of suicidal notes.

  16. Suicide Note Classification Using Natural Language Processing: A Content Analysis. (United States)

    Pestian, John; Nasrallah, Henry; Matykiewicz, Pawel; Bennett, Aurora; Leenaars, Antoon


    Suicide is the second leading cause of death among 25-34 year olds and the third leading cause of death among 15-25 year olds in the United States. In the Emergency Department, where suicidal patients often present, estimating the risk of repeated attempts is generally left to clinical judgment. This paper presents our second attempt to determine the role of computational algorithms in understanding a suicidal patient's thoughts, as represented by suicide notes. We focus on developing methods of natural language processing that distinguish between genuine and elicited suicide notes. We hypothesize that machine learning algorithms can categorize suicide notes as well as mental health professionals and psychiatric physician trainees do. The data used are comprised of suicide notes from 33 suicide completers and matched to 33 elicited notes from healthy control group members. Eleven mental health professionals and 31 psychiatric trainees were asked to decide if a note was genuine or elicited. Their decisions were compared to nine different machine-learning algorithms. The results indicate that trainees accurately classified notes 49% of the time, mental health professionals accurately classified notes 63% of the time, and the best machine learning algorithm accurately classified the notes 78% of the time. This is an important step in developing an evidence-based predictor of repeated suicide attempts because it shows that natural language processing can aid in distinguishing between classes of suicidal notes.

  17. Natural Language Processing in Radiology: A Systematic Review. (United States)

    Pons, Ewoud; Braun, Loes M M; Hunink, M G Myriam; Kors, Jan A


    Radiological reporting has generated large quantities of digital content within the electronic health record, which is potentially a valuable source of information for improving clinical care and supporting research. Although radiology reports are stored for communication and documentation of diagnostic imaging, harnessing their potential requires efficient and automated information extraction: they exist mainly as free-text clinical narrative, from which it is a major challenge to obtain structured data. Natural language processing (NLP) provides techniques that aid the conversion of text into a structured representation, and thus enables computers to derive meaning from human (ie, natural language) input. Used on radiology reports, NLP techniques enable automatic identification and extraction of information. By exploring the various purposes for their use, this review examines how radiology benefits from NLP. A systematic literature search identified 67 relevant publications describing NLP methods that support practical applications in radiology. This review takes a close look at the individual studies in terms of tasks (ie, the extracted information), the NLP methodology and tools used, and their application purpose and performance results. Additionally, limitations, future challenges, and requirements for advancing NLP in radiology will be discussed. (©) RSNA, 2016 Online supplemental material is available for this article.

  18. Advanced applications of natural language processing for performing information extraction

    CERN Document Server

    Rodrigues, Mário


    This book explains how can be created information extraction (IE) applications that are able to tap the vast amount of relevant information available in natural language sources: Internet pages, official documents such as laws and regulations, books and newspapers, and social web. Readers are introduced to the problem of IE and its current challenges and limitations, supported with examples. The book discusses the need to fill the gap between documents, data, and people, and provides a broad overview of the technology supporting IE. The authors present a generic architecture for developing systems that are able to learn how to extract relevant information from natural language documents, and illustrate how to implement working systems using state-of-the-art and freely available software tools. The book also discusses concrete applications illustrating IE uses.   ·         Provides an overview of state-of-the-art technology in information extraction (IE), discussing achievements and limitations for t...

  19. Neurolinguistics and psycholinguistics as a basis for computer acquisition of natural language

    Energy Technology Data Exchange (ETDEWEB)

    Powers, D.M.W.


    Research into natural language understanding systems for computers has concentrated on implementing particular grammars and grammatical models of the language concerned. This paper presents a rationale for research into natural language understanding systems based on neurological and psychological principles. Important features of the approach are that it seeks to place the onus of learning the language on the computer, and that it seeks to make use of the vast wealth of relevant psycholinguistic and neurolinguistic theory. 22 references.

  20. The need for verification of the Polish lignite deposits owing to development and nature conservation protection on land at the surface

    Directory of Open Access Journals (Sweden)

    Naworyta Wojciech


    Full Text Available Poland is a country rich in lignite. The area where the lignite occurs occupies approx. 22% of the total surface area of the country. Geological resources of Polish lignite deposits are estimated at 23.5 billion Mg, but in the majority (69% the accuracy of their identification is poor. Nevertheless the amount of coal in Polish deposits allows - at least in theory - for mining and energy production at the current level for hundreds of years to come. It is an important raw material for the energy security of the country both currently and in the future. Because the vast majority of Polish and foreign mines use an open pit method for lignite extraction the actual amount of mineral available for the extraction depends not only on the properties of the deposit but to a large extent on the method of development of the surface land above the deposit, as well as on the sensitivity of the environment in the vicinity of any future mines. After careful analysis it can be stated that only a few of the lignite deposits may be subject to cost-effective mining operations. These deposits should be subjected to special protection as a future resource base which will ensure the energy security of the country. Some examples of domestic deposits have been presented where due to the conflict resulting from the development of the area should be deleted from the Balance Sheet of Mineral Deposits because their exploitation is irrational and uneconomic. Keeping such deposits in the Balance Sheet, and the use of large numbers in the context of their resource base leads to an unwarranted sense of wealth which consequently does not encourage the protection of these deposits which may actually be subject to rational exploitation in the near future. In summary there is a need to find a compromise in order to adequately protect all natural resources including mineral deposits.

    Poletiek, Fenna H; Fitz, Hartmut; Bocanegra, Bruno R


    Rey et al. (2012) present data from a study with baboons that they interpret in support of the idea that center-embedded structures in human language have their origin in low level memory mechanisms and associative learning. Critically, the authors claim that the baboons showed a behavioral preference that is consistent with center-embedded sequences over other types of sequences. We argue that the baboons' response patterns suggest that two mechanisms are involved: first, they can be trained to associate a particular response with a particular stimulus, and, second, when faced with two conditioned stimuli in a row, they respond to the most recent one first, copying behavior they had been rewarded for during training. Although Rey et al. (2012) 'experiment shows that the baboons' behavior is driven by low level mechanisms, it is not clear how the animal behavior reported, bears on the phenomenon of Center Embedded structures in human syntax. Hence, (1) natural language syntax may indeed have been shaped by low level mechanisms, and (2) the baboons' behavior is driven by low level stimulus response learning, as Rey et al. propose. But is the second evidence for the first? We will discuss in what ways this study can and cannot give evidential value for explaining the origin of Center Embedded recursion in human grammar. More generally, their study provokes an interesting reflection on the use of animal studies in order to understand features of the human linguistic system. Copyright © 2015 Elsevier B.V. All rights reserved.

    Ealey, Douglas

    This thesis puts forward the view that a purely signal- based approach to natural language processing is both plausible and desirable. By questioning the veracity of symbolic representations of meaning, it argues for a unified, non-symbolic model of knowledge representation that is both biologically plausible and, potentially, highly efficient. Processes to generate a grounded, neural form of this model-dubbed the semantic filter-are discussed. The combined effects of local neural organisation, coincident with perceptual maturation, are used to hypothesise its nature. This theoretical model is then validated in light of a number of fundamental neurological constraints and milestones. The mechanisms of semantic and episodic development that the model predicts are then used to explain linguistic properties, such as propositions and verbs, syntax and scripting. To mimic the growth of locally densely connected structures upon an unbounded neural substrate, a system is developed that can grow arbitrarily large, data- dependant structures composed of individual self- organising neural networks. The maturational nature of the data used results in a structure in which the perception of concepts is refined by the networks, but demarcated by subsequent structure. As a consequence, the overall structure shows significant memory and computational benefits, as predicted by the cognitive and neural models. Furthermore, the localised nature of the neural architecture also avoids the increasing error sensitivity and redundancy of traditional systems as the training domain grows. The semantic and episodic filters have been demonstrated to perform as well, or better, than more specialist networks, whilst using significantly larger vocabularies, more complex sentence forms and more natural corpora.

    Wu, Joy T; Dernoncourt, Franck; Gehrmann, Sebastian; Tyler, Patrick D; Moseley, Edward T; Carlson, Eric T; Grant, David W; Li, Yeran; Welt, Jonathan; Celi, Leo Anthony


    Advancement of Artificial Intelligence (AI) capabilities in medicine can help address many pressing problems in healthcare. However, AI research endeavors in healthcare may not be clinically relevant, may have unrealistic expectations, or may not be explicit enough about their limitations. A diverse and well-functioning multidisciplinary team (MDT) can help identify appropriate and achievable AI research agendas in healthcare, and advance medical AI technologies by developing AI algorithms as well as addressing the shortage of appropriately labeled datasets for machine learning. In this paper, our team of engineers, clinicians and machine learning experts share their experience and lessons learned from their two-year-long collaboration on a natural language processing (NLP) research project. We highlight specific challenges encountered in cross-disciplinary teamwork, dataset creation for NLP research, and expectation setting for current medical AI technologies. Copyright © 2017. Published by Elsevier B.V.

    Kashyap, Vipul; Turchin, Alexander; Morin, Laura; Chang, Frank; Li, Qi; Hongsermeier, Tonya


    Structured Clinical Documentation is a fundamental component of the healthcare enterprise, linking both clinical (e.g., electronic health record, clinical decision support) and administrative functions (e.g., evaluation and management coding, billing). One of the challenges in creating good quality documentation templates has been the inability to address specialized clinical disciplines and adapt to local clinical practices. A one-size-fits-all approach leads to poor adoption and inefficiencies in the documentation process. On the other hand, the cost associated with manual generation of documentation templates is significant. Consequently there is a need for at least partial automation of the template generation process. We propose an approach and methodology for the creation of structured documentation templates for diabetes using Natural Language Processing (NLP).

    Deleger, Louise; Li, Qi; Lingren, Todd; Kaiser, Megan; Molnar, Katalin; Stoutenborough, Laura; Kouril, Michal; Marsolo, Keith; Solti, Imre


    We present the construction of three annotated corpora to serve as gold standards for medical natural language processing (NLP) tasks. Clinical notes from the medical record, clinical trial announcements, and FDA drug labels are annotated. We report high inter-annotator agreements (overall F-measures between 0.8467 and 0.9176) for the annotation of Personal Health Information (PHI) elements for a de-identification task and of medications, diseases/disorders, and signs/symptoms for information extraction (IE) task. The annotated corpora of clinical trials and FDA labels will be publicly released and to facilitate translational NLP tasks that require cross-corpora interoperability (e.g. clinical trial eligibility screening) their annotation schemas are aligned with a large scale, NIH-funded clinical text annotation project.

    Full Text Available Development of information technologies is growing steadily. With the latest software technologies development and application of the methods of artificial intelligence and machine learning intelligence embededs in computers, the expectations are that in near future computers will be able to solve problems themselves like people do. Artificial intelligence emulates human behavior on computers. Rather than executing instructions one by one, as theyare programmed, machine learning employs prior experience/data that is used in the process of system’s training. In this state of the art paper, common methods in AI, such as machine learning, pattern recognition and the natural language processing (NLP are discussed. Also are given standard architecture of NLP processing system and the level thatisneeded for understanding NLP. Lastly the statistical NLP processing and multi-word expressions are described.

  7. Natural mineral bottled waters available on the Polish market as a source of minerals for the consumers. Part 2: The intake of sodium and potassium. (United States)

    Gątarska, Anna; Ciborska, Joanna; Tońska, Elżbieta

    Natural mineral waters are purchased and consumed according to consumer preferences and possible recommendations. The choice of appropriate water should take into account not only the general level of mineralization but also the content of individual components, including electrolytes such as sodium and potassium. Sodium is necessary to ensure the proper physiological functions of the body. It is defined as a health risk factor only when its excessive intake occurs. Potassium acts antagonistically towards sodium and calcium ions, contributes to a reduction of the volume of extracellular fluids and at the same time reduces muscle tension and permeability of cell membranes. The demand for sodium and potassium is of particular importance in people expending significant physical effort, where an increased electrolyte supply is recommended. The aim of the study was to estimate the content of sodium and potassium in natural mineral waters available in the Polish market and to evaluate the intake of those components with the commercially available mineral waters by different groups of consumers at the assumed volume of their consumption. The research material consisted of natural mineral waters of forty various brands available on the Polish market. The examined products were either produced in Poland or originated in other European countries. Among the products under examination, about 30% of the waters were imported from Lithuania, Latvia, the Czech Republic, France, Italy and Germany. A sample for analyses consisted of two package units of the examined water from different production lots. Samples for research were collected at random. The study was conducted with the same samples in in which calcium and magnesium content was determined, which was the subject of the first part of the study. The content of sodium and potassium was determined using the emission technique (acetylene-air flame), with the use of atomic absorption spectrometer – ICE 3000 SERIES – THERMO

    Graham, Matthew; Zhang, M.; Djorgovski, S. G.; Donalek, C.; Drake, A. J.; Mahabal, A.


    The rapidly emerging field of time domain astronomy is one of the most exciting and vibrant new research frontiers, ranging in scientific scope from studies of the Solar System to extreme relativistic astrophysics and cosmology. It is being enabled by a new generation of large synoptic digital sky surveys - LSST, PanStarrs, CRTS - that cover large areas of sky repeatedly, looking for transient objects and phenomena. One of the biggest challenges facing these is the automated classification of transient events, a process that needs machine-processible astronomical knowledge. Semantic technologies enable the formal representation of concepts and relations within a particular domain. ATELs ( are a commonly-used means for reporting and commenting upon new astronomical observations of transient sources (supernovae, stellar outbursts, blazar flares, etc). However, they are loose and unstructured and employ scientific natural language for description: this makes automated processing of them - a necessity within the next decade with petascale data rates - a challenge. Nevertheless they represent a potentially rich corpus of information that could lead to new and valuable insights into transient phenomena. This project lies in the cutting-edge field of astrosemantics, a branch of astroinformatics, which applies semantic technologies to astronomy. The ATELs have been used to develop an appropriate concept scheme - a representation of the information they contain - for transient astronomy using hierarchical clustering of processed natural language. This allows us to automatically organize ATELs based on the vocabulary used. We conclude that we can use simple algorithms to process and extract meaning from astronomical textual data.

    Full Text Available Polish-Bulgarian-Russian, Bulgarian-Polish-Russian or Russian-Bulgarian-Polish dictionary? The trilingual dictionary (M. Duszkin, V. Koseska, J. Satoła and A. Tzoneva is being elaborated based on a working Polish-Bulgarian-Russian electronic parallel corpus authored by Maksim Duszkin, Violetta Koseska-Toszewa and Joanna Satoła-Staśkowiak, and works by A. Tzoneva. It is the first corpus comparing languages belonging to three different Slavic language groups: western, southern and eastern. Works on the dictionary are based on Gramatyka konfrontatywna bułgarsko-polska (Bulgarian-Polish confrontative grammar and the proposed there semantic-oriented interlanguage. Two types of classifiers have been introduced into the dictionary: classic and semantic. The trilingual dictionary will present a consistent and homogeneous set of facts of grammar and semantics. The Authors point out that in a traditional dictionary it is not clear for example whether aspect should be understood as imperfective / perfective form of a verb or as its meaning. Therefore in the dictionary forms and meaning are separated in a regular way. Imperfective verb form has two meanings: state and configuration of states and events culminating in state. Also perfective verb form has two meanings: event and configuration of states and events culminating in event. These meanings are described by the semantic classifiers, respectively, state and event, state1 and event1. The way of describing language units, mentioned in the article, gives a possibility to present language material (Polish, Bulgarian, Russian in any required order, hence the article’s title.

    Kim, Tai-Hoon

    Natural language processing (NLP) is a subfield of artificial intelligence and computational linguistics. It studies the problems of automated generation and understanding of natural human languages. This paper outlines a framework to use computer and natural language techniques for various levels of learners to learn foreign languages in Computer-based Learning environment. We propose some ideas for using the computer as a practical tool for learning foreign language where the most of courseware is generated automatically. We then describe how to build Computer Based Learning tools, discuss its effectiveness, and conclude with some possibilities using on-line resources.

    Deane, Paul; Sheehan, Kathleen

    This paper is an exploration of the conceptual issues that have arisen in the course of building a natural language generation (NLG) system for automatic test item generation. While natural language processing techniques are applicable to general verbal items, mathematics word problems are particularly tractable targets for natural language…

    Piątkowski, Włodzimierz; Skrzypek, Michał


    The cognitive identity of medical sociology has developed in a historical perspective in the context of a specific double frame of reference comprising medicine and general sociology. The purpose of this study is to reconstruct the process of the development of the subdiscipline's research specificity in Poland, drawing attention to the general-sociological context of the conceptualization of basic interpretive and analytical sociomedical categories. In this aspect, the presented study is based on the analysis of Polish sociomedical and general-sociological research published from the early 1960s until 1989. The purpose of the study is also to describe in this perspective the structure of the research field of contemporary Western medical sociology, which was a major point of reference in this process. A look at the chronology of how the scientific identity of medical sociology developed in Poland from a historical perspective shows the gradual balancing-out of the subdiscipline's medical references, typical of the early stage of its development, and manifested in the implementation of research projects for the requirements of doctors, through consistently developed and cultivated connections with general sociology manifested in complementing the knowledge of society with aspects related to health and illness. A sine qua non condition for undertaking this scope of research was to work out strictly sociological formulations of these concepts, which was accomplished as a result of the successful reception of general sociology by the subdiscipline in question. The contemporary understanding of the research field of Polish medical sociology defined by Magdalena Sokołowska and developed as part of the 'school of medical sociology', which she initiated, is characterized by the maintenance of close relations with general sociology (affiliations of sociomedical departments in academic sociological institutions, etc.), and at the same time, by partnership cooperation with

    Miller, William R; Johnson, Wendy R


    Client motivation for change, a topic of high interest to addiction clinicians, is multidimensional and complex, and many different approaches to measurement have been tried. The current effort drew on psycholinguistic research on natural language that is used by clients to describe their own motivation. Seven addiction treatment sites participated in the development of a simple scale to measure client motivation. Twelve items were drafted to represent six potential dimensions of motivation for change that occur in natural discourse. The maximum self-rating of motivation (10 on a 0-10 scale) was the median score on all items, and 43% of respondents rated 10 on all 12 items - a substantial ceiling effect. From 1035 responses, three factors emerged representing importance, ability, and commitment - constructs that are also reflected in several theoretical models of motivation. A 3-item version of the scale, with one marker item for each of these constructs, accounted for 81% of variance in the full scale. The three items are: 1. It is important for me to . . . 2. I could . . . and 3. I am trying to . . . This offers a quick (1-minute) assessment of clients' self-reported motivation for change.

  14. "Speaking English Naturally": The Language Ideologies of English as an Official Language at a Korean University (United States)

    Choi, Jinsook


    This study explores language ideologies of English at a Korean university where English has been adopted as an official language. This study draws on ethnographic data in order to understand how speakers respond to and experience the institutional language policy. The findings show that language ideologies in this university represent the…

    Wittrock, Merlin C.

    Concepts in cognitive psychology are applied to the language used in military situations, and a sentence classification system for use in analyzing military language is outlined. The system is designed to be used, in part, in conjunction with a natural language query system that allows a user to access a database. The discussion of military…

  16. Does Grammatical Gender Influence Perception? A Study of Polish and French Speakers

    Full Text Available Can the perception of a word be influenced by its grammatical gender? Can it happen that speakers of one language perceive an object to have masculine features, while speakers of another language perceive the same object to have feminine features? Previous studies suggest that this is the case, and also that there is some supra-language gender categorisation of objects as natural/feminine and artefact/masculine. This study was an attempt to replicate these findings on another population of subjects. This is the first Polish study of this kind, comparing the perceptions of objects by Polish- and French-speaking individuals. The results of this study show that grammatical gender may cue people to assess objects as masculine or feminine. However, the findings of some previous studies, that feminine features are more often ascribed to natural objects than artifacts, were not replicated.

    Hirschman, Lynette; Fort, Karën; Boué, Stéphanie; Kyrpides, Nikos; Islamaj Doğan, Rezarta; Cohen, Kevin Bretonnel


    Crowdsourcing is increasingly utilized for performing tasks in both natural language processing and biocuration. Although there have been many applications of crowdsourcing in these fields, there have been fewer high-level discussions of the methodology and its applicability to biocuration. This paper explores crowdsourcing for biocuration through several case studies that highlight different ways of leveraging 'the crowd'; these raise issues about the kind(s) of expertise needed, the motivations of participants, and questions related to feasibility, cost and quality. The paper is an outgrowth of a panel session held at BioCreative V (Seville, September 9-11, 2015). The session consisted of four short talks, followed by a discussion. In their talks, the panelists explored the role of expertise and the potential to improve crowd performance by training; the challenge of decomposing tasks to make them amenable to crowdsourcing; and the capture of biological data and metadata through community editing.Database URL: © The Author(s) 2016. Published by Oxford University Press.

  18. Arabic text preprocessing for the natural language processing applications

    Juuso, Esko K.


    Performance improvement is taken as the primary goal in the asset management. Advanced data analysis is needed to efficiently integrate condition monitoring data into the operation and maintenance. Intelligent stress and condition indices have been developed for control and condition monitoring by combining generalized norms with efficient nonlinear scaling. These nonlinear scaling methodologies can also be used to handle performance measures used for management since management oriented indicators can be presented in the same scale as intelligent condition and stress indices. Performance indicators are responses of the process, machine or system to the stress contributions analyzed from process and condition monitoring data. Scaled values are directly used in intelligent temporal analysis to calculate fluctuations and trends. All these methodologies can be used in prognostics and fatigue prediction. The meanings of the variables are beneficial in extracting expert knowledge and representing information in natural language. The idea of dividing the problems into the variable specific meanings and the directions of interactions provides various improvements for performance monitoring and decision making. The integrated temporal analysis and uncertainty processing facilitates the efficient use of domain expertise. Measurements can be monitored with generalized statistical process control (GSPC) based on the same scaling functions.

  20. A common type system for clinical natural language processing

    Full Text Available Abstract Background One challenge in reusing clinical data stored in electronic medical records is that these data are heterogenous. Clinical Natural Language Processing (NLP plays an important role in transforming information in clinical text to a standard representation that is comparable and interoperable. Information may be processed and shared when a type system specifies the allowable data structures. Therefore, we aim to define a common type system for clinical NLP that enables interoperability between structured and unstructured data generated in different clinical settings. Results We describe a common type system for clinical NLP that has an end target of deep semantics based on Clinical Element Models (CEMs, thus interoperating with structured data and accommodating diverse NLP approaches. The type system has been implemented in UIMA (Unstructured Information Management Architecture and is fully functional in a popular open-source clinical NLP system, cTAKES (clinical Text Analysis and Knowledge Extraction System versions 2.0 and later. Conclusions We have created a type system that targets deep semantics, thereby allowing for NLP systems to encapsulate knowledge from text and share it alongside heterogenous clinical data sources. Rather than surface semantics that are typically the end product of NLP algorithms, CEM-based semantics explicitly build in deep clinical semantics as the point of interoperability with more structured data types.

    Wu, Stephen T; Kaggal, Vinod C; Dligach, Dmitriy; Masanz, James J; Chen, Pei; Becker, Lee; Chapman, Wendy W; Savova, Guergana K; Liu, Hongfang; Chute, Christopher G


    One challenge in reusing clinical data stored in electronic medical records is that these data are heterogenous. Clinical Natural Language Processing (NLP) plays an important role in transforming information in clinical text to a standard representation that is comparable and interoperable. Information may be processed and shared when a type system specifies the allowable data structures. Therefore, we aim to define a common type system for clinical NLP that enables interoperability between structured and unstructured data generated in different clinical settings. We describe a common type system for clinical NLP that has an end target of deep semantics based on Clinical Element Models (CEMs), thus interoperating with structured data and accommodating diverse NLP approaches. The type system has been implemented in UIMA (Unstructured Information Management Architecture) and is fully functional in a popular open-source clinical NLP system, cTAKES (clinical Text Analysis and Knowledge Extraction System) versions 2.0 and later. We have created a type system that targets deep semantics, thereby allowing for NLP systems to encapsulate knowledge from text and share it alongside heterogenous clinical data sources. Rather than surface semantics that are typically the end product of NLP algorithms, CEM-based semantics explicitly build in deep clinical semantics as the point of interoperability with more structured data types.

  2. Template-based generation of natural language expressions with Controlled M-Grammar

    A method is described for the generation of related natural-language expressions. The method is based on a formal grammar of the natural language in question, specified in the Controlled M-Grammar (CMG) formalism. In the CMG framework the generation of an utterance is controlled by a derivation

    Despite the literature on the role of input in adult second-language (L2) acquisition and on artificial and statistical language learning, surprisingly little is known about how adults break into a new language in the wild. This article reports on a series of behavioral and neuroimaging studies that

    Lee, Ming Che; Chang, Jia Wei; Hsieh, Tung Cheng


    This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to "artificial language", such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure.

    McNamara, Danielle S; Crossley, Scott A; Roscoe, Rod


    The Writing Pal is an intelligent tutoring system that provides writing strategy training. A large part of its artificial intelligence resides in the natural language processing algorithms to assess essay quality and guide feedback to students. Because writing is often highly nuanced and subjective, the development of these algorithms must consider a broad array of linguistic, rhetorical, and contextual features. This study assesses the potential for computational indices to predict human ratings of essay quality. Past studies have demonstrated that linguistic indices related to lexical diversity, word frequency, and syntactic complexity are significant predictors of human judgments of essay quality but that indices of cohesion are not. The present study extends prior work by including a larger data sample and an expanded set of indices to assess new lexical, syntactic, cohesion, rhetorical, and reading ease indices. Three models were assessed. The model reported by McNamara, Crossley, and McCarthy (Written Communication 27:57-86, 2010) including three indices of lexical diversity, word frequency, and syntactic complexity accounted for only 6% of the variance in the larger data set. A regression model including the full set of indices examined in prior studies of writing predicted 38% of the variance in human scores of essay quality with 91% adjacent accuracy (i.e., within 1 point). A regression model that also included new indices related to rhetoric and cohesion predicted 44% of the variance with 94% adjacent accuracy. The new indices increased accuracy but, more importantly, afford the means to provide more meaningful feedback in the context of a writing tutoring system.

  6. Automation of a problem list using natural language processing

    Full Text Available Abstract Background The medical problem list is an important part of the electronic medical record in development in our institution. To serve the functions it is designed for, the problem list has to be as accurate and timely as possible. However, the current problem list is usually incomplete and inaccurate, and is often totally unused. To alleviate this issue, we are building an environment where the problem list can be easily and effectively maintained. Methods For this project, 80 medical problems were selected for their frequency of use in our future clinical field of evaluation (cardiovascular. We have developed an Automated Problem List system composed of two main components: a background and a foreground application. The background application uses Natural Language Processing (NLP to harvest potential problem list entries from the list of 80 targeted problems detected in the multiple free-text electronic documents available in our electronic medical record. These proposed medical problems drive the foreground application designed for management of the problem list. Within this application, the extracted problems are proposed to the physicians for addition to the official problem list. Results The set of 80 targeted medical problems selected for this project covered about 5% of all possible diagnoses coded in ICD-9-CM in our study population (cardiovascular adult inpatients, but about 64% of all instances of these coded diagnoses. The system contains algorithms to detect first document sections, then sentences within these sections, and finally potential problems within the sentences. The initial evaluation of the section and sentence detection algorithms demonstrated a sensitivity and positive predictive value of 100% when detecting sections, and a sensitivity of 89% and a positive predictive value of 94% when detecting sentences. Conclusion The global aim of our project is to automate the process of creating and maintaining a problem

    Redd, Andrew; Pickard, Steve; Meystre, Stephane; Scehnet, Jeffrey; Bolton, Dan; Heavirland, Julia; Weaver, Allison Lynn; Hope, Carol; Garvin, Jennifer Hornung


    We introduce and evaluate a new, easily accessible tool using a common statistical analysis and business analytics software suite, SAS, which can be programmed to remove specific protected health information (PHI) from a text document. Removal of PHI is important because the quantity of text documents used for research with natural language processing (NLP) is increasing. When using existing data for research, an investigator must remove all PHI not needed for the research to comply with human subjects' right to privacy. This process is similar, but not identical, to de-identification of a given set of documents. PHI Hunter removes PHI from free-form text. It is a set of rules to identify and remove patterns in text. PHI Hunter was applied to 473 Department of Veterans Affairs (VA) text documents randomly drawn from a research corpus stored as unstructured text in VA files. PHI Hunter performed well with PHI in the form of identification numbers such as Social Security numbers, phone numbers, and medical record numbers. The most commonly missed PHI items were names and locations. Incorrect removal of information occurred with text that looked like identification numbers. PHI Hunter fills a niche role that is related to but not equal to the role of de-identification tools. It gives research staff a tool to reasonably increase patient privacy. It performs well for highly sensitive PHI categories that are rarely used in research, but still shows possible areas for improvement. More development for patterns of text and linked demographic tables from electronic health records (EHRs) would improve the program so that more precise identifiable information can be removed. PHI Hunter is an accessible tool that can flexibly remove PHI not needed for research. If it can be tailored to the specific data set via linked demographic tables, its performance will improve in each new document set.

  8. HomeNL: Homecare Assistance in Natural Language. An Intelligent Conversational Agent for Hypertensive Patients Management.


  9. Towards multilingual access to textual databases in natural language

    The Cross-Lingual Information Retrieval system (CLIR) or Multilingual Information Retrieval (MIR) has become the key issue in electronic documents management systems in a multinational environment. We propose here a multilingual information retrieval system consisting of a morpho-syntactic analyser, a transfer system from source language to target language and an information retrieval system. A thorough investigation into the system architecture and the transfer mechanisms is proposed in that report, using two different performance evaluation methods. (author) [fr

  10. Of Substance: The Nature of Language Effects on Entity Construal (United States)

    Li, Peggy; Dunham, Yarrow; Carey, Susan


    Shown an entity (e.g., a plastic whisk) labeled by a novel noun in neutral syntax, speakers of Japanese, a classifier language, are more likely to assume the noun refers to the substance (plastic) than are speakers of English, a count/mass language, who are instead more likely to assume it refers to the object kind [whisk; Imai, M., & Gentner, D.…

    Pelucchi, Bruna; Hay, Jessica F; Saffran, Jenny R


    Numerous studies over the past decade support the claim that infants are equipped with powerful statistical language learning mechanisms. The primary evidence for statistical language learning in word segmentation comes from studies using artificial languages, continuous streams of synthesized syllables that are highly simplified relative to real speech. To what extent can these conclusions be scaled up to natural language learning? In the current experiments, English-learning 8-month-old infants' ability to track transitional probabilities in fluent infant-directed Italian speech was tested (N = 72). The results suggest that infants are sensitive to transitional probability cues in unfamiliar natural language stimuli, and support the claim that statistical learning is sufficiently robust to support aspects of real-world language acquisition.

    Full Text Available Natural Language Processing is one of the most developing fields in research area. In most of the applications related to the Natural Language Processing findings of the Morphological Analysis and Morphological Generation can be considered very important. As morphological study is the technique to recognise a word and its output can be used on later on stages .Keeping in view this importance this paper describes how Morphological Analysis and Morphological Generation can be proved as an important part of various Natural Language Processing fields such as Spell checker Machine Translation etc.

    Full Text Available This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to “artificial language”, such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure.

  14. A Grammar-Based Semantic Similarity Algorithm for Natural Language Sentences (United States)

    Chang, Jia Wei; Hsieh, Tung Cheng


    This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to “artificial language”, such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure. PMID:24982952

    Kiraz, George Anton

    This book presents a tractable computational model that can cope with complex morphological operations, especially in Semitic languages, and less complex morphological systems present in Western languages. It outlines a new generalized regular rewrite rule system that uses multiple finite-state automata to cater to root-and-pattern morphology,…

  16. Natural mineral bottled waters available on the Polish market as a source of minerals for the consumers. Part 1. Calcium and magnesium. (United States)

    Gątarska, Anna; Tońska, Elżbieta; Ciborska, Joanna


    Natural mineral waters may be an essential source of calcium, magnesium and other minerals. In bottled waters, minerals occur in an ionized form which is very well digestible. However, the concentration of minerals in underground waters (which constitute the material for the production of bottled waters) varies. In view of the above, the type of water consumed is essential. The aim of the study was to estimate the calcium and magnesium contents in products available on the market and to evaluate calcium and magnesium consumption with natural mineral water by different consumer groups with an assumed volume of the consumed product. These represented forty different brands of natural mineral available waters on Polish market. These waters were produced in Poland or other European countries. Among the studied products, about 30% of the waters were imported from Lithuania, Latvia, Czech Republic, France, Italy and Germany. The content of calcium and magnesium in mineral waters was determined using flame atomic absorption spectrometry in an acetylene-air flame. Further determinations were carried out using atomic absorption spectrometer--ICE 3000 SERIES-THERMO-England, equipped with a GLITE data station, background correction (a deuterium lamp) as well as other cathode lamps. Over half of the analysed natural mineral waters were medium-mineralized. The natural mineral waters available on the market can be characterized by a varied content of calcium and magnesium and a high degree of product mineralization does not guarantee significant amounts of these components. Among the natural mineral waters available on the market, only a few feature the optimum calcium-magnesium proportion (2:1). Considering the mineralization degree of the studied products, it can be stated that the largest percentage of products with significant calcium and magnesium contents can be found in the high-mineralized water group. For some natural mineral waters, the consumption of 1 litre daily may

    Wagner, J C; Solomon, W D; Michel, P A; Juge, C; Baud, R H; Rector, A L; Scherrer, J R


    Re-usable and sharable, and therefore language-independent concept models are of increasing importance in the medical domain. The GALEN project (Generalized Architecture for Languages Encyclopedias and Nomenclatures in Medicine) aims at developing language-independent concept representation systems as the foundations for the next generation of multilingual coding systems. For use within clinical applications, the content of the model has to be mapped to natural language. A so-called Multilingual Information Module (MM) establishes the link between the language-independent concept model and different natural languages. This text generation software must be versatile enough to cope at the same time with different languages and with different parts of a compositional model. It has to meet, on the one hand, the properties of the language as used in the medical domain and, on the other hand, the specific characteristics of the underlying model and its representation formalism. We propose a semantic-oriented approach to natural language generation that is based on linguistic annotations to a concept model. This approach is realized as an integral part of a Terminology Server, built around the concept model and offering different terminological services for clinical applications.

    Jamil, Hasan M


    One of the many unique features of biological databases is that the mere existence of a ground data item is not always a precondition for a query response. It may be argued that from a biologist's standpoint, queries are not always best posed using a structured language. By this we mean that approximate and flexible responses to natural language like queries are well suited for this domain. This is partly due to biologists' tendency to seek simpler interfaces and partly due to the fact that questions in biology involve high level concepts that are open to interpretations computed using sophisticated tools. In such highly interpretive environments, rigidly structured databases do not always perform well. In this paper, our goal is to propose a semantic correspondence plug-in to aid natural language query processing over arbitrary biological database schema with an aim to providing cooperative responses to queries tailored to users' interpretations. Natural language interfaces for databases are generally effective when they are tuned to the underlying database schema and its semantics. Therefore, changes in database schema become impossible to support, or a substantial reorganization cost must be absorbed to reflect any change. We leverage developments in natural language parsing, rule languages and ontologies, and data integration technologies to assemble a prototype query processor that is able to transform a natural language query into a semantically equivalent structured query over the database. We allow knowledge rules and their frequent modifications as part of the underlying database schema. The approach we adopt in our plug-in overcomes some of the serious limitations of many contemporary natural language interfaces, including support for schema modifications and independence from underlying database schema. The plug-in introduced in this paper is generic and facilitates connecting user selected natural language interfaces to arbitrary databases using a

    Kotarba, M.J.; Curtis, John B.; Lewan, M.D.


    This study examined the molecular and isotopic compositions of gases generated from different kerogen types (i.e., Types I/II, II, IIS and III) in Menilite Shales by sequential hydrous pyrolysis experiments. The experiments were designed to simulate gas generation from source rocks at pre-oil-cracking thermal maturities. Initially, rock samples were heated in the presence of liquid water at 330 ??C for 72 h to simulate early gas generation dominated by the overall reaction of kerogen decomposition to bitumen. Generated gas and oil were quantitatively collected at the completion of the experiments and the reactor with its rock and water was resealed and heated at 355 ??C for 72 h. This condition simulates late petroleum generation in which the dominant overall reaction is bitumen decomposition to oil. This final heating equates to a cumulative thermal maturity of 1.6% Rr, which represents pre-oil-cracking conditions. In addition to the generated gases from these two experiments being characterized individually, they are also summed to characterize a cumulative gas product. These results are compared with natural gases produced from sandstone reservoirs within or directly overlying the Menilite Shales. The experimentally generated gases show no molecular compositions that are distinct for the different kerogen types, but on a total organic carbon (TOC) basis, oil prone kerogens (i.e., Types I/II, II and IIS) generate more hydrocarbon gas than gas prone Type III kerogen. Although the proportionality of methane to ethane in the experimental gases is lower than that observed in the natural gases, the proportionality of ethane to propane and i-butane to n-butane are similar to those observed for the natural gases. ??13C values of the experimentally generated methane, ethane and propane show distinctions among the kerogen types. This distinction is related to the ??13C of the original kerogen, with 13C enriched kerogen generating more 13C enriched hydrocarbon gases than

    Barrera, Rosalinda B.; Aleman, Magdalena


    Described is a newspaper project in which elementary students report life as it was in the Middle Ages. Students are involved in a variety of language-centered activities. For example, they gather and evaluate information about medieval times and write, edit, and proofread articles for the newspaper. (RM)

    This paper describes how a language generation system that was originally designed for monologue generation, has been adapted for use in the OVIS spoken dialogue system. To meet the requirement that in a dialogue, the system’s utterances should make up a single, coherent dialogue turn, several

    Where Humans Meet Machines: Innovative Solutions for Knotty Natural-Language Problems brings humans and machines closer together by showing how linguistic complexities that confound the speech systems of today can be handled effectively by sophisticated natural-language technology. Some of the most vexing natural-language problems that are addressed in this book entail   recognizing and processing idiomatic expressions, understanding metaphors, matching an anaphor correctly with its antecedent, performing word-sense disambiguation, and handling out-of-vocabulary words and phrases. This fourteen-chapter anthology consists of contributions from industry scientists and from academicians working at major universities in North America and Europe. They include researchers who have played a central role in DARPA-funded programs and developers who craft real-world solutions for corporations. These contributing authors analyze the role of natural language technology in the global marketplace; they explore the need f...

    polish is demonstrated by comparing its performance with the traditional functional ANOVA fitted by means under different outlier models in simulation studies. The functional median polish is illustrated on various applications in climate science

    Snefjella, Bryor; Kuperman, Victor


    Existing evidence shows that more abstract mental representations are formed and more abstract language is used to characterize phenomena that are more distant from the self. Yet the precise form of the functional relationship between distance and linguistic abstractness is unknown. In four studies, we tested whether more abstract language is used in textual references to more geographically distant cities (Study 1), time points further into the past or future (Study 2), references to more socially distant people (Study 3), and references to a specific topic (Study 4). Using millions of linguistic productions from thousands of social-media users, we determined that linguistic concreteness is a curvilinear function of the logarithm of distance, and we discuss psychological underpinnings of the mathematical properties of this relationship. We also demonstrated that gradient curvilinear effects of geographic and temporal distance on concreteness are nearly identical, which suggests uniformity in representation of abstractness along multiple dimensions. © The Author(s) 2015.

    Full Text Available Autism spectrum disorders (ASD are pervasive neurodevelopmental disorders involving a number of deficits to linguistic cognition. The gap between genetics and the pathophysiology of ASD remains open, in particular regarding its distinctive linguistic profile. The goal of this paper is to attempt to bridge this gap, focusing on how the autistic brain processes language, particularly through the perspective of brain rhythms. Due to the phenomenon of pleiotropy, which may take some decades to overcome, we believe that studies of brain rhythms, which are not faced with problems of this scale, may constitute a more tractable route to interpreting language deficits in ASD and eventually other neurocognitive disorders. Building on recent attempts to link neural oscillations to certain computational primitives of language, we show that interpreting language deficits in ASD as oscillopathic traits is a potentially fruitful way to construct successful endophenotypes of this condition. Additionally, we will show that candidate genes for ASD are overrepresented among the genes that played a role in the evolution of language. These genes include (and are related to genes involved in brain rhythmicity. We hope that the type of steps taken here will additionally lead to a better understanding of the comorbidity, heterogeneity, and variability of ASD, and may help achieve a better treatment of the affected populations.

    Developing a sufficient understanding of environmental processes and exposure pathways that permit observations to be explained and robust predictions to be made over spatial and temporal scales is a clear challenge that radioecology needs to address. This scientific challenge has been developed as a separate section of the Strategic Research Agenda (SRA) a document produced by the STAR Network of Excellence in Radioecology that outlines a suggested prioritisation of research topics in radioecology. Reality is that in order to bring the SRA to fruition, besides considerable resources and time, an available proving ground is required. The sole sources of such data are areas affected by nuclear accidents but the conditions provided do not follow requirements for scientific experiment. On the other hand, it is hard to imagine anyone deliberately releasing substantial amount of radioactivity into environment in order to observe what would happen- Some of coal mines at Upper Silesia Coal Basin have discharged radium reach brines continuously for many years. The total amount of radium released to inland water is quite well known and varies with time or exploitation conditions. This phenomenon has been observed for more than 30 years and many contaminated sites being in different state were identified. Natural radionuclides (mainly radium isotopes) present in mine water after its release into the environment are subject to different chemical and/or physical processes influencing their final fate. The processes of concern are e.g. precipitation, sedimentation, adsorption, absorption, ion exchange, desorption, leaching, erosion, sequential decay etc. Based on physical and chemical rules, available data and real environmental conditions the key processes that govern radium and its progeny behaviour after discharge with mine water, associated transfers among environmental compartments and resulting exposures of both non-human and humans populations have been identified. The

  12. Cancer morbidity among polishers. (United States)

    Järvholm, B; Thiringer, G; Axelson, O


    The mortality pattern among 86 men was determined to investigate the possible hazards of polishing steel. The men had polished steel with polishing paste for at least five years. The polishing pastes had contained tallow, beeswax, carnauba wax, alundum, carborundum, ferric oxide, and chalk. A total of 18 men had died compared with 13.3 expected. Four had died of stomach cancer compared with 0.44 expected (p less than 0.005). The mortality for other causes of death was not increased. The study does not permit any definite conclusion but indicates a possible cancer hazard among polishers. PMID:7066237

  13. From language to nature: The semiotic metaphor in biology

    be of considerable value, not only heuristically, but in order to comprehend the irreducible nature of living organisms. In arguing for a semiotic perspective on living nature, it makes a marked difference whether the departure is made from the tradition of F. de Saussure´s structural linguistics or from...

    Artificial Intelligence federates numerous scientific fields in the aim of developing machines able to assist human operators performing complex treatments---most of which demand high cognitive skills (e.g. learning or decision processes). Central to this quest is to give machines the ability to estimate the likeness or similarity between things in the way human beings estimate the similarity between stimuli.In this context, this book focuses on semantic measures: approaches designed for comparing semantic entities such as units of language, e.g. words, sentences, or concepts and instances def

    Using the back propagation algorithm, we have trained the feed forward neural network to pronounce Polish language, more precisely to translate Polish text into its phonematic counterpart. Depending on the input coding and network architecture, 88%-95% translation efficiency was achieved. (author)

    Full Text Available The paper makes an attempt to analyse the forms of co-functioning of world languages, taking into account the fact that popular prestigious languages exert influence upon a number of less popular ones, thereby dictating the forms of their further development. Thus, the thesis that media-favoured languages used by politically salient super-powers effectively influence the expressions accepted in a number of “less successful” languages is identified, evidenced and diagnosed. Furthermore, the latter part of the paper stresses the issues concerning the observation that English, recognized as the most prominent donor language, creates many forms of description generally used in many other languages to denote and define similar forms of experiences. Research aimed at discovering the ways in which English influences recipient languages, in this case Polish, was carried out. Our principal assumption was that there exist at least two types of numerous contacts between a donor and recipient language: ones that can be called external (when the donor language mostly influences the recipient one and ones possibly labelled as internal (when various, normally observed forms of co-operation between the two languages can be traced. As in both cases some semantic bonds can be found, the subsequent research describes said bonds, naming them and uncovering the nature of such co-existence. The results of the research indicate clear forms of semantic co-existence showing that numerous borrowings and loanwords found in the recipient language are widely verbalized and deeply ingrained in the cultural linguistic interdependencies.

    Barker-Plummer, Dave; Dale, Robert; Cox, Richard; Romanczuk, Alex


    We have assembled a large corpus of student submissions to an automatic grading system, where the subject matter involves the translation of natural language sentences into propositional logic. Of the 2.3 million translation instances in the corpus, 286,000 (approximately 12%) are categorized as being in error. We want to understand the nature of…

  19. Polish natural bee honeys are anti-proliferative and anti-metastatic agents in human glioblastoma multiforme U87MG cell line.

    Full Text Available Honey has been used as food and a traditional medicament since ancient times. However, recently many scientists have been concentrating on the anti-oxidant, anti-proliferative, anti-inflammatory and other properties of honey. In this study, we investigated for the first time an anticancer effect of different honeys from Poland on tumor cell line - glioblastoma multiforme U87MG. Anti-proliferative activity of honeys and its interferences with temozolomide were determined by a cytotoxicity test and DNA binding by [H3]-thymidine incorporation. A gelatin zymography was used to conduct an evaluation of metalloproteinases (MMP-2 and MMP-9 expression in U87MG treatment with honey samples. The honeys were previously tested qualitatively (diastase activity, total phenolic content, lead and cadmium content. The data demonstrated that the examined honeys have a potent anti-proliferative effect on U87MG cell line in a time- and dose-dependent manner, being effective at concentrations as low as 0.5% (multifloral light honey - viability 53% after 72 h of incubation. We observed that after 48 h, combining honey with temozolomide showed a significantly higher inhibitory effect than the samples of honey alone. We observed a strong inhibition of MMP-2 and MMP-9 for the tested honeys (from 20 to 56% and from 5 to 58% compared to control, respectively. Our results suggest that Polish honeys have an anti-proliferative and anti-metastatic effect on U87MG cell line. Therefore, natural bee honey can be considered as a promising adjuvant treatment for brain tumors.

    Daltrozzo, Jerome; Emerson, Samantha N; Deocampo, Joanne; Singh, Sonia; Freggens, Marjorie; Branum-Martin, Lee; Conway, Christopher M


    Statistical learning (SL) is believed to enable language acquisition by allowing individuals to learn regularities within linguistic input. However, neural evidence supporting a direct relationship between SL and language ability is scarce. We investigated whether there are associations between event-related potential (ERP) correlates of SL and language abilities while controlling for the general level of selective attention. Seventeen adults completed tests of visual SL, receptive vocabulary, grammatical ability, and sentence completion. Response times and ERPs showed that SL is related to receptive vocabulary and grammatical ability. ERPs indicated that the relationship between SL and grammatical ability was independent of attention while the association between SL and receptive vocabulary depended on attention. The implications of these dissociative relationships in terms of underlying mechanisms of SL and language are discussed. These results further elucidate the cognitive nature of the links between SL mechanisms and language abilities. Copyright © 2017 Elsevier Inc. All rights reserved.

    Full Text Available The task of evaluating uncertainty in the measurement of sense in natural language constructions (NLCs was researched through formalization of the notions of the language image, formalization of artificial cognitive systems (ACSs and the formalization of units of meaning. The method for measuring the sense of natural language constructions incorporated fuzzy relations of meaning, which ensures that information about the links between lemmas of the text is taken into account, permitting the evaluation of two types of measurement uncertainty of sense characteristics. Using developed applications programs, experiments were conducted to investigate the proposed method to tackle the identification of informative characteristics of text. The experiments resulted in dependencies of parameters being obtained in order to utilise the Pareto distribution law to define relations between lemmas, analysis of which permits the identification of exponents of an average number of connections of the language image as the most informative characteristics of text.

    Clody, Michael C


    The essay argues that Francis Bacon's considerations of parables and cryptography reflect larger interpretative concerns of his natural philosophic project. Bacon describes nature as having a language distinct from those of God and man, and, in so doing, establishes a central problem of his natural philosophy—namely, how can the language of nature be accessed through scientific representation? Ultimately, Bacon's solution relies on a theory of differential and duplicitous signs that conceal within them the hidden voice of nature, which is best recognized in the natural forms of efficient causality. The "alphabet of nature"—those tables of natural occurrences—consequently plays a central role in his program, as it renders nature's language susceptible to a process and decryption that mirrors the model of the bilateral cipher. It is argued that while the writing of Bacon's natural philosophy strives for literality, its investigative process preserves a space for alterity within scientific representation, that is made accessible to those with the interpretative key.

    Heinrich, Stefan; Wermter, Stefan


    For the complex human brain that enables us to communicate in natural language, we gathered good understandings of principles underlying language acquisition and processing, knowledge about sociocultural conditions, and insights into activity patterns in the brain. However, we were not yet able to understand the behavioural and mechanistic characteristics for natural language and how mechanisms in the brain allow to acquire and process language. In bridging the insights from behavioural psychology and neuroscience, the goal of this paper is to contribute a computational understanding of appropriate characteristics that favour language acquisition. Accordingly, we provide concepts and refinements in cognitive modelling regarding principles and mechanisms in the brain and propose a neurocognitively plausible model for embodied language acquisition from real-world interaction of a humanoid robot with its environment. In particular, the architecture consists of a continuous time recurrent neural network, where parts have different leakage characteristics and thus operate on multiple timescales for every modality and the association of the higher level nodes of all modalities into cell assemblies. The model is capable of learning language production grounded in both, temporal dynamic somatosensation and vision, and features hierarchical concept abstraction, concept decomposition, multi-modal integration, and self-organisation of latent representations.

    Liu, Haitao; Xu, Chunshan; Liang, Junying


    Dependency distance, measured by the linear distance between two syntactically related words in a sentence, is generally held as an important index of memory burden and an indicator of syntactic difficulty. Since this constraint of memory is common for all human beings, there may well be a universal preference for dependency distance minimization (DDM) for the sake of reducing memory burden. This human-driven language universal is supported by big data analyses of various corpora that consistently report shorter overall dependency distance in natural languages than in artificial random languages and long-tailed distributions featuring a majority of short dependencies and a minority of long ones. Human languages, as complex systems, seem to have evolved to come up with diverse syntactic patterns under the universal pressure for dependency distance minimization. However, there always exist a small number of long-distance dependencies in natural languages, which may reflect some other biological or functional constraints. Language system may adapt itself to these sporadic long-distance dependencies. It is these universal constraints that have shaped such a rich diversity of syntactic patterns in human languages.

    Liu, Haitao; Xu, Chunshan; Liang, Junying


    Dependency distance, measured by the linear distance between two syntactically related words in a sentence, is generally held as an important index of memory burden and an indicator of syntactic difficulty. Since this constraint of memory is common for all human beings, there may well be a universal preference for dependency distance minimization (DDM) for the sake of reducing memory burden. This human-driven language universal is supported by big data analyses of various corpora that consistently report shorter overall dependency distance in natural languages than in artificial random languages and long-tailed distributions featuring a majority of short dependencies and a minority of long ones. Human languages, as complex systems, seem to have evolved to come up with diverse syntactic patterns under the universal pressure for dependency distance minimization. However, there always exist a small number of long-distance dependencies in natural languages, which may reflect some other biological or functional constraints. Language system may adapt itself to these sporadic long-distance dependencies. It is these universal constraints that have shaped such a rich diversity of syntactic patterns in human languages. Copyright © 2017. Published by Elsevier B.V.

    Scientists increasingly use workflows to represent and share their computational experiments. Because of their declarative nature, focus on pre-existing component composition and the availability of visual editors, workflows provide a valuable start for creating user-friendly environments for end

  9. Linguistic fundamentals for natural language processing 100 essentials from morphology and syntax

    Full Text Available We propose a stochastic model for the number of different words in a given database which incorporates the dependence on the database size and historical changes. The main feature of our model is the existence of two different classes of words: (i a finite number of core words, which have higher frequency and do not affect the probability of a new word to be used, and (ii the remaining virtually infinite number of noncore words, which have lower frequency and, once used, reduce the probability of a new word to be used in the future. Our model relies on a careful analysis of the Google Ngram database of books published in the last centuries, and its main consequence is the generalization of Zipf’s and Heaps’ law to two-scaling regimes. We confirm that these generalizations yield the best simple description of the data among generic descriptive models and that the two free parameters depend only on the language but not on the database. From the point of view of our model, the main change on historical time scales is the composition of the specific words included in the finite list of core words, which we observe to decay exponentially in time with a rate of approximately 30 words per year for English.

  11. An algorithm to transform natural language into SQL queries for relational databases

    Full Text Available Intelligent interface, to enhance efficient interactions between user and databases, is the need of the database applications. Databases must be intelligent enough to make the accessibility faster. However, not every user familiar with the Structured Query Language (SQL queries as they may not aware of structure of the database and they thus require to learn SQL. So, non-expert users need a system to interact with relational databases in their natural language such as English. For this, Database Management System (DBMS must have an ability to understand Natural Language (NL. In this research, an intelligent interface is developed using semantic matching technique which translates natural language query to SQL using set of production rules and data dictionary. The data dictionary consists of semantics sets for relations and attributes. A series of steps like lower case conversion, tokenization, speech tagging, database element and SQL element extraction is used to convert Natural Language Query (NLQ to SQL Query. The transformed query is executed and the results are obtained by the user. Intelligent Interface is the need of database applications to enhance efficient interaction between user and DBMS.

  12. Selecting the Best Mobile Information Service with Natural Language User Input (United States)

    Information services accessed via mobile phones provide information directly relevant to subscribers’ daily lives and are an area of dynamic market growth worldwide. Although many information services are currently offered by mobile operators, many of the existing solutions require a unique gateway for each service, and it is inconvenient for users to have to remember a large number of such gateways. Furthermore, the Short Message Service (SMS) is very popular in China and Chinese users would prefer to access these services in natural language via SMS. This chapter describes a Natural Language Based Service Selection System (NL3S) for use with a large number of mobile information services. The system can accept user queries in natural language and navigate it to the required service. Since it is difficult for existing methods to achieve high accuracy and high coverage and anticipate which other services a user might want to query, the NL3S is developed based on a Multi-service Ontology (MO) and Multi-service Query Language (MQL). The MO and MQL provide semantic and linguistic knowledge, respectively, to facilitate service selection for a user query and to provide adaptive service recommendations. Experiments show that the NL3S can achieve 75-95% accuracies and 85-95% satisfactions for processing various styles of natural language queries. A trial involving navigation of 30 different mobile services shows that the NL3S can provide a viable commercial solution for mobile operators.

    Full Text Available Lesley Milroy's Observing and Analysing Natural Language is a recent addition to an ever growing number of publications in the field of Sociolinguistics. It carries the weight of one of the experienced authors in the current days in the specified field and should offer basic information to both newcomers and established investigators in natural language. Lesley Milroy's Observing and Analysing Natural Language is a recent addition to an ever growing number of publications in the field of Sociolinguistics. It carries the weight of one of the experienced authors in the current days in the specified field and should offer basic information to both newcomers and established investigators in natural language.

    Köhler, Piotr


    from oral testimony, that the times of Lysenkoism were a terrible period in Polish botany, with all kinds of pressures exerted on botanists who did not adopt it. Fortunately, no Polish botanists lost their lives. The Lysenkoist period in Polish botany retarded the development of many of its branches. In the last fifty years many of the setbacks have been made up for, but it is in the biological education of the general public that Lysenkoism has had a more serious effect. Several generations of young people failed to be introduced to genetics, or at least its foundations, at any level of schooling. Instead they were inculcated with the erroneous belief of man's limitless possibilities in transforming nature, including the view that species can be shaped freely in line with economic needs. (ABSTRACT TRUNCATED)

  16. Automated Trait Extraction using ClearEarth, a Natural Language Processing System for Text Mining in Natural Sciences


    Topac, Vasile; Jurcau, Daniel-Alexandru; Stoicu-Tivadar, Vasile


    Medical terminology appears in the natural language in multiple forms: canonical, derived or inflected form. This research presents an analysis of the form in which medical terminology appears in Romanian and English language. The sources of medical language used for the study are web pages presenting medical information for patients and other lay users. The results show that, in English, medical terminology tends to appear more in canonical form while, in the case of Romanian, it is the opposite. This paper also presents the service that was created to perform this analysis. This tool is available for the general public, and it is designed to be easily extensible, allowing the addition of other languages.

    We provide dramatic evidence that 'Mellin space' is the natural home for correlation functions in CFTs with weakly coupled bulk duals. In Mellin space, CFT correlators have poles corresponding to an OPE decomposition into 'left' and 'right' sub-correlators, in direct analogy with the factorization channels of scattering amplitudes. In the regime where these correlators can be computed by tree level Witten diagrams in AdS, we derive an explicit formula for the residues of Mellin amplitudes at the corresponding factorization poles, and we use the conformal Casimir to show that these amplitudes obey algebraic finite difference equations. By analyzing the recursive structure of our factorization formula we obtain simple diagrammatic rules for the construction of Mellin amplitudes corresponding to tree-level Witten diagrams in any bulk scalar theory. We prove the diagrammatic rules using our finite difference equations. Finally, we show that our factorization formula and our diagrammatic rules morph into the flat space S-Matrix of the bulk theory, reproducing the usual Feynman rules, when we take the flat space limit of AdS/CFT. Throughout we emphasize a deep analogy with the properties of flat space scattering amplitudes in momentum space, which suggests that the Mellin amplitude may provide a holographic definition of the flat space S-Matrix.

    Full Text Available The Polish Cartographical Review (PCR journal has been published in English four times a year since 2015. The journal is in open access and it is published by De Gruyter Open. It is edited by Polish scientists in collaboration with international experts.

    Leung, Constant; Scarino, Angela


    Transformations associated with the increasing speed, scale, and complexity of mobilities, together with the information technology revolution, have changed the demography of most countries of the world and brought about accompanying social, cultural, and economic shifts (Heugh, 2013). This complex diversity has changed the very nature of…

  2. AutoTutor and Family: A Review of 17 Years of Natural Language Tutoring (United States)

    Nye, Benjamin D.; Graesser, Arthur C.; Hu, Xiangen


  3. Speech perception and reading: two parallel modes of understanding language and implications for acquiring literacy naturally. (United States)

    Massaro, Dominic W


    I review 2 seminal research reports published in this journal during its second decade more than a century ago. Given psychology's subdisciplines, they would not normally be reviewed together because one involves reading and the other speech perception. The small amount of interaction between these domains might have limited research and theoretical progress. In fact, the 2 early research reports revealed common processes involved in these 2 forms of language processing. Their illustration of the role of Wundt's apperceptive process in reading and speech perception anticipated descriptions of contemporary theories of pattern recognition, such as the fuzzy logical model of perception. Based on the commonalities between reading and listening, one can question why they have been viewed so differently. It is commonly believed that learning to read requires formal instruction and schooling, whereas spoken language is acquired from birth onward through natural interactions with people who talk. Most researchers and educators believe that spoken language is acquired naturally from birth onward and even prenatally. Learning to read, on the other hand, is not possible until the child has acquired spoken language, reaches school age, and receives formal instruction. If an appropriate form of written text is made available early in a child's life, however, the current hypothesis is that reading will also be learned inductively and emerge naturally, with no significant negative consequences. If this proposal is true, it should soon be possible to create an interactive system, Technology Assisted Reading Acquisition, to allow children to acquire literacy naturally.

    Higginbotham, D Jeffery; Lesher, Gregory W; Moulton, Bryan J; Roark, Brian


    Significant progress has been made in the application of natural language processing (NLP) to augmentative and alternative communication (AAC), particularly in the areas of interface design and word prediction. This article will survey the current state-of-the-science of NLP in AAC and discuss its future applications for the development of next generation of AAC technology.

    We are pleased to present the Proceedings of the 12th European Workshop on Natural Language Generation (ENLG 2009). ENLG 2009 was held in Athens, Greece, as a workshop at the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2009). Following our call, we

    Gerzymisch-Arbogast, Heidrun


    A theoretical discussion is offered on whether the subjunctive in the Romance languages is by nature thematic, as suggested in previous studies. English and Spanish samples are used to test the hypothesis; one conclusion is that the subjunctive seems to offer speaker-related information and may express the intensity of the speaker's involvement.…

    Laski, Karen E.; And Others


    Parents of four nonverbal and four echolalic autistic children, aged five-nine, were trained to increase their children's speech by using the Natural Language Paradigm. Following training, parents increased the frequency with which they required their children to speak, and children increased the frequency of their verbalizations in three…

  8. Modelling the phonotactic structure of natural language words with simple recurrent networks

    Simple Recurrent Networks (SRN) are Neural Network (connectionist) models able to process natural language. Phonotactics concerns the order of symbols in words. We continued an earlier unsuccessful trial to model the phonotactics of Dutch words with SRNs. In order to overcome the previously reported

    Ingram, D. E.

    The nature and development of the recently released International English Language Testing System (IELTS) instrument are described. The test is the result of a joint Australian-British project to develop a new test for use with foreign students planning to study in English-speaking countries. It is expected that the modular instrument will become…

    Tierney, Patrick J.


    This paper introduces a method of extending natural language-based processing of qualitative data analysis with the use of a very quantitative tool--graph theory. It is not an attempt to convert qualitative research to a positivist approach with a mathematical black box, nor is it a "graphical solution". Rather, it is a method to help qualitative…

    Balyan, Renu; McCarthy, Kathryn S.; McNamara, Danielle S.


    This study examined how machine learning and natural language processing (NLP) techniques can be leveraged to assess the interpretive behavior that is required for successful literary text comprehension. We compared the accuracy of seven different machine learning classification algorithms in predicting human ratings of student essays about…

    Wong, Wing-Kwong; Yin, Sheng-Kai; Yang, Chang-Zhe


    This paper presents a tool for drawing dynamic geometric figures by understanding the texts of geometry problems. With the tool, teachers and students can construct dynamic geometric figures on a web page by inputting a geometry problem in natural language. First we need to build the knowledge base for understanding geometry problems. With the…

    Kyle, Kristopher; Crossley, Scott A; McNamara, Danielle S.


    This study explores the construct validity of speaking tasks included in the TOEFL iBT (e.g., integrated and independent speaking tasks). Specifically, advanced natural language processing (NLP) tools, MANOVA difference statistics, and discriminant function analyses (DFA) are used to assess the degree to which and in what ways responses to these…

    Full Text Available In tonal languages, such as Mandarin Chinese, the pitch contour of vowels discriminates lexical meaning, which is not the case in non-tonal languages such as German. Recent data provide evidence that pitch processing is influenced by language experience. However, there are still many open questions concerning the representation of such phonological and language-related differences at the level of the auditory cortex (AC. Using magnetoencephalography (MEG, we recorded transient and sustained auditory evoked fields (AEF in native Chinese and German speakers to investigate language related phonological and semantic aspects in the processing of acoustic stimuli. AEF were elicited by spoken meaningful and meaningless syllables, by vowels, and by a French horn tone. Speech sounds were recorded from a native speaker and showed frequency-modulations according to the pitch-contours of Mandarin. The sustained field (SF evoked by natural speech signals was significantly larger for Chinese than for German listeners. In contrast, the SF elicited by a horn tone was not significantly different between groups. Furthermore, the SF of Chinese subjects was larger when evoked by meaningful syllables compared to meaningless ones, but there was no significant difference regarding whether vowels were part of the Chinese phonological system or not. Moreover, the N100m gave subtle but clear evidence that for Chinese listeners other factors than purely physical properties play a role in processing meaningful signals. These findings show that the N100 and the SF generated in Heschl's gyrus are influenced by language experience, which suggests that AC activity related to specific pitch contours of vowels is influenced in a top-down fashion by higher, language related areas. Such interactions are in line with anatomical findings and neuroimaging data, as well as with the dual-stream model of language of Hickok and Poeppel that highlights the close and reciprocal interaction

    Fan, Christina Siu-Dschu; Zhu, Xingyu; Dosch, Hans Günter; von Stutterheim, Christiane; Rupp, André


    In tonal languages, such as Mandarin Chinese, the pitch contour of vowels discriminates lexical meaning, which is not the case in non-tonal languages such as German. Recent data provide evidence that pitch processing is influenced by language experience. However, there are still many open questions concerning the representation of such phonological and language-related differences at the level of the auditory cortex (AC). Using magnetoencephalography (MEG), we recorded transient and sustained auditory evoked fields (AEF) in native Chinese and German speakers to investigate language related phonological and semantic aspects in the processing of acoustic stimuli. AEF were elicited by spoken meaningful and meaningless syllables, by vowels, and by a French horn tone. Speech sounds were recorded from a native speaker and showed frequency-modulations according to the pitch-contours of Mandarin. The sustained field (SF) evoked by natural speech signals was significantly larger for Chinese than for German listeners. In contrast, the SF elicited by a horn tone was not significantly different between groups. Furthermore, the SF of Chinese subjects was larger when evoked by meaningful syllables compared to meaningless ones, but there was no significant difference regarding whether vowels were part of the Chinese phonological system or not. Moreover, the N100m gave subtle but clear evidence that for Chinese listeners other factors than purely physical properties play a role in processing meaningful signals. These findings show that the N100 and the SF generated in Heschl's gyrus are influenced by language experience, which suggests that AC activity related to specific pitch contours of vowels is influenced in a top-down fashion by higher, language related areas. Such interactions are in line with anatomical findings and neuroimaging data, as well as with the dual-stream model of language of Hickok and Poeppel that highlights the close and reciprocal interaction between

    Schwartz, Geoffrey


    Acoustic and perceptual studies investgate B2-level Polish learners' acquisition of second language (L2) English word-boundaries involving word-initial vowels. In production, participants were less likely to produce glottalization of phrase-medial initial vowels in L2 English than in first language (L1) Polish. Perception studies employing word…

    Mathematics and the Laws of Nature, Revised Edition describes the evolution of the idea that nature can be described in the language of mathematics. Colorful chapters explore the earliest attempts to apply deductive methods to the study of the natural world. This revised resource goes on to examine the development of classical conservation laws, including the conservation of momentum, the conservation of mass, and the conservation of energy. Chapters have been updated and revised to reflect recent information, including the mathematical pioneers who introduced new ideas about what it meant to

  19. Assessing repetitive negative thinking using categorical and transdiagnostic approaches: A comparison and validation of three Polish language adaptations of self-report questionnaires

    Full Text Available Repetitive negative thinking (RNT is a transdiagnostic process involved in the risk, maintenance, and relapse of serious conditions including mood disorders, anxiety, eating disorders, and addictions. Processing mode theory provides a theoretical model to assess, research, and treat RNT using a transdiagnostic approach. Clinical researchers also often employ categorical approaches to RNT, including a focus on depressive rumination or worry, for similar purposes. Three widely used self-report questionnaires have been developed to assess these related constructs: the Ruminative Response Scale (RRT, the Perseverative Thinking Questionnaire (PTQ, and the Mini-Cambridge Exeter Repetitive Thought Scale (Mini-CERTS. Yet these scales have not previously been used in conjunction, despite useful theoretical distinctions only available in Mini-CERTS. The present validation of the methods in a Polish speaking population provides psychometric parameters estimates that contribute to current efforts to increase reliable replication of theoretical outcomes. Moreover, the following study aims to present particular characteristics and a comparison of the three methods. Although there has been some exploration of the categorical approach, the comparison of transdiagnostic methods is still lacking. These methods are particularly relevant for developing and evaluating theoretically based interventions like concreteness training, an emerging field of increasing interest, which can be used to address the maladaptive processing mode in RNT that can lead to depression and other disorders. Furthermore, the translation of these measures enables the examination of possible cross-cultural structural differences that may lead to important theoretical progress in the measurement and classification of RNT. The results support the theoretical hypothesis. As expected, the dimensions of brooding, general Repetitive Negative Thinking and Abstract Analytic Thinking, can all be

  20. The Nature of the Language Faculty and Its Implications for Evolution of Language (Reply to Fitch, Hauser, and Chomsky) (United States)

    Jackendoff, Ray; Pinker, Steven


    In a continuation of the conversation with Fitch, Chomsky, and Hauser on the evolution of language, we examine their defense of the claim that the uniquely human, language-specific part of the language faculty (the ''narrow language faculty'') consists only of recursion, and that this part cannot be considered an adaptation to communication. We…

    Full Text Available Communicative interactions involve a kind of procedural knowledge that is used by the human brain for processing verbal and nonverbal inputs and for language production. Although considerable work has been done on modeling human language abilities, it has been difficult to bring them together to a comprehensive tabula rasa system compatible with current knowledge of how verbal information is processed in the brain. This work presents a cognitive system, entirely based on a large-scale neural architecture, which was developed to shed light on the procedural knowledge involved in language elaboration. The main component of this system is the central executive, which is a supervising system that coordinates the other components of the working memory. In our model, the central executive is a neural network that takes as input the neural activation states of the short-term memory and yields as output mental actions, which control the flow of information among the working memory components through neural gating mechanisms. The proposed system is capable of learning to communicate through natural language starting from tabula rasa, without any a priori knowledge of the structure of phrases, meaning of words, role of the different classes of words, only by interacting with a human through a text-based interface, using an open-ended incremental learning process. It is able to learn nouns, verbs, adjectives, pronouns and other word classes, and to use them in expressive language. The model was validated on a corpus of 1587 input sentences, based on literature on early language assessment, at the level of about 4-years old child, and produced 521 output sentences, expressing a broad range of language processing functionalities.

    Rassinoux, Anne-Marie; Baud, Robert H; Rodrigues, Jean-Marie; Lovis, Christian; Geissbühler, Antoine


    The importance of clinical communication between providers, consumers and others, as well as the requisite for computer interoperability, strengthens the need for sharing common accepted terminologies. Under the directives of the World Health Organization (WHO), an approach is currently being conducted in Australia to adopt a standardized terminology for medical procedures that is intended to become an international reference. In order to achieve such a standard, a collaborative approach is adopted, in line with the successful experiment conducted for the development of the new French coding system CCAM. Different coding centres are involved in setting up a semantic representation of each term using a formal ontological structure expressed through a logic-based representation language. From this language-independent representation, multilingual natural language generation (NLG) is performed to produce noun phrases in various languages that are further compared for consistency with the original terms. Outcomes are presented for the assessment of the International Classification of Health Interventions (ICHI) and its translation into Portuguese. The initial results clearly emphasize the feasibility and cost-effectiveness of the proposed method for handling both a different classification and an additional language. NLG tools, based on ontology driven semantic representation, facilitate the discovery of ambiguous and inconsistent terms, and, as such, should be promoted for establishing coherent international terminologies.

    Full Text Available The article describes Polish research and discoveries in the Arctic and the Antarctic since the 19th century. The author is a geologist and since 1956 has been engaged in scientific field research on Spitsbergen, Greenland and Antarctica (23 expeditions. For many years chairman of the Committee on Polar Research of the Polish Academy of Sciences, he is now its Honorary Chairman.

  4. Harnessing Biomedical Natural Language Processing Tools to Identify Medicinal Plant Knowledge from Historical Texts. (United States)

    Sharma, Vivekanand; Law, Wayne; Balick, Michael J; Sarkar, Indra Neil


  5. Using Open Geographic Data to Generate Natural Language Descriptions for Hydrological Sensor Networks. (United States)

    Molina, Martin; Sanchez-Soriano, Javier; Corcho, Oscar


  6. An ontology model for nursing narratives with natural language generation technology. (United States)

    Min, Yul Ha; Park, Hyeoun-Ae; Jeon, Eunjoo; Lee, Joo Yun; Jo, Soo Jung


  7. BT-Nurse: computer generation of natural language shift summaries from complex heterogeneous medical data. (United States)

    Hunter, James; Freer, Yvonne; Gatt, Albert; Reiter, Ehud; Sripada, Somayajulu; Sykes, Cindy; Westwater, Dave


    The BT-Nurse system uses data-to-text technology to automatically generate a natural language nursing shift summary in a neonatal intensive care unit (NICU). The summary is solely based on data held in an electronic patient record system, no additional data-entry is required. BT-Nurse was tested for two months in the Royal Infirmary of Edinburgh NICU. Nurses were asked to rate the understandability, accuracy, and helpfulness of the computer-generated summaries; they were also asked for free-text comments about the summaries. The nurses found the majority of the summaries to be understandable, accurate, and helpful (pgenerated summaries. In conclusion, natural language NICU shift summaries can be automatically generated from an electronic patient record, but our proof-of-concept software needs considerable additional development work before it can be deployed.

    Full Text Available Providing descriptions of isolated sensors and sensor networks in natural language, understandable by the general public, is useful to help users find relevant sensors and analyze sensor data. In this paper, we discuss the feasibility of using geographic knowledge from public databases available on the Web (such as OpenStreetMap, Geonames, or DBpedia to automatically construct such descriptions. We present a general method that uses such information to generate sensor descriptions in natural language. The results of the evaluation of our method in a hydrologic national sensor network showed that this approach is feasible and capable of generating adequate sensor descriptions with a lower development effort compared to other approaches. In the paper we also analyze certain problems that we found in public databases (e.g., heterogeneity, non-standard use of labels, or rigid search methods and their impact in the generation of sensor descriptions.

  9. Natural language processing-based COTS software and related technologies survey.

    Natural language processing-based knowledge management software, traditionally developed for security organizations, is now becoming commercially available. An informal survey was conducted to discover and examine current NLP and related technologies and potential applications for information retrieval, information extraction, summarization, categorization, terminology management, link analysis, and visualization for possible implementation at Sandia National Laboratories. This report documents our current understanding of the technologies, lists software vendors and their products, and identifies potential applications of these technologies.

  10. Medical subdomain classification of clinical notes using a machine learning-based natural language processing approach


  11. Human Computer Collaboration at the Edge: Enhancing Collective Situation Understanding with Controlled Natural Language (United States)


    Will, Herbert A.; Mackin, Michael A.


    PC software is described which provides flexible natural language process control capability with an IBM PC or compatible machine. Hardware requirements include the PC, and suitable hardware interfaces to all controlled devices. Software required includes the Microsoft Disk Operating System (MS-DOS) operating system, a PC-based FORTRAN-77 compiler, and user-written device drivers. Instructions for use of the software are given as well as a description of an application of the system.

  13. Quantization, Frobenius and Bi algebras from the Categorical Framework of Quantum Mechanics to Natural Language Semantics (United States)

    Sadrzadeh, Mehrnoosh


  14. Quantization, Frobenius and Bi Algebras from the Categorical Framework of Quantum Mechanics to Natural Language Semantics

    Full Text Available Compact Closed categories and Frobenius and Bi algebras have been applied to model and reason about Quantum protocols. The same constructions have also been applied to reason about natural language semantics under the name: “categorical distributional compositional” semantics, or in short, the “DisCoCat” model. This model combines the statistical vector models of word meaning with the compositional models of grammatical structure. It has been applied to natural language tasks such as disambiguation, paraphrasing and entailment of phrases and sentences. The passage from the grammatical structure to vectors is provided by a functor, similar to the Quantization functor of Quantum Field Theory. The original DisCoCat model only used compact closed categories. Later, Frobenius algebras were added to it to model long distance dependancies such as relative pronouns. Recently, bialgebras have been added to the pack to reason about quantifiers. This paper reviews these constructions and their application to natural language semantics. We go over the theory and present some of the core experimental results.

    Hinton, Leanne


    Surveys developments in language revitalization and language death. Focusing on indigenous languages, discusses the role and nature of appropriate linguistic documentation, possibilities for bilingual education, and methods of promoting oral fluency and intergenerational transmission in affected languages. (Author/VWL)

    Sutherland, Dawn


    One dimension of early Canadian education is the attempt of the government to use the education system as an assimilative tool to integrate the First Nations and Me´tis people into Euro-Canadian society. Despite these attempts, many First Nations and Me´tis people retained their culture and their indigenous language. Few science educators have examined First Nations and Western scientific worldviews and the impact they may have on science learning. This study explored the views some First Nations (Cree) and Euro-Canadian Grade-7-level students in Manitoba had about the nature of science. Both qualitative (open-ended questions and interviews) and quantitative (a Likert-scale questionnaire) instruments were used to explore student views. A central hypothesis to this research programme is the possibility that the different world-views of two student populations, Cree and Euro-Canadian, are likely to influence their perceptions of science. This preliminary study explored a range of methodologies to probe the perceptions of the nature of science in these two student populations. It was found that the two cultural groups differed significantly between some of the tenets in a Nature of Scientific Knowledge Scale (NSKS). Cree students significantly differed from Euro-Canadian students on the developmental, testable and unified tenets of the nature of scientific knowledge scale. No significant differences were found in NSKS scores between language groups (Cree students who speak English in the home and those who speak English and Cree or Cree only). The differences found between language groups were primarily in the open-ended questions where preformulated responses were absent. Interviews about critical incidents provided more detailed accounts of the Cree students' perception of the nature of science. The implications of the findings of this study are discussed in relation to the challenges related to research methodology, further areas for investigation, science

    Woo, Chong Woo; Evens, Martha W; Freedman, Reva; Glass, Michael; Shim, Leem Seop; Zhang, Yuemei; Zhou, Yujian; Michael, Joel


    The objective of this research was to build an intelligent tutoring system capable of carrying on a natural language dialogue with a student who is solving a problem in physiology. Previous experiments have shown that students need practice in qualitative causal reasoning to internalize new knowledge and to apply it effectively and that they learn by putting their ideas into words. Analysis of a corpus of 75 hour-long tutoring sessions carried on in keyboard-to-keyboard style by two professors of physiology at Rush Medical College tutoring first-year medical students provided the rules used in tutoring strategies and tactics, parsing, and text generation. The system presents the student with a perturbation to the blood pressure, asks for qualitative predictions of the changes produced in seven important cardiovascular variables, and then launches a dialogue to correct any errors and to probe for possible misconceptions. The natural language understanding component uses a cascade of finite-state machines. The generation is based on lexical functional grammar. Results of experiments with pretests and posttests have shown that using the system for an hour produces significant learning gains and also that even this brief use improves the student's ability to solve problems more then reading textual material on the topic. Student surveys tell us that students like the system and feel that they learn from it. The system is now in regular use in the first-year physiology course at Rush Medical College. We conclude that the CIRCSIM-Tutor system demonstrates that intelligent tutoring systems can implement effective natural language dialogue with current language technology.

  18. Polish Toxic Currency Options

    Full Text Available Toxic currency options are defined on the basis of the opposition to the nature (essence of an option contract, which is justified in terms of norms founded on the general law clause of characteristics (nature of a relation (which represents an independent premise for imposing restrictions on the freedom of contracts. So-understood toxic currency options are unlawful. Indeed they contravene iuris cogentis regulations. These include for instance option contracts, which are concluded with a bank, if the bank has not informed about option risk before concluding the contract; or the barrier options, which focus only on the protection of bank’s interests. Therefore, such options may appear to be invalid. Therefore, performing contracts for toxic currency options may be qualified as a criminal mismanagement. For the sake of security, the manager should then take into consideration filing a claim for stating invalidity (which can be made in a court verdict. At the same time, if the supervisory board member in a commercial company, who can also be a subject to mismanagement offences, commits an omission involving lack of reaction (for example, if he/she fails to notify of the suspected offence committed by the management board members acting to the company’s detriment when the management board makes the company conclude option contracts which are charged with absolute invalidity the supervisory board member so acting may be considered to act to the company’s detriment. In the most recent Polish jurisprudence and judicature the standard of a “good host” is treated to be the last resort for determining whether the manager’s powers resulting from criminal regulations were performed. The manager of the exporter should not, as a rule, issue any options. Issuing options always means assuming an obligation. In the case of currency put options it is an absolute obligation to purchase a given amount in euro at exchange rate set in advance. On the

    KAUST Repository

    Sun, Ying


    This article proposes functional median polish, an extension of univariate median polish, for one-way and two-way functional analysis of variance (ANOVA). The functional median polish estimates the functional grand effect and functional main factor effects based on functional medians in an additive functional ANOVA model assuming no interaction among factors. A functional rank test is used to assess whether the functional main factor effects are significant. The robustness of the functional median polish is demonstrated by comparing its performance with the traditional functional ANOVA fitted by means under different outlier models in simulation studies. The functional median polish is illustrated on various applications in climate science, including one-way and two-way ANOVA when functional data are either curves or images. Specifically, Canadian temperature data, U. S. precipitation observations and outputs of global and regional climate models are considered, which can facilitate the research on the close link between local climate and the occurrence or severity of some diseases and other threats to human health. © 2012 International Biometric Society.

    Pahisa-Solé, Joan; Herrera-Joancomartí, Jordi


    In this article, we describe a compansion system that transforms the telegraphic language that comes from the use of pictogram-based augmentative and alternative communication (AAC) into natural language. The system was tested with four participants with severe cerebral palsy and ranging degrees of linguistic competence and intellectual disabilities. Participants had used pictogram-based AAC at least for the past 30 years each and presented a stable linguistic profile. During tests, which consisted of a total of 40 sessions, participants were able to learn new linguistic skills, such as the use of basic verb tenses, while using the compansion system, which proved a source of motivation. The system can be adapted to the linguistic competence of each person and required no learning curve during tests when none of its special features, like gender, number, verb tense, or sentence type modifiers, were used. Furthermore, qualitative and quantitative results showed a mean communication rate increase of 41.59%, compared to the same communication device without the compansion system, and an overall improvement in the communication experience when the output is in natural language. Tests were conducted in Catalan and Spanish.

    Newspaper cartoons can graphically display the result of ambiguity in human speech; the result can be unexpected and funny. Likewise, computer analysis of natural language statements also needs to successfully resolve ambiguous situations. Computer techniques already developed use restricted world knowledge in resolving ambiguous language use. This paper illustrates how these techniques can be used in resolving ambiguous situations arising in cartoons. 8 references.


    Full Text Available The semantic web extends the current World Wide Web by adding facilities for the machine understood description of meaning. The ontology based search model is used to enhance efficiency and accuracy of information retrieval. Ontology is the core technology for the semantic web and this mechanism for representing formal and shared domain descriptions. In this paper, we proposed ontology based meaningful search using semantic web and Natural Language Processing (NLP techniques in the educational domain. First we build the educational ontology then we present the semantic search system. The search model consisting three parts which are embedding spell-check, finding synonyms using WordNet API and querying ontology using SPARQL language. The results are both sensitive to spell check and synonymous context. This paper provides more accurate results and the complete details for the selected field in a single page.

    Tymoteusz Król

  4. Ulisse Aldrovandi's Color Sensibility: Natural History, Language and the Lay Color Practices of Renaissance Virtuosi. (United States)

    Pugliano, Valentina


  5. Systemic functional grammar in natural language generation linguistic description and computational representation

    CERN Document Server

    This paper presents a data-driven approach to graphically presenting text-based patient journals while still maintaining all textual information. The system first creates a timeline representation of a patients’ physiological condition during an admission, which is assessed by electronically...... monitoring vital signs and then combining these into Early Warning Scores (EWS). Hereafter, techniques from Natural Language Processing (NLP) are applied on the existing patient journal to extract all entries. Finally, the two methods are combined into an interactive timeline featuring the ability to see...... drastic changes in the patients’ health, and thereby enabling staff to see where in the journal critical events have taken place....

    Full Text Available The Quran is a scripture that acts as the main reference to people which their religion is Islam. It covers information from politics to science, with vast amount of information that requires effort to uncover the knowledge behind it. Today, the emergence of smartphones has led to the development of a wide-range application for enhancing knowledge-seeking activities. This project proposes a mobile application that is taking a natural language approach to searching topics in the Quran based on keyword searching. The benefit of the application is two-fold; it is intuitive and it saves time.

  8. On the Possibility of ESP Data Use in Natural Language Processing


    Knopp, Tomáš


    Gomez, Fernando


    It is shown how certain kinds of domain independent expert systems based on classification problem-solving methods can be constructed directly from natural language descriptions by a human expert. The expert knowledge is not translated into production rules. Rather, it is mapped into conceptual structures which are integrated into long-term memory (LTM). The resulting system is one in which problem-solving, retrieval and memory organization are integrated processes. In other words, the same algorithm and knowledge representation structures are shared by these processes. As a result of this, the system can answer questions, solve problems or reorganize LTM.

    Full Text Available Abstract Background Incident reporting is the most common method for detecting adverse events in a hospital. However, under-reporting or non-reporting and delay in submission of reports are problems that prevent early detection of serious adverse events. The aim of this study was to determine whether it is possible to promptly detect serious injuries after inpatient falls by using a natural language processing method and to determine which data source is the most suitable for this purpose. Methods We tried to detect adverse events from narrative text data of electronic medical records by using a natural language processing method. We made syntactic category decision rules to detect inpatient falls from text data in electronic medical records. We compared how often the true fall events were recorded in various sources of data including progress notes, discharge summaries, image order entries and incident reports. We applied the rules to these data sources and compared F-measures to detect falls between these data sources with reference to the results of a manual chart review. The lag time between event occurrence and data submission and the degree of injury were compared. Results We made 170 syntactic rules to detect inpatient falls by using a natural language processing method. Information on true fall events was most frequently recorded in progress notes (100%, incident reports (65.0% and image order entries (12.5%. However, F-measure to detect falls using the rules was poor when using progress notes (0.12 and discharge summaries (0.24 compared with that when using incident reports (1.00 and image order entries (0.91. Since the results suggested that incident reports and image order entries were possible data sources for prompt detection of serious falls, we focused on a comparison of falls found by incident reports and image order entries. Injury caused by falls found by image order entries was significantly more severe than falls detected by

  11. Semi-supervised learning and domain adaptation in natural language processing

    Musial, Joanna


    This study analyzes degrees of differences between the private and public sectors of Polish higher education. It finds them to be strong: Polish private institutions function very differently from Polish public institutions and these differences correspond with those found in the literature on higher education elsewhere in the world. Polish…

  13. Constructed Action, the Clause and the Nature of Syntax in Finnish Sign Language

    Directory of Open Access Journals (Sweden)

    Jantunen Tommi


    Full Text Available This paper investigates the interplay of constructed action and the clause in Finnish Sign Language (FinSL. Constructed action is a form of gestural enactment in which the signers use their hands, face and other parts of the body to represent the actions, thoughts or feelings of someone they are referring to in the discourse. With the help of frequencies calculated from corpus data, this article shows firstly that when FinSL signers are narrating a story, there are differences in how they use constructed action. Then the paper argues that there are differences also in the prototypical structure, linkage type and non-manual activity of clauses, depending on the presence or non-presence of constructed action. Finally, taking the view that gesturality is an integral part of language, the paper discusses the nature of syntax in sign languages and proposes a conceptualization in which syntax is seen as a set of norms distributed on a continuum between a categorial-conventional end and a gradient-unconventional end.

    Rice, Mabel L


    Future perspectives on children with language impairments are framed from what is known about children with specific language impairment (SLI). A summary of the current state of services is followed by discussion of how these children can be overlooked and misunderstood and consideration of why it is so hard for some children to acquire language when it is effortless for most children. Genetic influences are highlighted, with the suggestion that nature plus nurture should be considered in present as well as future intervention approaches. A nurture perspective highlights the family context of the likelihood of SLI for some of the children. Future models of the causal pathways may provide more specific information to guide gene-treatment decisions, in ways parallel to current personalized medicine approaches. Future treatment options can build on the potential of electronic technologies and social media to provide personalized treatment methods available at a time and place convenient for the person to use as often as desired. The speech-language pathologist could oversee a wide range of treatment options and monitor evidence provided electronically to evaluate progress and plan future treatment steps. Most importantly, future methods can provide lifelong language acquisition activities that maintain the privacy and dignity of persons with language impairment, and in so doing will in turn enhance the effectiveness of speech-language pathologists. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.

    Full Text Available Working in tandem with the use of information and communication technologies is well-known and frequently used as a method of supporting learning of foreign languages in authentic communication. It is based on a constructivist approach to teaching. In the reported case study Polish and Chinese students discussed in English preprepared topics. The work shows the potential of e-learning at the micro level, as the language and intercultural task is implemented into an academic course without modification of the objectives and learning outcomes of the course. Evaluation carried out at the end of the project indicates that both groups perceived the task as a significant linguistic, cultural and personal experience. They stressed the importance of sharing “culture for culture” as the partner culture was new for most of them. The ability to talk and respond to information which was often strange, from the point of view of their own culture, allowed for learning intercultural competence ̔in action’.

    The document is presenting the idea of the CNGI Norm called The Polish Standard of the Technical Safety of Transmission Gas Pipelines and the way of using it by companies associated in the Chamber of the Natural Gas Industry in the business activity. It will be applied to improve the quality and reliability of gas transmission after full opening of Polish natural gas market. (author)

  17. Language of the Earth: Exploring Natural Hazards through a Literary Anthology (United States)

    This paper explores natural hazards teaching and communications through the use of a literary anthology of writings about the earth aimed at non-experts. Teaching natural hazards in high-school and university introductory Earth Science and Geography courses revolves mostly around lectures, examinations, and laboratory demonstrations/activities. Often the results of such a course are that a student 'memorizes' the answers, and is penalized when they miss a given fact [e.g., "You lost one point because you were off by 50 km/hr on the wind speed of an F5 tornado."] Although facts and general methodologies are certainly important when teaching natural hazards, it is a strong motivation to a student's assimilation of, and enthusiasm for, this knowledge, if supplemented by writings about the Earth. In this paper, we discuss a literary anthology which we developed [Language of the Earth, Rhodes, Stone, Malamud, Wiley-Blackwell, 2008] which includes many descriptions about natural hazards. Using first- and second-hand accounts of landslides, earthquakes, tsunamis, floods and volcanic eruptions, through the writings of McPhee, Gaskill, Voltaire, Austin, Cloos, and many others, hazards become 'alive', and more than 'just' a compilation of facts and processes. Using short excerpts such as these, or other similar anthologies, of remarkably written accounts and discussions about natural hazards results in 'dry' facts becoming more than just facts. These often highly personal viewpoints of our catostrophic world, provide a useful supplement to a student's understanding of the turbulent world in which we live.

  18. Sexual activity of Polish adults

    Full Text Available Aim. The purpose of this research was to explore the subject of sexual activity in the Polish population, with special focus on age and gender differences, and sexual infidelity. Sexual activity is one of the basic factors in initiating and maintaining relationships. On the one hand, sexual activity enables us to meet natural needs and maintain an intimate relationship with another human being; on the other, it may allow us to overcome loneliness and social isolation by providing the opportunity to express feelings of closeness and unity. Material and method. The research was conducted on a representative group of 3,200 Poles aged between 15–49, with the support of a well-known Polish research company – TNS OBOP. Face-to-face and Pencil and Paper (PAPI interviews were carried out. Results. The results focus on two main issues: the age and motives of sexual initiation among teenagers (with a significant percentage starting their sexual activity at the age of 15, and the quality of the sexual lives of adults (average number of sexual partners, sexual infidelity and sexual satisfaction. Conclusion. There is dependence between the type of relationship and the performance or non-performance of sexual activity, as well as the quality of the relationship. Among both adolescents and adults, remaining in a stable relationship (partnership or marriage promotes loyalty. The performance of sexual goals turns out to be an important mechanism regulating the interpersonal aspects of a relationship, influencing their perception and evaluation.

  19. Comparative study on the customization of natural language interfaces to databases. (United States)

    In the last decades the popularity of natural language interfaces to databases (NLIDBs) has increased, because in many cases information obtained from them is used for making important business decisions. Unfortunately, the complexity of their customization by database administrators make them difficult to use. In order for a NLIDB to obtain a high percentage of correctly translated queries, it is necessary that it is correctly customized for the database to be queried. In most cases the performance reported in NLIDB literature is the highest possible; i.e., the performance obtained when the interfaces were customized by the implementers. However, for end users it is more important the performance that the interface can yield when the NLIDB is customized by someone different from the implementers. Unfortunately, there exist very few articles that report NLIDB performance when the NLIDBs are not customized by the implementers. This article presents a semantically-enriched data dictionary (which permits solving many of the problems that occur when translating from natural language to SQL) and an experiment in which two groups of undergraduate students customized our NLIDB and English language frontend (ELF), considered one of the best available commercial NLIDBs. The experimental results show that, when customized by the first group, our NLIDB obtained a 44.69 % of correctly answered queries and ELF 11.83 % for the ATIS database, and when customized by the second group, our NLIDB attained 77.05 % and ELF 13.48 %. The performance attained by our NLIDB, when customized by ourselves was 90 %.

    Full Text Available The study entitled. “Language and Interactional Discourse: Deconstructing the Talk - Generating Machinery in Natural Conversation,” is an analysis of spontaneous and informal conversation. The study, carried out in the theoretical and methodological tradition of Ethnomethodology, was aimed at explicating how ordinary talk is organized and produced, how people coordinate their talk –in- interaction, how meanings are determined, and the role of talk in the wider social processes. The study followed the basic assumption of conversation analysis which is, that talk is not just a product of two ‘speakers - hearers’ who attempt to exchange information or convey messages to each other. Rather, participants in conversation are seen to be mutually orienting to, and collaborating in order to achieve orderly and meaningful communication. The analytic objective is therefore to make clear these procedures on which speakers rely to produce utterances and by which they make sense of other speakers’ talk. The datum used for this study was a recorded informal conversation between two (and later three middle- class civil servants who are friends. The recording was done in such a way that the participants were not aware that they were being recorded. The recording was later transcribed in a way that we believe is faithful to the spontaneity and informality of the talk. Our finding showed that conversation has its own features and is an ordered and structured social day by- day event. Specifically, utterances are designed and informed by organized procedures, methods and resources which are tied to the contexts in which they are produced, and which participants are privy to by virtue of their membership of a culture or a natural language community.  Keywords: Language, Discourse and Conversation

    Full Text Available The aim of the paper is to assess whether, and in what fashion, managers of Polish cluster organizations perceive the attractiveness of foreign direct investment in Polish clusters This research is exploratory and qualitative in nature. The complex nature of Polish clusters, which can benefit from and be competitively challenged by, FDI are identified and a conceptual framework for assessing that nature is proposed; specifically, research using the grounded theory method (GTM.

    Lane, H. Chad; Vanlehn, Kurt


    For beginning programmers, inadequate problem solving and planning skills are among the most salient of their weaknesses. In this paper, we test the efficacy of natural language tutoring to teach and scaffold acquisition of these skills. We describe ProPL (Pro-PELL), a dialogue-based intelligent tutoring system that elicits goal decompositions and program plans from students in natural language. The system uses a variety of tutoring tactics that leverage students' intuitive understandings of the problem, how it might be solved, and the underlying concepts of programming. We report the results of a small-scale evaluation comparing students who used ProPL with a control group who read the same content. Our primary findings are that students who received tutoring from ProPL seem to have developed an improved ability to solve the composition problem and displayed behaviors that suggest they were able to think at greater levels of abstraction than students in the read-only group.

    Kreimeyer, Kory; Foster, Matthew; Pandey, Abhishek; Arya, Nina; Halford, Gwendolyn; Jones, Sandra F; Forshee, Richard; Walderhaug, Mark; Botsis, Taxiarchis


    We followed a systematic approach based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses to identify existing clinical natural language processing (NLP) systems that generate structured information from unstructured free text. Seven literature databases were searched with a query combining the concepts of natural language processing and structured data capture. Two reviewers screened all records for relevance during two screening phases, and information about clinical NLP systems was collected from the final set of papers. A total of 7149 records (after removing duplicates) were retrieved and screened, and 86 were determined to fit the review criteria. These papers contained information about 71 different clinical NLP systems, which were then analyzed. The NLP systems address a wide variety of important clinical and research tasks. Certain tasks are well addressed by the existing systems, while others remain as open challenges that only a small number of systems attempt, such as extraction of temporal information or normalization of concepts to standard terminologies. This review has identified many NLP systems capable of processing clinical free text and generating structured output, and the information collected and evaluated here will be important for prioritizing development of new approaches for clinical NLP. Copyright © 2017 Elsevier Inc. All rights reserved.

    Full Text Available Bahasa adalah sebuah cara berkomunikasi secara sistematis dengan menggunakan suara atau simbol-simbol yang memiliki arti, yang diucapkan melalui mulut. Bahasa juga ditulis dengan mengikuti kaidah yang berlaku. Salah satu bahasa yang banyak digunakan di belahan dunia adalah Bahasa Inggris. Namun ada beberapa kendala apabila kita belajar kepada seorang guru atau instruktur. Waktu yang diberikan seorang guru, terbatas pada jam sekolah atau les saja. Bila siswa pulang sekolah atau les, maka yang bersangkutan harus belajar bahasa Inggris secara mandiri. Dari permasalahan di atas, muncul sebuah ide tentang bagaimana membuat sebuah penelitian yang berkaitan dengan pembuatan aplikasi yang mampu memberikan pengetahuan kepada siswa tentang bagaimana belajar bahasa Inggris secara mandiri baik dari perubahan kalimat postif menjadi kalimat negatif dan kalimat tanya. Disamping itu, aplikasi ini juga mampu memberikan pengetahuan tentang bagaimana mengucapkan kalimat dalam bahasa Inggris. Pada intinya kontribusi yang dapat diperoleh dari hasil penelitian ini adalah pihak terkait dari tingkat SMP sampai dengan SMU/SMK, dapat menggunakan aplikasi text to speech berbasis natural language processing untuk mempelajari tenses pada bahasa Inggris. Aplikasi ini dapat memperdengarkan kalimat-kalimat pada bahasa inggris dan dapat menyusun kalimat tanya dan kalimat negatif berdasarkan kalimat positifnya dalam beberapa tenses bahasa Inggris. Kata Kunci : Natural language processing, Text to speech


    Directory of Open Access Journals (Sweden)

    Alexandr I Krupnov


    Full Text Available The article discusses the results of empirical study of the association between variables of persistence and academic achievement in foreign languages. The sample includes students of the Faculty of Physics, Mathematics and Natural Science at the RUDN University ( n = 115, divided into 5 subsamples, two of which are featured in the present study (the most and the least successful students subsamples. Persistence as a personality trait is studied within A.I. Krupnov’s system-functional approach. A.I. Krupnov’s paper-and-pencil test was used to measure persistence variables. Academic achievement was measured according to the four parameters: Phonetics, Grammar, Speaking and Political vocabulary based on the grades students received during the academic year. The analysis revealed that persistence displays different associations with academic achievement variables in more and less successful students subsamples, the general prominence of this trait is more important for unsuccessful students. Phonetics is the academic achievement variable most associated with persistence due to its nature, a skill one can acquire through hard work and practice which is the definition of persistence. Grammar as an academic achievement variable is not associated with persistence and probably relates to other factors. Unsuccessful students may have difficulties in separating various aspects of language acquisition from each other which should be taken into consideration by the teachers.

  6. Automatic generation of natural language nursing shift summaries in neonatal intensive care: BT-Nurse. (United States)

    Hunter, James; Freer, Yvonne; Gatt, Albert; Reiter, Ehud; Sripada, Somayajulu; Sykes, Cindy


    Our objective was to determine whether and how a computer system could automatically generate helpful natural language nursing shift summaries solely from an electronic patient record system, in a neonatal intensive care unit (NICU). A system was developed which automatically generates partial NICU shift summaries (for the respiratory and cardiovascular systems), using data-to-text technology. It was evaluated for 2 months in the NICU at the Royal Infirmary of Edinburgh, under supervision. In an on-ward evaluation, a substantial majority of the summaries was found by outgoing and incoming nurses to be understandable (90%), and a majority was found to be accurate (70%), and helpful (59%). The evaluation also served to identify some outstanding issues, especially with regard to extra content the nurses wanted to see in the computer-generated summaries. It is technically possible automatically to generate limited natural language NICU shift summaries from an electronic patient record. However, it proved difficult to handle electronic data that was intended primarily for display to the medical staff, and considerable engineering effort would be required to create a deployable system from our proof-of-concept software. Copyright © 2012 Elsevier B.V. All rights reserved.

  7. Knowledge-based machine indexing from natural language text: Knowledge base design, development, and maintenance (United States)

    Genuardi, Michael T.


    One strategy for machine-aided indexing (MAI) is to provide a concept-level analysis of the textual elements of documents or document abstracts. In such systems, natural-language phrases are analyzed in order to identify and classify concepts related to a particular subject domain. The overall performance of these MAI systems is largely dependent on the quality and comprehensiveness of their knowledge bases. These knowledge bases function to (1) define the relations between a controlled indexing vocabulary and natural language expressions; (2) provide a simple mechanism for disambiguation and the determination of relevancy; and (3) allow the extension of concept-hierarchical structure to all elements of the knowledge file. After a brief description of the NASA Machine-Aided Indexing system, concerns related to the development and maintenance of MAI knowledge bases are discussed. Particular emphasis is given to statistically-based text analysis tools designed to aid the knowledge base developer. One such tool, the Knowledge Base Building (KBB) program, presents the domain expert with a well-filtered list of synonyms and conceptually-related phrases for each thesaurus concept. Another tool, the Knowledge Base Maintenance (KBM) program, functions to identify areas of the knowledge base affected by changes in the conceptual domain (for example, the addition of a new thesaurus term). An alternate use of the KBM as an aid in thesaurus construction is also discussed.

  8. A semantic-based approach for querying linked data using natural language

    Paredes-Valverde, Mario Andrés


    The semantic Web aims to provide to Web information with a well-defined meaning and make it understandable not only by humans but also by computers, thus allowing the automation, integration and reuse of high-quality information across different applications. However, current information retrieval mechanisms for semantic knowledge bases are intended to be only used by expert users. In this work, we propose a natural language interface that allows non-expert users the access to this kind of information through formulating queries in natural language. The present approach uses a domain-independent ontology model to represent the question\\'s structure and context. Also, this model allows determination of the answer type expected by the user based on a proposed question classification. To prove the effectiveness of our approach, we have conducted an evaluation in the music domain using LinkedBrainz, an effort to provide the MusicBrainz information as structured data on the Web by means of Semantic Web technologies. Our proposal obtained encouraging results based on the F-measure metric, ranging from 0.74 to 0.82 for a corpus of questions generated by a group of real-world end users. © The Author(s) 2015.

  9. A semantic-based approach for querying linked data using natural language

    Paredes-Valverde, Mario André s; Valencia-Garcí a, Rafael; Rodriguez-Garcia, Miguel Angel; Colomo-Palacios, Ricardo; Alor-Herná ndez, Giner


  10. Selected Topics on Systems Modeling and Natural Language Processing: Editorial Introduction to the Issue 7 of CSIMQ

    Witold Andrzejewski


    Full Text Available The seventh issue of Complex Systems Informatics and Modeling Quarterly presents five papers devoted to two distinct research topics: systems modeling and natural language processing (NLP. Both of these subjects are very important in computer science. Through modeling we can simplify the studied problem by concentrating on only one aspect at a time. Moreover, a properly constructed model allows the modeler to work on higher levels of abstraction and not having to concentrate on details. Since the size and complexity of information systems grows rapidly, creating good models of such systems is crucial. The analysis of natural language is slowly becoming a widely used tool in commerce and day to day life. Opinion mining allows recommender systems to provide accurate recommendations based on user-generated reviews. Speech recognition and NLP are the basis for such widely used personal assistants as Apple’s Siri, Microsoft’s Cortana, and Google Now. While a lot of work has already been done on natural language processing, the research usually concerns widely used languages, such as English. Consequently, natural language processing in languages other than English is very relevant subject and is addressed in this issue.

  11. Gesture language use in natural UI: pen-based sketching in conceptual design (United States)

    Ma, Cuixia; Dai, Guozhong


    Natural User Interface is one of the important next generation interactions. Computers are not just the tools of many special people or areas but for most people. Ubiquitous computing makes the world magic and more comfortable. In the design domain, current systems, which need the detail information, cannot conveniently support the conceptual design of the early phrase. Pen and paper are the natural and simple tools to use in our daily life, especially in design domain. Gestures are the useful and natural mode in the interaction of pen-based. In natural UI, gestures can be introduced and used through the similar mode to the existing resources in interaction. But the gestures always are defined beforehand without the users' intention and recognized to represent something in certain applications without being transplanted to others. We provide the gesture description language (GDL) to try to cite the useful gestures to the applications conveniently. It can be used in terms of the independent control resource such as menus or icons in applications. So we give the idea from two perspectives: one from the application-dependent point of view and the other from the application-independent point of view.

  12. Prediction of Emergency Department Hospital Admission Based on Natural Language Processing and Neural Networks. (United States)

    Zhang, Xingyu; Kim, Joyce; Patzer, Rachel E; Pitts, Stephen R; Patzer, Aaron; Schrager, Justin D


    To describe and compare logistic regression and neural network modeling strategies to predict hospital admission or transfer following initial presentation to Emergency Department (ED) triage with and without the addition of natural language processing elements. Using data from the National Hospital Ambulatory Medical Care Survey (NHAMCS), a cross-sectional probability sample of United States EDs from 2012 and 2013 survey years, we developed several predictive models with the outcome being admission to the hospital or transfer vs. discharge home. We included patient characteristics immediately available after the patient has presented to the ED and undergone a triage process. We used this information to construct logistic regression (LR) and multilayer neural network models (MLNN) which included natural language processing (NLP) and principal component analysis from the patient's reason for visit. Ten-fold cross validation was used to test the predictive capacity of each model and receiver operating curves (AUC) were then calculated for each model. Of the 47,200 ED visits from 642 hospitals, 6,335 (13.42%) resulted in hospital admission (or transfer). A total of 48 principal components were extracted by NLP from the reason for visit fields, which explained 75% of the overall variance for hospitalization. In the model including only structured variables, the AUC was 0.824 (95% CI 0.818-0.830) for logistic regression and 0.823 (95% CI 0.817-0.829) for MLNN. Models including only free-text information generated AUC of 0.742 (95% CI 0.731- 0.753) for logistic regression and 0.753 (95% CI 0.742-0.764) for MLNN. When both structured variables and free text variables were included, the AUC reached 0.846 (95% CI 0.839-0.853) for logistic regression and 0.844 (95% CI 0.836-0.852) for MLNN. The predictive accuracy of hospital admission or transfer for patients who presented to ED triage overall was good, and was improved with the inclusion of free text data from a patient

  13. How many kinds of reasoning? Inference, probability, and natural language semantics. (United States)

    Lassiter, Daniel; Goodman, Noah D


    The "new paradigm" unifying deductive and inductive reasoning in a Bayesian framework (Oaksford & Chater, 2007; Over, 2009) has been claimed to be falsified by results which show sharp differences between reasoning about necessity vs. plausibility (Heit & Rotello, 2010; Rips, 2001; Rotello & Heit, 2009). We provide a probabilistic model of reasoning with modal expressions such as "necessary" and "plausible" informed by recent work in formal semantics of natural language, and show that it predicts the possibility of non-linear response patterns which have been claimed to be problematic. Our model also makes a strong monotonicity prediction, while two-dimensional theories predict the possibility of reversals in argument strength depending on the modal word chosen. Predictions were tested using a novel experimental paradigm that replicates the previously-reported response patterns with a minimal manipulation, changing only one word of the stimulus between conditions. We found a spectrum of reasoning "modes" corresponding to different modal words, and strong support for our model's monotonicity prediction. This indicates that probabilistic approaches to reasoning can account in a clear and parsimonious way for data previously argued to falsify them, as well as new, more fine-grained, data. It also illustrates the importance of careful attention to the semantics of language employed in reasoning experiments. Copyright © 2014 Elsevier B.V. All rights reserved.

  14. CLAMP - a toolkit for efficiently building customized clinical natural language processing pipelines. (United States)

    Soysal, Ergin; Wang, Jingqi; Jiang, Min; Wu, Yonghui; Pakhomov, Serguei; Liu, Hongfang; Xu, Hua


    Existing general clinical natural language processing (NLP) systems such as MetaMap and Clinical Text Analysis and Knowledge Extraction System have been successfully applied to information extraction from clinical text. However, end users often have to customize existing systems for their individual tasks, which can require substantial NLP skills. Here we present CLAMP (Clinical Language Annotation, Modeling, and Processing), a newly developed clinical NLP toolkit that provides not only state-of-the-art NLP components, but also a user-friendly graphic user interface that can help users quickly build customized NLP pipelines for their individual applications. Our evaluation shows that the CLAMP default pipeline achieved good performance on named entity recognition and concept encoding. We also demonstrate the efficiency of the CLAMP graphic user interface in building customized, high-performance NLP pipelines with 2 use cases, extracting smoking status and lab test values. CLAMP is publicly available for research use, and we believe it is a unique asset for the clinical NLP community. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email:

  15. Gender differences in natural language factors of subjective intoxication in college students: an experimental vignette study. (United States)

    Levitt, Ash; Schlauch, Robert C; Bartholow, Bruce D; Sher, Kenneth J


    Examining the natural language college students use to describe various levels of intoxication can provide important insight into subjective perceptions of college alcohol use. Previous research (Levitt et al., Alcohol Clin Exp Res 2009; 33: 448) has shown that intoxication terms reflect moderate and heavy levels of intoxication and that self-use of these terms differs by gender among college students. However, it is still unknown whether these terms similarly apply to other individuals and, if so, whether similar gender differences exist. To address these issues, the current study examined the application of intoxication terms to characters in experimentally manipulated vignettes of naturalistic drinking situations within a sample of university undergraduates (n = 145). Findings supported and extended previous research by showing that other-directed applications of intoxication terms are similar to self-directed applications and depend on the gender of both the target and the user. Specifically, moderate intoxication terms were applied to and from women more than men, even when the character was heavily intoxicated, whereas heavy intoxication terms were applied to and from men more than women. The findings suggest that gender differences in the application of intoxication terms are other-directed as well as self-directed and that intoxication language can inform gender-specific prevention and intervention efforts targeting problematic alcohol use among college students. Copyright © 2013 by the Research Society on Alcoholism.

  16. Conceptual dissonance: evaluating the efficacy of natural language processing techniques for validating translational knowledge constructs. (United States)

    Payne, Philip R O; Kwok, Alan; Dhaval, Rakesh; Borlawsky, Tara B


    The conduct of large-scale translational studies presents significant challenges related to the storage, management and analysis of integrative data sets. Ideally, the application of methodologies such as conceptual knowledge discovery in databases (CKDD) provides a means for moving beyond intuitive hypothesis discovery and testing in such data sets, and towards the high-throughput generation and evaluation of knowledge-anchored relationships between complex bio-molecular and phenotypic variables. However, the induction of such high-throughput hypotheses is non-trivial, and requires correspondingly high-throughput validation methodologies. In this manuscript, we describe an evaluation of the efficacy of a natural language processing-based approach to validating such hypotheses. As part of this evaluation, we will examine a phenomenon that we have labeled as "Conceptual Dissonance" in which conceptual knowledge derived from two or more sources of comparable scope and granularity cannot be readily integrated or compared using conventional methods and automated tools.

  17. From Imitation to Prediction, Data Compression vs Recurrent Neural Networks for Natural Language Processing

    Juan Andres Laura


    Full Text Available In recent studies Recurrent Neural Networks were used for generative processes and their surprising performance can be explained by their ability to create good predictions. In addition, Data Compression is also based on prediction. What the problem comes down to is whether a data compressor could be used to perform as well as recurrent neural networks in the natural language processing tasks of sentiment analysis and automatic text generation. If this is possible, then the problem comes down to determining if a compression algorithm is even more intelligent than a neural network in such tasks. In our journey, a fundamental difference between a Data Compression Algorithm and Recurrent Neural Networks has been discovered.

  18. On application of image analysis and natural language processing for music search (United States)

    Gwardys, Grzegorz


    In this paper, I investigate a problem of finding most similar music tracks using, popular in Natural Language Processing, techniques like: TF-IDF and LDA. I de ned document as music track. Each music track is transformed to spectrogram, thanks that, I can use well known techniques to get words from images. I used SURF operation to detect characteristic points and novel approach for their description. The standard kmeans was used for clusterization. Clusterization is here identical with dictionary making, so after that I can transform spectrograms to text documents and perform TF-IDF and LDA. At the final, I can make a query in an obtained vector space. The research was done on 16 music tracks for training and 336 for testing, that are splitted in four categories: Hiphop, Jazz, Metal and Pop. Although used technique is completely unsupervised, results are satisfactory and encouraging to further research.

  19. Natural Language Processing in Serious Games: A state of the art.

    Davide Picca


    Full Text Available In the last decades, Natural Language Processing (NLP has obtained a high level of success. Interactions between NLP and Serious Games have started and some of them already include NLP techniques. The objectives of this paper are twofold: on the one hand, providing a simple framework to enable analysis of potential uses of NLP in Serious Games and, on the other hand, applying the NLP framework to existing Serious Games and giving an overview of the use of NLP in pedagogical Serious Games. In this paper we present 11 serious games exploiting NLP techniques. We present them systematically, according to the following structure:  first, we highlight possible uses of NLP techniques in Serious Games, second, we describe the type of NLP implemented in the each specific Serious Game and, third, we provide a link to possible purposes of use for the different actors interacting in the Serious Game.

  20. Harmonization and development of resources and tools for Italian natural language processing within the PARLI project

    Bosco, Cristina; Delmonte, Rodolfo; Moschitti, Alessandro; Simi, Maria


    The papers collected in this volume are selected as a sample of the progress in Natural Language Processing (NLP) performed within the Italian NLP community and especially attested by the PARLI project. PARLI (Portale per l’Accesso alle Risorse in Lingua Italiana) is a project partially funded by the Ministero Italiano per l’Università e la Ricerca (PRIN 2008) from 2008 to 2012 for monitoring and fostering the harmonic growth and coordination of the activities of Italian NLP. It was proposed by various teams of researchers working in Italian universities and research institutions. According to the spirit of the PARLI project, most of the resources and tools created within the project and here described are freely distributed and they did not terminate their life at the end of the project itself, hoping they could be a key factor in future development of computational linguistics.

  1. Workshop on using natural language processing applications for enhancing clinical decision making: an executive summary. (United States)

    Pai, Vinay M; Rodgers, Mary; Conroy, Richard; Luo, James; Zhou, Ruixia; Seto, Belinda


    In April 2012, the National Institutes of Health organized a two-day workshop entitled 'Natural Language Processing: State of the Art, Future Directions and Applications for Enhancing Clinical Decision-Making' (NLP-CDS). This report is a summary of the discussions during the second day of the workshop. Collectively, the workshop presenters and participants emphasized the need for unstructured clinical notes to be included in the decision making workflow and the need for individualized longitudinal data tracking. The workshop also discussed the need to: (1) combine evidence-based literature and patient records with machine-learning and prediction models; (2) provide trusted and reproducible clinical advice; (3) prioritize evidence and test results; and (4) engage healthcare professionals, caregivers, and patients. The overall consensus of the NLP-CDS workshop was that there are promising opportunities for NLP and CDS to deliver cognitive support for healthcare professionals, caregivers, and patients.

  2. Accurate Identification of Fatty Liver Disease in Data Warehouse Utilizing Natural Language Processing. (United States)

    Redman, Joseph S; Natarajan, Yamini; Hou, Jason K; Wang, Jingqi; Hanif, Muzammil; Feng, Hua; Kramer, Jennifer R; Desiderio, Roxanne; Xu, Hua; El-Serag, Hashem B; Kanwal, Fasiha


    Natural language processing is a powerful technique of machine learning capable of maximizing data extraction from complex electronic medical records. We utilized this technique to develop algorithms capable of "reading" full-text radiology reports to accurately identify the presence of fatty liver disease. Abdominal ultrasound, computerized tomography, and magnetic resonance imaging reports were retrieved from the Veterans Affairs Corporate Data Warehouse from a random national sample of 652 patients. Radiographic fatty liver disease was determined by manual review by two physicians and verified with an expert radiologist. A split validation method was utilized for algorithm development. For all three imaging modalities, the algorithms could identify fatty liver disease with >90% recall and precision, with F-measures >90%. These algorithms could be used to rapidly screen patient records to establish a large cohort to facilitate epidemiological and clinical studies and examine the clinic course and outcomes of patients with radiographic hepatic steatosis.

  3. Optimizing annotation resources for natural language de-identification via a game theoretic framework. (United States)

    Li, Muqun; Carrell, David; Aberdeen, John; Hirschman, Lynette; Kirby, Jacqueline; Li, Bo; Vorobeychik, Yevgeniy; Malin, Bradley A


    Electronic medical records (EMRs) are increasingly repurposed for activities beyond clinical care, such as to support translational research and public policy analysis. To mitigate privacy risks, healthcare organizations (HCOs) aim to remove potentially identifying patient information. A substantial quantity of EMR data is in natural language form and there are concerns that automated tools for detecting identifiers are imperfect and leak information that can be exploited by ill-intentioned data recipients. Thus, HCOs have been encouraged to invest as much effort as possible to find and detect potential identifiers, but such a strategy assumes the recipients are sufficiently incentivized and capable of exploiting leaked identifiers. In practice, such an assumption may not hold true and HCOs may overinvest in de-identification technology. The goal of this study is to design a natural language de-identification framework, rooted in game theory, which enables an HCO to optimize their investments given the expected capabilities of an adversarial recipient. We introduce a Stackelberg game to balance risk and utility in natural language de-identification. This game represents a cost-benefit model that enables an HCO with a fixed budget to minimize their investment in the de-identification process. We evaluate this model by assessing the overall payoff to the HCO and the adversary using 2100 clinical notes from Vanderbilt University Medical Center. We simulate several policy alternatives using a range of parameters, including the cost of training a de-identification model and the loss in data utility due to the removal of terms that are not identifiers. In addition, we compare policy options where, when an attacker is fined for misuse, a monetary penalty is paid to the publishing HCO as opposed to a third party (e.g., a federal regulator). Our results show that when an HCO is forced to exhaust a limited budget (set to $2000 in the study), the precision and recall of the

  4. Generation of Natural-Language Textual Summaries from Longitudinal Clinical Records. (United States)

    Goldstein, Ayelet; Shahar, Yuval


    Physicians are required to interpret, abstract and present in free-text large amounts of clinical data in their daily tasks. This is especially true for chronic-disease domains, but holds also in other clinical domains. We have recently developed a prototype system, CliniText, which, given a time-oriented clinical database, and appropriate formal abstraction and summarization knowledge, combines the computational mechanisms of knowledge-based temporal data abstraction, textual summarization, abduction, and natural-language generation techniques, to generate an intelligent textual summary of longitudinal clinical data. We demonstrate our methodology, and the feasibility of providing a free-text summary of longitudinal electronic patient records, by generating summaries in two very different domains - Diabetes Management and Cardiothoracic surgery. In particular, we explain the process of generating a discharge summary of a patient who had undergone a Coronary Artery Bypass Graft operation, and a brief summary of the treatment of a diabetes patient for five years.

  5. A Natural Language Intelligent Tutoring System for Training Pathologists - Implementation and Evaluation (United States)

    El Saadawi, Gilan M.; Tseytlin, Eugene; Legowski, Elizabeth; Jukic, Drazen; Castine, Melissa; Fine, Jeffrey; Gormley, Robert; Crowley, Rebecca S.


    Introduction We developed and evaluated a Natural Language Interface (NLI) for an Intelligent Tutoring System (ITS) in Diagnostic Pathology. The system teaches residents to examine pathologic slides and write accurate pathology reports while providing immediate feedback on errors they make in their slide review and diagnostic reports. Residents can ask for help at any point in the case, and will receive context-specific feedback. Research Questions We evaluated (1) the performance of our natural language system, (2) the effect of the system on learning (3) the effect of feedback timing on learning gains and (4) the effect of ReportTutor on performance to self-assessment correlations. Methods The study uses a crossover 2×2 factorial design. We recruited 20 subjects from 4 academic programs. Subjects were randomly assigned to one of the four conditions - two conditions for the immediate interface, and two for the delayed interface. An expert dermatopathologist created a reference standard and 2 board certified AP/CP pathology fellows manually coded the residents' assessment reports. Subjects were given the opportunity to self grade their performance and we used a survey to determine student response to both interfaces. Results Our results show a highly significant improvement in report writing after one tutoring session with 4-fold increase in the learning gains with both interfaces but no effect of feedback timing on performance gains. Residents who used the immediate feedback interface first experienced a feature learning gain that is correlated with the number of cases they viewed. There was no correlation between performance and self-assessment in either condition. PMID:17934789


    Will, H.


    The complex environment of the typical research laboratory requires flexible process control. This program provides natural language process control from an IBM PC or compatible machine. Sometimes process control schedules require changes frequently, even several times per day. These changes may include adding, deleting, and rearranging steps in a process. This program sets up a process control system that can either run without an operator, or be run by workers with limited programming skills. The software system includes three programs. Two of the programs, written in FORTRAN77, record data and control research processes. The third program, written in Pascal, generates the FORTRAN subroutines used by the other two programs to identify the user commands with the user-written device drivers. The software system also includes an input data set which allows the user to define the user commands which are to be executed by the computer. To set the system up the operator writes device driver routines for all of the controlled devices. Once set up, this system requires only an input file containing natural language command lines which tell the system what to do and when to do it. The operator can make up custom commands for operating and taking data from external research equipment at any time of the day or night without the operator in attendance. This process control system requires a personal computer operating under MS-DOS with suitable hardware interfaces to all controlled devices. The program requires a FORTRAN77 compiler and user-written device drivers. This program was developed in 1989 and has a memory requirement of about 62 Kbytes.

  7. Building an ontology of pulmonary diseases with natural language processing tools using textual corpora. (United States)

    Baneyx, Audrey; Charlet, Jean; Jaulent, Marie-Christine


    Pathologies and acts are classified in thesauri to help physicians to code their activity. In practice, the use of thesauri is not sufficient to reduce variability in coding and thesauri are not suitable for computer processing. We think the automation of the coding task requires a conceptual modeling of medical items: an ontology. Our task is to help lung specialists code acts and diagnoses with software that represents medical knowledge of this concerned specialty by an ontology. The objective of the reported work was to build an ontology of pulmonary diseases dedicated to the coding process. To carry out this objective, we develop a precise methodological process for the knowledge engineer in order to build various types of medical ontologies. This process is based on the need to express precisely in natural language the meaning of each concept using differential semantics principles. A differential ontology is a hierarchy of concepts and relationships organized according to their similarities and differences. Our main research hypothesis is to apply natural language processing tools to corpora to develop the resources needed to build the ontology. We consider two corpora, one composed of patient discharge summaries and the other being a teaching book. We propose to combine two approaches to enrich the ontology building: (i) a method which consists of building terminological resources through distributional analysis and (ii) a method based on the observation of corpus sequences in order to reveal semantic relationships. Our ontology currently includes 1550 concepts and the software implementing the coding process is still under development. Results show that the proposed approach is operational and indicates that the combination of these methods and the comparison of the resulting terminological structures give interesting clues to a knowledge engineer for the building of an ontology.

  8. Terminologia anatomica in the past and the future from perspective of 110th anniversary of Polish Anatomical Terminology. (United States)

    Gielecki, J; Zurada, A; Osman, N


    Professional terminology is commonplace, particularly in the fields of mathematics, medicine, veterinary and natural sciences. The use of the terminology can be international, as it is with Anatomical Terminology (AT). In the early age of modern education, anatomists adopted Latin as the international language for AT. However, at the end of the 20th century, the English language became more predominant around the world. It can be said that the AT is a specific collection of scientific terms. One of the major flaws in early AT was that body structures were described by varying names, while some of the terms was irrational in nature, and confusing. At this time, different international committees were working on preparing a unified final version of the AT, which in the end consisted of 5,640 terms (4,286 originally from the Basle Nomina Anatomica, BNA). Also, each country wanted to have its own nomenclature. In order to accomplish this, each country based their nomenclature on the international AT, and then translated it into their own language. The history of the Polish Anatomical Terminology (PAT) is unique, and follows the events of history. It was first published in 1898, at a time when its neighbours partitioned the territory of Poland. During 150 years, the Polish culture and language was under the Russification and Germanization policy. It is important to note, that even with such difficult circumstances, the PAT was the first national AT in the world. The PAT was a union of the accepted first BNA in Latin and the original Polish anatomical equivalents. This union formed the basis for theoretical and clinical medicine in Poland.

  9. Creation of a simple natural language processing tool to support an imaging utilization quality dashboard. (United States)

    Swartz, Jordan; Koziatek, Christian; Theobald, Jason; Smith, Silas; Iturrate, Eduardo


    Testing for venous thromboembolism (VTE) is associated with cost and risk to patients (e.g. radiation). To assess the appropriateness of imaging utilization at the provider level, it is important to know that provider's diagnostic yield (percentage of tests positive for the diagnostic entity of interest). However, determining diagnostic yield typically requires either time-consuming, manual review of radiology reports or the use of complex and/or proprietary natural language processing software. The objectives of this study were twofold: 1) to develop and implement a simple, user-configurable, and open-source natural language processing tool to classify radiology reports with high accuracy and 2) to use the results of the tool to design a provider-specific VTE imaging dashboard, consisting of both utilization rate and diagnostic yield. Two physicians reviewed a training set of 400 lower extremity ultrasound (UTZ) and computed tomography pulmonary angiogram (CTPA) reports to understand the language used in VTE-positive and VTE-negative reports. The insights from this review informed the arguments to the five modifiable parameters of the NLP tool. A validation set of 2,000 studies was then independently classified by the reviewers and by the tool; the classifications were compared and the performance of the tool was calculated. The tool was highly accurate in classifying the presence and absence of VTE for both the UTZ (sensitivity 95.7%; 95% CI 91.5-99.8, specificity 100%; 95% CI 100-100) and CTPA reports (sensitivity 97.1%; 95% CI 94.3-99.9, specificity 98.6%; 95% CI 97.8-99.4). The diagnostic yield was then calculated at the individual provider level and the imaging dashboard was created. We have created a novel NLP tool designed for users without a background in computer programming, which has been used to classify venous thromboembolism reports with a high degree of accuracy. The tool is open-source and available for download at http

  10. A UMLS-based spell checker for natural language processing in vaccine safety

    Liu Fang


    Full Text Available Abstract Background The Institute of Medicine has identified patient safety as a key goal for health care in the United States. Detecting vaccine adverse events is an important public health activity that contributes to patient safety. Reports about adverse events following immunization (AEFI from surveillance systems contain free-text components that can be analyzed using natural language processing. To extract Unified Medical Language System (UMLS concepts from free text and classify AEFI reports based on concepts they contain, we first needed to clean the text by expanding abbreviations and shortcuts and correcting spelling errors. Our objective in this paper was to create a UMLS-based spelling error correction tool as a first step in the natural language processing (NLP pipeline for AEFI reports. Methods We developed spell checking algorithms using open source tools. We used de-identified AEFI surveillance reports to create free-text data sets for analysis. After expansion of abbreviated clinical terms and shortcuts, we performed spelling correction in four steps: (1 error detection, (2 word list generation, (3 word list disambiguation and (4 error correction. We then measured the performance of the resulting spell checker by comparing it to manual correction. Results We used 12,056 words to train the spell checker and tested its performance on 8,131 words. During testing, sensitivity, specificity, and positive predictive value (PPV for the spell checker were 74% (95% CI: 74–75, 100% (95% CI: 100–100, and 47% (95% CI: 46%–48%, respectively. Conclusion We created a prototype spell checker that can be used to process AEFI reports. We used the UMLS Specialist Lexicon as the primary source of dictionary terms and the WordNet lexicon as a secondary source. We used the UMLS as a domain-specific source of dictionary terms to compare potentially misspelled words in the corpus. The prototype sensitivity was comparable to currently available

  11. The place of polish in the multilingual space of the European Union

    Directory of Open Access Journals (Sweden)

    T. I. Neprytska


    Full Text Available The article studies the position of the Polish language in the multilingual space of the European Union and determines the key factors which facilitate its gaining popularity and spreading in Europe. A large territory and population determine the significant presence of Polish in the European Union. Intense economic development facilitates popularization of learning and using Polish in the business medium, however, English was and still remains the dominating language of business. Active work of the state on improving the reputation of the country abroad, civilizational (value­based unity with other nations of the EU, favorable geographical position, common Indo­European roots of Germanic, Romanic and Slavonic  languages as well as usage of the Latin type create favorable conditions for the development and popularization of Polish on the territory of the EU. The article also mentions a number of concerns, which are rooted in the historical past of a dependent or semi­dependent existence of the Polish people, namely, the existence of the Polish and culture in the shade of German and Russian culture space, the negative international image of modern Poland, which was formed at the beginning of the 1990­s, the low level of Europeans’ familiarization with the Polish culture, absence of popularity and economic necessity of learning Polish abroad.

  12. Integrating Multi-Purpose Natural Language Understanding, Robot's Memory, and Symbolic Planning for Task Execution in Humanoid Robots

    DEFF Research Database (Denmark)

    Wächter, Mirko; Ovchinnikova, Ekaterina; Wittenbeck, Valerij


    We propose an approach for instructing a robot using natural language to solve complex tasks in a dynamic environment. In this study, we elaborate on a framework that allows a humanoid robot to understand natural language, derive symbolic representations of its sensorimotor experience, generate....... The framework is implemented within the robot development environment ArmarX. We evaluate the framework on the humanoid robot ARMAR-III in the context of two experiments: a demonstration of the real execution of a complex task in the kitchen environment on ARMAR-III and an experiment with untrained users...

  13. Classifying a Person's Degree of Accessibility From Natural Body Language During Social Human-Robot Interactions. (United States)

    McColl, Derek; Jiang, Chuan; Nejat, Goldie


    For social robots to be successfully integrated and accepted within society, they need to be able to interpret human social cues that are displayed through natural modes of communication. In particular, a key challenge in the design of social robots is developing the robot's ability to recognize a person's affective states (emotions, moods, and attitudes) in order to respond appropriately during social human-robot interactions (HRIs). In this paper, we present and discuss social HRI experiments we have conducted to investigate the development of an accessibility-aware social robot able to autonomously determine a person's degree of accessibility (rapport, openness) toward the robot based on the person's natural static body language. In particular, we present two one-on-one HRI experiments to: 1) determine the performance of our automated system in being able to recognize and classify a person's accessibility levels and 2) investigate how people interact with an accessibility-aware robot which determines its own behaviors based on a person's speech and accessibility levels.

  14. Genes, language, and the nature of scientific explanations: the case of Williams syndrome. (United States)

    Musolino, Julien; Landau, Barbara


    In this article, we discuss two experiments of nature and their implications for the sciences of the mind. The first, Williams syndrome, bears on one of cognitive science's holy grails: the possibility of unravelling the causal chain between genes and cognition. We sketch the outline of a general framework to study the relationship between genes and cognition, focusing as our case study on the development of language in individuals with Williams syndrome. Our approach emphasizes the role of three key ingredients: the need to specify a clear level of analysis, the need to provide a theoretical account of the relevant cognitive structure at that level, and the importance of the (typical) developmental process itself. The promise offered by the case of Williams syndrome has also given rise to two strongly conflicting theoretical approaches-modularity and neuroconstructivism-themselves offshoots of a perennial debate between nativism and empiricism. We apply our framework to explore the tension created by these two conflicting perspectives. To this end, we discuss a second experiment of nature, which allows us to compare the two competing perspectives in what comes close to a controlled experimental setting. From this comparison, we conclude that the "meaningful debate assumption", a widespread assumption suggesting that neuroconstructivism and modularity address the same questions and represent genuine theoretical alternatives, rests on a fallacy.

  15. Automatic Lung-RADS™ classification with a natural language processing system. (United States)

    Beyer, Sebastian E; McKee, Brady J; Regis, Shawn M; McKee, Andrea B; Flacke, Sebastian; El Saadawi, Gilan; Wald, Christoph


    Our aim was to train a natural language processing (NLP) algorithm to capture imaging characteristics of lung nodules reported in a structured CT report and suggest the applicable Lung-RADS™ (LR) category. Our study included structured, clinical reports of consecutive CT lung screening (CTLS) exams performed from 08/2014 to 08/2015 at an ACR accredited Lung Cancer Screening Center. All patients screened were at high-risk for lung cancer according to the NCCN Guidelines ® . All exams were interpreted by one of three radiologists credentialed to read CTLS exams using LR using a standard reporting template. Training and test sets consisted of consecutive exams. Lung screening exams were divided into two groups: three training sets (500, 120, and 383 reports each) and one final evaluation set (498 reports). NLP algorithm results were compared with the gold standard of LR category assigned by the radiologist. The sensitivity/specificity of the NLP algorithm to correctly assign LR categories for suspicious nodules (LR 4) and positive nodules (LR 3/4) were 74.1%/98.6% and 75.0%/98.8% respectively. The majority of mismatches occurred in cases where pulmonary findings were present not currently addressed by LR. Misclassifications also resulted from the failure to identify exams as follow-up and the failure to completely characterize part-solid nodules. In a sub-group analysis among structured reports with standardized language, the sensitivity and specificity to detect LR 4 nodules were 87.0% and 99.5%, respectively. An NLP system can accurately suggest the appropriate LR category from CTLS exam findings when standardized reporting is used.

  16. A Discussion about Upgrading the Quick Script Platform to Create Natural Language based IoT Systems

    DEFF Research Database (Denmark)

    Khanna, Anirudh; Das, Bhagwan; Pandey, Bishwajeet


    With the advent of AI and IoT, the idea of incorporating smart things/appliances in our day to day life is converting into a reality. The paper discusses the possibilities and potential of designing IoT systems which can be controlled via natural language, with help of Quick Script as a development...

  17. Automated assessment of patients' self-narratives for posttraumatic stress disorder screening using natural language processing and text mining

    He, Qiwei; Veldkamp, Bernard P.; Glas, Cornelis A.W.; de Vries, Theo


    Patients’ narratives about traumatic experiences and symptoms are useful in clinical screening and diagnostic procedures. In this study, we presented an automated assessment system to screen patients for posttraumatic stress disorder via a natural language processing and text-mining approach. Four

  18. AIED 2009 Workshops Proceeedings Volume 10: Natural Language Processing in Support of Learning: Metrics, Feedback and Connectivity

    Dessus, Philippe; Trausan-Matu, Stefan; Van Rosmalen, Peter; Wild, Fridolin


    Dessus, P., Trausan-Matu, S., Van Rosmalen, P., & Wild, F. (Eds.) (2009). AIED 2009 Workshops Proceedings Volume 10 Natural Language Processing in Support of Learning: Metrics, Feedback and Connectivity. In S. D. Craig & D. Dicheva (Eds.), AIED 2009: 14th International Conference in Artificial

  20. Voice-enabled Knowledge Engine using Flood Ontology and Natural Language Processing (United States)

    The Iowa Flood Information System (IFIS) is a web-based platform developed by the Iowa Flood Center (IFC) to provide access to flood inundation maps, real-time flood conditions, flood forecasts, flood-related data, information and interactive visualizations for communities in Iowa. The IFIS is designed for use by general public, often people with no domain knowledge and limited general science background. To improve effective communication with such audience, we have introduced a voice-enabled knowledge engine on flood related issues in IFIS. Instead of navigating within many features and interfaces of the information system and web-based sources, the system provides dynamic computations based on a collection of built-in data, analysis, and methods. The IFIS Knowledge Engine connects to real-time stream gauges, in-house data sources, analysis and visualization tools to answer natural language questions. Our goal is the systematization of data and modeling results on flood related issues in Iowa, and to provide an interface for definitive answers to factual queries. The goal of the knowledge engine is to make all flood related knowledge in Iowa easily accessible to everyone, and support voice-enabled natural language input. We aim to integrate and curate all flood related data, implement analytical and visualization tools, and make it possible to compute answers from questions. The IFIS explicitly implements analytical methods and models, as algorithms, and curates all flood related data and resources so that all these resources are computable. The IFIS Knowledge Engine computes the answer by deriving it from its computational knowledge base. The knowledge engine processes the statement, access data warehouse, run complex database queries on the server-side and return outputs in various formats. This presentation provides an overview of IFIS Knowledge Engine, its unique information interface and functionality as an educational tool, and discusses the future plans

  1. Symmetry or asymmetry? Cross-border openness of service providers in Polish-Czech and Polish-German border towns

    Directory of Open Access Journals (Sweden)

    Dołzbłasz Sylwia


    Full Text Available The symmetry and/or asymmetry in terms of cross-border openness of service providers is examined in this article, for the cases of two border twin towns: Cieszyn/Český Těšín at the Polish-Czech border, and Gubin/Guben at the Polish-German border. To assess the level of openness of firms towards clients from the other side of the border, four trans-border categories were examined: neighbour’s language visible at store location; business offers in the language of the neighbour; the possibilities of payment in the neighbour’s currency; and the staff’s knowledge of the language. This enabled a comparison of both parts of the particular twin towns in relation to the character of cross-border openness, as well as an assessment of their symmetry/asymmetry. Comparisons of Gubin/Guben and Cieszyn/Český Těšín with respect to the analysed features were also carried out. The analysis shows significant variation in the level of cross-border openness towards clients from neighbouring countries. Whereas in the Polish-Czech town a relative symmetry was observed, in the Polish-German case, significant asymmetry was noted.

  2. Surmounting the Tower of Babel: Monolingual and bilingual 2-year-olds' understanding of the nature of foreign language words. (United States)

    Byers-Heinlein, Krista; Chen, Ke Heng; Xu, Fei


    Languages function as independent and distinct conventional systems, and so each language uses different words to label the same objects. This study investigated whether 2-year-old children recognize that speakers of their native language and speakers of a foreign language do not share the same knowledge. Two groups of children unfamiliar with Mandarin were tested: monolingual English-learning children (n=24) and bilingual children learning English and another language (n=24). An English speaker taught children the novel label fep. On English mutual exclusivity trials, the speaker asked for the referent of a novel label (wug) in the presence of the fep and a novel object. Both monolingual and bilingual children disambiguated the reference of the novel word using a mutual exclusivity strategy, choosing the novel object rather than the fep. On similar trials with a Mandarin speaker, children were asked to find the referent of a novel Mandarin label kuò. Monolinguals again chose the novel object rather than the object with the English label fep, even though the Mandarin speaker had no access to conventional English words. Bilinguals did not respond systematically to the Mandarin speaker, suggesting that they had enhanced understanding of the Mandarin speaker's ignorance of English words. The results indicate that monolingual children initially expect words to be conventionally shared across all speakers-native and foreign. Early bilingual experience facilitates children's discovery of the nature of foreign language words. Copyright © 2013 Elsevier Inc. All rights reserved.

  3. Qualitative spatial logic descriptors from 3D indoor scenes to generate explanations in natural language. (United States)

    Falomir, Zoe; Kluth, Thomas


    The challenge of describing 3D real scenes is tackled in this paper using qualitative spatial descriptors. A key point to study is which qualitative descriptors to use and how these qualitative descriptors must be organized to produce a suitable cognitive explanation. In order to find answers, a survey test was carried out with human participants which openly described a scene containing some pieces of furniture. The data obtained in this survey are analysed, and taking this into account, the QSn3D computational approach was developed which uses a XBox 360 Kinect to obtain 3D data from a real indoor scene. Object features are computed on these 3D data to identify objects in indoor scenes. The object orientation is computed, and qualitative spatial relations between the objects are extracted. These qualitative spatial relations are the input to a grammar which applies saliency rules obtained from the survey study and generates cognitive natural language descriptions of scenes. Moreover, these qualitative descriptors can be expressed as first-order logical facts in Prolog for further reasoning. Finally, a validation study is carried out to test whether the descriptions provided by QSn3D approach are human readable. The obtained results show that their acceptability is higher than 82%.

  4. Characterization of Change and Significance for Clinical Findings in Radiology Reports Through Natural Language Processing. (United States)

    Hassanpour, Saeed; Bay, Graham; Langlotz, Curtis P


    We built a natural language processing (NLP) method to automatically extract clinical findings in radiology reports and characterize their level of change and significance according to a radiology-specific information model. We utilized a combination of machine learning and rule-based approaches for this purpose. Our method is unique in capturing different features and levels of abstractions at surface, entity, and discourse levels in text analysis. This combination has enabled us to recognize the underlying semantics of radiology report narratives for this task. We evaluated our method on radiology reports from four major healthcare organizations. Our evaluation showed the efficacy of our method in highlighting important changes (accuracy 99.2%, precision 96.3%, recall 93.5%, and F1 score 94.7%) and identifying significant observations (accuracy 75.8%, precision 75.2%, recall 75.7%, and F1 score 75.3%) to characterize radiology reports. This method can help clinicians quickly understand the key observations in radiology reports and facilitate clinical decision support, review prioritization, and disease surveillance.

  5. Predicting judicial decisions of the European Court of Human Rights: a Natural Language Processing perspective

    Nikolaos Aletras


    Full Text Available Recent advances in Natural Language Processing and Machine Learning provide us with the tools to build predictive models that can be used to unveil patterns driving judicial decisions. This can be useful, for both lawyers and judges, as an assisting tool to rapidly identify cases and extract patterns which lead to certain decisions. This paper presents the first systematic study on predicting the outcome of cases tried by the European Court of Human Rights based solely on textual content. We formulate a binary classification task where the input of our classifiers is the textual content extracted from a case and the target output is the actual judgment as to whether there has been a violation of an article of the convention of human rights. Textual information is represented using contiguous word sequences, i.e., N-grams, and topics. Our models can predict the court’s decisions with a strong accuracy (79% on average. Our empirical analysis indicates that the formal facts of a case are the most important predictive factor. This is consistent with the theory of legal realism suggesting that judicial decision-making is significantly affected by the stimulus of the facts. We also observe that the topical content of a case is another important feature in this classification task and explore this relationship further by conducting a qualitative analysis.

  6. Is Natural Language a Perigraphic Process? The Theorem about Facts and Words Revisited

    Łukasz Dębowski


    Full Text Available As we discuss, a stationary stochastic process is nonergodic when a random persistent topic can be detected in the infinite random text sampled from the process, whereas we call the process strongly nonergodic when an infinite sequence of independent random bits, called probabilistic facts, is needed to describe this topic completely. Replacing probabilistic facts with an algorithmically random sequence of bits, called algorithmic facts, we adapt this property back to ergodic processes. Subsequently, we call a process perigraphic if the number of algorithmic facts which can be inferred from a finite text sampled from the process grows like a power of the text length. We present a simple example of such a process. Moreover, we demonstrate an assertion which we call the theorem about facts and words. This proposition states that the number of probabilistic or algorithmic facts which can be inferred from a text drawn from a process must be roughly smaller than the number of distinct word-like strings detected in this text by means of the Prediction by Partial Matching (PPM compression algorithm. We also observe that the number of the word-like strings for a sample of plays by Shakespeare follows an empirical stepwise power law, in a stark contrast to Markov processes. Hence, we suppose that natural language considered as a process is not only non-Markov but also perigraphic.

  7. Natural Language Use and Couples’ Adjustment to Head and Neck Cancer (United States)

    Badr, Hoda; Milbury, Kathrin; Majeed, Nadia; Carmack, Cindy L.; Ahmad, Zeba; Gritz, Ellen R.


    Objective This multimethod prospective study examined whether emotional disclosure and coping focus as conveyed through natural language use is associated with the psychological and marital adjustment of head and neck cancer patients and their spouses. Methods One-hundred twenty-three patients (85% men; age X‒=56.8 years, SD=10.4) and their spouses completed surveys prior to, following, and 4-months after engaging in a videotaped discussion about cancer in the laboratory. Linguistic Inquiry and Word Count (LIWC) software assessed counts of positive/negative emotion words and first-person singular (I-talk), second person (you-talk), and first-person plural (we-talk) pronouns. Using a Grounded Theory approach, discussions were also analyzed to describe how emotion words and pronouns were used and what was being discussed. Results Emotion words were most often used to disclose thoughts/feelings or worry/uncertainty about the future, and to express gratitude or acknowledgment to one’s partner. Although patients who disclosed more negative emotion during the discussion reported more positive mood following the discussion (ppsychological and marital adjustment were found. Patients used significantly more I-talk than spouses and spouses used significantly more you-talk than patients (p’sdistress at the 4-month follow-up assessment when their partners used more we-talk (p disclosure may be less important to one’s cancer adjustment than having a partner who one sees as instrumental to the coping process. PMID:27441867

  8. Detecting Target Objects by Natural Language Instructions Using an RGB-D Camera

    Jiatong Bao


    Full Text Available Controlling robots by natural language (NL is increasingly attracting attention for its versatility, convenience and no need of extensive training for users. Grounding is a crucial challenge of this problem to enable robots to understand NL instructions from humans. This paper mainly explores the object grounding problem and concretely studies how to detect target objects by the NL instructions using an RGB-D camera in robotic manipulation applications. In particular, a simple yet robust vision algorithm is applied to segment objects of interest. With the metric information of all segmented objects, the object attributes and relations between objects are further extracted. The NL instructions that incorporate multiple cues for object specifications are parsed into domain-specific annotations. The annotations from NL and extracted information from the RGB-D camera are matched in a computational state estimation framework to search all possible object grounding states. The final grounding is accomplished by selecting the states which have the maximum probabilities. An RGB-D scene dataset associated with different groups of NL instructions based on different cognition levels of the robot are collected. Quantitative evaluations on the dataset illustrate the advantages of the proposed method. The experiments of NL controlled object manipulation and NL-based task programming using a mobile manipulator show its effectiveness and practicability in robotic applications.

  9. A natural language processing pipeline for pairing measurements uniquely across free-text CT reports. (United States)

    Sevenster, Merlijn; Bozeman, Jeffrey; Cowhy, Andrea; Trost, William


    To standardize and objectivize treatment response assessment in oncology, guidelines have been proposed that are driven by radiological measurements, which are typically communicated in free-text reports defying automated processing. We study through inter-annotator agreement and natural language processing (NLP) algorithm development the task of pairing measurements that quantify the same finding across consecutive radiology reports, such that each measurement is paired with at most one other ("partial uniqueness"). Ground truth is created based on 283 abdomen and 311 chest CT reports of 50 patients each. A pre-processing engine segments reports and extracts measurements. Thirteen features are developed based on volumetric similarity between measurements, semantic similarity between their respective narrative contexts and structural properties of their report positions. A Random Forest classifier (RF) integrates all features. A "mutual best match" (MBM) post-processor ensures partial uniqueness. In an end-to-end evaluation, RF has precision 0.841, recall 0.807, F-measure 0.824 and AUC 0.971; with MBM, which performs above chance level (P0.960) indicates that the task is well defined. Domain properties and inter-section differences are discussed to explain superior performance in abdomen. Enforcing partial uniqueness has mixed but minor effects on performance. A combined machine learning-filtering approach is proposed for pairing measurements, which can support prospective (supporting treatment response assessment) and retrospective purposes (data mining). Copyright © 2014 Elsevier Inc. All rights reserved.

  10. Bringing Chatbots into education: Towards Natural Language Negotiation of Open Learner Models (United States)

    Kerlyl, Alice; Hall, Phil; Bull, Susan

    There is an extensive body of work on Intelligent Tutoring Systems: computer environments for education, teaching and training that adapt to the needs of the individual learner. Work on personalisation and adaptivity has included research into allowing the student user to enhance the system's adaptivity by improving the accuracy of the underlying learner model. Open Learner Modelling, where the system's model of the user's knowledge is revealed to the user, has been proposed to support student reflection on their learning. Increased accuracy of the learner model can be obtained by the student and system jointly negotiating the learner model. We present the initial investigations into a system to allow people to negotiate the model of their understanding of a topic in natural language. This paper discusses the development and capabilities of both conversational agents (or chatbots) and Intelligent Tutoring Systems, in particular Open Learner Modelling. We describe a Wizard-of-Oz experiment to investigate the feasibility of using a chatbot to support negotiation, and conclude that a fusion of the two fields can lead to developing negotiation techniques for chatbots and the enhancement of the Open Learner Model. This technology, if successful, could have widespread application in schools, universities and other training scenarios.


    A. E. Pismak


    Full Text Available Subject of Research. The paper is focused on Wiktionary articles structural organization in the aspect of its usage as the base for semantic network. Wiktionary community references, article templates and articles markup features are analyzed. The problem of numerical estimation for semantic similarity of structural elements in Wiktionary articles is considered. Analysis of existing software for semantic similarity estimation of such elements is carried out; algorithms of their functioning are studied; their advantages and disadvantages are shown. Methods. Mathematical statistics methods were used to analyze Wiktionary articles markup features. The method of semantic similarity computing based on statistics data for compared structural elements was proposed.Main Results. We have concluded that there is no possibility for direct use of Wiktionary articles as the source for semantic network. We have proposed to find hidden similarity between article elements, and for that purpose we have developed the algorithm for calculation of confidence coefficients proving that each pair of sentences is semantically near. The research of quantitative and qualitative characteristics for the developed algorithm has shown its major performance advantage over the other existing solutions in the presence of insignificantly higher error rate. Practical Relevance. The resulting algorithm may be useful in developing tools for automatic Wiktionary articles parsing. The developed method could be used in computing of semantic similarity for short text fragments in natural language in case of algorithm performance requirements are higher than its accuracy specifications.

  12. Natural language processing using online analytic processing for assessing recommendations in radiology reports. (United States)

    Dang, Pragya A; Kalra, Mannudeep K; Blake, Michael A; Schultz, Thomas J; Stout, Markus; Lemay, Paul R; Freshman, David J; Halpern, Elkan F; Dreyer, Keith J


    The study purpose was to describe the use of natural language processing (NLP) and online analytic processing (OLAP) for assessing patterns in recommendations in unstructured radiology reports on the basis of patient and imaging characteristics, such as age, gender, referring physicians, radiology subspecialty, modality, indications, diseases, and patient status (inpatient vs outpatient). A database of 4,279,179 radiology reports from a single tertiary health care center during a 10-year period (1995-2004) was created. The database includes reports of computed tomography, magnetic resonance imaging, fluoroscopy, nuclear medicine, ultrasound, radiography, mammography, angiography, special procedures, and unclassified imaging tests with patient demographics. A clinical data mining and analysis NLP program (Leximer, Nuance Inc, Burlington, Massachusetts) in conjunction with OLAP was used for classifying reports into those with recommendations (I(REC)) and without recommendations (N(REC)) for imaging and determining I(REC) rates for different patient age groups, gender, imaging modalities, indications, diseases, subspecialties, and referring physicians. In addition, temporal trends for I(REC) were also determined. There was a significant difference in the I(REC) rates in different age groups, varying between 4.8% (10-19 years) and 9.5% (>70 years) (P OLAP revealed considerable differences between recommendation trends for different imaging modalities and other patient and imaging characteristics.


    Yu. S. Hetsevich


    Full Text Available The article focuses on the problems existing in text-to-speech synthesis. Different morphological, lexical and syntactical elements were localized with the help of the Belarusian unit of NooJ program. Those types of errors, which occur in Belarusian texts, were analyzed and corrected. Language model and part of speech tagging model were built. The natural language processing of Belarusian corpus with the help of developed algorithm using machine learning was carried out. The precision of developed models of machine learning has been 80–90 %. The dictionary was enriched with new words for the further using it in the systems of Belarusian speech synthesis.

  14. Toward a Theory-Based Natural Language Capability in Robots and Other Embodied Agents: Evaluating Hausser's SLIM Theory and Database Semantics (United States)

    Burk, Robin K.


    Computational natural language understanding and generation have been a goal of artificial intelligence since McCarthy, Minsky, Rochester and Shannon first proposed to spend the summer of 1956 studying this and related problems. Although statistical approaches dominate current natural language applications, two current research trends bring…

  15. Medical subdomain classification of clinical notes using a machine learning-based natural language processing approach. (United States)

    Weng, Wei-Hung; Wagholikar, Kavishwar B; McCray, Alexa T; Szolovits, Peter; Chueh, Henry C


    The medical subdomain of a clinical note, such as cardiology or neurology, is useful content-derived metadata for developing machine learning downstream applications. To classify the medical subdomain of a note accurately, we have constructed a machine learning-based natural language processing (NLP) pipeline and developed medical subdomain classifiers based on the content of the note. We constructed the pipeline using the clinical NLP system, clinical Text Analysis and Knowledge Extraction System (cTAKES), the Unified Medical Language System (UMLS) Metathesaurus, Semantic Network, and learning algorithms to extract features from two datasets - clinical notes from Integrating Data for Analysis, Anonymization, and Sharing (iDASH) data repository (n = 431) and Massachusetts General Hospital (MGH) (n = 91,237), and built medical subdomain classifiers with different combinations of data representation methods and supervised learning algorithms. We evaluated the performance of classifiers and their portability across the two datasets. The convolutional recurrent neural network with neural word embeddings trained-medical subdomain classifier yielded the best performance measurement on iDASH and MGH datasets with area under receiver operating characteristic curve (AUC) of 0.975 and 0.991, and F1 scores of 0.845 and 0.870, respectively. Considering better clinical interpretability, linear support vector machine-trained medical subdomain classifier using hybrid bag-of-words and clinically relevant UMLS concepts as the feature representation, with term frequency-inverse document frequency (tf-idf)-weighting, outperformed other shallow learning classifiers on iDASH and MGH datasets with AUC of 0.957 and 0.964, and F1 scores of 0.932 and 0.934 respectively. We trained classifiers on one dataset, applied to the other dataset and yielded the threshold of F1 score of 0.7 in classifiers for half of the medical subdomains we studied. Our study shows that a supervised

  17. Dialogue-Games: Meta-Communication Structures for Natural Language Interaction (United States)


    analogy from Wittgenstein’s term "language game" ( Wittgenstein , 1958). However, Dialogue-games represent knowledge people have about language as used to...and memory of narrative discourse. CoRtiiiive PsycholoRy, 1977, 9, 77-110. Wittgenstein , L. Philosophical inve-ÜRalions (3rd ed.). New York

  18. The written language of signals as a means of natural literacy of deaf children

    Giovana Fracari Hautrive


    Full Text Available Taking the theme literacy of deaf children is currently directing the eye to the practice teaching course that demands beyond the school. Questions moving to daily practice, became a challenge, requiring an investigative attitude. The article aims to problematize the process of literacy of deaf children. Reflection proposal emerges from daily practice. This structure is from yarns that include theoretical studies of Vigotskii (1989, 1994, 1996, 1998; Stumpf (2005, Quadros (1997; Bolzan (1998, 2002; Skliar (1997a, 1997b, 1998 . From which, problematizes the processes involved in the construction of written language. It is as a result, the importance of the instrumentalization of sign language as first language in education of deaf and learning of sign language writing. Important aspects for the deaf student is observed in the condition to be literate in their mother tongue. It points out the need for a redirect in the literacy of deaf children, so that important aspects of language and its role in the structuring of thought and its communicative aspect, are respected and considered in this process. Thus, it emphasizes the learning of the writing of sign language as fundamental, it should occupy a central role in the proposed teaching the class, encouraging the contradictions that put the student in a situation of cognitive conflict, while respecting the diversity inherent to each humans. It is considered that the production of sign language writing is an appropriate tool for the deaf students record their visual language.

  19. Study of Profile Changes during Mechanical Polishing using Relocation Profilometry (United States)

    Kumaran, S. Chidambara; Shunmugam, M. S.


    Mechanical polishing is a finishing process practiced conventionally to enhance quality of surface. Surface finish is improved by mechanical cutting action of abrasive particles on work surface. Polishing is complex in nature and research efforts have been focused on understanding the polishing mechanism. Study of changes in profile is a useful method of understanding behavior of the polishing process. Such a study requires tracing same profile at regular process intervals, which is a tedious job. An innovative relocation technique is followed in the present work to study profile changes during mechanical polishing of austenitic stainless steel specimen. Using special locating fixture, micro-indentation mark and cross-correlation technique, the same profile is traced at certain process intervals. Comparison of different parameters of profiles shows the manner in which metal removal takes place in the polishing process. Mass removal during process estimated by the same relocation technique is checked with that obtained using weight measurement. The proposed approach can be extended to other micro/nano finishing processes and favorable process conditions can be identified.

  20. Population-Based Analysis of Histologically Confirmed Melanocytic Proliferations Using Natural Language Processing. (United States)

    Lott, Jason P; Boudreau, Denise M; Barnhill, Ray L; Weinstock, Martin A; Knopp, Eleanor; Piepkorn, Michael W; Elder, David E; Knezevich, Steven R; Baer, Andrew; Tosteson, Anna N A; Elmore, Joann G


    Population-based information on the distribution of histologic diagnoses associated with skin biopsies is unknown. Electronic medical records (EMRs) enable automated extraction of pathology report data to improve our epidemiologic understanding of skin biopsy outcomes, specifically those of melanocytic origin. To determine population-based frequencies and distribution of histologically confirmed melanocytic lesions. A natural language processing (NLP)-based analysis of EMR pathology reports of adult patients who underwent skin biopsies at a large integrated health care delivery system in the US Pacific Northwest from January 1, 2007, through December 31, 2012. Skin biopsy procedure. The primary outcome was histopathologic diagnosis, obtained using an NLP-based system to process EMR pathology reports. We determined the percentage of diagnoses classified as melanocytic vs nonmelanocytic lesions. Diagnoses classified as melanocytic were further subclassified using the Melanocytic Pathology Assessment Tool and Hierarchy for Diagnosis (MPATH-Dx) reporting schema into the following categories: class I (nevi and other benign proliferations such as mildly dysplastic lesions typically requiring no further treatment), class II (moderately dysplastic and other low-risk lesions that may merit narrow reexcision with skin biopsies, performed on 47 529 patients, were examined. Nearly 1 in 4 skin biopsies were of melanocytic lesions (23%; n = 18 715), which were distributed according to MPATH-Dx categories as follows: class I, 83.1% (n = 15 558); class II, 8.3% (n = 1548); class III, 4.5% (n = 842); class IV, 2.2% (n = 405); and class V, 1.9% (n = 362). Approximately one-quarter of skin biopsies resulted in diagnoses of melanocytic proliferations. These data provide the first population-based estimates across the spectrum of melanocytic lesions ranging from benign through dysplastic to malignant. These results may serve as a foundation for future

  1. Towards natural language question generation for the validation of ontologies and mappings. (United States)

    Ben Abacha, Asma; Dos Reis, Julio Cesar; Mrabet, Yassine; Pruski, Cédric; Da Silveira, Marcos


    The increasing number of open-access ontologies and their key role in several applications such as decision-support systems highlight the importance of their validation. Human expertise is crucial for the validation of ontologies from a domain point-of-view. However, the growing number of ontologies and their fast evolution over time make manual validation challenging. We propose a novel semi-automatic approach based on the generation of natural language (NL) questions to support the validation of ontologies and their evolution. The proposed approach includes the automatic generation, factorization and ordering of NL questions from medical ontologies. The final validation and correction is performed by submitting these questions to domain experts and automatically analyzing their feedback. We also propose a second approach for the validation of mappings impacted by ontology changes. The method exploits the context of the changes to propose correction alternatives presented as Multiple Choice Questions. This research provides a question optimization strategy to maximize the validation of ontology entities with a reduced number of questions. We evaluate our approach for the validation of three medical ontologies. We also evaluate the feasibility and efficiency of our mappings validation approach in the context of ontology evolution. These experiments are performed with different versions of SNOMED-CT and ICD9. The obtained experimental results suggest the feasibility and adequacy of our approach to support the validation of interconnected and evolving ontologies. Results also suggest that taking into account RDFS and OWL entailment helps reducing the number of questions and validation time. The application of our approach to validate mapping evolution also shows the difficulty of adapting mapping evolution over time and highlights the importance of semi-automatic validation.

  2. Natural Language Search Interfaces: Health Data Needs Single-Field Variable Search. (United States)

    Jay, Caroline; Harper, Simon; Dunlop, Ian; Smith, Sam; Sufi, Shoaib; Goble, Carole; Buchan, Iain


    Data discovery, particularly the discovery of key variables and their inter-relationships, is key to secondary data analysis, and in-turn, the evolving field of data science. Interface designers have presumed that their users are domain experts, and so they have provided complex interfaces to support these "experts." Such interfaces hark back to a time when searches needed to be accurate first time as there was a high computational cost associated with each search. Our work is part of a governmental research initiative between the medical and social research funding bodies to improve the use of social data in medical research. The cross-disciplinary nature of data science can make no assumptions regarding the domain expertise of a particular scientist, whose interests may intersect multiple domains. Here we consider the common requirement for scientists to seek archived data for secondary analysis. This has more in common with search needs of the "Google generation" than with their single-domain, single-tool forebears. Our study compares a Google-like interface with traditional ways of searching for noncomplex health data in a data archive. Two user interfaces are evaluated for the same set of tasks in extracting data from surveys stored in the UK Data Archive (UKDA). One interface, Web search, is "Google-like," enabling users to browse, search for, and view metadata about study variables, whereas the other, traditional search, has standard multioption user interface. Using a comprehensive set of tasks with 20 volunteers, we found that the Web search interface met data discovery needs and expectations better than the traditional search. A task × interface repeated measures analysis showed a main effect indicating that answers found through the Web search interface were more likely to be correct (F1,19=37.3, Pnatural language search interfaces for variable search supporting in particular: query reformulation; data browsing; faceted search; surrogates; relevance

  3. NOBLE - Flexible concept recognition for large-scale biomedical natural language processing. (United States)

    Tseytlin, Eugene; Mitchell, Kevin; Legowski, Elizabeth; Corrigan, Julia; Chavan, Girish; Jacobson, Rebecca S


    Natural language processing (NLP) applications are increasingly important in biomedical data analysis, knowledge engineering, and decision support. Concept recognition is an important component task for NLP pipelines, and can be either general-purpose or domain-specific. We describe a novel, flexible, and general-purpose concept recognition component for NLP pipelines, and compare its speed and accuracy against five commonly used alternatives on both a biological and clinical corpus. NOBLE Coder implements a general algorithm for matching terms to concepts from an arbitrary vocabulary set. The system's matching options can be configured individually or in combination to yield specific system behavior for a variety of NLP tasks. The software is open source, freely available, and easily integrated into UIMA or GATE. We benchmarked speed and accuracy of the system against the CRAFT and ShARe corpora as reference standards and compared it to MMTx, MGrep, Concept Mapper, cTAKES Dictionary Lookup Annotator, and cTAKES Fast Dictionary Lookup Annotator. We describe key advantages of the NOBLE Coder system and associated tools, including its greedy algorithm, configurable matching strategies, and multiple terminology input formats. These features provide unique functionality when compared with existing alternatives, including state-of-the-art systems. On two benchmarking tasks, NOBLE's performance exceeded commonly used alternatives, performing almost as well as the most advanced systems. Error analysis revealed differences in error profiles among systems. NOBLE Coder is comparable to other widely used concept recognition systems in terms of accuracy and speed. Advantages of NOBLE Coder include its interactive terminology builder tool, ease of configuration, and adaptability to various domains and tasks. NOBLE provides a term-to-concept matching system suitable for general concept recognition in biomedical NLP pipelines.

  4. A natural language processing program effectively extracts key pathologic findings from radical prostatectomy reports. (United States)

    Kim, Brian J; Merchant, Madhur; Zheng, Chengyi; Thomas, Anil A; Contreras, Richard; Jacobsen, Steven J; Chien, Gary W


    Natural language processing (NLP) software programs have been widely developed to transform complex free text into simplified organized data. Potential applications in the field of medicine include automated report summaries, physician alerts, patient repositories, electronic medical record (EMR) billing, and quality metric reports. Despite these prospects and the recent widespread adoption of EMR, NLP has been relatively underutilized. The objective of this study was to evaluate the performance of an internally developed NLP program in extracting select pathologic findings from radical prostatectomy specimen reports in the EMR. An NLP program was generated by a software engineer to extract key variables from prostatectomy reports in the EMR within our healthcare system, which included the TNM stage, Gleason grade, presence of a tertiary Gleason pattern, histologic subtype, size of dominant tumor nodule, seminal vesicle invasion (SVI), perineural invasion (PNI), angiolymphatic invasion (ALI), extracapsular extension (ECE), and surgical margin status (SMS). The program was validated by comparing NLP results to a gold standard compiled by two blinded manual reviewers for 100 random pathology reports. NLP demonstrated 100% accuracy for identifying the Gleason grade, presence of a tertiary Gleason pattern, SVI, ALI, and ECE. It also demonstrated near-perfect accuracy for extracting histologic subtype (99.0%), PNI (98.9%), TNM stage (98.0%), SMS (97.0%), and dominant tumor size (95.7%). The overall accuracy of NLP was 98.7%. NLP generated a result in report. This novel program demonstrated high accuracy and efficiency identifying key pathologic details from the prostatectomy report within an EMR system. NLP has the potential to assist urologists by summarizing and highlighting relevant information from verbose pathology reports. It may also facilitate future urologic research through the rapid and automated creation of large databases.

  5. Using natural language processing and machine learning to identify gout flares from electronic clinical notes. (United States)

    Zheng, Chengyi; Rashid, Nazia; Wu, Yi-Lin; Koblick, River; Lin, Antony T; Levy, Gerald D; Cheetham, T Craig


    Gout flares are not well documented by diagnosis codes, making it difficult to conduct accurate database studies. We implemented a computer-based method to automatically identify gout flares using natural language processing (NLP) and machine learning (ML) from electronic clinical notes. Of 16,519 patients, 1,264 and 1,192 clinical notes from 2 separate sets of 100 patients were selected as the training and evaluation data sets, respectively, which were reviewed by rheumatologists. We created separate NLP searches to capture different aspects of gout flares. For each note, the NLP search outputs became the ML system inputs, which provided the final classification decisions. The note-level classifications were grouped into patient-level gout flares. Our NLP+ML results were validated using a gold standard data set and compared with the claims-based method used by prior literatures. For 16,519 patients with a diagnosis of gout and a prescription for a urate-lowering therapy, we identified 18,869 clinical notes as gout flare positive (sensitivity 82.1%, specificity 91.5%): 1,402 patients with ≥3 flares (sensitivity 93.5%, specificity 84.6%), 5,954 with 1 or 2 flares, and 9,163 with no flare (sensitivity 98.5%, specificity 96.4%). Our method identified more flare cases (18,869 versus 7,861) and patients with ≥3 flares (1,402 versus 516) when compared to the claims-based method. We developed a computer-based method (NLP and ML) to identify gout flares from the clinical notes. Our method was validated as an accurate tool for identifying gout flares with higher sensitivity and specificity compared to previous studies. Copyright © 2014 by the American College of Rheumatology.

  6. Validation of natural language processing to extract breast cancer pathology procedures and results

    Arika E Wieneke


    Full Text Available Background: Pathology reports typically require manual review to abstract research data. We developed a natural language processing (NLP system to automatically interpret free-text breast pathology reports with limited assistance from manual abstraction. Methods: We used an iterative approach of machine learning algorithms and constructed groups of related findings to identify breast-related procedures and results from free-text pathology reports. We evaluated the NLP system using an all-or-nothing approach to determine which reports could be processed entirely using NLP and which reports needed manual review beyond NLP. We divided 3234 reports for development (2910, 90%, and evaluation (324, 10% purposes using manually reviewed pathology data as our gold standard. Results: NLP correctly coded 12.7% of the evaluation set, flagged 49.1% of reports for manual review, incorrectly coded 30.8%, and correctly omitted 7.4% from the evaluation set due to irrelevancy (i.e. not breast-related. Common procedures and results were identified correctly (e.g. invasive ductal with 95.5% precision and 94.0% sensitivity, but entire reports were flagged for manual review because of rare findings and substantial variation in pathology report text. Conclusions: The NLP system we developed did not perform sufficiently for abstracting entire breast pathology reports. The all-or-nothing approach resulted in too broad of a scope of work and limited our flexibility to identify breast pathology procedures and results. Our NLP system was also limited by the lack of the gold standard data on rare findings and wide variation in pathology text. Focusing on individual, common elements and improving pathology text report standardization may improve performance.

  11. Causal knowledge extraction by natural language processing in material science: a case study in chemical vapor deposition

    Directory of Open Access Journals (Sweden)

    Yuya Kajikawa


    Full Text Available Scientific publications written in natural language still play a central role as our knowledge source. However, due to the flood of publications, the literature survey process has become a highly time-consuming and tangled process, especially for novices of the discipline. Therefore, tools supporting the literature-survey process may help the individual scientist to explore new useful domains. Natural language processing (NLP is expected as one of the promising techniques to retrieve, abstract, and extract knowledge. In this contribution, NLP is firstly applied to the literature of chemical vapor deposition (CVD, which is a sub-discipline of materials science and is a complex and interdisciplinary field of research involving chemists, physicists, engineers, and materials scientists. Causal knowledge extraction from the literature is demonstrated using NLP.

  12. The Natural History of Human Language: Bridging the Gaps without Magic (United States)

    Merker, Bjorn; Okanoya, Kazuo

    Human languages are quintessentially historical phenomena. Every known aspect of linguistic form and content is subject to change in historical time (Lehmann, 1995; Bybee, 2004). Many facts of language, syntactic no less than semantic, find their explanation in the historical processes that generated them. If adpositions were once verbs, then the fact that they tend to occur on the same side of their arguments as do verbs ("cross-category harmony": Hawkins, 1983) is a matter of historical contingency rather than a reflection of inherent structural constraints on human language (Delancey, 1993).

  13. Crowdsourcing a normative natural language dataset: a comparison of Amazon Mechanical Turk and in-lab data collection. (United States)

    Saunders, Daniel R; Bex, Peter J; Woods, Russell L


    Crowdsourcing has become a valuable method for collecting medical research data. This approach, recruiting through open calls on the Web, is particularly useful for assembling large normative datasets. However, it is not known how natural language datasets collected over the Web differ from those collected under controlled laboratory conditions. To compare the natural language responses obtained from a crowdsourced sample of participants with responses collected in a conventional laboratory setting from participants recruited according to specific age and gender criteria. We collected natural language descriptions of 200 half-minute movie clips, from Amazon Mechanical Turk workers (crowdsourced) and 60 participants recruited from the community (lab-sourced). Crowdsourced participants responded to as many clips as they wanted and typed their responses, whereas lab-sourced participants gave spoken responses to 40 clips, and their responses were transcribed. The content of the responses was evaluated using a take-one-out procedure, which compared responses to other responses to the same clip and to other clips, with a comparison of the average number of shared words. In contrast to the 13 months of recruiting that was required to collect normative data from 60 lab-sourced participants (with specific demographic characteristics), only 34 days were needed to collect normative data from 99 crowdsourced participants (contributing a median of 22 responses). The majority of crowdsourced workers were female, and the median age was 35 years, lower than the lab-sourced median of 62 years but similar to the median age of the US population. The responses contributed by the crowdsourced participants were longer on average, that is, 33 words compared to 28 words (Pcrowdsourced participants had more shared words (P=.004 and .01 respectively), whereas younger participants had higher numbers of shared words in the lab-sourced population (P=.01). Crowdsourcing is an effective approach

  14. Unpacking Big Systems -- Natural Language Processing Meets Network Analysis. A Study of Smart Grid Development in Denmark

    Jurowetzki, Roman

    and contained technological trajectories on a national level using a combination of methods from statistical natural language processing, vector space modelling and network analysis. The proposed approach does not aim at replacing the researcher or expert but rather offers the possibility to algorithmically...... in Denmark. Results show that in the explored case it is not mainly new technologies and applications that are driving change but innovative re-combinations of old and new technologies....

    Elastic emission polishing, also called elastic emission machining (EEM), is a process where a stream of abrasive slurry is used to remove material from a substrate and produce damage free surfaces with controlled surface form. It is a noncontacting method utilizing a thick elasto-hydrodynamic film formed between a soft rotating ball and the workpiece to control the flow of the abrasive. An apparatus was built in the Center, which consists of a stationary spindle, a two-axis table for the workpiece, and a pump to circulate the working fluid. The process is controlled by a programmable computer numerical controller (CNC), which presently can operate the spindle speed and movement of the workpiece in one axis only. This apparatus has been used to determine material removal rates on different material samples as a function of time, utilizing zirconium oxide (ZrO{sub 2}) particles suspended in distilled water as the working fluid. By continuing a study of removal rates the process should become predictable, and thus create a new, effective, yet simple tool for ultra-precision mechanical machining of surfaces.

  16. Production of rare earth polishing powders in Russia

    in a suspension; polishing powder Ftoropol with addition of fluorine and higher contents of cerium dioxide (at least 70% by mass) that has a higher polishing ability and is attrition-proof, used for high-speed treatment of optical lenses, mirrors, TV screens and eyeglasses. The rare earth polishing powders made in Russia possess the following physico-chemical properties and performance characteristics; cerium dioxide content in solid REE solution - 50-90% by mass; F-ion content (in Ftoropol powder) - 8-14% by mass; non-REE content of sodium, calcium, strontium and iron impurities - at most 0.1% by mass of each element; natural radionuclide content of thorium, uranium, actinium, potassium-40 series, total standard specific activity - 0.45-0.85 Bq/g; - average particle size, 2.0-3.5 μm; density - 6.3-6.8 g/cm 3 ; pH of aqueous extract, 6-7; sedimentary stability - 10-20 minutes; polishing ability - 45-60 mg per 31 minutes (for polishing resin); abrasive inclusions - none. The report gives analysis of the. Russian powders compared against the best world analogues such as Cerox (Rhone Poulenc Company, France), Regipol (London and Scandinavian Division Chemical Company, England), etc. The analysis results imply, that the chief characteristics (granulometric composition, polishing ability and service life) of the Russian samples do not yield to the best foreign analogues, and in some properties (radionuclide content, sedimentary stability and scratching inclusions quantity) even surpass them

  17. Steering the conversation: A linguistic exploration of natural language interactions with a digital assistant during simulated driving. (United States)

    Given the proliferation of 'intelligent' and 'socially-aware' digital assistants embodying everyday mobile technology - and the undeniable logic that utilising voice-activated controls and interfaces in cars reduces the visual and manual distraction of interacting with in-vehicle devices - it appears inevitable that next generation vehicles will be embodied by digital assistants and utilise spoken language as a method of interaction. From a design perspective, defining the language and interaction style that a digital driving assistant should adopt is contingent on the role that they play within the social fabric and context in which they are situated. We therefore conducted a qualitative, Wizard-of-Oz study to explore how drivers might interact linguistically with a natural language digital driving assistant. Twenty-five participants drove for 10 min in a medium-fidelity driving simulator while interacting with a state-of-the-art, high-functioning, conversational digital driving assistant. All exchanges were transcribed and analysed using recognised linguistic techniques, such as discourse and conversation analysis, normally reserved for interpersonal investigation. Language usage patterns demonstrate that interactions with the digital assistant were fundamentally social in nature, with participants affording the assistant equal social status and high-level cognitive processing capability. For example, participants were polite, actively controlled turn-taking during the conversation, and used back-channelling, fillers and hesitation, as they might in human communication. Furthermore, participants expected the digital assistant to understand and process complex requests mitigated with hedging words and expressions, and peppered with vague language and deictic references requiring shared contextual information and mutual understanding. Findings are presented in six themes which emerged during the analysis - formulating responses; turn-taking; back

  18. The Robbers and the Others – A Serious Game Using Natural Language Processing

    Learning a new language includes multiple aspects, from vocabulary acquisition to exercising words in sentences, and developing discourse building capabilities. In most learning scenarios, students learn individually and interact only during classes; therefore, it is difficult to enhance their

  19. Word Sense Disambiguation Based on Large Scale Polish CLARIN Heterogeneous Lexical Resources

    Paweł Kędzia


    Full Text Available Word Sense Disambiguation Based on Large Scale Polish CLARIN Heterogeneous Lexical Resources Lexical resources can be applied in many different Natural Language Engineering tasks, but the most fundamental task is the recognition of word senses used in text contexts. The problem is difficult, not yet fully solved and different lexical resources provided varied support for it. Polish CLARIN lexical semantic resources are based on the plWordNet — a very large wordnet for Polish — as a central structure which is a basis for linking together several resources of different types. In this paper, several Word Sense Disambiguation (henceforth WSD methods developed for Polish that utilise plWordNet are discussed. Textual sense descriptions in the traditional lexicon can be compared with text contexts using Lesk’s algorithm in order to find best matching senses. In the case of a wordnet, lexico-semantic relations provide the main description of word senses. Thus, first, we adapted and applied to Polish a WSD method based on the Page Rank. According to it, text words are mapped on their senses in the plWordNet graph and Page Rank algorithm is run to find senses with the highest scores. The method presents results lower but comparable to those reported for English. The error analysis showed that the main problems are: fine grained sense distinctions in plWordNet and limited number of connections between words of different parts of speech. In the second approach plWordNet expanded with the mapping onto the SUMO ontology concepts was used. Two scenarios for WSD were investigated: two step disambiguation and disambiguation based on combined networks of plWordNet and SUMO. In the former scenario, words are first assigned SUMO concepts and next plWordNet senses are disambiguated. In latter, plWordNet and SUMO are combined in one large network used next for the disambiguation of senses. The additional knowledge sources used in WSD improved the performance

  20. Dependency distance: A new perspective on the syntactic development in second language acquisition. Comment on "Dependency distance: A new perspective on syntactic patterns in natural language" by Haitao Liu et al. (United States)

    Jiang, Jingyang; Ouyang, Jinghui


    Liu et al. [1] offers a clear and informative account of the use of dependency distance in studying natural languages, with a focus on the viewpoint that dependency distance minimization (DDM) can be regarded as a linguistic universal. We would like to add the perspective of employing dependency distance in the studies of second languages acquisition (SLA), particularly the studies of syntactic development.

  1. Natural Language Search Interfaces: Health Data Needs Single-Field Variable Search (United States)

    Smith, Sam; Sufi, Shoaib; Goble, Carole; Buchan, Iain


    Background Data discovery, particularly the discovery of key variables and their inter-relationships, is key to secondary data analysis, and in-turn, the evolving field of data science. Interface designers have presumed that their users are domain experts, and so they have provided complex interfaces to support these “experts.” Such interfaces hark back to a time when searches needed to be accurate first time as there was a high computational cost associated with each search. Our work is part of a governmental research initiative between the medical and social research funding bodies to improve the use of social data in medical research. Objective The cross-disciplinary nature of data science can make no assumptions regarding the domain expertise of a particular scientist, whose interests may intersect multiple domains. Here we consider the common requirement for scientists to seek archived data for secondary analysis. This has more in common with search needs of the “Google generation” than with their single-domain, single-tool forebears. Our study compares a Google-like interface with traditional ways of searching for noncomplex health data in a data archive. Methods Two user interfaces are evaluated for the same set of tasks in extracting data from surveys stored in the UK Data Archive (UKDA). One interface, Web search, is “Google-like,” enabling users to browse, search for, and view metadata about study variables, whereas the other, traditional search, has standard multioption user interface. Results Using a comprehensive set of tasks with 20 volunteers, we found that the Web search interface met data discovery needs and expectations better than the traditional search. A task × interface repeated measures analysis showed a main effect indicating that answers found through the Web search interface were more likely to be correct (F 1,19=37.3, Peffect of task (F 3,57=6.3, Pinterface (F 1,19=18.0, Peffect of task (F 2,38=4.1, P=.025, Greenhouse

  2. Interpretation of Ukrainian and Polish Adverbial Word Equivalents Form and Meaning Interaction in National Explanatory Lexicography

    Alla Luchyk


    Full Text Available Interpretation of Ukrainian and Polish Adverbial Word Equivalents Form and Meaning Interaction in National Explanatory Lexicography The article proves the necessity and possibility of compiling dictionaries with intermediate existence status glossary units, to which the word equivalents belong. In order to form the Ukrainian-Polish dictionary glossary of this type the form and meaning analysis of Ukrainian and Polish word equivalents is done, the common and distinctive features of these language system elements are described, the compiling principles of such dictionary are clarified.

  3. Sports metaphors in Polish written commentaries on politics


    This paper seeks to investigate what sports metaphors are used in Polish written commentaries on politics and what special purpose they serve. In particular, the paper examines structural metaphors that come from the lexicon of popular sports, such as boxing, racing, track and field athletics, sailing, etc. The language data, derived from English Internet websites, has been grouped and discussed according to source domains. Applying George Lakoff and Mark Johnson’s approach to metaphor, the p...

  4. How to propose an action as objectively necessary: The case of Polish Trzeba x ("One needs to x")


    Zinken, J; Ogiermann, E


    The present study demonstrates that language-specific grammatical resources can afford speakers language-specific ways of organizing cooperative practical action. On the basis of video recordings of Polish families in their homes, we describe action affordances of the Polish impersonal modal declarative construction trzeba x (“one needs to x”) in the accomplishment of everyday domestic activities, such as cutting bread, bringing recalcitrant children back to the dinner table, or making phone ...

  5. Terminology extraction from medical texts in Polish. (United States)

    Marciniak, Małgorzata; Mykowiecka, Agnieszka


    Hospital documents contain free text describing the most important facts relating to patients and their illnesses. These documents are written in specific language containing medical terminology related to hospital treatment. Their automatic processing can help in verifying the consistency of hospital documentation and obtaining statistical data. To perform this task we need information on the phrases we are looking for. At the moment, clinical Polish resources are sparse. The existing terminologies, such as Polish Medical Subject Headings (MeSH), do not provide sufficient coverage for clinical tasks. It would be helpful therefore if it were possible to automatically prepare, on the basis of a data sample, an initial set of terms which, after manual verification, could be used for the purpose of information extraction. Using a combination of linguistic and statistical methods for processing over 1200 children hospital discharge records, we obtained a list of single and multiword terms used in hospital discharge documents written in Polish. The phrases are ordered according to their presumed importance in domain texts measured by the frequency of use of a phrase and the variety of its contexts. The evaluation showed that the automatically identified phrases cover about 84% of terms in domain texts. At the top of the ranked list, only 4% out of 400 terms were incorrect while out of the final 200, 20% of expressions were either not domain related or syntactically incorrect. We also observed that 70% of the obtained terms are not included in the Polish MeSH. Automatic terminology extraction can give results which are of a quality high enough to be taken as a starting point for building domain related terminological dictionaries or ontologies. This approach can be useful for preparing terminological resources for very specific subdomains for which no relevant terminologies already exist. The evaluation performed showed that none of the tested ranking procedures were

  6. In silico Evolutionary Developmental Neurobiology and the Origin of Natural Language (United States)

    Szathmáry, Eörs; Szathmáry, Zoltán; Ittzés, Péter; Orbaán, Geroő; Zachár, István; Huszár, Ferenc; Fedor, Anna; Varga, Máté; Számadó, Szabolcs

    It is justified to assume that part of our genetic endowment contributes to our language skills, yet it is impossible to tell at this moment exactly how genes affect the language faculty. We complement experimental biological studies by an in silico approach in that we simulate the evolution of neuronal networks under selection for language-related skills. At the heart of this project is the Evolutionary Neurogenetic Algorithm (ENGA) that is deliberately biomimetic. The design of the system was inspired by important biological phenomena such as brain ontogenesis, neuron morphologies, and indirect genetic encoding. Neuronal networks were selected and were allowed to reproduce as a function of their performance in the given task. The selected neuronal networks in all scenarios were able to solve the communication problem they had to face. The most striking feature of the model is that it works with highly indirect genetic encoding--just as brains do.

  7. Mirror neurons and the social nature of language: the neural exploitation hypothesis. (United States)

    Gallese, Vittorio


    This paper discusses the relevance of the discovery of mirror neurons in monkeys and of the mirror neuron system in humans to a neuroscientific account of primates' social cognition and its evolution. It is proposed that mirror neurons and the functional mechanism they underpin, embodied simulation, can ground within a unitary neurophysiological explanatory framework important aspects of human social cognition. In particular, the main focus is on language, here conceived according to a neurophenomenological perspective, grounding meaning on the social experience of action. A neurophysiological hypothesis--the "neural exploitation hypothesis"--is introduced to explain how key aspects of human social cognition are underpinned by brain mechanisms originally evolved for sensorimotor integration. It is proposed that these mechanisms were later on adapted as new neurofunctional architecture for thought and language, while retaining their original functions as well. By neural exploitation, social cognition and language can be linked to the experiential domain of action.

  8. Computing Accurate Grammatical Feedback in a Virtual Writing Conference for German-Speaking Elementary-School Children: An Approach Based on Natural Language Generation (United States)

    Harbusch, Karin; Itsova, Gergana; Koch, Ulrich; Kuhner, Christine


    We built a natural language processing (NLP) system implementing a "virtual writing conference" for elementary-school children, with German as the target language. Currently, state-of-the-art computer support for writing tasks is restricted to multiple-choice questions or quizzes because automatic parsing of the often ambiguous and fragmentary…

  9. Learning homophones in context: Easy cases are favored in the lexicon of natural languages. (United States)

    Dautriche, Isabelle; Fibla, Laia; Fievet, Anne-Caroline; Christophe, Anne


    Even though ambiguous words are common in languages, children find it hard to learn homophones, where a single label applies to several distinct meanings (e.g., Mazzocco, 1997). The present work addresses this apparent discrepancy between learning abilities and typological pattern, with respect to homophony in the lexicon. In a series of five experiments, 20-month-old French children easily learnt a pair of homophones if the two meanings associated with the phonological form belonged to different syntactic categories, or to different semantic categories. However, toddlers failed to learn homophones when the two meanings were distinguished only by different grammatical genders. In parallel, we analyzed the lexicon of four languages, Dutch, English, French and German, and observed that homophones are distributed non-arbitrarily in the lexicon, such that easily learnable homophones are more frequent than hard-to-learn ones: pairs of homophones are preferentially distributed across syntactic and semantic categories, but not across grammatical gender. We show that learning homophones is easier than previously thought, at least when the meanings of the same phonological form are made sufficiently distinct by their syntactic or semantic context. Following this, we propose that this learnability advantage translates into the overall structure of the lexicon, i.e., the kinds of homophones present in languages exhibit the properties that make them learnable by toddlers, thus allowing them to remain in languages. Copyright © 2018 Elsevier Inc. All rights reserved.

  10. Implementation of Danish in the Natural Language Generator of Angus2

    Larsen, Søren Støvelbæk; Fihl, Preben; Moeslund, Thomas B.

    The purpose of this technical report is to cover the implementation of the Danish language and grammar in the Angus2 software. This includes a brief description of the Angus2 software, and the Danish grammar with relevance to the implementation in Angus2, and detailed description of how...

  11. Real versus template-based Natural Language Generation: a false opposition?

    van Deemter, Kees; Krahmer, Emiel; Theune, Mariet


    This paper challenges the received wisdom that template-based approaches to the generation of language are necessarily inferior to other approaches as regards their maintainability, linguistic well-foundedness and quality of output. Some recent NLG systems that call themselves `templatebased' will


    Pascual Cantos Gomez


    Full Text Available This paper ainis at presenting a survey of computational linguistic tools presently available but whose potential has been neither fully considered not exploited to its full in modern CALL. It starts with a discussion on the rationale of DDL to language learning, presenting typical DDL-activities. DDL-software and potential extensions of non-typical DDL-software (electronic dictionaries and electronic dictionary facilities to DDL . An extended section is devoted to describe NLP-technology and how it can be integrated into CALL, within already existing software or as stand alone resources. A range of NLP-tools is presentcd (MT programs, taggers, lemn~atizersp, arsers and speech technologies with special emphasis on tagged concordancing. The paper finishes with a number of reflections and ideas on how language technologies can be used efficiently within the language learning context and how extensive exploration and integration of these technologies might change and extend both modern CAI,I, and the present language learning paradigiii..

  13. The Sentence Fairy: A Natural-Language Generation System to Support Children's Essay Writing (United States)

    Harbusch, Karin; Itsova, Gergana; Koch, Ulrich; Kuhner, Christine


    We built an NLP system implementing a "virtual writing conference" for elementary-school children, with German as the target language. Currently, state-of-the-art computer support for writing tasks is restricted to multiple-choice questions or quizzes because automatic parsing of the often ambiguous and fragmentary texts produced by pupils…

  14. School Meaning Systems: The Symbiotic Nature of Culture and "Language-In-Use" (United States)

    Abawi, Lindy


    Recent research has produced evidence to suggest a strong reciprocal link between school context-specific language constructions that reflect a school's vision and schoolwide pedagogy, and the way that meaning making occurs, and a school's culture is characterized. This research was conducted within three diverse settings: one school in the Sydney…

  15. Genetic and Environmental Links between Natural Language Use and Cognitive Ability in Toddlers (United States)

    Canfield, Caitlin F.; Edelson, Lisa R.; Saudino, Kimberly J.


    Although the phenotypic correlation between language and nonverbal cognitive ability is well-documented, studies examining the etiology of the covariance between these abilities are scant, particularly in very young children. The goal of this study was to address this gap in the literature by examining the genetic and environmental links between…

  16. Sensing roughness and polish direction

    Jakobsen, Michael Linde; Olesen, Anders Sig; Larsen, Henning Engelbrecht


    As a part of the work carried out in a project supported by the Danish Council for Technology and Innovation, we have investigated the option of smoothing standard CNC-machined surfaces. In the process of constructing optical prototypes, involving custom-designed optics, the development cost...... and time consumption can become prohibitive in a research budget. Machining the optical surfaces directly is expensive and time consuming. Alternatively, a more standardized and cheaper machining method can be used, calling for the object to be manually polished. During the polishing process, the operator...... needs information about the RMS-value of the surface roughness and the current direction of the scratches introduced by the polishing process. The RMS-value indicates to the operator how far he is from the final finish, and the scratch orientation is often specified by the customer in order to avoid...

  17. Detecting Novel and Emerging Drug Terms Using Natural Language Processing: A Social Media Corpus Study. (United States)

    Simpson, Sean S; Adams, Nikki; Brugman, Claudia M; Conners, Thomas J


    With the rapid development of new psychoactive substances (NPS) and changes in the use of more traditional drugs, it is increasingly difficult for researchers and public health practitioners to keep up with emerging drugs and drug terms. Substance use surveys and diagnostic tools need to be able to ask about substances using the terms that drug users themselves are likely to be using. Analyses of social media may offer new ways for researchers to uncover and track changes in drug terms in near real time. This study describes the initial results from an innovative collaboration between substance use epidemiologists and linguistic scientists employing techniques from the field of natural language processing to examine drug-related terms in a sample of tweets from the United States. The objective of this study was to assess the feasibility of using distributed word-vector embeddings trained on social media data to uncover previously unknown (to researchers) drug terms. In this pilot study, we trained a continuous bag of words (CBOW) model of distributed word-vector embeddings on a Twitter dataset collected during July 2016 (roughly 884.2 million tokens). We queried the trained word embeddings for terms with high cosine similarity (a proxy for semantic relatedness) to well-known slang terms for marijuana to produce a list of candidate terms likely to function as slang terms for this substance. This candidate list was then compared with an expert-generated list of marijuana terms to assess the accuracy and efficacy of using word-vector embeddings to search for novel drug terminology. The method described here produced a list of 200 candidate terms for the target substance (marijuana). Of these 200 candidates, 115 were determined to in fact relate to marijuana (65 terms for the substance itself, 50 terms related to paraphernalia). This included 30 terms which were used to refer to the target substance in the corpus yet did not appear on the expert-generated list and were

  18. On the relation between dependency distance, crossing dependencies, and parsing. Comment on "Dependency distance: a new perspective on syntactic patterns in natural languages" by Haitao Liu et al. (United States)

    Gómez-Rodríguez, Carlos


    Liu et al. [1] provide a comprehensive account of research on dependency distance in human languages. While the article is a very rich and useful report on this complex subject, here I will expand on a few specific issues where research in computational linguistics (specifically natural language processing) can inform DDM research, and vice versa. These aspects have not been explored much in [1] or elsewhere, probably due to the little overlap between both research communities, but they may provide interesting insights for improving our understanding of the evolution of human languages, the mechanisms by which the brain processes and understands language, and the construction of effective computer systems to achieve this goal.

    A Pokrywka; Z Obmiński; D Kwiatkowska; R Grucza


    The aim of this study was to investigate the number of cases and the profiles of Polish athletes who had occasionally been using marijuana or hashish throughout the period of 1998-2004, with respect to: sex, age, and discipline of sport as well as the period of testing (in- and out-of-competition). Results of the study were compared with some data reported by other WADA accredited anti-doping laboratories. Totally, 13 631 urine samples taken from Polish athletes of both sexes, aged 10-67 year...

  20. Graphite Composite Panel Polishing Fixture (United States)

    Hagopian, John; Strojny, Carl; Budinoff, Jason


    The use of high-strength, lightweight composites for the fixture is the novel feature of this innovation. The main advantage is the light weight and high stiffness-to-mass ratio relative to aluminum. Meter-class optics require support during the grinding/polishing process with large tools. The use of aluminum as a polishing fixture is standard, with pitch providing a compliant layer to allow support without deformation. Unfortunately, with meter-scale optics, a meter-scale fixture weighs over 120 lb (.55 kg) and may distort the optics being fabricated by loading the mirror and/or tool used in fabrication. The use of composite structures that are lightweight yet stiff allows standard techniques to be used while providing for a decrease in fixture weight by almost 70 percent. Mounts classically used to support large mirrors during fabrication are especially heavy and difficult to handle. The mount must be especially stiff to avoid deformation during the optical fabrication process, where a very large and heavy lap often can distort the mount and optic being fabricated. If the optic is placed on top of the lapping tool, the weight of the optic and the fixture can distort the lap. Fixtures to support the mirror during fabrication are often very large plates of aluminum, often 2 in. (.5 cm) or more in thickness and weight upwards of 150 lb (68 kg). With the addition of a backing material such as pitch and the mirror itself, the assembly can often weigh over 250 lb (.113 kg) for a meter-class optic. This innovation is the use of a lightweight graphite panel with an aluminum honeycomb core for use as the polishing fixture. These materials have been used in the aerospace industry as structural members due to their light weight and high stiffness. The grinding polishing fixture consists of the graphite composite panel, fittings, and fixtures to allow interface to the polishing machine, and introduction of pitch buttons to support the optic under fabrication. In its

  1. Globes and Teaching Aids Manufactured by Jan Felkl Company for the Polish Market

    Malgorzata Taborska


    Full Text Available Jan Felkl company from Roztoky (Roztok near Prague manufactured globes in seventeen language versions, since 1861 also in Polish language. The company was active until 1952, but it ceased to manufacture Polish-language globes as early as in 1914. In the aftermath of the First World War, and with the development of printing business, the demand for Czech globes shrank. It is difficult to estimate the overall output of Polish- language globes manufactured by Felkl’s company throughout the 53 years it operated. From catalogues and the surviving globes we know that terrestrial globes in six sizes, folding globes in two sizes, celestial globes (probably in four sizes, as well as telluria, lunaria and planetaria were manufactured for the Polish market. It is difficult to decide how many editions of individual types of globes were issued. Polish names were compiled by Franciszek Waligórski (one globe and Mirosław Suchecki. Only 28 globes have survived to this day, including one celestial globe. Most of them are globes of an 8-inch diameter, approved by the Austrian ministries as teaching aids for schools. Nearly half of the surviving globes date from the years 1894–1914. Only ten items are in museums.

  2. Context Analysis of Customer Requests using a Hybrid Adaptive Neuro Fuzzy Inference System and Hidden Markov Models in the Natural Language Call Routing Problem (United States)

    Rustamov, Samir; Mustafayev, Elshan; Clements, Mark A.


    The context analysis of customer requests in a natural language call routing problem is investigated in the paper. One of the most significant problems in natural language call routing is a comprehension of client request. With the aim of finding a solution to this issue, the Hybrid HMM and ANFIS models become a subject to an examination. Combining different types of models (ANFIS and HMM) can prevent misunderstanding by the system for identification of user intention in dialogue system. Based on these models, the hybrid system may be employed in various language and call routing domains due to nonusage of lexical or syntactic analysis in classification process.

  3. Context Analysis of Customer Requests using a Hybrid Adaptive Neuro Fuzzy Inference System and Hidden Markov Models in the Natural Language Call Routing Problem

    Rustamov Samir


    Full Text Available The context analysis of customer requests in a natural language call routing problem is investigated in the paper. One of the most significant problems in natural language call routing is a comprehension of client request. With the aim of finding a solution to this issue, the Hybrid HMM and ANFIS models become a subject to an examination. Combining different types of models (ANFIS and HMM can prevent misunderstanding by the system for identification of user intention in dialogue system. Based on these models, the hybrid system may be employed in various language and call routing domains due to nonusage of lexical or syntactic analysis in classification process.

  4. A natural language query system for Hubble Space Telescope proposal selection (United States)

    Hornick, Thomas; Cohen, William; Miller, Glenn


    The proposal selection process for the Hubble Space Telescope is assisted by a robust and easy to use query program (TACOS). The system parses an English subset language sentence regardless of the order of the keyword phases, allowing the user a greater flexibility than a standard command query language. Capabilities for macro and procedure definition are also integrated. The system was designed for flexibility in both use and maintenance. In addition, TACOS can be applied to any knowledge domain that can be expressed in terms of a single reaction. The system was implemented mostly in Common LISP. The TACOS design is described in detail, with particular attention given to the implementation methods of sentence processing.

    re~arded as -a fairly complete dictionary contains about 18,000 itemsw at soluition to the domain-restricted task at tzanlating present, and will be... dictionary access and so on, with an article. Unfortunately, the Weidner system did but as time goes on, one might imagine functionality not know that...superfast type. looped tht it A31l be built with taste by peo. writer ought to be possible in the monolingual case pie who understand languages and

  6. FMS: A Format Manipulation System for Automatic Production of Natural Language Documents, Second Edition. Final Report. (United States)

    Silver, Steven S.

    FMS/3 is a system for producing hard copy documentation at high speed from free format text and command input. The system was originally written in assembler language for a 12K IBM 360 model 20 using a high speed 1403 printer with the UCS-TN chain option (upper and lower case). Input was from an IBM 2560 Multi-function Card Machine. The model 20…

  7. Zipf’s word frequency law in natural language: A critical review and future directions (United States)


    The frequency distribution of words has been a key object of study in statistical linguistics for the past 70 years. This distribution approximately follows a simple mathematical form known as Zipf ’ s law. This article first shows that human language has a highly complex, reliable structure in the frequency distribution over and above this classic law, although prior data visualization methods have obscured this fact. A number of empirical phenomena related to word frequencies are then reviewed. These facts are chosen to be informative about the mechanisms giving rise to Zipf’s law and are then used to evaluate many of the theoretical explanations of Zipf’s law in language. No prior account straightforwardly explains all the basic facts or is supported with independent evaluation of its underlying assumptions. To make progress at understanding why language obeys Zipf’s law, studies must seek evidence beyond the law itself, testing assumptions and evaluating novel predictions with new, independent data. PMID:24664880

  8. Descriptive Metaphysics, Natural Language Metaphysics, Sapir-Whorf, and All That Stuff: Evidence from the Mass-Count Distinction

    Directory of Open Access Journals (Sweden)

    Francis Jeffry Pelletier


    Full Text Available Strawson (1959 described ‘descriptive metaphysics’, Bach (1986a described ‘natural language metaphysics’, Sapir (1929 and Whorf (1940a,b, 1941 describe, well, Sapir-Whorfianism. And there are other views concerning the relation between correct semantic analysis of linguistic phenomena and the “reality” that is supposed to be thereby described. I think some considerations from the analyses of the mass-count distinction can shed some light on that very dark topic.ReferencesBach, Emmon. 1986a. ‘Natural Language Metaphysics’. In Ruth Barcan Marcus, G.J.W. Dorn & Paul Weingartner (eds. ‘Logic, Methodology, and Philosophy of Science, VII’, 573–595. Amsterdam: North Holland.Bach, Emmon. 1986b. ‘The Algebra of Events’. Linguistics and Philosophy 9: 5–16.Berger, Peter & Luckmann, Thomas. 1966. The Social Construction of Reality: A Treatise in the Sociology of Knowledge. New York: Doubleday.Boroditsky, Lera, Schmidt, Lauren & Phillips, Webb. 2003. ‘Sex, Syntax, and Semantics’. In Dedre Gentner & Susan Goldin-Meadow (eds. ‘Language in Mind: Advances in the Study of Language and Cognition’, 59–80. Cambridge, MA: MIT Press.Cheng, L. & Sybesma, R. 1999. ‘Bare and Not-So-Bare Nouns and the structure of NP’. Linguistic Inquiry 30: 509–542., Gennaro. 1998a. ‘Reference to Kinds across Languages’. Natural Language Semantics 6: 339–405., Gennaro. 1998b. ‘Plurality of Mass Nouns and the Notion of ‘Semantic Parameter’ ’. In S. Rothstein (ed. ‘Events and Grammar’, 53–103. Dordrecht: Kluwer.Chierchia, Gennaro. 2010. ‘Mass Nouns, Vagueness and Semantic Variation’. Synthèse 174: 99–149., Jenny. 1997. Quantifiers and Selection: On the Distribution of Quantifying Expressions in French, Dutch and English. Ph.D. thesis, University of Leiden, Holland

  9. Treating conduct disorder: An effectiveness and natural language analysis study of a new family-centred intervention program. (United States)

    Stevens, Kimberly A; Ronan, Prof Kevin; Davies, Gene


    This paper reports on a new family-centred, feedback-informed intervention focused on evaluating therapeutic outcomes and language changes across treatment for conduct disorder (CD). The study included 26 youth and families from a larger randomised, controlled trial (Ronan et al., in preparation). Outcome measures reflected family functioning/youth compliance, delinquency, and family goal attainment. First- and last-treatment session audio files were transcribed into more than 286,000 words and evaluated through the Linguistic Inquiry and Word Count Analysis program (Pennebaker et al., 2007). Significant outcomes across family functioning/youth compliance, delinquency, goal attainment and word usage reflected moderate-strong effect sizes. Benchmarking findings also revealed reduced time of treatment delivery compared to a gold standard approach. Linguistic analysis revealed specific language changes across treatment. For caregivers, increased first person, action-oriented, present tense, and assent type words and decreased sadness words were found; for youth, significant reduction in use of leisure words. This study is the first using lexical analyses of natural language to assess change across treatment for conduct disordered youth and families. Such findings provided strong support for program tenets; others, more speculative support. Copyright © 2016. Published by Elsevier B.V.

  10. Cannabinoids cases in polish athletes

    A Pokrywka


    Full Text Available The aim of this study was to investigate the number of cases and the profiles of Polish athletes who had occasionally been using marijuana or hashish throughout the period of 1998-2004, with respect to: sex, age, and discipline of sport as well as the period of testing (in- and out-of-competition. Results of the study were compared with some data reported by other WADA accredited anti-doping laboratories. Totally, 13 631 urine samples taken from Polish athletes of both sexes, aged 10-67 years, performing 46 disciplines of sport were tested. Cannabinoids were detected in 267 samples. Among Polish athletes the relative number of positive THC (tetrahydrocannabinol samples was one of the highest in Europe. The group of young Polish athletes (aged 16-24 years was the most THC-positive. THC-positive cases were noted more frequently in male athletes tested during out of competitions. The so-called contact sports (rugby, ice hockey, skating, boxing, badminton, body building and acrobatic sports were those sports, where the higher risk of cannabis use was observed. The legal interpretation of some positive cannabinoids results would be difficult because of some accidental and unintentional use of the narcotics by sportsmen. It was concluded that national anti-doping organizations (NADO’s, which are competent to judge whether the anti-doping rules were violated, should take into account the possibility of non-intentional doping use of cannabinoids via passive smoking of marijuana.

  11. ATLAS brochure (Polish version)

    Lefevre, C


    ATLAS is the largest detector at the LHC, the most powerful particle accelerator in the world, which will start up in 2008. ATLAS is a multi-purpose detector, designed to throw light on fundamental questions such as the origin of mass and the nature of the Universe's dark matter.

  12. The dynamic nature of motivation in language learning: A classroom perspective

    Mirosław Pawlak


    Full Text Available When we examine the empirical investigations of motivation in second and foreign language learning, even those drawing upon the latest theoretical paradigms, such as the L2 motivational self system (Dörnyei, 2009, it becomes clear that many of them still fail to take account of its dynamic character and temporal variation. This may be surprising in view of the fact that the need to adopt such a process-oriented approach has been emphasized by a number of theorists and researchers (e.g., Dörnyei, 2000, 2001, 2009; Ushioda, 1996; Williams & Burden, 1997, and it lies at the heart of the model of second language motivation proposed by Dörnyei and Ottó (1998. It is also unfortunate that few research projects have addressed the question of how motivation changes during a language lesson as well as a series of lessons, and what factors might be responsible for fluctuations of this kind. The present paper is aimed to rectify this problem by reporting the findings of a classroom-based study which investigated the changes in the motivation of 28 senior high school students, both in terms of their goals and intentions, and their interest and engagement in classroom activities and tasks over the period of four weeks. The analysis of the data collected by means of questionnaires, observations and interviews showed that although the reasons for learning remain relatively stable, the intensity of motivation is indeed subject to variation on a minute-to-minute basis and this fact has to be recognized even in large-scale, cross-sectional research in this area.

  13. Identification of methicillin-resistant Staphylococcus aureus within the Nation’s Veterans Affairs Medical Centers using natural language processing

    Jones Makoto


    Full Text Available Abstract Background Accurate information is needed to direct healthcare systems’ efforts to control methicillin-resistant Staphylococcus aureus (MRSA. Assembling complete and correct microbiology data is vital to understanding and addressing the multiple drug-resistant organisms in our hospitals. Methods Herein, we describe a system that securely gathers microbiology data from the Department of Veterans Affairs (VA network of databases. Using natural language processing methods, we applied an information extraction process to extract organisms and susceptibilities from the free-text data. We then validated the extraction against independently derived electronic data and expert annotation. Results We estimate that the collected microbiology data are 98.5% complete and that methicillin-resistant Staphylococcus aureus was extracted accurately 99.7% of the time. Conclusions Applying natural language processing methods to microbiology records appears to be a promising way to extract accurate and useful nosocomial pathogen surveillance data. Both scientific inquiry and the data’s reliability will be dependent on the surveillance system’s capability to compare from multiple sources and circumvent systematic error. The dataset constructed and methods used for this investigation could contribute to a comprehensive infectious disease surveillance system or other pressing needs.

  14. im4Things: An Ontology-Based Natural Language Interface for Controlling Devices in the Internet of Things

    KAUST Repository

    Noguera-Arnaldos, José Ángel


    The Internet of Things (IoT) offers opportunities for new applications and services that enable users to access and control their working and home environment from local and remote locations, aiming to perform daily life activities in an easy way. However, the IoT also introduces new challenges, some of which arise from the large range of devices currently available and the heterogeneous interfaces provided for their control. The control and management of this variety of devices and interfaces represent a new challenge for non-expert users, instead of making their life easier. Based on this understanding, in this work we present a natural language interface for the IoT, which takes advantage of Semantic Web technologies to allow non-expert users to control their home environment through an instant messaging application in an easy and intuitive way. We conducted several experiments with a group of end users aiming to evaluate the effectiveness of our approach to control home appliances by means of natural language instructions. The evaluation results proved that without the need for technicalities, the user was able to control the home appliances in an efficient way.

  15. Dual Sticky Hierarchical Dirichlet Process Hidden Markov Model and Its Application to Natural Language Description of Motions. (United States)

    Hu, Weiming; Tian, Guodong; Kang, Yongxin; Yuan, Chunfeng; Maybank, Stephen


    In this paper, a new nonparametric Bayesian model called the dual sticky hierarchical Dirichlet process hidden Markov model (HDP-HMM) is proposed for mining activities from a collection of time series data such as trajectories. All the time series data are clustered. Each cluster of time series data, corresponding to a motion pattern, is modeled by an HMM. Our model postulates a set of HMMs that share a common set of states (topics in an analogy with topic models for document processing), but have unique transition distributions. For the application to motion trajectory modeling, topics correspond to motion activities. The learnt topics are clustered into atomic activities which are assigned predicates. We propose a Bayesian inference method to decompose a given trajectory into a sequence of atomic activities. On combining the learnt sources and sinks, semantic motion regions, and the learnt sequence of atomic activities, the action represented by the trajectory can be described in natural language in as automatic a way as possible. The effectiveness of our dual sticky HDP-HMM is validated on several trajectory datasets. The effectiveness of the natural language descriptions for motions is demonstrated on the vehicle trajectories extracted from a traffic scene.

  16. A natural language-based presentation of cognitive stimulation to people with dementia in assistive technology: A pilot study. (United States)

    Dethlefs, Nina; Milders, Maarten; Cuayáhuitl, Heriberto; Al-Salkini, Turkey; Douglas, Lorraine


    Currently, an estimated 36 million people worldwide are affected by Alzheimer's disease or related dementias. In the absence of a cure, non-pharmacological interventions, such as cognitive stimulation, which slow down the rate of deterioration can benefit people with dementia and their caregivers. Such interventions have shown to improve well-being and slow down the rate of cognitive decline. It has further been shown that cognitive stimulation in interaction with a computer is as effective as with a human. However, the need to operate a computer often represents a difficulty for the elderly and stands in the way of widespread adoption. A possible solution to this obstacle is to provide a spoken natural language interface that allows people with dementia to interact with the cognitive stimulation software in the same way as they would interact with a human caregiver. This makes the assistive technology accessible to users regardless of their technical skills and provides a fully intuitive user experience. This article describes a pilot study that evaluated the feasibility of computer-based cognitive stimulation through a spoken natural language interface. Prototype software was evaluated with 23 users, including healthy elderly people and people with dementia. Feedback was overwhelmingly positive.

  17. On the nature and evolution of the neural bases of human language (United States)

    Lieberman, Philip


    The traditional theory equating the brain bases of language with Broca's and Wernicke's neocortical areas is wrong. Neural circuits linking activity in anatomically segregated populations of neurons in subcortical structures and the neocortex throughout the human brain regulate complex behaviors such as walking, talking, and comprehending the meaning of sentences. When we hear or read a word, neural structures involved in the perception or real-world associations of the word are activated as well as posterior cortical regions adjacent to Wernicke's area. Many areas of the neocortex and subcortical structures support the cortical-striatal-cortical circuits that confer complex syntactic ability, speech production, and a large vocabulary. However, many of these structures also form part of the neural circuits regulating other aspects of behavior. For example, the basal ganglia, which regulate motor control, are also crucial elements in the circuits that confer human linguistic ability and abstract reasoning. The cerebellum, traditionally associated with motor control, is active in motor learning. The basal ganglia are also key elements in reward-based learning. Data from studies of Broca's aphasia, Parkinson's disease, hypoxia, focal brain damage, and a genetically transmitted brain anomaly (the putative "language gene," family KE), and from comparative studies of the brains and behavior of other species, demonstrate that the basal ganglia sequence the discrete elements that constitute a complete motor act, syntactic process, or thought process. Imaging studies of intact human subjects and electrophysiologic and tracer studies of the brains and behavior of other species confirm these findings. As Dobzansky put it, "Nothing in biology makes sense except in the light of evolution" (cited in Mayr, 1982). That applies with as much force to the human brain and the neural bases of language as it does to the human foot or jaw. The converse follows: the mark of evolution on

  18. Knowledge-Based Natural Language Understanding: A AAAI-87 Survey Talk (United States)


    easily transformed into a regrettable mistake (don’t cry over spilt milk ) if G is not characterized as a fleeting goal and a recovery plan therefore...technical literature is characterized by very dry and literal language. If there is one place where metaphors might not intrude, it must be when people...from the point of view of both evidential support and falsification ? I ask it because you didn’t say anything about it. A: Well, I think there’s a lot

  19. Modelling language

    CERN Document Server

    Cardey, Sylviane


    In response to the need for reliable results from natural language processing, this book presents an original way of decomposing a language(s) in a microscopic manner by means of intra/inter‑language norms and divergences, going progressively from languages as systems to the linguistic, mathematical and computational models, which being based on a constructive approach are inherently traceable. Languages are described with their elements aggregating or repelling each other to form viable interrelated micro‑systems. The abstract model, which contrary to the current state of the art works in int

  20. Natural Language Processing Based Instrument for Classification of Free Text Medical Records

    Manana Khachidze


    Full Text Available According to the Ministry of Labor, Health and Social Affairs of Georgia a new health management system has to be introduced in the nearest future. In this context arises the problem of structuring and classifying documents containing all the history of medical services provided. The present work introduces the instrument for classification of medical records based on the Georgian language. It is the first attempt of such classification of the Georgian language based medical records. On the whole 24.855 examination records have been studied. The documents were classified into three main groups (ultrasonography, endoscopy, and X-ray and 13 subgroups using two well-known methods: Support Vector Machine (SVM and K-Nearest Neighbor (KNN. The results obtained demonstrated that both machine learning methods performed successfully, with a little supremacy of SVM. In the process of classification a “shrink” method, based on features selection, was introduced and applied. At the first stage of classification the results of the “shrink” case were better; however, on the second stage of classification into subclasses 23% of all documents could not be linked to only one definite individual subclass (liver or binary system due to common features characterizing these subclasses. The overall results of the study were successful.

  1. A Requirements-Based Exploration of Open-Source Software Development Projects--Towards a Natural Language Processing Software Analysis Framework (United States)

    Vlas, Radu Eduard


    Open source projects do have requirements; they are, however, mostly informal, text descriptions found in requests, forums, and other correspondence. Understanding such requirements provides insight into the nature of open source projects. Unfortunately, manual analysis of natural language requirements is time-consuming, and for large projects,…

  2. The Phonotactic Influence on the Perception of a Consonant Cluster /pt/ by Native English and Native Polish Listeners: A Behavioral and Event Related Potential (ERP) Study (United States)

    Wagner, Monica; Shafer, Valerie L.; Martin, Brett; Steinschneider, Mitchell


    The effect of exposure to the contextual features of the /pt/ cluster was investigated in native-English and native-Polish listeners using behavioral and event-related potential (ERP) methodology. Both groups experience the /pt/ cluster in their languages, but only the Polish group experiences the cluster in the context of word onset examined in…


    Full Text Available The aim of this study were to know semantic meaning of predicate Ngajengan, Daharan, Ngelor, Mangan, Ngrodok (Eating, Kaken (Eating, Suap, Bejijit, (Eating Bekeruak (Eating, Ngerasak (Eating and Nyangklok (Eating. Besides that, to know the lexical meaning of each words and the function of words in every sentences especially the meaning of eating in Sasaknese language. The lexical meaning of Ngajengan, Daharan, Ngelor, Mangan, Ngrodok (Eating, Kaken (Eating, Suap, Bejijit, (Eating Bekeruak (Eating, Ngerasak (Eating and Nyangklok (Eating was doing something to eat but the differences of these words are usage in sentences. Besides that, the word usage based on the subject and object and there is predicate that need tool to state eat meals or food.

  4. The development of a natural language interface to a geographical information system (United States)

    Toledo, Sue Walker; Davis, Bruce


    This paper will discuss a two and a half year long project undertaken to develop an English-language interface for the geographical information system GRASS. The work was carried out for NASA by a small business, Netrologic, based in San Diego, California, under Phase 1 and 2 Small Business Innovative Research contracts. We consider here the potential value of this system whose current functionality addresses numerical, categorical and boolean raster layers and includes the display of point sets defined by constraints on one or more layers, answers yes/no and numerical questions, and creates statistical reports. It also handles complex queries and lexical ambiguities, and allows temporarily switching to UNIX or GRASS.

  5. Polish energy-system modernisation

    International Nuclear Information System (INIS)

    Drozdz, M.


    The Polish energy-system needs intensive investments in new technologies, which are energy efficient, clean and cost effective. Since the early 1990s, the Polish economy has had practically full access to modern technological devices, equipment and technologies. Introducing new technologies is a difficult task for project teams, constructors and investors. The author presents a set of principles for project teams useful in planning and energy modernisation. Several essential features are discussed: Energy-efficient appliances and systems; Choice of energy carriers, media and fuels; Optimal tariffs, maximum power and installed power; Intelligent, integrated, steering systems; Waste-energy recovery; Renewable-energy recovery. In practice there are several difficulties connected with planning and realising good technological and economic solutions. The author presents his own experiences of energy-system modernisation of industrial processes and building new objects. (Author)

  6. Energy savings in Polish buildings

    Energy Technology Data Exchange (ETDEWEB)

    Markel, L.C.; Gula, A.; Reeves, G.


    A demonstration of low-cost insulation and weatherization techniques was a part of phase 1 of the Krakow Clean Fossil Fuels and Energy Efficient Project. The objectives were to identify a cost-effective set of measures to reduce energy used for space heating, determine how much energy could be saved, and foster widespread implementation of those measures. The demonstration project focused on 4 11-story buildings in a Krakow housing cooperative. Energy savings of over 20% were obtained. Most important, the procedures and materials implemented in the demonstration project have been adapted to Polish conditions and applied to other housing cooperatives, schools, and hospitals. Additional projects are being planned, in Krakow and other cities, under the direction of FEWE-Krakow, the Polish Energie Cities Network, and Biuro Rozwoju Krakowa.


    Katarzyna CHRUZIK


    Full Text Available Analyzing the level of Polish transport safety culture can be seen that it is now dependent on the culture of safety management within the organization and the requirements and recommendations of law in this field for different modes of transport (air, rail, road, water. Of the four basic types of transport requirements are widely developed in the aviation, rail, and water – the sea. In order to harmonize the requirements for transport safety so it appears advisable to develop a platform for exchange of safety information for different modes of transport, and the development of good practices multimodal offering the possibility of improving Polish transport safety. Described in the publication of the proposal in addition to the alignment platform experience and knowledge in the field of transport safety in all its kinds, it can also be a tool for perfecting new operators of public transport.

  8. Polishing compound for plastic surfaces (United States)

    Stowell, M.S.


    This invention is comprised of a polishing compound for plastic materials. The compound includes approximately by approximately by weight 25 to 80 parts at least one petroleum distillate lubricant, 1 to 12 parts mineral spirits, 50 to 155 parts abrasive paste, and 15 to 60 parts water. Preferably, the compound includes approximately 37 to 42 parts at least one petroleum distillate lubricant, up to 8 parts mineral spirits, 95 to 110 parts abrasive paste, and 50 to 55 parts water. The proportions of the ingredients are varied in accordance with the particular application. The compound is used on PLEXIGLAS{trademark}, LEXAN{trademark}, LUCITE{trademark}, polyvinyl chloride (PVC), and similar plastic materials whenever a smooth, clear polished surface is desired.

  9. Computer simulation as an important approach to explore language universal. Comment on "Dependency distance: a new perspective on syntactic patterns in natural languages" by Haitao Liu et al. (United States)

    Lu, Qian


    Exploring language universal is one of the major goals of linguistic researches, which are largely devoted to answering the ;Platonic questions; in linguistics, that is, what is the language knowledge, how to get and use this knowledge. However, if solely guided by linguistic intuition, it is very difficult for syntactic studies to answer these questions, or to achieve abstractions in the scientific sense. This suggests that linguistic analyses based on the probability theory may provide effective ways to investigate into language universals in terms of biological motivations or cognitive psychological mechanisms. With the view that ;Language is a human-driven system;, Liu, Xu & Liang's review [1] pointed out that dependency distance minimization (DDM), which has been corroborated by big data analysis of corpus, may be a language universal shaped in language evolution, a universal that has profound effect on syntactic patterns.

  10. Automatic Determination of the Need for Intravenous Contrast in Musculoskeletal MRI Examinations Using IBM Watson's Natural Language Processing Algorithm. (United States)

    Trivedi, Hari; Mesterhazy, Joseph; Laguna, Benjamin; Vu, Thienkhai; Sohn, Jae Ho


    Magnetic resonance imaging (MRI) protocoling can be time- and resource-intensive, and protocols can often be suboptimal dependent upon the expertise or preferences of the protocoling radiologist. Providing a best-practice recommendation for an MRI protocol has the potential to improve efficiency and decrease the likelihood of a suboptimal or erroneous study. The goal of this study was to develop and validate a machine learning-based natural language classifier that can automatically assign the use of intravenous contrast for musculoskeletal MRI protocols based upon the free-text clinical indication of the study, thereby improving efficiency of the protocoling radiologist and potentially decreasing errors. We utilized a deep learning-based natural language classification system from IBM Watson, a question-answering supercomputer that gained fame after challenging the best human players on Jeopardy! in 2011. We compared this solution to a series of traditional machine learning-based natural language processing techniques that utilize a term-document frequency matrix. Each classifier was trained with 1240 MRI protocols plus their respective clinical indications and validated with a test set of 280. Ground truth of contrast assignment was obtained from the clinical record. For evaluation of inter-reader agreement, a blinded second reader radiologist analyzed all cases and determined contrast assignment based on only the free-text clinical indication. In the test set, Watson demonstrated overall accuracy of 83.2% when compared to the original protocol. This was similar to the overall accuracy of 80.2% achieved by an ensemble of eight traditional machine learning algorithms based on a term-document matrix. When compared to the second reader's contrast assignment, Watson achieved 88.6% agreement. When evaluating only the subset of cases where the original protocol and second reader were concordant (n = 251), agreement climbed further to 90.0%. The classifier was

  11. Confocal Raman spectrocopy for the analysis of nail polish evidence. (United States)

    López-López, Maria; Vaz, Joana; García-Ruiz, Carmen


    Nail polishes are cosmetic paints that may be susceptible of forensic analysis offering useful information to assist in a crime reconstruction. Although the nail polish appearance could allow a quick visual identification of the sample, this analysis is subjected to the perception and subjective interpretation of the forensic examiner. The chemical analysis of the nail polishes offers great deal of information not subjected to analyst interpretation. Confocal Raman spectroscopy is a well-suited technique for the analysis of paints due to its non-invasive and non-destructive nature and its ability to supply information about the organic and inorganic components of the sample. In this work, 77 regular and gel nail polishes were analyzed with confocal Raman spectroscopy using two laser wavelengths (532 and 780 nm). The sample behavior under the two laser wavelengths and the differences in the spectra taken at different points of the sample were studied for each nail polish. Additionally, the spectra obtained for all the nail polishes were visually compared. The results concluded that the longer laser wavelength prevents sample burning and fluorescence effects; the similarity among the spectra collected within the sample is not directly related with the presence of glitter particles; and 64% of the samples analyzed showed a characteristic spectrum. Additionally, the use of confocal Raman spectroscopy for the forensic analysis of nail polishes evidence in the form of flakes or smudges on different surfaces were studied. The results showed that both types of evidence can be analyzed by the technique. Also, two non-invasive sampling methods for the collection of the evidence from the nails of the suspect or the victim were proposed: (i) to use acetone-soaked cotton swabs to remove the nail varnishes and (ii) to scrape the nail polish from the nail with a blade. Both approaches, each exhibiting advantages and drawbacks in terms of transport and handling were appropriate

  12. Careers of young Polish chemists


    Kosmulski, Marek


    Typical young Polish scientist is an alumnus of doctoral studies at the same university and department where he/she completed his/her Master degree. The career is continued by receiving a habilitation at the same university and department. Then a holder of habilitation is promoted to a tenured position at the same university and department. Detailed analysis of scientific careers of 154 recent Ph.D. recipients and of 16 habilitation candidates in chemistry from University of Warsaw is present...

  13. 19th Polish Control Conference

    Kacprzyk, Janusz; Oprzędkiewicz, Krzysztof; Skruch, Paweł


    This volume contains the proceedings of the KKA 2017 – the 19th Polish Control Conference, organized by the Department of Automatics and Biomedical Engineering, AGH University of Science and Technology in Kraków, Poland on June 18–21, 2017, under the auspices of the Committee on Automatic Control and Robotics of the Polish Academy of Sciences, and the Commission for Engineering Sciences of the Polish Academy of Arts and Sciences. Part 1 deals with general issues of modeling and control, notably flow modeling and control, sliding mode, predictive, dual, etc. control. In turn, Part 2 focuses on optimization, estimation and prediction for control. Part 3 is concerned with autonomous vehicles, while Part 4 addresses applications. Part 5 discusses computer methods in control, and Part 6 examines fractional order calculus in the modeling and control of dynamic systems. Part 7 focuses on modern robotics. Part 8 deals with modeling and identification, while Part 9 deals with problems related to security, fault ...

  14. Performance analysis of CRF-based learning for processing WoT application requests expressed in natural language. (United States)

    Yoon, Young


    In this paper, we investigate the effectiveness of a CRF-based learning method for identifying necessary Web of Things (WoT) application components that would satisfy the users' requests issued in natural language. For instance, a user request such as "archive all sports breaking news" can be satisfied by composing a WoT application that consists of ESPN breaking news service and Dropbox as a storage service. We built an engine that can identify the necessary application components by recognizing a main act (MA) or named entities (NEs) from a given request. We trained this engine with the descriptions of WoT applications (called recipes) that were collected from IFTTT WoT platform. IFTTT hosts over 300 WoT entities that offer thousands of functions referred to as triggers and actions. There are more than 270,000 publicly-available recipes composed with those functions by real users. Therefore, the set of these recipes is well-qualified for the training of our MA and NE recognition engine. We share our unique experience of generating the training and test set from these recipe descriptions and assess the performance of the CRF-based language method. Based on the performance evaluation, we introduce further research directions.

  15. Building a Natural Language Processing Tool to Identify Patients With High Clinical Suspicion for Kawasaki Disease from Emergency Department Notes. (United States)

    Doan, Son; Maehara, Cleo K; Chaparro, Juan D; Lu, Sisi; Liu, Ruiling; Graham, Amanda; Berry, Erika; Hsu, Chun-Nan; Kanegaye, John T; Lloyd, David D; Ohno-Machado, Lucila; Burns, Jane C; Tremoulet, Adriana H


    Delayed diagnosis of Kawasaki disease (KD) may lead to serious cardiac complications. We sought to create and test the performance of a natural language processing (NLP) tool, the KD-NLP, in the identification of emergency department (ED) patients for whom the diagnosis of KD should be considered. We developed an NLP tool that recognizes the KD diagnostic criteria based on standard clinical terms and medical word usage using 22 pediatric ED notes augmented by Unified Medical Language System vocabulary. With high suspicion for KD defined as fever and three or more KD clinical signs, KD-NLP was applied to 253 ED notes from children ultimately diagnosed with either KD or another febrile illness. We evaluated KD-NLP performance against ED notes manually reviewed by clinicians and compared the results to a simple keyword search. KD-NLP identified high-suspicion patients with a sensitivity of 93.6% and specificity of 77.5% compared to notes manually reviewed by clinicians. The tool outperformed a simple keyword search (sensitivity = 41.0%; specificity = 76.3%). KD-NLP showed comparable performance to clinician manual chart review for identification of pediatric ED patients with a high suspicion for KD. This tool could be incorporated into the ED electronic health record system to alert providers to consider the diagnosis of KD. KD-NLP could serve as a model for decision support for other conditions in the ED. © 2016 by the Society for Academic Emergency Medicine.

  16. Analyzing discourse and text complexity for learning and collaborating a cognitive approach based on natural language processing

    CERN Document Server

    Dascălu, Mihai


    With the advent and increasing popularity of Computer Supported Collaborative Learning (CSCL) and e-learning technologies, the need of automatic assessment and of teacher/tutor support for the two tightly intertwined activities of comprehension of reading materials and of collaboration among peers has grown significantly. In this context, a polyphonic model of discourse derived from Bakhtin’s work as a paradigm is used for analyzing both general texts and CSCL conversations in a unique framework focused on different facets of textual cohesion. As specificity of our analysis, the individual learning perspective is focused on the identification of reading strategies and on providing a multi-dimensional textual complexity model, whereas the collaborative learning dimension is centered on the evaluation of participants’ involvement, as well as on collaboration assessment. Our approach based on advanced Natural Language Processing techniques provides a qualitative estimation of the learning process and enhance...

  17. Automated Assessment of Patients' Self-Narratives for Posttraumatic Stress Disorder Screening Using Natural Language Processing and Text Mining. (United States)

    He, Qiwei; Veldkamp, Bernard P; Glas, Cees A W; de Vries, Theo


    Patients' narratives about traumatic experiences and symptoms are useful in clinical screening and diagnostic procedures. In this study, we presented an automated assessment system to screen patients for posttraumatic stress disorder via a natural language processing and text-mining approach. Four machine-learning algorithms-including decision tree, naive Bayes, support vector machine, and an alternative classification approach called the product score model-were used in combination with n-gram representation models to identify patterns between verbal features in self-narratives and psychiatric diagnoses. With our sample, the product score model with unigrams attained the highest prediction accuracy when compared with practitioners' diagnoses. The addition of multigrams contributed most to balancing the metrics of sensitivity and specificity. This article also demonstrates that text mining is a promising approach for analyzing patients' self-expression behavior, thus helping clinicians identify potential patients from an early stage.

  18. Computer based extraction of phenoptypic features of human congenital anomalies from the digital literature with natural language processing techniques. (United States)

    Karakülah, Gökhan; Dicle, Oğuz; Koşaner, Ozgün; Suner, Aslı; Birant, Çağdaş Can; Berber, Tolga; Canbek, Sezin


    The lack of laboratory tests for the diagnosis of most of the congenital anomalies renders the physical examination of the case crucial for the diagnosis of the anomaly; and the cases in the diagnostic phase are mostly being evaluated in the light of the literature knowledge. In this respect, for accurate diagnosis, ,it is of great importance to provide the decision maker with decision support by presenting the literature knowledge about a particular case. Here, we demonstrated a methodology for automated scanning and determining of the phenotypic features from the case reports related to congenital anomalies in the literature with text and natural language processing methods, and we created a framework of an information source for a potential diagnostic decision support system for congenital anomalies.

  19. Reproducibility in Natural Language Processing: A Case Study of Two R Libraries for Mining PubMed/MEDLINE (United States)

    Cohen, K. Bretonnel; Xia, Jingbo; Roeder, Christophe; Hunter, Lawrence E.


    There is currently a crisis in science related to highly publicized failures to reproduce large numbers of published studies. The current work proposes, by way of case studies, a methodology for moving the study of reproducibility in computational work to a full stage beyond that of earlier work. Specifically, it presents a case study in attempting to reproduce the reports of two R libraries for doing text mining of the PubMed/MEDLINE repository of scientific publications. The main findings are that a rational paradigm for reproduction of natural language processing papers can be established; the advertised functionality was difficult, but not impossible, to reproduce; and reproducibility studies can produce additional insights into the functioning of the published system. Additionally, the work on reproducibility lead to the production of novel user-centered documentation that has been accessed 260 times since its publication—an average of once a day per library. PMID:29568821

  20. Validation of Case Finding Algorithms for Hepatocellular Cancer From Administrative Data and Electronic Health Records Using Natural Language Processing. (United States)

    Sada, Yvonne; Hou, Jason; Richardson, Peter; El-Serag, Hashem; Davila, Jessica


    Accurate identification of hepatocellular cancer (HCC) cases from automated data is needed for efficient and valid quality improvement initiatives and research. We validated HCC International Classification of Diseases, 9th Revision (ICD-9) codes, and evaluated whether natural language processing by the Automated Retrieval Console (ARC) for document classification improves HCC identification. We identified a cohort of patients with ICD-9 codes for HCC during 2005-2010 from Veterans Affairs administrative data. Pathology and radiology reports were reviewed to confirm HCC. The positive predictive value (PPV), sensitivity, and specificity of ICD-9 codes were calculated. A split validation study of pathology and radiology reports was performed to develop and validate ARC algorithms. Reports were manually classified as diagnostic of HCC or not. ARC generated document classification algorithms using the Clinical Text Analysis and Knowledge Extraction System. ARC performance was compared with manual classification. PPV, sensitivity, and specificity of ARC were calculated. A total of 1138 patients with HCC were identified by ICD-9 codes. On the basis of manual review, 773 had HCC. The HCC ICD-9 code algorithm had a PPV of 0.67, sensitivity of 0.95, and specificity of 0.93. For a random subset of 619 patients, we identified 471 pathology reports for 323 patients and 943 radiology reports for 557 patients. The pathology ARC algorithm had PPV of 0.96, sensitivity of 0.96, and specificity of 0.97. The radiology ARC algorithm had PPV of 0.75, sensitivity of 0.94, and specificity of 0.68. A combined approach of ICD-9 codes and natural language processing of pathology and radiology reports improves HCC case identification in automated data.

  1. Polish students at the Académie Julian until 1919

    Zgórniak, Marek


    Full Text Available The subject of the article is the presence of Polish students in the most important private artistic school in Paris in the second half of the 19thcentury. The extant records regarding the atelier for male students made it possible to compile a list of about 165 Polish painters and sculptors studying there in the period from 1880 to 1919. The text presents the criteria used when preparing the list and the diagrams show the fluctuations in registration and the number of Polish artists in particular ateliers in successive years. The observations contained in the article have a summary nature and are illustrated only with selected examples.

  2. Using STED and ELSM confocal microscopy for a better knowledge of fused silica polished glass interface

    International Nuclear Information System (INIS)

    Catrin, Rodolphe; Neauport, Jerome; Taroux, Daniel; Corbineau, Thomas; Cormont, Philippe; Maunier, Cedric; Legros, Philippe


    Characteristics and nature of close surface defects existing in fused silica polished optical surfaces were explored. Samples were deliberately scratched using a modified polishing process in presence of different fluorescent dyes. Various techniques including Epi-fluorescence Laser Scanning Mode (ELSM) or Stimulated Emission Depletion (STED) confocal microscopy were used to measure and quantify scratches that are sometimes embedded under the polished layer. We show using a nondestructive technique that depth of the modified region extends far below the surface. Moreover cracks of 120 nm width can be present ten micrometers below the surface. (authors)

  3. What you say is not what you get: arguing for artificial languages instead of natural languages in human robot speech interaction

    NARCIS (Netherlands)

    Mubin, O.; Bartneck, C.; Feijs, L.M.G.


    The project described hereunder focuses on the design and implementation of a "Artificial Robotic Interaction Language", where the research goal is to find a balance between the effort necessary from the user to learn a new language and the resulting benefit of optimized automatic speech recognition

  4. Polish Experts’ Communication Encounters with Locals in a Chinese Subsidiary of a Western MNC

    Wilczewski, Michał; Søderberg, Anne-Marie; Gut, Arkadiusz

    and thematic analysis. Their stories reveal prerequisites for intercultural communication, language and culture-related communication problems with strategies to mitigate them, and factors that affect communication. The study offers important insights into the Polish-Chinese communication in a specific...

  5. From telegraphic to natural language: an expansion system in a pictogrambased AAC application


    Pahisa Solé, Joan


    En aquesta tesi doctoral, presentem un sistema de compansió que transforma el llenguatge telegràfic (frases formades per paraules de contingut no flexionades), derivat de la comunicació augmentativa i alternativa (CAA) basada en pictogrames, a llenguatge natural en català i en castellà. El sistema ha sigut dissenyat per millorar la comunicació de persones usuàries de CAA que habitualment tenen greus problemes a la parla, així com problemes motrius, i que utilitzen mètodes de comunicació basat...

  6. The tourism attractiveness of Polish libraries


    Miedzińska, Magdalena; Tanaś, Sławoj


  7. Natural Language-based Machine Learning Models for the Annotation of Clinical Radiology Reports. (United States)

    Zech, John; Pain, Margaret; Titano, Joseph; Badgeley, Marcus; Schefflein, Javin; Su, Andres; Costa, Anthony; Bederson, Joshua; Lehar, Joseph; Oermann, Eric Karl


    critical finding of 0.951 for unigram BOW versus 0.966 for the best-performing model. The Yule I of the head CT corpus was 34, markedly lower than that of the Reuters corpus (at 103) or I2B2 discharge summaries (at 271), indicating lower linguistic complexity. Conclusion Automated methods can be used to identify findings in radiology reports. The success of this approach benefits from the standardized language of these reports. With this method, a large labeled corpus can be generated for applications such as deep learning. © RSNA, 2018 Online supplemental material is available for this article.

  8. Dependency distance in language evolution. Comment on "Dependency distance: A new perspective on syntactic patterns in natural languages" by Haitao Liu et al. (United States)

    Liu, Bingli; Chen, Xinying


    In the target article [1], Liu et al. provide an informative introduction to the dependency distance studies and proclaim that language syntactic patterns, that relate to the dependency distance, are associated with human cognitive mechanisms, such as limited working memory and syntax processing. Therefore, such syntactic patterns are probably 'human-driven' language universals. Sufficient evidence based on big data analysis is also given in the article for supporting this idea. The hypotheses generally seem very convincing yet still need further tests from various perspectives. Diachronic linguistic study based on authentic language data, on our opinion, can be one of those 'further tests'.

  9. Polish Adaptation of Wrist Evaluation Questionnaires. (United States)

    Czarnecki, Piotr; Wawrzyniak-Bielęda, Anna; Romanowski, Leszek


    Questionnaires evaluating hand and wrist function are a very useful tool allowing for objective and systematic recording of symptoms reported by the patients. Most questionnaires generally accepted in clinical practice are available in English and need to be appropriately adapted in translation and undergo subsequent validation before they can be used in another culture and language. The process of translation of the questionnaires was based on the generally accepted guidelines of the International Quality of Life Assessment Project (IQOLA). First, the questionnaires were translated from English into Polish by two independent translators. Then, a joint version of the translation was prepared collectively and translated back into English. Each stage was followed by a written report. The translated questionnaires were then evaluated by a group of patients. We selected 31 patients with wrist problems and asked them to complete the PRWE, Mayo, Michigan and DASH questionnaires twice at intervals of 3-10 days. The results were submitted for statistical analysis. We found a statistically significant (pquestionnaires. A comparison of the PRWE and Mayo questionnaires with the DASH questionnaire also showed a statistically significant correlation (pquestionnaires was successful and that the questionnaires may be used in clinical practice.

  10. Corporate Politics on Polish Millennials


    Natalia Roślik


    In the very beginning of this particular paper, an author is trying to determine and describe who Millennials actually are. Then, the basis of Millennials definition is analysing corporation’s activity over the past years regarding this age group. The main goal of the thesis is to bring their specific futures out and describe what corporations on Polish job market are doing to encourage them to work in their offices. Especially in Poland within the last years, it is observed that big multinat...

  11. Polish Foundation for Energy Efficiency

    Energy Technology Data Exchange (ETDEWEB)



    The Polish Foundation for Energy Efficiency (FEWE) was established in Poland at the end of 1990. FEWE, as an independent and non-profit organization, has the following objectives: to strive towards an energy efficient national economy, and to show the way and methods by use of which energy efficiency can be increased. The activity of the Foundation covers the entire territory of Poland through three regional centers: in Warsaw, Katowice and Cracow. FEWE employs well-known and experienced specialists within thermal and power engineering, civil engineering, economy and applied sciences. The organizer of the Foundation has been Battelle Memorial Institute - Pacific Northwest Laboratories from the USA.

  12. Droughts in historical times in Polish territory (United States)

    Limanowka, Danuta; Cebulak, Elzbieta; Pyrc, Robert; Doktor, Radoslaw


    Climate change is one of the key environmental, social and economical issues, and it is also followed by political consequences. Impact of climate conditions on countries' economy is increasingly recognized, and a lot of attention is given, both in the global scale and by the individual national governments. In years 2008-2010, at the Poland -Institute of Meteorology and Water Management-National Research Institute was realized the KLIMAT Project on Impact of climate change on environment, economy and society (changes, effects and methods of reducing them, conclusions for science, engineering practice and economic planning) No. POIG01-03-01-14-011/08. The project was financed by the European Union and Polish state budget in frame of Innovative Economy Operational Programme. A very wide range of research was carried out in the different thematic areas. One of them was "Natural disasters and internal safety of the country (civil and economical)." The problem of drought in Poland was developed in terms of meteorology and hydrology. "Proxy" Data Descriptions very often inform about dry years and seasons, hot periods without precipitation. Analysis of historical material allowed to extract the years that have experienced prolonged periods of high temperatures and rainfall shortages. Weather phenomenon defined as drought belongs to extreme events. This information was very helpful in the process of indexing and thus to restore the course and intensity of climatic elements in the past. The analysis covered the period from year 1000 to modern times. Due to the limited information from the period of 1000-1500 the authors focused primarily on the period from 1500 to 2010. Analysis of the collected material has allowed the development of a highly precise temporal structure of the possible occurrence of dry periods to Polish territory.

  13. Language and human nature: Kurt Goldstein's neurolinguistic foundation of a holistic philosophy. (United States)

    Ludwig, David


    Holism in interwar Germany provides an excellent example for social and political influences on scientific developments. Deeply impressed by the ubiquitous invocation of a cultural crisis, biologists, physicians, and psychologists presented holistic accounts as an alternative to the "mechanistic worldview" of the nineteenth century. Although the ideological background of these accounts is often blatantly obvious, many holistic scientists did not content themselves with a general opposition to a mechanistic worldview but aimed at a rational foundation of their holistic projects. This article will discuss the work of Kurt Goldstein, who is known for both his groundbreaking contributions to neuropsychology and his holistic philosophy of human nature. By focusing on Goldstein's neurolinguistic research, I want to reconstruct the empirical foundations of his holistic program without ignoring its cultural background. In this sense, Goldstein's work provides a case study for the formation of a scientific theory through the complex interplay between specific empirical evidences and the general cultural developments of the Weimar Republic. © 2012 Wiley Periodicals, Inc.

  14. Evaluation of natural language processing from emergency department computerized medical records for intra-hospital syndromic surveillance

    Pagliaroli Véronique


    Full Text Available Abstract Background The identification of patients who pose an epidemic hazard when they are admitted to a health facility plays a role in preventing the risk of hospital acquired infection. An automated clinical decision support system to detect suspected cases, based on the principle of syndromic surveillance, is being developed at the University of Lyon's Hôpital de la Croix-Rousse. This tool will analyse structured data and narrative reports from computerized emergency department (ED medical records. The first step consists of developing an application (UrgIndex which automatically extracts and encodes information found in narrative reports. The purpose of the present article is to describe and evaluate this natural language processing system. Methods Narrative reports have to be pre-processed before utilizing the French-language medical multi-terminology indexer (ECMT for standardized encoding. UrgIndex identifies and excludes syntagmas containing a negation and replaces non-standard terms (abbreviations, acronyms, spelling errors.... Then, the phrases are sent to the ECMT through an Internet connection. The indexer's reply, based on Extensible Markup Language, returns codes and literals corresponding to the concepts found in phrases. UrgIndex filters codes corresponding to suspected infections. Recall is defined as the number of relevant processed medical concepts divided by the number of concepts evaluated (coded manually by the medical epidemiologist. Precision is defined as the number of relevant processed concepts divided by the number of concepts proposed by UrgIndex. Recall and precision were assessed for respiratory and cutaneous syndromes. Results Evaluation of 1,674 processed medical concepts contained in 100 ED medical records (50 for respiratory syndromes and 50 for cutaneous syndromes showed an overall recall of 85.8% (95% CI: 84.1-87.3. Recall varied from 84.5% for respiratory syndromes to 87.0% for cutaneous syndromes. The


    Harvard Univ., Cambridge, MA. Graduate School of Education.


  16. Linguistics in Language Education (United States)

    Kumar, Rajesh; Yunus, Reva


    This article looks at the contribution of insights from theoretical linguistics to an understanding of language acquisition and the nature of language in terms of their potential benefit to language education. We examine the ideas of innateness and universal language faculty, as well as multilingualism and the language-society relationship. Modern…

  17. Buffered Electrochemical Polishing of Niobium

    Energy Technology Data Exchange (ETDEWEB)

    Ciovati, Gianluigi [Thomas Jefferson National Accelerator Facility (TJNAF), Newport News, VA (United States); Tian, Hui [Thomas Jefferson National Accelerator Facility (TJNAF), Newport News, VA (United States); College of William and Mary, Williamsburg, VA (United States); Corcoran, Sean [Virginia Polytechnic Inst. and State Univ. (Virginia Tech), Blacksburg, VA (United States)


    The standard preparation of superconducting radio-frequency (SRF) cavities made of pure niobium include the removal of a 'damaged' surface layer, by buffered chemical polishing (BCP) or electropolishing (EP), after the cavities are formed. The performance of the cavities is characterized by a sharp degradation of the quality factor when the surface magnetic field exceeds about 90 mT, a phenomenon referred to as 'Q-drop.' In cavities made of polycrystalline fine grain (ASTM 5) niobium, the Q-drop can be significantly reduced by a low-temperature (? 120 °C) 'in-situ' baking of the cavity if the chemical treatment was EP rather than BCP. As part of the effort to understand this phenomenon, we investigated the effect of introducing a polarization potential during buffered chemical polishing, creating a process which is between the standard BCP and EP. While preliminary results on the application of this process to Nb cavities have been previously reported, in this contribution we focus on the characterization of this novel electrochemical process by measuring polarization curves, etching rates, surface finish, electrochemical impedance and the effects of temperature and electrolyte composition. In particular, it is shown that the anodic potential of Nb during BCP reduces the etching rate and improves the surface finish.

  18. Natural language query system design for interactive information storage and retrieval systems. Presentation visuals. M.S. Thesis Final Report, 1 Jul. 1985 - 31 Dec. 1987 (United States)

    Dominick, Wayne D. (Editor); Liu, I-Hsiung


    This Working Paper Series entry represents a collection of presentation visuals associated with the companion report entitled Natural Language Query System Design for Interactive Information Storage and Retrieval Systems, USL/DBMS NASA/RECON Working Paper Series report number DBMS.NASA/RECON-17.

  19. Development of a user-friendly interface for the searching of a data base in natural language while using concepts and means of artificial intelligence

    International Nuclear Information System (INIS)

    Pujo, Pascal


    This research thesis aimed at the development of a natural-language-based user-friendly interface for the searching of relational data bases. The author first addresses how to store data which will be accessible through an interface in natural language: this organisation must result in as few constraints as possible in query formulation. He briefly presents techniques related to the automatic processing of natural language, and highlights the need for a more user-friendly interface. Then, he presents the developed interface and outlines the user-friendliness and ergonomics of implemented procedures. He shows how the interface has been designed to deliver information and explanations on its processing. This allows the user to control the relevance of the answer. He also indicates the classification of mistakes and errors which may be present in queries in natural language. He finally gives an overview of possible evolutions of the interface, briefly presents deductive functionalities which could expand data management. The handling of complex objects is also addressed [fr

  20. Planned experiments and corpus based research play a complementary role. Comment on "Dependency distance: A new perspective on syntactic patterns in natural languages" by Haitao Liu et al. (United States)

    Vasishth, Shravan


    This interesting and informative review by Liu and colleagues [17] in this issue covers the full spectrum of research on the idea that in natural language, dependency distance tends to be small. The authors discuss two distinct research threads: experimental work from psycholinguistics on online processes in comprehension and production, and text-corpus studies of dependency length distributions.

  1. What Has Personality and Emotional Intelligence to Do with "Feeling Different" while Using a Foreign Language? (United States)

    Ozanska-Ponikwia, Katarzyna


    The present study investigates the link between personality traits (OCEAN Personality test), emotional intelligence (EI) (Trait Emotional Intelligence Questionnaire) and the notion of "feeling different" while using a foreign language among 102 Polish-English bilinguals and Polish L2 users of English who were immersed in a foreign language and…

  2. Polish Industry and Art at CERN

    CERN Multimedia


    On 17 October 2000 the second Polish industrial and technological exhibition opened at CERN. The first one was held five years ago and nine of the companies that were present then have come back again this year. Six of those companies were awarded contracts with CERN in 1995. Three Polish officials were present at the Opening Ceremony today: Mrs Malgorzata Kozlowska, Under-secretary of State in the State Committee for Scientific Research, Mr Henryk Ogryczak, Under-secretary of State in Ministry of Economy and Prof. Jerzy Niewodniczanski, President of National Atomic Energy Agency. Professor Luciano Maiani welcomed the Polish delegation to CERN and stressed the important contribution of Polish scientists and industrialists to the work of the laboratory. Director General Luciano Maiani (back left) and head of SPL division Karl-Heinz Kissler (back right) visit the Poland at CERN exhibition… The exhibition offers Polish companies the opportunity to establish professional contacts with CERN. Nineteen companies...


    Przemysław Wiatrowski


    Full Text Available This article discusses Indonesian set phrases, a research area not previously investigated by Polish scholars. The aim is to analyze expressions which reveal the cultural specificity of the Indonesian speech community. Specifically, the author is concerned with two categories of multiword expressions. One of them is lexical combinations which preserve observations characteristic of the Indonesian speech community. These are reflected in a system of lexical connotations drawn upon in the process of semantic motivation of idioms. The other is expressions made up of units which are specific to Indonesian culture. The cultural relevance of Indonesian multi-word combinations is examined against the background of the Polish language. By examining research material derived from dictionaries of phrases and collocations and general dictionaries of the Indonesian language, the author provides insights into the way of thinking and responding to reality which is embedded in the language and in the collective experience of members of the Indonesian community.

  4. Automated visual inspection for polished stone manufacture (United States)

    Smith, Melvyn L.; Smith, Lyndon N.


    Increased globalisation of the ornamental stone market has lead to increased competition and more rigorous product quality requirements. As such, there are strong motivators to introduce new, more effective, inspection technologies that will help enable stone processors to reduce costs, improve quality and improve productivity. Natural stone surfaces may contain a mixture of complex two-dimensional (2D) patterns and three-dimensional (3D) features. The challenge in terms of automated inspection is to develop systems able to reliably identify 3D topographic defects, either naturally occurring or resulting from polishing, in the presence of concomitant complex 2D stochastic colour patterns. The resulting real-time analysis of the defects may be used in adaptive process control, in order to avoid the wasteful production of defective product. An innovative approach, using structured light and based upon an adaptation of the photometric stereo method, has been pioneered and developed at UWE to isolate and characterize mixed 2D and 3D surface features. The method is able to undertake tasks considered beyond the capabilities of existing surface inspection techniques. The approach has been successfully applied to real stone samples, and a selection of experimental results is presented.

  5. Impact of polishing on the light scattering at aerogel surface

    International Nuclear Information System (INIS)

    Barnyakov, A.Yu.; Barnyakov, M.Yu.; Bobrovnikov, V.S.; Buzykaev, A.R.; Danilyuk, A.F.; Katcin, A.A.; Kononov, S.A.; Kirilenko, P.S.; Kravchenko, E.A.; Kuyanov, I.A.; Onuchin, A.P.; Ovtin, I.V.; Predein, A.Yu.; Protsenko, R.S.


    Particle identification power of modern aerogel RICH detectors strongly depends on optical quality of radiators. It was shown that wavelength dependence of aerogel tile transparency after polishing cannot be described by the standard Hunt formula. The Hunt formula has been modified to describe scattering in a thin layer of silica dust on the surface of aerogel tile. Several procedures of polishing of aerogel tile have been tested. The best result has been achieved while using natural silk tissue. The resulting block has optical smooth surfaces. The measured decrease of aerogel transparency due to surface scattering is about few percent. This result could be used for production of radiators for the Focusing Aerogel RICH detectors.

  6. Africa and Its People in the Polish Media

    Directory of Open Access Journals (Sweden)

    Full Text Available The African continent is treated by the Polish media marginally and usually seen through the lens of four domains of stereotypical perceptions that are associated with difficult life conditions, threats and dangers, beautiful and wild nature, as well as original and diverse cultures. Monitoring of the Polish media has become very important in this situation. That is why the results of first media monitoring report were published in 2011 by ‘Africa Another Way’ Foundation. Five years later the monitoring was repeated. It is hard to resist the impression that Africa is still viewed as this poor, underdeveloped and dangerous continent. And the way it is presented translates into the way individuals of African descent are perceived.

  7. A new strategy for the restructuring of Polish energy sector

    International Nuclear Information System (INIS)

    Kozlowski, R.H.; Tallat, J.


    In accordance with strategic planning in the military, the leader (in this case the Minister of Economy) is responsible for setting goals, finding the right people to accomplish these goals (those working in the energy sector), analysing the current situation (state of the energy sector) and evaluating available resources (conventional and renewable energy resources). In terms of economic planning (this term is proper for an economy that sets numerous laws and quotas), the goal is to get the Polish economy out of economic slump, which is the result of seventeen years of improper government practices, into a state of prosperity corresponding to no less than the European average. The only way of accomplishing this goal of high economic growth and catching up with highly-developed countries is to develop local inexpensive energy resources. This study focuses on the potential to develop abundant Polish geothermal resources as well as natural gas based co-generation. (author)

  8. Does it really matter whether students' contributions are spoken versus typed in an intelligent tutoring system with natural language? (United States)

    D'Mello, Sidney K; Dowell, Nia; Graesser, Arthur


    There is the question of whether learning differs when students speak versus type their responses when interacting with intelligent tutoring systems with natural language dialogues. Theoretical bases exist for three contrasting hypotheses. The speech facilitation hypothesis predicts that spoken input will increase learning, whereas the text facilitation hypothesis predicts typed input will be superior. The modality equivalence hypothesis claims that learning gains will be equivalent. Previous experiments that tested these hypotheses were confounded by automated speech recognition systems with substantial error rates that were detected by learners. We addressed this concern in two experiments via a Wizard of Oz procedure, where a human intercepted the learner's speech and transcribed the utterances before submitting them to the tutor. The overall pattern of the results supported the following conclusions: (1) learning gains associated with spoken and typed input were on par and quantitatively higher than a no-intervention control, (2) participants' evaluations of the session were not influenced by modality, and (3) there were no modality effects associated with differences in prior knowledge and typing proficiency. Although the results generally support the modality equivalence hypothesis, highly motivated learners reported lower cognitive load and demonstrated increased learning when typing compared with speaking. We discuss the implications of our findings for intelligent tutoring systems that can support typed and spoken input.

  9. Combining natural language processing and network analysis to examine how advocacy organizations stimulate conversation on social media. (United States)

    Bail, Christopher Andrew


    Social media sites are rapidly becoming one of the most important forums for public deliberation about advocacy issues. However, social scientists have not explained why some advocacy organizations produce social media messages that inspire far-ranging conversation among social media users, whereas the vast majority of them receive little or no attention. I argue that advocacy organizations are more likely to inspire comments from new social media audiences if they create "cultural bridges," or produce messages that combine conversational themes within an advocacy field that are seldom discussed together. I use natural language processing, network analysis, and a social media application to analyze how cultural bridges shaped public discourse about autism spectrum disorders on Facebook over the course of 1.5 years, controlling for various characteristics of advocacy organizations, their social media audiences, and the broader social context in which they interact. I show that organizations that create substantial cultural bridges provoke 2.52 times more comments about their messages from new social media users than those that do not, controlling for these factors. This study thus offers a theory of cultural messaging and public deliberation and computational techniques for text analysis and application-based survey research.

  10. Integrating natural language processing expertise with patient safety event review committees to improve the analysis of medication events. (United States)

    Fong, Allan; Harriott, Nicole; Walters, Donna M; Foley, Hanan; Morrissey, Richard; Ratwani, Raj R


    Many healthcare providers have implemented patient safety event reporting systems to better understand and improve patient safety. Reviewing and analyzing these reports is often time consuming and resource intensive because of both the quantity of reports and length of free-text descriptions in the reports. Natural language processing (NLP) experts collaborated with clinical experts on a patient safety committee to assist in the identification and analysis of medication related patient safety events. Different NLP algorithmic approaches were developed to identify four types of medication related patient safety events and the models were compared. Well performing NLP models were generated to categorize medication related events into pharmacy delivery delays, dispensing errors, Pyxis discrepancies, and prescriber errors with receiver operating characteristic areas under the curve of 0.96, 0.87, 0.96, and 0.81 respectively. We also found that modeling the brief without the resolution text generally improved model performance. These models were integrated into a dashboard visualization to support the patient safety committee review process. We demonstrate the capabilities of various NLP models and the use of two text inclusion strategies at categorizing medication related patient safety events. The NLP models and visualization could be used to improve the efficiency of patient safety event data review and analysis. Copyright © 2017 Elsevier B.V. All rights reserved.

  11. Informatics in radiology: RADTF: a semantic search-enabled, natural language processor-generated radiology teaching file. (United States)

    Do, Bao H; Wu, Andrew; Biswal, Sandip; Kamaya, Aya; Rubin, Daniel L


    Storing and retrieving radiology cases is an important activity for education and clinical research, but this process can be time-consuming. In the process of structuring reports and images into organized teaching files, incidental pathologic conditions not pertinent to the primary teaching point can be omitted, as when a user saves images of an aortic dissection case but disregards the incidental osteoid osteoma. An alternate strategy for identifying teaching cases is text search of reports in radiology information systems (RIS), but retrieved reports are unstructured, teaching-related content is not highlighted, and patient identifying information is not removed. Furthermore, searching unstructured reports requires sophisticated retrieval methods to achieve useful results. An open-source, RadLex(®)-compatible teaching file solution called RADTF, which uses natural language processing (NLP) methods to process radiology reports, was developed to create a searchable teaching resource from the RIS and the picture archiving and communication system (PACS). The NLP system extracts and de-identifies teaching-relevant statements from full reports to generate a stand-alone database, thus converting existing RIS archives into an on-demand source of teaching material. Using RADTF, the authors generated a semantic search-enabled, Web-based radiology archive containing over 700,000 cases with millions of images. RADTF combines a compact representation of the teaching-relevant content in radiology reports and a versatile search engine with the scale of the entire RIS-PACS collection of case material. ©RSNA, 2010

  12. Per-service supervised learning for identifying desired WoT apps from user requests in natural language.

    Young Yoon

    Full Text Available Web of Things (WoT platforms are growing fast so as the needs for composing WoT apps more easily and efficiently. We have recently commenced the campaign to develop an interface where users can issue requests for WoT apps entirely in natural language. This requires an effort to build a system that can learn to identify relevant WoT functions that fulfill user's requests. In our preceding work, we trained a supervised learning system with thousands of publicly-available IFTTT app recipes based on conditional random fields (CRF. However, the sub-par accuracy and excessive training time motivated us to devise a better approach. In this paper, we present a novel solution that creates a separate learning engine for each trigger service. With this approach, parallel and incremental learning becomes possible. For inference, our system first identifies the most relevant trigger service for a given user request by using an information retrieval technique. Then, the learning engine associated with the trigger service predicts the most likely pair of trigger and action functions. We expect that such two-phase inference method given parallel learning engines would improve the accuracy of identifying related WoT functions. We verify our new solution through the empirical evaluation with training and test sets sampled from a pool of refined IFTTT app recipes. We also meticulously analyze the characteristics of the recipes to find future research directions.

  13. An Introduction to Natural Language Processing: How You Can Get More From Those Electronic Notes You Are Generating. (United States)

    Kimia, Amir A; Savova, Guergana; Landschaft, Assaf; Harper, Marvin B


    Electronically stored clinical documents may contain both structured data and unstructured data. The use of structured clinical data varies by facility, but clinicians are familiar with coded data such as International Classification of Diseases, Ninth Revision, Systematized Nomenclature of Medicine-Clinical Terms codes, and commonly other data including patient chief complaints or laboratory results. Most electronic health records have much more clinical information stored as unstructured data, for example, clinical narrative such as history of present illness, procedure notes, and clinical decision making are stored as unstructured data. Despite the importance of this information, electronic capture or retrieval of unstructured clinical data has been challenging. The field of natural language processing (NLP) is undergoing rapid development, and existing tools can be successfully used for quality improvement, research, healthcare coding, and even billing compliance. In this brief review, we provide examples of successful uses of NLP using emergency medicine physician visit notes for various projects and the challenges of retrieving specific data and finally present practical methods that can run on a standard personal computer as well as high-end state-of-the-art funded processes run by leading NLP informatics researchers.

  14. Experiments with a First Prototype of a Spatial Model of Cultural Meaning through Natural-Language Human-Robot Interaction

    Directory of Open Access Journals (Sweden)

    Full Text Available When using assistive systems, the consideration of individual and cultural meaning is crucial for the utility and acceptance of technology. Orientation, communication and interaction are rooted in perception and therefore always happen in material space. We understand that a major problem lies in the difference between human and technical perception of space. Cultural policies are based on meanings including their spatial situation and their rich relationships. Therefore, we have developed an approach where the different perception systems share a hybrid spatial model that is generated by artificial intelligence—a joint effort by humans and assistive systems. The aim of our project is to create a spatial model of cultural meaning based on interaction between humans and robots. We define the role of humanoid robots as becoming our companions. This calls for technical systems to include still inconceivable human and cultural agendas for the perception of space. In two experiments, we tested a first prototype of the communication module that allows a humanoid to learn cultural meanings through a machine learning system. Interaction is achieved by non-verbal and natural-language communication between humanoids and test persons. This helps us to better understand how a spatial model of cultural meaning can be developed.

  15. Per-service supervised learning for identifying desired WoT apps from user requests in natural language. (United States)

    Yoon, Young


    Web of Things (WoT) platforms are growing fast so as the needs for composing WoT apps more easily and efficiently. We have recently commenced the campaign to develop an interface where users can issue requests for WoT apps entirely in natural language. This requires an effort to build a system that can learn to identify relevant WoT functions that fulfill user's requests. In our preceding work, we trained a supervised learning system with thousands of publicly-available IFTTT app recipes based on conditional random fields (CRF). However, the sub-par accuracy and excessive training time motivated us to devise a better approach. In this paper, we present a novel solution that creates a separate learning engine for each trigger service. With this approach, parallel and incremental learning becomes possible. For inference, our system first identifies the most relevant trigger service for a given user request by using an information retrieval technique. Then, the learning engine associated with the trigger service predicts the most likely pair of trigger and action functions. We expect that such two-phase inference method given parallel learning engines would improve the accuracy of identifying related WoT functions. We verify our new solution through the empirical evaluation with training and test sets sampled from a pool of refined IFTTT app recipes. We also meticulously analyze the characteristics of the recipes to find future research directions.

  16. Corporate Politics on Polish Millennials

    Natalia Roślik


    Full Text Available In the very beginning of this particular paper, an author is trying to determine and describe who Millennials actually are. Then, the basis of Millennials definition is analysing corporation’s activity over the past years regarding this age group. The main goal of the thesis is to bring their specific futures out and describe what corporations on Polish job market are doing to encourage them to work in their offices. Especially in Poland within the last years, it is observed that big multinational companies are paying special attention to Millennials and trying to hire them before competitors will do so. As a part of this paper, an author will describe corporate politics and practices on Thomson Reuters and BNY Mellon examples. Within this work, an author is also discussing key features and differences between this generation and Millennials parent’s generation. Additionally, there is a reference to corporate social responsibility concept and work-life balance issues.

  17. Determinants of Polish public debt

    Directory of Open Access Journals (Sweden)

    Tomasz Stryjewski


    Full Text Available The crisis, which had its beginning in 2007, turned into the debt crisis of the countries. The examples of Greece, Ireland, Iceland or Spain showed the category of public debt in a new light. Poland, at the turn of 2010/2011 also achieved the upper level of public debt acceptable by the law. In the present situation of the European Union countries being in debt, and even insolvent, the situation in Poland becomes riskier. This article attempts at an empirical verification of the determinants of Polish public debt within 95 months (the data link with the period of time from January 2003 to November 2010. The verification of the main factors which cause the formation of public debt takes place by means of an appropriately verified econometric model.

  18. Zerodur polishing process for high surface quality and high efficiency

    Tesar, A.; Fuchs, B.


    Zerodur is a glass-ceramic composite importance in applications where temperature instabilities influence optical and mechanical performance, such as in earthbound and spaceborne telescope mirror substrates. Polished Zerodur surfaces of high quality have been required for laser gyro mirrors. Polished surface quality of substrates affects performance of high reflection coatings. Thus, the interest in improving Zerodur polished surface quality has become more general. Beyond eliminating subsurface damage, high quality surfaces are produced by reducing the amount of hydrated material redeposited on the surface during polishing. With the proper control of polishing parameters, such surfaces exhibit roughnesses of < l Angstrom rms. Zerodur polishing was studied to recommend a high surface quality polishing process which could be easily adapted to standard planetary continuous polishing machines and spindles. This summary contains information on a polishing process developed at LLNL which reproducibly provides high quality polished Zerodur surfaces at very high polishing efficiencies

  19. Convergent Polishing: A Simple, Rapid, Full Aperture Polishing Process of High Quality Optical Flats & Spheres (United States)

    Suratwala, Tayyab; Steele, Rusty; Feit, Michael; Dylla-Spears, Rebecca; Desjardin, Richard; Mason, Dan; Wong, Lana; Geraghty, Paul; Miller, Phil; Shen, Nan


    Convergent Polishing is a novel polishing system and method for finishing flat and spherical glass optics in which a workpiece, independent of its initial shape (i.e., surface figure), will converge to final surface figure with excellent surface quality under a fixed, unchanging set of polishing parameters in a single polishing iteration. In contrast, conventional full aperture polishing methods require multiple, often long, iterative cycles involving polishing, metrology and process changes to achieve the desired surface figure. The Convergent Polishing process is based on the concept of workpiece-lap height mismatch resulting in pressure differential that decreases with removal and results in the workpiece converging to the shape of the lap. The successful implementation of the Convergent Polishing process is a result of the combination of a number of technologies to remove all sources of non-uniform spatial material removal (except for workpiece-lap mismatch) for surface figure convergence and to reduce the number of rogue particles in the system for low scratch densities and low roughness. The Convergent Polishing process has been demonstrated for the fabrication of both flats and spheres of various shapes, sizes, and aspect ratios on various glass materials. The practical impact is that high quality optical components can be fabricated more rapidly, more repeatedly, with less metrology, and with less labor, resulting in lower unit costs. In this study, the Convergent Polishing protocol is specifically described for fabricating 26.5 cm square fused silica flats from a fine ground surface to a polished ~λ/2 surface figure after polishing 4 hr per surface on a 81 cm diameter polisher. PMID:25489745

  20. Understanding Language in Education and Grade 4 Reading Performance Using a "Natural Experiment" of Botswana and South Africa (United States)

    Shepherd, Debra Lynne


    The regional and cultural closeness of Botswana and South Africa, as well as differences in their political histories and language policy stances, offers a unique opportunity to evaluate the role of language in reading outcomes. This study aims to empirically test the effect of exposure to mother tongue and English instruction on the reading…

  1. The Importance of Natural Change in Planning School-Based Intervention for Children with Developmental Language Impairment (DLI) (United States)

    Botting, Nicola; Gaynor, Marguerite; Tucker, Katie; Orchard-Lisle, Ginnie


    Some reports suggest that there is an increase in the number of children identified as having developmental language impairment (Bercow, 2008). yet resource issues have meant that many speech and language therapy services have compromised provision in some way. Thus, efficient ways of identifying need and prioritizing intervention are required.…

  2. Technological Advances of Robot Assisted Polishing

    DEFF Research Database (Denmark)

    Lazarev, Ruslan; Top, Søren; Grønbæk, Jens

    The efficient polishing of surfaces is very important in mould and die industry. Fine abrasive processes are widely used in industry for the first steps for the production of tools of high quality in terms of finishing accuracy, form and surface integrity. While manufacturing of most components....... In this study, the influence of polishing parameters and type of polishing media on fine abrasive surface finishing is investigated. Experimental study is covering 2D rotational surfaces that is widespread used in mould and dies industry. Application of it is essential for process intelligent control, condition...... monitoring and quality inspection....

  3. Common data model for natural language processing based on two existing standard information models: CDA+GrAF. (United States)

    An increasing need for collaboration and resources sharing in the Natural Language Processing (NLP) research and development community motivates efforts to create and share a common data model and a common terminology for all information annotated and extracted from clinical text. We have combined two existing standards: the HL7 Clinical Document Architecture (CDA), and the ISO Graph Annotation Format (GrAF; in development), to develop such a data model entitled "CDA+GrAF". We experimented with several methods to combine these existing standards, and eventually selected a method wrapping separate CDA and GrAF parts in a common standoff annotation (i.e., separate from the annotated text) XML document. Two use cases, clinical document sections, and the 2010 i2b2/VA NLP Challenge (i.e., problems, tests, and treatments, with their assertions and relations), were used to create examples of such standoff annotation documents, and were successfully validated with the XML schemata provided with both standards. We developed a tool to automatically translate annotation documents from the 2010 i2b2/VA NLP Challenge format to GrAF, and automatically generated 50 annotation documents using this tool, all successfully validated. Finally, we adapted the XSL stylesheet provided with HL7 CDA to allow viewing annotation XML documents in a web browser, and plan to adapt existing tools for translating annotation documents between CDA+GrAF and the UIMA and GATE frameworks. This common data model may ease directly comparing NLP tools and applications, combining their output, transforming and "translating" annotations between different NLP applications, and eventually "plug-and-play" of different modules in NLP applications. Copyright © 2011 Elsevier Inc. All rights reserved.

  4. Identification of Long Bone Fractures in Radiology Reports Using Natural Language Processing to support Healthcare Quality Improvement. (United States)

    Grundmeier, Robert W; Masino, Aaron J; Casper, T Charles; Dean, Jonathan M; Bell, Jamie; Enriquez, Rene; Deakyne, Sara; Chamberlain, James M; Alpern, Elizabeth R


    Important information to support healthcare quality improvement is often recorded in free text documents such as radiology reports. Natural language processing (NLP) methods may help extract this information, but these methods have rarely been applied outside the research laboratories where they were developed. To implement and validate NLP tools to identify long bone fractures for pediatric emergency medicine quality improvement. Using freely available statistical software packages, we implemented NLP methods to identify long bone fractures from radiology reports. A sample of 1,000 radiology reports was used to construct three candidate classification models. A test set of 500 reports was used to validate the model performance. Blinded manual review of radiology reports by two independent physicians provided the reference standard. Each radiology report was segmented and word stem and bigram features were constructed. Common English "stop words" and rare features were excluded. We used 10-fold cross-validation to select optimal configuration parameters for each model. Accuracy, recall, precision and the F1 score were calculated. The final model was compared to the use of diagnosis codes for the identification of patients with long bone fractures. There were 329 unique word stems and 344 bigrams in the training documents. A support vector machine classifier with Gaussian kernel performed best on the test set with accuracy=0.958, recall=0.969, precision=0.940, and F1 score=0.954. Optimal parameters for this model were cost=4 and gamma=0.005. The three classification models that we tested all performed better than diagnosis codes in terms of accuracy, precision, and F1 score (diagnosis code accuracy=0.932, recall=0.960, precision=0.896, and F1 score=0.927). NLP methods using a corpus of 1,000 training documents accurately identified acute long bone fractures from radiology reports. Strategic use of straightforward NLP methods, implemented with freely available

  5. Development of a Natural Language Processing Engine to Generate Bladder Cancer Pathology Data for Health Services Research. (United States)

    Schroeck, Florian R; Patterson, Olga V; Alba, Patrick R; Pattison, Erik A; Seigne, John D; DuVall, Scott L; Robertson, Douglas J; Sirovich, Brenda; Goodney, Philip P


    To take the first step toward assembling population-based cohorts of patients with bladder cancer with longitudinal pathology data, we developed and validated a natural language processing (NLP) engine that abstracts pathology data from full-text pathology reports. Using 600 bladder pathology reports randomly selected from the Department of Veterans Affairs, we developed and validated an NLP engine to abstract data on histology, invasion (presence vs absence and depth), grade, the presence of muscularis propria, and the presence of carcinoma in situ. Our gold standard was based on an independent review of reports by 2 urologists, followed by adjudication. We assessed the NLP performance by calculating the accuracy, the positive predictive value, and the sensitivity. We subsequently applied the NLP engine to pathology reports from 10,725 patients with bladder cancer. When comparing the NLP output to the gold standard, NLP achieved the highest accuracy (0.98) for the presence vs the absence of carcinoma in situ. Accuracy for histology, invasion (presence vs absence), grade, and the presence of muscularis propria ranged from 0.83 to 0.96. The most challenging variable was depth of invasion (accuracy 0.68), with an acceptable positive predictive value for lamina propria (0.82) and for muscularis propria (0.87) invasion. The validated engine was capable of abstracting pathologic characteristics for 99% of the patients with bladder cancer. NLP had high accuracy for 5 of 6 variables and abstracted data for the vast majority of the patients. This now allows for the assembly of population-based cohorts with longitudinal pathology data. Published by Elsevier Inc.

  6. Improving performance of natural language processing part-of-speech tagging on clinical narratives through domain adaptation. (United States)

    Ferraro, Jeffrey P; Daumé, Hal; Duvall, Scott L; Chapman, Wendy W; Harkema, Henk; Haug, Peter J


    Natural language processing (NLP) tasks are commonly decomposed into subtasks, chained together to form processing pipelines. The residual error produced in these subtasks propagates, adversely affecting the end objectives. Limited availability of annotated clinical data remains a barrier to reaching state-of-the-art operating characteristics using statistically based NLP tools in the clinical domain. Here we explore the unique linguistic constructions of clinical texts and demonstrate the loss in operating characteristics when out-of-the-box part-of-speech (POS) tagging tools are applied to the clinical domain. We test a domain adaptation approach integrating a novel lexical-generation probability rule used in a transformation-based learner to boost POS performance on clinical narratives. Two target corpora from independent healthcare institutions were constructed from high frequency clinical narratives. Four leading POS taggers with their out-of-the-box models trained from general English and biomedical abstracts were evaluated against these clinical corpora. A high performing domain adaptation method, Easy Adapt, was compared to our newly proposed method ClinAdapt. The evaluated POS taggers drop in accuracy by 8.5-15% when tested on clinical narratives. The highest performing tagger reports an accuracy of 88.6%. Domain adaptation with Easy Adapt reports accuracies of 88.3-91.0% on clinical texts. ClinAdapt reports 93.2-93.9%. ClinAdapt successfully boosts POS tagging performance through domain adaptation requiring a modest amount of annotated clinical data. Improving the performance of critical NLP subtasks is expected to reduce pipeline error propagation leading to better overall results on complex processing tasks.

  7. The Common Alerting Protocol (CAP) and Emergency Data Exchange Language (EDXL) - Application in Early Warning Systems for Natural Hazard (United States)

    Lendholt, Matthias; Hammitzsch, Martin; Wächter, Joachim


    The Common Alerting Protocol (CAP) [1] is an XML-based data format for exchanging public warnings and emergencies between alerting technologies. In conjunction with the Emergency Data Exchange Language (EDXL) Distribution Element (-DE) [2] these data formats can be used for warning message dissemination in early warning systems for natural hazards. Application took place in the DEWS (Distance Early Warning System) [3] project where CAP serves as central message format containing both human readable warnings and structured data for automatic processing by message receivers. In particular the spatial reference capabilities are of paramount importance both in CAP and EDXL. Affected areas are addressable via geo codes like HASC (Hierarchical Administrative Subdivision Codes) [4] or UN/LOCODE [5] but also with arbitrary polygons that can be directly generated out of GML [6]. For each affected area standardized criticality values (urgency, severity and certainty) have to be set but also application specific key-value-pairs like estimated time of arrival or maximum inundation height can be specified. This enables - together with multilingualism, message aggregation and message conversion for different dissemination channels - the generation of user-specific tailored warning messages. [1] CAP, [2] EDXL-DE, [3] DEWS, [4] HASC, "Administrative Subdivisions of Countries: A Comprehensive World Reference, 1900 Through 1998" ISBN 0-7864-0729-8 [5] UN/LOCODE, [6] GML,

  8. Scaling properties of Polish rain series (United States)

    Licznar, P.


    Scaling properties as well as multifractal nature of precipitation time series have not been studied for local Polish conditions until recently due to lack of long series of high-resolution data. The first Polish study of precipitation time series scaling phenomena was made on the base of pluviograph data from the Wroclaw University of Environmental and Life Sciences meteorological station located at the south-western part of the country. The 38 annual rainfall records from years 1962-2004 were converted into digital format and transformed into a standard format of 5-minute time series. The scaling properties and multifractal character of this material were studied by means of several different techniques: power spectral density analysis, functional box-counting, probability distribution/multiple scaling and trace moment methods. The result proved the general scaling character of time series at the range of time scales ranging form 5 minutes up to at least 24 hours. At the same time some characteristic breaks at scaling behavior were recognized. It is believed that the breaks were artificial and arising from the pluviograph rain gauge measuring precision limitations. Especially strong limitations at the precision of low-intensity precipitations recording by pluviograph rain gauge were found to be the main reason for artificial break at energy spectra, as was reported by other authors before. The analysis of co-dimension and moments scaling functions showed the signs of the first-order multifractal phase transition. Such behavior is typical for dressed multifractal processes that are observed by spatial or temporal averaging on scales larger than the inner-scale of those processes. The fractal dimension of rainfall process support derived from codimension and moments scaling functions geometry analysis was found to be 0.45. The same fractal dimension estimated by means of the functional box-counting method was equal to 0.58. At the final part of the study

  9. Automated identification of wound information in clinical notes of patients with heart diseases: Developing and validating a natural language processing application. (United States)

    Topaz, Maxim; Lai, Kenneth; Dowding, Dawn; Lei, Victor J; Zisberg, Anna; Bowles, Kathryn H; Zhou, Li


    Electronic health records are being increasingly used by nurses with up to 80% of the health data recorded as free text. However, only a few studies have developed nursing-relevant tools that help busy clinicians to identify information they need at the point of care. This study developed and validated one of the first automated natural language processing applications to extract wound information (wound type, pressure ulcer stage, wound size, anatomic location, and wound treatment) from free text clinical notes. First, two human annotators manually reviewed a purposeful training sample (n=360) and random test sample (n=1100) of clinical notes (including 50% discharge summaries and 50% outpatient notes), identified wound cases, and created a gold standard dataset. We then trained and tested our natural language processing system (known as MTERMS) to process the wound information. Finally, we assessed our automated approach by comparing system-generated findings against the gold standard. We also compared the prevalence of wound cases identified from free-text data with coded diagnoses in the structured data. The testing dataset included 101 notes (9.2%) with wound information. The overall system performance was good (F-measure is a compiled measure of system's accuracy=92.7%), with best results for wound treatment (F-measure=95.7%) and poorest results for wound size (F-measure=81.9%). Only 46.5% of wound notes had a structured code for a wound diagnosis. The natural language processing system achieved good performance on a subset of randomly selected discharge summaries and outpatient notes. In more than half of the wound notes, there were no coded wound diagnoses, which highlight the significance of using natural language processing to enrich clinical decision making. Our future steps will include expansion of the application's information coverage to other relevant wound factors and validation of the model with external data. Copyright © 2016 Elsevier Ltd. All

  10. Jewish problem in the Polish Communist Party

    Directory of Open Access Journals (Sweden)

    Cimek Henryk


    Full Text Available Jews accounted for approx. 8-10% of the population of the Second Republic and in the communist movement (Polish Communist Party and Polish Communist Youth Union the rate was approx, 30%, while in subsequent years it much fluctuated. The percentage of Jews was the highest in the authorities of the party and in the KZMP. This had a negative impact on the position of the KPP on many issues, especially in its relation to the Second Republic.

  11. Electrolytic polishing system for space age materials

    International Nuclear Information System (INIS)

    Coons, W.C.; Iosty, L.R.


    A simple electrolytic polishing technique was developed for preparing Cr, Co, Hf, Mo, Ni, Re, Ti, V, Zr, and their alloys for structural analysis on the optical microscope. The base electrolyte contains 5g ZnCl 2 and 15g AlCl 3 . 6H 2 O in 200 ml methyl alcohol, plus an amount of H 2 SO 4 depending on the metal being polished. Five etchants are listed

  12. Trace element analysis of nail polishes

    International Nuclear Information System (INIS)

    Misra, G.; Mittal, V.K.; Sahota, H.S.


    Instrumental neutron activation analysis (INAA) technique was used to measure the concentrations of various trace elements in nail polishes of popular Indian and foreign brands. The aim of the present experiment was to see whether trace elements could distinguish nail polishes of different Indian and foreign brands from forensic point of view. It was found that cesium can act as a marker to differentiate foreign and Indian brands. (author)

  13. Experimental Polish-Lithuanian Corpus with the Semantic Annotation Elements

    Directory of Open Access Journals (Sweden)

    Full Text Available Experimental Polish-Lithuanian Corpus with the Semantic Annotation Elements In the article the authors present the experimental Polish-Lithuanian corpus (ECorpPL-LT formed for the idea of Polish-Lithuanian theoretical contrastive studies, a Polish-Lithuanian electronic dictionary, and as help for a sworn translator. The semantic annotation being brought into ECorpPL-LT is extremely useful in Polish-Lithuanian contrastive studies, and also proves helpful in translation work.

  14. Development of a user friendly interface for database querying in natural language by using concepts and means related to artificial intelligence

    International Nuclear Information System (INIS)

    Pujo, Pascal


    This research thesis reports the development of a user-friendly interface in natural language for querying a relational database. The developed system differs from usual approaches for its integrated architecture as the relational model management is totally controlled by the interface. The author first addresses the way to store data in order to make them accessible through an interface in natural language, and more precisely to store data with an organisation which would result in the less possible constraints in query formulation. The author then briefly presents techniques related to automatic processing in natural language, and discusses the implications of a better user-friendliness and for error processing. The next part reports the study of the developed interface: selection of data processing tools, interface development, data management at the interface level, information input by the user. The last chapter proposes an overview of possible evolutions for the interface: use of deductive functionalities, use of an extensional base and of an intentional base to deduce facts from knowledge stores in the extensional base, and handling of complex objects [fr

  15. A corpus of full-text journal articles is a robust evaluation tool for revealing differences in performance of biomedical natural language processing tools. (United States)

    Verspoor, Karin; Cohen, Kevin Bretonnel; Lanfranchi, Arrick; Warner, Colin; Johnson, Helen L; Roeder, Christophe; Choi, Jinho D; Funk, Christopher; Malenkiy, Yuriy; Eckert, Miriam; Xue, Nianwen; Baumgartner, William A; Bada, Michael; Palmer, Martha; Hunter, Lawrence E


    We introduce the linguistic annotation of a corpus of 97 full-text biomedical publications, known as the Colorado Richly Annotated Full Text (CRAFT) corpus. We further assess the performance of existing tools for performing sentence splitting, tokenization, syntactic parsing, and named entity recognition on this corpus. Many biomedical natural language processing systems demonstrated large differences between their previously published results and their performance on the CRAFT corpus when tested with the publicly available models or rule sets. Trainable systems differed widely with respect to their ability to build high-performing models based on this data. The finding that some systems were able to train high-performing models based on this corpus is additional evidence, beyond high inter-annotator agreement, that the quality of the CRAFT corpus is high. The overall poor performance of various systems indicates that considerable work needs to be done to enable natural language processing systems to work well when the input is full-text journal articles. The CRAFT corpus provides a valuable resource to the biomedical natural language processing community for evaluation and training of new models for biomedical full text publications.

  16. Laser polishing of additive manufactured Ti alloys (United States)

    Ma, C. P.; Guan, Y. C.; Zhou, W.


    Laser-based additive manufacturing has attracted much attention as a promising 3D printing method for metallic components in recent years. However, surface roughness of additive manufactured components has been considered as a challenge to achieve high performance. In this work, we demonstrate the capability of fiber laser in polishing rough surface of additive manufactured Ti-based alloys as Ti-6Al-4V and TC11. Both as-received surface and laser-polished surfaces as well as cross-section subsurfaces were analyzed carefully by White-Light Interference, Confocal Microscope, Focus Ion Beam, Scanning Electron Microscopy, Energy Dispersive Spectrometer, and X-ray Diffraction. Results revealed that as-received Ti-based alloys with surface roughness more than 5 μm could be reduce to less than 1 μm through laser polishing process. Moreover, microstructure, microhardness and wear resistance of laser-polished zone was investigated in order to examine the thermal effect of laser polishing processing on the substrate of additive manufactured Ti alloys. This proof-of-concept process has the potential to effectively improve the surface roughness of additive manufactured metallic alloy by local polishing method without damage to the substrate.

  17. Conformal polishing approach: Tool footprint analysis

    Directory of Open Access Journals (Sweden)

    José A Dieste


    Full Text Available Polishing process is one of the most critical manufacturing processes during a metal part production because it determines the final quality of the product. Free-form surface polishing is a handmade process with lots of rejected parts, scrap generation and time and energy consumption. Two different research lines are being developed: prediction models of the final surface quality parameters and an analysis of the amount of material removed depending on the polishing parameters to predict the tool footprint during the polishing task. This research lays the foundations for a future automatic conformal polishing system. It is based on rotational and translational tool with dry abrasive in the front mounted at the end of a robot. A tool to part concept is used, useful for large or heavy workpieces. Results are applied on different curved parts typically used in tooling industry, aeronautics or automotive. A mathematical model has been developed to predict the amount of material removed in function of polishing parameters. Model has been fitted for different abrasives and raw materials. Results have shown deviations under 20% that implies a reliable and controllable process. Smaller amount of material can be removed in controlled areas of a three-dimensional workpiece.

  18. Polishing of silicon based advanced ceramics (United States)

    Klocke, Fritz; Dambon, Olaf; Zunke, Richard; Waechter, D.


    Silicon based advanced ceramics show advantages in comparison to other materials due to their extreme hardness, wear and creep resistance, low density and low coefficient of thermal expansion. As a matter of course, machining requires high efforts. In order to reach demanded low roughness for optical or tribological applications a defect free surface is indispensable. In this paper, polishing of silicon nitride and silicon carbide is investigated. The objective is to elaborate scientific understanding of the process interactions. Based on this knowledge, the optimization of removal rate, surface quality and form accuracy can be realized. For this purpose, fundamental investigations of polishing silicon based ceramics are undertaken and evaluated. Former scientific publications discuss removal mechanisms and wear behavior, but the scientific insight is mainly based on investigations in grinding and lapping. The removal mechanisms in polishing are not fully understood due to complexity of interactions. The role of, e.g., process parameters, slurry and abrasives, and their influence on the output parameters is still uncertain. Extensive technological investigations demonstrate the influence of the polishing system and the machining parameters on the stability and the reproducibility. It is shown that the interactions between the advanced ceramics and the polishing systems is of great relevance. Depending on the kind of slurry and polishing agent the material removal mechanisms differ. The observed effects can be explained by dominating mechanical or chemo-mechanical removal mechanisms. Therefore, hypotheses to state adequate explanations are presented and validated by advanced metrology devices, such as SEM, AFM and TEM.

  19. A case of "order insensitivity"? Natural and artificial language processing in a man with primary progressive aphasia.


    Zimmerer, V. C.; Varley, R. A.


    Processing of linear word order (linear configuration) is important for virtually all languages and essential to languages such as English which have little functional morphology. Damage to systems underpinning configurational processing may specifically affect word-order reliant sentence structures. We explore order processing in WR, a man with primary progressive aphasia (PPA). In a previous report, we showed how WR showed impaired processing of actives, which rely strongly on word order, b...

  20. Web 2.0-based crowdsourcing for high-quality gold standard development in clinical natural language processing. (United States)

    Zhai, Haijun; Lingren, Todd; Deleger, Louise; Li, Qi; Kaiser, Megan; Stoutenborough, Laura; Solti, Imre


    A high-quality gold standard is vital for supervised, machine learning-based, clinical natural language processing (NLP) systems. In clinical NLP projects, expert annotators traditionally create the gold standard. However, traditional annotation is expensive and time-consuming. To reduce the cost of annotation, general NLP projects have turned to crowdsourcing based on Web 2.0 technology, which involves submitting smaller subtasks to a coordinated marketplace of workers on the Internet. Many studies have been conducted in the area of crowdsourcing, but only a few have focused on tasks in the general NLP field and only a handful in the biomedical domain, usually based upon very small pilot sample sizes. In addition, the quality of the crowdsourced biomedical NLP corpora were never exceptional when compared to traditionally-developed gold standards. The previously reported results on medical named entity annotation task showed a 0.68 F-measure based agreement between crowdsourced and traditionally-developed corpora. Building upon previous work from the general crowdsourcing research, this study investigated the usability of crowdsourcing in the clinical NLP domain with special emphasis on achieving high agreement between crowdsourced and traditionally-developed corpora. To build the gold standard for evaluating the crowdsourcing workers' performance, 1042 clinical trial announcements (CTAs) from the website were randomly selected and double annotated for medication names, medication types, and linked attributes. For the experiments, we used CrowdFlower, an Amazon Mechanical Turk-based crowdsourcing platform. We calculated sensitivity, precision, and F-measure to evaluate the quality of the crowd's work and tested the statistical significance (Pcrowdsourced and traditionally-developed annotations. The agreement between the crowd's annotations and the traditionally-generated corpora was high for: (1) annotations (0.87, F-measure for medication names

    Directory of Open Access Journals (Sweden)

    Andrea F. De Carlo


    Full Text Available The article analyses the critical voices raised against the young poets and artists who promoted Futurism in Poland during the first half of the Twentieth century. Futurist manifestos influenced the new Polish poetry, stimulating a lively debate among intellectuals of the calibre of Stefan Żeromski and Karol Irzykowski. In general, the coeval criticism of Polish Futurism focused on three main points: the lack of originality and servile imitation of foreign literary models; the repudiation of the past and national traditions; Futurism as an expression of ideologies such as Fascism in Italy and Bolshevism in Russia. In this article, specific attention is devoted to an analysis of the essay Snobizm i postęp (Snobbery and Progress, 1923 by Żeromski. The writer, criticising Polish imitators of Russian Futurism, affirmed that Polish literature and culture, in the context of national reconstruction after three partitions of Poland, needed to maintain its natural connection with the past and at the same time, without losing its national nature, to weave some universal suggestions into the plot of purely Polish themes. The goal of this article is to reveal that Żeromski and Irzykowski’s critical stance towards the Polish Futurists, which influenced the critics of the next generation, was dictated by a shallow analysis of Futuristic works and by their inability to understand Futuristic efforts to modernise Polish art and literature.

  2. Evaluation of the effect of polishing on flexural strength of feldspathic porcelain and its comparison with autoglazing and over glazing

    Directory of Open Access Journals (Sweden)

    Jalali H.


    Full Text Available Statement of Problem: Ceramic restorations are popular because they can provide the most natural replacement for teeth. However, the brittleness of ceramics is a primary disadvantage. There are various methods for strengthening ceramics such as metal framework, ceramic cores, and surface strengthening mechanisms through glazing, work hardening and ion exchange. Purpose: The purpose of this study was to evaluate the effect of polish on flexural strength of feldspathic porcelain and to compare it with overglaze and autoglaze. Materials and Methods: In this experimental study, one brand of feldspathic porcelain (colorlogic, Ceramco was used and forty bars (25×6×3 mm were prepared according to ISO 6872 and ADA No. 69. The specimens were randomly divided into four groups: overglazed, auto glazed, fine polish and coarse polish (clinic polish. Flexural strength of each specimen was determined by three point bending test (Universal Testing Machine, Zwick 1494, Germany. Collected data was analyzed by ANOVA and post-hoc test with P<0.05 as the limit of significance. Results: A significant difference was observed among the studied groups (P<0.0001. According to post-hoc test, flexural strength in overglaze and fine polish group were significantly stronger than clinic polish and autoglaze group (P<0.001. Although the mean value for overglazed group was higher than fine polish group, this was not statistically significant (P=0.9. Also no statistical difference was seen between autoglazed and coarse polish group (P=0.2. Conclusion: Based on the findings of this study, flexural strength achieved by fine polish (used in this study can compete with overglazing the feldespathic porcelains. It also can be concluded that a final finishing procedure that involves fine polishing may be preferred to simple staining followed by self-glazing.

  3. Polish Listening SPAN: A new tool for measuring verbal working memory

    Directory of Open Access Journals (Sweden)

    Katarzyna Zychowicz


    Full Text Available Individual differences in second language acquisition (SLA encompass differences in working memory capacity, which is believed to be one of the most crucial factors influencing language learning. However, in Poland research on the role of working memory in SLA is scarce due to a lack of proper Polish instruments for measuring this construct. The purpose of this paper is to discuss the process of construction and validation of the Polish Listening Span (PLSPAN as a tool intended to measure verbal working memory of adults. The article presents the requisite theoretical background as well as the information about the PLSPAN, that is, the structure of the test, the scoring procedures and the steps taken with the aim of validating it.

  4. Five Martyr Brothers. First Polish hermits and their worship

    Directory of Open Access Journals (Sweden)

    Kinga Blaschke


    Full Text Available Brothers Benedict and John, students of Romuald, came to Poland at the invitation of Otto III to convert pagans. Soon the Italian hermits were joined by Polish brothers Isaac and Matthew, who helped them in learning the Slavic language. The hermits, as well as Christinus, well killed in 1003 by thugs who wanted to steal money given by Duke Boleslav to an expedition to Rome, which was aimed at obtaining papal consent for conducting missionary work. Although the hermits died as victims of a robbery, killed by fellow Christians, the pope canonized them as martyrs. Their lives are relatively well-documented: the earliest and the most credible story of the five brothers by Bruno of Querfurt was written as early as five years after their death, although remained unknown until 1883. Another early account is the life of St. Romuald by Piotr Damiani of 1041. The martyrs have been also associated with yet another mysterious work – a gravestone unearthed in 1959 at the external wall of the north Roman apse of the Gniezno Cathedral, considered by most researchers the oldest epigraphic item on the Polish soil. However, the identification of the warriors mentioned in the inscription with 11th century martyrs raises many doubts. The article discusses the above matters, as well as the subject of the development of the worship of the martyr brothers.

  5. Circular motion and Polish Doughnuts in NUT spacetime (United States)

    Jefremov, Paul I.

    The astrophysical relevance of the NUT spacetime(s) is a matter of debate due to pathological properties exhibited by this solution. However, if it is realised in nature, then we should look for the characteristic imprints of it on possible observations. One of the major sources of data on black hole astrophysics is the accretion process. Using a simple but fully analytical ``Polish Doughnuts'' model of accretion disk one gets both qualitative and quantitative differences from the Kerr spacetime produced by the presence of the gravitomagnetic charge. The present paper is based on our work Jefremov & Perlick (2016).

  6. Specificity of Geotechnical Measurements and Practice of Polish Offshore Operations

    Directory of Open Access Journals (Sweden)

    Bogumil Laczynski


    Full Text Available As offshore market in Europe grows faster and faster, new sea areas are being managed and new ideas on how to use the sea potential are being developed. In North Sea, where offshore industry conducts intensive expansion since late 1960s, numerous wind farms, oil and gas platforms and pipelines have been put into operation following extensive research, including geotechnical measurement. Recently, a great number of similar projects is under development in Baltic Sea, inter alia in Polish EEZ, natural conditions of which vary from the North Sea significantly. In this paper, those differences are described together with some solutions to problems thereby arising.

  7. Polish migrant youth in Scottish schools : conflicted identity and family capital.


    Moskal, M.


    The perspectives of migrant children and young people have been largely omitted in youth studies. Existing literature focuses predominantly on young people born to migrant parents in the host country, while the problems of first generation of migrant youth have received limited attention. This paper focuses on first-generation Polish migrants and their experiences in relation to school transition, new language learning and the changing family relationships in the new social environment. It dr...

  8. Smoking characteristics of Polish immigrants in Dublin.

    Kabir, Zubair


    BACKGROUND: This study examined two main hypotheses: a) Polish immigrants\\' smoking estimates are greater than their Irish counterparts (b) Polish immigrants purchasing cigarettes from Poland smoke "heavier" (>\\/= 20 cigarettes a day) when compared to those purchasing cigarettes from Ireland. The study also set out to identify significant predictors of \\'current\\' smoking (some days and everyday) among the Polish immigrants. METHODS: Dublin residents of Polish origin (n = 1,545) completed a previously validated Polish questionnaire in response to an advertisement in a local Polish lifestyle magazine over 5 weekends (July-August, 2007). The Office of Tobacco Control telephone-based monthly survey data were analyzed for the Irish population in Dublin for the same period (n = 484). RESULTS: Age-sex adjusted smoking estimates were: 47.6% (95% Confidence Interval [CI]: 47.3%; 48.0%) among the Poles and 27.8% (95% CI: 27.2%; 28.4%) among the general Irish population (p < 0.001). Of the 57% of smokers (n = 345\\/606) who purchased cigarettes solely from Poland and the 33% (n = 198\\/606) who purchased only from Ireland, 42.6% (n = 147\\/345) and 41.4% (n = 82\\/198) were "heavy" smokers, respectively (p = 0.79). Employment (Odds Ratio [OR]: 2.89; 95% CI: 1.25-6.69), lower education (OR: 3.76; 95%CI: 2.46-5.74), and a longer stay in Ireland (>24 months) were significant predictors of current smoking among the Poles. An objective validation of the self-reported smoking history of a randomly selected sub-sample immigrant group, using expired carbon monoxide (CO) measurements, showed a highly significant correlation coefficient (r = 0.64) of expired CO levels with the reported number of cigarettes consumed (p < 0.0001). CONCLUSION: Polish immigrants\\' smoking estimates are higher than their Irish counterparts, and particularly if employed, with only primary-level education, and are overseas >2 years.

  9. Four Riders of the Apocalypse of the Polish Bureaucracy

    Directory of Open Access Journals (Sweden)

    Kieżun Witold


    Full Text Available This article was originally published in Polish as: Witold Kieżun, Czterej jeźdźcy apokalipsy polskiej biurokracji, Kultura, No. 3/630, Paris 2000. The Literary Institute (publisher of the “Kultura” monthly expressed its interest in and consent to the publication of an English-language version of this article. Professor Witold Kieżun also gave his consent to the translation of his text and its posting in the pages of the journal Foundations of Management. The text was written in 1999 and introduced subsequent to January 1 of that year, regarding the reform of the administrative division of Poland, which involved, among other things, three-stage structure of territorial division and the introduction of counties as administrative units (editorial note.

  10. Introducing a gender-neutral pronoun in a natural gender language: the influence of time on attitudes and behavior. (United States)

    Gustafsson Sendén, Marie; Bäck, Emma A; Lindqvist, Anna


    The implementation of gender fair language is often associated with negative reactions and hostile attacks on people who propose a change. This was also the case in Sweden in 2012 when a third gender-neutral pronoun hen was proposed as an addition to the already existing Swedish pronouns for she (hon) and he (han). The pronoun hen can be used both generically, when gender is unknown or irrelevant, and as a transgender pronoun for people who categorize themselves outside the gender dichotomy. In this article we review the process from 2012 to 2015. No other language has so far added a third gender-neutral pronoun, existing parallel with two gendered pronouns, that actually have reached the broader population of language users. This makes the situation in Sweden unique. We present data on attitudes toward hen during the past 4 years and analyze how time is associated with the attitudes in the process of introducing hen to the Swedish language. In 2012 the majority of the Swedish population was negative to the word, but already in 2014 there was a significant shift to more positive attitudes. Time was one of the strongest predictors for attitudes also when other relevant factors were controlled for. The actual use of the word also increased, although to a lesser extent than the attitudes shifted. We conclude that new words challenging the binary gender system evoke hostile and negative reactions, but also that attitudes can normalize rather quickly. We see this finding very positive and hope it could motivate language amendments and initiatives for gender-fair language, although the first responses may be negative.

  11. Comparison Between Manual Auditing and a Natural Language Process With Machine Learning Algorithm to Evaluate Faculty Use of Standardized Reports in Radiology. (United States)

    Guimaraes, Carolina V; Grzeszczuk, Robert; Bisset, George S; Donnelly, Lane F


    When implementing or monitoring department-sanctioned standardized radiology reports, feedback about individual faculty performance has been shown to be a useful driver of faculty compliance. Most commonly, these data are derived from manual audit, which can be both time-consuming and subject to sampling error. The purpose of this study was to evaluate whether a software program using natural language processing and machine learning could accurately audit radiologist compliance with the use of standardized reports compared with performed manual audits. Radiology reports from a 1-month period were loaded into such a software program, and faculty compliance with use of standardized reports was calculated. For that same period, manual audits were performed (25 reports audited for each of 42 faculty members). The mean compliance rates calculated by automated auditing were then compared with the confidence interval of the mean rate by manual audit. The mean compliance rate for use of standardized reports as determined by manual audit was 91.2% with a confidence interval between 89.3% and 92.8%. The mean compliance rate calculated by automated auditing was 92.0%, within that confidence interval. This study shows that by use of natural language processing and machine learning algorithms, an automated analysis can accurately define whether reports are compliant with use of standardized report templates and language, compared with manual audits. This may avoid significant labor costs related to conducting the manual auditing process. Copyright © 2017 American College of Radiology. Published by Elsevier Inc. All rights reserved.

  12. Adaptation and Validation of the ADOS-2, Polish Version

    Izabela Chojnicka


    Full Text Available Autism Diagnostic Observation Schedule (ADOS is one of the most popular instruments used world-widely in the diagnosis of autism spectrum disorders (ASD. Unfortunately, there are only a few studies of the psychometric properties of non-English language versions of this instrument and none of the adaptation of its second edition (ADOS-2. The objective of this study was to verify the psychometric properties of the Polish version of the Autism Diagnostic Observation Schedule, Second Edition (ADOS-2-PL. The authors recruited 401 participants: 193 with ASDs (ASD group and 78 with non-spectrum disorders, plus 130 typically developing participants (control group. ADOS-2-PL was found to have high interrater reliability, internal consistency and test–retest reliability. Confirmatory factor analysis confirmed a good fit of the Polish data to the two-factor model of ADOS-2. As no significant differences were found between participants with childhood autism and other ASDs, only one cut-off was established for Modules 1–4. The sensitivity, specificity and positive predictive value of ADOS-2-PL are high: sensitivity was over 90% (only for the “Older with some words” algorithm in the Toddler Module the sensitivity was 71% and “Aged 5 years or older” algorithm in Module 2 sensitivity was 84%, specificity was above 80% (with the exception of the Module 4 and Module 2 “Aged 5 years or older” algorithm where it was above 70%. The results support the use of ADOS-2-PL in clinical practice and scientific research. To the best of our knowledge, there have been no reports to date about adaptations of ADOS-2 and the psychometric properties of non-English language versions. As such, this constitutes the first attempt at adapting ADOS-2, and its results could be of interest for researchers outside of Poland.

  13. APS 3D: a new benchmark in aspherical polishing (United States)

    Gauch, Daniel; Mikulic, Dalibor; Veit, Christian


    The APS 3D system performs polishing and form correction in one step in order to reduce overall process time, reduce the number of polishing steps required and eliminate the need for highly skilled operators while providing a repeatable polishing process. This new 3D Polishing system yields better surface quality, and a better slope error, automatically determining the optimum speeds, feed rates and polish pressures to achieve a deterministic process based on the required quality parameters input by the operator. The process flow is always the same to ensure consistent quality and target quality values are defined before polishing begins.

  14. Paying Attention to Attention Allocation in Second-Language Learning: Some Insights into the Nature of Linguistic Thresholds. (United States)

    Hawson, Anne


    Three threshold hypotheses proposed by Cummins (1976) and Diaz (1985) as explanations of data on the cognitive consequences of bilingualism are examined in depth and compared to one another. A neuroscientifically updated information-processing perspective on the interaction of second-language comprehension and visual-processing ability is…

  15. The formation of the polish opposite movement in Western Ukraine at the beginning of the Second world war

    Viktoriya V. Dashko


    Full Text Available The article highlights the nature of the Soviet totalitarian ethnic policy and its influence on the origin of the Polish opposite movement in Western Ukraine at the beginning of the Second World War. It also clarifies the main factors of the formation of active opposite movements among the Polish part of population in the Western Ukraine territory, which withdrew to the Soviet Union due to the distribution of Poland as a result of the Molotov-Ribbentrop Pact. The author defined category of Polish nationality persons, who were dissatisfied with Stalin’s repressive policies in 1939-1941 and become that social environment, in which finally formed opposite movement to totalitarianism and, in particular, antinational regime against the Polish ethnos, and the environment from which later appeared activists of this movement. By the author was analyzed the activity of Soviet authorities in the occupied territories of the former «Wshodnih kresuv» of the Second Rich Pospolyta and determined main factors that led to dissatisfaction with the rigid Soviet policy against the former government officials, military precipitators and servants of the Roman Catholic Church. Investigated and determined features of the Polish opposite movement formation in the former eastern Polish territories occupied in 1939 by the Soviet Union and seized in 1941 by Nazi Germany. The article also describes the origin and activity of the first underground Polish armed forces on Ukrainian territory.

  16. Surface roughness and morphology of dental nanocomposites polished by four different procedures evaluated by a multifractal approach

    Energy Technology Data Exchange (ETDEWEB)

    Ţălu, Ştefan, E-mail: [Technical University of Cluj-Napoca, Faculty of Mechanical Engineering, Department of AET, Discipline of Descriptive Geometry and Engineering Graphics, 103-105 B-dul Muncii St., Cluj-Napoca 400641, Cluj (Romania); Stach, Sebastian, E-mail: [University of Silesia, Faculty of Computer Science and Materials Science, Institute of Informatics, Department of Biomedical Computer Systems, Będzińska 39, 41-205 Sosnowiec (Poland); Lainović, Tijana, E-mail: [University of Novi Sad, Faculty of Medicine, School of Dentistry, Hajduk Veljkova 3, 21000 Novi Sad (Serbia); Vilotić, Marko, E-mail: [University of Novi Sad, Faculty of Technical Sciences, Department for Production Engineering, Trg Dositeja Obradovića 6, 21000 Novi Sad (Serbia); Blažić, Larisa, E-mail: [University of Novi Sad, Faculty of Medicine, School of Dentistry, Clinic of Dentistry of Vojvodina, Department of Restorative Dentistry and Endodontics, Hajduk Veljkova 3, 21000 Novi Sad (Serbia); Alb, Sandu Florin, E-mail: [“Iuliu Haţieganu” University of Medicine and Pharmacy, Faculty of Dentistry, Department of Periodontology, 8 Victor Babeş St., 400012 Cluj-Napoca (Romania); Kakaš, Damir, E-mail: [University of Novi Sad, Faculty of Technical Sciences, Department for Production Engineering, Trg Dositeja Obradovića 6, 21000 Novi Sad (Serbia)


    Graphical abstract: - Highlights: • Multifractals are good indicators of polished dental composites 3-D surface structure. • The nanofilled composite had superior 3-D surface properties than the nanohybrid one. • Composite polishing with diamond paste created improved 3-D multifractal structure. • Recommendation: polish the composite with diamond paste if using the one-step tool. • Multifractal analysis could become essential in designing new dental surfaces. - Abstract: The objective of this study was to determine the effect of different dental polishing methods on surface texture parameters of dental nanocomposites. The 3-D surface morphology was investigated by atomic force microscopy (AFM) and multifractal analysis. Two representative dental resin-based nanocomposites were investigated: a nanofilled and a nanohybrid composite. The samples were polished by two dental polishing protocols using multi-step and one-step system. Both protocols were then followed by diamond paste polishing. The 3-D surface roughness of samples was studied by AFM on square areas of topography on the 80 × 80 μm{sup 2} scanning area. The multifractal spectrum theory based on computational algorithms was applied for AFM data and multifractal spectra were calculated. The generalized dimension D{sub q} and the singularity spectrum f(α) provided quantitative values that characterize the local scale properties of dental nanocomposites polished by four different dental polishing protocols at nanometer scale. The results showed that the larger the spectrum width Δα (Δα = α{sub max} − α{sub min}) of the multifractal spectra f(α), the more non-uniform was the surface morphology. Also, the 3-D surface topography was described by statistical parameters, according to ISO 25178-2:2012. The 3-D surface of samples had a multifractal nature. Nanofilled composite had lower values of height parameters than nanohybrid composites, due to its composition. Multi-step polishing protocol

  17. Surface roughness and morphology of dental nanocomposites polished by four different procedures evaluated by a multifractal approach

    International Nuclear Information System (INIS)

    Ţălu, Ştefan; Stach, Sebastian; Lainović, Tijana; Vilotić, Marko; Blažić, Larisa; Alb, Sandu Florin; Kakaš, Damir


    Graphical abstract: - Highlights: • Multifractals are good indicators of polished dental composites 3-D surface structure. • The nanofilled composite had superior 3-D surface properties than the nanohybrid one. • Composite polishing with diamond paste created improved 3-D multifractal structure. • Recommendation: polish the composite with diamond paste if using the one-step tool. • Multifractal analysis could become essential in designing new dental surfaces. - Abstract: The objective of this study was to determine the effect of different dental polishing methods on surface texture parameters of dental nanocomposites. The 3-D surface morphology was investigated by atomic force microscopy (AFM) and multifractal analysis. Two representative dental resin-based nanocomposites were investigated: a nanofilled and a nanohybrid composite. The samples were polished by two dental polishing protocols using multi-step and one-step system. Both protocols were then followed by diamond paste polishing. The 3-D surface roughness of samples was studied by AFM on square areas of topography on the 80 × 80 μm 2 scanning area. The multifractal spectrum theory based on computational algorithms was applied for AFM data and multifractal spectra were calculated. The generalized dimension D q and the singularity spectrum f(α) provided quantitative values that characterize the local scale properties of dental nanocomposites polished by four different dental polishing protocols at nanometer scale. The results showed that the larger the spectrum width Δα (Δα = α max − α min ) of the multifractal spectra f(α), the more non-uniform was the surface morphology. Also, the 3-D surface topography was described by statistical parameters, according to ISO 25178-2:2012. The 3-D surface of samples had a multifractal nature. Nanofilled composite had lower values of height parameters than nanohybrid composites, due to its composition. Multi-step polishing protocol created a better

  18. Introducing a gender-neutral pronoun in a natural gender language: The influence of time on attitudes and behavior

    Marie eGustafsson Sendén


    Full Text Available The implementation of gender fair language is often associated with negative reactions and hostile attack on people who propose a change. This was also the case in Sweden in 2012 when a third gender-neutral pronoun hen was proposed as an addition to the already existing Swedish pronouns for she and he. The pronoun hen can be used both generically, when gender is unknown or irrelevant, and as a transgender pronoun for people who categorize themselves outside the gender dichotomy. In this article we review the process from 2012 to 2015 when hen has been introduced in the Swedish Dictionary. No other language has so far added a third gender-neutral pronoun that actually has reached the broader population of language users, which makes the situation in Sweden unique. We present data on attitudes toward hen during the recent four years and study how time is associated with the attitudes. In 2012 the majority of the Swedish population was negative to the word, but already in 2014 there was a significant shift to more positive attitudes. Time was one of the strongest predictors for attitudes also when other relevant factors were controlled for. Even though to a lesser extent than the attitudes, the actual use of the word has also increased. We conclude that new words challenging the binary gender system evoke hostile and negative reactions, but also that attitudes can normalize rather quickly. This is very positive because it should motivate language amendments and initiatives for gender-fair language although the first responses are negative.

  19. First Language Acquisition and Teaching (United States)

    Cruz-Ferreira, Madalena


    "First language acquisition" commonly means the acquisition of a single language in childhood, regardless of the number of languages in a child's natural environment. Language acquisition is variously viewed as predetermined, wondrous, a source of concern, and as developing through formal processes. "First language teaching" concerns schooling in…

  20. Advisory Functions of Selected Polish Business Institutions in the Innovation Process in Enterprises – Research Conclusions

    Directory of Open Access Journals (Sweden)



    Full Text Available This paper contains an analysis of innovation processes in enterprises, from the perspective of demand for knowledge, which companies increasingly obtain from business environment. This requires a discernment of valuable partners in the environment, who will provide professional transfer of knowledge to the company. Therefore, knowledge has become a product, and Polish universities and specialized, commercial knowledge providers are competing for the customer on the knowledge market. While knowledge transfer to companies is a secondary activity for Polish universities, specialized, commercial knowledge suppliers are making an effort of acquiring as much orders as possible. Therefore a natural competition between these two types of entities arises. In this paper the Author examines possibilities of supporting innovation-oriented enterprises by Polish universities and commercial providers of knowledge and formulates terms and conditions which will make cooperation between these two groups of entities possible, so as to transform the competition into cooperation, beneficial for both sides, and support innovative processes in enterprises.