Sample records for polish natural language

  1. Polish as a Foreign Language: New Context, Regulations and Prospects

    Popławska Anna


    Full Text Available The article provides an overview of the developments in Poland after the collapse of communism in 1989 from the point of view of foreign language teaching, including a swift reorientation and transition from a public school system with Russian as the main foreign language, to a diversified language teaching market focused on teaching English and other languages. A particular stress is placed on a relatively new phenomenon, being an increased demand and new opportunities for teaching Polish as a foreign language to be further promoted as a result of anticipated amendments to legal regulations governing the status and certification of the Polish language.

  2. Natural Language Processing


    Preeti; BrahmaleenKaurSidhu


    Natural language processing (NLP) work began more than sixty years ago; it is a field of computer science and linguistics devoted to creating computer systems that use human (natural) language. Natural Language Processing holds great promise for making computer interfaces that are easier to use for people, since people will be able to talk to the computer in their own language, rather than learn a specialized language of computer commands. Natural Language processing techniques can make possi...

  3. Interactive System for Polish Signed Language Learning

    Karolina Olga Nurzyńska


    Full Text Available The aim of this study is to present an overview about computer singed language course with module for automatic signed language recognition as a part of language acquisition test. The idea to create an interactive sign language learning system seems to be a new one. We hope that this solution helps to overcome the barrier between the silent and hearing world. On the other hand, we concentrate our efforts to create a system for a home use that will not need any sophisticated hardware. Moreover, we put pressure on utilization of already proposed and popular description scheme. The MPEG-7 standard formally called the Multimedia Content Description Interface has been chosen. This standard provides a rich set of tools for complete multimedia content description. The most important application for sign language is the possibilities to describe static and dynamic features of objects in image sequences both. This description schema gives the opportunity to create description of signing person on required level of granularity. In the article a brief description of many suggested solutions for semiautomatic or automatic sign language recognition systems is given. Besides, there are described some implemented learning application which aim was to learn sign languages. The main groups, which could be distinguished are: animated avatars observation, messenger for deaf people, testing progress in learning sign languages by using education platforms.

  4. The corpus-driven revolution in Polish Sign Language: the interview with Dr. Paweł Rutkowski

    Iztok Kosem


    Full Text Available Dr. Paweł Rutkowski is head of the Section for Sign Linguistics at the University of Warsaw. He is a general linguist and a specialist in the field of syntax of natural languages, carrying out research on Polish Sign Language (polski język migowy — PJM. He has been awarded a number of prizes, grants and scholarships by such institutions as the Foundation for Polish Science, Polish Ministry of Science and Higher Education, National Science Centre, Poland, Polish–U.S. Fulbright Commission, Kosciuszko Foundation and DAAD. Dr. Rutkowski leads the team developing the Corpus of Polish Sign Language and the Corpus-based Dictionary of Polish Sign Language, the first dictionary of this language prepared in compliance with modern lexicographical standards. The dictionary is an open-access publication, available freely at the following address: This interview took place at eLex 2017, a biennial conference on electronic lexicography, where Dr. Rutkowski was awarded the Adam Kilgarriff Prize and gave a keynote address entitled Sign language as a challenge to electronic lexicography: The Corpus-based Dictionary of Polish Sign Language and beyond. The interview was conducted by Dr. Victoria Nyst from Leiden University, Faculty of Humanities, and Dr. Iztok Kosem from the University of Ljubljana, Faculty of Arts.

  5. Gender-dependent language anxiety in Polish communication apprehensives

    Ewa Piechurska-Kuciel


    Full Text Available This paper analyzes the relationship between communication apprehension and language anxiety from the perspective of gender. As virtually no empirical studies have addressed the explicit influence of gender on language anxiety in communication apprehensives, this paper proposes that females are generally more sensitive to anxiety, as reflected in various spheres of communication. For this reason, language anxiety levels in communication apprehensive females should be higher, unlike those of communication apprehensive males. Comparisons between them were made using a student t test, two-way ANOVA, and post-hoc Tukey test. The results revealed that Polish communication apprehensive secondary grammar school males and females do not differ in their levels of language anxiety, although nonapprehensive males experience significantly lower language anxiety than their female peers. It is argued that the finding can be attributed to developmental patterns, gender socialization processes, classroom practices, and the uniqueness of the FL learning process, which is a stereotypically female domain.

  6. Proficiency Testing and Language Teaching: Russian and Polish

    Rimma Garn


    Full Text Available This paper explores the potential application of proficiency testing in U.S. colleges and universities. Specific consideration is giv-en to: the Oral Proficiency Interview, based on ILR or ACTFL guidelines, administered on a large scale at the Defense Language In-stitute and occasionally employed in American academia; the Diag-nostic Assessment Interview, the assessment tool of choice at DLI, basically unheard of in academia; and the new Polish proficiency test, which is part and parcel of the standardized series of language tests administered throughout Europe, based on Language Testers of Eu-rope guidelines. The author proposes that introducing the underlying principles of proficiency testing into American academia and promot-ing a better awareness of level tasks and expectations on the part of language teachers could help to eliminate the disconnect between testing and teaching. It could benefit instruction from early to ad-vanced stages.

  7. Constraints on Negative Prefixation in Polish Sign Language (United States)

    Tomaszewski, Piotr


    The aim of this article is to describe a negative prefix, NEG-, in Polish Sign Language (PJM) which appears to be indigenous to the language. This is of interest given the relative rarity of prefixes in sign languages. Prefixed PJM signs were analyzed on the basis of both a corpus of texts signed by 15 deaf PJM users who are either native or near-native signers, and material including a specified range of prefixed signs as demonstrated by native signers in dictionary form (i.e. signs produced in isolation, not as part of phrases or sentences). In order to define the morphological rules behind prefixation on both the phonological and morphological levels, native PJM users were consulted for their expertise. The research results can enrich models for describing processes of grammaticalization in the context of the visual-gestural modality that forms the basis for sign language structure. PMID:26619066

  8. Constraints on Negative Prefixation in Polish Sign Language. (United States)

    Tomaszewski, Piotr


    The aim of this article is to describe a negative prefix, NEG-, in Polish Sign Language (PJM) which appears to be indigenous to the language. This is of interest given the relative rarity of prefixes in sign languages. Prefixed PJM signs were analyzed on the basis of both a corpus of texts signed by 15 deaf PJM users who are either native or near-native signers, and material including a specified range of prefixed signs as demonstrated by native signers in dictionary form (i.e. signs produced in isolation, not as part of phrases or sentences). In order to define the morphological rules behind prefixation on both the phonological and morphological levels, native PJM users were consulted for their expertise. The research results can enrich models for describing processes of grammaticalization in the context of the visual-gestural modality that forms the basis for sign language structure.

  9. Polish as a foreign language at elementary level of instruction : crosslinguistic influences in writing


    Danuta Gabrys-Barker


    Being a minority European language, Polish has not attracted the attention of second language research (SLA) very much. Most studies in the area focus on English and other major languages describing variables and process observed in learners’ interlanguage development. This article looks at the language performance of elementary learners of Polish as a foreign language with a view to diagnosing areas of difficulty at the initial stages of language instruction. It is a case study of five learn...

  10. Patterns of Language Use: Polish Migrants from the 1980s and Their Children in Melbourne (United States)

    Leuner, Beata


    This paper investigates the retention of Polish language and culture by first generation Polish migrants from the 1980s and their second generation offspring (aged 15-24) from endogamous and exogamous marriages. We examine various domains such as the home, social networks, visits to Poland, institutions of learning, the Polish media, the Polish…

  11. Natural Language Sourcebook. (United States)

    Baker, Eva; And Others

    This sourcebook is intended to provide researchers and users of natural language computer systems with a classification scheme to describe language-related problems associated with such systems. Methods from the disciplines of artificial intelligence (AI), education, linguistics, psychology, anthropology, and psychometrics were applied in an…

  12. Natural language generation (United States)

    Maybury, Mark T.

    The goal of natural language generation is to replicate human writers or speakers: to generate fluent, grammatical, and coherent text or speech. Produced language, using both explicit and implicit means, must clearly and effectively express some intended message. This demands the use of a lexicon and a grammar together with mechanisms which exploit semantic, discourse and pragmatic knowledge to constrain production. Furthermore, special processors may be required to guide focus, extract presuppositions, and maintain coherency. As with interpretation, generation may require knowledge of the world, including information about the discourse participants as well as knowledge of the specific domain of discourse. All of these processes and knowledge sources must cooperate to produce well-written, unambiguous language. Natural language generation has received less attention than language interpretation due to the nature of language: it is important to interpret all the ways of expressing a message but we need to generate only one. Furthermore, the generative task can often be accomplished by canned text (e.g., error messages or user instructions). The advent of more sophisticated computer systems, however, has intensified the need to express multisentential English.

  13. Natural language modeling

    Energy Technology Data Exchange (ETDEWEB)

    Sharp, J.K. [Sandia National Labs., Albuquerque, NM (United States)


    This seminar describes a process and methodology that uses structured natural language to enable the construction of precise information requirements directly from users, experts, and managers. The main focus of this natural language approach is to create the precise information requirements and to do it in such a way that the business and technical experts are fully accountable for the results. These requirements can then be implemented using appropriate tools and technology. This requirement set is also a universal learning tool because it has all of the knowledge that is needed to understand a particular process (e.g., expense vouchers, project management, budget reviews, tax, laws, machine function).

  14. The NCL natural constraint language

    CERN Document Server

    Zhou, Jianyang


    This book presents the Natural Constraint Language (NCL) language, a description language in conventional mathematical logic for modeling and solving constraint satisfaction problems. It uses illustrations and tutorials to detail NCL and its applications.

  15. Fighting alcoholism among railway workers in the light of early 20th Century Polish-language temperance publications

    Izabela Krasińska


    Discussion and conclusions: The Polish-language temperance periodicals provide, among other things, valuable information referring to as yet unknown though essential problem of fighting alcoholism among railway workers in Europe, USA and the Polish territories of the Three Partitions.

  16. Polish Vocabulary Development in 2-Year-Olds: Comparisons With English Using the Language Development Survey. (United States)

    Rescorla, Leslie; Constants, Holly; Bialecka-Pikul, Marta; Stepien-Nycz, Malgorzata; Ochal, Anna


    The objective of this study was to compare vocabulary size and composition in 2-year-olds learning Polish or English as measured by the Language Development Survey (LDS; Rescorla, 1989). Participants were 199 Polish toddlers (M = 24.14 months, SD = 0.35) and 422 U.S. toddlers (M = 24.69 months, SD = 0.78). Test-retest reliability was .92, internal consistency was .99, and concurrent validity was .55. Girls had higher vocabulary scores than boys. Mean LDS score was significantly lower in Polish than in English, and fewer Polish children had LDS scores >200 words. Also, more words were reported for English. The cross-linguistic correlation for word frequencies was .44. Noun dominance was comparable in the two languages, and 55 cross-linguistic word matches were found among the top 100 words. Although more Polish than U.S. children had Vocabulary acquisition appeared to be slower in Polish than in English, probably because of the complexity of the language. However, the languages were very similar with respect to vocabulary composition findings.


    Full Text Available Agricultural production risk is of special nature due to a great number of hazards, relative weakness of production entities on the market and high ambiguity which is greater than in industrial production. Natural disasters occurring very frequently, at simultaneous low percentage of insured farmers, cause damage of such sizes that force the state to organise current financial aid (for instance in the form of preferential natural disaster loans. This aid is usually not sufficient. On the other hand, regional diversity of the risk level does not positively affect the development of insurance. From the perspective of insurance companies and policymakers it becomes highly important to investigate the spatial structure of losses in agriculture caused by natural disasters. The purpose of the research is to classify the 16 Polish voivodeships into clusters in order to show differences between them according to the criterion of level of damage in agricultural farms caused by natural disasters. On the basis of the cluster analysis it was demonstrated that 11 voivodeships form quite a homogeneous group in terms of size of damage in agriculture (the value of damage in cultivations and the acreage of destroyed cultivations are two most important factors determining affiliation to the cluster, however, the profile of loss occurring in other five voivodeships has a very individual course and requires separate handling in the actuarial sense. It was also proved that high value of losses in agriculture in the absolute sense in given voivodeships do not have to mean high vulnerability of agricultural farms from these voivodeships to natural risks.

  18. Sociolinguistics in selected textbooks used for teaching Polish as a native language in a primary school

    Szymańska Marta


    Full Text Available The text is an effort to present a change which took place at the turn of centuries in teaching Polish as a native language. It is, first of all, about a new sociolinguistic perspective in teaching Polish which appeared at schools. The author analyses four selected series of textbooks used for teaching Polish in a primary school. Special attention was paid to activity books, which are analysed with regard for presence of situational exercises that make students analyse communication situations and their typical language behaviours. They also make them create effective utterances adequate to a specific context. The conducted research shows that a communication perspective is not represented well in school textbooks. Activities focusing on development of communication competence are rare, they are scattered or separated from other language actions. Thus, they do not fit into a general textbook concept, and they often are only a decoration required by the core curriculum.

  19. Forms of Address and their Meaning in Contrast in Polish and Russian Languages


    Wojciech Sosnowski


    Forms of Address and their Meaning in Contrast in Polish and Russian Languages Many studies in contemporary linguistics focus on investigating politeness and rudeness in language. This paper, however, has not been intended as a contrastive study of the phenomena in question. Language politeness and rudeness are conveyed by means of expressions of politeness and rudeness which are perceived as entrenched and recurring in specific situations. These expressions convey the expected meaning of...

  20. Polish as a foreign language at elementary level of instruction : crosslinguistic influences in writing

    Danuta Gabrys-Barker


    Full Text Available Being a minority European language, Polish has not attracted the attention of second language research (SLA very much. Most studies in the area focus on English and other major languages describing variables and process observed in learners’ interlanguage development. This article looks at the language performance of elementary learners of Polish as a foreign language with a view to diagnosing areas of difficulty at the initial stages of language instruction. It is a case study of five learners’ written production after a year of intensive language instruction in the controlled conditions of a classroom. The objective of the study presented here is: 1. to determine the types of error produced in a short translation task at different levels of language (morphosyntactic, lexical 2. to observe manifestations of crosslinguistic influences between languages the subjects know (interlingual transfer as well as those related to the language learnt itself (intralingual transfer.The small sample of texts produced does not allow for any generalized observations and conclusions, however, at the level of elementary competence in any foreign language, as other research shows, the amount of individual variation is not the most significant factor. Thus the incorrect forms produced may testify to some more universally error-prone areas of language. The value of this kind of analysis lies in this direct application to the teaching of Polish as a synthetic language. The study also demonstrates the fact that communicative teaching has a limited contribution to make in the case of this family of languages. It suggests that overt and explicit teaching of a synthetic language will give a sounder basis for further development of language competence in its communicative dimension

  1. Teaching natural language to computers


    Corneli, Joseph; Corneli, Miriam


    "Natural Language," whether spoken and attended to by humans, or processed and generated by computers, requires networked structures that reflect creative processes in semantic, syntactic, phonetic, linguistic, social, emotional, and cultural modules. Being able to produce novel and useful behavior following repeated practice gets to the root of both artificial intelligence and human language. This paper investigates the modalities involved in language-like applications that computers -- and ...

  2. Stress 'deafness' in a language with fixed word stress: an ERP study on Polish

    Ulrike eDomahs


    Full Text Available The aim of the present contribution was to examine the factors influencing the prosodic processing in a language with predictable word stress. For Polish, a language with fixed penultimate stress but several well-defined exceptions, difficulties in the processing and representation of prosodic information have been reported (e.g., Peperkamp & Dupoux, 2002. The present study utilized event-related potentials (ERPs to investigate the factors influencing prosodic processing in Polish. These factors are i the predictability of stress and ii the prosodic structure in terms of metrical feet. Polish native speakers were presented with correctly and incorrectly stressed Polish words and instructed to judge the correctness of the perceived stress patterns. For each stress violation an early negativity was found which was interpreted as reflection of an error-detection mechanism, and in addition exceptional stress patterns (= antepenultimate stress and post-lexical (= initial stress evoked a task-related positivity effect (P300 whose amplitude and latency is correlated with the degree of anomaly and deviation from an expectation. Violations involving the default (= penultimate stress in contrast did not produce such an effect. This asymmetrical result is interpreted to reflect that Polish native speakers are less sensitive to the default pattern than to the exceptional or post-lexical patterns. Behavioral results are orthogonal to the electrophysiological results showing that Polish speakers had difficulties to reject any kind of stress violation. Thus, on a meta-linguistic level Polish speakers appeared to be stress-‘deaf’ for any kind of stress manipulation, whereas the neural reactions differentiate between the default and lexicalized patterns.

  3. Handbook of Natural Language Processing

    CERN Document Server

    Indurkhya, Nitin


    Provides a comprehensive, modern reference of practical tools and techniques for implementing natural language processing in computer systems. This title covers classical methods, empirical and statistical techniques, and various applications. It describes how the techniques can be applied to European and Asian languages as well as English

  4. Advances in natural language processing. (United States)

    Hirschberg, Julia; Manning, Christopher D


    Natural language processing employs computational techniques for the purpose of learning, understanding, and producing human language content. Early computational approaches to language research focused on automating the analysis of the linguistic structure of language and developing basic technologies such as machine translation, speech recognition, and speech synthesis. Today's researchers refine and make use of such tools in real-world applications, creating spoken dialogue systems and speech-to-speech translation engines, mining social media for information about health or finance, and identifying sentiment and emotion toward products and services. We describe successes and challenges in this rapidly advancing area. Copyright © 2015, American Association for the Advancement of Science.



    Kowalska, Monika


    After the Polish accession to the European Union in 2004, language services have considerably grown in importance. Intensive contacts with foreign companies and institutions coupled with information technology developments have increased the role of English as a linguistic medium of international cooperation. The overall aim of this paper is to examine the Polish business environment for Language Service Providers (LSPs) offering specialized English courses and translation services (EN-PL and...


    Michał Głuszkowski


    Full Text Available The article discusses factors influencing language maintenance under changing social, cultural, economic and political conditions of Polish minority in Siberia. The village of Vershina was founded in 1910 by Polish voluntary settlers from Little Poland.During its first three decades Vershina preserved Polish language,traditions, farming methods and machines and also the Roman Catholic religion. The changes came to a village in taiga in the1930s. Vershina lost its ethnocultural homogeneity because of Russian and Buryat workers in the local kolkhoz. Nowadays the inhabitants of Vershina regained their minority rights: religious, educational and cultural. However, during the years of sovietization and ateization, their culture and customs became much more similar to other Siberian villages. Polish language in Vershina is under strong influence of Russian, which is the language of education,administration, and surrounding villages. Children from Polish-Russian families become monolingual and use Polish very rare, only asa school subject and in contacts with grandparents. The process of abandoning mother tongue in Vershina is growing rapidly. However,there are some factors which may hinder the actual changes:the activity of local Polish organisations and Roman Catholic parish as well as folk group “Jazhumbek”.

  7. A Portable Natural Language Interface. (United States)


    and that would integrate graphics, mouse deixis , and natural language. Although the project was originally intended to last several years, it has been...planning program, an expert system used to plan air attack missions for the Air Force. This interface combined English with graphics and mouse deixis

  8. Natural language processing with Java

    CERN Document Server

    Reese, Richard M


    If you are a Java programmer who wants to learn about the fundamental tasks underlying natural language processing, this book is for you. You will be able to identify and use NLP tasks for many common problems, and integrate them in your applications to solve more difficult problems. Readers should be familiar/experienced with Java software development.

  9. New trends in natural language processing: statistical natural language processing.


    Marcus, M


    The field of natural language processing (NLP) has seen a dramatic shift in both research direction and methodology in the past several years. In the past, most work in computational linguistics tended to focus on purely symbolic methods. Recently, more and more work is shifting toward hybrid methods that combine new empirical corpus-based methods, including the use of probabilistic and information-theoretic techniques, with traditional symbolic methods. This work is made possible by the rece...

  10. The Book of Psalms in the Church Slavonic, Greek, and Polish Languages from Simon Azarjin’s Library

    Jelena A. Celunova


    Full Text Available This article is devoted to research on the Book of Psalms manuscript written in the first half of the 17th century from Simon Azarjin’s book collection. The Book of Psalms is written inter-linearly in three languages: Church Slavonic, Greek, and Polish. The availability of the text in Polish in the Orthodox psalms makes this memorable text unique. The research concentrates on the clarification of the aim that led to the creation of the Book of Psalms. The lack of a preface or any other evidence of its author, time, or place of its translation forces us to turn to indirect facts, namely, to research of the textological character and to an analysis of Church Slavonic and Polish texts. Textological research of the Church Slavonic edition of the Book of Psalms reveals its similarity with pre-Nikonian texts and the analyses of the text in Polish allows us to affirm that the author had used the Catholic Leopolita’s Bible in 1561, exposed it to a profound edition—both textological as well as linguistic. The analysis of the inserted changes into the text in Polish and the alternated language itself enables us to assume that the author of the manuscript might have been a native from West Russia, while the text itself had probably been created in the Trinity Monastery of St. Sergius. The efforts aimed at adaptation of the Polish text into the text in Church Slavonic prove that the tri-lingual Book of Psalms might have been created for the inhabitants of the previous territories of Great Principality of Lithuania who converted from the Catholic Church or from the Greek Catholic into the Orthodox Church. The text in Polish had thus been needed especially for those believers practicing the Orthodox religion in order to understand the Church Slavonic language of worship.

  11. Natural Language Processing and the Language-Impaired. (United States)

    Ward, R. D.


    Describes ideas for making the best use of simple language processing interfaces in computer-based learning activities. These ideas are based on classroom observations of hearing-impaired, language-impaired, and unimpaired children using programs with a natural language interface which allows them to communicate with the computer about…

  12. Natural language processing: an introduction. (United States)

    Nadkarni, Prakash M; Ohno-Machado, Lucila; Chapman, Wendy W


    To provide an overview and tutorial of natural language processing (NLP) and modern NLP-system design. This tutorial targets the medical informatics generalist who has limited acquaintance with the principles behind NLP and/or limited knowledge of the current state of the art. We describe the historical evolution of NLP, and summarize common NLP sub-problems in this extensive field. We then provide a synopsis of selected highlights of medical NLP efforts. After providing a brief description of common machine-learning approaches that are being used for diverse NLP sub-problems, we discuss how modern NLP architectures are designed, with a summary of the Apache Foundation's Unstructured Information Management Architecture. We finally consider possible future directions for NLP, and reflect on the possible impact of IBM Watson on the medical field.

  13. Visualizing Natural Language Descriptions: A Survey


    Hassani, Kaveh; Lee, Won-Sook


    A natural language interface exploits the conceptual simplicity and naturalness of the language to create a high-level user-friendly communication channel between humans and machines. One of the promising applications of such interfaces is generating visual interpretations of semantic content of a given natural language that can be then visualized either as a static scene or a dynamic animation. This survey discusses requirements and challenges of developing such systems and reports 26 graphi...

  14. Knowledge representation and natural language processing

    Energy Technology Data Exchange (ETDEWEB)

    Weischedel, R.M.


    In principle, natural language and knowledge representation are closely related. This paper investigates this by demonstrating how several natural language phenomena, such as definite reference, ambiguity, ellipsis, ill-formed input, figures of speech, and vagueness, require diverse knowledge sources and reasoning. The breadth of kinds of knowledge needed to represent morphology, syntax, semantics, and pragmatics is surveyed. Furthermore, several current issues in knowledge representation, such as logic versus semantic nets, general-purpose versus special-purpose reasoners, adequacy of first-order logic, wait-and-see strategies, and default reasoning, are illustrated in terms of their relation to natural language processing and how natural language impact the issues.

  15. Forms of Address and their Meaning in Contrast in Polish and Russian Languages

    Wojciech Sosnowski


    Full Text Available Forms of Address and their Meaning in Contrast in Polish and Russian Languages Many studies in contemporary linguistics focus on investigating politeness and rudeness in language. This paper, however, has not been intended as a contrastive study of the phenomena in question. Language politeness and rudeness are conveyed by means of expressions of politeness and rudeness which are perceived as entrenched and recurring in specific situations. These expressions convey the expected meaning of politeness and rudeness accepted in the model of social behaviour. If one uses the explicative method such expressions could be reduced to the following formula ‘I inform you that I follow a verbal conduct defined as polite’. Owing to the emergence of parallel corpora of particular languages, it is nowadays easier to collect data for research on forms of address as well as on expressions of politeness in the first half of the 21st century. Investigating the meaning of forms of address, which are part of linguistic repertoire used to express politeness and rudeness should be regarded as an interesting area of research. It is the consequence of the increasing importance of intercultural communication, expansion of international cooperation, and formation of new standards of interpersonal communication aimed at achieving mutual understanding without resorting to violence. It is worth mentioning that currently there are no bilingual dictionaries which would include practical rules for using forms of address. Moreover, dictionaries (especially bilingual ones also do not list classifiers of politeness, which becomes a shortcoming as regards the purposes of translation and teaching foreign languages. The aforementioned problems apply to print as well as computer dictionaries. A reliable list of forms of address and their meaning may become helpful in intercultural communication. It would be also important to create a Contemporary Dictionary of Expressions of

  16. Mobile speech and advanced natural language solutions

    CERN Document Server

    Markowitz, Judith


    Mobile Speech and Advanced Natural Language Solutions provides a comprehensive and forward-looking treatment of natural speech in the mobile environment. This fourteen-chapter anthology brings together lead scientists from Apple, Google, IBM, AT&T, Yahoo! Research and other companies, along with academicians, technology developers and market analysts.  They analyze the growing markets for mobile speech, new methodological approaches to the study of natural language, empirical research findings on natural language and mobility, and future trends in mobile speech.  Mobile Speech opens with a challenge to the industry to broaden the discussion about speech in mobile environments beyond the smartphone, to consider natural language applications across different domains.   Among the new natural language methods introduced in this book are Sequence Package Analysis, which locates and extracts valuable opinion-related data buried in online postings; microintonation as a way to make TTS truly human-like; and se...

  17. [Lublin - the capital of polish speech and language therapy. Half a century of slt education in UMCS]. (United States)

    Woźniak, Tomasz

    Lublin is the capital of Polish speech and language therapy (SLT) and this fact is justified by both historical as well as support in evaluating the potential of science - research and teaching, particularly in connection with the activities of the Department of Logopedics/SLT and Applied Linguistics of University of Maria Curie-Skłodowska and Polish Logopedic Society. The article discusses the history of the formation of SLT in Poland, strongly associated with Lublin, and also presents Lublin SLT educational traditions and the current teaching and research activities of the Department of Logopedics/SLT and Applied Linguistics of UMCS.

  18. Three-dimensional grammar in the brain: Dissociating the neural correlates of natural sign language and manually coded spoken language. (United States)

    Jednoróg, Katarzyna; Bola, Łukasz; Mostowski, Piotr; Szwed, Marcin; Boguszewski, Paweł M; Marchewka, Artur; Rutkowski, Paweł


    In several countries natural sign languages were considered inadequate for education. Instead, new sign-supported systems were created, based on the belief that spoken/written language is grammatically superior. One such system called SJM (system językowo-migowy) preserves the grammatical and lexical structure of spoken Polish and since 1960s has been extensively employed in schools and on TV. Nevertheless, the Deaf community avoids using SJM for everyday communication, its preferred language being PJM (polski język migowy), a natural sign language, structurally and grammatically independent of spoken Polish and featuring classifier constructions (CCs). Here, for the first time, we compare, with fMRI method, the neural bases of natural vs. devised communication systems. Deaf signers were presented with three types of signed sentences (SJM and PJM with/without CCs). Consistent with previous findings, PJM with CCs compared to either SJM or PJM without CCs recruited the parietal lobes. The reverse comparison revealed activation in the anterior temporal lobes, suggesting increased semantic combinatory processes in lexical sign comprehension. Finally, PJM compared with SJM engaged left posterior superior temporal gyrus and anterior temporal lobe, areas crucial for sentence-level speech comprehension. We suggest that activity in these two areas reflects greater processing efficiency for naturally evolved sign language. Copyright © 2015 Elsevier Ltd. All rights reserved.

  19. Generating natural language under pragmatic constraints

    CERN Document Server

    Hovy, Eduard H


    Recognizing that the generation of natural language is a goal- driven process, where many of the goals are pragmatic (i.e., interpersonal and situational) in nature, this book provides an overview of the role of pragmatics in language generation. Each chapter states a problem that arises in generation, develops a pragmatics-based solution, and then describes how the solution is implemented in PAULINE, a language generator that can produce numerous versions of a single underlying message, depending on its setting.

  20. Polish origins of the Faculty of Mathematical and Natural Sciences of the University of Fribourg and the Polish contribution to the Fribourg industrial revolution (in Polish

    Directory of Open Access Journals (Sweden)

    Wojciech KOCUREK


    Full Text Available The article is dedicated to high-tech companies founded by Poles at the end of the 19th century in the rural canton of Fribourg in Switzerland. The text is divided into two parts. In the first part, the author attempts to present the economic, social and political reality of Fribourg in a period of intense industrialization in the world and the formation of the liberal free market system. In this rapidly changing reality, the new Catholic-conservative authorities of the canton tried to lead to establishing of a comprehensive, but also different system of a “Christian republic”, whose aim was to achieve social justice consistent with the teachings of the Gospel. In order to complete the project, the cantonal government did not shy away from using the possibilities and measures offered by the contemporary world. Decision-makers, led by Georges Python, needed support from the society, who was aware of the changes. Due to this fact, it became necessary to establish a university capable of shaping new attitudes and views. However, the costs significantly exceeded the financial capabilities of the agricultural and relatively poor canton of Fribourg. In these less favourable circumstances, a conscious policy of industrialization was the way out of the deadlock. Newly created industrial institutions were to contribute to an increase of cash inflows to the canton and thus allow for the financing of the university, which would also become an intellectual foundation for the emerging industry. The activity of Polish scientists, which is the subject of the second part of the article, matched this philosophy perfectly. The Poles invited to cooperate with Python, i.e. Józef Wierusz-Kowalski, Ignacy Mościcki and Jan Modzelewski, created the foundations of the Faculty of Mathematical and Natural Sciences at the University of Fribourg. As members of the faculty, in addition to teaching, they conducted research into, among other things, nitric acid

  1. Specialist English as a foreign language for European public health: evaluation of competencies and needs among Polish and Lithuanian students. (United States)

    Sumskas, Linas; Czabanowska, Katarzyna; Bruneviciūte, Raimonda; Kregzdyte, Rima; Krikstaponyte, Zita; Ziomkiewicz, Anna


    Foreign languages are becoming an essential prerequisite for a successful carrier among all professions including public health professionals in many countries. The expanding role of English as a mode of communication allows for university graduates to project and to seek their career in English-speaking countries. The present study was carried out in the framework of EU Leonardo da Vinci project "Specialist English as a foreign language for European public health." The study aimed to get a deeper insight how the English language is perceived as a foreign language, by Polish and Lithuanian public health students, what is level of their language competence, which level of English proficiency they expect to use in future. MATERIAL AND METHODS. A total of 246 respondents completed the special questionnaires in autumn semester in 2005. A questionnaire form was developed by the international project team. For evaluation of English competences, the Language Passport (Common European Framework of Reference for Languages of Council of Europe) was applied. RESULTS. Current self-rated proficiency of the English language was at the same level for Lithuanian (3.47+/-1.14) and Polish (3.31+/-0.83) respondents (P>0.05). Majority of respondents (88.6% of Lithuanian and 87.8% of Polish) reported using the English language for their current studies. Respondents reported a significant increase in necessity for higher level of English proficiency in future: mean scores provided by respondents changed from B1 level to B2 level. Respondents gave priority to less formal and practice-based interactive English teaching methods (going abroad, contacts with native speakers) in comparison with theory-oriented methods of learning (self-studying, Internet courses). CONCLUSIONS. Similar levels of English language in all five areas of language skills were established in Polish and Lithuanian university students. Respondents gave more priorities to less formal and practice-based interactive

  2. A Natural Logic for Natural-Language Knowledge Bases

    DEFF Research Database (Denmark)

    Andreasen, Troels; Styltsvig, Henrik Bulskov; Jensen, Per Anker


    We describe a natural logic for computational reasoning with a regimented fragment of natural language. The natural logic comes with intuitive inference rules enabling deductions and with an internal graph representation facilitating conceptual path finding between pairs of terms as an approach t......-conservative constructs in order to approach scientific use of natural language. Finally, we outline a prototype system addressing life science for the natural logic knowledge base setup being under continuous development.......We describe a natural logic for computational reasoning with a regimented fragment of natural language. The natural logic comes with intuitive inference rules enabling deductions and with an internal graph representation facilitating conceptual path finding between pairs of terms as an approach...

  3. Efficiency of a natural wetland for effluent polishing of a septic tank

    Directory of Open Access Journals (Sweden)

    Z. Yousefi


    Full Text Available Wetlands now days apply as a polishing system for the classical wastewater treatment, in addition of different usages. Usually wetland systems are inexpensive methods vs. expensive high technology treatment systems. Objective of this study is an evaluation of natural wetland treatment in polishing of a septic effluent. Research duration works extended for 10 months on a natural wetland system in Pardis of Mazandaran University of medical sciences and eastern north of health faculty. Wastewater quality index such as pH, EC, BOD, COD, TSS, Nitrate, Phosphorus, Ammonia and Temperature performed on the samples of influent and effluent of the system. The study showed the system works as a buffering system for flow and pH. Results indicated that average of BOD5 and TSS efficiency were 67.70and 83%, respectively. Efficiency of COD was 65.26 and 80 % for a Low and moderate strength influent respectively. Average of phosphorus, NH3 and Nitrate in effluent were 0.032 mg/L, 7.18 and 0.036 mg/L, respectively. Efficiency of ammonia and Phosphorus were slightly increased in best condition. Based on this study result, natural wetland can be success in BOD, COD, and TSS removal of the classical septic tank, but for nitrogen and Phosphorus removal do not have considerable effects.

  4. Arabic Natural Language Processing System Code Library (United States)


    POS Tagging, and Dependency Parsing. Fourth Workshop on Statistical Parsing of Morphologically Rich Languages (SPMRL). English (Note: These are for...Detection, Affix Labeling, POS Tagging, and Dependency Parsing" by Stephen Tratz presented at the Statistical Parsing of Morphologically Rich Languages ...and also English ) natural language processing (NLP), containing code for training and applying the Arabic NLP system described in Stephen Tratz’s

  5. Bayesian natural language semantics and pragmatics

    CERN Document Server

    Zeevat, Henk


    The contributions in this volume focus on the Bayesian interpretation of natural languages, which is widely used in areas of artificial intelligence, cognitive science, and computational linguistics. This is the first volume to take up topics in Bayesian Natural Language Interpretation and make proposals based on information theory, probability theory, and related fields. The methodologies offered here extend to the target semantic and pragmatic analyses of computational natural language interpretation. Bayesian approaches to natural language semantics and pragmatics are based on methods from signal processing and the causal Bayesian models pioneered by especially Pearl. In signal processing, the Bayesian method finds the most probable interpretation by finding the one that maximizes the product of the prior probability and the likelihood of the interpretation. It thus stresses the importance of a production model for interpretation as in Grice's contributions to pragmatics or in interpretation by abduction.

  6. Natural Language Description of Emotion (United States)

    Kazemzadeh, Abe


    This dissertation studies how people describe emotions with language and how computers can simulate this descriptive behavior. Although many non-human animals can express their current emotions as social signals, only humans can communicate about emotions symbolically. This symbolic communication of emotion allows us to talk about emotions that we…

  7. Trainable Methods for Surface Natural Language Generation


    Ratnaparkhi, Adwait


    We present three systems for surface natural language generation that are trainable from annotated corpora. The first two systems, called NLG1 and NLG2, require a corpus marked only with domain-specific semantic attributes, while the last system, called NLG3, requires a corpus marked with both semantic attributes and syntactic dependency information. All systems attempt to produce a grammatical natural language phrase from a domain-specific semantic representation. NLG1 serves a baseline syst...

  8. Evolution, brain, and the nature of language. (United States)

    Berwick, Robert C; Friederici, Angela D; Chomsky, Noam; Bolhuis, Johan J


    Language serves as a cornerstone for human cognition, yet much about its evolution remains puzzling. Recent research on this question parallels Darwin's attempt to explain both the unity of all species and their diversity. What has emerged from this research is that the unified nature of human language arises from a shared, species-specific computational ability. This ability has identifiable correlates in the brain and has remained fixed since the origin of language approximately 100 thousand years ago. Although songbirds share with humans a vocal imitation learning ability, with a similar underlying neural organization, language is uniquely human. Copyright © 2012 Elsevier Ltd. All rights reserved.

  9. Research in Natural Language Understanding (United States)


    of the lexical material to explain how many actions there were, how many actors , etc., and the nature of the map from actor onto action, etc. For...direction and make a measurement there, or may scan from the current focus in a specified " direccion " (or by some other specification of a trajectory

  10. Semantic structures advances in natural language processing

    CERN Document Server

    Waltz, David L


    Natural language understanding is central to the goals of artificial intelligence. Any truly intelligent machine must be capable of carrying on a conversation: dialogue, particularly clarification dialogue, is essential if we are to avoid disasters caused by the misunderstanding of the intelligent interactive systems of the future. This book is an interim report on the grand enterprise of devising a machine that can use natural language as fluently as a human. What has really been achieved since this goal was first formulated in Turing's famous test? What obstacles still need to be overcome?

  11. The social impact of natural language processing

    DEFF Research Database (Denmark)

    Hovy, Dirk; Spruit, Shannon

    Research in natural language processing (NLP) used to be mostly performed on anonymous corpora, with the goal of enriching linguistic analysis. Authors were either largely unknown or public figures. As we increasingly use more data from social media, this situation has changed: users are now...

  12. Natural Language Navigation Support in Virtual Reality

    NARCIS (Netherlands)

    van Luin, J.; Nijholt, Antinus; op den Akker, Hendrikus J.A.; Giagourta, V.; Strintzis, M.G.


    We describe our work on designing a natural language accessible navigation agent for a virtual reality (VR) environment. The agent is part of an agent framework, which means that it can communicate with other agents. Its navigation task consists of guiding the visitors in the environment and to

  13. Theoretical approaches to natural language understanding

    Energy Technology Data Exchange (ETDEWEB)


    This book discusses the following: Computational Linguistics, Artificial Intelligence, Linguistics, Philosophy, and Cognitive Science and the current state of natural language understanding. Three topics form the focus for discussion; these topics include aspects of grammars, aspects of semantics/pragmatics, and knowledge representation.

  14. Brain readiness and the nature of language. (United States)

    Bouchard, Denis


    To identify the neural components that make a brain ready for language, it is important to have well defined linguistic phenotypes, to know precisely what language is. There are two central features to language: the capacity to form signs (words), and the capacity to combine them into complex structures. We must determine how the human brain enables these capacities. A sign is a link between a perceptual form and a conceptual meaning. Acoustic elements and content elements, are already brain-internal in non-human animals, but as categorical systems linked with brain-external elements. Being indexically tied to objects of the world, they cannot freely link to form signs. A crucial property of a language-ready brain is the capacity to process perceptual forms and contents offline, detached from any brain-external phenomena, so their "representations" may be linked into signs. These brain systems appear to have pleiotropic effects on a variety of phenotypic traits and not to be specifically designed for language. Syntax combines signs, so the combination of two signs operates simultaneously on their meaning and form. The operation combining the meanings long antedates its function in language: the primitive mode of predication operative in representing some information about an object. The combination of the forms is enabled by the capacity of the brain to segment vocal and visual information into discrete elements. Discrete temporal units have order and juxtaposition, and vocal units have intonation, length, and stress. These are primitive combinatorial processes. So the prior properties of the physical and conceptual elements of the sign introduce combinatoriality into the linguistic system, and from these primitive combinatorial systems derive concatenation in phonology and combination in morphosyntax. Given the nature of language, a key feature to our understanding of the language-ready brain is to be found in the mechanisms in human brains that enable the unique

  15. Brain readiness and the nature of language

    Denis eBouchard


    Full Text Available To identify the neural components that make a brain ready for language, it is important to have well defined linguistic phenotypes, to know precisely what language is. There are two central features to language: the capacity to form signs (words, and the capacity to combine them into complex structures. We must determine how the human brain enables these capacities.A sign is a link between a perceptual form and a conceptual meaning. Acoustic elements and content elements, are already brain-internal in non-human animals, but as categorical systems linked with brain-external elements. Being indexically tied to objects of the world, they cannot freely link to form signs. A crucial property of a language-ready brain is the capacity to process perceptual forms and contents offline, detached from any brain-external phenomena, so their representations may be linked into signs. These brain systems appear to have pleiotropic effects on a variety of phenotypic traits and not to be specifically designed for language.Syntax combines signs, so the combination of two signs operates simultaneously on their meaning and form. The operation combining the meanings long antedates its function in language: the primitive mode of predication operative in representing some information about an object. The combination of the forms is enabled by the capacity of the brain to segment vocal and visual information into discrete elements. Discrete temporal units have order and juxtaposition, and vocal units have intonation, length, and stress. These are primitive combinatorial processes. So the prior properties of the physical and conceptual elements of the sign introduce combinatoriality into the linguistic system, and from these primitive combinatorial systems derive concatenation in phonology and combination in morphosyntax.Given the nature of language, a key feature to our understanding of the language-ready brain is to be found in the mechanisms in human brains that

  16. Learning procedures from interactive natural language instructions (United States)

    Huffman, Scott B.; Laird, John E.


    Despite its ubiquity in human learning, very little work has been done in artificial intelligence on agents that learn from interactive natural language instructions. In this paper, the problem of learning procedures from interactive, situated instruction is examined in which the student is attempting to perform tasks within the instructional domain, and asks for instruction when it is needed. Presented is Instructo-Soar, a system that behaves and learns in response to interactive natural language instructions. Instructo-Soar learns completely new procedures from sequences of instruction, and also learns how to extend its knowledge of previously known procedures to new situations. These learning tasks require both inductive and analytic learning. Instructo-Soar exhibits a multiple execution learning process in which initial learning has a rote, episodic flavor, and later executions allow the initially learned knowledge to be generalized properly.

  17. Henkin semantics for reasoning with natural language

    Directory of Open Access Journals (Sweden)

    Michael Hahn


    Full Text Available The frequency of intensional and non-first-order definable operators in natural languages constitutes a challenge for automated reasoning with the kind of logical translations that are deemed adequate by formal semanticists. Whereas linguists employ expressive higher-order logics in their theories of meaning, the most successful logical reasoning strategies with natural language to date rely on sophisticated first-order theorem provers and model builders. In order to bridge the fundamental mathematical gap between linguistic theory and computational practice, we present a general translation from a higher-order logic frequently employed in the linguistics literature, two-sorted Type Theory, to first-order logic under Henkin semantics. We investigate alternative formulations of the translation, discuss their properties, and evaluate the availability of linguistically relevant inferences with standard theorem provers in a test suite of inference problems stated in English. The results of the experiment indicate that translation from higher-order logic to first-order logic under Henkin semantics is a promising strategy for automated reasoning with natural languages.The paper is accompanied by the source code (cf. SUPP. FILES of the grammar and reasoning architecture described in the paper.

  18. Natural language processing tools for computer assisted language learning

    Directory of Open Access Journals (Sweden)

    Vandeventer Faltin, Anne


    Full Text Available This paper illustrates the usefulness of natural language processing (NLP tools for computer assisted language learning (CALL through the presentation of three NLP tools integrated within a CALL software for French. These tools are (i a sentence structure viewer; (ii an error diagnosis system; and (iii a conjugation tool. The sentence structure viewer helps language learners grasp the structure of a sentence, by providing lexical and grammatical information. This information is derived from a deep syntactic analysis. Two different outputs are presented. The error diagnosis system is composed of a spell checker, a grammar checker, and a coherence checker. The spell checker makes use of alpha-codes, phonological reinterpretation, and some ad hoc rules to provide correction proposals. The grammar checker employs constraint relaxation and phonological reinterpretation as diagnosis techniques. The coherence checker compares the underlying "semantic" structures of a stored answer and of the learners' input to detect semantic discrepancies. The conjugation tool is a resource with enhanced capabilities when put on an electronic format, enabling searches from inflected and ambiguous verb forms.

  19. Diagnostic validity Polish language version of the questionnaire MINI-KID (Mini International Neuropsychiatry Interview for Children and Adolescent). (United States)

    Adamowska, Sylwia; Sylwia, Adamowska; Adamowski, Tomasz; Tomasz, Adamowski; Frydecka, Dorota; Dorota, Frydecka; Kiejna, Andrzej; Andrzej, Kiejna


    Since over forty years structuralized interviews for clinical and epidemiological research in child and adolescent psychiatry are being developed that should increase validity and reliability of diagnoses according to classification systems (DSM and ICD). The aim of the study is to assess the validity of the Polish version of MINI-KID (Mini International Neuropsychiatric Interview for Children and Adolescents) in comparison to clinical diagnosis made by a specialist in the field of child and adolescent psychiatry. There were 140 patients included in the study (93 boys, 66.4%, mean age 11.8±3.0 and 47 girls 33.5%, mean age 14.0±2.9). All the patients were diagnosed by the specialist in the field of child and adolescent psychiatry according to ICD-10 criteria and by the independent interviewer with the Polish version of MINI-KID (version 2.0, 2001). There was higher agreement between clinical diagnoses and diagnoses based on MINI-KID interview with respect to eating disorders and externalizing disorders (κ 0.43-0.56) and lower in internalizing disorders (κ 0.13-0.45). In the clinical interview, there was smaller number of diagnostic categories (maximum 3 diagnoses per one patient) in comparison to MINI-KID (maximum 10 diagnoses per one patient), and the smaller percentage of patients with one diagnosis (65,7%) in comparison to MINI-KID interview (72%). Our study has shown satisfactory validity parameters of MINI-KID questionnaire, promoting its use for clinical and epidemiological settings. The Mini International Neuropsychiatry Interview for Children and Adolescent (MINI-KID) is the first structuralized diagnostic interview for assessing mental status in children and adolescents, which has been translated into Polish language. Our validation study demonstrated satisfactory psychometric properties of the questionnaire, enabling its use in clinical practice and in research projects. Copyright © 2014 Elsevier Inc. All rights reserved.

  20. Natural Language Question Answering in Open Domains

    Directory of Open Access Journals (Sweden)

    Dan Tufis


    Full Text Available With the ever-growing volume of information on the web, the traditional search engines, returning hundreds or thousands of documents per query, become more and more demanding on the user patience in satisfying his/her information needs. Question Answering in Open Domains is a top research and development topic in current language technology. Unlike the standard search engines, based on the latest Information Retrieval (IR methods, open domain question-answering systems are expected to deliver not a list of documents that might be relevant for the user's query, but a sentence or a paragraph answering the question asked in natural language. This paper reports on the construction and testing of a Question Answering (QA system which builds on several web services developed at the Research Institute for Artificial Intelligence (ICIA/RACAI. The evaluation of the system has been independently done by the organizers of the ResPubliQA 2009 exercise and has been rated the best performing system with the highest improvement due to the natural language processing technology over a baseline state-of-the-art IR system. The system was trained on a specific corpus, but its functionality is independent on the linguistic register of the training data.

  1. Natural Language Generation in Health Care (United States)

    Cawsey, Alison J.; Webber, Bonnie L.; Jones, Ray B.


    Abstract Good communication is vital in health care, both among health care professionals, and between health care professionals and their patients. And well-written documents, describing and/or explaining the information in structured databases may be easier to comprehend, more edifying, and even more convincing than the structured data, even when presented in tabular or graphic form. Documents may be automatically generated from structured data, using techniques from the field of natural language generation. These techniques are concerned with how the content, organization and language used in a document can be dynamically selected, depending on the audience and context. They have been used to generate health education materials, explanations and critiques in decision support systems, and medical reports and progress notes. PMID:9391935

  2. On the Relationship between a Computational Natural Logic and Natural Language

    DEFF Research Database (Denmark)

    Andreasen, Troels; Bulskov, Henrik; Nilsson, Jørgen Fischer


    This paper makes a case for adopting appropriate forms of natural logic as target language for computational reasoning with descriptive natural language. Natural logics are stylized fragments of natural language where reasoning can be conducted directly by natural reasoning rules reflecting intui...

  3. Origin of natural gases in the Paleozoic-Mesozoic basement of the Polish Carpathian Foredeep (United States)

    Kotarba, Maciej


    Hydrocarbon gases from Upper Devonian and Lower Carboniferous reservoirs in the Paleozoic basement of the Polish Carpathian Foredeep were generated mainly during low-temperature thermogenic processes ("oil window"). They contain only insignificant amounts of microbial methane and ethane. These gaseous hydrocarbons were generated from Lower Carboniferous and/or Middle Jurassic mixed Type III/II kerogen and from Ordovician-Silurian Type II kerogen, respectively. Methane, ethane and carbon dioxide of natural gas from the Middle Devonian reservoir contain a significant microbial component whereas their small thermogenic component is most probably genetically related to Ordovician-Silurian Type II kerogen. The gaseous hydrocarbons from the Upper Jurassic and the Upper Cretaceous reservoirs of the Mesozoic basement were generated both by microbial carbon dioxide reduction and thermogenic processes. The presence of microbial methane generated by carbon dioxide reduction suggests that in some deposits the traps had already been formed and sealed during the migration of microbial methane, presumably in the immature source rock environment. The traps were successively supplied with thermogenic methane and higher hydrocarbons generated at successively higher maturation stages of kerogen. The higher hydrocarbons of the majority of deposits were generated from mixed Type III/II kerogen deposited in the Middle Jurassic, Lower Carboniferous and/or Devonian strata. Type II or mixed Type II/III kerogen could be the source for hydrocarbons in both the Tarnów and Brzezówka deposits. In the Cenomanian sandstone reservoir of the Brzezowiec deposit and one Upper Jurassic carbonate block of the Lubaczów deposit microbial methane prevails. It migrated from the autochthonous Miocene strata.

  4. Two interpretive systems for natural language? (United States)

    Frazier, Lyn


    It is proposed that humans have available to them two systems for interpreting natural language. One system is familiar from formal semantics. It is a type based system that pairs a syntactic form with its interpretation using grammatical rules of composition. This system delivers both plausible and implausible meanings. The other proposed system is one that uses the grammar together with knowledge of how the human production system works. It is token based and only delivers plausible meanings, including meanings based on a repaired input when the input might have been produced as a speech error.

  5. Investigating the attitudes towards learning a third language and its culture in Polish junior high school


    Kiermasz, Zuzanna


    It is believed that attitudes to languages and culture tend to affect achievement in foreign language learning (Baker, 1997). Thus, this factor may be seen as crucial when it comes to the discrepancies in attainment in different languages learnt by the same students. Therefore, it seems vital to investigate variation in attitudes towards both learning L2 together with the approach to the L2 culture and the corresponding issues with respect to L3. Nevertheless, the general at...

  6. Perceived teacher support and language anxiety in Polish secondary school EFL learners

    Ewa Piechurska-Kuciel


    Full Text Available The teacher’s role is vital, both in respect to achieving academic goals, and with regard to the regulation of emotional and social processes. Positive perceptions of teacher support can endorse psychological wellness, and help maintain students’ academic interests, higher academic achievement and more positive peer relationships. The teacher who shows understanding, empathy and consistency in their behavior helps students start forming an identity, which will assist them in coping with stress and anxiety directly connected with the foreign language learning process (language anxiety. The main aim of this research is to investigate the relationship between teacher support and language anxiety levels. It is speculated that teacher support functions as a buffer from the effects of negative emotions, such as language anxiety experienced in the foreign language learning process. The participants of the study were 621 secondary grammar school students whose responses to a questionnaire were the main data source. The results of the study demonstrate that students with higher levels of teacher support experience lower language anxiety levels in comparison to their peers with lower levels of teacher support. Students who have a feeling that they can count on the instructor’s help, advice, assistance, or backing manage the learning process more successfully. They evaluate their language abilities highly and receive better final grades. Nevertheless, gender and residential location do not moderate teacher support and language anxiety due to the specificity of the sample consisting of novice secondary grammar school students.

  7. The social impact of natural language processing

    DEFF Research Database (Denmark)

    Hovy, Dirk; Spruit, Shannon

    Research in natural language processing (NLP) used to be mostly performed on anonymous corpora, with the goal of enriching linguistic analysis. Authors were either largely unknown or public figures. As we increasingly use more data from social media, this situation has changed: users are now...... individually identifiable, and the outcome of NLP experiments and applications can have a direct effect on their lives. This change should spawn a debate about the ethical implications of NLP, but until now, the internal discourse in the field has not followed the technological development. This position paper...... identifies a number of social implications that NLP research may have, and discusses their ethical significance, as well as ways to address them....

  8. Presentation of the verbs in Bulgarian-Polish electronic dictionary

    Ludmila Dimitrova


    Full Text Available Presentation of the verbs in Bulgarian-Polish electronic dictionary This paper briefly discusses the presentation of the verbs in the first electronic Bulgarian-Polish dictionary that is currently being developed under a bilateral collaboration between IMI-BAS and ISS-PAS. Special attention is given to the digital entry classifiers that describe Bulgarian and Polish verbs. Problems related to the correspondence between natural language phenomena and their presentations are discussed. Some examples illustrate the different types of dictionary entries for verbs.

  9. Landscape Design and the language of Nature

    Directory of Open Access Journals (Sweden)

    Stephen Perry


    Full Text Available Recognition that we need to live in a more ecologically sustainable way and that the physical forms of designed landscapes are an expression of the social values and cultural drivers of the time has underpinned the call by some landscape design professionals for a new design aesthetic - one that reflects modern ecological concerns. However, for an 'ecological aesthetic' to be accepted, it must be capable of generating landscape forms that are pleasurable to the general public, as it is the general public who will be responsible for delivering ecological sustainability in the long term. The growth in understanding of the mathematical properties of natural systems and processes has led some authors to suggest that fractal geometry, called the language of nature, could playa role in developing such an aesthetic. This is supported by recent research that suggests human perceptual systems have evolved to process fractal patterning and that we have a visual preference for images with certain fractal qualities. However, how fractal geometry can be used, and what form an aesthetic based on this geometry might take, remains elusive and undefined. To develop an aesthetic based on fractal geometry it is necessary to understand why fractal geometry should be considered as a potential tool and whether the application of fractal analysis can differentiate between the types of landscape forms encountered every day.

  10. Capturing and Modeling Domain Knowledge Using Natural Language Processing Techniques

    National Research Council Canada - National Science Library

    Auger, Alain


    .... Initiated in 2004 at Defense Research and Development Canada (DRDC), the SACOT knowledge engineering research project is currently investigating, developing and validating innovative natural language processing (NLP...


    Joanna Kostecka


    Full Text Available Food production, based on intensive farming, contributes to high and constantly increasing pollution of soils and other environmental resources. Given this, search for non-conventional sources of animal protein seems justified. The present study was designed to examine opinions of selected Polish consumers related to their acceptance of insect-based food as an alternative source of nutrients. The assessment of attitudes towards alternative sources of nutrients was based on the survey developed at the Faculty of Science, University of Porto in Portugal. Representatives of Polish consumers in the region of Podkarpackie generally did not show open-mindedness towards incorporating insect-based food into their diet. Majority of the respondents, however, recognized the importance of food sector operation based on respect for natural resources. Therefore, it seems important that consumers be informed about the advantages of production or use of insect biomass originating from natural ecosystems. This may contribute to increased acceptance for alternative sources of protein, which consequently may lead to reduced environmental pressure of traditional livestock farming and to retardation of ecosystems transformation and loss of biological diversity.

  12. Semiotic Nature of Language Teaching Methods in Foreign Language Learning and Teaching


    Erton, İsmail


    This paper aims to cover the semiotic nature of language teaching methods, and their sample applications in the language classroom. The verbal and the non-verbal aspects of language teaching should not be kept separate since they are closely interrelated and interdependent. The use of signs, symbols and visual aids by the teachers help the enhancement of the learning capacity of the language learner both at cognitive and meta-cognitive levels as they listen and try to learn a foreign language...

  13. The Islamic State Battle Plan: Press Release Natural Language Processing (United States)


    we apply Natural Language Processing (NLP) tools to a unique database constructed from approximately 3,000 English translated press the English language . It denies any bias introduced by limiting sources to English language media reports. IBC critics claim that its body counts...added benefit to the understanding of the text. There are variations of stopwords for each language . The System for the Mechanical Analysis and

  14. Natural language solution to a Tuff problem

    International Nuclear Information System (INIS)

    Langkopf, B.S.; Mallory, L.H.


    A scientific data base, the Tuff Data Base, is being created at Sandia National Laboratories on the Cyber 170/855, using System 2000. It is being developed for use by scientists and engineers investigating the feasibility of locating a high-level radioactive waste repository in tuff (a type of volcanic rock) at Yucca Mountain on and adjacent to the Nevada Test Site. This project, the Nevada Nuclear Waste Storage Investigations (NNWSI) Project, is managed by the Nevada Operations Office of the US Department of Energy. A user-friendly interface, PRIMER, was developed that uses the Self-Contained Facility (SCF) command SUBMIT and System 2000 Natural Language functions and parametric strings that are schema resident. The interface was designed to: (1) allow users, with or without computer experience or keyboard skill, to sporadically access data in the Tuff Data Base; (2) produce retrieval capabilities for the user quickly; and (3) acquaint the users with the data in the Tuff Data Base. This paper gives a brief description of the Tuff Data Base Schema and the interface, PRIMER, which is written in Fortran V. 3 figures

  15. Policy-Based Management Natural Language Parser (United States)

    James, Mark


    The Policy-Based Management Natural Language Parser (PBEM) is a rules-based approach to enterprise management that can be used to automate certain management tasks. This parser simplifies the management of a given endeavor by establishing policies to deal with situations that are likely to occur. Policies are operating rules that can be referred to as a means of maintaining order, security, consistency, or other ways of successfully furthering a goal or mission. PBEM provides a way of managing configuration of network elements, applications, and processes via a set of high-level rules or business policies rather than managing individual elements, thus switching the control to a higher level. This software allows unique management rules (or commands) to be specified and applied to a cross-section of the Global Information Grid (GIG). This software embodies a parser that is capable of recognizing and understanding conversational English. Because all possible dialect variants cannot be anticipated, a unique capability was developed that parses passed on conversation intent rather than the exact way the words are used. This software can increase productivity by enabling a user to converse with the system in conversational English to define network policies. PBEM can be used in both manned and unmanned science-gathering programs. Because policy statements can be domain-independent, this software can be applied equally to a wide variety of applications.

  16. Natural language metaphors covertly influence reasoning. (United States)

    Thibodeau, Paul H; Boroditsky, Lera


    Metaphors pervade discussions of social issues like climate change, the economy, and crime. We ask how natural language metaphors shape the way people reason about such social issues. In previous work, we showed that describing crime metaphorically as a beast or a virus, led people to generate different solutions to a city's crime problem. In the current series of studies, instead of asking people to generate a solution on their own, we provided them with a selection of possible solutions and asked them to choose the best ones. We found that metaphors influenced people's reasoning even when they had a set of options available to compare and select among. These findings suggest that metaphors can influence not just what solution comes to mind first, but also which solution people think is best, even when given the opportunity to explicitly compare alternatives. Further, we tested whether participants were aware of the metaphor. We found that very few participants thought the metaphor played an important part in their decision. Further, participants who had no explicit memory of the metaphor were just as much affected by the metaphor as participants who were able to remember the metaphorical frame. These findings suggest that metaphors can act covertly in reasoning. Finally, we examined the role of political affiliation on reasoning about crime. The results confirm our previous findings that Republicans are more likely to generate enforcement and punishment solutions for dealing with crime, and are less swayed by metaphor than are Democrats or Independents.

  17. Natural Language Video Description using Deep Recurrent Neural Networks (United States)


    language with a single deep neural network. We use deep recurrent nets (RNNs), which have recently demonstrated strong results for machine translation (MT...Donahue, Marcus Rohrbach, Raymond Mooney, and Kate Saenko. Translating videos to natural language using deep recurrent neural net - works. In NAACL, 2015...Natural Language Video Description using Deep Recurrent Neural Networks Subhashini Venugopalan University of Texas at Austin

  18. Cognitive Neuroscience of Natural Language Use

    NARCIS (Netherlands)

    Willems, R.M.


    When we think of everyday language use, the first things that come to mind include colloquial conversations, reading and writing e-mails, sending text messages or reading a book. But can we study the brain basis of language as we use it in our daily lives? As a topic of study, the cognitive

  19. Do neural nets learn statistical laws behind natural language?

    Directory of Open Access Journals (Sweden)

    Full Text Available The performance of deep learning in natural language processing has been spectacular, but the reasons for this success remain unclear because of the inherent complexity of deep learning. This paper provides empirical evidence of its effectiveness and of a limitation of neural networks for language engineering. Precisely, we demonstrate that a neural language model based on long short-term memory (LSTM effectively reproduces Zipf's law and Heaps' law, two representative statistical properties underlying natural language. We discuss the quality of reproducibility and the emergence of Zipf's law and Heaps' law as training progresses. We also point out that the neural language model has a limitation in reproducing long-range correlation, another statistical property of natural language. This understanding could provide a direction for improving the architectures of neural networks.

  20. Whole language and deaf bilingual-bicultural education--naturally! (United States)

    Mason, D; Ewoldt, C


    This position paper discusses how the tenets of Whole Language and Deaf Bilingual-Bicultural Education complement each other. It stresses that Whole Language is based on natural processes through which children can translate their constructs of personal experiences, observations, and perspectives into modes of communication that include written language and, in the present case, American Sign Language. The paper is based on two emphases: (a) Whole Language emphasizes a two-way teaching/learning process, teachers learning from children, and vice versa; and (b) Deaf Bilingual-Bicultural Education emphasizes American Sign Language as a language of instruction and builds on mutual respect for the similarities and differences in the sociocultural and socioeducational experiences and values of Deaf and hearing people. Both Whole Language and Deaf Bilingual-Bicultural Education attempt to authenticate curriculum by integrating Deaf persons' worldviews as part of educational experience.

  1. Some Uses of Natural Language Interfaces in Computer Assisted Language Learning. (United States)

    Ward, R. D.


    Presents a theoretical rationale for the idea that computer programs simulating written conversation, and using natural language, could be effective in language teaching and remediation, and reports empirical studies of its potential. Studies with 10- to 14-year-old language-impaired children are described, software is explained, and future…

  2. Natural language computing an English generative grammar in Prolog

    CERN Document Server

    Dougherty, Ray C


    This book's main goal is to show readers how to use the linguistic theory of Noam Chomsky, called Universal Grammar, to represent English, French, and German on a computer using the Prolog computer language. In so doing, it presents a follow-the-dots approach to natural language processing, linguistic theory, artificial intelligence, and expert systems. The basic idea is to introduce meaningful answers to significant problems involved in representing human language data on a computer. The book offers a hands-on approach to anyone who wishes to gain a perspective on natural language

  3. Finite-state pre-processing for natural language analysis

    NARCIS (Netherlands)

    Prins, Robbert Paul


    Wide-coverage natural language parsers are typically not very efficient. Finite-state techniques are less powerful, but offer the advantage of being very fast, and good at representing language locally. This dissertation constitutes empirical research into the construction and use of a finite-state

  4. Understanding and Representing Natural Language Meaning. (United States)


    Pragmatics , in press. Collins, A. and M. R. Quillian, "Experiments on Semantic Memory and Language Comprehension," in L. W. Gregg (Ed.), Cognition in Learning...ed Anaphora in Basque," ProceedingA of the 8th Anniil -cet in of the Berjkeley Ljnuisti,._; $ocietZ, Berkeley, CA, 1982. (2) Azkarate, M., D. Far

  5. Natural Language Assistant: A Dialog System for Online Product Recommendation


    Chai, Joyce; Horvath, Veronika; Nicolov, Nicolas; Stys, Margo; Kambhatla, Nanda; Zadrozny, Wlodek; Melville, Prem


    With the emergence of electronic-commerce systems, successful information access on electroniccommerce web sites becomes essential. Menu-driven navigation and keyword search currently provided by most commercial sites have considerable limitations because they tend to overwhelm and frustrate users with lengthy, rigid, and ineffective interactions. To provide an efficient solution for information access, we have built the NATURAL language ASSISTANT (NLA), a web-based natural language dialog sy...

  6. State of the Art of Natural Language Processing (United States)


    computers. ♦ Noam Chomsky , Aspects of the Theory of Syntax (Cambridge, Mass.: MIT Press, 1965). -A- One of the earliest attempts at Natural Language...of computers that a machine which understood natural languages was highly desirable. It also was evident from the work of Chomsky * and others that...20 years. All the interviewees were educated to the Ph.D. level and most had extensively published in AI literature. The interviewees were evenly

  7. Finite-State Methodology in Natural Language Processing

    Michal Korzycki


    Full Text Available Recent mathematical and algorithmic results in the field of finite-state technology, as well the increase in computing power, have constructed the base for a new approach in natural language processing. However the task of creating an appropriate model that would describe the phenomena of the natural language is still to be achieved. ln this paper I'm presenting some notions related to the finite-state modelling of syntax and morphology.

  8. The nature of written language deficits in children with SLI. (United States)

    Mackie, Clare; Dockrell, Julie E


    Children with specific language impairment (SLI) have associated difficulties in reading decoding and reading comprehension. To date, few research studies have examined the children's written language. The aim of the present study was to (a) evaluate the nature and extent of the children's difficulties with writing and (b) investigate the relationship between oral and written language. Eleven children with SLI were identified (mean age = 11 years) and were compared with a group of children matched for chronological age (CA; mean age = 11;2 [years;months]) and language age (LA; mean CA = 7;3). All groups completed standardized measures of language production, writing, and reading decoding. The writing assessment revealed that the SLI group wrote fewer words and produced proportionately more syntax errors than the CA group, but they did not differ on a measure of content of written language or on the proportion of spelling errors. The SLI group also produced proportionately more syntax errors than the LA group. The relationships among oral language, reading, and writing differed for the 3 groups. The nature and extent of the children's written language problems are considered in the context of difficulties with spoken language.

  9. Natural language processing in psychiatry. Artificial intelligence technology and psychopathology. (United States)

    Garfield, D A; Rapp, C; Evens, M


    The potential benefit of artificial intelligence (AI) technology as a tool of psychiatry has not been well defined. In this essay, the technology of natural language processing and its position with regard to the two main schools of AI is clearly outlined. Past experiments utilizing AI techniques in understanding psychopathology are reviewed. Natural language processing can automate the analysis of transcripts and can be used in modeling theories of language comprehension. In these ways, it can serve as a tool in testing psychological theories of psychopathology and can be used as an effective tool in empirical research on verbal behavior in psychopathology.

  10. Naturalizing language: human appraisal and (quasi) technology

    DEFF Research Database (Denmark)

    Cowley, Stephen


    Using contemporary science, the paper builds on Wittgenstein’s views of human language. Rather than ascribing reality to inscription-like entities, it links embodiment with distributed cognition. The verbal or (quasi) technological aspect of language is traced to not action, but human specific...... interactivity. This species-specific form of sense-making sustains, among other things, using texts, making/construing phonetic gestures and thinking. Human action is thus grounded in appraisals or sense-saturated coordination. To illustrate interactivity at work, the paper focuses on a case study. Over 11 s......, a crime scene investigator infers that she is probably dealing with an inside job: she uses not words, but intelligent gaze. This connects professional expertise to circumstances and the feeling of thinking. It is suggested that, as for other species, human appraisal is based in synergies. However, since...

  11. Handbook of natural language processing and machine translation DARPA global autonomous language exploitation

    CERN Document Server

    Olive, Joseph P; McCary, John


    This comprehensive handbook, written by leading experts in the field, details the groundbreaking research conducted under the breakthrough GALE program - The Global Autonomous Language Exploitation within the Defense Advanced Research Projects Agency (DARPA), while placing it in the context of previous research in the fields of natural language and signal processing, artificial intelligence and machine translation. The most fundamental contrast between GALE and its predecessor programs was its holistic integration of previously separate or sequential processes. In earlier language research pro

  12. Polish Semantic Parser

    Directory of Open Access Journals (Sweden)

    Agnieszka Grudzinska


    Full Text Available Amount of information transferred by computers grows very rapidly thus outgrowing the average man's capability of reception. It implies computer programs increase in the demand for which would be able to perform an introductory classitication or even selection of information directed to a particular receiver. Due to the complexity of the problem, we restricted it to understanding short newspaper notes. Among many conceptions formulated so far, the conceptual dependency worked out by Roger Schank has been chosen. It is a formal language of description of the semantics of pronouncement integrated with a text understanding algorithm. Substantial part of each text transformation system is a semantic parser of the Polish language. It is a module, which as the first and the only one has an access to the text in the Polish language. lt plays the role of an element, which finds relations between words of the Polish language and the formal registration. It translates sentences written in the language used by people into the language theory. The presented structure of knowledge units and the shape of understanding process algorithms are universal by virtue of the theory. On the other hand the defined knowledge units and the rules used in the algorithms ure only examples because they are constructed in order to understand short newspaper notes.

  13. A Natural Logic for Natural-language Knowledge Bases

    DEFF Research Database (Denmark)

    Andreasen, Troels; Bulskov, Henrik; Jensen, Per Anker


    to semantic querying. Our core natural logic proposal covers formal ontologies and generative extensions thereof. It further provides means of expressing general relationships between classes in an application. We discuss extensions of the core natural logic with various conservative as well as non-conservative...

  14. A Natural Logic for Natural-language Knowledge Bases

    DEFF Research Database (Denmark)

    Andreasen, Troels; Bulskov, Henrik; Jensen, Per Anker


    to semantic querying. Our core natural logic proposal covers formal ontologies and generative extensions thereof. It further provides means of expressing general relationships between classes in an application. We discuss extensions of the core natural logic with various conservative as well as non...

  15. Statistical Language Models and Information Retrieval: Natural Language Processing Really Meets Retrieval

    NARCIS (Netherlands)

    Hiemstra, Djoerd; de Jong, Franciska M.G.


    Traditionally, natural language processing techniques for information retrieval have always been studied outside the framework of formal models of information retrieval. In this article, we introduce a new formal model of information retrieval based on the application of statistical language models.

  16. ROPE: Recoverable Order-Preserving Embedding of Natural Language

    Energy Technology Data Exchange (ETDEWEB)

    Widemann, David P. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Wang, Eric X. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Thiagarajan, Jayaraman J. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)


    We present a novel Recoverable Order-Preserving Embedding (ROPE) of natural language. ROPE maps natural language passages from sparse concatenated one-hot representations to distributed vector representations of predetermined fixed length. We use Euclidean distance to return search results that are both grammatically and semantically similar. ROPE is based on a series of random projections of distributed word embeddings. We show that our technique typically forms a dictionary with sufficient incoherence such that sparse recovery of the original text is possible. We then show how our embedding allows for efficient and meaningful natural search and retrieval on Microsoft’s COCO dataset and the IMDB Movie Review dataset.

  17. From language to nature: The semiotic metaphor in biology

    DEFF Research Database (Denmark)

    Emmeche, Claus; Hoffmeyer, Jesper Normann


    of a program, written in a formal language in the computer. Other versions of the semiotic or "nature-as-language" metaphor uses other formal or informal aspects of language to comprehend the specific structural relations in nature as explored by molecular and evolutionary biology. This intuitively appealing......The development of form in living organisms continues to challenge biological research. The concept of biological information encoded in the genetic program that controls development forms a major part of the semiotic metaphor in biology. Development is here seen in analogy to an execution...... complex of related ideas, which has a long history in the philosophy of nature and biology, is critically reviewed. The general nature of metaphor in science is considered, and different levels of metaphorical transfer of signification is distinguished. It is argued, that the metaphors may...

  18. Semiotic Nature of Language Teaching Methods in Foreign Language Learning and Teaching

    Directory of Open Access Journals (Sweden)

    İsmail ERTON


    Full Text Available This paper aims to cover the semiotic nature of language teaching methods, andtheir sample applications in the language classroom. The verbal and the non-verbalaspects of language teaching should not be kept separate since they are closelyinterrelated and interdependent. The use of signs, symbols and visual aids by theteachers help the enhancement of the learning capacity of the language learner both atcognitive and meta-cognitive levels as they listen and try to learn a foreign languagecomponent in the classroom.

  19. Artificial intelligence, expert systems, computer vision, and natural language processing (United States)

    Gevarter, W. B.


    An overview of artificial intelligence (AI), its core ingredients, and its applications is presented. The knowledge representation, logic, problem solving approaches, languages, and computers pertaining to AI are examined, and the state of the art in AI is reviewed. The use of AI in expert systems, computer vision, natural language processing, speech recognition and understanding, speech synthesis, problem solving, and planning is examined. Basic AI topics, including automation, search-oriented problem solving, knowledge representation, and computational logic, are discussed.

  20. Natural Language Direction Following for Robots in Unstructured Unknown Environments (United States)


    music is not to be found in the notes. Gustav Mahler Our approach so far has only considered the user’s natural language command as a specification... Electronic Lexical Database. Language, Speech, and Communication. 1998. 2.1.1, 3.4 [46] Dave Ferguson and Anthony Stentz. Field D*: An interpolation-based...and Brain Sciences, 1993. 3.1 [84] Christian Landsiedel, Roderick De Nijs, Kolja Kuhnlenz, Dirk Wollherr, and Martin Buss. Route description

  1. Natural language processing and the Now-or-Never bottleneck. (United States)

    Gómez-Rodríguez, Carlos


    Researchers, motivated by the need to improve the efficiency of natural language processing tools to handle web-scale data, have recently arrived at models that remarkably match the expected features of human language processing under the Now-or-Never bottleneck framework. This provides additional support for said framework and highlights the research potential in the interaction between applied computational linguistics and cognitive science.

  2. Clinical Natural Language Processing in languages other than English: opportunities and challenges. (United States)

    Névéol, Aurélie; Dalianis, Hercules; Velupillai, Sumithra; Savova, Guergana; Zweigenbaum, Pierre


    Natural language processing applied to clinical text or aimed at a clinical outcome has been thriving in recent years. This paper offers the first broad overview of clinical Natural Language Processing (NLP) for languages other than English. Recent studies are summarized to offer insights and outline opportunities in this area. We envision three groups of intended readers: (1) NLP researchers leveraging experience gained in other languages, (2) NLP researchers faced with establishing clinical text processing in a language other than English, and (3) clinical informatics researchers and practitioners looking for resources in their languages in order to apply NLP techniques and tools to clinical practice and/or investigation. We review work in clinical NLP in languages other than English. We classify these studies into three groups: (i) studies describing the development of new NLP systems or components de novo, (ii) studies describing the adaptation of NLP architectures developed for English to another language, and (iii) studies focusing on a particular clinical application. We show the advantages and drawbacks of each method, and highlight the appropriate application context. Finally, we identify major challenges and opportunities that will affect the impact of NLP on clinical practice and public health studies in a context that encompasses English as well as other languages.

  3. Computing an Ontological Semantics for a Natural Language Fragment

    DEFF Research Database (Denmark)

    Szymczak, Bartlomiej Antoni

    The key objective of the research that has been carried out has been to establish theoretically sound connections between the following two areas: • Computational processing of texts in natural language by means of logical methods • Theories and methods for engineering of formal ontologies We have...... tried to establish a domain independent “ontological semantics” for relevant fragments of natural language. The purpose of this research is to develop methods and systems for taking advantage of formal ontologies for the purpose of extracting the meaning contents of texts. This functionality...... is desirable e.g. for future content–based search systems in contrast to today’s keyword based search systems (viz., Google) which rely chiefly on recognition of stated keywords in the targeted text. Logical methods were introduced into semantic theories for natural language already during the 60’s in what...

  4. Quicky location determination based on geographic keywords of natural language (United States)

    Guo, Danhuai; Cui, Weihong


    In location determination based on natural language, it is common to find the location by describing relationship between the undetermined position and one or several determined position. That indicates that the uncertainty of location determination processing is derived from the one of natural language procedure, the one of spatial position description and the one of spatial relationship description. Most of current researches and regular GIS software take certainty as prerequisite and try to avoid uncertainty and its influence. The research reported in this paper is an attempt to create a new combing method of Artificial Intelligence (AI), Fuzzy set theory and spatial information science named Quickly Location Determination based on Geographic Keywords (QLDGK) to rise to the challenge of location searching technique based on natural language. QLDGK have two technical gists. The first one is geographic-keywords-library and special natural-language-separation-model-library that increases the language processing efficiency. The second one is fuzzy theory based definition of spatial relationship, spatial metric and spatial orientation that extends the searching scope and defines variant confidences on variant searching outcome. QLDGK takes consideration on both higher query efficiency and the lower omission rate. The above method has been proved workable and efficient by QLDGK prototype system which was tested by about 12000 emergency call reports from K-city, Southwest of China, and achieved the test result with 78% accuracy in highest confidence and 8% omitting ration.

  5. Learning to rank for information retrieval and natural language processing

    CERN Document Server

    Li, Hang


    Learning to rank refers to machine learning techniques for training a model in a ranking task. Learning to rank is useful for many applications in information retrieval, natural language processing, and data mining. Intensive studies have been conducted on its problems recently, and significant progress has been made. This lecture gives an introduction to the area including the fundamental problems, major approaches, theories, applications, and future work.The author begins by showing that various ranking problems in information retrieval and natural language processing can be formalized as tw

  6. System reliability analysis with natural language and expert's subjectivity

    International Nuclear Information System (INIS)

    Onisawa, T.


    This paper introduces natural language expressions and expert's subjectivity to system reliability analysis. To this end, this paper defines a subjective measure of reliability and presents the method of the system reliability analysis using the measure. The subjective measure of reliability corresponds to natural language expressions of reliability estimation, which is represented by a fuzzy set defined on [0,1]. The presented method deals with the dependence among subsystems and employs parametrized operations of subjective measures of reliability which can reflect expert 's subjectivity towards the analyzed system. The analysis results are also expressed by linguistic terms. Finally this paper gives an example of the system reliability analysis by the presented method

  7. Second Language Aquisition and The Development through Nature-Nurture

    Directory of Open Access Journals (Sweden)

    Syahfitri Purnama


    Full Text Available There are some factors regarding which aspect of second language acquisition is affected by individual learner factors, age, learning style. aptitude, motivation, and personality. This research is about English language acquisition of fourth-year child by nature and nurture. The child acquired her second language acquisition at home and also in one of the courses in Jakarta. She schooled by her parents in order to be able to speak English well as a target language for her future time. The purpose of this paper is to see and examine individual learner difference especially in using English as a second language. This study is a library research and retrieved data collected, recorded, transcribed, and analyzed descriptively. The results can be concluded: the child is able to communicate well and also able to construct simple sentences, complex sentences, sentence statement, phrase questions, and explain something when her teacher asks her at school. She is able to communicate by making a simple sentence or compound sentence in well-form (two clauses or three clauses, even though she still not focus to use the past tense form and sometimes she forgets to put bound morpheme -s in third person singular but she can use turn-taking in her utterances. It is a very long process since the child does the second language acquisition. The family and teacher should participate and assist the child, the proven child can learn the first and the second language at the same time.

  8. Applications of Natural Language Processing in Biodiversity Science

    Directory of Open Access Journals (Sweden)

    Anne E. Thessen


    A computer can handle the volume but cannot make sense of the language. This paper reviews and discusses the use of natural language processing (NLP and machine-learning algorithms to extract information from systematic literature. NLP algorithms have been used for decades, but require special development for application in the biological realm due to the special nature of the language. Many tools exist for biological information extraction (cellular processes, taxonomic names, and morphological characters, but none have been applied life wide and most still require testing and development. Progress has been made in developing algorithms for automated annotation of taxonomic text, identification of taxonomic names in text, and extraction of morphological character information from taxonomic descriptions. This manuscript will briefly discuss the key steps in applying information extraction tools to enhance biodiversity science.

  9. Natural Language Processing for the Swiss German Dialect Area


    Scherrer, Yves; Rambow, Owen


    This paper discusses work on data collection for Swiss German dialects taking into account the continuous nature of the dialect landscape, and proposes to integrate these data into natural language processing models. We present knowledge-based models for machine translation into any Swiss German dialect, for dialect identification, and for multi-dialectal parsing. In a dialect continuum, rules cannot be applied uniformly, but have restricted validity in well-defined geographic areas. Therefor...

  10. A Tutorial on Techniques and Applications for Natural Language Processing (United States)


    machines through natural language. The emphasis is pragmatic . It is less important in applied NLP whether the machine "understands" its natural...between man and machine or communication between two people, entails discourse phenomena that transcend individual sentences. e Anaphora - Pronouns and...identifying the referents of these place-holder words. Interactive dialogues invite the use of anaphora , much more than simpler data base query situations

  11. Spatial Extent Models for Natural Language Phrases Involving Directional Containment

    NARCIS (Netherlands)

    Singh, G.; de By, R.A.


    We study the problem of assigning a spatial extent to a text phrase such as central northern California', with the objective of allowing spatial interpretations of natural language, and consistency testing of complex utterances that involve multiple phrases from which spatial extent can be derived.

  12. Generating natural language descriptions using speaker-dependent information

    NARCIS (Netherlands)

    Castro Ferreira, Thiago; Paraboni, Ivandré


    This paper discusses the issue of human variation in natural language referring expression generation. We introduce a model of content selection that takes speaker-dependent information into account to produce descriptions that closely resemble those produced by each individual, as seen in a number

  13. Perspectives on Bayesian Natural Language Semantics and Pragmatics

    NARCIS (Netherlands)

    Zeevat, H.; Zeevat, H.; Schmitz, H.-C.


    Bayesian interpretation is a technique in signal processing and its application to natural language semantics and pragmatics (BNLSP from here on and BNLI if there is no particular emphasis on semantics and pragmatics) is basically an engineering decision. It is a cognitive science hypothesis that

  14. Recurrent Artificial Neural Networks and Finite State Natural Language Processing. (United States)

    Moisl, Hermann

    It is argued that pessimistic assessments of the adequacy of artificial neural networks (ANNs) for natural language processing (NLP) on the grounds that they have a finite state architecture are unjustified, and that their adequacy in this regard is an empirical issue. First, arguments that counter standard objections to finite state NLP on the…

  15. Spinoza II: Conceptual Case-Based Natural Language Analysis. (United States)

    Schank, Roger C.; And Others

    This paper presents the theoretical changes that have developed in Conceptual Dependency Theory and their ramifications in computer analysis of natural language. The major items of concern are: the elimination of reliance on "grammar rules" for parsing with the emphasis given to conceptual rule based parsing; the development of a…

  16. CITE NLM: Natural-Language Searching in an Online Catalog. (United States)

    Doszkocs, Tamas E.


    The National Library of Medicine's Current Information Transfer in English public access online catalog offers unique subject search capabilities--natural-language query input, automatic medical subject headings display, closest match search strategy, ranked document output, dynamic end user feedback for search refinement. References, description…

  17. Orwell's 1984: Natural Language Searching and the Contemporary Metaphor. (United States)

    Dadlez, Eva M.


    Describes a natural language searching strategy for retrieving current material which has bearing on George Orwell's "1984," and identifies four main themes (technology, authoritarianism, press and psychological/linguistic implications of surveillance, political oppression) which have emerged from cross-database searches of the "Big…

  18. The Nature of Object Marking in American Sign Language (United States)

    Gokgoz, Kadir


    In this dissertation, I examine the nature of object marking in American Sign Language (ASL). I investigate object marking by means of directionality (the movement of the verb towards a certain location in signing space) and by means of handling classifiers (certain handshapes accompanying the verb). I propose that object marking in ASL is…

  19. Paired structures in logical and semiotic models of natural language

    DEFF Research Database (Denmark)

    Rodríguez, J. Tinguaro; Franco, Camilo; Montero, Javier


    The evidence coming from cognitive psychology and linguistics shows that pairs of reference concepts (as e.g. good/bad, tall/short, nice/ugly, etc.) play a crucial role in the way we everyday use and understand natural languages in order to analyze reality and make decisions. Different situations...

  20. Liturgical language of the Eastern Slavonic Orthodox Churches. The Position of The Polish Autocephalous Orthodox Church’s Faithful Concerning Liturgical Language

    Directory of Open Access Journals (Sweden)

    Tomasz Stempa


    Full Text Available The analysis of collected materials from the life of the Slavic Orthodox Churches indicates, that in some cases Church Slavonic language is no longer a current or justifiable liturgical language. Bilingualism was introduced or Church Slavonic language was replaced by national languages. A closer investigation into the liturgical language situation in Orthodox Churches reveals that the topicality and the validity of using Church Slavonic language as a liturgical language depends on a few factors. As in the case of the non-canonical Orthodox Churches in Macedonia and Ukraine, the Church Slavonic language has been replaced by national languages for nationalistic reasons. In the case of Bulgaria and Serbia, the main factor that has influenced this change is treating Orthodox Church as a national church. In Eastern Slavonic Orthodox Churches (Belarus, Poland and Russia, changing the liturgical language has occurred at a slow pace. The history of churches in XIX and XXI century, the temper and character of Eastern Slavs have had an influence on this. In this case, the biggest opponent of the Church Slavonic language is democracy in a broad sense. Orthodox Christians in Poland still want to pray in the Church Slavonic language. It is worth mentioning, that in churches, where the national language is used, Church Slavonic language has not been completely removed from liturgical life. Bilingualism of liturgical languages is common and in some cases, when the place is considered as backbone for the Orthodox Church, reversion to Church Slavonic language has been noted (Serbia, Bulgaria.

  1. Blurring the Inputs: A Natural Language Approach to Sensitivity Analysis (United States)

    Kleb, William L.; Thompson, Richard A.; Johnston, Christopher O.


    To document model parameter uncertainties and to automate sensitivity analyses for numerical simulation codes, a natural-language-based method to specify tolerances has been developed. With this new method, uncertainties are expressed in a natural manner, i.e., as one would on an engineering drawing, namely, 5.25 +/- 0.01. This approach is robust and readily adapted to various application domains because it does not rely on parsing the particular structure of input file formats. Instead, tolerances of a standard format are added to existing fields within an input file. As a demonstration of the power of this simple, natural language approach, a Monte Carlo sensitivity analysis is performed for three disparate simulation codes: fluid dynamics (LAURA), radiation (HARA), and ablation (FIAT). Effort required to harness each code for sensitivity analysis was recorded to demonstrate the generality and flexibility of this new approach.

  2. Developing Formal Correctness Properties from Natural Language Requirements (United States)

    Nikora, Allen P.


    This viewgraph presentation reviews the rationale of the program to transform natural language specifications into formal notation.Specifically, automate generation of Linear Temporal Logic (LTL)correctness properties from natural language temporal specifications. There are several reasons for this approach (1) Model-based techniques becoming more widely accepted, (2) Analytical verification techniques (e.g., model checking, theorem proving) significantly more effective at detecting types of specification design errors (e.g., race conditions, deadlock) than manual inspection, (3) Many requirements still written in natural language, which results in a high learning curve for specification languages, associated tools and increased schedule and budget pressure on projects reduce training opportunities for engineers, and (4) Formulation of correctness properties for system models can be a difficult problem. This has relevance to NASA in that it would simplify development of formal correctness properties, lead to more widespread use of model-based specification, design techniques, assist in earlier identification of defects and reduce residual defect content for space mission software systems. The presentation also discusses: potential applications, accomplishments and/or technological transfer potential and the next steps.

  3. Medical problem and document model for natural language understanding. (United States)

    Meystre, Stephanie; Haug, Peter J


    We are developing tools to help maintain a complete, accurate and timely problem list within a general purpose Electronic Medical Record system. As a part of this project, we have designed a system to automatically retrieve medical problems from free-text documents. Here we describe an information model based on XML (eXtensible Markup Language) and compliant with the CDA (Clinical Document Architecture). This model is used to ease the exchange of clinical data between the Natural Language Understanding application that retrieves potential problems from narrative document, and the problem list management application.

  4. Managing Fieldwork Data with Toolbox and the Natural Language Toolkit

    Directory of Open Access Journals (Sweden)

    Stuart Robinson


    Full Text Available This paper shows how fieldwork data can be managed using the program Toolbox together with the Natural Language Toolkit (NLTK for the Python programming language. It provides background information about Toolbox and describes how it can be downloaded and installed. The basic functionality of the program for lexicons and texts is described, and its strengths and weaknesses are reviewed. Its underlying data format is briefly discussed, and Toolbox processing capabilities of NLTK are introduced, showing ways in which it can be used to extend the functionality of Toolbox. This is illustrated with a few simple scripts that demonstrate basic data management tasks relevant to language documentation, such as printing out the contents of a lexicon as HTML.

  5. Polish visit

    CERN Document Server


    On 6 October, Professor Michal Kleiber, Polish Minister of Science and Chairman of the State Committee for Scientific Research, visited CERN and met both the current and designated Director General, Luciano Maiani and Robert Aymar. Professor Kleiber visited the CMS and ATLAS detector assembly halls, the underground cavern for ATLAS, and the LHC superconducting magnet string test hall. Michal Kleiber (left), Polish minister of science and Jan Krolikowski, scientist at Warsaw University and working for CMS, who shows the prototypes of the Muon Trigger board of CMS.

  6. Combining Natural Language Processing and Statistical Text Mining: A Study of Specialized versus Common Languages (United States)

    Jarman, Jay


    This dissertation focuses on developing and evaluating hybrid approaches for analyzing free-form text in the medical domain. This research draws on natural language processing (NLP) techniques that are used to parse and extract concepts based on a controlled vocabulary. Once important concepts are extracted, additional machine learning algorithms,…

  7. Using natural language processing techniques to inform research on nanotechnology

    Directory of Open Access Journals (Sweden)

    Nastassja A. Lewinski


    Full Text Available Literature in the field of nanotechnology is exponentially increasing with more and more engineered nanomaterials being created, characterized, and tested for performance and safety. With the deluge of published data, there is a need for natural language processing approaches to semi-automate the cataloguing of engineered nanomaterials and their associated physico-chemical properties, performance, exposure scenarios, and biological effects. In this paper, we review the different informatics methods that have been applied to patent mining, nanomaterial/device characterization, nanomedicine, and environmental risk assessment. Nine natural language processing (NLP-based tools were identified: NanoPort, NanoMapper, TechPerceptor, a Text Mining Framework, a Nanodevice Analyzer, a Clinical Trial Document Classifier, Nanotoxicity Searcher, NanoSifter, and NEIMiner. We conclude with recommendations for sharing NLP-related tools through online repositories to broaden participation in nanoinformatics.

  8. Conclusiveness of natural languages and recognition of images

    Energy Technology Data Exchange (ETDEWEB)

    Wojcik, Z.M.


    The conclusiveness is investigated using recognition processes and one-one correspondence between expressions of a natural language and graphs representing events. The graphs, as conceived in psycholinguistics, are obtained as a result of perception processes. It is possible to generate and process the graphs automatically, using computers and then to convert the resulting graphs into expressions of a natural language. Correctness and conclusiveness of the graphs and sentences are investigated using the fundamental condition for events representation processes. Some consequences of the conclusiveness are discussed, e.g. undecidability of arithmetic, human brain assymetry, correctness of statistical calculations and operations research. It is suggested that the group theory should be imposed on mathematical models of any real system. Proof of the fundamental condition is also presented. 14 references.

  9. Polish Academy of Sciences Great Dictionary of Polish [Wielki słownik języka polskiego PAN

    Directory of Open Access Journals (Sweden)

    Piotr Žmigrodzki


    Full Text Available The paper describes a lexicographical project involving the development of the newest general dictionary of the Polish language: the Polish Academy of Sciences Great Dictionary of Polish [Wielki słownik języka polskiego PAN]. The project is coordinated by the Institute of Polish Language at the Polish Academy of Sciences and carried out in collaboration with linguists and lexicographers from several other Polish academic centres. The paper offers a brief description of the genesis of the project and the scope of information included in the dictionary, the organisation of work, the life of the dictionary on the Web as well as the plans for the future.

  10. Exploiting Lexical Regularities in Designing Natural Language Systems. (United States)


    ELEMENT. PROJECT. TASKN Artificial Inteligence Laboratory A1A4WR NTumet 0) 545 Technology Square Cambridge, MA 02139 Ln *t- CONTROLLING OFFICE NAME AND...RO-RI95 922 EXPLOITING LEXICAL REGULARITIES IN DESIGNING NATURAL 1/1 LANGUAGE SYSTENS(U) MASSACHUSETTS INST OF TECH CAMBRIDGE ARTIFICIAL INTELLIGENCE...oes.ary and ftdou.Ip hr Nl wow" L,2This paper presents the lexical component of the START Question Answering system developed at the MIT Artificial

  11. Generalized Hebbian Algorithm for Dimensionality Reduction in Natural Language Processing


    Gorrell, Genevieve


    The current surge of interest in search and comparison tasks in natural language processing has brought with it a focus on vector space approaches and vector space dimensionality reduction techniques. Presenting data as points in hyperspace provides opportunities to use a variety of welldeveloped tools pertinent to this representation. Dimensionality reduction allows data to be compressed and generalised. Eigen decomposition and related algorithms are one category of approaches to dimensional...

  12. ARSENAL: Automatic Requirements Specification Extraction from Natural Language


    Ghosh, Shalini; Elenius, Daniel; Li, Wenchao; Lincoln, Patrick; Shankar, Natarajan; Steiner, Wilfried


    Requirements are informal and semi-formal descriptions of the expected behavior of a complex system from the viewpoints of its stakeholders (customers, users, operators, designers, and engineers). However, for the purpose of design, testing, and verification for critical systems, we can transform requirements into formal models that can be analyzed automatically. ARSENAL is a framework and methodology for systematically transforming natural language (NL) requirements into analyzable formal mo...

  13. Determination of suitability of natural Polish resources for production of ceramic proppants applied in gas exploration from European shale formations (United States)

    Szymanska, Joanna; Mizera, Jaroslaw


    Poland is one of few European countries undertaking innovative research towards effective exploration of hydrocarbons form shale deposits. With regard for strict geological conditions, which occur during hydraulic fracturing, it is required to apply ceramic proppants enhancing extraction of shale gas. Ceramic proppants are granules (16/30 - 70/120 Mesh) classified as propping agents. These granules located in the newly created fissures (due to injected high pressure fluid) in the shale rock, act as a prop, what enables gas flow up the well. It occurs if the proppants can resist high stress of the closing fractures. Commonly applied proppants are quartz sands used only for shallow reservoirs and fissile shales (in the USA). Whereas, the ceramic granules are proper for extraction of gas on the high depths at hard geomechanical conditions (in Europe) to increase output even by 30 - 50%. In comparison to other propping materials, this kind of proppants predominate with mechanical strength, smoother surface, lower solubility in acids and also high stability in water. Such parameters can be available through proper raw materials selection to further proppants production. The Polish ceramic proppants are produced from natural resources as kaolin, bauxite and white clay mixed with water and binders. Afterwards, the slurries are subjected to granulation in a mechanical granulator and sintered at high temperatures (1200 - 1550°C). Taking into consideration presence of geomechanical barriers, that prevent fracture propagation beyond shale formations, it is crucial to determine quality of applied natural deposits. Next step is to optimize the proppants production and select the best kind of granules, what was the aim of this research. Utility of the raw materials was estimated on basis of their particle size distribution, bulk density, specific surface area (BET) and thermal analysis (thermogravimetry). Morphology and shape were determined by Scanning Electron Microscopy (SEM

  14. Anaphora and Logical Form: On Formal Meaning Representations for Natural Language. Technical Report No. 36. (United States)

    Nash-Webber, Bonnie; Reiter, Raymond

    This paper describes a computational approach to certain problems of anaphora in natural language and argues in favor of formal meaning representation languages (MRLs) for natural language. After presenting arguments in favor of formal meaning representation languages, appropriate MRLs are discussed. Minimal requirements include provisions for…

  15. Natural Language Processing Technologies in Radiology Research and Clinical Applications (United States)

    Cai, Tianrun; Giannopoulos, Andreas A.; Yu, Sheng; Kelil, Tatiana; Ripley, Beth; Kumamaru, Kanako K.; Rybicki, Frank J.


    The migration of imaging reports to electronic medical record systems holds great potential in terms of advancing radiology research and practice by leveraging the large volume of data continuously being updated, integrated, and shared. However, there are significant challenges as well, largely due to the heterogeneity of how these data are formatted. Indeed, although there is movement toward structured reporting in radiology (ie, hierarchically itemized reporting with use of standardized terminology), the majority of radiology reports remain unstructured and use free-form language. To effectively “mine” these large datasets for hypothesis testing, a robust strategy for extracting the necessary information is needed. Manual extraction of information is a time-consuming and often unmanageable task. “Intelligent” search engines that instead rely on natural language processing (NLP), a computer-based approach to analyzing free-form text or speech, can be used to automate this data mining task. The overall goal of NLP is to translate natural human language into a structured format (ie, a fixed collection of elements), each with a standardized set of choices for its value, that is easily manipulated by computer programs to (among other things) order into subcategories or query for the presence or absence of a finding. The authors review the fundamentals of NLP and describe various techniques that constitute NLP in radiology, along with some key applications. ©RSNA, 2016 PMID:26761536

  16. Natural Language Processing Technologies in Radiology Research and Clinical Applications. (United States)

    Cai, Tianrun; Giannopoulos, Andreas A; Yu, Sheng; Kelil, Tatiana; Ripley, Beth; Kumamaru, Kanako K; Rybicki, Frank J; Mitsouras, Dimitrios


    The migration of imaging reports to electronic medical record systems holds great potential in terms of advancing radiology research and practice by leveraging the large volume of data continuously being updated, integrated, and shared. However, there are significant challenges as well, largely due to the heterogeneity of how these data are formatted. Indeed, although there is movement toward structured reporting in radiology (ie, hierarchically itemized reporting with use of standardized terminology), the majority of radiology reports remain unstructured and use free-form language. To effectively "mine" these large datasets for hypothesis testing, a robust strategy for extracting the necessary information is needed. Manual extraction of information is a time-consuming and often unmanageable task. "Intelligent" search engines that instead rely on natural language processing (NLP), a computer-based approach to analyzing free-form text or speech, can be used to automate this data mining task. The overall goal of NLP is to translate natural human language into a structured format (ie, a fixed collection of elements), each with a standardized set of choices for its value, that is easily manipulated by computer programs to (among other things) order into subcategories or query for the presence or absence of a finding. The authors review the fundamentals of NLP and describe various techniques that constitute NLP in radiology, along with some key applications. ©RSNA, 2016.

  17. Discovery of Kolmogorov Scaling in the Natural Language

    Directory of Open Access Journals (Sweden)

    Maurice H. P. M. van Putten


    Full Text Available We consider the rate R and variance σ 2 of Shannon information in snippets of text based on word frequencies in the natural language. We empirically identify Kolmogorov’s scaling law in σ 2 ∝ k - 1 . 66 ± 0 . 12 (95% c.l. as a function of k = 1 / N measured by word count N. This result highlights a potential association of information flow in snippets, analogous to energy cascade in turbulent eddies in fluids at high Reynolds numbers. We propose R and σ 2 as robust utility functions for objective ranking of concordances in efficient search for maximal information seamlessly across different languages and as a starting point for artificial attention.

  18. 'Fly Like This': Natural Language Interface for UAV Mission Planning (United States)

    Chandarana, Meghan; Meszaros, Erica L.; Trujillo, Anna; Allen, B. Danette


    With the increasing presence of unmanned aerial vehicles (UAVs) in everyday environments, the user base of these powerful and potentially intelligent machines is expanding beyond exclusively highly trained vehicle operators to include non-expert system users. Scientists seeking to augment costly and often inflexible methods of data collection historically used are turning towards lower cost and reconfigurable UAVs. These new users require more intuitive and natural methods for UAV mission planning. This paper explores two natural language interfaces - gesture and speech - for UAV flight path generation through individual user studies. Subjects who participated in the user studies also used a mouse-based interface for a baseline comparison. Each interface allowed the user to build flight paths from a library of twelve individual trajectory segments. Individual user studies evaluated performance, efficacy, and ease-of-use of each interface using background surveys, subjective questionnaires, and observations on time and correctness. Analysis indicates that natural language interfaces are promising alternatives to traditional interfaces. The user study data collected on the efficacy and potential of each interface will be used to inform future intuitive UAV interface design for non-expert users.

  19. Deviations in the Zipf and Heaps laws in natural languages

    International Nuclear Information System (INIS)

    Bochkarev, Vladimir V; Lerner, Eduard Yu; Shevlyakova, Anna V


    This paper is devoted to verifying of the empirical Zipf and Hips laws in natural languages using Google Books Ngram corpus data. The connection between the Zipf and Heaps law which predicts the power dependence of the vocabulary size on the text size is discussed. In fact, the Heaps exponent in this dependence varies with the increasing of the text corpus. To explain it, the obtained results are compared with the probability model of text generation. Quasi-periodic variations with characteristic time periods of 60-100 years were also found

  20. Context and Natural Language in Formal Concept Analysis

    DEFF Research Database (Denmark)

    Wray, Tim; Eklund, Peter


    CollectionWeb is a framework that uses Formal Concept Analysis (FCA) to link contextually related objects within museum collections. These connections are used to drive a number of user interactions that are intended to promote exploration and discovery. The idea is based on museological perspect...... narratives based on conceptual pathways. The framework has been applied to a number of user facing applications and provides insights on how FCA and natural language pipelines can be used to provide contextual, linked navigation within museum collections....

  1. VnCoreNLP: A Vietnamese Natural Language Processing Toolkit


    Vu, Thanh; Nguyen, Dat Quoc; Nguyen, Dai Quoc; Dras, Mark; Johnson, Mark


    We present an easy-to-use and fast toolkit, namely VnCoreNLP---a Java NLP annotation pipeline for Vietnamese. Our VnCoreNLP supports key natural language processing (NLP) tasks including word segmentation, part-of-speech (POS) tagging, named entity recognition (NER) and dependency parsing, and obtains state-of-the-art (SOTA) results for these tasks. We release VnCoreNLP to provide rich linguistic annotations to facilitate research work on Vietnamese NLP. Our VnCoreNLP is open-source under GPL...

  2. The need for verification of the Polish lignite deposits owing to development and nature conservation protection on land at the surface

    Directory of Open Access Journals (Sweden)

    Naworyta Wojciech


    Full Text Available Poland is a country rich in lignite. The area where the lignite occurs occupies approx. 22% of the total surface area of the country. Geological resources of Polish lignite deposits are estimated at 23.5 billion Mg, but in the majority (69% the accuracy of their identification is poor. Nevertheless the amount of coal in Polish deposits allows - at least in theory - for mining and energy production at the current level for hundreds of years to come. It is an important raw material for the energy security of the country both currently and in the future. Because the vast majority of Polish and foreign mines use an open pit method for lignite extraction the actual amount of mineral available for the extraction depends not only on the properties of the deposit but to a large extent on the method of development of the surface land above the deposit, as well as on the sensitivity of the environment in the vicinity of any future mines. After careful analysis it can be stated that only a few of the lignite deposits may be subject to cost-effective mining operations. These deposits should be subjected to special protection as a future resource base which will ensure the energy security of the country. Some examples of domestic deposits have been presented where due to the conflict resulting from the development of the area should be deleted from the Balance Sheet of Mineral Deposits because their exploitation is irrational and uneconomic. Keeping such deposits in the Balance Sheet, and the use of large numbers in the context of their resource base leads to an unwarranted sense of wealth which consequently does not encourage the protection of these deposits which may actually be subject to rational exploitation in the near future. In summary there is a need to find a compromise in order to adequately protect all natural resources including mineral deposits.

  3. Natural, social, economical and political influences on fisheries: a review of the transitional area of the Polish waters of the Vistula Lagoon. (United States)

    Psuty, Iwona


    A 60 year (1948-2007) dataset gathered by Polish researchers working on the Vistula Lagoon fish assemblages and fisheries has shown this stressed transitional environment to be always been dominated by a few highly abundant fish species. During this period, the surrounding countries Poland and Russia (Kaliningrad) were transformed from centrally-planned economies with fixed prices to free market systems. The organization of the fishery evolved from one in which the majority of the fishing effort was expended by cooperatives, to one which was characterized by individual economic activity. The fishing gear deployed also evolved from cotton to monofilament, as well as from large sailing vessels with small-sized pair trawls to fyke nets targeting eel (Anguilla anguilla) and pound nets targeting herring (Clupea harengus). Small-sized gillnets targeting perch (Perca fluviatilis) grew in popularity as eel and pikeperch (Sander lucioperca) catches decreased. Cooperation between Polish and Russian fishery managers began in 1952 with the aim of implementing joint agreements to establish protection guidelines. The substantial nutrient loads into the lagoon in 1970 and 1980 put very large pressure on the environment, and contributed to the loss of macrophytes as well as the development of non-commercial fish populations. One of the consequences of these changes was the rapid growth of a black cormorant (Phalacrocorax carbo) breeding colony. These multi-faceted changes are considered to be the factors that have influenced the exploitation of fish assemblages in the Polish part of the Vistula Lagoon. The most evident change in the fish assemblage structure during the study period was the permanent decrease in the basin's top predators--pike (Esox lucius) and pikeperch. Eel stocking was initiated in 1970 following a crucial decline in yield from natural recruitment, and stocking was successful in increasing eel abundance. Copyright 2010 Elsevier Ltd. All rights reserved.

  4. Natural mineral bottled waters available on the Polish market as a source of minerals for the consumers. Part 2: The intake of sodium and potassium. (United States)

    Gątarska, Anna; Ciborska, Joanna; Tońska, Elżbieta

    Natural mineral waters are purchased and consumed according to consumer preferences and possible recommendations. The choice of appropriate water should take into account not only the general level of mineralization but also the content of individual components, including electrolytes such as sodium and potassium. Sodium is necessary to ensure the proper physiological functions of the body. It is defined as a health risk factor only when its excessive intake occurs. Potassium acts antagonistically towards sodium and calcium ions, contributes to a reduction of the volume of extracellular fluids and at the same time reduces muscle tension and permeability of cell membranes. The demand for sodium and potassium is of particular importance in people expending significant physical effort, where an increased electrolyte supply is recommended. The aim of the study was to estimate the content of sodium and potassium in natural mineral waters available in the Polish market and to evaluate the intake of those components with the commercially available mineral waters by different groups of consumers at the assumed volume of their consumption. The research material consisted of natural mineral waters of forty various brands available on the Polish market. The examined products were either produced in Poland or originated in other European countries. Among the products under examination, about 30% of the waters were imported from Lithuania, Latvia, the Czech Republic, France, Italy and Germany. A sample for analyses consisted of two package units of the examined water from different production lots. Samples for research were collected at random. The study was conducted with the same samples in in which calcium and magnesium content was determined, which was the subject of the first part of the study. The content of sodium and potassium was determined using the emission technique (acetylene-air flame), with the use of atomic absorption spectrometer – ICE 3000 SERIES – THERMO

  5. Does textual feedback hinder spoken interaction in natural language? (United States)

    Le Bigot, Ludovic; Terrier, Patrice; Jamet, Eric; Botherel, Valerie; Rouet, Jean-Francois


    The aim of the study was to determine the influence of textual feedback on the content and outcome of spoken interaction with a natural language dialogue system. More specifically, the assumption that textual feedback could disrupt spoken interaction was tested in a human-computer dialogue situation. In total, 48 adult participants, familiar with the system, had to find restaurants based on simple or difficult scenarios using a real natural language service system in a speech-only (phone), speech plus textual dialogue history (multimodal) or text-only (web) modality. The linguistic contents of the dialogues differed as a function of modality, but were similar whether the textual feedback was included in the spoken condition or not. These results add to burgeoning research efforts on multimodal feedback, in suggesting that textual feedback may have little or no detrimental effect on information searching with a real system. STATEMENT OF RELEVANCE: The results suggest that adding textual feedback to interfaces for human-computer dialogue could enhance spoken interaction rather than create interference. The literature currently suggests that adding textual feedback to tasks that depend on the visual sense benefits human-computer interaction. The addition of textual output when the spoken modality is heavily taxed by the task was investigated.

  6. Natural Language Processing in Radiology: A Systematic Review. (United States)

    Pons, Ewoud; Braun, Loes M M; Hunink, M G Myriam; Kors, Jan A


    Radiological reporting has generated large quantities of digital content within the electronic health record, which is potentially a valuable source of information for improving clinical care and supporting research. Although radiology reports are stored for communication and documentation of diagnostic imaging, harnessing their potential requires efficient and automated information extraction: they exist mainly as free-text clinical narrative, from which it is a major challenge to obtain structured data. Natural language processing (NLP) provides techniques that aid the conversion of text into a structured representation, and thus enables computers to derive meaning from human (ie, natural language) input. Used on radiology reports, NLP techniques enable automatic identification and extraction of information. By exploring the various purposes for their use, this review examines how radiology benefits from NLP. A systematic literature search identified 67 relevant publications describing NLP methods that support practical applications in radiology. This review takes a close look at the individual studies in terms of tasks (ie, the extracted information), the NLP methodology and tools used, and their application purpose and performance results. Additionally, limitations, future challenges, and requirements for advancing NLP in radiology will be discussed. (©) RSNA, 2016 Online supplemental material is available for this article.

  7. Suicide Note Classification Using Natural Language Processing: A Content Analysis

    Directory of Open Access Journals (Sweden)

    John Pestian


    Full Text Available Suicide is the second leading cause of death among 25–34 year olds and the third leading cause of death among 15–25 year olds in the United States. In the Emergency Department, where suicidal patients often present, estimating the risk of repeated attempts is generally left to clinical judgment. This paper presents our second attempt to determine the role of computational algorithms in understanding a suicidal patient’s thoughts, as represented by suicide notes. We focus on developing methods of natural language processing that distinguish between genuine and elicited suicide notes. We hypothesize that machine learning algorithms can categorize suicide notes as well as mental health professionals and psychiatric physician trainees do. The data used are comprised of suicide notes from 33 suicide completers and matched to 33 elicited notes from healthy control group members. Eleven mental health professionals and 31 psychiatric trainees were asked to decide if a note was genuine or elicited. Their decisions were compared to nine different machine-learning algorithms. The results indicate that trainees accurately classified notes 49% of the time, mental health professionals accurately classified notes 63% of the time, and the best machine learning algorithm accurately classified the notes 78% of the time. This is an important step in developing an evidence-based predictor of repeated suicide attempts because it shows that natural language processing can aid in distinguishing between classes of suicidal notes.

  8. Suicide Note Classification Using Natural Language Processing: A Content Analysis. (United States)

    Pestian, John; Nasrallah, Henry; Matykiewicz, Pawel; Bennett, Aurora; Leenaars, Antoon


    Suicide is the second leading cause of death among 25-34 year olds and the third leading cause of death among 15-25 year olds in the United States. In the Emergency Department, where suicidal patients often present, estimating the risk of repeated attempts is generally left to clinical judgment. This paper presents our second attempt to determine the role of computational algorithms in understanding a suicidal patient's thoughts, as represented by suicide notes. We focus on developing methods of natural language processing that distinguish between genuine and elicited suicide notes. We hypothesize that machine learning algorithms can categorize suicide notes as well as mental health professionals and psychiatric physician trainees do. The data used are comprised of suicide notes from 33 suicide completers and matched to 33 elicited notes from healthy control group members. Eleven mental health professionals and 31 psychiatric trainees were asked to decide if a note was genuine or elicited. Their decisions were compared to nine different machine-learning algorithms. The results indicate that trainees accurately classified notes 49% of the time, mental health professionals accurately classified notes 63% of the time, and the best machine learning algorithm accurately classified the notes 78% of the time. This is an important step in developing an evidence-based predictor of repeated suicide attempts because it shows that natural language processing can aid in distinguishing between classes of suicidal notes.

  9. Advanced applications of natural language processing for performing information extraction

    CERN Document Server

    Rodrigues, Mário


    This book explains how can be created information extraction (IE) applications that are able to tap the vast amount of relevant information available in natural language sources: Internet pages, official documents such as laws and regulations, books and newspapers, and social web. Readers are introduced to the problem of IE and its current challenges and limitations, supported with examples. The book discusses the need to fill the gap between documents, data, and people, and provides a broad overview of the technology supporting IE. The authors present a generic architecture for developing systems that are able to learn how to extract relevant information from natural language documents, and illustrate how to implement working systems using state-of-the-art and freely available software tools. The book also discusses concrete applications illustrating IE uses.   ·         Provides an overview of state-of-the-art technology in information extraction (IE), discussing achievements and limitations for t...

  10. Polish-Bulgarian-Russian, Bulgarian-Polish-Russian or Russian-Bulgarian-Polish dictionary?

    Directory of Open Access Journals (Sweden)

    Violetta Koseska-Toszewa


    Full Text Available Polish-Bulgarian-Russian, Bulgarian-Polish-Russian or Russian-Bulgarian-Polish dictionary? The trilingual dictionary (M. Duszkin, V. Koseska, J. Satoła and A. Tzoneva is being elaborated based on a working Polish-Bulgarian-Russian electronic parallel corpus authored by Maksim Duszkin, Violetta Koseska-Toszewa and Joanna Satoła-Staśkowiak, and works by A. Tzoneva. It is the first corpus comparing languages belonging to three different Slavic language groups: western, southern and eastern. Works on the dictionary are based on Gramatyka konfrontatywna bułgarsko-polska (Bulgarian-Polish confrontative grammar and the proposed there semantic-oriented interlanguage. Two types of classifiers have been introduced into the dictionary: classic and semantic. The trilingual dictionary will present a consistent and homogeneous set of facts of grammar and semantics. The Authors point out that in a traditional dictionary it is not clear for example whether aspect should be understood as imperfective / perfective form of a verb or as its meaning. Therefore in the dictionary forms and meaning are separated in a regular way. Imperfective verb form has two meanings: state and configuration of states and events culminating in state. Also perfective verb form has two meanings: event and configuration of states and events culminating in event. These meanings are described by the semantic classifiers, respectively, state and event, state1 and event1. The way of describing language units, mentioned in the article, gives a possibility to present language material (Polish, Bulgarian, Russian in any required order, hence the article’s title.

  11. The social nature of health and illness--evolution of research approaches in Polish classical medical sociology. (United States)

    Piątkowski, Włodzimierz; Skrzypek, Michał


    The cognitive identity of medical sociology has developed in a historical perspective in the context of a specific double frame of reference comprising medicine and general sociology. The purpose of this study is to reconstruct the process of the development of the subdiscipline's research specificity in Poland, drawing attention to the general-sociological context of the conceptualization of basic interpretive and analytical sociomedical categories. In this aspect, the presented study is based on the analysis of Polish sociomedical and general-sociological research published from the early 1960s until 1989. The purpose of the study is also to describe in this perspective the structure of the research field of contemporary Western medical sociology, which was a major point of reference in this process. A look at the chronology of how the scientific identity of medical sociology developed in Poland from a historical perspective shows the gradual balancing-out of the subdiscipline's medical references, typical of the early stage of its development, and manifested in the implementation of research projects for the requirements of doctors, through consistently developed and cultivated connections with general sociology manifested in complementing the knowledge of society with aspects related to health and illness. A sine qua non condition for undertaking this scope of research was to work out strictly sociological formulations of these concepts, which was accomplished as a result of the successful reception of general sociology by the subdiscipline in question. The contemporary understanding of the research field of Polish medical sociology defined by Magdalena Sokołowska and developed as part of the 'school of medical sociology', which she initiated, is characterized by the maintenance of close relations with general sociology (affiliations of sociomedical departments in academic sociological institutions, etc.), and at the same time, by partnership cooperation with

  12. Nature of phonological delay in children with specific language impairment. (United States)

    Orsolini, M; Sechi, E; Maronato, C; Bonvino, E; Corcelli, A


    This study investigated the nature of phonological delay in a group of children with specific language impairment. It was asked whether phonological errors in this group of children were generated by a slow but normal language learning process or whether they reflected a selective impairment in some representations that enhance normal acquisition and use of a language phonology. A group of 10 children with SLI (mean age = 5.1) was compared with three groups of normal children who were matched in age (age control group, mean age = 5.1), in sentence comprehension and recalling (grammar control group, mean age = 3.7), or who exhibited a phonological performance lower than the age average (group with low phonological performance, mean age = 4.4). The four groups of children were assessed in terms of: (1) responses to a mispronunciation detection task; and (2) error profiles with complex and simple syllabic structures. Performance on the mispronunciation detection task showed that the group with SLI could distinguish a target lexical item from acoustic non-word stimuli that were highly similar to it in terms of phonetic characteristics. An analysis of overall error rate at this task showed, however, that four children with SLI had a much lower performance than normal children of the same age, even when the auditory stimuli were tokens of the target word, or non-words that were phonetically different from the target. A difficulty in coordinating vocal actions in an articulatory plan accounted for error profiles with simple syllabic structures both for some children with SLI and normal children with phonological performance lower than the age average. A severe difficulty with representing complex syllabic structures was a homogeneous characteristic of the group with SLI and worked as the main indicator of impaired, rather than simply slow, phonological development.

  13. Neurolinguistics and psycholinguistics as a basis for computer acquisition of natural language

    Energy Technology Data Exchange (ETDEWEB)

    Powers, D.M.W.


    Research into natural language understanding systems for computers has concentrated on implementing particular grammars and grammatical models of the language concerned. This paper presents a rationale for research into natural language understanding systems based on neurological and psychological principles. Important features of the approach are that it seeks to place the onus of learning the language on the computer, and that it seeks to make use of the vast wealth of relevant psycholinguistic and neurolinguistic theory. 22 references.

  14. Connectionist natural language parsing with BrainC (United States)

    Mueller, Adrian; Zell, Andreas


    A close examination of pure neural parsers shows that they either could not guarantee the correctness of their derivations or had to hard-code seriality into the structure of the net. The authors therefore decided to use a hybrid architecture, consisting of a serial parsing algorithm and a trainable net. The system fulfills the following design goals: (1) parsing of sentences without length restriction, (2) soundness and completeness for any context-free language, and (3) learning the applicability of parsing rules with a neural network to increase the efficiency of the whole system. BrainC (backtracktacking and backpropagation in C) combines the well- known shift-reduce parsing technique with backtracking with a backpropagation network to learn and represent typical structures of the trained natural language grammars. The system has been implemented as a subsystem of the Rochester Connectionist Simulator (RCS) on SUN workstations and was tested with several grammars for English and German. The design of the system and then the results are discussed.

  15. Natural language acquisition in large scale neural semantic networks (United States)

    Ealey, Douglas

    This thesis puts forward the view that a purely signal- based approach to natural language processing is both plausible and desirable. By questioning the veracity of symbolic representations of meaning, it argues for a unified, non-symbolic model of knowledge representation that is both biologically plausible and, potentially, highly efficient. Processes to generate a grounded, neural form of this model-dubbed the semantic filter-are discussed. The combined effects of local neural organisation, coincident with perceptual maturation, are used to hypothesise its nature. This theoretical model is then validated in light of a number of fundamental neurological constraints and milestones. The mechanisms of semantic and episodic development that the model predicts are then used to explain linguistic properties, such as propositions and verbs, syntax and scripting. To mimic the growth of locally densely connected structures upon an unbounded neural substrate, a system is developed that can grow arbitrarily large, data- dependant structures composed of individual self- organising neural networks. The maturational nature of the data used results in a structure in which the perception of concepts is refined by the networks, but demarcated by subsequent structure. As a consequence, the overall structure shows significant memory and computational benefits, as predicted by the cognitive and neural models. Furthermore, the localised nature of the neural architecture also avoids the increasing error sensitivity and redundancy of traditional systems as the training domain grows. The semantic and episodic filters have been demonstrated to perform as well, or better, than more specialist networks, whilst using significantly larger vocabularies, more complex sentence forms and more natural corpora.

  16. Behind the scenes: A medical natural language processing project. (United States)

    Wu, Joy T; Dernoncourt, Franck; Gehrmann, Sebastian; Tyler, Patrick D; Moseley, Edward T; Carlson, Eric T; Grant, David W; Li, Yeran; Welt, Jonathan; Celi, Leo Anthony


    Advancement of Artificial Intelligence (AI) capabilities in medicine can help address many pressing problems in healthcare. However, AI research endeavors in healthcare may not be clinically relevant, may have unrealistic expectations, or may not be explicit enough about their limitations. A diverse and well-functioning multidisciplinary team (MDT) can help identify appropriate and achievable AI research agendas in healthcare, and advance medical AI technologies by developing AI algorithms as well as addressing the shortage of appropriately labeled datasets for machine learning. In this paper, our team of engineers, clinicians and machine learning experts share their experience and lessons learned from their two-year-long collaboration on a natural language processing (NLP) research project. We highlight specific challenges encountered in cross-disciplinary teamwork, dataset creation for NLP research, and expectation setting for current medical AI technologies. Copyright © 2017. Published by Elsevier B.V.

  17. Natural language processing in biomedicine: a unified system architecture overview. (United States)

    Doan, Son; Conway, Mike; Phuong, Tu Minh; Ohno-Machado, Lucila


    In contemporary electronic medical records much of the clinically important data-signs and symptoms, symptom severity, disease status, etc.-are not provided in structured data fields but rather are encoded in clinician-generated narrative text. Natural language processing (NLP) provides a means of unlocking this important data source for applications in clinical decision support, quality assurance, and public health. This chapter provides an overview of representative NLP systems in biomedicine based on a unified architectural view. A general architecture in an NLP system consists of two main components: background knowledge that includes biomedical knowledge resources and a framework that integrates NLP tools to process text. Systems differ in both components, which we review briefly. Additionally, the challenge facing current research efforts in biomedical NLP includes the paucity of large, publicly available annotated corpora, although initiatives that facilitate data sharing, system evaluation, and collaborative work between researchers in clinical NLP are starting to emerge.

  18. Natural Language Based Multimodal Interface for UAV Mission Planning (United States)

    Chandarana, Meghan; Meszaros, Erica L.; Trujillo, Anna; Allen, B. Danette


    As the number of viable applications for unmanned aerial vehicle (UAV) systems increases at an exponential rate, interfaces that reduce the reliance on highly skilled engineers and pilots must be developed. Recent work aims to make use of common human communication modalities such as speech and gesture. This paper explores a multimodal natural language interface that uses a combination of speech and gesture input modalities to build complex UAV flight paths by defining trajectory segment primitives. Gesture inputs are used to define the general shape of a segment while speech inputs provide additional geometric information needed to fully characterize a trajectory segment. A user study is conducted in order to evaluate the efficacy of the multimodal interface.

  19. Pattern Recognition and Natural Language Processing: State of the Art

    Directory of Open Access Journals (Sweden)

    Mirjana Kocaleva


    Full Text Available Development of information technologies is growing steadily. With the latest software technologies development and application of the methods of artificial intelligence and machine learning intelligence embededs in computers, the expectations are that in near future computers will be able to solve problems themselves like people do. Artificial intelligence emulates human behavior on computers. Rather than executing instructions one by one, as theyare programmed, machine learning employs prior experience/data that is used in the process of system’s training. In this state of the art paper, common methods in AI, such as machine learning, pattern recognition and the natural language processing (NLP are discussed. Also are given standard architecture of NLP processing system and the level thatisneeded for understanding NLP. Lastly the statistical NLP processing and multi-word expressions are described.

  20. Constructing Concept Schemes From Astronomical Telegrams Via Natural Language Clustering (United States)

    Graham, Matthew; Zhang, M.; Djorgovski, S. G.; Donalek, C.; Drake, A. J.; Mahabal, A.


    The rapidly emerging field of time domain astronomy is one of the most exciting and vibrant new research frontiers, ranging in scientific scope from studies of the Solar System to extreme relativistic astrophysics and cosmology. It is being enabled by a new generation of large synoptic digital sky surveys - LSST, PanStarrs, CRTS - that cover large areas of sky repeatedly, looking for transient objects and phenomena. One of the biggest challenges facing these is the automated classification of transient events, a process that needs machine-processible astronomical knowledge. Semantic technologies enable the formal representation of concepts and relations within a particular domain. ATELs ( are a commonly-used means for reporting and commenting upon new astronomical observations of transient sources (supernovae, stellar outbursts, blazar flares, etc). However, they are loose and unstructured and employ scientific natural language for description: this makes automated processing of them - a necessity within the next decade with petascale data rates - a challenge. Nevertheless they represent a potentially rich corpus of information that could lead to new and valuable insights into transient phenomena. This project lies in the cutting-edge field of astrosemantics, a branch of astroinformatics, which applies semantic technologies to astronomy. The ATELs have been used to develop an appropriate concept scheme - a representation of the information they contain - for transient astronomy using hierarchical clustering of processed natural language. This allows us to automatically organize ATELs based on the vocabulary used. We conclude that we can use simple algorithms to process and extract meaning from astronomical textual data.

  1. Emerging Approach of Natural Language Processing in Opinion Mining: A Review (United States)

    Kim, Tai-Hoon

    Natural language processing (NLP) is a subfield of artificial intelligence and computational linguistics. It studies the problems of automated generation and understanding of natural human languages. This paper outlines a framework to use computer and natural language techniques for various levels of learners to learn foreign languages in Computer-based Learning environment. We propose some ideas for using the computer as a practical tool for learning foreign language where the most of courseware is generated automatically. We then describe how to build Computer Based Learning tools, discuss its effectiveness, and conclude with some possibilities using on-line resources.

  2. Second-language instinct and instruction effects: nature and nurture in second-language acquisition. (United States)

    Yusa, Noriaki; Koizumi, Masatoshi; Kim, Jungho; Kimura, Naoki; Uchida, Shinya; Yokoyama, Satoru; Miura, Naoki; Kawashima, Ryuta; Hagiwara, Hiroko


    Adults seem to have greater difficulties than children in acquiring a second language (L2) because of the alleged "window of opportunity" around puberty. Postpuberty Japanese participants learned a new English rule with simplex sentences during one month of instruction, and then they were tested on "uninstructed complex sentences" as well as "instructed simplex sentences." The behavioral data show that they can acquire more knowledge than is instructed, suggesting the interweaving of nature (universal principles of grammar, UG) and nurture (instruction) in L2 acquisition. The comparison in the "uninstructed complex sentences" between post-instruction and pre-instruction using functional magnetic resonance imaging reveals a significant activation in Broca's area. Thus, this study provides new insight into Broca's area, where nature and nurture cooperate to produce L2 learners' rich linguistic knowledge. It also shows neural plasticity of adult L2 acquisition, arguing against a critical period hypothesis, at least in the domain of UG.

  3. Does Grammatical Gender Influence Perception? A Study of Polish and French Speakers

    Directory of Open Access Journals (Sweden)

    Haertlé Izabella


    Full Text Available Can the perception of a word be influenced by its grammatical gender? Can it happen that speakers of one language perceive an object to have masculine features, while speakers of another language perceive the same object to have feminine features? Previous studies suggest that this is the case, and also that there is some supra-language gender categorisation of objects as natural/feminine and artefact/masculine. This study was an attempt to replicate these findings on another population of subjects. This is the first Polish study of this kind, comparing the perceptions of objects by Polish- and French-speaking individuals. The results of this study show that grammatical gender may cue people to assess objects as masculine or feminine. However, the findings of some previous studies, that feminine features are more often ascribed to natural objects than artifacts, were not replicated.

  4. The Nature of Spanish versus English Language Use at Home (United States)

    Branum-Martin, Lee; Mehta, Paras D.; Carlson, Coleen D.; Francis, David J.; Goldenberg, Claude


    Home language experiences are important for children's development of language and literacy. However, the home language context is complex, especially for Spanish-speaking children in the United States. A child's use of Spanish or English likely ranges along a continuum, influenced by preferences of particular people involved, such as parents,…

  5. A Classification of Sentences Used in Natural Language Processing in the Military Services. (United States)

    Wittrock, Merlin C.

    Concepts in cognitive psychology are applied to the language used in military situations, and a sentence classification system for use in analyzing military language is outlined. The system is designed to be used, in part, in conjunction with a natural language query system that allows a user to access a database. The discussion of military…

  6. Understanding the Nature of Learners' Out-of-Class Language Learning Experience with Technology (United States)

    Lai, Chun; Hu, Xiao; Lyu, Boning


    Out-of-class learning with technology comprises an essential context of second language development. Understanding the nature of out-of-class language learning with technology is the initial step towards safeguarding its quality. This study examined the types of learning experiences that language learners engaged in outside the classroom and the…

  7. Automatic retrieval of bone fracture knowledge using natural language processing. (United States)

    Do, Bao H; Wu, Andrew S; Maley, Joan; Biswal, Sandip


    Natural language processing (NLP) techniques to extract data from unstructured text into formal computer representations are valuable for creating robust, scalable methods to mine data in medical documents and radiology reports. As voice recognition (VR) becomes more prevalent in radiology practice, there is opportunity for implementing NLP in real time for decision-support applications such as context-aware information retrieval. For example, as the radiologist dictates a report, an NLP algorithm can extract concepts from the text and retrieve relevant classification or diagnosis criteria or calculate disease probability. NLP can work in parallel with VR to potentially facilitate evidence-based reporting (for example, automatically retrieving the Bosniak classification when the radiologist describes a kidney cyst). For these reasons, we developed and validated an NLP system which extracts fracture and anatomy concepts from unstructured text and retrieves relevant bone fracture knowledge. We implement our NLP in an HTML5 web application to demonstrate a proof-of-concept feedback NLP system which retrieves bone fracture knowledge in real time.

  8. Intelligent Performance Analysis with a Natural Language Interface (United States)

    Juuso, Esko K.


    Performance improvement is taken as the primary goal in the asset management. Advanced data analysis is needed to efficiently integrate condition monitoring data into the operation and maintenance. Intelligent stress and condition indices have been developed for control and condition monitoring by combining generalized norms with efficient nonlinear scaling. These nonlinear scaling methodologies can also be used to handle performance measures used for management since management oriented indicators can be presented in the same scale as intelligent condition and stress indices. Performance indicators are responses of the process, machine or system to the stress contributions analyzed from process and condition monitoring data. Scaled values are directly used in intelligent temporal analysis to calculate fluctuations and trends. All these methodologies can be used in prognostics and fatigue prediction. The meanings of the variables are beneficial in extracting expert knowledge and representing information in natural language. The idea of dividing the problems into the variable specific meanings and the directions of interactions provides various improvements for performance monitoring and decision making. The integrated temporal analysis and uncertainty processing facilitates the efficient use of domain expertise. Measurements can be monitored with generalized statistical process control (GSPC) based on the same scaling functions.

  9. Arabic text preprocessing for the natural language processing applications

    International Nuclear Information System (INIS)

    Awajan, A.


    A new approach for processing vowelized and unvowelized Arabic texts in order to prepare them for Natural Language Processing (NLP) purposes is described. The developed approach is rule-based and made up of four phases: text tokenization, word light stemming, word's morphological analysis and text annotation. The first phase preprocesses the input text in order to isolate the words and represent them in a formal way. The second phase applies a light stemmer in order to extract the stem of each word by eliminating the prefixes and suffixes. The third phase is a rule-based morphological analyzer that determines the root and the morphological pattern for each extracted stem. The last phase produces an annotated text where each word is tagged with its morphological attributes. The preprocessor presented in this paper is capable of dealing with vowelized and unvowelized words, and provides the input words along with relevant linguistics information needed by different applications. It is designed to be used with different NLP applications such as machine translation text summarization, text correction, information retrieval and automatic vowelization of Arabic Text. (author)

  10. Crowdsourcing and curation: perspectives from biology and natural language processing. (United States)

    Hirschman, Lynette; Fort, Karën; Boué, Stéphanie; Kyrpides, Nikos; Islamaj Doğan, Rezarta; Cohen, Kevin Bretonnel


    Crowdsourcing is increasingly utilized for performing tasks in both natural language processing and biocuration. Although there have been many applications of crowdsourcing in these fields, there have been fewer high-level discussions of the methodology and its applicability to biocuration. This paper explores crowdsourcing for biocuration through several case studies that highlight different ways of leveraging 'the crowd'; these raise issues about the kind(s) of expertise needed, the motivations of participants, and questions related to feasibility, cost and quality. The paper is an outgrowth of a panel session held at BioCreative V (Seville, September 9-11, 2015). The session consisted of four short talks, followed by a discussion. In their talks, the panelists explored the role of expertise and the potential to improve crowd performance by training; the challenge of decomposing tasks to make them amenable to crowdsourcing; and the capture of biological data and metadata through community editing.Database URL: © The Author(s) 2016. Published by Oxford University Press.

  11. A common type system for clinical natural language processing

    Directory of Open Access Journals (Sweden)

    Wu Stephen T


    Full Text Available Abstract Background One challenge in reusing clinical data stored in electronic medical records is that these data are heterogenous. Clinical Natural Language Processing (NLP plays an important role in transforming information in clinical text to a standard representation that is comparable and interoperable. Information may be processed and shared when a type system specifies the allowable data structures. Therefore, we aim to define a common type system for clinical NLP that enables interoperability between structured and unstructured data generated in different clinical settings. Results We describe a common type system for clinical NLP that has an end target of deep semantics based on Clinical Element Models (CEMs, thus interoperating with structured data and accommodating diverse NLP approaches. The type system has been implemented in UIMA (Unstructured Information Management Architecture and is fully functional in a popular open-source clinical NLP system, cTAKES (clinical Text Analysis and Knowledge Extraction System versions 2.0 and later. Conclusions We have created a type system that targets deep semantics, thereby allowing for NLP systems to encapsulate knowledge from text and share it alongside heterogenous clinical data sources. Rather than surface semantics that are typically the end product of NLP algorithms, CEM-based semantics explicitly build in deep clinical semantics as the point of interoperability with more structured data types.

  12. A common type system for clinical natural language processing. (United States)

    Wu, Stephen T; Kaggal, Vinod C; Dligach, Dmitriy; Masanz, James J; Chen, Pei; Becker, Lee; Chapman, Wendy W; Savova, Guergana K; Liu, Hongfang; Chute, Christopher G


    One challenge in reusing clinical data stored in electronic medical records is that these data are heterogenous. Clinical Natural Language Processing (NLP) plays an important role in transforming information in clinical text to a standard representation that is comparable and interoperable. Information may be processed and shared when a type system specifies the allowable data structures. Therefore, we aim to define a common type system for clinical NLP that enables interoperability between structured and unstructured data generated in different clinical settings. We describe a common type system for clinical NLP that has an end target of deep semantics based on Clinical Element Models (CEMs), thus interoperating with structured data and accommodating diverse NLP approaches. The type system has been implemented in UIMA (Unstructured Information Management Architecture) and is fully functional in a popular open-source clinical NLP system, cTAKES (clinical Text Analysis and Knowledge Extraction System) versions 2.0 and later. We have created a type system that targets deep semantics, thereby allowing for NLP systems to encapsulate knowledge from text and share it alongside heterogenous clinical data sources. Rather than surface semantics that are typically the end product of NLP algorithms, CEM-based semantics explicitly build in deep clinical semantics as the point of interoperability with more structured data types.

  13. The language of nature matters: we need a more public ecology (United States)

    Bruce R. Hull; David P. Robertson


    The language we use to describe nature matters. It is used by policy analysts to set goals for ecological restoration and management, by scientists to describe the nature that did, does, or could exist, and by all of us to imagine possible and acceptable conditions of environmental quality. Participants in environmental decision making demand a lot of the language and...

  14. Natural Language Understanding Systems Within the A. I. Paradigm: A Survey and Some Comparisons. (United States)

    Wilks, Yorick

    The paper surveys the major projects on the understanding of natural language that fall within what may now be called the artificial intelligence paradigm of natural language systems. Some space is devoted to arguing that the paradigm is now a reality and different in significant respects from the generative paradigm of present-day linguistics.…

  15. On the neurolinguistic nature of language abnormalities in Huntington's disease. (United States)

    Wallesch, C W; Fehrenbach, R A


    Spontaneous language of 18 patients suffering from Huntington's disease and 15 dysarthric controls suffering from Friedreich's ataxia were investigated. In addition, language functions in various modalities were assessed with the Aachen Aphasia Test (AAT). The Huntington patients exhibited deficits in the syntactical complexity of spontaneous speech and in the Token Test, confrontation naming, and language comprehension subtests of the AAT, which are interpreted as resulting from their dementia. Errors affecting word access mechanisms and production of syntactical structures as such were not encountered.

  16. Sign language: its history and contribution to the understanding of the biological nature of language. (United States)

    Ruben, Robert J


    The development of conceptualization of a biological basis of language during the 20th century has come about, in part, through the appreciation of the central nervous system's ability to utilize varied sensory inputs, and particularly vision, to develop language. Sign language has been a part of the linguistic experience from prehistory to the present day. Data suggest that human language may have originated as a visual language and became primarily auditory with the later development of our voice/speech tract. Sign language may be categorized into two types. The first is used by individuals who have auditory/oral language and the signs are used for special situations, such as communication in a monastery in which there is a vow of silence. The second is used by those who do not have access to auditory/oral language, namely the deaf. The history of the two forms of sign language and the development of the concept of the biological basis of language are reviewed from the fourth century BC to the present day. Sign languages of the deaf have been recognized since at least the fourth century BC. The codification of a monastic sign language occurred in the seventh to eighth centuries AD. Probable synergy between the two forms of sign language occurred in the 16th century. Among other developments, the Abbey de L'Epée introduced, in the 18th century, an oral syntax, French, into a sign language based upon indigenous signs of the deaf and newly created signs. During the 19th century, the concept of a "critical" period for the acquisition of language developed; this was an important stimulus for the exploration of the biological basis of language. The introduction of techniques, e.g. evoked potentials and functional MRI, during the 20th century allowed study of the brain functions associated with language.

  17. Bibliography of English-Polish Contrastive Studies in Poland (as of August 1976). (United States)

    Mieszek, Aleksandra

    This bibliography lists books, articles, papers, theses and dissertations describing English-Polish contrastive studies conducted in Poland. There are 403 works listed in both languages, divided into two groups: General Works and English-Polish Contrastive Studies. (CHK)

  18. On the nature of language – Heidegger and African Philosophy ...

    African Journals Online (AJOL)

    My contention is that Heidegger's daring phenomenology of language is also found and even radicalised within the framework of African philosophy, particularly the philosophy of myth. I argue that the exploration of the relation between these views of language offers the possibility not only to expand on the conventional ...

  19. "Homo Pedagogicus": The Evolutionary Nature of Second Language Teaching (United States)

    Atkinson, Dwight


    Second language (SL) teacher educators tirelessly teach others how to teach. But how often do we actually define teaching? Without explicit definitional activity on this fundamental concept in second language teaching (SLT), it remains implicit and intuitive--the opposite of clear, productive understanding. I therefore explore the question,…

  20. Natural mineral bottled waters available on the Polish market as a source of minerals for the consumers. Part 1. Calcium and magnesium. (United States)

    Gątarska, Anna; Tońska, Elżbieta; Ciborska, Joanna


    Natural mineral waters may be an essential source of calcium, magnesium and other minerals. In bottled waters, minerals occur in an ionized form which is very well digestible. However, the concentration of minerals in underground waters (which constitute the material for the production of bottled waters) varies. In view of the above, the type of water consumed is essential. The aim of the study was to estimate the calcium and magnesium contents in products available on the market and to evaluate calcium and magnesium consumption with natural mineral water by different consumer groups with an assumed volume of the consumed product. These represented forty different brands of natural mineral available waters on Polish market. These waters were produced in Poland or other European countries. Among the studied products, about 30% of the waters were imported from Lithuania, Latvia, Czech Republic, France, Italy and Germany. The content of calcium and magnesium in mineral waters was determined using flame atomic absorption spectrometry in an acetylene-air flame. Further determinations were carried out using atomic absorption spectrometer--ICE 3000 SERIES-THERMO-England, equipped with a GLITE data station, background correction (a deuterium lamp) as well as other cathode lamps. Over half of the analysed natural mineral waters were medium-mineralized. The natural mineral waters available on the market can be characterized by a varied content of calcium and magnesium and a high degree of product mineralization does not guarantee significant amounts of these components. Among the natural mineral waters available on the market, only a few feature the optimum calcium-magnesium proportion (2:1). Considering the mineralization degree of the studied products, it can be stated that the largest percentage of products with significant calcium and magnesium contents can be found in the high-mineralized water group. For some natural mineral waters, the consumption of 1 litre daily may

  1. A grammar-based semantic similarity algorithm for natural language sentences. (United States)

    Lee, Ming Che; Chang, Jia Wei; Hsieh, Tung Cheng


    This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to "artificial language", such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure.

  2. Automation of a problem list using natural language processing

    Directory of Open Access Journals (Sweden)

    Haug Peter J


    Full Text Available Abstract Background The medical problem list is an important part of the electronic medical record in development in our institution. To serve the functions it is designed for, the problem list has to be as accurate and timely as possible. However, the current problem list is usually incomplete and inaccurate, and is often totally unused. To alleviate this issue, we are building an environment where the problem list can be easily and effectively maintained. Methods For this project, 80 medical problems were selected for their frequency of use in our future clinical field of evaluation (cardiovascular. We have developed an Automated Problem List system composed of two main components: a background and a foreground application. The background application uses Natural Language Processing (NLP to harvest potential problem list entries from the list of 80 targeted problems detected in the multiple free-text electronic documents available in our electronic medical record. These proposed medical problems drive the foreground application designed for management of the problem list. Within this application, the extracted problems are proposed to the physicians for addition to the official problem list. Results The set of 80 targeted medical problems selected for this project covered about 5% of all possible diagnoses coded in ICD-9-CM in our study population (cardiovascular adult inpatients, but about 64% of all instances of these coded diagnoses. The system contains algorithms to detect first document sections, then sentences within these sections, and finally potential problems within the sentences. The initial evaluation of the section and sentence detection algorithms demonstrated a sensitivity and positive predictive value of 100% when detecting sections, and a sensitivity of 89% and a positive predictive value of 94% when detecting sentences. Conclusion The global aim of our project is to automate the process of creating and maintaining a problem

  3. Using Neural Networks to Generate Inferential Roles for Natural Language

    Directory of Open Access Journals (Sweden)

    Peter Blouw


    Full Text Available Neural networks have long been used to study linguistic phenomena spanning the domains of phonology, morphology, syntax, and semantics. Of these domains, semantics is somewhat unique in that there is little clarity concerning what a model needs to be able to do in order to provide an account of how the meanings of complex linguistic expressions, such as sentences, are understood. We argue that one thing such models need to be able to do is generate predictions about which further sentences are likely to follow from a given sentence; these define the sentence's “inferential role.” We then show that it is possible to train a tree-structured neural network model to generate very simple examples of such inferential roles using the recently released Stanford Natural Language Inference (SNLI dataset. On an empirical front, we evaluate the performance of this model by reporting entailment prediction accuracies on a set of test sentences not present in the training data. We also report the results of a simple study that compares human plausibility ratings for both human-generated and model-generated entailments for a random selection of sentences in this test set. On a more theoretical front, we argue in favor of a revision to some common assumptions about semantics: understanding a linguistic expression is not only a matter of mapping it onto a representation that somehow constitutes its meaning; rather, understanding a linguistic expression is mainly a matter of being able to draw certain inferences. Inference should accordingly be at the core of any model of semantic cognition.

  4. One grammar or two? Sign Languages and the Nature of Human Language. (United States)

    Lillo-Martin, Diane C; Gajewski, Jon


    Linguistic research has identified abstract properties that seem to be shared by all languages-such properties may be considered defining characteristics. In recent decades, the recognition that human language is found not only in the spoken modality but also in the form of sign languages has led to a reconsideration of some of these potential linguistic universals. In large part, the linguistic analysis of sign languages has led to the conclusion that universal characteristics of language can be stated at an abstract enough level to include languages in both spoken and signed modalities. For example, languages in both modalities display hierarchical structure at sub-lexical and phrasal level, and recursive rule application. However, this does not mean that modality-based differences between signed and spoken languages are trivial. In this article, we consider several candidate domains for modality effects, in light of the overarching question: are signed and spoken languages subject to the same abstract grammatical constraints, or is a substantially different conception of grammar needed for the sign language case? We look at differences between language types based on the use of space, iconicity, and the possibility for simultaneity in linguistic expression. The inclusion of sign languages does support some broadening of the conception of human language-in ways that are applicable for spoken languages as well. Still, the overall conclusion is that one grammar applies for human language, no matter the modality of expression. WIREs Cogn Sci 2014, 5:387-401. doi: 10.1002/wcs.1297 This article is categorized under: Linguistics > Linguistic Theory. © 2014 The Authors. WIREs Cognitive Science published by John Wiley & Sons, Ltd.

  5. From quantum foundations via natural language meaning to a theory of everything


    Coecke, Bob


    In this paper we argue for a paradigmatic shift from `reductionism' to `togetherness'. In particular, we show how interaction between systems in quantum theory naturally carries over to modelling how word meanings interact in natural language. Since meaning in natural language, depending on the subject domain, encompasses discussions within any scientific discipline, we obtain a template for theories such as social interaction, animal behaviour, and many others.

  6. Comparison of natural gases accumulated in Oligocene strata with hydrous pyrolysis gases from Menilite Shales of the Polish Outer Carpathians (United States)

    Kotarba, M.J.; Curtis, John B.; Lewan, M.D.


    This study examined the molecular and isotopic compositions of gases generated from different kerogen types (i.e., Types I/II, II, IIS and III) in Menilite Shales by sequential hydrous pyrolysis experiments. The experiments were designed to simulate gas generation from source rocks at pre-oil-cracking thermal maturities. Initially, rock samples were heated in the presence of liquid water at 330 ??C for 72 h to simulate early gas generation dominated by the overall reaction of kerogen decomposition to bitumen. Generated gas and oil were quantitatively collected at the completion of the experiments and the reactor with its rock and water was resealed and heated at 355 ??C for 72 h. This condition simulates late petroleum generation in which the dominant overall reaction is bitumen decomposition to oil. This final heating equates to a cumulative thermal maturity of 1.6% Rr, which represents pre-oil-cracking conditions. In addition to the generated gases from these two experiments being characterized individually, they are also summed to characterize a cumulative gas product. These results are compared with natural gases produced from sandstone reservoirs within or directly overlying the Menilite Shales. The experimentally generated gases show no molecular compositions that are distinct for the different kerogen types, but on a total organic carbon (TOC) basis, oil prone kerogens (i.e., Types I/II, II and IIS) generate more hydrocarbon gas than gas prone Type III kerogen. Although the proportionality of methane to ethane in the experimental gases is lower than that observed in the natural gases, the proportionality of ethane to propane and i-butane to n-butane are similar to those observed for the natural gases. ??13C values of the experimentally generated methane, ethane and propane show distinctions among the kerogen types. This distinction is related to the ??13C of the original kerogen, with 13C enriched kerogen generating more 13C enriched hydrocarbon gases than

  7. Metal polish poisoning (United States)

    Metal polishes are used to clean metals, including brass, copper, or silver. This article discusses the harmful effects from swallowing metal polish. This article is for information only. DO NOT use ...

  8. Williamson Polishing & Plating Site (United States)

    Williamson Polishing & Plating Co. Inc. was a plating shop located in the Martindale-Brightwood neighborhood of Indianapolis. The facility conducted job shop polishing and electroplating services. The vacant site contains a 14,651-square-foot building.

  9. Towards multilingual access to textual databases in natural language

    International Nuclear Information System (INIS)

    Radwan, Khaled


    The Cross-Lingual Information Retrieval system (CLIR) or Multilingual Information Retrieval (MIR) has become the key issue in electronic documents management systems in a multinational environment. We propose here a multilingual information retrieval system consisting of a morpho-syntactic analyser, a transfer system from source language to target language and an information retrieval system. A thorough investigation into the system architecture and the transfer mechanisms is proposed in that report, using two different performance evaluation methods. (author) [fr

  10. Applications Associated With Morphological Analysis And Generation In Natural Language Processing

    Directory of Open Access Journals (Sweden)

    Neha Yadav


    Full Text Available Natural Language Processing is one of the most developing fields in research area. In most of the applications related to the Natural Language Processing findings of the Morphological Analysis and Morphological Generation can be considered very important. As morphological study is the technique to recognise a word and its output can be used on later on stages .Keeping in view this importance this paper describes how Morphological Analysis and Morphological Generation can be proved as an important part of various Natural Language Processing fields such as Spell checker Machine Translation etc.

  11. Induction of the morphology of natural language : unsupervised morpheme segmentation with application to automatic speech recognition


    Creutz, Mathias


    In order to develop computer applications that successfully process natural language data (text and speech), one needs good models of the vocabulary and grammar of as many languages as possible. According to standard linguistic theory, words consist of morphemes, which are the smallest individually meaningful elements in a language. Since an immense number of word forms can be constructed by combining a limited set of morphemes, the capability of understanding and producing new word forms dep...

  12. A Grammar-Based Semantic Similarity Algorithm for Natural Language Sentences (United States)

    Chang, Jia Wei; Hsieh, Tung Cheng


    This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to “artificial language”, such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure. PMID:24982952

  13. A Grammar-Based Semantic Similarity Algorithm for Natural Language Sentences

    Directory of Open Access Journals (Sweden)

    Ming Che Lee


    Full Text Available This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to “artificial language”, such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure.

  14. Computational Nonlinear Morphology with Emphasis on Semitic Languages. Studies in Natural Language Processing. (United States)

    Kiraz, George Anton

    This book presents a tractable computational model that can cope with complex morphological operations, especially in Semitic languages, and less complex morphological systems present in Western languages. It outlines a new generalized regular rewrite rule system that uses multiple finite-state automata to cater to root-and-pattern morphology,…

  15. Dynamic changes in network activations characterize early learning of a natural language. (United States)

    Plante, Elena; Patterson, Dianne; Dailey, Natalie S; Kyle, R Almyrde; Fridriksson, Julius


    Those who are initially exposed to an unfamiliar language have difficulty separating running speech into individual words, but over time will recognize both words and the grammatical structure of the language. Behavioral studies have used artificial languages to demonstrate that humans are sensitive to distributional information in language input, and can use this information to discover the structure of that language. This is done without direct instruction and learning occurs over the course of minutes rather than days or months. Moreover, learners may attend to different aspects of the language input as their own learning progresses. Here, we examine processing associated with the early stages of exposure to a natural language, using fMRI. Listeners were exposed to an unfamiliar language (Icelandic) while undergoing four consecutive fMRI scans. The Icelandic stimuli were constrained in ways known to produce rapid learning of aspects of language structure. After approximately 4 min of exposure to the Icelandic stimuli, participants began to differentiate between correct and incorrect sentences at above chance levels, with significant improvement between the first and last scan. An independent component analysis of the imaging data revealed four task-related components, two of which were associated with behavioral performance early in the experiment, and two with performance later in the experiment. This outcome suggests dynamic changes occur in the recruitment of neural resources even within the initial period of exposure to an unfamiliar natural language. Copyright © 2014 Elsevier Ltd. All rights reserved.

  16. The Nature of Chinese Language Classroom Learning Environments in Singapore Secondary Schools (United States)

    Chua, Siew Lian; Wong, Angela F. L.; Chen, Der-Thanq V.


    This article reports findings from a classroom environment study which was designed to investigate the nature of Chinese Language classroom environments in Singapore secondary schools. We used a perceptual instrument, the Chinese Language Classroom Environment Inventory, to investigate teachers' and students' perceptions towards their Chinese…

  17. Using the Natural Language Paradigm (NLP) to Increase Vocalizations of Older Adults with Cognitive Impairments (United States)

    LeBlanc, Linda A.; Geiger, Kaneen B.; Sautter, Rachael A.; Sidener, Tina M.


    The Natural Language Paradigm (NLP) has proven effective in increasing spontaneous verbalizations for children with autism. This study investigated the use of NLP with older adults with cognitive impairments served at a leisure-based adult day program for seniors. Three individuals with limited spontaneous use of functional language participated…

  18. Where humans meet machines innovative solutions for knotty natural-language problems

    CERN Document Server

    Markowitz, Judith


    Where Humans Meet Machines: Innovative Solutions for Knotty Natural-Language Problems brings humans and machines closer together by showing how linguistic complexities that confound the speech systems of today can be handled effectively by sophisticated natural-language technology. Some of the most vexing natural-language problems that are addressed in this book entail   recognizing and processing idiomatic expressions, understanding metaphors, matching an anaphor correctly with its antecedent, performing word-sense disambiguation, and handling out-of-vocabulary words and phrases. This fourteen-chapter anthology consists of contributions from industry scientists and from academicians working at major universities in North America and Europe. They include researchers who have played a central role in DARPA-funded programs and developers who craft real-world solutions for corporations. These contributing authors analyze the role of natural language technology in the global marketplace; they explore the need f...

  19. From Monologue to Dialogue: Natural Language Generation in OVIS

    NARCIS (Netherlands)

    Theune, Mariet; Freedman, R.; Callaway, C.

    This paper describes how a language generation system that was originally designed for monologue generation, has been adapted for use in the OVIS spoken dialogue system. To meet the requirement that in a dialogue, the system’s utterances should make up a single, coherent dialogue turn, several

  20. Evolutionary explanations for natural language: criteria from evolutionary biology

    NARCIS (Netherlands)

    Zuidema, W.; de Boer, B.


    Theories of the evolutionary origins of language must be informed by empirical and theoretical results from a variety of different fields. Complementing recent surveys of relevant work from linguistics, animal behaviour and genetics, this paper surveys the requirements on evolutionary scenarios that

  1. Evolutionary Developmental Linguistics: Naturalization of the Faculty of Language (United States)

    Locke, John L.


    Since language is a biological trait, it is necessary to investigate its evolution, development, and functions, along with the mechanisms that have been set aside, and are now recruited, for its acquisition and use. It is argued here that progress toward each of these goals can be facilitated by new programs of research, carried out within a new…

  2. Validation of the Polish language version of the SF-36 Health Survey in patients suffering from lumbar spinal stenosis

    Directory of Open Access Journals (Sweden)

    Michał Kłosiński


    Full Text Available [b]introduction and objective[/b]. Patient-reported outcome (PRO questionnaires have become the standard measure for treatment effectiveness after spinal surgery. One of the most widely used generic PROs is the SF-36 Health Survey. The aim of this study was to specifically focus on validating the SF-36 Health Survey to confirm that the tool is an acceptable and psychometrically robust measure to collect HRQoL data in Polish patients with spinal stenosis. [b]materials and methods[/b]. Patients were eligible if they were above 18 years of age and had been qualified for spine surgery of the lumbar region due to either discopathy or non-traumatic spinal stenosis. All patients filled-in the Polish version of the SF-36 and a demographic questionnaire. Standard validity and reliability analyses were performed. [b]results.[/b] 192 patients (83 women – 43.2% agreed to take part in the study (mean age: 57.5±11.4 years. In 47 patients (24.5%, using MRI, ossification of the ligamenta flava were found. Cronbach’s alpha coefficients showed positive internal consistency (0.70–0.92. Interclass correlations for the SF-36 ranged from 0.72 – 0.86 and proved appropriate test-retest reliability. Satisfactory convergent and discriminant validity in multi-trait scaling analyses was seen. [b]conclusions.[/b] The Polish version of the SF-36 is a reliable and valid tool for measuring HRQoL in patients with spinal stenosis. It can be recommended for use in clinical and epidemiological settings in the Polish population. However, caution is warranted when interpreting the results of the ‘role limitations due to physical health problems’ and the ‘role limitations due to emotional problems’ scales because of floor and ceiling effects.

  3. Social Obstacles Towards Success of Pupils in Polish Primary Schools. (United States)

    Nakielska, Zofia

    In 1973, the Polish Minister of Education ordered objective competitions at the primary school level in the fields of Polish studies, Russian language, and math. In order to determine whether such subject competitions were justified and if they contributed equally to the development of interests and abilities among the rural and urban and…

  4. Polish natural bee honeys are anti-proliferative and anti-metastatic agents in human glioblastoma multiforme U87MG cell line.

    Directory of Open Access Journals (Sweden)

    Justyna Moskwa

    Full Text Available Honey has been used as food and a traditional medicament since ancient times. However, recently many scientists have been concentrating on the anti-oxidant, anti-proliferative, anti-inflammatory and other properties of honey. In this study, we investigated for the first time an anticancer effect of different honeys from Poland on tumor cell line - glioblastoma multiforme U87MG. Anti-proliferative activity of honeys and its interferences with temozolomide were determined by a cytotoxicity test and DNA binding by [H3]-thymidine incorporation. A gelatin zymography was used to conduct an evaluation of metalloproteinases (MMP-2 and MMP-9 expression in U87MG treatment with honey samples. The honeys were previously tested qualitatively (diastase activity, total phenolic content, lead and cadmium content. The data demonstrated that the examined honeys have a potent anti-proliferative effect on U87MG cell line in a time- and dose-dependent manner, being effective at concentrations as low as 0.5% (multifloral light honey - viability 53% after 72 h of incubation. We observed that after 48 h, combining honey with temozolomide showed a significantly higher inhibitory effect than the samples of honey alone. We observed a strong inhibition of MMP-2 and MMP-9 for the tested honeys (from 20 to 56% and from 5 to 58% compared to control, respectively. Our results suggest that Polish honeys have an anti-proliferative and anti-metastatic effect on U87MG cell line. Therefore, natural bee honey can be considered as a promising adjuvant treatment for brain tumors.

  5. The Language Faculty that Wasn't: A Usage-Based Account of Natural Language Recursion

    Directory of Open Access Journals (Sweden)

    Morten H Christiansen


    Full Text Available In the generative tradition, the language faculty has been shrinking—perhaps to include only the mechanism of recursion. This paper argues that even this view of the language faculty is too expansive. We first argue that a language faculty is difficult to reconcile with evolutionary considerations. We then focus on recursion as a detailed case study, arguing that our ability to process recursive structure does not rely on recursion as a property of the grammar, but instead emerge gradually by piggybacking on domain-general sequence learning abilities. Evidence from genetics, comparative work on non-human primates, and cognitive neuroscience suggests that humans have evolved complex sequence learning skills, which were subsequently pressed into service to accommodate language. Constraints on sequence learning therefore have played an important role in shaping the cultural evolution of linguistic structure, including our limited abilities for processing recursive structure. Finally, we re-evaluate some of the key considerations that have often been taken to require the postulation of a language faculty.

  6. Semantic similarity from natural language and ontology analysis

    CERN Document Server

    Harispe, Sébastien; Janaqi, Stefan


    Artificial Intelligence federates numerous scientific fields in the aim of developing machines able to assist human operators performing complex treatments---most of which demand high cognitive skills (e.g. learning or decision processes). Central to this quest is to give machines the ability to estimate the likeness or similarity between things in the way human beings estimate the similarity between stimuli.In this context, this book focuses on semantic measures: approaches designed for comparing semantic entities such as units of language, e.g. words, sentences, or concepts and instances def

  7. [Lysenkoism in Polish botany]. (United States)

    Köhler, Piotr


    from oral testimony, that the times of Lysenkoism were a terrible period in Polish botany, with all kinds of pressures exerted on botanists who did not adopt it. Fortunately, no Polish botanists lost their lives. The Lysenkoist period in Polish botany retarded the development of many of its branches. In the last fifty years many of the setbacks have been made up for, but it is in the biological education of the general public that Lysenkoism has had a more serious effect. Several generations of young people failed to be introduced to genetics, or at least its foundations, at any level of schooling. Instead they were inculcated with the erroneous belief of man's limitless possibilities in transforming nature, including the view that species can be shaped freely in line with economic needs. (ABSTRACT TRUNCATED)

  8. Lexical exponents of hypothetical modality in Polish and Lithuanian


    Roman Roszko


    Lexical exponents of hypothetical modality in Polish and Lithuanian The article focuses on the lexical exponents of hypothetical modality in Polish and Lithuanian. The purpose for comparing and contrasting the lexical exponents of hypothetical modality is not only to identify all the lexemes in both languages but also find the answer to the following question: whether the morphological exponents of hypothetical modality (so-called modus relativus) familiar to the Lithuanian language have/...

  9. Deciphering the language of nature: cryptography, secrecy, and alterity in Francis Bacon. (United States)

    Clody, Michael C


    The essay argues that Francis Bacon's considerations of parables and cryptography reflect larger interpretative concerns of his natural philosophic project. Bacon describes nature as having a language distinct from those of God and man, and, in so doing, establishes a central problem of his natural philosophy—namely, how can the language of nature be accessed through scientific representation? Ultimately, Bacon's solution relies on a theory of differential and duplicitous signs that conceal within them the hidden voice of nature, which is best recognized in the natural forms of efficient causality. The "alphabet of nature"—those tables of natural occurrences—consequently plays a central role in his program, as it renders nature's language susceptible to a process and decryption that mirrors the model of the bilateral cipher. It is argued that while the writing of Bacon's natural philosophy strives for literality, its investigative process preserves a space for alterity within scientific representation, that is made accessible to those with the interpretative key.

  10. Evaluation of uncertainty in the measurement of sense of natural language constructions

    Directory of Open Access Journals (Sweden)

    Bisikalo Oleg V.


    Full Text Available The task of evaluating uncertainty in the measurement of sense in natural language constructions (NLCs was researched through formalization of the notions of the language image, formalization of artificial cognitive systems (ACSs and the formalization of units of meaning. The method for measuring the sense of natural language constructions incorporated fuzzy relations of meaning, which ensures that information about the links between lemmas of the text is taken into account, permitting the evaluation of two types of measurement uncertainty of sense characteristics. Using developed applications programs, experiments were conducted to investigate the proposed method to tackle the identification of informative characteristics of text. The experiments resulted in dependencies of parameters being obtained in order to utilise the Pareto distribution law to define relations between lemmas, analysis of which permits the identification of exponents of an average number of connections of the language image as the most informative characteristics of text.

  11. LOVE in English and Polish

    Directory of Open Access Journals (Sweden)

    Małgorzata Brożyna Reczko


    Full Text Available LOVE in English and Polish The paper presents a sample contrastive analysis of the linguistic picture of love in English and Polish. The material used in the survey is drawn from lexicographic data, including the British National Corpus and Narodowy Korpus Języka Polskiego [National Corpus of Polish]. The paper focuses on the similarities and differences in conceptualizing the abstract concept of love in the English and Polish languages. An analytical method, developed by Bartmiński and associates, serves as the theoretical basis for the reconstruction of the linguistic picture of the world.   MIŁOŚĆ w języku angielskim i polskim Niniejszy artykuł to próba kontrastywnego porównania językowego obrazu świata MIŁOŚCI w języku angielskim i polskim. Materiał badawczy pochodzi głównie ze źródeł leksykograficznych: słowników oraz korpusów (Narodowego Korpusu Języka Polskiego oraz z korpusu języka angielskiego British National Corpus. Celem badania było poszukiwanie podobieństw i różnic w konceptualizacji MIŁOŚCI w tych dwóch językach. Metoda badawcza została zaczerpnięta z prac J. Bartmińskiego i dotyczy rekonstrukcji językowego obrazu świata różnych pojęć.

  12. Linguistic fundamentals for natural language processing 100 essentials from morphology and syntax

    CERN Document Server

    Bender, Emily M


    Many NLP tasks have at their core a subtask of extracting the dependencies-who did what to whom-from natural language sentences. This task can be understood as the inverse of the problem solved in different ways by diverse human languages, namely, how to indicate the relationship between different parts of a sentence. Understanding how languages solve the problem can be extremely useful in both feature design and error analysis in the application of machine learning to NLP. Likewise, understanding cross-linguistic variation can be important for the design of MT systems and other multilingual a

  13. Polish Cartographical Review

    Directory of Open Access Journals (Sweden)

    Nedjeljko Frančula


    Full Text Available The Polish Cartographical Review (PCR journal has been published in English four times a year since 2015. The journal is in open access and it is published by De Gruyter Open. It is edited by Polish scientists in collaboration with international experts.

  14. Stochastic Model for the Vocabulary Growth in Natural Languages (United States)

    Gerlach, Martin; Altmann, Eduardo G.


    We propose a stochastic model for the number of different words in a given database which incorporates the dependence on the database size and historical changes. The main feature of our model is the existence of two different classes of words: (i) a finite number of core words, which have higher frequency and do not affect the probability of a new word to be used, and (ii) the remaining virtually infinite number of noncore words, which have lower frequency and, once used, reduce the probability of a new word to be used in the future. Our model relies on a careful analysis of the Google Ngram database of books published in the last centuries, and its main consequence is the generalization of Zipf’s and Heaps’ law to two-scaling regimes. We confirm that these generalizations yield the best simple description of the data among generic descriptive models and that the two free parameters depend only on the language but not on the database. From the point of view of our model, the main change on historical time scales is the composition of the specific words included in the finite list of core words, which we observe to decay exponentially in time with a rate of approximately 30 words per year for English.

  15. Ontology-Based Controlled Natural Language Editor Using CFG with Lexical Dependency (United States)

    Namgoong, Hyun; Kim, Hong-Gee

    In recent years, CNL (Controlled Natural Language) has received much attention with regard to ontology-based knowledge acquisition systems. CNLs, as subsets of natural languages, can be useful for both humans and computers by eliminating ambiguity of natural languages. Our previous work, OntoPath [10], proposed to edit natural language-like narratives that are structured in RDF (Resource Description Framework) triples, using a domain-specific ontology as their language constituents. However, our previous work and other systems employing CFG for grammar definition have difficulties in enlarging the expression capacity. A newly developed editor, which we propose in this paper, permits grammar definitions through CFG-LD (Context-Free Grammar with Lexical Dependency) that includes sequential and semantic structures of the grammars. With CFG describing the sequential structure of grammar, lexical dependencies between sentence elements can be designated in the definition system. Through the defined grammars, the implemented editor guides users' narratives in more familiar expressions with a domain-specific ontology and translates the content into RDF triples.

  16. An algorithm to transform natural language into SQL queries for relational databases

    Directory of Open Access Journals (Sweden)

    Garima Singh


    Full Text Available Intelligent interface, to enhance efficient interactions between user and databases, is the need of the database applications. Databases must be intelligent enough to make the accessibility faster. However, not every user familiar with the Structured Query Language (SQL queries as they may not aware of structure of the database and they thus require to learn SQL. So, non-expert users need a system to interact with relational databases in their natural language such as English. For this, Database Management System (DBMS must have an ability to understand Natural Language (NL. In this research, an intelligent interface is developed using semantic matching technique which translates natural language query to SQL using set of production rules and data dictionary. The data dictionary consists of semantics sets for relations and attributes. A series of steps like lower case conversion, tokenization, speech tagging, database element and SQL element extraction is used to convert Natural Language Query (NLQ to SQL Query. The transformed query is executed and the results are obtained by the user. Intelligent Interface is the need of database applications to enhance efficient interaction between user and DBMS.

  17. Selecting the Best Mobile Information Service with Natural Language User Input (United States)

    Feng, Qiangze; Qi, Hongwei; Fukushima, Toshikazu

    Information services accessed via mobile phones provide information directly relevant to subscribers’ daily lives and are an area of dynamic market growth worldwide. Although many information services are currently offered by mobile operators, many of the existing solutions require a unique gateway for each service, and it is inconvenient for users to have to remember a large number of such gateways. Furthermore, the Short Message Service (SMS) is very popular in China and Chinese users would prefer to access these services in natural language via SMS. This chapter describes a Natural Language Based Service Selection System (NL3S) for use with a large number of mobile information services. The system can accept user queries in natural language and navigate it to the required service. Since it is difficult for existing methods to achieve high accuracy and high coverage and anticipate which other services a user might want to query, the NL3S is developed based on a Multi-service Ontology (MO) and Multi-service Query Language (MQL). The MO and MQL provide semantic and linguistic knowledge, respectively, to facilitate service selection for a user query and to provide adaptive service recommendations. Experiments show that the NL3S can achieve 75-95% accuracies and 85-95% satisfactions for processing various styles of natural language queries. A trial involving navigation of 30 different mobile services shows that the NL3S can provide a viable commercial solution for mobile operators.


    Directory of Open Access Journals (Sweden)

    Endang Fauziati


    Full Text Available This study deals with learner language known as interlanguage; in particular, this tries to investigate its nature. For this purpose, an empirical study was conducted, using Indonesian senior high school learners learning English as the research subjects. This study used error analysis as methodological framework. The data were in the form of interlanguage errors collected from the learners‘ free compositions prior and after an error treatment. The data were analyzed qualitatively. The research indicates that Error treatment was proved to have significant contribution to the destabilization process; that is to say, it helped the learners‘ interlanguage errors change their nature: at a certain period of learning, some particular errors should appear as inevitable part of learning process; as a result of error treatment they change their nature. It was observed that the change of state of interlanguage errors was stimulated by several classroom aspects, namely: input, feedback, explicit grammar explanation, and practice. The conclusion is that the learner language is dynamic in nature.

  19. Natural Language Processing and Fuzzy Tools for Business Processes in a Geolocation Context

    Directory of Open Access Journals (Sweden)

    Isis Truck


    Full Text Available In the geolocation field where high-level programs and low-level devices coexist, it is often difficult to find a friendly user interface to configure all the parameters. The challenge addressed in this paper is to propose intuitive and simple, thus natural language interfaces to interact with low-level devices. Such interfaces contain natural language processing (NLP and fuzzy representations of words that facilitate the elicitation of business-level objectives in our context. A complete methodology is proposed, from the lexicon construction to a dialogue software agent including a fuzzy linguistic representation, based on synonymy.


    Directory of Open Access Journals (Sweden)

    Ri Yong-Sok


    Full Text Available The precedent studies on the validity of Modus ponens and Modus tollens have been carried out with most regard to a major type of conditionals in which the conditional clause is a sufficient condition for the main clause. But we sometimes, in natural language arguments, find other types of conditionals in which the conditional clause is a necessary or necessary and sufficient condition for the main clause. In this paper I reappraise, on the basis of new definitions of Modus ponens and Modus tollens, their validity/invalidity in natural language arguments in consideration of all types of conditionals.

  1. A Natural Language for AdS/CFT Correlators

    Energy Technology Data Exchange (ETDEWEB)

    Fitzpatrick, A.Liam; /Boston U.; Kaplan, Jared; /SLAC; Penedones, Joao; /Perimeter Inst. Theor. Phys.; Raju, Suvrat; /Harish-Chandra Res. Inst.; van Rees, Balt C.; /YITP, Stony Brook


    We provide dramatic evidence that 'Mellin space' is the natural home for correlation functions in CFTs with weakly coupled bulk duals. In Mellin space, CFT correlators have poles corresponding to an OPE decomposition into 'left' and 'right' sub-correlators, in direct analogy with the factorization channels of scattering amplitudes. In the regime where these correlators can be computed by tree level Witten diagrams in AdS, we derive an explicit formula for the residues of Mellin amplitudes at the corresponding factorization poles, and we use the conformal Casimir to show that these amplitudes obey algebraic finite difference equations. By analyzing the recursive structure of our factorization formula we obtain simple diagrammatic rules for the construction of Mellin amplitudes corresponding to tree-level Witten diagrams in any bulk scalar theory. We prove the diagrammatic rules using our finite difference equations. Finally, we show that our factorization formula and our diagrammatic rules morph into the flat space S-Matrix of the bulk theory, reproducing the usual Feynman rules, when we take the flat space limit of AdS/CFT. Throughout we emphasize a deep analogy with the properties of flat space scattering amplitudes in momentum space, which suggests that the Mellin amplitude may provide a holographic definition of the flat space S-Matrix.

  2. Word Boundaries in L2 Speech: Evidence from Polish Learners of English (United States)

    Schwartz, Geoffrey


    Acoustic and perceptual studies investgate B2-level Polish learners' acquisition of second language (L2) English word-boundaries involving word-initial vowels. In production, participants were less likely to produce glottalization of phrase-medial initial vowels in L2 English than in first language (L1) Polish. Perception studies employing word…

  3. The Rape of Mother Nature? Women in the Language of Environmental Discourse. (United States)

    Berman, Tzeporah


    Argues that the structure of language reflects and reproduces the dominant model, and reinforces many of the dualistic assumptions which underlie the separation of male and female, nature and culture, mind from body, emotion from reason, and intuition from fact. (LZ)

  4. Combining Machine Learning and Natural Language Processing to Assess Literary Text Comprehension (United States)

    Balyan, Renu; McCarthy, Kathryn S.; McNamara, Danielle S.


    This study examined how machine learning and natural language processing (NLP) techniques can be leveraged to assess the interpretive behavior that is required for successful literary text comprehension. We compared the accuracy of seven different machine learning classification algorithms in predicting human ratings of student essays about…

  5. Speech perception and reading: two parallel modes of understanding language and implications for acquiring literacy naturally. (United States)

    Massaro, Dominic W


    I review 2 seminal research reports published in this journal during its second decade more than a century ago. Given psychology's subdisciplines, they would not normally be reviewed together because one involves reading and the other speech perception. The small amount of interaction between these domains might have limited research and theoretical progress. In fact, the 2 early research reports revealed common processes involved in these 2 forms of language processing. Their illustration of the role of Wundt's apperceptive process in reading and speech perception anticipated descriptions of contemporary theories of pattern recognition, such as the fuzzy logical model of perception. Based on the commonalities between reading and listening, one can question why they have been viewed so differently. It is commonly believed that learning to read requires formal instruction and schooling, whereas spoken language is acquired from birth onward through natural interactions with people who talk. Most researchers and educators believe that spoken language is acquired naturally from birth onward and even prenatally. Learning to read, on the other hand, is not possible until the child has acquired spoken language, reaches school age, and receives formal instruction. If an appropriate form of written text is made available early in a child's life, however, the current hypothesis is that reading will also be learned inductively and emerge naturally, with no significant negative consequences. If this proposal is true, it should soon be possible to create an interactive system, Technology Assisted Reading Acquisition, to allow children to acquire literacy naturally.

  6. Visualization of health information with predications extracted using natural language processing and filtered using the UMLS. (United States)

    Miller, Trudi; Leroy, Gondy


    Increased availability of and reliance on written health information can tax the abilities of unskilled readers. We are developing a system that uses natural language processing to extract phrases, identify medical terms using the UMLS, and visualize the propositions. This system substantially reduces the amount of information a consumer must read, while providing an alternative to traditional prose based text.

  7. Using natural language processing to improve biomedical concept normalization and relation mining

    NARCIS (Netherlands)

    N. Kang (Ning)


    textabstractThis thesis concerns the use of natural language processing for improving biomedical concept normalization and relation mining. We begin with introducing the background of biomedical text mining, and subsequently we will continue by describing a typical text mining pipeline, some key

  8. Modelling the phonotactic structure of natural language words with simple recurrent networks

    NARCIS (Netherlands)

    Stoianov, [No Value; Nerbonne, J; Bouma, H; Coppen, PA; vanHalteren, H; Teunissen, L


    Simple Recurrent Networks (SRN) are Neural Network (connectionist) models able to process natural language. Phonotactics concerns the order of symbols in words. We continued an earlier unsuccessful trial to model the phonotactics of Dutch words with SRNs. In order to overcome the previously reported

  9. Construct Validity in TOEFL iBT Speaking Tasks: Insights from Natural Language Processing (United States)

    Kyle, Kristopher; Crossley, Scott A.; McNamara, Danielle S.


    This study explores the construct validity of speaking tasks included in the TOEFL iBT (e.g., integrated and independent speaking tasks). Specifically, advanced natural language processing (NLP) tools, MANOVA difference statistics, and discriminant function analyses (DFA) are used to assess the degree to which and in what ways responses to these…

  10. Dimensional Reduction in Vector Space Methods for Natural Language Processing: Products and Projections (United States)

    Aerts, Sven


    We introduce vector space based approaches to natural language processing and some of their similarities with quantum theory when applied to information retrieval. We explain how dimensional reduction is called for from both a practical and theoretical point of view and how this can be achieved through choice of product or through projectors onto subspaces.

  11. Drawing Dynamic Geometry Figures Online with Natural Language for Junior High School Geometry (United States)

    Wong, Wing-Kwong; Yin, Sheng-Kai; Yang, Chang-Zhe


    This paper presents a tool for drawing dynamic geometric figures by understanding the texts of geometry problems. With the tool, teachers and students can construct dynamic geometric figures on a web page by inputting a geometry problem in natural language. First we need to build the knowledge base for understanding geometry problems. With the…

  12. You Are Your Words: Modeling Students' Vocabulary Knowledge with Natural Language Processing Tools (United States)

    Allen, Laura K.; McNamara, Danielle S.


    The current study investigates the degree to which the lexical properties of students' essays can inform stealth assessments of their vocabulary knowledge. In particular, we used indices calculated with the natural language processing tool, TAALES, to predict students' performance on a measure of vocabulary knowledge. To this end, two corpora were…

  13. A semi-automated approach for generating natural language requirements documents based on business process models

    NARCIS (Netherlands)

    Aysolmaz, Banu; Leopold, Henrik; Reijers, Hajo A.; Demirörs, Onur


    Context: The analysis of requirements for business-related software systems is often supported by using business process models. However, the final requirements are typically still specified in natural language. This means that the knowledge captured in process models must be consistently

  14. Plurilingualism and polish teenage learners of english

    Directory of Open Access Journals (Sweden)

    Agnieszka Otwinowska‑Kasztelanic


    Full Text Available Using several languages has become a norm for those who want to learn and work in the European Union. However, teaching for plurilingualism is also a challenge. The present paper first clarifies the notions of plurilingualism and multilingualism, then discusses the role of crosslinguistic similarity in language learning in the case of European languages. It also shows how lexical crosslinguistic similarity can be used in teaching typologically related and unrelated languages, and discusses the key factors in noticing such similarity. The research presented reports on examining and raising language awareness of Polish‑‑ English cognate vocabulary in the case of a group of Polish teenage learners of English. It presents the results of a small‑‑ scale study in quasi‑‑ experimental design, as well as qualitative research on the learners’ opinions and attitudes. Finally, the paper presents implications for language pedagogy and focuses on the fact that awareness raising may affect the learners’ plurilingual competence.

  15. Reconceptualizing the Nature of Goals and Outcomes in Language/s Education (United States)

    Leung, Constant; Scarino, Angela


    Transformations associated with the increasing speed, scale, and complexity of mobilities, together with the information technology revolution, have changed the demography of most countries of the world and brought about accompanying social, cultural, and economic shifts (Heugh, 2013). This complex diversity has changed the very nature of…


    Directory of Open Access Journals (Sweden)

    D.F. Spinu


    Full Text Available Twenty-six years ago the international community witnessed one of the most dramatic changes in economic systems. Naturally, the fall of communism in Eastern Europe and its consequences were events difficult to judge and anticipate in their immediate aftermath. Today, we have gained a much more coherent perspective on their meaning. The political liberalization of Poland in 1989 and its transition to the market economy was generally perceived as the most successful of all post-communist countries. From 1990 to 2013, Poland experienced the most outstanding economic growth within the former communist bloc. It doubled its GDP in real terms and became the only country to experience economic growth during the financial crisis of 2008-09. However, the polish secret recipe lies in the "shock therapy" adopted at the beginning of the 90's. The aim of this paper is to examine the importance of the Balcerowicz's program in creating the basis for economic stability and growth through privatization, liberalization of foreign trade, monetary reform and an open economy. We will also review the impact of this unprecedented transformation in shaping a strong, market-oriented economy.

  17. The Exploring Nature of Definitions and Classifications of Language Learning Strategies (LLSs) in the Current Studies of Second/Foreign Language Learning (United States)

    Fazeli, Seyed Hossein


    This study aims to explore the nature of definitions and classifications of Language Learning Strategies (LLSs) in the current studies of second/foreign language learning in order to show the current problems regarding such definitions and classifications. The present study shows that there is not a universal agreeable definition and…

  18. Assessing Repetitive Negative Thinking Using Categorical and Transdiagnostic Approaches: A Comparison and Validation of Three Polish Language Adaptations of Self-Report Questionnaires. (United States)

    Kornacka, Monika; Buczny, Jacek; Layton, Rebekah L


    Repetitive negative thinking (RNT) is a transdiagnostic process involved in the risk, maintenance, and relapse of serious conditions including mood disorders, anxiety, eating disorders, and addictions. Processing mode theory provides a theoretical model to assess, research, and treat RNT using a transdiagnostic approach. Clinical researchers also often employ categorical approaches to RNT, including a focus on depressive rumination or worry, for similar purposes. Three widely used self-report questionnaires have been developed to assess these related constructs: the Ruminative Response Scale (RRS), the Perseverative Thinking Questionnaire (PTQ), and the Mini-Cambridge Exeter Repetitive Thought Scale (Mini-CERTS). Yet these scales have not previously been used in conjunction, despite useful theoretical distinctions only available in Mini-CERTS. The present validation of the methods in a Polish speaking population provides psychometric parameters estimates that contribute to current efforts to increase reliable replication of theoretical outcomes. Moreover, the following study aims to present particular characteristics and a comparison of the three methods. Although there has been some exploration of a categorical approach, the comparison of transdiagnostic methods is still lacking. These methods are particularly relevant for developing and evaluating theoretically based interventions like concreteness training, an emerging field of increasing interest, which can be used to address the maladaptive processing mode in RNT that can lead to depression and other disorders. Furthermore, the translation of these measures enables the examination of possible cross-cultural structural differences that may lead to important theoretical progress in the measurement and classification of RNT. The results support the theoretical hypothesis. As expected, the dimensions of brooding, general repetitive negative thinking, as well as abstract analytical thinking, can all be classified

  19. Assessing repetitive negative thinking using categorical and transdiagnostic approaches: A comparison and validation of three Polish language adaptations of self-report questionnaires

    Directory of Open Access Journals (Sweden)

    Monika eKornacka


    Full Text Available Repetitive negative thinking (RNT is a transdiagnostic process involved in the risk, maintenance, and relapse of serious conditions including mood disorders, anxiety, eating disorders, and addictions. Processing mode theory provides a theoretical model to assess, research, and treat RNT using a transdiagnostic approach. Clinical researchers also often employ categorical approaches to RNT, including a focus on depressive rumination or worry, for similar purposes. Three widely used self-report questionnaires have been developed to assess these related constructs: the Ruminative Response Scale (RRT, the Perseverative Thinking Questionnaire (PTQ, and the Mini-Cambridge Exeter Repetitive Thought Scale (Mini-CERTS. Yet these scales have not previously been used in conjunction, despite useful theoretical distinctions only available in Mini-CERTS. The present validation of the methods in a Polish speaking population provides psychometric parameters estimates that contribute to current efforts to increase reliable replication of theoretical outcomes. Moreover, the following study aims to present particular characteristics and a comparison of the three methods. Although there has been some exploration of the categorical approach, the comparison of transdiagnostic methods is still lacking. These methods are particularly relevant for developing and evaluating theoretically based interventions like concreteness training, an emerging field of increasing interest, which can be used to address the maladaptive processing mode in RNT that can lead to depression and other disorders. Furthermore, the translation of these measures enables the examination of possible cross-cultural structural differences that may lead to important theoretical progress in the measurement and classification of RNT. The results support the theoretical hypothesis. As expected, the dimensions of brooding, general Repetitive Negative Thinking and Abstract Analytic Thinking, can all be

  20. Mathematics and the Laws of Nature Developing the Language of Science (Revised Edition)

    CERN Document Server

    Tabak, John


    Mathematics and the Laws of Nature, Revised Edition describes the evolution of the idea that nature can be described in the language of mathematics. Colorful chapters explore the earliest attempts to apply deductive methods to the study of the natural world. This revised resource goes on to examine the development of classical conservation laws, including the conservation of momentum, the conservation of mass, and the conservation of energy. Chapters have been updated and revised to reflect recent information, including the mathematical pioneers who introduced new ideas about what it meant to

  1. Natural language indicators of differential gene regulation in the human immune system. (United States)

    Mehl, Matthias R; Raison, Charles L; Pace, Thaddeus W W; Arevalo, Jesusa M G; Cole, Steve W


    Adverse social conditions have been linked to a conserved transcriptional response to adversity (CTRA) in circulating leukocytes that may contribute to social gradients in disease. However, the CNS mechanisms involved remain obscure, in part because CTRA gene-expression profiles often track external social-environmental variables more closely than they do self-reported internal affective states such as stress, depression, or anxiety. This study examined the possibility that variations in patterns of natural language use might provide more sensitive indicators of the automatic threat-detection and -response systems that proximally regulate autonomic induction of the CTRA. In 22,627 audio samples of natural speech sampled from the daily interactions of 143 healthy adults, both total language output and patterns of function-word use covaried with CTRA gene expression. These language features predicted CTRA gene expression substantially better than did conventional self-report measures of stress, depression, and anxiety and did so independently of demographic and behavioral factors (age, sex, race, smoking, body mass index) and leukocyte subset distributions. This predictive relationship held when language and gene expression were sampled more than a week apart, suggesting that associations reflect stable individual differences or chronic life circumstances. Given the observed relationship between personal expression and gene expression, patterns of natural language use may provide a useful behavioral indicator of nonconsciously evaluated well-being (implicit safety vs. threat) that is distinct from conscious affective experience and more closely tracks the neurobiological processes involved in peripheral gene regulation. Copyright © 2017 the Author(s). Published by PNAS.

  2. Polish Toxic Currency Options

    Directory of Open Access Journals (Sweden)

    Waldemar Gontarski


    Full Text Available Toxic currency options are defined on the basis of the opposition to the nature (essence of an option contract, which is justified in terms of norms founded on the general law clause of characteristics (nature of a relation (which represents an independent premise for imposing restrictions on the freedom of contracts. So-understood toxic currency options are unlawful. Indeed they contravene iuris cogentis regulations. These include for instance option contracts, which are concluded with a bank, if the bank has not informed about option risk before concluding the contract; or the barrier options, which focus only on the protection of banks interests. Therefore, such options may appear to be invalid. Therefore, performing contracts for toxic currency options may be qualified as a criminal mismanagement. For the sake of security, the manager should then take into consideration filing a claim for stating invalidity (which can be made in a court verdict. At the same time, if the supervisory board member in a commercial company, who can also be a subject to mismanagement offences, commits an omission involving lack of reaction (for example, if he/she fails to notify of the suspected offence committed by the management board members acting to the companys detriment when the management board makes the company conclude option contracts which are charged with absolute invalidity the supervisory board member so acting may be considered to act to the companys detriment. In the most recent Polish jurisprudence and judicature the standard of a good host is treated to be the last resort for determining whether the managers powers resulting from criminal regulations were performed. The manager of the exporter should not, as a rule, issue any options. Issuing options always means assuming an obligation. In the case of currency put options it is an absolute obligation to purchase a given amount in euro at exchange rate set in advance. On the other hand issuing

  3. Functional Median Polish

    KAUST Repository

    Sun, Ying


    This article proposes functional median polish, an extension of univariate median polish, for one-way and two-way functional analysis of variance (ANOVA). The functional median polish estimates the functional grand effect and functional main factor effects based on functional medians in an additive functional ANOVA model assuming no interaction among factors. A functional rank test is used to assess whether the functional main factor effects are significant. The robustness of the functional median polish is demonstrated by comparing its performance with the traditional functional ANOVA fitted by means under different outlier models in simulation studies. The functional median polish is illustrated on various applications in climate science, including one-way and two-way ANOVA when functional data are either curves or images. Specifically, Canadian temperature data, U. S. precipitation observations and outputs of global and regional climate models are considered, which can facilitate the research on the close link between local climate and the occurrence or severity of some diseases and other threats to human health. © 2012 International Biometric Society.

  4. Naturalism and Ideological Work: How Is Family Language Policy Renegotiated as Both Parents and Children Learn a Threatened Minority Language? (United States)

    Armstrong, Timothy Currie


    Parents who enroll their children to be educated through a threatened minority language frequently do not speak that language themselves and classes in the language are sometimes offered to parents in the expectation that this will help them to support their children's education and to use the minority language in the home. Providing…

  5. The Nature of the Language Faculty and Its Implications for Evolution of Language (Reply to Fitch, Hauser, and Chomsky) (United States)

    Jackendoff, Ray; Pinker, Steven


    In a continuation of the conversation with Fitch, Chomsky, and Hauser on the evolution of language, we examine their defense of the claim that the uniquely human, language-specific part of the language faculty (the ''narrow language faculty'') consists only of recursion, and that this part cannot be considered an adaptation to communication. We…

  6. Adapting existing natural language processing resources for cardiovascular risk factors identification in clinical notes. (United States)

    Khalifa, Abdulrahman; Meystre, Stéphane


    The 2014 i2b2 natural language processing shared task focused on identifying cardiovascular risk factors such as high blood pressure, high cholesterol levels, obesity and smoking status among other factors found in health records of diabetic patients. In addition, the task involved detecting medications, and time information associated with the extracted data. This paper presents the development and evaluation of a natural language processing (NLP) application conceived for this i2b2 shared task. For increased efficiency, the application main components were adapted from two existing NLP tools implemented in the Apache UIMA framework: Textractor (for dictionary-based lookup) and cTAKES (for preprocessing and smoking status detection). The application achieved a final (micro-averaged) F1-measure of 87.5% on the final evaluation test set. Our attempt was mostly based on existing tools adapted with minimal changes and allowed for satisfying performance with limited development efforts. Copyright © 2015 Elsevier Inc. All rights reserved.

  7. Feature Selection for Natural Language Call Routing Based on Self-Adaptive Genetic Algorithm (United States)

    Koromyslova, A.; Semenkina, M.; Sergienko, R.


    The text classification problem for natural language call routing was considered in the paper. Seven different term weighting methods were applied. As dimensionality reduction methods, the feature selection based on self-adaptive GA is considered. k-NN, linear SVM and ANN were used as classification algorithms. The tasks of the research are the following: perform research of text classification for natural language call routing with different term weighting methods and classification algorithms and investigate the feature selection method based on self-adaptive GA. The numerical results showed that the most effective term weighting is TRR. The most effective classification algorithm is ANN. Feature selection with self-adaptive GA provides improvement of classification effectiveness and significant dimensionality reduction with all term weighting methods and with all classification algorithms.

  8. The wear of polished and glazed zirconia against enamel. (United States)

    Janyavula, Sridhar; Lawson, Nathaniel; Lawson, Nathaniel; Cakir, Deniz; Beck, Preston; Ramp, Lance C; Burgess, John O


    The wear of tooth structure opposing anatomically contoured zirconia crowns requires further investigation. The purpose of this in vitro study was to measure the roughness and wear of polished, glazed, and polished then reglazed zirconia against human enamel antagonists and compare the measurements to those of veneering porcelain and natural enamel. Zirconia specimens were divided into polished, glazed, and polished then reglazed groups (n=8). A veneering porcelain (Ceramco3) and enamel were used as controls. The surface roughness of all pretest specimens was measured. Wear testing was performed in the newly designed Alabama wear testing device. The mesiobuccal cusps of extracted molars were standardized and used as antagonists. Three-dimensional (3D) scans of the specimens and antagonists were obtained at baseline and after 200 000 and 400 000 cycles with a profilometer. The baseline scans were superimposed on the posttesting scans to determine volumetric wear. Data were analyzed with a 1-way ANOVA and Tukey Honestly Significant Difference (HSD) post hoc tests (α=.05) Surface roughness ranked in order of least rough to roughest was: polished zirconia, glazed zirconia, polished then reglazed zirconia, veneering porcelain, and enamel. For ceramic, there was no measureable loss on polished zirconia, moderate loss on the surface of enamel, and significant loss on glazed and polished then reglazed zirconia. The highest ceramic wear was exhibited by the veneering ceramic. For enamel antagonists, polished zirconia caused the least wear, and enamel caused moderate wear. Glazed and polished then reglazed zirconia showed significant opposing enamel wear, and veneering porcelain demonstrated the most. Within the limitations of the study, polished zirconia is wear-friendly to the opposing tooth. Glazed zirconia causes more material and antagonist wear than polished zirconia. The surface roughness of the zirconia aided in predicting the wear of the opposing dentition

  9. Laboratory process control using natural language commands from a personal computer (United States)

    Will, Herbert A.; Mackin, Michael A.


    PC software is described which provides flexible natural language process control capability with an IBM PC or compatible machine. Hardware requirements include the PC, and suitable hardware interfaces to all controlled devices. Software required includes the Microsoft Disk Operating System (MS-DOS) operating system, a PC-based FORTRAN-77 compiler, and user-written device drivers. Instructions for use of the software are given as well as a description of an application of the system.

  10. Natural language processing-based COTS software and related technologies survey.

    Energy Technology Data Exchange (ETDEWEB)

    Stickland, Michael G.; Conrad, Gregory N.; Eaton, Shelley M.


    Natural language processing-based knowledge management software, traditionally developed for security organizations, is now becoming commercially available. An informal survey was conducted to discover and examine current NLP and related technologies and potential applications for information retrieval, information extraction, summarization, categorization, terminology management, link analysis, and visualization for possible implementation at Sandia National Laboratories. This report documents our current understanding of the technologies, lists software vendors and their products, and identifies potential applications of these technologies.

  11. Human Computer Collaboration at the Edge: Enhancing Collective Situation Understanding with Controlled Natural Language (United States)


    has conceptually noted lim- itations of COPs [26]; our research empirically illustrates the tradeoffs with a COP even if all users have a shared group size and dynamics. To further assess the effects of a COP on information quality and quantity, we plan to run a conceptual replication of the...2] T. Kuhn, “A survey and classification of controlled natural languages,” Computational Linguistics , vol. 40, pp. 121–170, 2014. [3] E. Cambria

  12. Effects of speech- and text-based interaction modes in natural language human-computer dialogue. (United States)

    Le Bigot, Ludovic; Rouet, Jean-François; Jamet, Eric


    This study examined the effects of user production (speaking and typing) and user reception (listening and reading) modes on natural language human-computer dialogue. Text-based dialogue is often more efficient than speech-based dialogue, but the latter is more dynamic and more suitable for mobile environments and hands-busy situations. The respective contributions of user production and reception modes have not previously been assessed. Eighteen participants performed several information search tasks using a natural language information system in four experimental conditions: phone (speaking and listening), Web (typing and reading), and mixed (speaking and reading or typing and listening). Mental workload was greater and participants' repetitions of commands were more frequent when speech (speaking or listening) was used for both the user production and reception modes rather than text (typing or reading). Completion times were longer for listening than for reading. Satisfaction was lower, utterances were longer, and the interaction error rate was higher for speaking than typing. The production and reception modes both contribute to dialogue and mental workload. They have distinct contributions to performance, satisfaction, and the form of the discourse. The most efficient configuration for interacting in natural language would appear to be speech for production and system prompts in text, as this combination decreases the time on task while improving dialogue involvement.

  13. Classifying free-text triage chief complaints into syndromic categories with natural language processing. (United States)

    Chapman, Wendy W; Christensen, Lee M; Wagner, Michael M; Haug, Peter J; Ivanov, Oleg; Dowling, John N; Olszewski, Robert T


    Develop and evaluate a natural language processing application for classifying chief complaints into syndromic categories for syndromic surveillance. Much of the input data for artificial intelligence applications in the medical field are free-text patient medical records, including dictated medical reports and triage chief complaints. To be useful for automated systems, the free-text must be translated into encoded form. We implemented a biosurveillance detection system from Pennsylvania to monitor the 2002 Winter Olympic Games. Because input data was in free-text format, we used a natural language processing text classifier to automatically classify free-text triage chief complaints into syndromic categories used by the biosurveillance system. The classifier was trained on 4700 chief complaints from Pennsylvania. We evaluated the ability of the classifier to classify free-text chief complaints into syndromic categories with a test set of 800 chief complaints from Utah. The classifier produced the following areas under the ROC curve: Constitutional = 0.95; Gastrointestinal = 0.97; Hemorrhagic = 0.99; Neurological = 0.96; Rash = 1.0; Respiratory = 0.99; Other = 0.96. Using information stored in the system's semantic model, we extracted from the Respiratory classifications lower respiratory complaints and lower respiratory complaints with fever with a precision of 0.97 and 0.96, respectively. Results suggest that a trainable natural language processing text classifier can accurately extract data from free-text chief complaints for biosurveillance.

  14. Modeling virtual organizations with Latent Dirichlet Allocation: a case for natural language processing. (United States)

    Gross, Alexander; Murthy, Dhiraj


    This paper explores a variety of methods for applying the Latent Dirichlet Allocation (LDA) automated topic modeling algorithm to the modeling of the structure and behavior of virtual organizations found within modern social media and social networking environments. As the field of Big Data reveals, an increase in the scale of social data available presents new challenges which are not tackled by merely scaling up hardware and software. Rather, they necessitate new methods and, indeed, new areas of expertise. Natural language processing provides one such method. This paper applies LDA to the study of scientific virtual organizations whose members employ social technologies. Because of the vast data footprint in these virtual platforms, we found that natural language processing was needed to 'unlock' and render visible latent, previously unseen conversational connections across large textual corpora (spanning profiles, discussion threads, forums, and other social media incarnations). We introduce variants of LDA and ultimately make the argument that natural language processing is a critical interdisciplinary methodology to make better sense of social 'Big Data' and we were able to successfully model nested discussion topics from forums and blog posts using LDA. Importantly, we found that LDA can move us beyond the state-of-the-art in conventional Social Network Analysis techniques. Copyright © 2014 Elsevier Ltd. All rights reserved.

  15. Using Natural Language And Voice To Control High Level Tasks In A Robotics Environment (United States)

    Hackenberg, Robert G.


    RCA's Advanced Technology Laboratories (ATL) has implemented an integrated system which permits control of high level tasks in a robotics environment through voice input in the form of natural language syntax. The paper to be presented will outline the architecture used to integrate voice recognition and synthesis hardware and natural language and intelligent reasoning software with a supervisory processor that controls robotic and vision operations in the robotic testbed. The application is intended to give the human operator of a Puma 782 industrial robot the ability to combine joystick teleoperation with voice input in order to provide a flexible man-machine interface in a hands-busy environment. The system is designed to give the operator a speech interface which is unobtrusive and undemanding in terms of predetermined syntax requirements. The voice recognizer accepts continuous speech and the natural language processor accepts full and partial sentence fragments and can perform a fair amount of disambiguation and context analysis. Output to the operator comes via the parallel channel of speech synthesis so that the operator does not have to consult the computer's CRT for messages. The messages are generated from the software and offer warnings about unacceptable situations, confirmations of actions completed, and feedback of system data.

  16. Language

    DEFF Research Database (Denmark)

    Sanden, Guro Refsum


    Purpose: – The purpose of this paper is to analyse the consequences of globalisation in the area of corporate communication, and investigate how language may be managed as a strategic resource. Design/methodology/approach: – A review of previous studies on the effects of globalisation on corporate...... communication and the implications of language management initiatives in international business. Findings: – Efficient language management can turn language into a strategic resource. Language needs analyses, i.e. linguistic auditing/language check-ups, can be used to determine the language situation...

  17. Verbalization the Concept of Dignity in Polish Dictionaries

    Directory of Open Access Journals (Sweden)

    Chaban Vasylyna


    Full Text Available Background: The object of the study is the concept of DIGNITY. The justification of the research is predetermined by the lack of such investigation in Polish linguistics. The background for the article is also determined by the lack of similar studies in the sphere of Cognitive Linguistics. Purpose: The article aims at analysing the cultural concept of DIGNITY in the minds of the bearers of the Polish language and culture based on the data incorporated from the lexicographical sources and collection of folk sayings and proverbs. Results: The article dwells upon the most important features of this concept in the Polish language and culture. The study investigates the language conceptualization of this concept in the system of the Polish language and folk maxims and sayings. The definition of the lexeme of dignity reveals the compliance with higher moral principles and the understanding of value; good name, good character claiming attention and respect; the last name. DIGNITY is the principal regulating mechanism of behaviour and development of positive human values which become apparent in the process of human interaction. A man of worth respects himself and is as well proud to be an honourable man. DIGNITY concerns both a separate human and the society in general; it could be regarded as both individual and collective notion. Discussion: The suggested analysis revealed cultural and linguistic peculiarities of the concept of DIGNITY in the Polish language and culture. The conducted research is not exhaustive since it provides potential perspectives for further research of the Polish concept of DIGNITY, in particular, in terms of comparative studies with correlative concepts in Ukrainian and English languages and cultures with the help of the cognitive definition method.

  18. Testing an AAC system that transforms pictograms into natural language with persons with cerebral palsy. (United States)

    Pahisa-Solé, Joan; Herrera-Joancomartí, Jordi


    In this article, we describe a compansion system that transforms the telegraphic language that comes from the use of pictogram-based augmentative and alternative communication (AAC) into natural language. The system was tested with four participants with severe cerebral palsy and ranging degrees of linguistic competence and intellectual disabilities. Participants had used pictogram-based AAC at least for the past 30 years each and presented a stable linguistic profile. During tests, which consisted of a total of 40 sessions, participants were able to learn new linguistic skills, such as the use of basic verb tenses, while using the compansion system, which proved a source of motivation. The system can be adapted to the linguistic competence of each person and required no learning curve during tests when none of its special features, like gender, number, verb tense, or sentence type modifiers, were used. Furthermore, qualitative and quantitative results showed a mean communication rate increase of 41.59%, compared to the same communication device without the compansion system, and an overall improvement in the communication experience when the output is in natural language. Tests were conducted in Catalan and Spanish.

  19. Resolution of ambiguities in cartoons as an illustration of the role of pragmatics in natural language understanding by computers

    Energy Technology Data Exchange (ETDEWEB)

    Mazlack, L.J.; Paz, N.M.


    Newspaper cartoons can graphically display the result of ambiguity in human speech; the result can be unexpected and funny. Likewise, computer analysis of natural language statements also needs to successfully resolve ambiguous situations. Computer techniques already developed use restricted world knowledge in resolving ambiguous language use. This paper illustrates how these techniques can be used in resolving ambiguous situations arising in cartoons. 8 references.

  20. Language Revitalization. (United States)

    Hinton, Leanne


    Surveys developments in language revitalization and language death. Focusing on indigenous languages, discusses the role and nature of appropriate linguistic documentation, possibilities for bilingual education, and methods of promoting oral fluency and intergenerational transmission in affected languages. (Author/VWL)

  1. How to Investigate Polish Clusters’ Attractiveness for Inward FDI? Addressing Ambiguity Problem

    Directory of Open Access Journals (Sweden)

    Götz Marta


    Full Text Available The aim of the paper is to assess whether, and in what fashion, managers of Polish cluster organizations perceive the attractiveness of foreign direct investment in Polish clusters This research is exploratory and qualitative in nature. The complex nature of Polish clusters, which can benefit from and be competitively challenged by, FDI are identified and a conceptual framework for assessing that nature is proposed; specifically, research using the grounded theory method (GTM.

  2. Sexual activity of Polish adults. (United States)

    Pastwa-Wojciechowska, Beata; Izdebski, Zbigniew


    The purpose of this research was to explore the subject of sexual activity in the Polish population, with special focus on age and gender differences, and sexual infidelity. Sexual activity is one of the basic factors in initiating and maintaining relationships. On the one hand, sexual activity enables us to meet natural needs and maintain an intimate relationship with another human being; on the other, it may allow us to overcome loneliness and social isolation by providing the opportunity to express feelings of closeness and unity. The research was conducted on a representative group of 3,200 Poles aged between 15-49, with the support of a well-known Polish research company - TNS OBOP. Face-to-face and Pencil and Paper (PAPI) interviews were carried out. The results focus on two main issues: the age and motives of sexual initiation among teenagers (with a significant percentage starting their sexual activity at the age of 15), and the quality of the sexual lives of adults (average number of sexual partners, sexual infidelity and sexual satisfaction). There is dependence between the type of relationship and the performance or non-performance of sexual activity, as well as the quality of the relationship. Among both adolescents and adults, remaining in a stable relationship (partnership or marriage) promotes loyalty. The performance of sexual goals turns out to be an important mechanism regulating the interpersonal aspects of a relationship, influencing their perception and evaluation.

  3. Sexual activity of Polish adults

    Directory of Open Access Journals (Sweden)

    Beata Pastwa-Wojciechowska


    Full Text Available Aim. The purpose of this research was to explore the subject of sexual activity in the Polish population, with special focus on age and gender differences, and sexual infidelity. Sexual activity is one of the basic factors in initiating and maintaining relationships. On the one hand, sexual activity enables us to meet natural needs and maintain an intimate relationship with another human being; on the other, it may allow us to overcome loneliness and social isolation by providing the opportunity to express feelings of closeness and unity. Material and method. The research was conducted on a representative group of 3,200 Poles aged between 15–49, with the support of a well-known Polish research company – TNS OBOP. Face-to-face and Pencil and Paper (PAPI interviews were carried out. Results. The results focus on two main issues: the age and motives of sexual initiation among teenagers (with a significant percentage starting their sexual activity at the age of 15, and the quality of the sexual lives of adults (average number of sexual partners, sexual infidelity and sexual satisfaction. Conclusion. There is dependence between the type of relationship and the performance or non-performance of sexual activity, as well as the quality of the relationship. Among both adolescents and adults, remaining in a stable relationship (partnership or marriage promotes loyalty. The performance of sexual goals turns out to be an important mechanism regulating the interpersonal aspects of a relationship, influencing their perception and evaluation.

  4. Interculutral Polish-Chinese QQing

    Directory of Open Access Journals (Sweden)

    Elżbieta Gajek


    Full Text Available Working in tandem with the use of information and communication technologies is well-known and frequently used as a method of supporting learning of foreign languages in authentic communication. It is based on a constructivist approach to teaching. In the reported case study Polish and Chinese students discussed in English preprepared topics. The work shows the potential of e-learning at the micro level, as the language and intercultural task is implemented into an academic course without modification of the objectives and learning outcomes of the course. Evaluation carried out at the end of the project indicates that both groups perceived the task as a significant linguistic, cultural and personal experience. They stressed the importance of sharing “culture for culture” as the partner culture was new for most of them. The ability to talk and respond to information which was often strange, from the point of view of their own culture, allowed for learning intercultural competence ̔in action’.

  5. Analytic tendencies in modern Polish and Russian

    Directory of Open Access Journals (Sweden)

    Wojciech Sosnowski


    Full Text Available Analytic tendencies in modern Polish and Russian Modern Polish and Russian are characterized by some features which demonstrate an increasing level of analitism. In the process of transformation from synthetic to analytical language, a crucial role is played by prepositional units. In this research, analitism is understood in a traditional way as a morphological and syntactic phenomenon. The fact that the synthetic structure of a language may, in some conditions, turn into an analytical one, as happened in the case of Bulgarian and Macedonian, has been intriguing linguists ever since, and has made me attempt to answer the question: What is the condition of modern Polish and Russian, which are languages with a rich literary tradition and solid grammatical norms, which belong to a group of synthetic languages? The analytical tendencies in morphology include the following: a decrease in the number of cases in all inflected parts of speech; a more frequent use of uninflected nouns and adjectives; the growing importance of nouns with common gender, and, in particular, the use of forms of masculine gender to depict feminine gender; differences in expressing collectiveness in a group of nouns (using collective meaning for forms that have singular meaning; substituting case forms with prepositions; substituting case forms with subordinate clauses; substituting case forms with “helper” words. Analytical tendencies in the area of numeral functioning include: substituting inflected forms of ordinal numerals with cardinal ones; the gradual disappearing inflection of numerals; confusing the forms of noun cases after numerals; the disappearing declination of collective numerals; displacing other cases with so-called simple cases; changing the syntactical position which the numeral should be inflected in; abandoning the declination of first elements of collective numerals. During the study of analytic tendencies in morphology, it was necessary to examine

  6. Ulisse Aldrovandi's Color Sensibility: Natural History, Language and the Lay Color Practices of Renaissance Virtuosi. (United States)

    Pugliano, Valentina


    Famed for his collection of drawings of naturalia and his thoughts on the relationship between painting and natural knowledge, it now appears that the Bolognese naturalist Ulisse Aldrovandi (1522-1605) also pondered specifically color and pigments, compiling not only lists and diagrams of color terms but also a full-length unpublished manuscript entitled De coloribus or Trattato dei colori. Introducing these writings for the first time, this article portrays a scholar not so much interested in the materiality of pigment production, as in the cultural history of hues. It argues that these writings constituted an effort to build a language of color, in the sense both of a standard nomenclature of hues and of a lexicon, a dictionary of their denotations and connotations as documented in the literature of ancients and moderns. This language would serve the naturalist in his artistic patronage and his natural historical studies, where color was considered one of the most reliable signs for the correct identification of specimens, and a guarantee of accuracy in their illustration. Far from being an exception, Aldrovandi's 'color sensibility'spoke of that of his university-educated nature-loving peers.

  7. Natural Language Processing Approach for Searching the Quran: Quick and Intuitive

    Directory of Open Access Journals (Sweden)

    Zainal Abidah


    Full Text Available The Quran is a scripture that acts as the main reference to people which their religion is Islam. It covers information from politics to science, with vast amount of information that requires effort to uncover the knowledge behind it. Today, the emergence of smartphones has led to the development of a wide-range application for enhancing knowledge-seeking activities. This project proposes a mobile application that is taking a natural language approach to searching topics in the Quran based on keyword searching. The benefit of the application is two-fold; it is intuitive and it saves time.

  8. Semi-supervised learning and domain adaptation in natural language processing

    CERN Document Server

    Søgaard, Anders


    This book introduces basic supervised learning algorithms applicable to natural language processing (NLP) and shows how the performance of these algorithms can often be improved by exploiting the marginal distribution of large amounts of unlabeled data. One reason for that is data sparsity, i.e., the limited amounts of data we have available in NLP. However, in most real-world NLP applications our labeled data is also heavily biased. This book introduces extensions of supervised learning algorithms to cope with data sparsity and different kinds of sampling bias.This book is intended to be both

  9. Knowledge acquisition from natural language for expert systems based on classification problem-solving methods (United States)

    Gomez, Fernando


    It is shown how certain kinds of domain independent expert systems based on classification problem-solving methods can be constructed directly from natural language descriptions by a human expert. The expert knowledge is not translated into production rules. Rather, it is mapped into conceptual structures which are integrated into long-term memory (LTM). The resulting system is one in which problem-solving, retrieval and memory organization are integrated processes. In other words, the same algorithm and knowledge representation structures are shared by these processes. As a result of this, the system can answer questions, solve problems or reorganize LTM.

  10. Detecting inpatient falls by using natural language processing of electronic medical records

    Directory of Open Access Journals (Sweden)

    Toyabe Shin-ichi


    Full Text Available Abstract Background Incident reporting is the most common method for detecting adverse events in a hospital. However, under-reporting or non-reporting and delay in submission of reports are problems that prevent early detection of serious adverse events. The aim of this study was to determine whether it is possible to promptly detect serious injuries after inpatient falls by using a natural language processing method and to determine which data source is the most suitable for this purpose. Methods We tried to detect adverse events from narrative text data of electronic medical records by using a natural language processing method. We made syntactic category decision rules to detect inpatient falls from text data in electronic medical records. We compared how often the true fall events were recorded in various sources of data including progress notes, discharge summaries, image order entries and incident reports. We applied the rules to these data sources and compared F-measures to detect falls between these data sources with reference to the results of a manual chart review. The lag time between event occurrence and data submission and the degree of injury were compared. Results We made 170 syntactic rules to detect inpatient falls by using a natural language processing method. Information on true fall events was most frequently recorded in progress notes (100%, incident reports (65.0% and image order entries (12.5%. However, F-measure to detect falls using the rules was poor when using progress notes (0.12 and discharge summaries (0.24 compared with that when using incident reports (1.00 and image order entries (0.91. Since the results suggested that incident reports and image order entries were possible data sources for prompt detection of serious falls, we focused on a comparison of falls found by incident reports and image order entries. Injury caused by falls found by image order entries was significantly more severe than falls detected by

  11. Visualizing Patient Journals by Combining Vital Signs Monitoring and Natural Language Processing

    DEFF Research Database (Denmark)

    Vilic, Adnan; Petersen, John Asger; Hoppe, Karsten


    This paper presents a data-driven approach to graphically presenting text-based patient journals while still maintaining all textual information. The system first creates a timeline representation of a patients’ physiological condition during an admission, which is assessed by electronically...... monitoring vital signs and then combining these into Early Warning Scores (EWS). Hereafter, techniques from Natural Language Processing (NLP) are applied on the existing patient journal to extract all entries. Finally, the two methods are combined into an interactive timeline featuring the ability to see...... drastic changes in the patients’ health, and thereby enabling staff to see where in the journal critical events have taken place....

  12. Systemic functional grammar in natural language generation linguistic description and computational representation

    CERN Document Server

    Teich, Elke


    This volume deals with the computational application of systemic functional grammar (SFG) for natural language generation. In particular, it describes the implementation of a fragment of the grammar of German in the computational framework of KOMET-PENMAN for multilingual generation. The text also presents a specification of explicit well-formedness constraints on syntagmatic structure which are defined in the form of typed feature structures. It thus achieves a model of systemic functional grammar that unites both the strengths of systemics, such as stratification, functional diversification

  13. Visualizing Patient Journals by Combining Vital Signs Monitoring and Natural Language Processing

    DEFF Research Database (Denmark)

    Vilic, Adnan; Petersen, John Asger; Hoppe, Karsten


    monitoring vital signs and then combining these into Early Warning Scores (EWS). Hereafter, techniques from Natural Language Processing (NLP) are applied on the existing patient journal to extract all entries. Finally, the two methods are combined into an interactive timeline featuring the ability to see......This paper presents a data-driven approach to graphically presenting text-based patient journals while still maintaining all textual information. The system first creates a timeline representation of a patients’ physiological condition during an admission, which is assessed by electronically...... drastic changes in the patients’ health, and thereby enabling staff to see where in the journal critical events have taken place....

  14. Language and Interactional Discourse: Deconstrusting the Talk- Generating Machinery in Natural Convresation

    Directory of Open Access Journals (Sweden)

    Amaechi Uneke Enyi


    Full Text Available The study entitled. “Language and Interactional Discourse: Deconstructing the Talk - Generating Machinery in Natural Conversation,” is an analysis of spontaneous and informal conversation. The study, carried out in the theoretical and methodological tradition of Ethnomethodology, was aimed at explicating how ordinary talk is organized and produced, how people coordinate their talk –in- interaction, how meanings are determined, and the role of talk in the wider social processes. The study followed the basic assumption of conversation analysis which is, that talk is not just a product of two ‘speakers - hearers’ who attempt to exchange information or convey messages to each other. Rather, participants in conversation are seen to be mutually orienting to, and collaborating in order to achieve orderly and meaningful communication. The analytic objective is therefore to make clear these procedures on which speakers rely to produce utterances and by which they make sense of other speakers’ talk. The datum used for this study was a recorded informal conversation between two (and later three middle- class civil servants who are friends. The recording was done in such a way that the participants were not aware that they were being recorded. The recording was later transcribed in a way that we believe is faithful to the spontaneity and informality of the talk. Our finding showed that conversation has its own features and is an ordered and structured social day by- day event. Specifically, utterances are designed and informed by organized procedures, methods and resources which are tied to the contexts in which they are produced, and which participants are privy to by virtue of their membership of a culture or a natural language community.  Keywords: Language, Discourse and Conversation

  15. Natural Conversation Reconstruction Tasks: The Language Classroom as a Meeting Place

    Directory of Open Access Journals (Sweden)

    Jun Ohashi


    Full Text Available This paper, drawing on Pratt’s notion of ‘transculturation’ and Bhabha’s ‘third space’, presents an example of language learning tasks that empower learners’ agency and promote their cross-cultural awareness and sensitivities to a different set of cultural expectations, using a naturally occurred Japanese thanking episodes. The paper discusses the merits of Natural Conversation Reconstruction Tasks (NCRTs as a practical method for helping L2 learners develop this ‘intercultural competence’. It is based on a qualitative study of the results of one NCRT created for use in the context of teaching Japanese as a L2 in a multicultural society. It suggests the NCRT encourages the learners to explore the intersection where language use, speaker intention and L1 and L2 cultural norms meet. Such a process helps the learners become aware of socially expected patterns of communication in L1 and L2 in terms of the choices of speech act, formulaic expressions, sequential organization and politeness orientation. The learners’ comments suggest that the NCRT helps learners transcend their cultural boundaries by overcoming their narrow understanding of ‘thanking’ as ‘expressions of gratitude and appreciation’ and by cross-culturally widening their views of what counts as thanking. The NCRT with rich contextual information promotes the learners’ intercultural awareness, sensitivity to context and intercultural exploration in the space between L1 and L2, where they have authority and freedom of making sense of conversations, and pragmatics is fully integrated into language pedagogy.

  16. Formal ontology for natural language processing and the integration of biomedical databases. (United States)

    Simon, Jonathan; Dos Santos, Mariana; Fielding, James; Smith, Barry


    The central hypothesis underlying this communication is that the methodology and conceptual rigor of a philosophically inspired formal ontology can bring significant benefits in the development and maintenance of application ontologies [A. Flett, M. Dos Santos, W. Ceusters, Some Ontology Engineering Procedures and their Supporting Technologies, EKAW2002, 2003]. This hypothesis has been tested in the collaboration between Language and Computing (L&C), a company specializing in software for supporting natural language processing especially in the medical field, and the Institute for Formal Ontology and Medical Information Science (IFOMIS), an academic research institution concerned with the theoretical foundations of ontology. In the course of this collaboration L&C's ontology, LinKBase, which is designed to integrate and support reasoning across a plurality of external databases, has been subjected to a thorough auditing on the basis of the principles underlying IFOMIS's Basic Formal Ontology (BFO) [B. Smith, Basic Formal Ontology, 2002.]. The goal is to transform a large terminology-based ontology into one with the ability to support reasoning applications. Our general procedure has been the implementation of a meta-ontological definition space in which the definitions of all the concepts and relations in LinKBase are standardized in the framework of first-order logic. In this paper we describe how this principles-based standardization has led to a greater degree of internal coherence of the LinKBase structure, and how it has facilitated the construction of mappings between external databases using LinKBase as translation hub. We argue that the collaboration here described represents a new phase in the quest to solve the so-called "Tower of Babel" problem of ontology integration [F. Montayne, J. Flanagan, Formal Ontology: The Foundation for Natural Language Processing, 2003.].

  17. Natural language processing systems for capturing and standardizing unstructured clinical information: A systematic review. (United States)

    Kreimeyer, Kory; Foster, Matthew; Pandey, Abhishek; Arya, Nina; Halford, Gwendolyn; Jones, Sandra F; Forshee, Richard; Walderhaug, Mark; Botsis, Taxiarchis


    We followed a systematic approach based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses to identify existing clinical natural language processing (NLP) systems that generate structured information from unstructured free text. Seven literature databases were searched with a query combining the concepts of natural language processing and structured data capture. Two reviewers screened all records for relevance during two screening phases, and information about clinical NLP systems was collected from the final set of papers. A total of 7149 records (after removing duplicates) were retrieved and screened, and 86 were determined to fit the review criteria. These papers contained information about 71 different clinical NLP systems, which were then analyzed. The NLP systems address a wide variety of important clinical and research tasks. Certain tasks are well addressed by the existing systems, while others remain as open challenges that only a small number of systems attempt, such as extraction of temporal information or normalization of concepts to standard terminologies. This review has identified many NLP systems capable of processing clinical free text and generating structured output, and the information collected and evaluated here will be important for prioritizing development of new approaches for clinical NLP. Copyright © 2017 Elsevier Inc. All rights reserved.

  18. Efficient Queries of Stand-off Annotations for Natural Language Processing on Electronic Medical Records. (United States)

    Luo, Yuan; Szolovits, Peter


    In natural language processing, stand-off annotation uses the starting and ending positions of an annotation to anchor it to the text and stores the annotation content separately from the text. We address the fundamental problem of efficiently storing stand-off annotations when applying natural language processing on narrative clinical notes in electronic medical records (EMRs) and efficiently retrieving such annotations that satisfy position constraints. Efficient storage and retrieval of stand-off annotations can facilitate tasks such as mapping unstructured text to electronic medical record ontologies. We first formulate this problem into the interval query problem, for which optimal query/update time is in general logarithm. We next perform a tight time complexity analysis on the basic interval tree query algorithm and show its nonoptimality when being applied to a collection of 13 query types from Allen's interval algebra. We then study two closely related state-of-the-art interval query algorithms, proposed query reformulations, and augmentations to the second algorithm. Our proposed algorithm achieves logarithmic time stabbing-max query time complexity and solves the stabbing-interval query tasks on all of Allen's relations in logarithmic time, attaining the theoretic lower bound. Updating time is kept logarithmic and the space requirement is kept linear at the same time. We also discuss interval management in external memory models and higher dimensions.


    Directory of Open Access Journals (Sweden)

    Alexandr I Krupnov


    Full Text Available The article discusses the results of empirical study of the association between variables of persistence and academic achievement in foreign languages. The sample includes students of the Faculty of Physics, Mathematics and Natural Science at the RUDN University ( n = 115, divided into 5 subsamples, two of which are featured in the present study (the most and the least successful students subsamples. Persistence as a personality trait is studied within A.I. Krupnov’s system-functional approach. A.I. Krupnov’s paper-and-pencil test was used to measure persistence variables. Academic achievement was measured according to the four parameters: Phonetics, Grammar, Speaking and Political vocabulary based on the grades students received during the academic year. The analysis revealed that persistence displays different associations with academic achievement variables in more and less successful students subsamples, the general prominence of this trait is more important for unsuccessful students. Phonetics is the academic achievement variable most associated with persistence due to its nature, a skill one can acquire through hard work and practice which is the definition of persistence. Grammar as an academic achievement variable is not associated with persistence and probably relates to other factors. Unsuccessful students may have difficulties in separating various aspects of language acquisition from each other which should be taken into consideration by the teachers.

  20. Text to Speech Berbasis Natural Language pada Aplikasi Pembelajaran Tenses Bahasa Inggris

    Directory of Open Access Journals (Sweden)

    Amak Yunus


    Full Text Available Bahasa adalah sebuah cara berkomunikasi secara sistematis dengan menggunakan suara atau simbol-simbol yang memiliki arti, yang diucapkan melalui mulut. Bahasa juga ditulis dengan mengikuti kaidah yang berlaku. Salah satu bahasa yang banyak digunakan di belahan dunia adalah Bahasa Inggris. Namun ada beberapa kendala apabila kita belajar kepada seorang guru atau instruktur. Waktu yang diberikan seorang guru, terbatas pada jam sekolah atau les saja. Bila siswa pulang sekolah atau les, maka yang bersangkutan harus belajar bahasa Inggris secara mandiri. Dari permasalahan di atas, muncul sebuah ide tentang bagaimana membuat sebuah penelitian yang berkaitan dengan pembuatan aplikasi yang mampu memberikan pengetahuan kepada siswa tentang bagaimana belajar bahasa Inggris secara mandiri baik dari perubahan kalimat postif menjadi kalimat negatif dan kalimat tanya. Disamping itu, aplikasi ini juga mampu memberikan pengetahuan tentang bagaimana mengucapkan kalimat dalam bahasa Inggris. Pada intinya kontribusi yang dapat diperoleh dari hasil penelitian ini adalah pihak terkait dari tingkat SMP sampai dengan SMU/SMK, dapat menggunakan aplikasi text to speech berbasis natural language processing untuk mempelajari tenses pada bahasa Inggris. Aplikasi ini dapat memperdengarkan kalimat-kalimat pada bahasa inggris dan dapat menyusun kalimat tanya dan kalimat negatif berdasarkan kalimat positifnya dalam beberapa tenses bahasa Inggris. Kata Kunci : Natural language processing, Text to speech


    Directory of Open Access Journals (Sweden)

    Heriyanto Heriyanto


    Full Text Available Natural Language Processing (NLP to know Al-Quran reading law can analyse text data input in the form of sentence with everyday human being Ianguage of process early by recognizing syntak order and existing production order through scanning, identifying token, result of from token will be conducted by parsing and processed later;then to be conducted by adaptation with existing production order. Result of adaptation will in accepting or is not accepted by if do not fulfill existing production order will emerge message of mistake. The result if as according to order produce hence will present as according to wanted sentence to present Al-Quran reading law. Knowing real correct Al-Quran reading law as according to tartil, its science of its law nya of kifayah fardhu, therefore in studying and knowing Al-Quran reading law by using Natural Language Processing (NLP can fulfill science procedures learn to read Al-Quran matching with tajwid science. NLP which can recognize wanted reading law by consumer for the letter of Al-Fatihah, Al-Baqarah Juz 1. made Application Software can give appearance result of Al-Quran reading laws, NLP which can analyse about wanted reading law with input pass text. Read text pursuant to used production order so that can know reading law which is pursuant to included text

  2. Teaching the tacit knowledge of programming to noviceswith natural language tutoring (United States)

    Lane, H. Chad; Vanlehn, Kurt


    For beginning programmers, inadequate problem solving and planning skills are among the most salient of their weaknesses. In this paper, we test the efficacy of natural language tutoring to teach and scaffold acquisition of these skills. We describe ProPL (Pro-PELL), a dialogue-based intelligent tutoring system that elicits goal decompositions and program plans from students in natural language. The system uses a variety of tutoring tactics that leverage students' intuitive understandings of the problem, how it might be solved, and the underlying concepts of programming. We report the results of a small-scale evaluation comparing students who used ProPL with a control group who read the same content. Our primary findings are that students who received tutoring from ProPL seem to have developed an improved ability to solve the composition problem and displayed behaviors that suggest they were able to think at greater levels of abstraction than students in the read-only group.

  3. A semantic-based approach for querying linked data using natural language

    KAUST Repository

    Paredes-Valverde, Mario Andrés


    The semantic Web aims to provide to Web information with a well-defined meaning and make it understandable not only by humans but also by computers, thus allowing the automation, integration and reuse of high-quality information across different applications. However, current information retrieval mechanisms for semantic knowledge bases are intended to be only used by expert users. In this work, we propose a natural language interface that allows non-expert users the access to this kind of information through formulating queries in natural language. The present approach uses a domain-independent ontology model to represent the question\\'s structure and context. Also, this model allows determination of the answer type expected by the user based on a proposed question classification. To prove the effectiveness of our approach, we have conducted an evaluation in the music domain using LinkedBrainz, an effort to provide the MusicBrainz information as structured data on the Web by means of Semantic Web technologies. Our proposal obtained encouraging results based on the F-measure metric, ranging from 0.74 to 0.82 for a corpus of questions generated by a group of real-world end users. © The Author(s) 2015.

  4. Knowledge-based machine indexing from natural language text: Knowledge base design, development, and maintenance (United States)

    Genuardi, Michael T.


    One strategy for machine-aided indexing (MAI) is to provide a concept-level analysis of the textual elements of documents or document abstracts. In such systems, natural-language phrases are analyzed in order to identify and classify concepts related to a particular subject domain. The overall performance of these MAI systems is largely dependent on the quality and comprehensiveness of their knowledge bases. These knowledge bases function to (1) define the relations between a controlled indexing vocabulary and natural language expressions; (2) provide a simple mechanism for disambiguation and the determination of relevancy; and (3) allow the extension of concept-hierarchical structure to all elements of the knowledge file. After a brief description of the NASA Machine-Aided Indexing system, concerns related to the development and maintenance of MAI knowledge bases are discussed. Particular emphasis is given to statistically-based text analysis tools designed to aid the knowledge base developer. One such tool, the Knowledge Base Building (KBB) program, presents the domain expert with a well-filtered list of synonyms and conceptually-related phrases for each thesaurus concept. Another tool, the Knowledge Base Maintenance (KBM) program, functions to identify areas of the knowledge base affected by changes in the conceptual domain (for example, the addition of a new thesaurus term). An alternate use of the KBM as an aid in thesaurus construction is also discussed.

  5. Selected Topics on Systems Modeling and Natural Language Processing: Editorial Introduction to the Issue 7 of CSIMQ

    Directory of Open Access Journals (Sweden)

    Witold Andrzejewski


    Full Text Available The seventh issue of Complex Systems Informatics and Modeling Quarterly presents five papers devoted to two distinct research topics: systems modeling and natural language processing (NLP. Both of these subjects are very important in computer science. Through modeling we can simplify the studied problem by concentrating on only one aspect at a time. Moreover, a properly constructed model allows the modeler to work on higher levels of abstraction and not having to concentrate on details. Since the size and complexity of information systems grows rapidly, creating good models of such systems is crucial. The analysis of natural language is slowly becoming a widely used tool in commerce and day to day life. Opinion mining allows recommender systems to provide accurate recommendations based on user-generated reviews. Speech recognition and NLP are the basis for such widely used personal assistants as Apple’s Siri, Microsoft’s Cortana, and Google Now. While a lot of work has already been done on natural language processing, the research usually concerns widely used languages, such as English. Consequently, natural language processing in languages other than English is very relevant subject and is addressed in this issue.

  6. The Teaching of English in Polish Educational Institutions (United States)

    Hughes, Teresa Ann; Butler, Norman L.; Kritsonis, William Allan; Herrington, David


    This article discusses the strengths and weaknesses of native and non-native teachers of English in Polish schools, and is the result of the Dr. Butler's experience as a teacher of English in Poland. It is argued that native teachers of English should be employed in Poland because they teach in their own language, use current idioms, provide…

  7. The Meanings of Learning as Described by Polish Migrant Bloggers (United States)

    Popow, Monika


    This paper addresses the meanings given to learning by Polish migrant bloggers. It presents the result of an analysis of ten blogs, written by Poles living abroad. The blogs under analysis were chosen on the basis of random sample. The analysed material was categorised by recurring themes, which included: learning in Poland, language acquisition,…

    Gielecki, J; Zurada, A; Osman, N


    Professional terminology is commonplace, particularly in the fields of mathematics, medicine, veterinary and natural sciences. The use of the terminology can be international, as it is with Anatomical Terminology (AT). In the early age of modern education, anatomists adopted Latin as the international language for AT. However, at the end of the 20th century, the English language became more predominant around the world. It can be said that the AT is a specific collection of scientific terms. One of the major flaws in early AT was that body structures were described by varying names, while some of the terms was irrational in nature, and confusing. At this time, different international committees were working on preparing a unified final version of the AT, which in the end consisted of 5,640 terms (4,286 originally from the Basle Nomina Anatomica, BNA). Also, each country wanted to have its own nomenclature. In order to accomplish this, each country based their nomenclature on the international AT, and then translated it into their own language. The history of the Polish Anatomical Terminology (PAT) is unique, and follows the events of history. It was first published in 1898, at a time when its neighbours partitioned the territory of Poland. During 150 years, the Polish culture and language was under the Russification and Germanization policy. It is important to note, that even with such difficult circumstances, the PAT was the first national AT in the world. The PAT was a union of the accepted first BNA in Latin and the original Polish anatomical equivalents. This union formed the basis for theoretical and clinical medicine in Poland.

  9. A study of the very high order natural user language (with AI capabilities) for the NASA space station common module (United States)

    Gill, E. N.


    The requirements are identified for a very high order natural language to be used by crew members on board the Space Station. The hardware facilities, databases, realtime processes, and software support are discussed. The operations and capabilities that will be required in both normal (routine) and abnormal (nonroutine) situations are evaluated. A structure and syntax for an interface (front-end) language to satisfy the above requirements are recommended.

  10. CLAMP - a toolkit for efficiently building customized clinical natural language processing pipelines. (United States)

    Soysal, Ergin; Wang, Jingqi; Jiang, Min; Wu, Yonghui; Pakhomov, Serguei; Liu, Hongfang; Xu, Hua


    Existing general clinical natural language processing (NLP) systems such as MetaMap and Clinical Text Analysis and Knowledge Extraction System have been successfully applied to information extraction from clinical text. However, end users often have to customize existing systems for their individual tasks, which can require substantial NLP skills. Here we present CLAMP (Clinical Language Annotation, Modeling, and Processing), a newly developed clinical NLP toolkit that provides not only state-of-the-art NLP components, but also a user-friendly graphic user interface that can help users quickly build customized NLP pipelines for their individual applications. Our evaluation shows that the CLAMP default pipeline achieved good performance on named entity recognition and concept encoding. We also demonstrate the efficiency of the CLAMP graphic user interface in building customized, high-performance NLP pipelines with 2 use cases, extracting smoking status and lab test values. CLAMP is publicly available for research use, and we believe it is a unique asset for the clinical NLP community. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email:

  11. Towards symbiosis in knowledge representation and natural language processing for structuring clinical practice guidelines. (United States)

    Weng, Chunhua; Payne, Philip R O; Velez, Mark; Johnson, Stephen B; Bakken, Suzanne


    The successful adoption by clinicians of evidence-based clinical practice guidelines (CPGs) contained in clinical information systems requires efficient translation of free-text guidelines into computable formats. Natural language processing (NLP) has the potential to improve the efficiency of such translation. However, it is laborious to develop NLP to structure free-text CPGs using existing formal knowledge representations (KR). In response to this challenge, this vision paper discusses the value and feasibility of supporting symbiosis in text-based knowledge acquisition (KA) and KR. We compare two ontologies: (1) an ontology manually created by domain experts for CPG eligibility criteria and (2) an upper-level ontology derived from a semantic pattern-based approach for automatic KA from CPG eligibility criteria text. Then we discuss the strengths and limitations of interweaving KA and NLP for KR purposes and important considerations for achieving the symbiosis of KR and NLP for structuring CPGs to achieve evidence-based clinical practice.

  12. On application of image analysis and natural language processing for music search (United States)

    Gwardys, Grzegorz


    In this paper, I investigate a problem of finding most similar music tracks using, popular in Natural Language Processing, techniques like: TF-IDF and LDA. I de ned document as music track. Each music track is transformed to spectrogram, thanks that, I can use well known techniques to get words from images. I used SURF operation to detect characteristic points and novel approach for their description. The standard kmeans was used for clusterization. Clusterization is here identical with dictionary making, so after that I can transform spectrograms to text documents and perform TF-IDF and LDA. At the final, I can make a query in an obtained vector space. The research was done on 16 music tracks for training and 336 for testing, that are splitted in four categories: Hiphop, Jazz, Metal and Pop. Although used technique is completely unsupervised, results are satisfactory and encouraging to further research.

  13. Workshop on using natural language processing applications for enhancing clinical decision making: an executive summary. (United States)

    Pai, Vinay M; Rodgers, Mary; Conroy, Richard; Luo, James; Zhou, Ruixia; Seto, Belinda


    In April 2012, the National Institutes of Health organized a two-day workshop entitled 'Natural Language Processing: State of the Art, Future Directions and Applications for Enhancing Clinical Decision-Making' (NLP-CDS). This report is a summary of the discussions during the second day of the workshop. Collectively, the workshop presenters and participants emphasized the need for unstructured clinical notes to be included in the decision making workflow and the need for individualized longitudinal data tracking. The workshop also discussed the need to: (1) combine evidence-based literature and patient records with machine-learning and prediction models; (2) provide trusted and reproducible clinical advice; (3) prioritize evidence and test results; and (4) engage healthcare professionals, caregivers, and patients. The overall consensus of the NLP-CDS workshop was that there are promising opportunities for NLP and CDS to deliver cognitive support for healthcare professionals, caregivers, and patients.

  14. Optimizing annotation resources for natural language de-identification via a game theoretic framework. (United States)

    Li, Muqun; Carrell, David; Aberdeen, John; Hirschman, Lynette; Kirby, Jacqueline; Li, Bo; Vorobeychik, Yevgeniy; Malin, Bradley A


    Electronic medical records (EMRs) are increasingly repurposed for activities beyond clinical care, such as to support translational research and public policy analysis. To mitigate privacy risks, healthcare organizations (HCOs) aim to remove potentially identifying patient information. A substantial quantity of EMR data is in natural language form and there are concerns that automated tools for detecting identifiers are imperfect and leak information that can be exploited by ill-intentioned data recipients. Thus, HCOs have been encouraged to invest as much effort as possible to find and detect potential identifiers, but such a strategy assumes the recipients are sufficiently incentivized and capable of exploiting leaked identifiers. In practice, such an assumption may not hold true and HCOs may overinvest in de-identification technology. The goal of this study is to design a natural language de-identification framework, rooted in game theory, which enables an HCO to optimize their investments given the expected capabilities of an adversarial recipient. We introduce a Stackelberg game to balance risk and utility in natural language de-identification. This game represents a cost-benefit model that enables an HCO with a fixed budget to minimize their investment in the de-identification process. We evaluate this model by assessing the overall payoff to the HCO and the adversary using 2100 clinical notes from Vanderbilt University Medical Center. We simulate several policy alternatives using a range of parameters, including the cost of training a de-identification model and the loss in data utility due to the removal of terms that are not identifiers. In addition, we compare policy options where, when an attacker is fined for misuse, a monetary penalty is paid to the publishing HCO as opposed to a third party (e.g., a federal regulator). Our results show that when an HCO is forced to exhaust a limited budget (set to $2000 in the study), the precision and recall of the

  15. Accurate Identification of Fatty Liver Disease in Data Warehouse Utilizing Natural Language Processing. (United States)

    Redman, Joseph S; Natarajan, Yamini; Hou, Jason K; Wang, Jingqi; Hanif, Muzammil; Feng, Hua; Kramer, Jennifer R; Desiderio, Roxanne; Xu, Hua; El-Serag, Hashem B; Kanwal, Fasiha


    Natural language processing is a powerful technique of machine learning capable of maximizing data extraction from complex electronic medical records. We utilized this technique to develop algorithms capable of "reading" full-text radiology reports to accurately identify the presence of fatty liver disease. Abdominal ultrasound, computerized tomography, and magnetic resonance imaging reports were retrieved from the Veterans Affairs Corporate Data Warehouse from a random national sample of 652 patients. Radiographic fatty liver disease was determined by manual review by two physicians and verified with an expert radiologist. A split validation method was utilized for algorithm development. For all three imaging modalities, the algorithms could identify fatty liver disease with >90% recall and precision, with F-measures >90%. These algorithms could be used to rapidly screen patient records to establish a large cohort to facilitate epidemiological and clinical studies and examine the clinic course and outcomes of patients with radiographic hepatic steatosis.

  16. Natural Language Processing in Serious Games: A state of the art.

    Directory of Open Access Journals (Sweden)

    Davide Picca


    Full Text Available In the last decades, Natural Language Processing (NLP has obtained a high level of success. Interactions between NLP and Serious Games have started and some of them already include NLP techniques. The objectives of this paper are twofold: on the one hand, providing a simple framework to enable analysis of potential uses of NLP in Serious Games and, on the other hand, applying the NLP framework to existing Serious Games and giving an overview of the use of NLP in pedagogical Serious Games. In this paper we present 11 serious games exploiting NLP techniques. We present them systematically, according to the following structure:  first, we highlight possible uses of NLP techniques in Serious Games, second, we describe the type of NLP implemented in the each specific Serious Game and, third, we provide a link to possible purposes of use for the different actors interacting in the Serious Game.

  17. Knowledge Extraction from MEDLINE by Combining Clustering with Natural Language Processing. (United States)

    Miñarro-Giménez, Jose A; Kreuzthaler, Markus; Schulz, Stefan


    The identification of relevant predicates between co-occurring concepts in scientific literature databases like MEDLINE is crucial for using these sources for knowledge extraction, in order to obtain meaningful biomedical predications as subject-predicate-object triples. We consider the manually assigned MeSH indexing terms (main headings and subheadings) in MEDLINE records as a rich resource for extracting a broad range of domain knowledge. In this paper, we explore the combination of a clustering method for co-occurring concepts based on their related MeSH subheadings in MEDLINE with the use of SemRep, a natural language processing engine, which extracts predications from free text documents. As a result, we generated sets of clusters of co-occurring concepts and identified the most significant predicates for each cluster. The association of such predicates with the co-occurrences of the resulting clusters produces the list of predications, which were checked for relevance.

  18. Harmonization and development of resources and tools for Italian natural language processing within the PARLI project

    CERN Document Server

    Bosco, Cristina; Delmonte, Rodolfo; Moschitti, Alessandro; Simi, Maria


    The papers collected in this volume are selected as a sample of the progress in Natural Language Processing (NLP) performed within the Italian NLP community and especially attested by the PARLI project. PARLI (Portale per l’Accesso alle Risorse in Lingua Italiana) is a project partially funded by the Ministero Italiano per l’Università e la Ricerca (PRIN 2008) from 2008 to 2012 for monitoring and fostering the harmonic growth and coordination of the activities of Italian NLP. It was proposed by various teams of researchers working in Italian universities and research institutions. According to the spirit of the PARLI project, most of the resources and tools created within the project and here described are freely distributed and they did not terminate their life at the end of the project itself, hoping they could be a key factor in future development of computational linguistics.

  19. Integrating natural language processing and web GIS for interactive knowledge domain visualization (United States)

    Du, Fangming

    Recent years have seen a powerful shift towards data-rich environments throughout society. This has extended to a change in how the artifacts and products of scientific knowledge production can be analyzed and understood. Bottom-up approaches are on the rise that combine access to huge amounts of academic publications with advanced computer graphics and data processing tools, including natural language processing. Knowledge domain visualization is one of those multi-technology approaches, with its aim of turning domain-specific human knowledge into highly visual representations in order to better understand the structure and evolution of domain knowledge. For example, network visualizations built from co-author relations contained in academic publications can provide insight on how scholars collaborate with each other in one or multiple domains, and visualizations built from the text content of articles can help us understand the topical structure of knowledge domains. These knowledge domain visualizations need to support interactive viewing and exploration by users. Such spatialization efforts are increasingly looking to geography and GIS as a source of metaphors and practical technology solutions, even when non-georeferenced information is managed, analyzed, and visualized. When it comes to deploying spatialized representations online, web mapping and web GIS can provide practical technology solutions for interactive viewing of knowledge domain visualizations, from panning and zooming to the overlay of additional information. This thesis presents a novel combination of advanced natural language processing - in the form of topic modeling - with dimensionality reduction through self-organizing maps and the deployment of web mapping/GIS technology towards intuitive, GIS-like, exploration of a knowledge domain visualization. A complete workflow is proposed and implemented that processes any corpus of input text documents into a map form and leverages a web

  20. A Natural Language Intelligent Tutoring System for Training Pathologists - Implementation and Evaluation (United States)

    El Saadawi, Gilan M.; Tseytlin, Eugene; Legowski, Elizabeth; Jukic, Drazen; Castine, Melissa; Fine, Jeffrey; Gormley, Robert; Crowley, Rebecca S.


    Introduction We developed and evaluated a Natural Language Interface (NLI) for an Intelligent Tutoring System (ITS) in Diagnostic Pathology. The system teaches residents to examine pathologic slides and write accurate pathology reports while providing immediate feedback on errors they make in their slide review and diagnostic reports. Residents can ask for help at any point in the case, and will receive context-specific feedback. Research Questions We evaluated (1) the performance of our natural language system, (2) the effect of the system on learning (3) the effect of feedback timing on learning gains and (4) the effect of ReportTutor on performance to self-assessment correlations. Methods The study uses a crossover 2×2 factorial design. We recruited 20 subjects from 4 academic programs. Subjects were randomly assigned to one of the four conditions - two conditions for the immediate interface, and two for the delayed interface. An expert dermatopathologist created a reference standard and 2 board certified AP/CP pathology fellows manually coded the residents' assessment reports. Subjects were given the opportunity to self grade their performance and we used a survey to determine student response to both interfaces. Results Our results show a highly significant improvement in report writing after one tutoring session with 4-fold increase in the learning gains with both interfaces but no effect of feedback timing on performance gains. Residents who used the immediate feedback interface first experienced a feature learning gain that is correlated with the number of cases they viewed. There was no correlation between performance and self-assessment in either condition. PMID:17934789

  1. Wikipedia and medicine: quantifying readership, editors, and the significance of natural language. (United States)

    Heilman, James M; West, Andrew G


    Wikipedia is a collaboratively edited encyclopedia. One of the most popular websites on the Internet, it is known to be a frequently used source of health care information by both professionals and the lay public. This paper quantifies the production and consumption of Wikipedia's medical content along 4 dimensions. First, we measured the amount of medical content in both articles and bytes and, second, the citations that supported that content. Third, we analyzed the medical readership against that of other health care websites between Wikipedia's natural language editions and its relationship with disease prevalence. Fourth, we surveyed the quantity/characteristics of Wikipedia's medical contributors, including year-over-year participation trends and editor demographics. Using a well-defined categorization infrastructure, we identified medically pertinent English-language Wikipedia articles and links to their foreign language equivalents. With these, Wikipedia can be queried to produce metadata and full texts for entire article histories. Wikipedia also makes available hourly reports that aggregate reader traffic at per-article granularity. An online survey was used to determine the background of contributors. Standard mining and visualization techniques (eg, aggregation queries, cumulative distribution functions, and/or correlation metrics) were applied to each of these datasets. Analysis focused on year-end 2013, but historical data permitted some longitudinal analysis. Wikipedia's medical content (at the end of 2013) was made up of more than 155,000 articles and 1 billion bytes of text across more than 255 languages. This content was supported by more than 950,000 references. Content was viewed more than 4.88 billion times in 2013. This makes it one of if not the most viewed medical resource(s) globally. The core editor community numbered less than 300 and declined over the past 5 years. The members of this community were half health care providers and 85

  2. A UMLS-based spell checker for natural language processing in vaccine safety. (United States)

    Tolentino, Herman D; Matters, Michael D; Walop, Wikke; Law, Barbara; Tong, Wesley; Liu, Fang; Fontelo, Paul; Kohl, Katrin; Payne, Daniel C


    The Institute of Medicine has identified patient safety as a key goal for health care in the United States. Detecting vaccine adverse events is an important public health activity that contributes to patient safety. Reports about adverse events following immunization (AEFI) from surveillance systems contain free-text components that can be analyzed using natural language processing. To extract Unified Medical Language System (UMLS) concepts from free text and classify AEFI reports based on concepts they contain, we first needed to clean the text by expanding abbreviations and shortcuts and correcting spelling errors. Our objective in this paper was to create a UMLS-based spelling error correction tool as a first step in the natural language processing (NLP) pipeline for AEFI reports. We developed spell checking algorithms using open source tools. We used de-identified AEFI surveillance reports to create free-text data sets for analysis. After expansion of abbreviated clinical terms and shortcuts, we performed spelling correction in four steps: (1) error detection, (2) word list generation, (3) word list disambiguation and (4) error correction. We then measured the performance of the resulting spell checker by comparing it to manual correction. We used 12,056 words to train the spell checker and tested its performance on 8,131 words. During testing, sensitivity, specificity, and positive predictive value (PPV) for the spell checker were 74% (95% CI: 74-75), 100% (95% CI: 100-100), and 47% (95% CI: 46%-48%), respectively. We created a prototype spell checker that can be used to process AEFI reports. We used the UMLS Specialist Lexicon as the primary source of dictionary terms and the WordNet lexicon as a secondary source. We used the UMLS as a domain-specific source of dictionary terms to compare potentially misspelled words in the corpus. The prototype sensitivity was comparable to currently available tools, but the specificity was much superior. The slow processing

  3. A UMLS-based spell checker for natural language processing in vaccine safety

    Directory of Open Access Journals (Sweden)

    Liu Fang


    Full Text Available Abstract Background The Institute of Medicine has identified patient safety as a key goal for health care in the United States. Detecting vaccine adverse events is an important public health activity that contributes to patient safety. Reports about adverse events following immunization (AEFI from surveillance systems contain free-text components that can be analyzed using natural language processing. To extract Unified Medical Language System (UMLS concepts from free text and classify AEFI reports based on concepts they contain, we first needed to clean the text by expanding abbreviations and shortcuts and correcting spelling errors. Our objective in this paper was to create a UMLS-based spelling error correction tool as a first step in the natural language processing (NLP pipeline for AEFI reports. Methods We developed spell checking algorithms using open source tools. We used de-identified AEFI surveillance reports to create free-text data sets for analysis. After expansion of abbreviated clinical terms and shortcuts, we performed spelling correction in four steps: (1 error detection, (2 word list generation, (3 word list disambiguation and (4 error correction. We then measured the performance of the resulting spell checker by comparing it to manual correction. Results We used 12,056 words to train the spell checker and tested its performance on 8,131 words. During testing, sensitivity, specificity, and positive predictive value (PPV for the spell checker were 74% (95% CI: 74–75, 100% (95% CI: 100–100, and 47% (95% CI: 46%–48%, respectively. Conclusion We created a prototype spell checker that can be used to process AEFI reports. We used the UMLS Specialist Lexicon as the primary source of dictionary terms and the WordNet lexicon as a secondary source. We used the UMLS as a domain-specific source of dictionary terms to compare potentially misspelled words in the corpus. The prototype sensitivity was comparable to currently available

  4. Creation of a simple natural language processing tool to support an imaging utilization quality dashboard. (United States)

    Swartz, Jordan; Koziatek, Christian; Theobald, Jason; Smith, Silas; Iturrate, Eduardo


    Testing for venous thromboembolism (VTE) is associated with cost and risk to patients (e.g. radiation). To assess the appropriateness of imaging utilization at the provider level, it is important to know that provider's diagnostic yield (percentage of tests positive for the diagnostic entity of interest). However, determining diagnostic yield typically requires either time-consuming, manual review of radiology reports or the use of complex and/or proprietary natural language processing software. The objectives of this study were twofold: 1) to develop and implement a simple, user-configurable, and open-source natural language processing tool to classify radiology reports with high accuracy and 2) to use the results of the tool to design a provider-specific VTE imaging dashboard, consisting of both utilization rate and diagnostic yield. Two physicians reviewed a training set of 400 lower extremity ultrasound (UTZ) and computed tomography pulmonary angiogram (CTPA) reports to understand the language used in VTE-positive and VTE-negative reports. The insights from this review informed the arguments to the five modifiable parameters of the NLP tool. A validation set of 2,000 studies was then independently classified by the reviewers and by the tool; the classifications were compared and the performance of the tool was calculated. The tool was highly accurate in classifying the presence and absence of VTE for both the UTZ (sensitivity 95.7%; 95% CI 91.5-99.8, specificity 100%; 95% CI 100-100) and CTPA reports (sensitivity 97.1%; 95% CI 94.3-99.9, specificity 98.6%; 95% CI 97.8-99.4). The diagnostic yield was then calculated at the individual provider level and the imaging dashboard was created. We have created a novel NLP tool designed for users without a background in computer programming, which has been used to classify venous thromboembolism reports with a high degree of accuracy. The tool is open-source and available for download at http

  5. Polished Stone Value Test and its relationship with petrographic parameters (hardness contrast and modal composition and surface micro-roughness in natural and artificial aggregates

    Directory of Open Access Journals (Sweden)

    Fernández, A.


    Full Text Available The goal of this work was first to establish the relationships between the PSV values and the microstructural and mineralogical features of the aggregates and surface micro-roughness, and then to establish the behavioural differences between natural and artificial aggregates. The results obtained indicate that the surface micro-roughness and the different PSV values of the natural aggregates are strongly governed by the existence of minerals with different degrees of hardness, together with the proportion of these minerals. In contrast, the different degree of porosity in artificial aggregates (a furnace slag was seen to be responsible for its high surface micro-roughness and PSV values. Finally, the PSV and a petrographic parameter (Overall Hardness Contrast, ΔH were seen to be related by an exponential curve (PSV=39.726ΔH0.057 with an extremely good fit, providing a good tool to estimate PSVs in natural and artificial aggregates from petrographic parameters.El objetivo de este trabajo es establecer, por un lado, las relaciones existentes del CPA con las características petrográficas de los áridos, así como su microrrugosidad superficial y, por otro, las diferencias de comportamiento entre áridos naturales y artificiales. Los resultados indican que en los áridos naturales la microrrugosidad superficial y el diferente valor del CPA están determinados, en gran medida, por las diferencias de dureza de sus minerales y también por la proporción en la que estos minerales se encuentran en las distintas litologías. Sin embargo, en los áridos artificiales (escorias de fundición su elevada porosidad es la responsable de su marcada microrrugosidad superficial y su elevado valor del CPA. Finalmente, se relaciona el CPA con un parámetro petrográfico (Contraste de Dureza Global, ΔH mediante una curva exponencial, cuyo excelente índice de regresión hace que sea factible estimar mediante el estudio petrográfico de un árido su valor del CPA

  6. Classifying a Person's Degree of Accessibility From Natural Body Language During Social Human-Robot Interactions. (United States)

    McColl, Derek; Jiang, Chuan; Nejat, Goldie


    For social robots to be successfully integrated and accepted within society, they need to be able to interpret human social cues that are displayed through natural modes of communication. In particular, a key challenge in the design of social robots is developing the robot's ability to recognize a person's affective states (emotions, moods, and attitudes) in order to respond appropriately during social human-robot interactions (HRIs). In this paper, we present and discuss social HRI experiments we have conducted to investigate the development of an accessibility-aware social robot able to autonomously determine a person's degree of accessibility (rapport, openness) toward the robot based on the person's natural static body language. In particular, we present two one-on-one HRI experiments to: 1) determine the performance of our automated system in being able to recognize and classify a person's accessibility levels and 2) investigate how people interact with an accessibility-aware robot which determines its own behaviors based on a person's speech and accessibility levels.

  7. Image statistics of American Sign Language: comparison with faces and natural scenes (United States)

    Bosworth, Rain G.; Bartlett, Marian Stewart; Dobkins, Karen R.


    Several lines of evidence suggest that the image statistics of the environment shape visual abilities. To date, the image statistics of natural scenes and faces have been well characterized using Fourier analysis. We employed Fourier analysis to characterize images of signs in American Sign Language (ASL). These images are highly relevant to signers who rely on ASL for communication, and thus the image statistics of ASL might influence signers' visual abilities. Fourier analysis was conducted on 105 static images of signs, and these images were compared with analyses of 100 natural scene images and 100 face images. We obtained two metrics from our Fourier analysis: mean amplitude and entropy of the amplitude across the image set (which is a measure from information theory) as a function of spatial frequency and orientation. The results of our analyses revealed interesting differences in image statistics across the three different image sets, setting up the possibility that ASL experience may alter visual perception in predictable ways. In addition, for all image sets, the mean amplitude results were markedly different from the entropy results, which raises the interesting question of which aspect of an image set (mean amplitude or entropy of the amplitude) is better able to account for known visual abilities.

  8. Subjective Quality of Life of Polish, Polish-Immigrant, and Polish-American Elderly. (United States)

    Berdes, Celia; Zych, Adam A.


    Compares subjective quality of life of elderly Poles living in Poland, and Polish immigrants and Polish-American ethnics living in Chicago as part of a secondary data analysis of a study initially conducted in Poland. Conclusions lend support to the idea that U.S.-born elderly people and elderly immigrants to the United States have a significantly…

  9. Symmetry or asymmetry? Cross-border openness of service providers in Polish-Czech and Polish-German border towns

    Directory of Open Access Journals (Sweden)

    Dołzbłasz Sylwia


    Full Text Available The symmetry and/or asymmetry in terms of cross-border openness of service providers is examined in this article, for the cases of two border twin towns: Cieszyn/Český Těšín at the Polish-Czech border, and Gubin/Guben at the Polish-German border. To assess the level of openness of firms towards clients from the other side of the border, four trans-border categories were examined: neighbour’s language visible at store location; business offers in the language of the neighbour; the possibilities of payment in the neighbour’s currency; and the staff’s knowledge of the language. This enabled a comparison of both parts of the particular twin towns in relation to the character of cross-border openness, as well as an assessment of their symmetry/asymmetry. Comparisons of Gubin/Guben and Cieszyn/Český Těšín with respect to the analysed features were also carried out. The analysis shows significant variation in the level of cross-border openness towards clients from neighbouring countries. Whereas in the Polish-Czech town a relative symmetry was observed, in the Polish-German case, significant asymmetry was noted.

  10. Statistical Learning in a Natural Language by 8-Month-Old Infants (United States)

    Pelucchi, Bruna; Hay, Jessica F.; Saffran, Jenny R.


    Numerous studies over the past decade support the claim that infants are equipped with powerful statistical language learning mechanisms. The primary evidence for statistical language learning in word segmentation comes from studies using artificial languages, continuous streams of synthesized syllables that are highly simplified relative to real…

  11. Natural language processing pipelines to annotate BioC collections with an application to the NCBI disease corpus. (United States)

    Comeau, Donald C; Liu, Haibin; Islamaj Doğan, Rezarta; Wilbur, W John


    BioC is a new format and associated code libraries for sharing text and annotations. We have implemented BioC natural language preprocessing pipelines in two popular programming languages: C++ and Java. The current implementations interface with the well-known MedPost and Stanford natural language processing tool sets. The pipeline functionality includes sentence segmentation, tokenization, part-of-speech tagging, lemmatization and sentence parsing. These pipelines can be easily integrated along with other BioC programs into any BioC compliant text mining systems. As an application, we converted the NCBI disease corpus to BioC format, and the pipelines have successfully run on this corpus to demonstrate their functionality. Code and data can be downloaded from Database URL: © The Author(s) 2014. Published by Oxford University Press.

  12. AIED 2009 Workshops Proceeedings Volume 10: Natural Language Processing in Support of Learning: Metrics, Feedback and Connectivity

    NARCIS (Netherlands)

    Dessus, Philippe; Trausan-Matu, Stefan; Van Rosmalen, Peter; Wild, Fridolin


    Dessus, P., Trausan-Matu, S., Van Rosmalen, P., & Wild, F. (Eds.) (2009). AIED 2009 Workshops Proceedings Volume 10 Natural Language Processing in Support of Learning: Metrics, Feedback and Connectivity. In S. D. Craig & D. Dicheva (Eds.), AIED 2009: 14th International Conference in Artificial

  13. Voice-enabled Knowledge Engine using Flood Ontology and Natural Language Processing (United States)

    Sermet, M. Y.; Demir, I.; Krajewski, W. F.


    The Iowa Flood Information System (IFIS) is a web-based platform developed by the Iowa Flood Center (IFC) to provide access to flood inundation maps, real-time flood conditions, flood forecasts, flood-related data, information and interactive visualizations for communities in Iowa. The IFIS is designed for use by general public, often people with no domain knowledge and limited general science background. To improve effective communication with such audience, we have introduced a voice-enabled knowledge engine on flood related issues in IFIS. Instead of navigating within many features and interfaces of the information system and web-based sources, the system provides dynamic computations based on a collection of built-in data, analysis, and methods. The IFIS Knowledge Engine connects to real-time stream gauges, in-house data sources, analysis and visualization tools to answer natural language questions. Our goal is the systematization of data and modeling results on flood related issues in Iowa, and to provide an interface for definitive answers to factual queries. The goal of the knowledge engine is to make all flood related knowledge in Iowa easily accessible to everyone, and support voice-enabled natural language input. We aim to integrate and curate all flood related data, implement analytical and visualization tools, and make it possible to compute answers from questions. The IFIS explicitly implements analytical methods and models, as algorithms, and curates all flood related data and resources so that all these resources are computable. The IFIS Knowledge Engine computes the answer by deriving it from its computational knowledge base. The knowledge engine processes the statement, access data warehouse, run complex database queries on the server-side and return outputs in various formats. This presentation provides an overview of IFIS Knowledge Engine, its unique information interface and functionality as an educational tool, and discusses the future plans

  14. Surmounting the Tower of Babel: Monolingual and bilingual 2-year-olds' understanding of the nature of foreign language words. (United States)

    Byers-Heinlein, Krista; Chen, Ke Heng; Xu, Fei


    Languages function as independent and distinct conventional systems, and so each language uses different words to label the same objects. This study investigated whether 2-year-old children recognize that speakers of their native language and speakers of a foreign language do not share the same knowledge. Two groups of children unfamiliar with Mandarin were tested: monolingual English-learning children (n=24) and bilingual children learning English and another language (n=24). An English speaker taught children the novel label fep. On English mutual exclusivity trials, the speaker asked for the referent of a novel label (wug) in the presence of the fep and a novel object. Both monolingual and bilingual children disambiguated the reference of the novel word using a mutual exclusivity strategy, choosing the novel object rather than the fep. On similar trials with a Mandarin speaker, children were asked to find the referent of a novel Mandarin label kuò. Monolinguals again chose the novel object rather than the object with the English label fep, even though the Mandarin speaker had no access to conventional English words. Bilinguals did not respond systematically to the Mandarin speaker, suggesting that they had enhanced understanding of the Mandarin speaker's ignorance of English words. The results indicate that monolingual children initially expect words to be conventionally shared across all speakers-native and foreign. Early bilingual experience facilitates children's discovery of the nature of foreign language words. Copyright © 2013 Elsevier Inc. All rights reserved.

  15. Water Relationships in the U.S. Southwest: Characterizing Water Management Networks Using Natural Language Processing

    Directory of Open Access Journals (Sweden)

    John T. Murphy


    Full Text Available Natural language processing (NLP and named entity recognition (NER techniques are applied to collections of newspaper articles from four cities in the U.S. Southwest. The results are used to generate a network of water management institutions that reflect public perceptions of water management and the structure of water management in these areas. This structure can be highly centralized or fragmented; in the latter case, multiple peer institutions exist that may cooperate or be in conflict. This is reflected in the public discourse of the water consumers in these areas and can, we contend, impact the potential responses of management agencies to challenges of water supply and quality and, in some cases, limit their effectiveness. Flagstaff, AZ, Tucson, AZ, Las Vegas, NV, and the Grand Valley, CO, are examined, including more than 110,000 articles from 2004–2012. Documents are scored by association with water topics, and phrases likely to be institutions are extracted via custom NLP and NER algorithms; those institutions associated with water-related documents are used to form networks via document co-location. The Grand Valley is shown to have a markedly different structure, which we contend reflects the different historical trajectory of its development and its current state, which includes multiple institutions of roughly equal scope and size. These results demonstrate the utility of using NLP and NER methods to understanding the structure and variation of water management systems.

  16. ReportTutor – An Intelligent Tutoring System that Uses a Natural Language Interface (United States)

    Crowley, Rebecca S.; Tseytlin, Eugene; Jukic, Drazen


    ReportTutor is an extension to our work on Intelligent Tutoring Systems for visual diagnosis. ReportTutor combines a virtual microscope and a natural language interface to allow students to visually inspect a virtual slide as they type a diagnostic report on the case. The system monitors both actions in the virtual microscope interface as well as text created by the student in the reporting interface. It provides feedback about the correctness, completeness, and style of the report. ReportTutor uses MMTx with a custom data-source created with the NCI Metathesaurus. A separate ontology of cancer specific concepts is used to structure the domain knowledge needed for evaluation of the student’s input including co-reference resolution. As part of the early evaluation of the system, we collected data from 4 pathology residents who typed in their reports without the tutoring aspects of the system, and compared responses to an expert dermatopathologist. We analyzed the resulting reports to (1) identify the error rates and distribution among student reports, (2) determine the performance of the system in identifying features within student reports, and (3) measure the accuracy of the system in distinguishing between correct and incorrect report elements. PMID:16779024

  17. Bringing Chatbots into education: Towards Natural Language Negotiation of Open Learner Models (United States)

    Kerlyl, Alice; Hall, Phil; Bull, Susan

    There is an extensive body of work on Intelligent Tutoring Systems: computer environments for education, teaching and training that adapt to the needs of the individual learner. Work on personalisation and adaptivity has included research into allowing the student user to enhance the system's adaptivity by improving the accuracy of the underlying learner model. Open Learner Modelling, where the system's model of the user's knowledge is revealed to the user, has been proposed to support student reflection on their learning. Increased accuracy of the learner model can be obtained by the student and system jointly negotiating the learner model. We present the initial investigations into a system to allow people to negotiate the model of their understanding of a topic in natural language. This paper discusses the development and capabilities of both conversational agents (or chatbots) and Intelligent Tutoring Systems, in particular Open Learner Modelling. We describe a Wizard-of-Oz experiment to investigate the feasibility of using a chatbot to support negotiation, and conclude that a fusion of the two fields can lead to developing negotiation techniques for chatbots and the enhancement of the Open Learner Model. This technology, if successful, could have widespread application in schools, universities and other training scenarios.

  18. Natural language processing of clinical notes for identification of critical limb ischemia. (United States)

    Afzal, Naveed; Mallipeddi, Vishnu Priya; Sohn, Sunghwan; Liu, Hongfang; Chaudhry, Rajeev; Scott, Christopher G; Kullo, Iftikhar J; Arruda-Olson, Adelaide M


    Critical limb ischemia (CLI) is a complication of advanced peripheral artery disease (PAD) with diagnosis based on the presence of clinical signs and symptoms. However, automated identification of cases from electronic health records (EHRs) is challenging due to absence of a single definitive International Classification of Diseases (ICD-9 or ICD-10) code for CLI. In this study, we extend a previously validated natural language processing (NLP) algorithm for PAD identification to develop and validate a subphenotyping NLP algorithm (CLI-NLP) for identification of CLI cases from clinical notes. We compared performance of the CLI-NLP algorithm with CLI-related ICD-9 billing codes. The gold standard for validation was human abstraction of clinical notes from EHRs. Compared to billing codes the CLI-NLP algorithm had higher positive predictive value (PPV) (CLI-NLP 96%, billing codes 67%, p tools and support a learning healthcare system. Copyright © 2017 The Authors. Published by Elsevier B.V. All rights reserved.

  19. Detecting Target Objects by Natural Language Instructions Using an RGB-D Camera

    Directory of Open Access Journals (Sweden)

    Jiatong Bao


    Full Text Available Controlling robots by natural language (NL is increasingly attracting attention for its versatility, convenience and no need of extensive training for users. Grounding is a crucial challenge of this problem to enable robots to understand NL instructions from humans. This paper mainly explores the object grounding problem and concretely studies how to detect target objects by the NL instructions using an RGB-D camera in robotic manipulation applications. In particular, a simple yet robust vision algorithm is applied to segment objects of interest. With the metric information of all segmented objects, the object attributes and relations between objects are further extracted. The NL instructions that incorporate multiple cues for object specifications are parsed into domain-specific annotations. The annotations from NL and extracted information from the RGB-D camera are matched in a computational state estimation framework to search all possible object grounding states. The final grounding is accomplished by selecting the states which have the maximum probabilities. An RGB-D scene dataset associated with different groups of NL instructions based on different cognition levels of the robot are collected. Quantitative evaluations on the dataset illustrate the advantages of the proposed method. The experiments of NL controlled object manipulation and NL-based task programming using a mobile manipulator show its effectiveness and practicability in robotic applications.

  20. Predicting judicial decisions of the European Court of Human Rights: a Natural Language Processing perspective

    Directory of Open Access Journals (Sweden)

    Nikolaos Aletras


    Full Text Available Recent advances in Natural Language Processing and Machine Learning provide us with the tools to build predictive models that can be used to unveil patterns driving judicial decisions. This can be useful, for both lawyers and judges, as an assisting tool to rapidly identify cases and extract patterns which lead to certain decisions. This paper presents the first systematic study on predicting the outcome of cases tried by the European Court of Human Rights based solely on textual content. We formulate a binary classification task where the input of our classifiers is the textual content extracted from a case and the target output is the actual judgment as to whether there has been a violation of an article of the convention of human rights. Textual information is represented using contiguous word sequences, i.e., N-grams, and topics. Our models can predict the court’s decisions with a strong accuracy (79% on average. Our empirical analysis indicates that the formal facts of a case are the most important predictive factor. This is consistent with the theory of legal realism suggesting that judicial decision-making is significantly affected by the stimulus of the facts. We also observe that the topical content of a case is another important feature in this classification task and explore this relationship further by conducting a qualitative analysis.

  1. Is Natural Language a Perigraphic Process? The Theorem about Facts and Words Revisited

    Directory of Open Access Journals (Sweden)

    Łukasz Dębowski


    Full Text Available As we discuss, a stationary stochastic process is nonergodic when a random persistent topic can be detected in the infinite random text sampled from the process, whereas we call the process strongly nonergodic when an infinite sequence of independent random bits, called probabilistic facts, is needed to describe this topic completely. Replacing probabilistic facts with an algorithmically random sequence of bits, called algorithmic facts, we adapt this property back to ergodic processes. Subsequently, we call a process perigraphic if the number of algorithmic facts which can be inferred from a finite text sampled from the process grows like a power of the text length. We present a simple example of such a process. Moreover, we demonstrate an assertion which we call the theorem about facts and words. This proposition states that the number of probabilistic or algorithmic facts which can be inferred from a text drawn from a process must be roughly smaller than the number of distinct word-like strings detected in this text by means of the Prediction by Partial Matching (PPM compression algorithm. We also observe that the number of the word-like strings for a sample of plays by Shakespeare follows an empirical stepwise power law, in a stark contrast to Markov processes. Hence, we suppose that natural language considered as a process is not only non-Markov but also perigraphic.

  2. Rethinking information delivery: using a natural language processing application for point-of-care data discovery. (United States)

    Workman, T Elizabeth; Stoddart, Joan M


    This paper examines the use of Semantic MEDLINE, a natural language processing application enhanced with a statistical algorithm known as Combo, as a potential decision support tool for clinicians. Semantic MEDLINE summarizes text in PubMed citations, transforming it into compact declarations that are filtered according to a user's information need that can be displayed in a graphic interface. Integration of the Combo algorithm enables Semantic MEDLINE to deliver information salient to many diverse needs. The authors selected three disease topics and crafted PubMed search queries to retrieve citations addressing the prevention of these diseases. They then processed the citations with Semantic MEDLINE, with the Combo algorithm enhancement. To evaluate the results, they constructed a reference standard for each disease topic consisting of preventive interventions recommended by a commercial decision support tool. Semantic MEDLINE with Combo produced an average recall of 79% in primary and secondary analyses, an average precision of 45%, and a final average F-score of 0.57. This new approach to point-of-care information delivery holds promise as a decision support tool for clinicians. Health sciences libraries could implement such technologies to deliver tailored information to their users.

  3. Natural language processing to extract medical problems from electronic clinical documents: performance evaluation. (United States)

    Meystre, Stéphane; Haug, Peter J


    In this study, we evaluate the performance of a Natural Language Processing (NLP) application designed to extract medical problems from narrative text clinical documents. The documents come from a patient's electronic medical record and medical problems are proposed for inclusion in the patient's electronic problem list. This application has been developed to help maintain the problem list and make it more accurate, complete, and up-to-date. The NLP part of this system-analyzed in this study-uses the UMLS MetaMap Transfer (MMTx) application and a negation detection algorithm called NegEx to extract 80 different medical problems selected for their frequency of use in our institution. When using MMTx with its default data set, we measured a recall of 0.74 and a precision of 0.756. A custom data subset for MMTx was created, making it faster and significantly improving the recall to 0.896 with a non-significant reduction in precision.

  4. Automated Radiology Report Summarization Using an Open-Source Natural Language Processing Pipeline. (United States)

    Goff, Daniel J; Loehfelm, Thomas W


    Diagnostic radiologists are expected to review and assimilate findings from prior studies when constructing their overall assessment of the current study. Radiology information systems facilitate this process by presenting the radiologist with a subset of prior studies that are more likely to be relevant to the current study, usually by comparing anatomic coverage of both the current and prior studies. It is incumbent on the radiologist to review the full text report and/or images from those prior studies, a process that is time-consuming and confers substantial risk of overlooking a relevant prior study or finding. This risk is compounded when patients have dozens or even hundreds of prior imaging studies. Our goal is to assess the feasibility of natural language processing techniques to automatically extract asserted and negated disease entities from free-text radiology reports as a step towards automated report summarization. We compared automatically extracted disease mentions to a gold-standard set of manual annotations for 50 radiology reports from CT abdomen and pelvis examinations. The automated report summarization pipeline found perfect or overlapping partial matches for 86% of the manually annotated disease mentions (sensitivity 0.86, precision 0.66, accuracy 0.59, F1 score 0.74). The performance of the automated pipeline was good, and the overall accuracy was similar to the interobserver agreement between the two manual annotators.

  5. Natural Language Processing for EHR-Based Pharmacovigilance: A Structured Review. (United States)

    Luo, Yuan; Thompson, William K; Herr, Timothy M; Zeng, Zexian; Berendsen, Mark A; Jonnalagadda, Siddhartha R; Carson, Matthew B; Starren, Justin


    The goal of pharmacovigilance is to detect, monitor, characterize and prevent adverse drug events (ADEs) with pharmaceutical products. This article is a comprehensive structured review of recent advances in applying natural language processing (NLP) to electronic health record (EHR) narratives for pharmacovigilance. We review methods of varying complexity and problem focus, summarize the current state-of-the-art in methodology advancement, discuss limitations and point out several promising future directions. The ability to accurately capture both semantic and syntactic structures in clinical narratives becomes increasingly critical to enable efficient and accurate ADE detection. Significant progress has been made in algorithm development and resource construction since 2000. Since 2012, statistical analysis and machine learning methods have gained traction in automation of ADE mining from EHR narratives. Current state-of-the-art methods for NLP-based ADE detection from EHRs show promise regarding their integration into production pharmacovigilance systems. In addition, integrating multifaceted, heterogeneous data sources has shown promise in improving ADE detection and has become increasingly adopted. On the other hand, challenges and opportunities remain across the frontier of NLP application to EHR-based pharmacovigilance, including proper characterization of ADE context, differentiation between off- and on-label drug-use ADEs, recognition of the importance of polypharmacy-induced ADEs, better integration of heterogeneous data sources, creation of shared corpora, and organization of shared-task challenges to advance the state-of-the-art.

  6. Qualitative spatial logic descriptors from 3D indoor scenes to generate explanations in natural language. (United States)

    Falomir, Zoe; Kluth, Thomas


    The challenge of describing 3D real scenes is tackled in this paper using qualitative spatial descriptors. A key point to study is which qualitative descriptors to use and how these qualitative descriptors must be organized to produce a suitable cognitive explanation. In order to find answers, a survey test was carried out with human participants which openly described a scene containing some pieces of furniture. The data obtained in this survey are analysed, and taking this into account, the QSn3D computational approach was developed which uses a XBox 360 Kinect to obtain 3D data from a real indoor scene. Object features are computed on these 3D data to identify objects in indoor scenes. The object orientation is computed, and qualitative spatial relations between the objects are extracted. These qualitative spatial relations are the input to a grammar which applies saliency rules obtained from the survey study and generates cognitive natural language descriptions of scenes. Moreover, these qualitative descriptors can be expressed as first-order logical facts in Prolog for further reasoning. Finally, a validation study is carried out to test whether the descriptions provided by QSn3D approach are human readable. The obtained results show that their acceptability is higher than 82%.


    A. E. Pismak


    Full Text Available Subject of Research. The paper is focused on Wiktionary articles structural organization in the aspect of its usage as the base for semantic network. Wiktionary community references, article templates and articles markup features are analyzed. The problem of numerical estimation for semantic similarity of structural elements in Wiktionary articles is considered. Analysis of existing software for semantic similarity estimation of such elements is carried out; algorithms of their functioning are studied; their advantages and disadvantages are shown. Methods. Mathematical statistics methods were used to analyze Wiktionary articles markup features. The method of semantic similarity computing based on statistics data for compared structural elements was proposed.Main Results. We have concluded that there is no possibility for direct use of Wiktionary articles as the source for semantic network. We have proposed to find hidden similarity between article elements, and for that purpose we have developed the algorithm for calculation of confidence coefficients proving that each pair of sentences is semantically near. The research of quantitative and qualitative characteristics for the developed algorithm has shown its major performance advantage over the other existing solutions in the presence of insignificantly higher error rate. Practical Relevance. The resulting algorithm may be useful in developing tools for automatic Wiktionary articles parsing. The developed method could be used in computing of semantic similarity for short text fragments in natural language in case of algorithm performance requirements are higher than its accuracy specifications.

  8. Natural language processing using online analytic processing for assessing recommendations in radiology reports. (United States)

    Dang, Pragya A; Kalra, Mannudeep K; Blake, Michael A; Schultz, Thomas J; Stout, Markus; Lemay, Paul R; Freshman, David J; Halpern, Elkan F; Dreyer, Keith J


    The study purpose was to describe the use of natural language processing (NLP) and online analytic processing (OLAP) for assessing patterns in recommendations in unstructured radiology reports on the basis of patient and imaging characteristics, such as age, gender, referring physicians, radiology subspecialty, modality, indications, diseases, and patient status (inpatient vs outpatient). A database of 4,279,179 radiology reports from a single tertiary health care center during a 10-year period (1995-2004) was created. The database includes reports of computed tomography, magnetic resonance imaging, fluoroscopy, nuclear medicine, ultrasound, radiography, mammography, angiography, special procedures, and unclassified imaging tests with patient demographics. A clinical data mining and analysis NLP program (Leximer, Nuance Inc, Burlington, Massachusetts) in conjunction with OLAP was used for classifying reports into those with recommendations (I(REC)) and without recommendations (N(REC)) for imaging and determining I(REC) rates for different patient age groups, gender, imaging modalities, indications, diseases, subspecialties, and referring physicians. In addition, temporal trends for I(REC) were also determined. There was a significant difference in the I(REC) rates in different age groups, varying between 4.8% (10-19 years) and 9.5% (>70 years) (P OLAP revealed considerable differences between recommendation trends for different imaging modalities and other patient and imaging characteristics.


    La Yani


    Full Text Available This paper aims at investigating the meaning of “to bring” in Ciacia language based on the Natural Semantics Metalanguage (NSM theory, an approach to investigate various forms, structure, and meaning as the whole with the principle of “one form for one meaning and one meaning for one form”. The data were collected through interview and note taking techniques. The result of this study shows that meaning of “to bring” in Ciacia can be expressed by a number of lexicons. Each form has certain or distinctive meaning. First is suu meaning ‘to bring something by putting on head’; second are tongku and lemba meaning ‘to bring something by putting on shoulder’; third is temba meaning ‘to bring something by putting on chest’; fourth lexicon are solo/rongo meaning ‘to bring something by putting on the back’; fifth are bimbi and sele meaning ‘to bring something by putting on waist’; and lastly ntai, kopo, and tape meaning ‘to bring something by putting on finger’.

  10. Tooth polishing: The current status

    Directory of Open Access Journals (Sweden)

    Madhuri Alankar Sawai


    Full Text Available Healthy teeth and gums make a person feel confident and fit. As people go about their daily routines and with different eating and drinking habits, the tooth enamel turns yellowish or gets stained. Polishing traditionally has been associated with the prophylaxis procedure in most dental practices, which patients know and expect. However, with overzealous use of polishing procedure, there is wearing of the superficial tooth structure. This would lead to more accumulation of local deposits. Also, it takes a long time for the formation of the fluoride-rich layer of the tooth again. Hence, now-a-days, polishing is not advised as a part of routine oral prophylaxis procedure but is done selectively based on the patients′ need. The article here, gives an insight on the different aspects of the polishing process along with the different methods and agents used for the same.

  11. Lysenko affair and Polish botany. (United States)

    Köhler, Piotr


    This article describes the slight impact of Lysenkoism upon Polish botany. I begin with an account of the development of plant genetics in Poland, as well as the attitude of scientists and the Polish intelligentsia toward Marxist philosophy prior to the World War II. Next I provide a short history of the introduction and demise of Lysenkoism in Polish science, with a focus on events in botany, in context with key events in Polish science from 1939 to 1958. The article outlines the little effects of Lysenkoism upon botanists and their research, as well as how botanists for the most part rejected what was often termed the "new biology." My paper shows that though Lysenko's theories received political support, and were actively promoted by a small circle of scientists and Communist party activists, they were never accepted by most botanists. Once the political climate in Poland altered after the events of 1956, Lysenko's theories were immediately abandoned.

  12. Toward a Theory-Based Natural Language Capability in Robots and Other Embodied Agents: Evaluating Hausser's SLIM Theory and Database Semantics (United States)

    Burk, Robin K.


    Computational natural language understanding and generation have been a goal of artificial intelligence since McCarthy, Minsky, Rochester and Shannon first proposed to spend the summer of 1956 studying this and related problems. Although statistical approaches dominate current natural language applications, two current research trends bring…


    Yu. S. Hetsevich


  14. Production of rare earth polishing powders in Russia

    International Nuclear Information System (INIS)

    Kosynkin, V.D.; Ivanov, E.N.; Kotrekhov, V.A.; Shtutza, M.G.; Grabko, A.I.


    in a suspension; polishing powder Ftoropol with addition of fluorine and higher contents of cerium dioxide (at least 70% by mass) that has a higher polishing ability and is attrition-proof, used for high-speed treatment of optical lenses, mirrors, TV screens and eyeglasses. The rare earth polishing powders made in Russia possess the following physico-chemical properties and performance characteristics; cerium dioxide content in solid REE solution - 50-90% by mass; F-ion content (in Ftoropol powder) - 8-14% by mass; non-REE content of sodium, calcium, strontium and iron impurities - at most 0.1% by mass of each element; natural radionuclide content of thorium, uranium, actinium, potassium-40 series, total standard specific activity - 0.45-0.85 Bq/g; - average particle size, 2.0-3.5 μm; density - 6.3-6.8 g/cm 3 ; pH of aqueous extract, 6-7; sedimentary stability - 10-20 minutes; polishing ability - 45-60 mg per 31 minutes (for polishing resin); abrasive inclusions - none. The report gives analysis of the. Russian powders compared against the best world analogues such as Cerox (Rhone Poulenc Company, France), Regipol (London and Scandinavian Division Chemical Company, England), etc. The analysis results imply, that the chief characteristics (granulometric composition, polishing ability and service life) of the Russian samples do not yield to the best foreign analogues, and in some properties (radionuclide content, sedimentary stability and scratching inclusions quantity) even surpass them

  15. Elastic emission polishing

    Energy Technology Data Exchange (ETDEWEB)

    Loewenthal, M.; Loseke, K.; Dow, T.A.; Scattergood, R.O.


    Elastic emission polishing, also called elastic emission machining (EEM), is a process where a stream of abrasive slurry is used to remove material from a substrate and produce damage free surfaces with controlled surface form. It is a noncontacting method utilizing a thick elasto-hydrodynamic film formed between a soft rotating ball and the workpiece to control the flow of the abrasive. An apparatus was built in the Center, which consists of a stationary spindle, a two-axis table for the workpiece, and a pump to circulate the working fluid. The process is controlled by a programmable computer numerical controller (CNC), which presently can operate the spindle speed and movement of the workpiece in one axis only. This apparatus has been used to determine material removal rates on different material samples as a function of time, utilizing zirconium oxide (ZrO{sub 2}) particles suspended in distilled water as the working fluid. By continuing a study of removal rates the process should become predictable, and thus create a new, effective, yet simple tool for ultra-precision mechanical machining of surfaces.

  16. Language Background and Learners' Attitudes to Own-Language Use (United States)

    Scheffler, Pawel; Horverak, May Olaug; Krzebietke, Weronika; Askland, Sigrunn


    Learners' language background is one of the factors which may influence the amount and functions of own-language use in English instruction. This article reports a study in which a group of almost 400 Polish and Norwegian secondary school learners of English were asked how their own languages are used in the classroom, how they use them when they…

  17. Medical subdomain classification of clinical notes using a machine learning-based natural language processing approach. (United States)

    Weng, Wei-Hung; Wagholikar, Kavishwar B; McCray, Alexa T; Szolovits, Peter; Chueh, Henry C


    The medical subdomain of a clinical note, such as cardiology or neurology, is useful content-derived metadata for developing machine learning downstream applications. To classify the medical subdomain of a note accurately, we have constructed a machine learning-based natural language processing (NLP) pipeline and developed medical subdomain classifiers based on the content of the note. We constructed the pipeline using the clinical NLP system, clinical Text Analysis and Knowledge Extraction System (cTAKES), the Unified Medical Language System (UMLS) Metathesaurus, Semantic Network, and learning algorithms to extract features from two datasets - clinical notes from Integrating Data for Analysis, Anonymization, and Sharing (iDASH) data repository (n = 431) and Massachusetts General Hospital (MGH) (n = 91,237), and built medical subdomain classifiers with different combinations of data representation methods and supervised learning algorithms. We evaluated the performance of classifiers and their portability across the two datasets. The convolutional recurrent neural network with neural word embeddings trained-medical subdomain classifier yielded the best performance measurement on iDASH and MGH datasets with area under receiver operating characteristic curve (AUC) of 0.975 and 0.991, and F1 scores of 0.845 and 0.870, respectively. Considering better clinical interpretability, linear support vector machine-trained medical subdomain classifier using hybrid bag-of-words and clinically relevant UMLS concepts as the feature representation, with term frequency-inverse document frequency (tf-idf)-weighting, outperformed other shallow learning classifiers on iDASH and MGH datasets with AUC of 0.957 and 0.964, and F1 scores of 0.932 and 0.934 respectively. We trained classifiers on one dataset, applied to the other dataset and yielded the threshold of F1 score of 0.7 in classifiers for half of the medical subdomains we studied. Our study shows that a supervised

  18. Word Sense Disambiguation Based on Large Scale Polish CLARIN Heterogeneous Lexical Resources

    Directory of Open Access Journals (Sweden)

    Paweł Kędzia


    Full Text Available Word Sense Disambiguation Based on Large Scale Polish CLARIN Heterogeneous Lexical Resources Lexical resources can be applied in many different Natural Language Engineering tasks, but the most fundamental task is the recognition of word senses used in text contexts. The problem is difficult, not yet fully solved and different lexical resources provided varied support for it. Polish CLARIN lexical semantic resources are based on the plWordNet — a very large wordnet for Polish — as a central structure which is a basis for linking together several resources of different types. In this paper, several Word Sense Disambiguation (henceforth WSD methods developed for Polish that utilise plWordNet are discussed. Textual sense descriptions in the traditional lexicon can be compared with text contexts using Lesk’s algorithm in order to find best matching senses. In the case of a wordnet, lexico-semantic relations provide the main description of word senses. Thus, first, we adapted and applied to Polish a WSD method based on the Page Rank. According to it, text words are mapped on their senses in the plWordNet graph and Page Rank algorithm is run to find senses with the highest scores. The method presents results lower but comparable to those reported for English. The error analysis showed that the main problems are: fine grained sense distinctions in plWordNet and limited number of connections between words of different parts of speech. In the second approach plWordNet expanded with the mapping onto the SUMO ontology concepts was used. Two scenarios for WSD were investigated: two step disambiguation and disambiguation based on combined networks of plWordNet and SUMO. In the former scenario, words are first assigned SUMO concepts and next plWordNet senses are disambiguated. In latter, plWordNet and SUMO are combined in one large network used next for the disambiguation of senses. The additional knowledge sources used in WSD improved the performance

  19. Polish Listening SPAN: A New Tool for Measuring Verbal Working Memory (United States)

    Zychowicz, Katarzyna; Biedron, Adriana; Pawlak, Miroslaw


    Individual differences in second language acquisition (SLA) encompass differences in working memory capacity, which is believed to be one of the most crucial factors influencing language learning. However, in Poland research on the role of working memory in SLA is scarce due to a lack of proper Polish instruments for measuring this construct. The…

  20. The application of computerized content analysis of natural language in psychotherapy research now and in the future. (United States)

    Gottschalk, L A


    For many years the author and his colleagues have been involved in studying the roots and processes of the conveyance of semantic messages via spoken language and verbal texts. After establishing that reliable and valid measurements of highly relevant neuropsychiatric categories, such as anxiety, depression, and cognitive impairment, can be made by identifying and counting the occurrence per grammatical clause of language content and form categories typifying specific content-analysis scales, the research focus has turned towards computerizing this process of content analysis. This report summarizes the achievements and applications of the current empirical status of this method of computerized content analysis of natural language to psychotherapy research, and it speculates on possible future applications in the millennium.

  1. The written language of signals as a means of natural literacy of deaf children

    Directory of Open Access Journals (Sweden)

    Giovana Fracari Hautrive


    Full Text Available Taking the theme literacy of deaf children is currently directing the eye to the practice teaching course that demands beyond the school. Questions moving to daily practice, became a challenge, requiring an investigative attitude. The article aims to problematize the process of literacy of deaf children. Reflection proposal emerges from daily practice. This structure is from yarns that include theoretical studies of Vigotskii (1989, 1994, 1996, 1998; Stumpf (2005, Quadros (1997; Bolzan (1998, 2002; Skliar (1997a, 1997b, 1998 . From which, problematizes the processes involved in the construction of written language. It is as a result, the importance of the instrumentalization of sign language as first language in education of deaf and learning of sign language writing. Important aspects for the deaf student is observed in the condition to be literate in their mother tongue. It points out the need for a redirect in the literacy of deaf children, so that important aspects of language and its role in the structuring of thought and its communicative aspect, are respected and considered in this process. Thus, it emphasizes the learning of the writing of sign language as fundamental, it should occupy a central role in the proposed teaching the class, encouraging the contradictions that put the student in a situation of cognitive conflict, while respecting the diversity inherent to each humans. It is considered that the production of sign language writing is an appropriate tool for the deaf students record their visual language.

  2. Using natural language processing and machine learning to identify gout flares from electronic clinical notes. (United States)

    Zheng, Chengyi; Rashid, Nazia; Wu, Yi-Lin; Koblick, River; Lin, Antony T; Levy, Gerald D; Cheetham, T Craig


    Gout flares are not well documented by diagnosis codes, making it difficult to conduct accurate database studies. We implemented a computer-based method to automatically identify gout flares using natural language processing (NLP) and machine learning (ML) from electronic clinical notes. Of 16,519 patients, 1,264 and 1,192 clinical notes from 2 separate sets of 100 patients were selected as the training and evaluation data sets, respectively, which were reviewed by rheumatologists. We created separate NLP searches to capture different aspects of gout flares. For each note, the NLP search outputs became the ML system inputs, which provided the final classification decisions. The note-level classifications were grouped into patient-level gout flares. Our NLP+ML results were validated using a gold standard data set and compared with the claims-based method used by prior literatures. For 16,519 patients with a diagnosis of gout and a prescription for a urate-lowering therapy, we identified 18,869 clinical notes as gout flare positive (sensitivity 82.1%, specificity 91.5%): 1,402 patients with ≥3 flares (sensitivity 93.5%, specificity 84.6%), 5,954 with 1 or 2 flares, and 9,163 with no flare (sensitivity 98.5%, specificity 96.4%). Our method identified more flare cases (18,869 versus 7,861) and patients with ≥3 flares (1,402 versus 516) when compared to the claims-based method. We developed a computer-based method (NLP and ML) to identify gout flares from the clinical notes. Our method was validated as an accurate tool for identifying gout flares with higher sensitivity and specificity compared to previous studies. Copyright © 2014 by the American College of Rheumatology.

  3. Population-Based Analysis of Histologically Confirmed Melanocytic Proliferations Using Natural Language Processing. (United States)

    Lott, Jason P; Boudreau, Denise M; Barnhill, Ray L; Weinstock, Martin A; Knopp, Eleanor; Piepkorn, Michael W; Elder, David E; Knezevich, Steven R; Baer, Andrew; Tosteson, Anna N A; Elmore, Joann G


    Population-based information on the distribution of histologic diagnoses associated with skin biopsies is unknown. Electronic medical records (EMRs) enable automated extraction of pathology report data to improve our epidemiologic understanding of skin biopsy outcomes, specifically those of melanocytic origin. To determine population-based frequencies and distribution of histologically confirmed melanocytic lesions. A natural language processing (NLP)-based analysis of EMR pathology reports of adult patients who underwent skin biopsies at a large integrated health care delivery system in the US Pacific Northwest from January 1, 2007, through December 31, 2012. Skin biopsy procedure. The primary outcome was histopathologic diagnosis, obtained using an NLP-based system to process EMR pathology reports. We determined the percentage of diagnoses classified as melanocytic vs nonmelanocytic lesions. Diagnoses classified as melanocytic were further subclassified using the Melanocytic Pathology Assessment Tool and Hierarchy for Diagnosis (MPATH-Dx) reporting schema into the following categories: class I (nevi and other benign proliferations such as mildly dysplastic lesions typically requiring no further treatment), class II (moderately dysplastic and other low-risk lesions that may merit narrow reexcision with skin biopsies, performed on 47 529 patients, were examined. Nearly 1 in 4 skin biopsies were of melanocytic lesions (23%; n = 18 715), which were distributed according to MPATH-Dx categories as follows: class I, 83.1% (n = 15 558); class II, 8.3% (n = 1548); class III, 4.5% (n = 842); class IV, 2.2% (n = 405); and class V, 1.9% (n = 362). Approximately one-quarter of skin biopsies resulted in diagnoses of melanocytic proliferations. These data provide the first population-based estimates across the spectrum of melanocytic lesions ranging from benign through dysplastic to malignant. These results may serve as a foundation for future

  4. Validation of natural language processing to extract breast cancer pathology procedures and results

    Directory of Open Access Journals (Sweden)

    Arika E Wieneke


    Full Text Available Background: Pathology reports typically require manual review to abstract research data. We developed a natural language processing (NLP system to automatically interpret free-text breast pathology reports with limited assistance from manual abstraction. Methods: We used an iterative approach of machine learning algorithms and constructed groups of related findings to identify breast-related procedures and results from free-text pathology reports. We evaluated the NLP system using an all-or-nothing approach to determine which reports could be processed entirely using NLP and which reports needed manual review beyond NLP. We divided 3234 reports for development (2910, 90%, and evaluation (324, 10% purposes using manually reviewed pathology data as our gold standard. Results: NLP correctly coded 12.7% of the evaluation set, flagged 49.1% of reports for manual review, incorrectly coded 30.8%, and correctly omitted 7.4% from the evaluation set due to irrelevancy (i.e. not breast-related. Common procedures and results were identified correctly (e.g. invasive ductal with 95.5% precision and 94.0% sensitivity, but entire reports were flagged for manual review because of rare findings and substantial variation in pathology report text. Conclusions: The NLP system we developed did not perform sufficiently for abstracting entire breast pathology reports. The all-or-nothing approach resulted in too broad of a scope of work and limited our flexibility to identify breast pathology procedures and results. Our NLP system was also limited by the lack of the gold standard data on rare findings and wide variation in pathology text. Focusing on individual, common elements and improving pathology text report standardization may improve performance.

  5. Natural Language Processing for Cohort Discovery in a Discharge Prediction Model for the Neonatal ICU. (United States)

    Temple, Michael W; Lehmann, Christoph U; Fabbri, Daniel


    Discharging patients from the Neonatal Intensive Care Unit (NICU) can be delayed for non-medical reasons including the procurement of home medical equipment, parental education, and the need for children's services. We previously created a model to identify patients that will be medically ready for discharge in the subsequent 2-10 days. In this study we use Natural Language Processing to improve upon that model and discern why the model performed poorly on certain patients. We retrospectively examined the text of the Assessment and Plan section from daily progress notes of 4,693 patients (103,206 patient-days) from the NICU of a large, academic children's hospital. A matrix was constructed using words from NICU notes (single words and bigrams) to train a supervised machine learning algorithm to determine the most important words differentiating poorly performing patients compared to well performing patients in our original discharge prediction model. NLP using a bag of words (BOW) analysis revealed several cohorts that performed poorly in our original model. These included patients with surgical diagnoses, pulmonary hypertension, retinopathy of prematurity, and psychosocial issues. The BOW approach aided in cohort discovery and will allow further refinement of our original discharge model prediction. Adequately identifying patients discharged home on g-tube feeds alone could improve the AUC of our original model by 0.02. Additionally, this approach identified social issues as a major cause for delayed discharge. A BOW analysis provides a method to improve and refine our NICU discharge prediction model and could potentially avoid over 900 (0.9%) hospital days.

  6. Using natural language processing to provide personalized learning opportunities from trainee clinical notes. (United States)

    Denny, Joshua C; Spickard, Anderson; Speltz, Peter J; Porier, Renee; Rosenstiel, Donna E; Powers, James S


    Assessment of medical trainee learning through pre-defined competencies is now commonplace in schools of medicine. We describe a novel electronic advisor system using natural language processing (NLP) to identify two geriatric medicine competencies from medical student clinical notes in the electronic medical record: advance directives (AD) and altered mental status (AMS). Clinical notes from third year medical students were processed using a general-purpose NLP system to identify biomedical concepts and their section context. The system analyzed these notes for relevance to AD or AMS and generated custom email alerts to students with embedded supplemental learning material customized to their notes. Recall and precision of the two advisors were evaluated by physician review. Students were given pre and post multiple choice question tests broadly covering geriatrics. Of 102 students approached, 66 students consented and enrolled. The system sent 393 email alerts to 54 students (82%), including 270 for AD and 123 for AMS. Precision was 100% for AD and 93% for AMS. Recall was 69% for AD and 100% for AMS. Students mentioned ADs for 43 patients, with all mentions occurring after first having received an AD reminder. Students accessed educational links 34 times from the 393 email alerts. There was no difference in pre (mean 62%) and post (mean 60%) test scores. The system effectively identified two educational opportunities using NLP applied to clinical notes and demonstrated a small change in student behavior. Use of electronic advisors such as these may provide a scalable model to assess specific competency elements and deliver educational opportunities. Copyright © 2015 Elsevier Inc. All rights reserved.

  7. Comparison of natural language processing biosurveillance methods for identifying influenza from encounter notes. (United States)

    Elkin, Peter L; Froehling, David A; Wahner-Roedler, Dietlind L; Brown, Steven H; Bailey, Kent R


    An effective national biosurveillance system expedites outbreak recognition and facilitates response coordination at the federal, state, and local levels. The BioSense system, used at the Centers for Disease Control and Prevention, incorporates chief complaints but not data from the whole encounter note into its surveillance algorithms. To evaluate whether biosurveillance by using data from the whole encounter note is superior to that using data from the chief complaint field alone. 6-year retrospective case-control cohort study. Mayo Clinic, Rochester, Minnesota. 17,243 persons tested for influenza A or B virus between 1 January 2000 and 31 December 2006. The accuracy of a model based on signs and symptoms to predict influenza virus infection in patients with upper respiratory tract symptoms, and the ability of a natural language processing technique to identify definitional clinical features from free-text encounter notes. Surveillance based on the whole encounter note was superior to the chief complaint field alone. For the case definition used by surveillance of the whole encounter note, the normalized partial area under the receiver-operating characteristic curve (specificity, 0.1 to 0.4) for surveillance using the whole encounter note was 92.9% versus 70.3% for surveillance with the chief complaint field (difference, 22.6%; P biosurveillance monitoring was not studied. A biosurveillance model for influenza using the whole encounter note is more accurate than a model that uses only the chief complaint field. Because case-defining signs and symptoms of influenza are commonly available in health records, the investigators believe that the national strategy for biosurveillance should be changed to incorporate data from the whole health record. Centers for Disease Control and Prevention.

  8. A natural language processing program effectively extracts key pathologic findings from radical prostatectomy reports. (United States)

    Kim, Brian J; Merchant, Madhur; Zheng, Chengyi; Thomas, Anil A; Contreras, Richard; Jacobsen, Steven J; Chien, Gary W


    Natural language processing (NLP) software programs have been widely developed to transform complex free text into simplified organized data. Potential applications in the field of medicine include automated report summaries, physician alerts, patient repositories, electronic medical record (EMR) billing, and quality metric reports. Despite these prospects and the recent widespread adoption of EMR, NLP has been relatively underutilized. The objective of this study was to evaluate the performance of an internally developed NLP program in extracting select pathologic findings from radical prostatectomy specimen reports in the EMR. An NLP program was generated by a software engineer to extract key variables from prostatectomy reports in the EMR within our healthcare system, which included the TNM stage, Gleason grade, presence of a tertiary Gleason pattern, histologic subtype, size of dominant tumor nodule, seminal vesicle invasion (SVI), perineural invasion (PNI), angiolymphatic invasion (ALI), extracapsular extension (ECE), and surgical margin status (SMS). The program was validated by comparing NLP results to a gold standard compiled by two blinded manual reviewers for 100 random pathology reports. NLP demonstrated 100% accuracy for identifying the Gleason grade, presence of a tertiary Gleason pattern, SVI, ALI, and ECE. It also demonstrated near-perfect accuracy for extracting histologic subtype (99.0%), PNI (98.9%), TNM stage (98.0%), SMS (97.0%), and dominant tumor size (95.7%). The overall accuracy of NLP was 98.7%. NLP generated a result in report. This novel program demonstrated high accuracy and efficiency identifying key pathologic details from the prostatectomy report within an EMR system. NLP has the potential to assist urologists by summarizing and highlighting relevant information from verbose pathology reports. It may also facilitate future urologic research through the rapid and automated creation of large databases.

  9. Measuring information acquisition from sensory input using automated scoring of natural-language descriptions.

    Directory of Open Access Journals (Sweden)

    Daniel R Saunders

    Full Text Available Information acquisition, the gathering and interpretation of sensory information, is a basic function of mobile organisms. We describe a new method for measuring this ability in humans, using free-recall responses to sensory stimuli which are scored objectively using a "wisdom of crowds" approach. As an example, we demonstrate this metric using perception of video stimuli. Immediately after viewing a 30 s video clip, subjects responded to a prompt to give a short description of the clip in natural language. These responses were scored automatically by comparison to a dataset of responses to the same clip by normally-sighted viewers (the crowd. In this case, the normative dataset consisted of responses to 200 clips by 60 subjects who were stratified by age (range 22 to 85 y and viewed the clips in the lab, for 2,400 responses, and by 99 crowdsourced participants (age range 20 to 66 y who viewed clips in their Web browser, for 4,000 responses. We compared different algorithms for computing these similarities and found that a simple count of the words in common had the best performance. It correctly matched 75% of the lab-sourced and 95% of crowdsourced responses to their corresponding clips. We validated the measure by showing that when the amount of information in the clip was degraded using defocus lenses, the shared word score decreased across the five predetermined visual-acuity levels, demonstrating a dose-response effect (N = 15. This approach, of scoring open-ended immediate free recall of the stimulus, is applicable not only to video, but also to other situations where a measure of the information that is successfully acquired is desirable. Information acquired will be affected by stimulus quality, sensory ability, and cognitive processes, so our metric can be used to assess each of these components when the others are controlled.

  10. Mining peripheral arterial disease cases from narrative clinical notes using natural language processing. (United States)

    Afzal, Naveed; Sohn, Sunghwan; Abram, Sara; Scott, Christopher G; Chaudhry, Rajeev; Liu, Hongfang; Kullo, Iftikhar J; Arruda-Olson, Adelaide M


    Lower extremity peripheral arterial disease (PAD) is highly prevalent and affects millions of individuals worldwide. We developed a natural language processing (NLP) system for automated ascertainment of PAD cases from clinical narrative notes and compared the performance of the NLP algorithm with billing code algorithms, using ankle-brachial index test results as the gold standard. We compared the performance of the NLP algorithm to (1) results of gold standard ankle-brachial index; (2) previously validated algorithms based on relevant International Classification of Diseases, Ninth Revision diagnostic codes (simple model); and (3) a combination of International Classification of Diseases, Ninth Revision codes with procedural codes (full model). A dataset of 1569 patients with PAD and controls was randomly divided into training (n = 935) and testing (n = 634) subsets. We iteratively refined the NLP algorithm in the training set including narrative note sections, note types, and service types, to maximize its accuracy. In the testing dataset, when compared with both simple and full models, the NLP algorithm had better accuracy (NLP, 91.8%; full model, 81.8%; simple model, 83%; P NLP, 92.9%; full model, 74.3%; simple model, 79.9%; P NLP, 92.5%; full model, 64.2%; simple model, 75.9%; P NLP algorithm for automatic ascertainment of PAD cases from clinical notes had greater accuracy than billing code algorithms. Our findings highlight the potential of NLP tools for rapid and efficient ascertainment of PAD cases from electronic health records to facilitate clinical investigation and eventually improve care by clinical decision support. Copyright © 2016 The Authors. Published by Elsevier Inc. All rights reserved.

  11. Using Natural Language Processing to Extract Abnormal Results From Cancer Screening Reports. (United States)

    Moore, Carlton R; Farrag, Ashraf; Ashkin, Evan


    Numerous studies show that follow-up of abnormal cancer screening results, such as mammography and Papanicolaou (Pap) smears, is frequently not performed in a timely manner. A contributing factor is that abnormal results may go unrecognized because they are buried in free-text documents in electronic medical records (EMRs), and, as a result, patients are lost to follow-up. By identifying abnormal results from free-text reports in EMRs and generating alerts to clinicians, natural language processing (NLP) technology has the potential for improving patient care. The goal of the current study was to evaluate the performance of NLP software for extracting abnormal results from free-text mammography and Pap smear reports stored in an EMR. A sample of 421 and 500 free-text mammography and Pap reports, respectively, were manually reviewed by a physician, and the results were categorized for each report. We tested the performance of NLP to extract results from the reports. The 2 assessments (criterion standard versus NLP) were compared to determine the precision, recall, and accuracy of NLP. When NLP was compared with manual review for mammography reports, the results were as follows: precision, 98% (96%-99%); recall, 100% (98%-100%); and accuracy, 98% (96%-99%). For Pap smear reports, the precision, recall, and accuracy of NLP were all 100%. Our study developed NLP models that accurately extract abnormal results from mammography and Pap smear reports. Plans include using NLP technology to generate real-time alerts and reminders for providers to facilitate timely follow-up of abnormal results.

  12. Automated chart review utilizing natural language processing algorithm for asthma predictive index. (United States)

    Kaur, Harsheen; Sohn, Sunghwan; Wi, Chung-Il; Ryu, Euijung; Park, Miguel A; Bachman, Kay; Kita, Hirohito; Croghan, Ivana; Castro-Rodriguez, Jose A; Voge, Gretchen A; Liu, Hongfang; Juhn, Young J


    Thus far, no algorithms have been developed to automatically extract patients who meet Asthma Predictive Index (API) criteria from the Electronic health records (EHR) yet. Our objective is to develop and validate a natural language processing (NLP) algorithm to identify patients that meet API criteria. This is a cross-sectional study nested in a birth cohort study in Olmsted County, MN. Asthma status ascertained by manual chart review based on API criteria served as gold standard. NLP-API was developed on a training cohort (n = 87) and validated on a test cohort (n = 427). Criterion validity was measured by sensitivity, specificity, positive predictive value and negative predictive value of the NLP algorithm against manual chart review for asthma status. Construct validity was determined by associations of asthma status defined by NLP-API with known risk factors for asthma. Among the eligible 427 subjects of the test cohort, 48% were males and 74% were White. Median age was 5.3 years (interquartile range 3.6-6.8). 35 (8%) had a history of asthma by NLP-API vs. 36 (8%) by abstractor with 31 by both approaches. NLP-API predicted asthma status with sensitivity 86%, specificity 98%, positive predictive value 88%, negative predictive value 98%. Asthma status by both NLP and manual chart review were significantly associated with the known asthma risk factors, such as history of allergic rhinitis, eczema, family history of asthma, and maternal history of smoking during pregnancy (p value NLP-API and abstractor, and the effect sizes were similar between the reviews with 4.4 vs 4.2 respectively. NLP-API was able to ascertain asthma status in children mining from EHR and has a potential to enhance asthma care and research through population management and large-scale studies when identifying children who meet API criteria.

  13. Interpretation of Ukrainian and Polish Adverbial Word Equivalents Form and Meaning Interaction in National Explanatory Lexicography

    Directory of Open Access Journals (Sweden)

    Alla Luchyk


    Full Text Available Interpretation of Ukrainian and Polish Adverbial Word Equivalents Form and Meaning Interaction in National Explanatory Lexicography The article proves the necessity and possibility of compiling dictionaries with intermediate existence status glossary units, to which the word equivalents belong. In order to form the Ukrainian-Polish dictionary glossary of this type the form and meaning analysis of Ukrainian and Polish word equivalents is done, the common and distinctive features of these language system elements are described, the compiling principles of such dictionary are clarified.

  14. Intercultural pragmatics: an investigation of expressing opinions in Irish English amongst Irish and Polish students.


    Gąsior, Weronika Zofia


    peer-reviewed Research in cross-cultural pragmatics has been limited to a handful of speech acts, and opinions remain rather poorly documented. The aim of this research was to explore the speech act of opinions from the dual perspective of pragmalinguistics-sociopragmatics, focusing additionally on the Irish variety of the English language and the Irish-Polish intercultural context. An empirical study of the expression of opinions among Polish and Irish students was conducte...

  15. Natural Language Search Interfaces: Health Data Needs Single-Field Variable Search (United States)

    Smith, Sam; Sufi, Shoaib; Goble, Carole; Buchan, Iain


    search interface for learnability (P=.002, 95% CI [0.6-2.4]), ease of use (Psearch, which is consistent with its general familiarity and with enabling queries to be refined as the search proceeds, which treats serendipity as part of the refinement. Conclusions The results provide clear evidence that data science should adopt single-field natural language search interfaces for variable search supporting in particular: query reformulation; data browsing; faceted search; surrogates; relevance feedback; summarization, analytics, and visual presentation. PMID:26769334

  16. Assessing the feasibility of large-scale natural language processing in a corpus of ordinary medical records: a lexical analysis. (United States)

    Identify the lexical content of a large corpus of ordinary medical records to assess the feasibility of large-scale natural language processing. A corpus of 560 megabytes of medical record text from an academic medical center was broken into individual words and compared with the words in six medical vocabularies, a common word list, and a database of patient names. Unrecognized words were assessed for algorithmic and contextual approaches to identifying more words, while the remainder were analyzed for spelling correctness. About 60% of the words occurred in the medical vocabularies, common word list, or names database. Of the remainder, one-third were recognizable by other means. Of the remaining unrecognizable words, over three-fourths represented correctly spelled real words and the rest were misspellings. Large-scale generalized natural language processing methods for the medical record will require expansion of existing vocabularies, spelling error correction, and other algorithmic approaches to map words into those from clinical vocabularies.

    Full Text Available Scientific publications written in natural language still play a central role as our knowledge source. However, due to the flood of publications, the literature survey process has become a highly time-consuming and tangled process, especially for novices of the discipline. Therefore, tools supporting the literature-survey process may help the individual scientist to explore new useful domains. Natural language processing (NLP is expected as one of the promising techniques to retrieve, abstract, and extract knowledge. In this contribution, NLP is firstly applied to the literature of chemical vapor deposition (CVD, which is a sub-discipline of materials science and is a complex and interdisciplinary field of research involving chemists, physicists, engineers, and materials scientists. Causal knowledge extraction from the literature is demonstrated using NLP.

  18. Crowdsourcing a normative natural language dataset: a comparison of Amazon Mechanical Turk and in-lab data collection. (United States)

    Saunders, Daniel R; Bex, Peter J; Woods, Russell L


    Crowdsourcing has become a valuable method for collecting medical research data. This approach, recruiting through open calls on the Web, is particularly useful for assembling large normative datasets. However, it is not known how natural language datasets collected over the Web differ from those collected under controlled laboratory conditions. To compare the natural language responses obtained from a crowdsourced sample of participants with responses collected in a conventional laboratory setting from participants recruited according to specific age and gender criteria. We collected natural language descriptions of 200 half-minute movie clips, from Amazon Mechanical Turk workers (crowdsourced) and 60 participants recruited from the community (lab-sourced). Crowdsourced participants responded to as many clips as they wanted and typed their responses, whereas lab-sourced participants gave spoken responses to 40 clips, and their responses were transcribed. The content of the responses was evaluated using a take-one-out procedure, which compared responses to other responses to the same clip and to other clips, with a comparison of the average number of shared words. In contrast to the 13 months of recruiting that was required to collect normative data from 60 lab-sourced participants (with specific demographic characteristics), only 34 days were needed to collect normative data from 99 crowdsourced participants (contributing a median of 22 responses). The majority of crowdsourced workers were female, and the median age was 35 years, lower than the lab-sourced median of 62 years but similar to the median age of the US population. The responses contributed by the crowdsourced participants were longer on average, that is, 33 words compared to 28 words (Pcrowdsourced participants had more shared words (P=.004 and .01 respectively), whereas younger participants had higher numbers of shared words in the lab-sourced population (P=.01). Crowdsourcing is an effective approach

  19. Efficient polishing of aspheric optics

    The objectives of this project are to develop, evaluate, and optimize novel designs for a polishing tool intended for ultra-precise figure corrections on aspheric optics with tolerances typical of those required for use in extreme ultraviolet (EUV) projection lithography. This work may lead to an enhanced US industrial capability for producing optics for EUV, x-ray and, other high precision applications. LLNL benefits from developments in computer-controlled polishing and the insertion of fluid mechanics modeling into the precision manufacturing area. Our accomplishments include the numerical estimation of the hydrodynamic shear stress distribution for a new polishing tool that directs and controls the interaction of an abrasive slurry with an optical surface. A key milestone is in establishing a correlation between the shear stress predicted using our fluid mechanics model and the observed removal footprint created by a prototype tool. In addition, we demonstrate the ability to remove 25 nm layers of optical glass in a manner qualitatively similar to macroscopic milling operations using a numerically- controlled machine tool. Other accomplishments include the development of computer control software for directing the polishing tool and the construction of a polishing testbed.

  20. Steering the conversation: A linguistic exploration of natural language interactions with a digital assistant during simulated driving. (United States)

    Large, David R; Clark, Leigh; Quandt, Annie; Burnett, Gary; Skrypchuk, Lee


    Given the proliferation of 'intelligent' and 'socially-aware' digital assistants embodying everyday mobile technology - and the undeniable logic that utilising voice-activated controls and interfaces in cars reduces the visual and manual distraction of interacting with in-vehicle devices - it appears inevitable that next generation vehicles will be embodied by digital assistants and utilise spoken language as a method of interaction. From a design perspective, defining the language and interaction style that a digital driving assistant should adopt is contingent on the role that they play within the social fabric and context in which they are situated. We therefore conducted a qualitative, Wizard-of-Oz study to explore how drivers might interact linguistically with a natural language digital driving assistant. Twenty-five participants drove for 10 min in a medium-fidelity driving simulator while interacting with a state-of-the-art, high-functioning, conversational digital driving assistant. All exchanges were transcribed and analysed using recognised linguistic techniques, such as discourse and conversation analysis, normally reserved for interpersonal investigation. Language usage patterns demonstrate that interactions with the digital assistant were fundamentally social in nature, with participants affording the assistant equal social status and high-level cognitive processing capability. For example, participants were polite, actively controlled turn-taking during the conversation, and used back-channelling, fillers and hesitation, as they might in human communication. Furthermore, participants expected the digital assistant to understand and process complex requests mitigated with hedging words and expressions, and peppered with vague language and deictic references requiring shared contextual information and mutual understanding. Findings are presented in six themes which emerged during the analysis - formulating responses; turn-taking; back

  1. Starting over: international adoption as a natural experiment in language development. (United States)

    Snedeker, Jesse; Geren, Joy; Shafto, Carissa L


    Language development is characterized by predictable shifts in the words children produce and the complexity of their utterances. Because acquisition typically occurs simultaneously with maturation and cognitive development, it is difficult to determine the causes of these shifts. We explored how acquisition proceeds in the absence of possible cognitive or maturational roadblocks, by examining the acquisition of English in internationally adopted preschoolers. Like infants, and unlike other second-language learners, these children acquire language from child-directed speech, without access to bilingual informants. Parental reports and speech samples were collected from 27 preschoolers, 3 to 18 months after they were adopted from China. These children showed the same developmental patterns in language production as monolingual infants (matched for vocabulary size). Early on, their vocabularies were dominated by nouns, their utterances were short, and grammatical morphemes were generally omitted. Children at later stages had more diverse vocabularies and produced longer utterances with more grammatical morphemes.

  2. A Discussion about Upgrading the Quick Script Platform to Create Natural Language based IoT Systems

    Khanna, Anirudh; Das, Bhagwan; Pandey, Bishwajeet


    With the advent of AI and IoT, the idea of incorporating smart things/appliances in our day to day life is converting into a reality. The paper discusses the possibilities and potential of designing IoT systems which can be controlled via natural language, with help of Quick Script as a development......, and where all the necessary changes/ additions are to be made. The benefits of this will include sharing the power of controlling and even programming (up to some extent) to the user end. As well as providing a simple intermediary to make communication between man and his machines a little more natural...

  3. Natural Language Processing Based Instrument for Classification of Free Text Medical Records


    Khachidze, Manana; Tsintsadze, Magda; Archuadze, Maia


    According to the Ministry of Labor, Health and Social Affairs of Georgia a new health management system has to be introduced in the nearest future. In this context arises the problem of structuring and classifying documents containing all the history of medical services provided. The present work introduces the instrument for classification of medical records based on the Georgian language. It is the first attempt of such classification of the Georgian language based medical records. On the w...

  4. Meeting the Needs of Polonia: A Comparison of the Polish-American Press Before and After 1980-1981 (United States)


    IS 4 ,, ., * C) N pMeeting the Needs of Polonia: CA Comparison of the Polish- American Press Before And After 1980-1981 S ELECTE Ef JUL 0 2 1990 by...MEETING THE NEEDS OF POLONIA: A COMPARISON OF THE POLISH-AMERICAN PRESS BEFORE AND AFTER 1980-1981 f. A’IJIIOR(S) ZAJACKOWSKI, DONALD THOMAS 7...AND PURPOSE OF THE STUDY The ethnic, foreign-language press, including the Polish-language press, has existed in this country to serve the needs of its

  5. Dynamical Languages (United States)

    Xie, Huimin

    The following sections are included: * Definition of Dynamical Languages * Distinct Excluded Blocks * Definition and Properties * L and L″ in Chomsky Hierarchy * A Natural Equivalence Relation * Symbolic Flows * Symbolic Flows and Dynamical Languages * Subshifts of Finite Type * Sofic Systems * Graphs and Dynamical Languages * Graphs and Shannon-Graphs * Transitive Languages * Topological Entropy

  6. Dependency distance: A new perspective on the syntactic development in second language acquisition. Comment on "Dependency distance: A new perspective on syntactic patterns in natural language" by Haitao Liu et al. (United States)

    Jiang, Jingyang; Ouyang, Jinghui


    Liu et al. [1] offers a clear and informative account of the use of dependency distance in studying natural languages, with a focus on the viewpoint that dependency distance minimization (DDM) can be regarded as a linguistic universal. We would like to add the perspective of employing dependency distance in the studies of second languages acquisition (SLA), particularly the studies of syntactic development.

  7. Cannabinoids cases in polish athletes


    A Pokrywka; Z Obmiński; D Kwiatkowska; R Grucza


    The aim of this study was to investigate the number of cases and the profiles of Polish athletes who had occasionally been using marijuana or hashish throughout the period of 1998-2004, with respect to: sex, age, and discipline of sport as well as the period of testing (in- and out-of-competition). Results of the study were compared with some data reported by other WADA accredited anti-doping laboratories. Totally, 13 631 urine samples taken from Polish athletes of both sexes, aged 10-67 year...

  8. Natural Language Search Interfaces: Health Data Needs Single-Field Variable Search. (United States)

    Jay, Caroline; Harper, Simon; Dunlop, Ian; Smith, Sam; Sufi, Shoaib; Goble, Carole; Buchan, Iain


    ,19=18.0, P<.001). There was also a main effect of task (F2,38=4.1, P=.025, Greenhouse-Geisser correction applied). Overall, participants were asked to rate learnability, ease of use, and satisfaction. Paired mean comparisons showed that the Web search interface received significantly higher ratings than the traditional search interface for learnability (P=.002, 95% CI [0.6-2.4]), ease of use (P<.001, 95% CI [1.2-3.2]), and satisfaction (P<.001, 95% CI [1.8-3.5]). The results show superior cross-domain usability of Web search, which is consistent with its general familiarity and with enabling queries to be refined as the search proceeds, which treats serendipity as part of the refinement. The results provide clear evidence that data science should adopt single-field natural language search interfaces for variable search supporting in particular: query reformulation; data browsing; faceted search; surrogates; relevance feedback; summarization, analytics, and visual presentation.

  9. Natural language morphology integration in off-line Arabic optical text recognition. (United States)

    Kanoun, Slim; Alimi, Adel M; Lecourtier, Yves


    In this paper, we propose a new linguistic-based approach called the affixal approach for Arabic word and text image recognition. Most of the existing works in the field integrate the knowledge of the Arabic language in the recognition process in two ways: either in post-recognition using the language of dictionary (dictionary of words) to validate the word hypotheses suggested by the OCR or in the course of the recognition process (recognition directed by a lexicon) using a statistical model of the language (Hidden Markov Model or N-gram). The proposed approach uses the linguistic concepts of the vocabulary to direct and simplify the recognition process. The principal contribution of the proposed approach is to be able to categorize the word hypotheses in words that are either derived or not derived from roots and to characterize morphologically each word hypothesis in order to prepare the text hypotheses for later analyses (for example, syntactic analysis; to filter the sentence hypotheses).

  10. In silico Evolutionary Developmental Neurobiology and the Origin of Natural Language (United States)

    Szathmáry, Eörs; Szathmáry, Zoltán; Ittzés, Péter; Orbaán, Geroő; Zachár, István; Huszár, Ferenc; Fedor, Anna; Varga, Máté; Számadó, Szabolcs

    It is justified to assume that part of our genetic endowment contributes to our language skills, yet it is impossible to tell at this moment exactly how genes affect the language faculty. We complement experimental biological studies by an in silico approach in that we simulate the evolution of neuronal networks under selection for language-related skills. At the heart of this project is the Evolutionary Neurogenetic Algorithm (ENGA) that is deliberately biomimetic. The design of the system was inspired by important biological phenomena such as brain ontogenesis, neuron morphologies, and indirect genetic encoding. Neuronal networks were selected and were allowed to reproduce as a function of their performance in the given task. The selected neuronal networks in all scenarios were able to solve the communication problem they had to face. The most striking feature of the model is that it works with highly indirect genetic encoding--just as brains do.

  11. Mirror neurons and the social nature of language: the neural exploitation hypothesis. (United States)

    Gallese, Vittorio


    This paper discusses the relevance of the discovery of mirror neurons in monkeys and of the mirror neuron system in humans to a neuroscientific account of primates' social cognition and its evolution. It is proposed that mirror neurons and the functional mechanism they underpin, embodied simulation, can ground within a unitary neurophysiological explanatory framework important aspects of human social cognition. In particular, the main focus is on language, here conceived according to a neurophenomenological perspective, grounding meaning on the social experience of action. A neurophysiological hypothesis--the "neural exploitation hypothesis"--is introduced to explain how key aspects of human social cognition are underpinned by brain mechanisms originally evolved for sensorimotor integration. It is proposed that these mechanisms were later on adapted as new neurofunctional architecture for thought and language, while retaining their original functions as well. By neural exploitation, social cognition and language can be linked to the experiential domain of action.

  12. Cannabinoids cases in polish athletes

    Full Text Available The aim of this study was to investigate the number of cases and the profiles of Polish athletes who had occasionally been using marijuana or hashish throughout the period of 1998-2004, with respect to: sex, age, and discipline of sport as well as the period of testing (in- and out-of-competition. Results of the study were compared with some data reported by other WADA accredited anti-doping laboratories. Totally, 13 631 urine samples taken from Polish athletes of both sexes, aged 10-67 years, performing 46 disciplines of sport were tested. Cannabinoids were detected in 267 samples. Among Polish athletes the relative number of positive THC (tetrahydrocannabinol samples was one of the highest in Europe. The group of young Polish athletes (aged 16-24 years was the most THC-positive. THC-positive cases were noted more frequently in male athletes tested during out of competitions. The so-called contact sports (rugby, ice hockey, skating, boxing, badminton, body building and acrobatic sports were those sports, where the higher risk of cannabis use was observed. The legal interpretation of some positive cannabinoids results would be difficult because of some accidental and unintentional use of the narcotics by sportsmen. It was concluded that national anti-doping organizations (NADO’s, which are competent to judge whether the anti-doping rules were violated, should take into account the possibility of non-intentional doping use of cannabinoids via passive smoking of marijuana.

  13. Sensing roughness and polish direction

    Jakobsen, Michael Linde; Olesen, Anders Sig; Larsen, Henning Engelbrecht


    needs information about the RMS-value of the surface roughness and the current direction of the scratches introduced by the polishing process. The RMS-value indicates to the operator how far he is from the final finish, and the scratch orientation is often specified by the customer in order to avoid...... structures and light scattered from scratches....

  14. ATLAS brochure (Polish version)

    CERN Multimedia

    Lefevre, C


    ATLAS is the largest detector at the LHC, the most powerful particle accelerator in the world, which will start up in 2008. ATLAS is a multi-purpose detector, designed to throw light on fundamental questions such as the origin of mass and the nature of the Universe's dark matter.

  15. Genetic and Environmental Links between Natural Language Use and Cognitive Ability in Toddlers (United States)

    Canfield, Caitlin F.; Edelson, Lisa R.; Saudino, Kimberly J.


    Although the phenotypic correlation between language and nonverbal cognitive ability is well-documented, studies examining the etiology of the covariance between these abilities are scant, particularly in very young children. The goal of this study was to address this gap in the literature by examining the genetic and environmental links between…

  16. Implementation of Danish in the Natural Language Generator of Angus2

    DEFF Research Database (Denmark)

    Larsen, Søren Støvelbæk; Fihl, Preben; Moeslund, Thomas B.

    The purpose of this technical report is to cover the implementation of the Danish language and grammar in the Angus2 software. This includes a brief description of the Angus2 software, and the Danish grammar with relevance to the implementation in Angus2, and detailed description of how...

  17. The substantive nature of psycholexical personality factors : A comparison across languages

    NARCIS (Netherlands)

    Peabody, D; De Raad, B.


    The psycholexical approach to personality structure in American English has led to the Big Five factors. The present study considers whether this result is similar or different in other languages. Instead of placing the usual emphasis on quantitative indices, this study examines the substantive


    Full Text Available This paper ainis at presenting a survey of computational linguistic tools presently available but whose potential has been neither fully considered not exploited to its full in modern CALL. It starts with a discussion on the rationale of DDL to language learning, presenting typical DDL-activities. DDL-software and potential extensions of non-typical DDL-software (electronic dictionaries and electronic dictionary facilities to DDL . An extended section is devoted to describe NLP-technology and how it can be integrated into CALL, within already existing software or as stand alone resources. A range of NLP-tools is presentcd (MT programs, taggers, lemn~atizersp, arsers and speech technologies with special emphasis on tagged concordancing. The paper finishes with a number of reflections and ideas on how language technologies can be used efficiently within the language learning context and how extensive exploration and integration of these technologies might change and extend both modern CAI,I, and the present language learning paradigiii..

  19. Detecting Novel and Emerging Drug Terms Using Natural Language Processing: A Social Media Corpus Study (United States)

    Simpson, Sean S; Brugman, Claudia M; Conners, Thomas J


    Background With the rapid development of new psychoactive substances (NPS) and changes in the use of more traditional drugs, it is increasingly difficult for researchers and public health practitioners to keep up with emerging drugs and drug terms. Substance use surveys and diagnostic tools need to be able to ask about substances using the terms that drug users themselves are likely to be using. Analyses of social media may offer new ways for researchers to uncover and track changes in drug terms in near real time. This study describes the initial results from an innovative collaboration between substance use epidemiologists and linguistic scientists employing techniques from the field of natural language processing to examine drug-related terms in a sample of tweets from the United States. Objective The objective of this study was to assess the feasibility of using distributed word-vector embeddings trained on social media data to uncover previously unknown (to researchers) drug terms. Methods In this pilot study, we trained a continuous bag of words (CBOW) model of distributed word-vector embeddings on a Twitter dataset collected during July 2016 (roughly 884.2 million tokens). We queried the trained word embeddings for terms with high cosine similarity (a proxy for semantic relatedness) to well-known slang terms for marijuana to produce a list of candidate terms likely to function as slang terms for this substance. This candidate list was then compared with an expert-generated list of marijuana terms to assess the accuracy and efficacy of using word-vector embeddings to search for novel drug terminology. Results The method described here produced a list of 200 candidate terms for the target substance (marijuana). Of these 200 candidates, 115 were determined to in fact relate to marijuana (65 terms for the substance itself, 50 terms related to paraphernalia). This included 30 terms which were used to refer to the target substance in the corpus yet did not appear

  20. On the relation between dependency distance, crossing dependencies, and parsing. Comment on "Dependency distance: a new perspective on syntactic patterns in natural languages" by Haitao Liu et al. (United States)

    Gómez-Rodríguez, Carlos


    Liu et al. [1] provide a comprehensive account of research on dependency distance in human languages. While the article is a very rich and useful report on this complex subject, here I will expand on a few specific issues where research in computational linguistics (specifically natural language processing) can inform DDM research, and vice versa. These aspects have not been explored much in [1] or elsewhere, probably due to the little overlap between both research communities, but they may provide interesting insights for improving our understanding of the evolution of human languages, the mechanisms by which the brain processes and understands language, and the construction of effective computer systems to achieve this goal.

  1. Context Analysis of Customer Requests using a Hybrid Adaptive Neuro Fuzzy Inference System and Hidden Markov Models in the Natural Language Call Routing Problem (United States)

    Rustamov, Samir; Mustafayev, Elshan; Clements, Mark A.


    The context analysis of customer requests in a natural language call routing problem is investigated in the paper. One of the most significant problems in natural language call routing is a comprehension of client request. With the aim of finding a solution to this issue, the Hybrid HMM and ANFIS models become a subject to an examination. Combining different types of models (ANFIS and HMM) can prevent misunderstanding by the system for identification of user intention in dialogue system. Based on these models, the hybrid system may be employed in various language and call routing domains due to nonusage of lexical or syntactic analysis in classification process.

  2. A natural language query system for Hubble Space Telescope proposal selection (United States)

    Hornick, Thomas; Cohen, William; Miller, Glenn


    The proposal selection process for the Hubble Space Telescope is assisted by a robust and easy to use query program (TACOS). The system parses an English subset language sentence regardless of the order of the keyword phases, allowing the user a greater flexibility than a standard command query language. Capabilities for macro and procedure definition are also integrated. The system was designed for flexibility in both use and maintenance. In addition, TACOS can be applied to any knowledge domain that can be expressed in terms of a single reaction. The system was implemented mostly in Common LISP. The TACOS design is described in detail, with particular attention given to the implementation methods of sentence processing.

  3. Research and Development in Natural Language Understanding as Part of the Strategic Computing Program. (United States)


  4. Descriptive Metaphysics, Natural Language Metaphysics, Sapir-Whorf, and All That Stuff: Evidence from the Mass-Count Distinction

    Directory of Open Access Journals (Sweden)

    Francis Jeffry Pelletier


    Full Text Available Strawson (1959 described ‘descriptive metaphysics’, Bach (1986a described ‘natural language metaphysics’, Sapir (1929 and Whorf (1940a,b, 1941 describe, well, Sapir-Whorfianism. And there are other views concerning the relation between correct semantic analysis of linguistic phenomena and the “reality” that is supposed to be thereby described. I think some considerations from the analyses of the mass-count distinction can shed some light on that very dark topic.ReferencesBach, Emmon. 1986a. ‘Natural Language Metaphysics’. In Ruth Barcan Marcus, G.J.W. Dorn & Paul Weingartner (eds. ‘Logic, Methodology, and Philosophy of Science, VII’, 573–595. Amsterdam: North Holland.Bach, Emmon. 1986b. ‘The Algebra of Events’. Linguistics and Philosophy 9: 5–16.Berger, Peter & Luckmann, Thomas. 1966. The Social Construction of Reality: A Treatise in the Sociology of Knowledge. New York: Doubleday.Boroditsky, Lera, Schmidt, Lauren & Phillips, Webb. 2003. ‘Sex, Syntax, and Semantics’. In Dedre Gentner & Susan Goldin-Meadow (eds. ‘Language in Mind: Advances in the Study of Language and Cognition’, 59–80. Cambridge, MA: MIT Press.Cheng, L. & Sybesma, R. 1999. ‘Bare and Not-So-Bare Nouns and the structure of NP’. Linguistic Inquiry 30: 509–542., Gennaro. 1998a. ‘Reference to Kinds across Languages’. Natural Language Semantics 6: 339–405., Gennaro. 1998b. ‘Plurality of Mass Nouns and the Notion of ‘Semantic Parameter’ ’. In S. Rothstein (ed. ‘Events and Grammar’, 53–103. Dordrecht: Kluwer.Chierchia, Gennaro. 2010. ‘Mass Nouns, Vagueness and Semantic Variation’. Synthèse 174: 99–149., Jenny. 1997. Quantifiers and Selection: On the Distribution of Quantifying Expressions in French, Dutch and English. Ph.D. thesis, University of Leiden, Holland

  5. Treating conduct disorder: An effectiveness and natural language analysis study of a new family-centred intervention program. (United States)

    Stevens, Kimberly A; Ronan, Prof Kevin; Davies, Gene


    This paper reports on a new family-centred, feedback-informed intervention focused on evaluating therapeutic outcomes and language changes across treatment for conduct disorder (CD). The study included 26 youth and families from a larger randomised, controlled trial (Ronan et al., in preparation). Outcome measures reflected family functioning/youth compliance, delinquency, and family goal attainment. First- and last-treatment session audio files were transcribed into more than 286,000 words and evaluated through the Linguistic Inquiry and Word Count Analysis program (Pennebaker et al., 2007). Significant outcomes across family functioning/youth compliance, delinquency, goal attainment and word usage reflected moderate-strong effect sizes. Benchmarking findings also revealed reduced time of treatment delivery compared to a gold standard approach. Linguistic analysis revealed specific language changes across treatment. For caregivers, increased first person, action-oriented, present tense, and assent type words and decreased sadness words were found; for youth, significant reduction in use of leisure words. This study is the first using lexical analyses of natural language to assess change across treatment for conduct disordered youth and families. Such findings provided strong support for program tenets; others, more speculative support. Copyright © 2016. Published by Elsevier B.V.

  6. The Phonotactic Influence on the Perception of a Consonant Cluster /pt/ by Native English and Native Polish Listeners: A Behavioral and Event Related Potential (ERP) Study (United States)

    Wagner, Monica; Shafer, Valerie L.; Martin, Brett; Steinschneider, Mitchell


    The effect of exposure to the contextual features of the /pt/ cluster was investigated in native-English and native-Polish listeners using behavioral and event-related potential (ERP) methodology. Both groups experience the /pt/ cluster in their languages, but only the Polish group experiences the cluster in the context of word onset examined in…

  7. Polish energy-system modernisation

    International Nuclear Information System (INIS)

    Drozdz, M.


    The Polish energy-system needs intensive investments in new technologies, which are energy efficient, clean and cost effective. Since the early 1990s, the Polish economy has had practically full access to modern technological devices, equipment and technologies. Introducing new technologies is a difficult task for project teams, constructors and investors. The author presents a set of principles for project teams useful in planning and energy modernisation. Several essential features are discussed: Energy-efficient appliances and systems; Choice of energy carriers, media and fuels; Optimal tariffs, maximum power and installed power; Intelligent, integrated, steering systems; Waste-energy recovery; Renewable-energy recovery. In practice there are several difficulties connected with planning and realising good technological and economic solutions. The author presents his own experiences of energy-system modernisation of industrial processes and building new objects. (Author)


    Full Text Available  A household’s fi nancial security is essential for the satisfaction of the needs and wants of its members, both communal and individual. It constitutes a kind of foundation for all of a household’s fi nancial decisions that impact its standard of living. The article aims to assess the level of fi nancial security of Polish households in 2005–2013. The research draws on data from Genworth Index, HBS conducted by the Central Statistical Offi ce (GUS and Social Diagnosis (Diagnoza społeczna overseen by the Social Monitoring Council. The study shows that Poland is characterized by a low level of fi nancial security relative to other European countries, especially Western and Scandinavian. More than three-quarters of Polish households experience fi nancial problems and exhibit both a low propensity to save, and low savings rates.

  9. Energy savings in Polish buildings

    Markel, L.C.; Gula, A.; Reeves, G.


    A demonstration of low-cost insulation and weatherization techniques was a part of phase 1 of the Krakow Clean Fossil Fuels and Energy Efficient Project. The objectives were to identify a cost-effective set of measures to reduce energy used for space heating, determine how much energy could be saved, and foster widespread implementation of those measures. The demonstration project focused on 4 11-story buildings in a Krakow housing cooperative. Energy savings of over 20% were obtained. Most important, the procedures and materials implemented in the demonstration project have been adapted to Polish conditions and applied to other housing cooperatives, schools, and hospitals. Additional projects are being planned, in Krakow and other cities, under the direction of FEWE-Krakow, the Polish Energie Cities Network, and Biuro Rozwoju Krakowa.

  10. Confocal Raman spectrocopy for the analysis of nail polish evidence. (United States)

    López-López, Maria; Vaz, Joana; García-Ruiz, Carmen


    Nail polishes are cosmetic paints that may be susceptible of forensic analysis offering useful information to assist in a crime reconstruction. Although the nail polish appearance could allow a quick visual identification of the sample, this analysis is subjected to the perception and subjective interpretation of the forensic examiner. The chemical analysis of the nail polishes offers great deal of information not subjected to analyst interpretation. Confocal Raman spectroscopy is a well-suited technique for the analysis of paints due to its non-invasive and non-destructive nature and its ability to supply information about the organic and inorganic components of the sample. In this work, 77 regular and gel nail polishes were analyzed with confocal Raman spectroscopy using two laser wavelengths (532 and 780 nm). The sample behavior under the two laser wavelengths and the differences in the spectra taken at different points of the sample were studied for each nail polish. Additionally, the spectra obtained for all the nail polishes were visually compared. The results concluded that the longer laser wavelength prevents sample burning and fluorescence effects; the similarity among the spectra collected within the sample is not directly related with the presence of glitter particles; and 64% of the samples analyzed showed a characteristic spectrum. Additionally, the use of confocal Raman spectroscopy for the forensic analysis of nail polishes evidence in the form of flakes or smudges on different surfaces were studied. The results showed that both types of evidence can be analyzed by the technique. Also, two non-invasive sampling methods for the collection of the evidence from the nails of the suspect or the victim were proposed: (i) to use acetone-soaked cotton swabs to remove the nail varnishes and (ii) to scrape the nail polish from the nail with a blade. Both approaches, each exhibiting advantages and drawbacks in terms of transport and handling were appropriate

  11. Earnings Management in Polish Companies


    Brzeszczyński, Janusz; Gajdka, Jerzy; Schabek, Tomasz


    This paper presents results of the investigation of a phenomenon known as "earnings management'' (EM) among the companies listed on the Polish stock market. The distribution of earnings per share (EPS) for the stocks around the threshold value of "zero" and the threshold of "recent performance" was analyzed in the period of years 1997-2010. Moreover, the changes of earnings for the stocks, which are suspected to manipulate their earnings, were also investigated. The results, which indicate as...

  12. Identification of methicillin-resistant Staphylococcus aureus within the Nation’s Veterans Affairs Medical Centers using natural language processing

    Full Text Available Abstract Background Accurate information is needed to direct healthcare systems’ efforts to control methicillin-resistant Staphylococcus aureus (MRSA. Assembling complete and correct microbiology data is vital to understanding and addressing the multiple drug-resistant organisms in our hospitals. Methods Herein, we describe a system that securely gathers microbiology data from the Department of Veterans Affairs (VA network of databases. Using natural language processing methods, we applied an information extraction process to extract organisms and susceptibilities from the free-text data. We then validated the extraction against independently derived electronic data and expert annotation. Results We estimate that the collected microbiology data are 98.5% complete and that methicillin-resistant Staphylococcus aureus was extracted accurately 99.7% of the time. Conclusions Applying natural language processing methods to microbiology records appears to be a promising way to extract accurate and useful nosocomial pathogen surveillance data. Both scientific inquiry and the data’s reliability will be dependent on the surveillance system’s capability to compare from multiple sources and circumvent systematic error. The dataset constructed and methods used for this investigation could contribute to a comprehensive infectious disease surveillance system or other pressing needs.

  13. Dual Sticky Hierarchical Dirichlet Process Hidden Markov Model and Its Application to Natural Language Description of Motions. (United States)

    Hu, Weiming; Tian, Guodong; Kang, Yongxin; Yuan, Chunfeng; Maybank, Stephen


    In this paper, a new nonparametric Bayesian model called the dual sticky hierarchical Dirichlet process hidden Markov model (HDP-HMM) is proposed for mining activities from a collection of time series data such as trajectories. All the time series data are clustered. Each cluster of time series data, corresponding to a motion pattern, is modeled by an HMM. Our model postulates a set of HMMs that share a common set of states (topics in an analogy with topic models for document processing), but have unique transition distributions. For the application to motion trajectory modeling, topics correspond to motion activities. The learnt topics are clustered into atomic activities which are assigned predicates. We propose a Bayesian inference method to decompose a given trajectory into a sequence of atomic activities. On combining the learnt sources and sinks, semantic motion regions, and the learnt sequence of atomic activities, the action represented by the trajectory can be described in natural language in as automatic a way as possible. The effectiveness of our dual sticky HDP-HMM is validated on several trajectory datasets. The effectiveness of the natural language descriptions for motions is demonstrated on the vehicle trajectories extracted from a traffic scene.

  14. im4Things: An Ontology-Based Natural Language Interface for Controlling Devices in the Internet of Things

    KAUST Repository

    Noguera-Arnaldos, José Ángel


    The Internet of Things (IoT) offers opportunities for new applications and services that enable users to access and control their working and home environment from local and remote locations, aiming to perform daily life activities in an easy way. However, the IoT also introduces new challenges, some of which arise from the large range of devices currently available and the heterogeneous interfaces provided for their control. The control and management of this variety of devices and interfaces represent a new challenge for non-expert users, instead of making their life easier. Based on this understanding, in this work we present a natural language interface for the IoT, which takes advantage of Semantic Web technologies to allow non-expert users to control their home environment through an instant messaging application in an easy and intuitive way. We conducted several experiments with a group of end users aiming to evaluate the effectiveness of our approach to control home appliances by means of natural language instructions. The evaluation results proved that without the need for technicalities, the user was able to control the home appliances in an efficient way.

  15. The dynamic nature of motivation in language learning: A classroom perspective

    Directory of Open Access Journals (Sweden)

    Mirosław Pawlak


    Full Text Available When we examine the empirical investigations of motivation in second and foreign language learning, even those drawing upon the latest theoretical paradigms, such as the L2 motivational self system (Dörnyei, 2009, it becomes clear that many of them still fail to take account of its dynamic character and temporal variation. This may be surprising in view of the fact that the need to adopt such a process-oriented approach has been emphasized by a number of theorists and researchers (e.g., Dörnyei, 2000, 2001, 2009; Ushioda, 1996; Williams & Burden, 1997, and it lies at the heart of the model of second language motivation proposed by Dörnyei and Ottó (1998. It is also unfortunate that few research projects have addressed the question of how motivation changes during a language lesson as well as a series of lessons, and what factors might be responsible for fluctuations of this kind. The present paper is aimed to rectify this problem by reporting the findings of a classroom-based study which investigated the changes in the motivation of 28 senior high school students, both in terms of their goals and intentions, and their interest and engagement in classroom activities and tasks over the period of four weeks. The analysis of the data collected by means of questionnaires, observations and interviews showed that although the reasons for learning remain relatively stable, the intensity of motivation is indeed subject to variation on a minute-to-minute basis and this fact has to be recognized even in large-scale, cross-sectional research in this area.

  16. 19th Polish Control Conference

    Kacprzyk, Janusz; Oprzędkiewicz, Krzysztof; Skruch, Paweł


    This volume contains the proceedings of the KKA 2017 – the 19th Polish Control Conference, organized by the Department of Automatics and Biomedical Engineering, AGH University of Science and Technology in Kraków, Poland on June 18–21, 2017, under the auspices of the Committee on Automatic Control and Robotics of the Polish Academy of Sciences, and the Commission for Engineering Sciences of the Polish Academy of Arts and Sciences. Part 1 deals with general issues of modeling and control, notably flow modeling and control, sliding mode, predictive, dual, etc. control. In turn, Part 2 focuses on optimization, estimation and prediction for control. Part 3 is concerned with autonomous vehicles, while Part 4 addresses applications. Part 5 discusses computer methods in control, and Part 6 examines fractional order calculus in the modeling and control of dynamic systems. Part 7 focuses on modern robotics. Part 8 deals with modeling and identification, while Part 9 deals with problems related to security, fault ...

  17. Directly polished lightweight aluminum mirror (United States)

    ter Horst, Rik; Tromp, Niels; de Haan, Menno; Navarro, Ramon; Venema, Lars; Pragt, Johan


    During the last ten years, Astron has been a major contractor for the design and manufacturing of astronomical instruments for Space- and Earth based observatories, such as VISIR, MIDI, SPIFFI, X-Shooter and MIRI. Driven by the need to reduce the weight of optically ultra-stiff structures, two promising techniques have been developed in the last years: ASTRON Extreme Lightweighting [1][2] for mechanical structures and an improved Polishing Technique for Aluminum Mirrors. Using one single material for both optical components and mechanical structure simplifies the design of a cryogenic instrument significantly, it is very beneficial during instrument test and verification, and makes the instrument insensitive to temperature changes. Aluminum has been the main material used for cryogenic optical instruments, and optical aluminum mirrors are generally diamond turned. The application of a polishable hard top coating like nickel removes excess stray light caused by the groove pattern, but limits the degree of lightweighting of the mirrors due to the bi-metal effect. By directly polishing the aluminum mirror surface, the recent developments at Astron allow for using a non-exotic material for light weighted yet accurate optical mirrors, with a lower surface roughness ( 1nm RMS), higher surface accuracy and reduced light scattering. This paper presents the techniques, obtained results and a global comparison with alternative lightweight mirror solutions. Recent discussions indicate possible extensions of the extreme light weight technology to alternative materials such as Zerodur or Silicon Carbide.

  18. On the nature and evolution of the neural bases of human language (United States)

    Lieberman, Philip


    The traditional theory equating the brain bases of language with Broca's and Wernicke's neocortical areas is wrong. Neural circuits linking activity in anatomically segregated populations of neurons in subcortical structures and the neocortex throughout the human brain regulate complex behaviors such as walking, talking, and comprehending the meaning of sentences. When we hear or read a word, neural structures involved in the perception or real-world associations of the word are activated as well as posterior cortical regions adjacent to Wernicke's area. Many areas of the neocortex and subcortical structures support the cortical-striatal-cortical circuits that confer complex syntactic ability, speech production, and a large vocabulary. However, many of these structures also form part of the neural circuits regulating other aspects of behavior. For example, the basal ganglia, which regulate motor control, are also crucial elements in the circuits that confer human linguistic ability and abstract reasoning. The cerebellum, traditionally associated with motor control, is active in motor learning. The basal ganglia are also key elements in reward-based learning. Data from studies of Broca's aphasia, Parkinson's disease, hypoxia, focal brain damage, and a genetically transmitted brain anomaly (the putative "language gene," family KE), and from comparative studies of the brains and behavior of other species, demonstrate that the basal ganglia sequence the discrete elements that constitute a complete motor act, syntactic process, or thought process. Imaging studies of intact human subjects and electrophysiologic and tracer studies of the brains and behavior of other species confirm these findings. As Dobzansky put it, "Nothing in biology makes sense except in the light of evolution" (cited in Mayr, 1982). That applies with as much force to the human brain and the neural bases of language as it does to the human foot or jaw. The converse follows: the mark of evolution on

  19. Italian, French, Polish: in the Bermuda Triangle of linguistic interference

    Directory of Open Access Journals (Sweden)

    Małgorzata Balicka


    Full Text Available Increasing contacts between nations make people learn foreign languages. Unfortunately, the use of more than one language system may cause errors due to linguistic interference. In the present paper we consider different definitions of this phenomenon and describe the conditions reąuired for its presence, along with proposals on how to eliminate interference mistakes. A specific case is false friends, words similar in form but different in meaning, which can lead to misunderstandings or even disgrace. As interference implies the knowledge of at least two languages, bilingualism is another problem discussed in the paper. We present different approaches to interference and study the relation between bilingualism and interference. Finally, we discuss loanwords as one of the results of interference. With all these considerations in mind we can make a specific study of interference mistakes made by Polish students of Italian philology.

  20. Natural Language Processing Based Instrument for Classification of Free Text Medical Records

    Full Text Available According to the Ministry of Labor, Health and Social Affairs of Georgia a new health management system has to be introduced in the nearest future. In this context arises the problem of structuring and classifying documents containing all the history of medical services provided. The present work introduces the instrument for classification of medical records based on the Georgian language. It is the first attempt of such classification of the Georgian language based medical records. On the whole 24.855 examination records have been studied. The documents were classified into three main groups (ultrasonography, endoscopy, and X-ray and 13 subgroups using two well-known methods: Support Vector Machine (SVM and K-Nearest Neighbor (KNN. The results obtained demonstrated that both machine learning methods performed successfully, with a little supremacy of SVM. In the process of classification a “shrink” method, based on features selection, was introduced and applied. At the first stage of classification the results of the “shrink” case were better; however, on the second stage of classification into subclasses 23% of all documents could not be linked to only one definite individual subclass (liver or binary system due to common features characterizing these subclasses. The overall results of the study were successful.

  1. Language of sociology: the problem of artificial and natural (everyday concepts

    M. V. Naumenko


    The author analyzes the possible negative consequences of widespread use of artificial concepts in sociology, advantages and disadvantages of the use of natural (everyday concepts in sociology, propose a resolve the situation of naive reading of everyday concepts.

  2. Polish students at the Académie Julian until 1919

    Full Text Available The subject of the article is the presence of Polish students in the most important private artistic school in Paris in the second half of the 19thcentury. The extant records regarding the atelier for male students made it possible to compile a list of about 165 Polish painters and sculptors studying there in the period from 1880 to 1919. The text presents the criteria used when preparing the list and the diagrams show the fluctuations in registration and the number of Polish artists in particular ateliers in successive years. The observations contained in the article have a summary nature and are illustrated only with selected examples.

  3. A Requirements-Based Exploration of Open-Source Software Development Projects--Towards a Natural Language Processing Software Analysis Framework (United States)

    Vlas, Radu Eduard


    Open source projects do have requirements; they are, however, mostly informal, text descriptions found in requests, forums, and other correspondence. Understanding such requirements provides insight into the nature of open source projects. Unfortunately, manual analysis of natural language requirements is time-consuming, and for large projects,…

  4. An Overview of Polish Martial Arts

    Full Text Available The purpose of this study is to explain the revival of Polish martial arts from the perspectives of cultural sociology, the sciences of physical culture, and the humanistic theory of martial arts. The Polish Martial Arts (Polskie Sztuki Walki are a subject still requiring serious scientific examination, even in Poland. There are few works concerning the history of Polish weapons, and most only describe techniques for wielding specific types of edged weapons. Nevertheless, there is a large group of enthusiasts trying to restore and cultivate the old Polish tradition, a tradition with heavy emphasis on the art of fencing. The author knows many of the people and facts presented here, from personal observation and from direct participation in these arts. As a disciple of the late Master Yoshio Sugino (10th-dan Kobudo Katori Shinto-ryu, he fought against the Polish saber champion, and he has taken part in joint exhibitions of Polish and Japanese fencing.


    Full Text Available The aim of this study were to know semantic meaning of predicate Ngajengan, Daharan, Ngelor, Mangan, Ngrodok (Eating, Kaken (Eating, Suap, Bejijit, (Eating Bekeruak (Eating, Ngerasak (Eating and Nyangklok (Eating. Besides that, to know the lexical meaning of each words and the function of words in every sentences especially the meaning of eating in Sasaknese language. The lexical meaning of Ngajengan, Daharan, Ngelor, Mangan, Ngrodok (Eating, Kaken (Eating, Suap, Bejijit, (Eating Bekeruak (Eating, Ngerasak (Eating and Nyangklok (Eating was doing something to eat but the differences of these words are usage in sentences. Besides that, the word usage based on the subject and object and there is predicate that need tool to state eat meals or food.

  6. The embodied nature of medical concepts: image schemas and language for PAIN. (United States)

    Prieto Velasco, Juan Antonio; Tercedor Sánchez, Maribel


    Cognitive linguistics assumes that knowledge is both embodied and situated as far as it is acquired through our bodily interaction with the world in a specific environment (e.g. Barsalou in Lang Cogn Process 18:513-562, 2003; Connell et al. in PLoS One 7:3, 2012). Therefore, embodiment provides an explanation to the mental representation and linguistic expression of concepts. Among the first, we find multimodal conceptual structures, like image schemas, which are schematic representations of embodied experiences resulting from our conceptualization of the surrounding environment (Tercedor Sánchez et al. in J Spec Transl 18:187-205, 2012). Furthermore, the way we interact with the environment and its objects is dynamic and configures how we refer to concepts both by means of images and lexicalizations. In this article, we investigate how image schemas underlie verbal and visual representations. They both evoke concepts based on exteroception, interoception and proprioception which can be lexicalized through language. More specifically, we study (1) a multimodal corpus of medical texts to examine how image schemas lexicalize in the language of medicine to represent specialized concepts and (2) medical pictures to explore the depiction of image-schematic concepts, in order to account for the verbal and visual representation of embodied concepts. We explore the concept PAIN, a sensory and emotional experience associated with actual or potential tissue damage, using corpus analysis tools (Sketch Engine) to extract information about the lexicalization of underlying image schemas in definitions and defining contexts. Then, we use the image schemas behind medical concepts to consistently select images which depict our experience of pain and the way we understand it. Finally, such lexicalizations and visualizations will help us assess how we refer to PAIN both verbally and visually.

  7. Modelling language

    Cardey, Sylviane


    In response to the need for reliable results from natural language processing, this book presents an original way of decomposing a language(s) in a microscopic manner by means of intra/inter‑language norms and divergences, going progressively from languages as systems to the linguistic, mathematical and computational models, which being based on a constructive approach are inherently traceable. Languages are described with their elements aggregating or repelling each other to form viable interrelated micro‑systems. The abstract model, which contrary to the current state of the art works in int

  8. Multilingual Acquisition of Vowels in L1 Polish, L2 Danish and L3 English (United States)

    Sypianska, Jolanta


    The aim of this paper is to determine whether all languages in the linguistic repertoire of a multilingual speaker manifest cross-linguistic influence (CLI) and establish the directions of CLI on the basis of chosen vowels from the linguistic repertoire of two groups: the Bilingual group (L1 Polish/L2 Danish) and the Multilingual group (L1…

  9. Cross-Cultural Perspective of FL Teaching and Learning in the Polish Context (United States)

    Sobkowiak, Pawel


    This study examines whether learners' capacity to use a foreign language (FL) successfully in the global world is developed in the FL classroom in Polish high schools. The article reports results of the quantitative research which aimed at assessing whether and to what extent homogeneous FL classes in Poland are conducive to developing learners'…

  10. The tourism attractiveness of Polish libraries


    Miedzińska, Magdalena; Tanaś, Sławoj


    The aim of the article is to draw the reader's attention to the tourism attractiveness of renowned Polish libraries. These have attained a tourism function due to tourism exploration and penetration, but remain in the shadow of other Polish cultural assets. The article outlines the historical geography of Polish libraries, an analysis of tourism assets and an attempt to classify and catalogue libraries in Poland.

  11. Automatic Determination of the Need for Intravenous Contrast in Musculoskeletal MRI Examinations Using IBM Watson's Natural Language Processing Algorithm. (United States)

    Trivedi, Hari; Mesterhazy, Joseph; Laguna, Benjamin; Vu, Thienkhai; Sohn, Jae Ho


    Magnetic resonance imaging (MRI) protocoling can be time- and resource-intensive, and protocols can often be suboptimal dependent upon the expertise or preferences of the protocoling radiologist. Providing a best-practice recommendation for an MRI protocol has the potential to improve efficiency and decrease the likelihood of a suboptimal or erroneous study. The goal of this study was to develop and validate a machine learning-based natural language classifier that can automatically assign the use of intravenous contrast for musculoskeletal MRI protocols based upon the free-text clinical indication of the study, thereby improving efficiency of the protocoling radiologist and potentially decreasing errors. We utilized a deep learning-based natural language classification system from IBM Watson, a question-answering supercomputer that gained fame after challenging the best human players on Jeopardy! in 2011. We compared this solution to a series of traditional machine learning-based natural language processing techniques that utilize a term-document frequency matrix. Each classifier was trained with 1240 MRI protocols plus their respective clinical indications and validated with a test set of 280. Ground truth of contrast assignment was obtained from the clinical record. For evaluation of inter-reader agreement, a blinded second reader radiologist analyzed all cases and determined contrast assignment based on only the free-text clinical indication. In the test set, Watson demonstrated overall accuracy of 83.2% when compared to the original protocol. This was similar to the overall accuracy of 80.2% achieved by an ensemble of eight traditional machine learning algorithms based on a term-document matrix. When compared to the second reader's contrast assignment, Watson achieved 88.6% agreement. When evaluating only the subset of cases where the original protocol and second reader were concordant (n = 251), agreement climbed further to 90.0%. The classifier was

  12. Lexical exponents of hypothetical modality in Polish and Lithuanian

    To analyse both the languages there is used the method of theoretical contrastive studies, which the most important features are: (1 orienting the studies from the content grounds to the formal grounds, (2 using a semantic interlanguage as tertium comparationis. First of all, the content of hypothetical modality and its definition and paraphrase is given here. Next, the gradational character of this category is discussed. There are distinguished six groups of lexemes expressing the corresponding degrees of hypothetical modality — from a shadow of uncertainty (minimal degree of probability to an almost complete certainty (maximum degree of probability. The experimental Polish-Lithuanian corpus is widely applied in the studies.

  13. Writing in science: Exploring teachers' and students' views of the nature of science in language enriched environments (United States)

    Decoito, Isha

    Writing in science can be used to address some of the issues relevant to contemporary scientific literacy, such as the nature of science, which describes the scientific enterprise for science education. This has implications for the kinds of writing tasks students should attempt in the classroom, and for how students should understand the rationale and claims of these tasks. While scientific writing may train the mind to think scientifically in a disciplined and structured way thus encouraging students to gain access to the public domain of scientific knowledge, the counter-argument is that students need to be able to express their thoughts freely in their own language. Writing activities must aim to promote philosophical and epistemological views of science that accurately portray contemporary science. This mixed-methods case study explored language-enriched environments, in this case, secondary science classrooms with a focus on teacher-developed activities, involving diversified writing styles, that were directly linked to the science curriculum. The research foci included: teachers' implementation of these activities in their classrooms; how the activities reflected the teachers' nature of science views; common attributes between students' views of science and how they represented science in their writings; and if, and how the activities influenced students' nature of science views. Teachers' and students' views of writing and the nature of science are illustrated through pre-and post-questionnaire responses; interviews; student work; and classroom observations. Results indicated that diversified writing activities have the potential to accurately portray science to students, personalize learning in science, improve students' overall attitude towards science, and enhance scientific literacy through learning science, learning about science, and doing science. Further research is necessary to develop an understanding of whether the choice of genre has an

  14. The Phonetics and Phonology of the Polish Calling Melodies. (United States)

    Arvaniti, Amalia; Żygis, Marzena; Jaskuła, Marek


    Two calling melodies of Polish were investigated, the routine call, used to call someone for an everyday reason, and the urgent call, which conveys disapproval of the addressee's actions. A Discourse Completion Task was used to elicit the two melodies from Polish speakers using twelve names from one to four syllables long; there were three names per syllable count, and speakers produced three tokens of each name with each melody. The results, based on eleven speakers, show that the routine calling melody consists of a low F0 stretch followed by a rise-fall-rise; the urgent calling melody, on the other hand, is a simple rise-fall. Systematic differences were found in the scaling and alignment of tonal targets: the routine call showed late alignment of the accentual pitch peak, and in most instances lower scaling of targets. The accented vowel was also affected, being overall louder in the urgent call. Based on the data and comparisons with other Polish melodies, we analyze the routine call as LH* !H-H% and the urgent call as H* L-L%. We discuss the results and our analysis in light of recent findings on calling melodies in other languages, and explore their repercussions for intonational phonology and the modeling of intonation. © 2017 S. Karger AG, Basel.

  15. Computer simulation as an important approach to explore language universal. Comment on "Dependency distance: a new perspective on syntactic patterns in natural languages" by Haitao Liu et al. (United States)

    Lu, Qian


    Exploring language universal is one of the major goals of linguistic researches, which are largely devoted to answering the ;Platonic questions; in linguistics, that is, what is the language knowledge, how to get and use this knowledge. However, if solely guided by linguistic intuition, it is very difficult for syntactic studies to answer these questions, or to achieve abstractions in the scientific sense. This suggests that linguistic analyses based on the probability theory may provide effective ways to investigate into language universals in terms of biological motivations or cognitive psychological mechanisms. With the view that ;Language is a human-driven system;, Liu, Xu & Liang's review [1] pointed out that dependency distance minimization (DDM), which has been corroborated by big data analysis of corpus, may be a language universal shaped in language evolution, a universal that has profound effect on syntactic patterns.

  16. Perceptions of and Attitudes towards Regional Varieties of Polish: Views from Two Polish Provinces (United States)

    Milobog, Magdalena; Garrett, Peter


    This paper reports a study of perceptions and attitudes relating to regional varieties of Polish. The methodology followed folk linguistic approaches to attitudes research. Respondents in two Polish provinces were asked to draw on a map of Poland where they thought the main regional varieties of Polish were spoken, and then to name and…

  17. Performance analysis of CRF-based learning for processing WoT application requests expressed in natural language. (United States)

    Yoon, Young


    In this paper, we investigate the effectiveness of a CRF-based learning method for identifying necessary Web of Things (WoT) application components that would satisfy the users' requests issued in natural language. For instance, a user request such as "archive all sports breaking news" can be satisfied by composing a WoT application that consists of ESPN breaking news service and Dropbox as a storage service. We built an engine that can identify the necessary application components by recognizing a main act (MA) or named entities (NEs) from a given request. We trained this engine with the descriptions of WoT applications (called recipes) that were collected from IFTTT WoT platform. IFTTT hosts over 300 WoT entities that offer thousands of functions referred to as triggers and actions. There are more than 270,000 publicly-available recipes composed with those functions by real users. Therefore, the set of these recipes is well-qualified for the training of our MA and NE recognition engine. We share our unique experience of generating the training and test set from these recipe descriptions and assess the performance of the CRF-based language method. Based on the performance evaluation, we introduce further research directions.

  18. Droughts in historical times in Polish territory (United States)

    Limanowka, Danuta; Cebulak, Elzbieta; Pyrc, Robert; Doktor, Radoslaw


    Climate change is one of the key environmental, social and economical issues, and it is also followed by political consequences. Impact of climate conditions on countries' economy is increasingly recognized, and a lot of attention is given, both in the global scale and by the individual national governments. In years 2008-2010, at the Poland -Institute of Meteorology and Water Management-National Research Institute was realized the KLIMAT Project on Impact of climate change on environment, economy and society (changes, effects and methods of reducing them, conclusions for science, engineering practice and economic planning) No. POIG01-03-01-14-011/08. The project was financed by the European Union and Polish state budget in frame of Innovative Economy Operational Programme. A very wide range of research was carried out in the different thematic areas. One of them was "Natural disasters and internal safety of the country (civil and economical)." The problem of drought in Poland was developed in terms of meteorology and hydrology. "Proxy" Data Descriptions very often inform about dry years and seasons, hot periods without precipitation. Analysis of historical material allowed to extract the years that have experienced prolonged periods of high temperatures and rainfall shortages. Weather phenomenon defined as drought belongs to extreme events. This information was very helpful in the process of indexing and thus to restore the course and intensity of climatic elements in the past. The analysis covered the period from year 1000 to modern times. Due to the limited information from the period of 1000-1500 the authors focused primarily on the period from 1500 to 2010. Analysis of the collected material has allowed the development of a highly precise temporal structure of the possible occurrence of dry periods to Polish territory.

  19. Polish Geophysical Solid Earth Infrastructure Contributing to EPOS (United States)

    Debski, W.; Mutke, G.; Suchcicki, J.; Jozwiak, W.; Wiejacz, P.; Trojanowski, J.


    In this poster we present the current state of the main polish solid-earth-orientated infrastructures and shortly described history of their development, current state, and some plans for their future development. The presen- tation concentrates only on the classical infrastructure leaving aside for the while the the geodetic-orientated infrastructure, like GPS network and the GPS processing data centers, gravimetric infrastructure and others of this type. Polish broadband seismic infrastructure consists of 7 permanent broadband stations incorporated into the VEBSN initiative running at the polish territory and one operated in collaboration with NORSAR is settled at the Hornsund (Svalbard) polish polar station. All stations are equipped with STS-2 seismometers and polish MK-6 seismic stations providing 120 dB dynamics 100Hz sampling and data transmission in a real time to processing center. Besides this permanent broadband seismic network (PLSN) the Central Institute of Mining is running the permanent regional, short period network at the Upper Silesia area dedicated to the detailed monitoring of seismicity induced by the black coal mining activity in this area. The network consists of As the mining activity is the main source of seismicity in Poland also all mines are running underground short period networks, like for example Rudna-Polkowice copper mine seismic network consisting of 64 underground located short period seimometers. In that area, especially around the Zelazny Most: the huge post-floating artificial lake the, IGF PAS is running the local seismic array consisting of 4 short period seismometers. Besides these permanent network IGF PAN is running the portable seismic network for detailed mapping a possible natural seismic activity in selected regions of Poland. Important contribution to classical geophysical observation in the electro-magnetic field are provided by three permanent geomagnetic observatories (one at Hornsund) and supporting set of 10

  20. History of Polish gastrointestinal radiology. (United States)

    Urbanik, A


    As early as several days after the publication of the information concerning Roentgen's discovery the first radiological examinations were performed in Poland. The new method was immediately introduced into medical practice, including gastroenterology. In that pioneer period the most important works were those by Walery Jaworski who was the first man in the world to perform an X-ray of gall stones as well as the stomach with the use of a contrast medium. In its more-than-a-hundred-year history Polish gastrointestinal radiology has attempted not only to catch up with the world science, but it also has made a considerable contribution to its development.

  1. Corporate Politics on Polish Millennials


    Natalia Roślik


    In the very beginning of this particular paper, an author is trying to determine and describe who Millennials actually are. Then, the basis of Millennials definition is analysing corporation’s activity over the past years regarding this age group. The main goal of the thesis is to bring their specific futures out and describe what corporations on Polish job market are doing to encourage them to work in their offices. Especially in Poland within the last years, it is observed that big multinat...

  2. Polish Foundation for Energy Efficiency

    The Polish Foundation for Energy Efficiency (FEWE) was established in Poland at the end of 1990. FEWE, as an independent and non-profit organization, has the following objectives: to strive towards an energy efficient national economy, and to show the way and methods by use of which energy efficiency can be increased. The activity of the Foundation covers the entire territory of Poland through three regional centers: in Warsaw, Katowice and Cracow. FEWE employs well-known and experienced specialists within thermal and power engineering, civil engineering, economy and applied sciences. The organizer of the Foundation has been Battelle Memorial Institute - Pacific Northwest Laboratories from the USA.

  3. Reproducibility in Natural Language Processing: A Case Study of Two R Libraries for Mining PubMed/MEDLINE (United States)

    Cohen, K. Bretonnel; Xia, Jingbo; Roeder, Christophe; Hunter, Lawrence E.


    There is currently a crisis in science related to highly publicized failures to reproduce large numbers of published studies. The current work proposes, by way of case studies, a methodology for moving the study of reproducibility in computational work to a full stage beyond that of earlier work. Specifically, it presents a case study in attempting to reproduce the reports of two R libraries for doing text mining of the PubMed/MEDLINE repository of scientific publications. The main findings are that a rational paradigm for reproduction of natural language processing papers can be established; the advertised functionality was difficult, but not impossible, to reproduce; and reproducibility studies can produce additional insights into the functioning of the published system. Additionally, the work on reproducibility lead to the production of novel user-centered documentation that has been accessed 260 times since its publication—an average of once a day per library.

  4. Computer-Aided TRIZ Ideality and Level of Invention Estimation Using Natural Language Processing and Machine Learning (United States)

    Adams, Christopher; Tate, Derrick

    Patent textual descriptions provide a wealth of information that can be used to understand the underlying design approaches that result in the generation of novel and innovative technology. This article will discuss a new approach for estimating Degree of Ideality and Level of Invention metrics from the theory of inventive problem solving (TRIZ) using patent textual information. Patent text includes information that can be used to model both the functions performed by a design and the associated costs and problems that affect a design’s value. The motivation of this research is to use patent data with calculation of TRIZ metrics to help designers understand which combinations of system components and functions result in creative and innovative design solutions. This article will discuss in detail methods to estimate these TRIZ metrics using natural language processing and machine learning with the use of neural networks.

  5. Analyzing discourse and text complexity for learning and collaborating a cognitive approach based on natural language processing

    CERN Document Server

    Dascălu, Mihai


    With the advent and increasing popularity of Computer Supported Collaborative Learning (CSCL) and e-learning technologies, the need of automatic assessment and of teacher/tutor support for the two tightly intertwined activities of comprehension of reading materials and of collaboration among peers has grown significantly. In this context, a polyphonic model of discourse derived from Bakhtin’s work as a paradigm is used for analyzing both general texts and CSCL conversations in a unique framework focused on different facets of textual cohesion. As specificity of our analysis, the individual learning perspective is focused on the identification of reading strategies and on providing a multi-dimensional textual complexity model, whereas the collaborative learning dimension is centered on the evaluation of participants’ involvement, as well as on collaboration assessment. Our approach based on advanced Natural Language Processing techniques provides a qualitative estimation of the learning process and enhance...

  6. Automated Assessment of Patients' Self-Narratives for Posttraumatic Stress Disorder Screening Using Natural Language Processing and Text Mining. (United States)

    He, Qiwei; Veldkamp, Bernard P; Glas, Cees A W; de Vries, Theo


    Patients' narratives about traumatic experiences and symptoms are useful in clinical screening and diagnostic procedures. In this study, we presented an automated assessment system to screen patients for posttraumatic stress disorder via a natural language processing and text-mining approach. Four machine-learning algorithms-including decision tree, naive Bayes, support vector machine, and an alternative classification approach called the product score model-were used in combination with n-gram representation models to identify patterns between verbal features in self-narratives and psychiatric diagnoses. With our sample, the product score model with unigrams attained the highest prediction accuracy when compared with practitioners' diagnoses. The addition of multigrams contributed most to balancing the metrics of sensitivity and specificity. This article also demonstrates that text mining is a promising approach for analyzing patients' self-expression behavior, thus helping clinicians identify potential patients from an early stage.

  7. A Sibling-Mediated Intervention for Children with Autism Spectrum Disorder: Using the Natural Language Paradigm (NLP). (United States)

    Spector, Vicki; Charlop, Marjorie H


    We taught three typically developing siblings to occasion speech by implementing the Natural Language Paradigm (NLP) with their brothers with autism spectrum disorder (ASD). A non-concurrent multiple baseline design across children with ASD and sibling dyads was used. Ancillary behaviors of happiness, play, and joint attention for the children with ASD were recorded. Generalization of speech for the children with ASD across setting and peers was also measured. During baseline, the children with ASD displayed few target speech behaviors and the siblings inconsistently occasioned speech from their brothers. After sibling training, however, they successfully delivered NLP, and in turn, for two of the brothers with ASD, speech reached criterion. Implications of this research suggest the inclusion of siblings in interventions.

  8. Arbitrary symbolism in natural language revisited: when word forms carry meaning.

    Full Text Available Cognitive science has a rich history of interest in the ways that languages represent abstract and concrete concepts (e.g., idea vs. dog. Until recently, this focus has centered largely on aspects of word meaning and semantic representation. However, recent corpora analyses have demonstrated that abstract and concrete words are also marked by phonological, orthographic, and morphological differences. These regularities in sound-meaning correspondence potentially allow listeners to infer certain aspects of semantics directly from word form. We investigated this relationship between form and meaning in a series of four experiments. In Experiments 1-2 we examined the role of metalinguistic knowledge in semantic decision by asking participants to make semantic judgments for aurally presented nonwords selectively varied by specific acoustic and phonetic parameters. Participants consistently associated increased word length and diminished wordlikeness with abstract concepts. In Experiment 3, participants completed a semantic decision task (i.e., abstract or concrete for real words varied by length and concreteness. Participants were more likely to misclassify longer, inflected words (e.g., "apartment" as abstract and shorter uninflected abstract words (e.g., "fate" as concrete. In Experiment 4, we used a multiple regression to predict trial level naming data from a large corpus of nouns which revealed significant interaction effects between concreteness and word form. Together these results provide converging evidence for the hypothesis that listeners map sound to meaning through a non-arbitrary process using prior knowledge about statistical regularities in the surface forms of words.

  9. Neurolinguistic approach to natural language processing with applications to medical text analysis. (United States)

    Duch, Włodzisław; Matykiewicz, Paweł; Pestian, John


    Understanding written or spoken language presumably involves spreading neural activation in the brain. This process may be approximated by spreading activation in semantic networks, providing enhanced representations that involve concepts not found directly in the text. The approximation of this process is of great practical and theoretical interest. Although activations of neural circuits involved in representation of words rapidly change in time snapshots of these activations spreading through associative networks may be captured in a vector model. Concepts of similar type activate larger clusters of neurons, priming areas in the left and right hemisphere. Analysis of recent brain imaging experiments shows the importance of the right hemisphere non-verbal clusterization. Medical ontologies enable development of a large-scale practical algorithm to re-create pathways of spreading neural activations. First concepts of specific semantic type are identified in the text, and then all related concepts of the same type are added to the text, providing expanded representations. To avoid rapid growth of the extended feature space after each step only the most useful features that increase document clusterization are retained. Short hospital discharge summaries are used to illustrate how this process works on a real, very noisy data. Expanded texts show significantly improved clustering and may be classified with much higher accuracy. Although better approximations to the spreading of neural activations may be devised a practical approach presented in this paper helps to discover pathways used by the brain to process specific concepts, and may be used in large-scale applications.

  10. Buffered Electrochemical Polishing of Niobium

    Ciovati, Gianluigi [Thomas Jefferson National Accelerator Facility (TJNAF), Newport News, VA (United States); Tian, Hui [Thomas Jefferson National Accelerator Facility (TJNAF), Newport News, VA (United States); College of William and Mary, Williamsburg, VA (United States); Corcoran, Sean [Virginia Polytechnic Inst. and State Univ. (Virginia Tech), Blacksburg, VA (United States)


    The standard preparation of superconducting radio-frequency (SRF) cavities made of pure niobium include the removal of a 'damaged' surface layer, by buffered chemical polishing (BCP) or electropolishing (EP), after the cavities are formed. The performance of the cavities is characterized by a sharp degradation of the quality factor when the surface magnetic field exceeds about 90 mT, a phenomenon referred to as 'Q-drop.' In cavities made of polycrystalline fine grain (ASTM 5) niobium, the Q-drop can be significantly reduced by a low-temperature (? 120 °C) 'in-situ' baking of the cavity if the chemical treatment was EP rather than BCP. As part of the effort to understand this phenomenon, we investigated the effect of introducing a polarization potential during buffered chemical polishing, creating a process which is between the standard BCP and EP. While preliminary results on the application of this process to Nb cavities have been previously reported, in this contribution we focus on the characterization of this novel electrochemical process by measuring polarization curves, etching rates, surface finish, electrochemical impedance and the effects of temperature and electrolyte composition. In particular, it is shown that the anodic potential of Nb during BCP reduces the etching rate and improves the surface finish.

  11. From telegraphic to natural language: an expansion system in a pictogrambased AAC application


    Pahisa Solé, Joan


    En aquesta tesi doctoral, presentem un sistema de compansió que transforma el llenguatge telegràfic (frases formades per paraules de contingut no flexionades), derivat de la comunicació augmentativa i alternativa (CAA) basada en pictogrames, a llenguatge natural en català i en castellà. El sistema ha sigut dissenyat per millorar la comunicació de persones usuàries de CAA que habitualment tenen greus problemes a la parla, així com problemes motrius, i que utilitzen mètodes de comunicació basat...

  12. Method of polishing nickel-base alloys and stainless steels (United States)

    Steeves, Arthur F.; Buono, Donald P.


    A chemical attack polish and polishing procedure for use on metal surfaces such as nickel base alloys and stainless steels. The chemical attack polish comprises Fe(NO.sub.3).sub.3, concentrated CH.sub.3 COOH, concentrated H.sub.2 SO.sub.4 and H.sub.2 O. The polishing procedure includes saturating a polishing cloth with the chemical attack polish and submicron abrasive particles and buffing the metal surface.

  13. Chemical language and warfare of bacterial natural products in bacteria-nematode-insect interactions. (United States)

    Shi, Yi-Ming; Bode, Helge B


    Covering: up to November 2017Organismic interaction is one of the fundamental principles for survival in any ecosystem. Today, numerous examples show the interaction between microorganisms like bacteria and higher eukaryotes that can be anything between mutualistic to parasitic/pathogenic symbioses. There is also increasing evidence that microorganisms are used by higher eukaryotes not only for the supply of essential factors like vitamins but also as biological weapons to protect themselves or to kill other organisms. Excellent examples for such systems are entomopathogenic nematodes of the genera Heterorhabditis and Steinernema that live in mutualistic symbiosis with bacteria of the genera Photorhabdus and Xenorhabdus, respectively. Although these systems have been used successfully in organic farming on an industrial scale, it was only shown during the last 15 years that several different natural products (NPs) produced by the bacteria play key roles in the complex life cycle of the bacterial symbionts, the nematode host and the insect prey that is killed by and provides nutrients for the nematode-bacteria pair. Since the bacteria can switch from mutualistic to pathogenic lifestyle, interacting with two different types of higher eukaryotes, and since the full system with all players can be established in the lab, they are promising model systems to elucidate the natural function of microbial NPs. This review summarizes the current knowledge as well as open questions for NPs from Photorhabdus and Xenorhabdus and tries to assign their roles in the tritrophic relationship.

  14. Advanced techniques for computer-controlled polishing (United States)

    Schinhaerl, Markus; Stamp, Richard; Pitschke, Elmar; Rascher, Rolf; Smith, Lyndon; Smith, Gordon; Geiss, Andreas; Sperber, Peter


    Computer-controlled polishing has introduced determinism into the finishing of high-quality surfaces, for example those used as optical interfaces. Computer-controlled polishing may overcome many of the disadvantages of traditional polishing techniques. The polishing procedure is computed in terms of the surface error-profile and the material removal characteristic of the polishing tool, the influence function. Determinism and predictability not only enable more economical manufacture but also facilitate considerably increased processing accuracy. However, there are several disadvantages that serve to limit the capabilities of computer-controlled polishing, many of these are considered to be issues associated with determination of the influence function. Magnetorheological finishing has been investigated and various new techniques and approaches that dramatically enhance the potential as well as the economics of computer-controlled polishing have been developed and verified experimentally. Recent developments and advancements in computer-controlled polishing are discussed. The generic results of this research may be used in a wide variety of alternative applications in which controlled material removal is employed to achieve a desired surface specification, ranging from surface treatment processes in technical disciplines, to manipulation of biological surface textures in medical technologies.

  15. Polish Industry and Art at CERN

    CERN Multimedia


    On 17 October 2000 the second Polish industrial and technological exhibition opened at CERN. The first one was held five years ago and nine of the companies that were present then have come back again this year. Six of those companies were awarded contracts with CERN in 1995. Three Polish officials were present at the Opening Ceremony today: Mrs Malgorzata Kozlowska, Under-secretary of State in the State Committee for Scientific Research, Mr Henryk Ogryczak, Under-secretary of State in Ministry of Economy and Prof. Jerzy Niewodniczanski, President of National Atomic Energy Agency. Professor Luciano Maiani welcomed the Polish delegation to CERN and stressed the important contribution of Polish scientists and industrialists to the work of the laboratory. Director General Luciano Maiani (back left) and head of SPL division Karl-Heinz Kissler (back right) visit the Poland at CERN exhibition… The exhibition offers Polish companies the opportunity to establish professional contacts with CERN. Nineteen companies...

  16. Response of the Polish Wheat Prices to the Worlds Crude Oil Prices

    Full Text Available Agricultural commodities prices play crucial role both in farmers income determination and in price relationship establishment for the whole economy. Among the factors influencing the wheat prices, crude oil prices are considered as one of the most important. The aim of this paper was to assess the character of linkage between world crude oil prices and Polish wheat prices. Results of the research confirm the existence of such linkage although the nature and the strength of this relationship changes over time. However, the long-run relationships between the crude oil and Polish wheat prices were not proven. Moreover, growing impact of crude oil prices on Polish wheat prices over time was not detected. The results suggest that exchange rates may strongly influence wheat prices. This in turn may weaken response of Polish wheat prices in relation to world crude oil prices.

  17. Natural language processing of radiology reports for the detection of thromboembolic diseases and clinically relevant incidental findings. (United States)

    Pham, Anne-Dominique; Névéol, Aurélie; Lavergne, Thomas; Yasunaga, Daisuke; Clément, Olivier; Meyer, Guy; Morello, Rémy; Burgun, Anita


    Natural Language Processing (NLP) has been shown effective to analyze the content of radiology reports and identify diagnosis or patient characteristics. We evaluate the combination of NLP and machine learning to detect thromboembolic disease diagnosis and incidental clinically relevant findings from angiography and venography reports written in French. We model thromboembolic diagnosis and incidental findings as a set of concepts, modalities and relations between concepts that can be used as features by a supervised machine learning algorithm. A corpus of 573 radiology reports was de-identified and manually annotated with the support of NLP tools by a physician for relevant concepts, modalities and relations. A machine learning classifier was trained on the dataset interpreted by a physician for diagnosis of deep-vein thrombosis, pulmonary embolism and clinically relevant incidental findings. Decision models accounted for the imbalanced nature of the data and exploited the structure of the reports. The best model achieved an F measure of 0.98 for pulmonary embolism identification, 1.00 for deep vein thrombosis, and 0.80 for incidental clinically relevant findings. The use of concepts, modalities and relations improved performances in all cases. This study demonstrates the benefits of developing an automated method to identify medical concepts, modality and relations from radiology reports in French. An end-to-end automatic system for annotation and classification which could be applied to other radiology reports databases would be valuable for epidemiological surveillance, performance monitoring, and accreditation in French hospitals.

  18. Africa and Its People in the Polish Media

    Full Text Available The African continent is treated by the Polish media marginally and usually seen through the lens of four domains of stereotypical perceptions that are associated with difficult life conditions, threats and dangers, beautiful and wild nature, as well as original and diverse cultures. Monitoring of the Polish media has become very important in this situation. That is why the results of first media monitoring report were published in 2011 by ‘Africa Another Way’ Foundation. Five years later the monitoring was repeated. It is hard to resist the impression that Africa is still viewed as this poor, underdeveloped and dangerous continent. And the way it is presented translates into the way individuals of African descent are perceived.

  19. A new strategy for the restructuring of Polish energy sector

    International Nuclear Information System (INIS)

    Kozlowski, R.H.; Tallat, J.


    In accordance with strategic planning in the military, the leader (in this case the Minister of Economy) is responsible for setting goals, finding the right people to accomplish these goals (those working in the energy sector), analysing the current situation (state of the energy sector) and evaluating available resources (conventional and renewable energy resources). In terms of economic planning (this term is proper for an economy that sets numerous laws and quotas), the goal is to get the Polish economy out of economic slump, which is the result of seventeen years of improper government practices, into a state of prosperity corresponding to no less than the European average. The only way of accomplishing this goal of high economic growth and catching up with highly-developed countries is to develop local inexpensive energy resources. This study focuses on the potential to develop abundant Polish geothermal resources as well as natural gas based co-generation. (author)

  20. Automated visual inspection for polished stone manufacture (United States)

    Smith, Melvyn L.; Smith, Lyndon N.


    Increased globalisation of the ornamental stone market has lead to increased competition and more rigorous product quality requirements. As such, there are strong motivators to introduce new, more effective, inspection technologies that will help enable stone processors to reduce costs, improve quality and improve productivity. Natural stone surfaces may contain a mixture of complex two-dimensional (2D) patterns and three-dimensional (3D) features. The challenge in terms of automated inspection is to develop systems able to reliably identify 3D topographic defects, either naturally occurring or resulting from polishing, in the presence of concomitant complex 2D stochastic colour patterns. The resulting real-time analysis of the defects may be used in adaptive process control, in order to avoid the wasteful production of defective product. An innovative approach, using structured light and based upon an adaptation of the photometric stereo method, has been pioneered and developed at UWE to isolate and characterize mixed 2D and 3D surface features. The method is able to undertake tasks considered beyond the capabilities of existing surface inspection techniques. The approach has been successfully applied to real stone samples, and a selection of experimental results is presented.

  1. Dependency distance in language evolution. Comment on "Dependency distance: A new perspective on syntactic patterns in natural languages" by Haitao Liu et al. (United States)

    Liu, Bingli; Chen, Xinying


    In the target article [1], Liu et al. provide an informative introduction to the dependency distance studies and proclaim that language syntactic patterns, that relate to the dependency distance, are associated with human cognitive mechanisms, such as limited working memory and syntax processing. Therefore, such syntactic patterns are probably 'human-driven' language universals. Sufficient evidence based on big data analysis is also given in the article for supporting this idea. The hypotheses generally seem very convincing yet still need further tests from various perspectives. Diachronic linguistic study based on authentic language data, on our opinion, can be one of those 'further tests'.

  2. Task Effects on Linguistic Complexity and Accuracy: A Large-Scale Learner Corpus Analysis Employing Natural Language Processing Techniques (United States)

    Alexopoulou, Theodora; Michel, Marije; Murakami, Akira; Meurers, Detmar


    Large-scale learner corpora collected from online language learning platforms, such as the EF-Cambridge Open Language Database (EFCAMDAT), provide opportunities to analyze learner data at an unprecedented scale. However, interpreting the learner language in such corpora requires a precise understanding of tasks: How does the prompt and input of a…

  3. Language and human nature: Kurt Goldstein's neurolinguistic foundation of a holistic philosophy. (United States)

    Ludwig, David


    Holism in interwar Germany provides an excellent example for social and political influences on scientific developments. Deeply impressed by the ubiquitous invocation of a cultural crisis, biologists, physicians, and psychologists presented holistic accounts as an alternative to the "mechanistic worldview" of the nineteenth century. Although the ideological background of these accounts is often blatantly obvious, many holistic scientists did not content themselves with a general opposition to a mechanistic worldview but aimed at a rational foundation of their holistic projects. This article will discuss the work of Kurt Goldstein, who is known for both his groundbreaking contributions to neuropsychology and his holistic philosophy of human nature. By focusing on Goldstein's neurolinguistic research, I want to reconstruct the empirical foundations of his holistic program without ignoring its cultural background. In this sense, Goldstein's work provides a case study for the formation of a scientific theory through the complex interplay between specific empirical evidences and the general cultural developments of the Weimar Republic. © 2012 Wiley Periodicals, Inc.

  4. Evaluation of natural language processing from emergency department computerized medical records for intra-hospital syndromic surveillance

    Full Text Available Abstract Background The identification of patients who pose an epidemic hazard when they are admitted to a health facility plays a role in preventing the risk of hospital acquired infection. An automated clinical decision support system to detect suspected cases, based on the principle of syndromic surveillance, is being developed at the University of Lyon's Hôpital de la Croix-Rousse. This tool will analyse structured data and narrative reports from computerized emergency department (ED medical records. The first step consists of developing an application (UrgIndex which automatically extracts and encodes information found in narrative reports. The purpose of the present article is to describe and evaluate this natural language processing system. Methods Narrative reports have to be pre-processed before utilizing the French-language medical multi-terminology indexer (ECMT for standardized encoding. UrgIndex identifies and excludes syntagmas containing a negation and replaces non-standard terms (abbreviations, acronyms, spelling errors.... Then, the phrases are sent to the ECMT through an Internet connection. The indexer's reply, based on Extensible Markup Language, returns codes and literals corresponding to the concepts found in phrases. UrgIndex filters codes corresponding to suspected infections. Recall is defined as the number of relevant processed medical concepts divided by the number of concepts evaluated (coded manually by the medical epidemiologist. Precision is defined as the number of relevant processed concepts divided by the number of concepts proposed by UrgIndex. Recall and precision were assessed for respiratory and cutaneous syndromes. Results Evaluation of 1,674 processed medical concepts contained in 100 ED medical records (50 for respiratory syndromes and 50 for cutaneous syndromes showed an overall recall of 85.8% (95% CI: 84.1-87.3. Recall varied from 84.5% for respiratory syndromes to 87.0% for cutaneous syndromes. The

  5. Corporate Politics on Polish Millennials

    Directory of Open Access Journals (Sweden)

    Natalia Roślik


    Full Text Available In the very beginning of this particular paper, an author is trying to determine and describe who Millennials actually are. Then, the basis of Millennials definition is analysing corporation’s activity over the past years regarding this age group. The main goal of the thesis is to bring their specific futures out and describe what corporations on Polish job market are doing to encourage them to work in their offices. Especially in Poland within the last years, it is observed that big multinational companies are paying special attention to Millennials and trying to hire them before competitors will do so. As a part of this paper, an author will describe corporate politics and practices on Thomson Reuters and BNY Mellon examples. Within this work, an author is also discussing key features and differences between this generation and Millennials parent’s generation. Additionally, there is a reference to corporate social responsibility concept and work-life balance issues.

  6. Planned experiments and corpus based research play a complementary role. Comment on "Dependency distance: A new perspective on syntactic patterns in natural languages" by Haitao Liu et al. (United States)

    Vasishth, Shravan


    This interesting and informative review by Liu and colleagues [17] in this issue covers the full spectrum of research on the idea that in natural language, dependency distance tends to be small. The authors discuss two distinct research threads: experimental work from psycholinguistics on online processes in comprehension and production, and text-corpus studies of dependency length distributions.

  7. Development of a user-friendly interface for the searching of a data base in natural language while using concepts and means of artificial intelligence

    International Nuclear Information System (INIS)

    Pujo, Pascal


    This research thesis aimed at the development of a natural-language-based user-friendly interface for the searching of relational data bases. The author first addresses how to store data which will be accessible through an interface in natural language: this organisation must result in as few constraints as possible in query formulation. He briefly presents techniques related to the automatic processing of natural language, and highlights the need for a more user-friendly interface. Then, he presents the developed interface and outlines the user-friendliness and ergonomics of implemented procedures. He shows how the interface has been designed to deliver information and explanations on its processing. This allows the user to control the relevance of the answer. He also indicates the classification of mistakes and errors which may be present in queries in natural language. He finally gives an overview of possible evolutions of the interface, briefly presents deductive functionalities which could expand data management. The handling of complex objects is also addressed [fr

  8. Natural language query system design for interactive information storage and retrieval systems. Presentation visuals. M.S. Thesis Final Report, 1 Jul. 1985 - 31 Dec. 1987 (United States)

    Dominick, Wayne D. (Editor); Liu, I-Hsiung


    This Working Paper Series entry represents a collection of presentation visuals associated with the companion report entitled Natural Language Query System Design for Interactive Information Storage and Retrieval Systems, USL/DBMS NASA/RECON Working Paper Series report number DBMS.NASA/RECON-17.

  9. What Has Personality and Emotional Intelligence to Do with "Feeling Different" while Using a Foreign Language? (United States)

    Ozanska-Ponikwia, Katarzyna


    The present study investigates the link between personality traits (OCEAN Personality test), emotional intelligence (EI) (Trait Emotional Intelligence Questionnaire) and the notion of "feeling different" while using a foreign language among 102 Polish-English bilinguals and Polish L2 users of English who were immersed in a foreign language and…

  10. Acculturation strategy and language experience in expert ESL speakers: An exploratory study

    Directory of Open Access Journals (Sweden)

    Full Text Available Acculturation and language proficiency have been found to be inter-related both from the perspective of second language acquisition (Schumann, 1978, 1986 and socio-psychological adaptation in cross-cultural contacts (Ward, Bochner, & Furnham, 2001. However, the predictions as to the effect of a particular strategy on success differ, with assimilation believed to create most favourable conditions for SLA and integration for general well-being. The present study explores acculturation patterns in three expert users of English as a second language, recent Polish immigrants to the UK, in relation to their language experience. The qualitative data were collected with the use of a questionnaire and analysed with respect to language experience and socio-affective factors. The analysis aimed at better understanding of the relationship between language learning in a formal context and language use in a natural setting on the one hand and the relationship between language expertise and acculturation strategy choice on the other. The results show that in spite of individual differences, expert language users tend to adopt an assimilation rather than integration acculturation strategy. This may suggest that attitudes are related to expertise in English as a second language in a more conservative way than advocated by cross-cultural approaches.

  11. Smoking characteristics of Polish immigrants in Dublin

    Full Text Available Abstract Background This study examined two main hypotheses: a Polish immigrants' smoking estimates are greater than their Irish counterparts (b Polish immigrants purchasing cigarettes from Poland smoke "heavier" (≥ 20 cigarettes a day when compared to those purchasing cigarettes from Ireland. The study also set out to identify significant predictors of 'current' smoking (some days and everyday among the Polish immigrants. Methods Dublin residents of Polish origin (n = 1,545 completed a previously validated Polish questionnaire in response to an advertisement in a local Polish lifestyle magazine over 5 weekends (July–August, 2007. The Office of Tobacco Control telephone-based monthly survey data were analyzed for the Irish population in Dublin for the same period (n = 484. Results Age-sex adjusted smoking estimates were: 47.6% (95% Confidence Interval [CI]: 47.3%; 48.0% among the Poles and 27.8% (95% CI: 27.2%; 28.4% among the general Irish population (p 24 months were significant predictors of current smoking among the Poles. An objective validation of the self-reported smoking history of a randomly selected sub-sample immigrant group, using expired carbon monoxide (CO measurements, showed a highly significant correlation coefficient (r = 0.64 of expired CO levels with the reported number of cigarettes consumed (p Conclusion Polish immigrants' smoking estimates are higher than their Irish counterparts, and particularly if employed, with only primary-level education, and are overseas >2 years.

  12. Automated extraction of lexical meanings from Polish corpora: potentialities and limitations

    Directory of Open Access Journals (Sweden)

    Maciej Piasecki


    Full Text Available Automated extraction of lexical meanings from Polish corpora: potentialities and limitations Large corpora are often consulted by linguists as a knowledge source with respect to lexicon, morphology or syntax. However, there are also several methods of automated extraction of semantic properties of language units from corpora. In the paper we focus on emerging potentialities of these methods, as well as on their identified limitations. Evidence that can be collected from corpora is confronted with the existing models of formalised description of lexical meanings. Two basic paradigms of lexical semantics extraction are briefly described. Their properties are analysed on the basis of several experiments performed on Polish corpora. Several potential applications of the methods, including a system supporting expansion of a Polish wordnet, are discussed. Finally, perspectives on the potential further development are discussed.

  13. Language Policy, Language Choice and Language Use in the ...

    African Journals Online (AJOL)

    The paper examines the pros and cons of the checkered nature of language use in the Tanzanian Parliament. It focuses on language policy, language choice and the practicality of language use in parliamentary discourse. Right from the eve of independence, the medium of communication in the Tanzanian parliament has ...

  14. Linguistics in Language Education (United States)

    Kumar, Rajesh; Yunus, Reva


    This article looks at the contribution of insights from theoretical linguistics to an understanding of language acquisition and the nature of language in terms of their potential benefit to language education. We examine the ideas of innateness and universal language faculty, as well as multilingualism and the language-society relationship. Modern…

  15. Super-polishing of Zerodur aspheres by means of conventional polishing technology (United States)

    Polak, Jaroslav; Klepetková, Eva; Pošmourný, Josef; Šulc, Miroslav; Procháska, František; Tomka, David; Matoušek, Ondřej; Poláková, Ivana; Šubert, Eduard


    This paper describes a quest to find simple technique to superpolish Zerodur asphere (55μm departure from best fit sphere) that could be employed on old fashion way 1-excenter optical polishing machine. The work focuses on selection of polishing technology, study of different polishing slurries and optimization of polishing setup. It is demonstrated that either by use of fine colloidal CeO2 slurry or by use of bowl-feed polishing setup with CeO2 charged pitch we could reach 0.4nm RMS roughness while removing <30nm of surface layer. This technique, although not optimized, was successfully used to improve surface roughness on already prepolished Zerodur aspheres without necessity to involve sophisticated super-polishing technology and highly trained manpower.

  16. Making Sense of Big Textual Data for Health Care: Findings from the Section on Clinical Natural Language Processing. (United States)

    Névéol, A; Zweigenbaum, P


    Objectives: To summarize recent research and present a selection of the best papers published in 2016 in the field of clinical Natural Language Processing (NLP). Method: A survey of the literature was performed by the two section editors of the IMIA Yearbook NLP section. Bibliographic databases were searched for papers with a focus on NLP efforts applied to clinical texts or aimed at a clinical outcome. Papers were automatically ranked and then manually reviewed based on titles and abstracts. A shortlist of candidate best papers was first selected by the section editors before being peer-reviewed by independent external reviewers. Results: The five clinical NLP best papers provide a contribution that ranges from emerging original foundational methods to transitioning solid established research results to a practical clinical setting. They offer a framework for abbreviation disambiguation and coreference resolution, a classification method to identify clinically useful sentences, an analysis of counseling conversations to improve support to patients with mental disorder and grounding of gradable adjectives. Conclusions: Clinical NLP continued to thrive in 2016, with an increasing number of contributions towards applications compared to fundamental methods. Fundamental work addresses increasingly complex problems such as lexical semantics, coreference resolution, and discourse analysis. Research results translate into freely available tools, mainly for English. Georg Thieme Verlag KG Stuttgart.

  17. Integrating natural language processing expertise with patient safety event review committees to improve the analysis of medication events. (United States)

    Fong, Allan; Harriott, Nicole; Walters, Donna M; Foley, Hanan; Morrissey, Richard; Ratwani, Raj R


    Many healthcare providers have implemented patient safety event reporting systems to better understand and improve patient safety. Reviewing and analyzing these reports is often time consuming and resource intensive because of both the quantity of reports and length of free-text descriptions in the reports. Natural language processing (NLP) experts collaborated with clinical experts on a patient safety committee to assist in the identification and analysis of medication related patient safety events. Different NLP algorithmic approaches were developed to identify four types of medication related patient safety events and the models were compared. Well performing NLP models were generated to categorize medication related events into pharmacy delivery delays, dispensing errors, Pyxis discrepancies, and prescriber errors with receiver operating characteristic areas under the curve of 0.96, 0.87, 0.96, and 0.81 respectively. We also found that modeling the brief without the resolution text generally improved model performance. These models were integrated into a dashboard visualization to support the patient safety committee review process. We demonstrate the capabilities of various NLP models and the use of two text inclusion strategies at categorizing medication related patient safety events. The NLP models and visualization could be used to improve the efficiency of patient safety event data review and analysis. Copyright © 2017 Elsevier B.V. All rights reserved.

  18. An Introduction to Natural Language Processing: How You Can Get More From Those Electronic Notes You Are Generating. (United States)

    Kimia, Amir A; Savova, Guergana; Landschaft, Assaf; Harper, Marvin B


    Electronically stored clinical documents may contain both structured data and unstructured data. The use of structured clinical data varies by facility, but clinicians are familiar with coded data such as International Classification of Diseases, Ninth Revision, Systematized Nomenclature of Medicine-Clinical Terms codes, and commonly other data including patient chief complaints or laboratory results. Most electronic health records have much more clinical information stored as unstructured data, for example, clinical narrative such as history of present illness, procedure notes, and clinical decision making are stored as unstructured data. Despite the importance of this information, electronic capture or retrieval of unstructured clinical data has been challenging. The field of natural language processing (NLP) is undergoing rapid development, and existing tools can be successfully used for quality improvement, research, healthcare coding, and even billing compliance. In this brief review, we provide examples of successful uses of NLP using emergency medicine physician visit notes for various projects and the challenges of retrieving specific data and finally present practical methods that can run on a standard personal computer as well as high-end state-of-the-art funded processes run by leading NLP informatics researchers.

  19. Rethinking information delivery: using a natural language processing application for point-of-care data discovery*† (United States)

    Workman, T. Elizabeth; Stoddart, Joan M


    Objective: This paper examines the use of Semantic MEDLINE, a natural language processing application enhanced with a statistical algorithm known as Combo, as a potential decision support tool for clinicians. Semantic MEDLINE summarizes text in PubMed citations, transforming it into compact declarations that are filtered according to a user's information need that can be displayed in a graphic interface. Integration of the Combo algorithm enables Semantic MEDLINE to deliver information salient to many diverse needs. Methods: The authors selected three disease topics and crafted PubMed search queries to retrieve citations addressing the prevention of these diseases. They then processed the citations with Semantic MEDLINE, with the Combo algorithm enhancement. To evaluate the results, they constructed a reference standard for each disease topic consisting of preventive interventions recommended by a commercial decision support tool. Results: Semantic MEDLINE with Combo produced an average recall of 79% in primary and secondary analyses, an average precision of 45%, and a final average F-score of 0.57. Conclusion: This new approach to point-of-care information delivery holds promise as a decision support tool for clinicians. Health sciences libraries could implement such technologies to deliver tailored information to their users. PMID:22514507

  20. Per-service supervised learning for identifying desired WoT apps from user requests in natural language. (United States)

    Yoon, Young


    Web of Things (WoT) platforms are growing fast so as the needs for composing WoT apps more easily and efficiently. We have recently commenced the campaign to develop an interface where users can issue requests for WoT apps entirely in natural language. This requires an effort to build a system that can learn to identify relevant WoT functions that fulfill user's requests. In our preceding work, we trained a supervised learning system with thousands of publicly-available IFTTT app recipes based on conditional random fields (CRF). However, the sub-par accuracy and excessive training time motivated us to devise a better approach. In this paper, we present a novel solution that creates a separate learning engine for each trigger service. With this approach, parallel and incremental learning becomes possible. For inference, our system first identifies the most relevant trigger service for a given user request by using an information retrieval technique. Then, the learning engine associated with the trigger service predicts the most likely pair of trigger and action functions. We expect that such two-phase inference method given parallel learning engines would improve the accuracy of identifying related WoT functions. We verify our new solution through the empirical evaluation with training and test sets sampled from a pool of refined IFTTT app recipes. We also meticulously analyze the characteristics of the recipes to find future research directions.

  1. Interpreting the Fuzzy Semantics of Natural-Language Spatial Relation Terms with the Fuzzy Random Forest Algorithm

    Directory of Open Access Journals (Sweden)

    Full Text Available Naïve Geography, intelligent geographical information systems (GIS, and spatial data mining especially from social media all rely on natural-language spatial relations (NLSR terms to incorporate commonsense spatial knowledge into conventional GIS and to enhance the semantic interoperability of spatial information in social media data. Yet, the inherent fuzziness of NLSR terms makes them challenging to interpret. This study proposes to interpret the fuzzy semantics of NLSR terms using the fuzzy random forest (FRF algorithm. Based on a large number of fuzzy samples acquired by transforming a set of crisp samples with the random forest algorithm, two FRF models with different membership assembling strategies are trained to obtain the fuzzy interpretation of three line-region geometric representations using 69 NLSR terms. Experimental results demonstrate that the two FRF models achieve good accuracy in interpreting line-region geometric representations using fuzzy NLSR terms. In addition, fuzzy classification of FRF can interpret the fuzzy semantics of NLSR terms more fully than their crisp counterparts.

  2. Informatics in radiology: RADTF: a semantic search-enabled, natural language processor-generated radiology teaching file. (United States)

    Do, Bao H; Wu, Andrew; Biswal, Sandip; Kamaya, Aya; Rubin, Daniel L


    Storing and retrieving radiology cases is an important activity for education and clinical research, but this process can be time-consuming. In the process of structuring reports and images into organized teaching files, incidental pathologic conditions not pertinent to the primary teaching point can be omitted, as when a user saves images of an aortic dissection case but disregards the incidental osteoid osteoma. An alternate strategy for identifying teaching cases is text search of reports in radiology information systems (RIS), but retrieved reports are unstructured, teaching-related content is not highlighted, and patient identifying information is not removed. Furthermore, searching unstructured reports requires sophisticated retrieval methods to achieve useful results. An open-source, RadLex(®)-compatible teaching file solution called RADTF, which uses natural language processing (NLP) methods to process radiology reports, was developed to create a searchable teaching resource from the RIS and the picture archiving and communication system (PACS). The NLP system extracts and de-identifies teaching-relevant statements from full reports to generate a stand-alone database, thus converting existing RIS archives into an on-demand source of teaching material. Using RADTF, the authors generated a semantic search-enabled, Web-based radiology archive containing over 700,000 cases with millions of images. RADTF combines a compact representation of the teaching-relevant content in radiology reports and a versatile search engine with the scale of the entire RIS-PACS collection of case material. ©RSNA, 2010

  3. Comparative study of performance of shoe polishes formulated from ...

    African Journals Online (AJOL)


    This should make the polish remains as discrete solid particles held mechanically within the leather. This work intends to explore the use of polyethylene pigment in the production of shoe polish. The shoe polish produced will be applied alongside shoe polish from carbon black (CI black Pigment 7) on finished black leather ...


    Full Text Available Solutions in the field of ecological architecture appear more and more often in Poland. There are two approaches to eco-design: high-tech and low-tech. High-tech focuses on the use of the latest technological solvings. These means are often used in newly designed commercial buildings, such as the first Polish office building which uses passive technology, built in Katowice, in Euro-Centrum Science and Technology Park. It is intended especially for companies focusing on energy observance issues. Low-tech is usually used in small-scale buildings (for example a cottage in Jartypory village, and is focused on the use of inexpensive, traditional technologies and the daily conscious management of natural resources. Thinking about the impact on the environment and principles of sustainable development is also present in urban planning. In Siewierz, near Katowice, Poland’s first eco-village is being built, with full infrastructure, high-quality residential buildings, shops, offices and hotels. The range of applied solutions will allow residents for the economical exploitation of these buildings.

  5. Cross-language diversity, head-direction and grammars. Comment on "Dependency distance: A new perspective on syntactic patterns in natural languages" by Haitao Liu et al. (United States)

    Hudson, Richard


    This paper [4] - referred to below as 'LXL' - is an excellent example of cross-disciplinary work which brings together three very different disciplines, each with its different methods: quantitative computational linguistics (exploring big data), psycholinguistics (using experiments with human subjects) and theoretical linguistics (building models based on language descriptions). The measured unit is the dependency between two words, as defined by theoretical linguistics, and the question is how the length of this dependency affects the choices made by writers, as revealed in big data from a wide range of languages.

  6. New Environmental Practices in Polish Production Firms

    DEFF Research Database (Denmark)

    Kræmer, Trine Pipi


    Based on five case studies in Poland, the paper discusses, how a specific environmental policy influences the firms? industrial environmental practices. The study illustrates, how the Polish environmental policy, dominated by environmental charges on emissions, is extremely effective in improving...

  7. Electro Polishing of Niobium Cavities at DESY

    CERN Document Server

    Matheisen, A; Morales, H; Petersen, B; Schmoekel, M; Steinhau-Kühl, N


    At DESY a facility for electro polishing (EP) of the super conducting (s.c.) TESLA/TTF cavities have been built and is operational since summer 2003. The EP infrastructure is capable to handle single-cell structures and the standard TESLA/ TTF nine-cell cavities. Several electro polishing processes have been made since and acceleration voltage up to 40 MV/m have been reached in nine cell structures. We report on measurements and experiences gained since 2003 as well as on handling procedures developed for the preparation of electro polished resonators. Specific data like heat production, variation of current density and bath aging will be presented. Another important point for reproducible results is the quality control of the electro polishing process. First quality control steps to be implanted in the EP procedure for large-scale production will be described.

  8. Semiological analysis of Polish theater posters

    Full Text Available Through the application of semiological analysis to theater posters made by two Polish authors, the paper uncovers signs, meanings, codes and specifics of the „Polish school of poster-making“ and contemporary Polish posters. Aside from this, I suggest a methodological framework for studying the issue of coding and shaping a theater poster as a culturally specific form of visual communication. The aesthetic and semiotic outlook of the Polish theater posters which were chosen is analyzed using the semiological method which highlights their differences and similarities. By pointing out the system of codes through the examples given here, a graphic designer is informed about the existence and the possibilities of a more systematic approach to shaping a theater poster.

  9. Report on Polish Public Libraries 2001


    Bednarek-Michalska, Bożena; Szatkowska, Olga


    Report is an overview of polish public libraries, services to a variety of communities, it also provides statistics, n explanation of the funding policy for public libraries in Poland and information about the governance of public libraries in the country.

  10. Electrolytic polishing system for space age materials

    International Nuclear Information System (INIS)

    Coons, W.C.; Iosty, L.R.


    A simple electrolytic polishing technique was developed for preparing Cr, Co, Hf, Mo, Ni, Re, Ti, V, Zr, and their alloys for structural analysis on the optical microscope. The base electrolyte contains 5g ZnCl 2 and 15g AlCl 3 . 6H 2 O in 200 ml methyl alcohol, plus an amount of H 2 SO 4 depending on the metal being polished. Five etchants are listed

  11. Trace element analysis of nail polishes

    International Nuclear Information System (INIS)

    Misra, G.; Mittal, V.K.; Sahota, H.S.


    Instrumental neutron activation analysis (INAA) technique was used to measure the concentrations of various trace elements in nail polishes of popular Indian and foreign brands. The aim of the present experiment was to see whether trace elements could distinguish nail polishes of different Indian and foreign brands from forensic point of view. It was found that cesium can act as a marker to differentiate foreign and Indian brands. (author)

  12. Interlocking directorships in Polish joint stock companies:


    Pawlak, Marek


    Studies concerning interlocking directorships have been carried out among Polish joint stock corporations. The main source of data have been the announcements that are to be published by corporations regularly in a journal called Business and Court Gazette (BCG). Interlocking directorships constitute a network among corporations the use of which enables co-ordinated management of the whole group. The phenomenon of interlocking directorships in Polish joint stock companies can be compared to t...

  13. Jewish problem in the Polish Communist Party

    Directory of Open Access Journals (Sweden)

    Cimek Henryk


    Full Text Available Jews accounted for approx. 8-10% of the population of the Second Republic and in the communist movement (Polish Communist Party and Polish Communist Youth Union the rate was approx, 30%, while in subsequent years it much fluctuated. The percentage of Jews was the highest in the authorities of the party and in the KZMP. This had a negative impact on the position of the KPP on many issues, especially in its relation to the Second Republic.

  14. Attack polish for nickel-base alloys and stainless steels (United States)

    Not Available


    A chemical attack polish and polishing procedure for use on metal surfaces such as nickel base alloys and stainless steels is described. The chemical attack polich comprises FeNO/sub 3/, concentrated CH/sub 3/COOH, concentrated H/sub 2/SO/sub 4/ and H/sub 2/O. The polishing procedure includes saturating a polishing cloth with the chemical attack polish and submicron abrasive particles and buffing the metal surface.

  15. Experimental Polish-Lithuanian Corpus with the Semantic Annotation Elements

    Full Text Available Experimental Polish-Lithuanian Corpus with the Semantic Annotation Elements In the article the authors present the experimental Polish-Lithuanian corpus (ECorpPL-LT formed for the idea of Polish-Lithuanian theoretical contrastive studies, a Polish-Lithuanian electronic dictionary, and as help for a sworn translator. The semantic annotation being brought into ECorpPL-LT is extremely useful in Polish-Lithuanian contrastive studies, and also proves helpful in translation work.

  16. Il futurismo polacco nella critica letteraria dell’epoca [Polish Futurism in Literary Criticism of the Early Twentieth Century

    Directory of Open Access Journals (Sweden)

    Andrea F. De Carlo


    Full Text Available The article analyses the critical voices raised against the young poets and artists who promoted Futurism in Poland during the first half of the Twentieth century. Futurist manifestos influenced the new Polish poetry, stimulating a lively debate among intellectuals of the calibre of Stefan Żeromski and Karol Irzykowski. In general, the coeval criticism of Polish Futurism focused on three main points: the lack of originality and servile imitation of foreign literary models; the repudiation of the past and national traditions; Futurism as an expression of ideologies such as Fascism in Italy and Bolshevism in Russia. In this article, specific attention is devoted to an analysis of the essay Snobizm i postęp (Snobbery and Progress, 1923 by Żeromski. The writer, criticising Polish imitators of Russian Futurism, affirmed that Polish literature and culture, in the context of national reconstruction after three partitions of Poland, needed to maintain its natural connection with the past and at the same time, without losing its national nature, to weave some universal suggestions into the plot of purely Polish themes. The goal of this article is to reveal that Żeromski and Irzykowski’s critical stance towards the Polish Futurists, which influenced the critics of the next generation, was dictated by a shallow analysis of Futuristic works and by their inability to understand Futuristic efforts to modernise Polish art and literature.

  17. Conformal polishing approach: Tool footprint analysis

    Directory of Open Access Journals (Sweden)

    José A Dieste


    Full Text Available Polishing process is one of the most critical manufacturing processes during a metal part production because it determines the final quality of the product. Free-form surface polishing is a handmade process with lots of rejected parts, scrap generation and time and energy consumption. Two different research lines are being developed: prediction models of the final surface quality parameters and an analysis of the amount of material removed depending on the polishing parameters to predict the tool footprint during the polishing task. This research lays the foundations for a future automatic conformal polishing system. It is based on rotational and translational tool with dry abrasive in the front mounted at the end of a robot. A tool to part concept is used, useful for large or heavy workpieces. Results are applied on different curved parts typically used in tooling industry, aeronautics or automotive. A mathematical model has been developed to predict the amount of material removed in function of polishing parameters. Model has been fitted for different abrasives and raw materials. Results have shown deviations under 20% that implies a reliable and controllable process. Smaller amount of material can be removed in controlled areas of a three-dimensional workpiece.

  18. Laser polishing of additive manufactured Ti alloys (United States)

    Ma, C. P.; Guan, Y. C.; Zhou, W.


    Laser-based additive manufacturing has attracted much attention as a promising 3D printing method for metallic components in recent years. However, surface roughness of additive manufactured components has been considered as a challenge to achieve high performance. In this work, we demonstrate the capability of fiber laser in polishing rough surface of additive manufactured Ti-based alloys as Ti-6Al-4V and TC11. Both as-received surface and laser-polished surfaces as well as cross-section subsurfaces were analyzed carefully by White-Light Interference, Confocal Microscope, Focus Ion Beam, Scanning Electron Microscopy, Energy Dispersive Spectrometer, and X-ray Diffraction. Results revealed that as-received Ti-based alloys with surface roughness more than 5 μm could be reduce to less than 1 μm through laser polishing process. Moreover, microstructure, microhardness and wear resistance of laser-polished zone was investigated in order to examine the thermal effect of laser polishing processing on the substrate of additive manufactured Ti alloys. This proof-of-concept process has the potential to effectively improve the surface roughness of additive manufactured metallic alloy by local polishing method without damage to the substrate.

  19. The Importance of Natural Change in Planning School-Based Intervention for Children with Developmental Language Impairment (DLI) (United States)

    Botting, Nicola; Gaynor, Marguerite; Tucker, Katie; Orchard-Lisle, Ginnie


    Some reports suggest that there is an increase in the number of children identified as having developmental language impairment (Bercow, 2008). yet resource issues have meant that many speech and language therapy services have compromised provision in some way. Thus, efficient ways of identifying need and prioritizing intervention are required.…

  20. Understanding Language in Education and Grade 4 Reading Performance Using a "Natural Experiment" of Botswana and South Africa (United States)

    Shepherd, Debra Lynne


    The regional and cultural closeness of Botswana and South Africa, as well as differences in their political histories and language policy stances, offers a unique opportunity to evaluate the role of language in reading outcomes. This study aims to empirically test the effect of exposure to mother tongue and English instruction on the reading…

  1. Evaluation of the effect of polishing on flexural strength of feldspathic porcelain and its comparison with autoglazing and over glazing

    Directory of Open Access Journals (Sweden)

    Jalali H.


    Full Text Available Statement of Problem: Ceramic restorations are popular because they can provide the most natural replacement for teeth. However, the brittleness of ceramics is a primary disadvantage. There are various methods for strengthening ceramics such as metal framework, ceramic cores, and surface strengthening mechanisms through glazing, work hardening and ion exchange. Purpose: The purpose of this study was to evaluate the effect of polish on flexural strength of feldspathic porcelain and to compare it with overglaze and autoglaze. Materials and Methods: In this experimental study, one brand of feldspathic porcelain (colorlogic, Ceramco was used and forty bars (25×6×3 mm were prepared according to ISO 6872 and ADA No. 69. The specimens were randomly divided into four groups: overglazed, auto glazed, fine polish and coarse polish (clinic polish. Flexural strength of each specimen was determined by three point bending test (Universal Testing Machine, Zwick 1494, Germany. Collected data was analyzed by ANOVA and post-hoc test with P<0.05 as the limit of significance. Results: A significant difference was observed among the studied groups (P<0.0001. According to post-hoc test, flexural strength in overglaze and fine polish group were significantly stronger than clinic polish and autoglaze group (P<0.001. Although the mean value for overglazed group was higher than fine polish group, this was not statistically significant (P=0.9. Also no statistical difference was seen between autoglazed and coarse polish group (P=0.2. Conclusion: Based on the findings of this study, flexural strength achieved by fine polish (used in this study can compete with overglazing the feldespathic porcelains. It also can be concluded that a final finishing procedure that involves fine polishing may be preferred to simple staining followed by self-glazing.

  2. Specialized languages

    Mousten, Birthe; Laursen, Anne Lise


    -disciplinarily, because they work with both derivative and contributory approaches. Derivative, because specialized language retrieves its philosophy of science as well as methods from both the natural sciences, social sciences and humanistic sciences. Contributory because language results support the communication...... science fields communicate their findings. With this article, we want to create awareness of the work in this special area of language studies and of the inherent cross-disciplinarity that makes LSP special compared to common-core language. An acknowledgement of the importance of this field both in terms...

  3. Facilitating pharmacogenetic studies using electronic health records and natural-language processing: a case study of warfarin. (United States)

    Xu, Hua; Jiang, Min; Oetjens, Matt; Bowton, Erica A; Ramirez, Andrea H; Jeff, Janina M; Basford, Melissa A; Pulley, Jill M; Cowan, James D; Wang, Xiaoming; Ritchie, Marylyn D; Masys, Daniel R; Roden, Dan M; Crawford, Dana C; Denny, Joshua C


    DNA biobanks linked to comprehensive electronic health records systems are potentially powerful resources for pharmacogenetic studies. This study sought to develop natural-language-processing algorithms to extract drug-dose information from clinical text, and to assess the capabilities of such tools to automate the data-extraction process for pharmacogenetic studies. A manually validated warfarin pharmacogenetic study identified a cohort of 1125 patients with a stable warfarin dose, in which 776 patients were managed by Coumadin Clinic physicians, and the remaining 349 patients were managed by their providers. The authors developed two algorithms to extract weekly warfarin doses from both data sets: a regular expression-based program for semistructured Coumadin Clinic notes; and an advanced weekly dose calculator based on an existing medication information extraction system (MedEx) for narrative providers' notes. The authors then conducted an association analysis between an automatically extracted stable weekly dose of warfarin and four genetic variants of VKORC1 and CYP2C9 genes. The performance of the weekly dose-extraction program was evaluated by comparing it with a gold standard containing manually curated weekly doses. Precision, recall, F-measure, and overall accuracy were reported. Associations between known variants in VKORC1 and CYP2C9 and warfarin stable weekly dose were performed with linear regression adjusted for age, gender, and body mass index. The authors' evaluation showed that the MedEx-based system could determine patients' warfarin weekly doses with 99.7% recall, 90.8% precision, and 93.8% accuracy. Using the automatically extracted weekly doses of warfarin, the authors successfully replicated the previous known associations between warfarin stable dose and genetic variants in VKORC1 and CYP2C9.

  4. The Common Alerting Protocol (CAP) and Emergency Data Exchange Language (EDXL) - Application in Early Warning Systems for Natural Hazard (United States)

    Lendholt, Matthias; Hammitzsch, Martin; Wächter, Joachim


    The Common Alerting Protocol (CAP) [1] is an XML-based data format for exchanging public warnings and emergencies between alerting technologies. In conjunction with the Emergency Data Exchange Language (EDXL) Distribution Element (-DE) [2] these data formats can be used for warning message dissemination in early warning systems for natural hazards. Application took place in the DEWS (Distance Early Warning System) [3] project where CAP serves as central message format containing both human readable warnings and structured data for automatic processing by message receivers. In particular the spatial reference capabilities are of paramount importance both in CAP and EDXL. Affected areas are addressable via geo codes like HASC (Hierarchical Administrative Subdivision Codes) [4] or UN/LOCODE [5] but also with arbitrary polygons that can be directly generated out of GML [6]. For each affected area standardized criticality values (urgency, severity and certainty) have to be set but also application specific key-value-pairs like estimated time of arrival or maximum inundation height can be specified. This enables - together with multilingualism, message aggregation and message conversion for different dissemination channels - the generation of user-specific tailored warning messages. [1] CAP, [2] EDXL-DE, [3] DEWS, [4] HASC, "Administrative Subdivisions of Countries: A Comprehensive World Reference, 1900 Through 1998" ISBN 0-7864-0729-8 [5] UN/LOCODE, [6] GML,

  5. Specificity of Geotechnical Measurements and Practice of Polish Offshore Operations

    Directory of Open Access Journals (Sweden)

    Bogumil Laczynski


    Full Text Available As offshore market in Europe grows faster and faster, new sea areas are being managed and new ideas on how to use the sea potential are being developed. In North Sea, where offshore industry conducts intensive expansion since late 1960s, numerous wind farms, oil and gas platforms and pipelines have been put into operation following extensive research, including geotechnical measurement. Recently, a great number of similar projects is under development in Baltic Sea, inter alia in Polish EEZ, natural conditions of which vary from the North Sea significantly. In this paper, those differences are described together with some solutions to problems thereby arising.

  6. Polish Listening SPAN: A new tool for measuring verbal working memory

    Directory of Open Access Journals (Sweden)

    Katarzyna Zychowicz


    Full Text Available Individual differences in second language acquisition (SLA encompass differences in working memory capacity, which is believed to be one of the most crucial factors influencing language learning. However, in Poland research on the role of working memory in SLA is scarce due to a lack of proper Polish instruments for measuring this construct. The purpose of this paper is to discuss the process of construction and validation of the Polish Listening Span (PLSPAN as a tool intended to measure verbal working memory of adults. The article presents the requisite theoretical background as well as the information about the PLSPAN, that is, the structure of the test, the scoring procedures and the steps taken with the aim of validating it.

  7. Five Martyr Brothers. First Polish hermits and their worship

    Directory of Open Access Journals (Sweden)

    Kinga Blaschke


    Full Text Available Brothers Benedict and John, students of Romuald, came to Poland at the invitation of Otto III to convert pagans. Soon the Italian hermits were joined by Polish brothers Isaac and Matthew, who helped them in learning the Slavic language. The hermits, as well as Christinus, well killed in 1003 by thugs who wanted to steal money given by Duke Boleslav to an expedition to Rome, which was aimed at obtaining papal consent for conducting missionary work. Although the hermits died as victims of a robbery, killed by fellow Christians, the pope canonized them as martyrs. Their lives are relatively well-documented: the earliest and the most credible story of the five brothers by Bruno of Querfurt was written as early as five years after their death, although remained unknown until 1883. Another early account is the life of St. Romuald by Piotr Damiani of 1041. The martyrs have been also associated with yet another mysterious work – a gravestone unearthed in 1959 at the external wall of the north Roman apse of the Gniezno Cathedral, considered by most researchers the oldest epigraphic item on the Polish soil. However, the identification of the warriors mentioned in the inscription with 11th century martyrs raises many doubts. The article discusses the above matters, as well as the subject of the development of the worship of the martyr brothers.

  8. Publishing Academic Texts in English: A Polish Perspective (United States)

    Duszak, Anna; Lewkowicz, Jo


    The language in which to publish is a complex issue for academics in Poland. With the growth of English as the global lingua franca it may appear to be the obvious language of choice. Yet, publishing in English inevitably brings with it linguistic challenges. It also raises concerns of a social and ideological nature. Choosing to publish in Polish…

  9. Polishing for glass ceramics: which protocol? (United States)

    Silva, Tânia Mara da; Salvia, Ana Carolina Rodrigues Danzi; Carvalho, Rodrigo Furtado de; Pagani, Clovis; Rocha, Daniel Maranha da; Silva, Eduardo Galera da


    The execution of adjustments on ceramic restorations is sometimes necessary for either correction of occlusion and/or inadequate contours or esthetical improvements. Clinically, the surfaces undergo weariness through fine grinding diamond burs which remove the superficial glazing layer. Several materials for ceramic polishing have been used in an attempt to reach a satisfactory surface smoothness. The aim of this study was to perform a literature review on different polishing protocols of several dental ceramics. This is a literature review performed through scientific articles published between 2004 and 2012, indexed in MEDLINE, PubMed and Scielo databases. The study selected and analyzed a total of 20 relevant articles that evaluated different types of ceramics, polishing treatment and surface roughness. After an extensive literature review, this study observed: 1 - after the rupture of the glazing layer due to the adjustments of the restorations, the best choice for the polishing of the surface will depend on the type of ceramics used; 2 - glazing procedure provide excellent results regarding to the superficial smoothness; however, if reglazing is impossible, either abrasive rubber cups/points or sandpaper discs followed by the use of diamond polishing pastes results in a satisfactory superficial smoothness; 3 - clinical studies that take into account the behavior of the protocols polishing are scarce and should be encouraged; 4 - the large number of variables influence the final outcome of polishing should be considered. The necessity in standardization of methodologies to enable a comparison among researches. Copyright © 2014 Japan Prosthodontic Society. Published by Elsevier Ltd. All rights reserved.

  10. Smoking characteristics of Polish immigrants in Dublin.

    Kabir, Zubair


    BACKGROUND: This study examined two main hypotheses: a) Polish immigrants\\' smoking estimates are greater than their Irish counterparts (b) Polish immigrants purchasing cigarettes from Poland smoke "heavier" (>\\/= 20 cigarettes a day) when compared to those purchasing cigarettes from Ireland. The study also set out to identify significant predictors of \\'current\\' smoking (some days and everyday) among the Polish immigrants. METHODS: Dublin residents of Polish origin (n = 1,545) completed a previously validated Polish questionnaire in response to an advertisement in a local Polish lifestyle magazine over 5 weekends (July-August, 2007). The Office of Tobacco Control telephone-based monthly survey data were analyzed for the Irish population in Dublin for the same period (n = 484). RESULTS: Age-sex adjusted smoking estimates were: 47.6% (95% Confidence Interval [CI]: 47.3%; 48.0%) among the Poles and 27.8% (95% CI: 27.2%; 28.4%) among the general Irish population (p < 0.001). Of the 57% of smokers (n = 345\\/606) who purchased cigarettes solely from Poland and the 33% (n = 198\\/606) who purchased only from Ireland, 42.6% (n = 147\\/345) and 41.4% (n = 82\\/198) were "heavy" smokers, respectively (p = 0.79). Employment (Odds Ratio [OR]: 2.89; 95% CI: 1.25-6.69), lower education (OR: 3.76; 95%CI: 2.46-5.74), and a longer stay in Ireland (>24 months) were significant predictors of current smoking among the Poles. An objective validation of the self-reported smoking history of a randomly selected sub-sample immigrant group, using expired carbon monoxide (CO) measurements, showed a highly significant correlation coefficient (r = 0.64) of expired CO levels with the reported number of cigarettes consumed (p < 0.0001). CONCLUSION: Polish immigrants\\' smoking estimates are higher than their Irish counterparts, and particularly if employed, with only primary-level education, and are overseas >2 years.

  11. The acquisition of gender and case in Polish and Russian: A study of monolingual and bilingual children

    Janssen, B.E.


    Polish and Russian are typologically closely related Slavic languages that have highly comparable nominal morphology within their gender and case systems in their written form. In their spoken form, however, they show crucial differences, specifically in the phonetic realisation of unstressed

  12. Automated identification of wound information in clinical notes of patients with heart diseases: Developing and validating a natural language processing application. (United States)

    Topaz, Maxim; Lai, Kenneth; Dowding, Dawn; Lei, Victor J; Zisberg, Anna; Bowles, Kathryn H; Zhou, Li


    Electronic health records are being increasingly used by nurses with up to 80% of the health data recorded as free text. However, only a few studies have developed nursing-relevant tools that help busy clinicians to identify information they need at the point of care. This study developed and validated one of the first automated natural language processing applications to extract wound information (wound type, pressure ulcer stage, wound size, anatomic location, and wound treatment) from free text clinical notes. First, two human annotators manually reviewed a purposeful training sample (n=360) and random test sample (n=1100) of clinical notes (including 50% discharge summaries and 50% outpatient notes), identified wound cases, and created a gold standard dataset. We then trained and tested our natural language processing system (known as MTERMS) to process the wound information. Finally, we assessed our automated approach by comparing system-generated findings against the gold standard. We also compared the prevalence of wound cases identified from free-text data with coded diagnoses in the structured data. The testing dataset included 101 notes (9.2%) with wound information. The overall system performance was good (F-measure is a compiled measure of system's accuracy=92.7%), with best results for wound treatment (F-measure=95.7%) and poorest results for wound size (F-measure=81.9%). Only 46.5% of wound notes had a structured code for a wound diagnosis. The natural language processing system achieved good performance on a subset of randomly selected discharge summaries and outpatient notes. In more than half of the wound notes, there were no coded wound diagnoses, which highlight the significance of using natural language processing to enrich clinical decision making. Our future steps will include expansion of the application's information coverage to other relevant wound factors and validation of the model with external data. Copyright © 2016 Elsevier Ltd. All

  13. Development of a user friendly interface for database querying in natural language by using concepts and means related to artificial intelligence

    International Nuclear Information System (INIS)

    Pujo, Pascal


    This research thesis reports the development of a user-friendly interface in natural language for querying a relational database. The developed system differs from usual approaches for its integrated architecture as the relational model management is totally controlled by the interface. The author first addresses the way to store data in order to make them accessible through an interface in natural language, and more precisely to store data with an organisation which would result in the less possible constraints in query formulation. The author then briefly presents techniques related to automatic processing in natural language, and discusses the implications of a better user-friendliness and for error processing. The next part reports the study of the developed interface: selection of data processing tools, interface development, data management at the interface level, information input by the user. The last chapter proposes an overview of possible evolutions for the interface: use of deductive functionalities, use of an extensional base and of an intentional base to deduce facts from knowledge stores in the extensional base, and handling of complex objects [fr

  14. A corpus of full-text journal articles is a robust evaluation tool for revealing differences in performance of biomedical natural language processing tools (United States)


    Background We introduce the linguistic annotation of a corpus of 97 full-text biomedical publications, known as the Colorado Richly Annotated Full Text (CRAFT) corpus. We further assess the performance of existing tools for performing sentence splitting, tokenization, syntactic parsing, and named entity recognition on this corpus. Results Many biomedical natural language processing systems demonstrated large differences between their previously published results and their performance on the CRAFT corpus when tested with the publicly available models or rule sets. Trainable systems differed widely with respect to their ability to build high-performing models based on this data. Conclusions The finding that some systems were able to train high-performing models based on this corpus is additional evidence, beyond high inter-annotator agreement, that the quality of the CRAFT corpus is high. The overall poor performance of various systems indicates that considerable work needs to be done to enable natural language processing systems to work well when the input is full-text journal articles. The CRAFT corpus provides a valuable resource to the biomedical natural language processing community for evaluation and training of new models for biomedical full text publications. PMID:22901054

  15. Web 2.0-based crowdsourcing for high-quality gold standard development in clinical natural language processing. (United States)

    Zhai, Haijun; Lingren, Todd; Deleger, Louise; Li, Qi; Kaiser, Megan; Stoutenborough, Laura; Solti, Imre


    A high-quality gold standard is vital for supervised, machine learning-based, clinical natural language processing (NLP) systems. In clinical NLP projects, expert annotators traditionally create the gold standard. However, traditional annotation is expensive and time-consuming. To reduce the cost of annotation, general NLP projects have turned to crowdsourcing based on Web 2.0 technology, which involves submitting smaller subtasks to a coordinated marketplace of workers on the Internet. Many studies have been conducted in the area of crowdsourcing, but only a few have focused on tasks in the general NLP field and only a handful in the biomedical domain, usually based upon very small pilot sample sizes. In addition, the quality of the crowdsourced biomedical NLP corpora were never exceptional when compared to traditionally-developed gold standards. The previously reported results on medical named entity annotation task showed a 0.68 F-measure based agreement between crowdsourced and traditionally-developed corpora. Building upon previous work from the general crowdsourcing research, this study investigated the usability of crowdsourcing in the clinical NLP domain with special emphasis on achieving high agreement between crowdsourced and traditionally-developed corpora. To build the gold standard for evaluating the crowdsourcing workers' performance, 1042 clinical trial announcements (CTAs) from the website were randomly selected and double annotated for medication names, medication types, and linked attributes. For the experiments, we used CrowdFlower, an Amazon Mechanical Turk-based crowdsourcing platform. We calculated sensitivity, precision, and F-measure to evaluate the quality of the crowd's work and tested the statistical significance (Pcrowdsourced and traditionally-developed annotations. The agreement between the crowd's annotations and the traditionally-generated corpora was high for: (1) annotations (0.87, F-measure for medication names

  16. APS 3D: a new benchmark in aspherical polishing (United States)

    Gauch, Daniel; Mikulic, Dalibor; Veit, Christian


    The APS 3D system performs polishing and form correction in one step in order to reduce overall process time, reduce the number of polishing steps required and eliminate the need for highly skilled operators while providing a repeatable polishing process. This new 3D Polishing system yields better surface quality, and a better slope error, automatically determining the optimum speeds, feed rates and polish pressures to achieve a deterministic process based on the required quality parameters input by the operator. The process flow is always the same to ensure consistent quality and target quality values are defined before polishing begins.

  17. The formation of the polish opposite movement in Western Ukraine at the beginning of the Second world war

    Full Text Available The article highlights the nature of the Soviet totalitarian ethnic policy and its influence on the origin of the Polish opposite movement in Western Ukraine at the beginning of the Second World War. It also clarifies the main factors of the formation of active opposite movements among the Polish part of population in the Western Ukraine territory, which withdrew to the Soviet Union due to the distribution of Poland as a result of the Molotov-Ribbentrop Pact. The author defined category of Polish nationality persons, who were dissatisfied with Stalin’s repressive policies in 1939-1941 and become that social environment, in which finally formed opposite movement to totalitarianism and, in particular, antinational regime against the Polish ethnos, and the environment from which later appeared activists of this movement. By the author was analyzed the activity of Soviet authorities in the occupied territories of the former «Wshodnih kresuv» of the Second Rich Pospolyta and determined main factors that led to dissatisfaction with the rigid Soviet policy against the former government officials, military precipitators and servants of the Roman Catholic Church. Investigated and determined features of the Polish opposite movement formation in the former eastern Polish territories occupied in 1939 by the Soviet Union and seized in 1941 by Nazi Germany. The article also describes the origin and activity of the first underground Polish armed forces on Ukrainian territory.

  18. Towards Recognition of Spatial Relations between Entities for Polish

    Full Text Available Towards Recognition of Spatial Relations between Entities for Polish In this paper, the problem of spatial relation recognition in Polish is examined. We present the different ways of distributing spatial information throughout a sentence by reviewing the lexical and grammatical signals of various relations between objects. We focus on the spatial usage of prepositions and their meaning, determined by the ‘conceptual’ schemes they constitute. We also discuss the feasibility of a comprehensive recognition of spatial relations between objects expressed in different ways by reviewing the existing tools and resources for text processing in Polish. As a result, we propose a heuristic method for the recognition of spatial relations expressed in various phrase structures called spatial expressions. We propose a definition of spatial expressions by taking into account the limitations of the available tools for the Polish language. A set of rules is used to generate candidates of spatial expressions which are later tested against a set of semantic constraints. The results of our work on recognition of spatial expressions in Polish texts were partially presented in (Marcińczuk, Oleksy, & Wieczorek, 2016. In that paper we focused on a detailed analysis of errors obtained using a set of basic morphosyntactic patterns for generating spatial expression candidates - we identified and described the most common sources of errors, i.e. incorrectly recognized or unrecognized expressions. In this paper we focused mainly on the preliminary stages of spatial expression recognition. We presented an extensive review on how the spatial information can be encoded in the text, types of spatial triggers in Polish and a detailed evaluation of morphosyntactic patterns which can be used to generate spatial expression candidates.   Rozpoznawanie relacji przestrzennych między obiektami fizycznymi w języku polskim Artykuł dotyczy zagadnienia rozpoznawania relacji

  19. Surface roughness and morphology of dental nanocomposites polished by four different procedures evaluated by a multifractal approach

    Energy Technology Data Exchange (ETDEWEB)

    Graphical abstract: - Highlights: • Multifractals are good indicators of polished dental composites 3-D surface structure. • The nanofilled composite had superior 3-D surface properties than the nanohybrid one. • Composite polishing with diamond paste created improved 3-D multifractal structure. • Recommendation: polish the composite with diamond paste if using the one-step tool. • Multifractal analysis could become essential in designing new dental surfaces. - Abstract: The objective of this study was to determine the effect of different dental polishing methods on surface texture parameters of dental nanocomposites. The 3-D surface morphology was investigated by atomic force microscopy (AFM) and multifractal analysis. Two representative dental resin-based nanocomposites were investigated: a nanofilled and a nanohybrid composite. The samples were polished by two dental polishing protocols using multi-step and one-step system. Both protocols were then followed by diamond paste polishing. The 3-D surface roughness of samples was studied by AFM on square areas of topography on the 80 × 80 μm{sup 2} scanning area. The multifractal spectrum theory based on computational algorithms was applied for AFM data and multifractal spectra were calculated. The generalized dimension D{sub q} and the singularity spectrum f(α) provided quantitative values that characterize the local scale properties of dental nanocomposites polished by four different dental polishing protocols at nanometer scale. The results showed that the larger the spectrum width Δα (Δα = α{sub max} − α{sub min}) of the multifractal spectra f(α), the more non-uniform was the surface morphology. Also, the 3-D surface topography was described by statistical parameters, according to ISO 25178-2:2012. The 3-D surface of samples had a multifractal nature. Nanofilled composite had lower values of height parameters than nanohybrid composites, due to its composition. Multi-step polishing protocol

  20. Development and Validation of a Natural Language Processing Tool to Identify Patients Treated for Pneumonia across VA Emergency Departments. (United States)

    Jones, B E; South, B R; Shao, Y; Lu, C C; Leng, J; Sauer, B C; Gundlapalli, A V; Samore, M H; Zeng, Q


     Identifying pneumonia using diagnosis codes alone may be insufficient for research on clinical decision making. Natural language processing (NLP) may enable the inclusion of cases missed by diagnosis codes.  This article (1) develops a NLP tool that identifies the clinical assertion of pneumonia from physician emergency department (ED) notes, and (2) compares classification methods using diagnosis codes versus NLP against a gold standard of manual chart review to identify patients initially treated for pneumonia.  Among a national population of ED visits occurring between 2006 and 2012 across the Veterans Affairs health system, we extracted 811 physician documents containing search terms for pneumonia for training, and 100 random documents for validation. Two reviewers annotated span- and document-level classifications of the clinical assertion of pneumonia. An NLP tool using a support vector machine was trained on the enriched documents. We extracted diagnosis codes assigned in the ED and upon hospital discharge and calculated performance characteristics for diagnosis codes, NLP, and NLP plus diagnosis codes against manual review in training and validation sets.  Among the training documents, 51% contained clinical assertions of pneumonia; in the validation set, 9% were classified with pneumonia, of which 100% contained pneumonia search terms. After enriching with search terms, the NLP system alone demonstrated a recall/sensitivity of 0.72 (training) and 0.55 (validation), and a precision/positive predictive value (PPV) of 0.89 (training) and 0.71 (validation). ED-assigned diagnostic codes demonstrated lower recall/sensitivity (0.48 and 0.44) but higher precision/PPV (0.95 in training, 1.0 in validation); the NLP system identified more "possible-treated" cases than diagnostic coding. An approach combining NLP and ED-assigned diagnostic coding classification achieved the best performance (sensitivity 0.89 and PPV 0.80).  System-wide application of NLP to