Pon-Barry, Heather Roberta
The ﬁeld of spoken language processing is concerned with creating computer programs that can understand human speech and produce human-like speech. Regarding the problem of understanding human speech, there is currently growing interest in moving beyond speech recognition (the task of transcribing the words in an audio stream) and towards machine listening—interpreting the full spectrum of information in an audio stream. One part of machine listening, the problem that this thesis focuses on, ...
Jednoróg, Katarzyna; Bola, Łukasz; Mostowski, Piotr; Szwed, Marcin; Boguszewski, Paweł M; Marchewka, Artur; Rutkowski, Paweł
In several countries natural sign languages were considered inadequate for education. Instead, new sign-supported systems were created, based on the belief that spoken/written language is grammatically superior. One such system called SJM (system językowo-migowy) preserves the grammatical and lexical structure of spoken Polish and since 1960s has been extensively employed in schools and on TV. Nevertheless, the Deaf community avoids using SJM for everyday communication, its preferred language being PJM (polski język migowy), a natural sign language, structurally and grammatically independent of spoken Polish and featuring classifier constructions (CCs). Here, for the first time, we compare, with fMRI method, the neural bases of natural vs. devised communication systems. Deaf signers were presented with three types of signed sentences (SJM and PJM with/without CCs). Consistent with previous findings, PJM with CCs compared to either SJM or PJM without CCs recruited the parietal lobes. The reverse comparison revealed activation in the anterior temporal lobes, suggesting increased semantic combinatory processes in lexical sign comprehension. Finally, PJM compared with SJM engaged left posterior superior temporal gyrus and anterior temporal lobe, areas crucial for sentence-level speech comprehension. We suggest that activity in these two areas reflects greater processing efficiency for naturally evolved sign language. Copyright © 2015 Elsevier Ltd. All rights reserved.
D'Mello, Sidney K; Dowell, Nia; Graesser, Arthur
There is the question of whether learning differs when students speak versus type their responses when interacting with intelligent tutoring systems with natural language dialogues. Theoretical bases exist for three contrasting hypotheses. The speech facilitation hypothesis predicts that spoken input will increase learning, whereas the text facilitation hypothesis predicts typed input will be superior. The modality equivalence hypothesis claims that learning gains will be equivalent. Previous experiments that tested these hypotheses were confounded by automated speech recognition systems with substantial error rates that were detected by learners. We addressed this concern in two experiments via a Wizard of Oz procedure, where a human intercepted the learner's speech and transcribed the utterances before submitting them to the tutor. The overall pattern of the results supported the following conclusions: (1) learning gains associated with spoken and typed input were on par and quantitatively higher than a no-intervention control, (2) participants' evaluations of the session were not influenced by modality, and (3) there were no modality effects associated with differences in prior knowledge and typing proficiency. Although the results generally support the modality equivalence hypothesis, highly motivated learners reported lower cognitive load and demonstrated increased learning when typing compared with speaking. We discuss the implications of our findings for intelligent tutoring systems that can support typed and spoken input.
Crowe, Kathryn; McLeod, Sharynne
The purpose of this research was to investigate factors that influence professionals' guidance of parents of children with hearing loss regarding spoken language multilingualism and spoken language choice. Sixteen professionals who provide services to children and young people with hearing loss completed an online survey, rating the importance of…
This article addresses key issues and considerations for teachers wanting to incorporate spoken grammar activities into their own teaching and also focuses on six common features of spoken grammar, with practical activities and suggestions for teaching them in the language classroom. The hope is that this discussion of spoken grammar and its place…
Houston, K. Todd; Perigoe, Christina B.
Determining the most effective methods and techniques to facilitate the spoken language development of individuals with hearing loss has been a focus of practitioners for centuries. Due to modern advances in hearing technology, earlier identification of hearing loss, and immediate enrollment in early intervention, children with hearing loss are…
Full Text Available A key problem in spoken language identification (LID is to design effective representations which are specific to language information. For example, in recent years, representations based on both phonotactic and acoustic features have proven their effectiveness for LID. Although advances in machine learning have led to significant improvements, LID performance is still lacking, especially for short duration speech utterances. With the hypothesis that language information is weak and represented only latently in speech, and is largely dependent on the statistical properties of the speech content, existing representations may be insufficient. Furthermore they may be susceptible to the variations caused by different speakers, specific content of the speech segments, and background noise. To address this, we propose using Deep Bottleneck Features (DBF for spoken LID, motivated by the success of Deep Neural Networks (DNN in speech recognition. We show that DBFs can form a low-dimensional compact representation of the original inputs with a powerful descriptive and discriminative capability. To evaluate the effectiveness of this, we design two acoustic models, termed DBF-TV and parallel DBF-TV (PDBF-TV, using a DBF based i-vector representation for each speech utterance. Results on NIST language recognition evaluation 2009 (LRE09 show significant improvements over state-of-the-art systems. By fusing the output of phonotactic and acoustic approaches, we achieve an EER of 1.08%, 1.89% and 7.01% for 30 s, 10 s and 3 s test utterances respectively. Furthermore, various DBF configurations have been extensively evaluated, and an optimal system proposed.
Apr 12, 2018 ... languages and can be used for the purposes of spoken language identification. Keywords. SLID .... branch of linguistics to study the sound structure of human language. ... countries, work in the area of Indian language identification has not ...... English and speech database has been collected over tele-.
The objective of this effort was to develop a prototype, hand-held or body-mounted spoken language translator to assist military and law enforcement personnel in interacting with non-English-speaking people...
Bates, Madeleine; Ellard, Dan; Peterson, Pat; Shaked, Varda
.... In an effort to demonstrate the relevance of SIS technology to real-world military applications, BBN has undertaken the task of providing a spoken language interface to DART, a system for military...
Jelena Kuvač Kraljević
Full Text Available Interest in spoken-language corpora has increased over the past two decades leading to the development of new corpora and the discovery of new facets of spoken language. These types of corpora represent the most comprehensive data source about the language of ordinary speakers. Such corpora are based on spontaneous, unscripted speech defined by a variety of styles, registers and dialects. The aim of this paper is to present the Croatian Adult Spoken Language Corpus (HrAL, its structure and its possible applications in different linguistic subfields. HrAL was built by sampling spontaneous conversations among 617 speakers from all Croatian counties, and it comprises more than 250,000 tokens and more than 100,000 types. Data were collected during three time slots: from 2010 to 2012, from 2014 to 2015 and during 2016. HrAL is today available within TalkBank, a large database of spoken-language corpora covering different languages (https://talkbank.org, in the Conversational Analyses corpora within the subsection titled Conversational Banks. Data were transcribed, coded and segmented using the transcription format Codes for Human Analysis of Transcripts (CHAT and the Computerised Language Analysis (CLAN suite of programmes within the TalkBank toolkit. Speech streams were segmented into communication units (C-units based on syntactic criteria. Most transcripts were linked to their source audios. The TalkBank is public free, i.e. all data stored in it can be shared by the wider community in accordance with the basic rules of the TalkBank. HrAL provides information about spoken grammar and lexicon, discourse skills, error production and productivity in general. It may be useful for sociolinguistic research and studies of synchronic language changes in Croatian.
Moore, Robert C.; Cohen, Michael H.
Under this effort, SRI has developed spoken-language technology for interactive problem solving, featuring real-time performance for up to several thousand word vocabularies, high semantic accuracy, habitability within the domain, and robustness to many sources of variability. Although the technology is suitable for many applications, efforts to date have focused on developing an Air Travel Information System (ATIS) prototype application. SRI's ATIS system has been evaluated in four ARPA benchmark evaluations, and has consistently been at or near the top in performance. These achievements are the result of SRI's technical progress in speech recognition, natural-language processing, and speech and natural-language integration.
Curtiss, S; de Bode, S; Mathern, G W
We analyzed postsurgery linguistic outcomes of 43 hemispherectomy patients operated on at UCLA. We rated spoken language (Spoken Language Rank, SLR) on a scale from 0 (no language) to 6 (mature grammar) and examined the effects of side of resection/damage, age at surgery/seizure onset, seizure control postsurgery, and etiology on language development. Etiology was defined as developmental (cortical dysplasia and prenatal stroke) and acquired pathology (Rasmussen's encephalitis and postnatal stroke). We found that clinical variables were predictive of language outcomes only when they were considered within distinct etiology groups. Specifically, children with developmental etiologies had lower SLRs than those with acquired pathologies (p =.0006); age factors correlated positively with higher SLRs only for children with acquired etiologies (p =.0006); right-sided resections led to higher SLRs only for the acquired group (p =.0008); and postsurgery seizure control correlated positively with SLR only for those with developmental etiologies (p =.0047). We argue that the variables considered are not independent predictors of spoken language outcome posthemispherectomy but should be viewed instead as characteristics of etiology. Copyright 2001 Elsevier Science.
Lee, Sungjin; Noh, Hyungjong; Lee, Jonghoon; Lee, Kyusong; Lee, Gary Geunbae
Although there have been enormous investments into English education all around the world, not many differences have been made to change the English instruction style. Considering the shortcomings for the current teaching-learning methodology, we have been investigating advanced computer-assisted language learning (CALL) systems. This paper aims at summarizing a set of POSTECH approaches including theories, technologies, systems, and field studies and providing relevant pointers. On top of the state-of-the-art technologies of spoken dialog system, a variety of adaptations have been applied to overcome some problems caused by numerous errors and variations naturally produced by non-native speakers. Furthermore, a number of methods have been developed for generating educational feedback that help learners develop to be proficient. Integrating these efforts resulted in intelligent educational robots — Mero and Engkey — and virtual 3D language learning games, Pomy. To verify the effects of our approaches on students' communicative abilities, we have conducted a field study at an elementary school in Korea. The results showed that our CALL approaches can be enjoyable and fruitful activities for students. Although the results of this study bring us a step closer to understanding computer-based education, more studies are needed to consolidate the findings.
Full Text Available Spoken text differs from written one in its features of context dependency, turn-taking organization, and dynamic structure. EFL learners; however, sometime find it difficult to produce typical characteristics of spoken language, particularly in casual talk. When they are asked to conduct a conversation, some of them tend to be script-based which is considered unnatural. Using the theory of Thornburry (2005, this paper aims to analyze characteristics of spoken language in casual conversation which cover spontaneity, interactivity, interpersonality, and coherence. This study used discourse analysis to reveal four features in turns and moves of three casual conversations. The findings indicate that not all sub-features used in the conversation. In this case, the spontaneity features were used 132 times; the interactivity features were used 1081 times; the interpersonality features were used 257 times; while the coherence features (negotiation features were used 526 times. Besides, the results also present that some participants seem to dominantly produce some sub-features naturally and vice versa. Therefore, this finding is expected to be beneficial to provide a model of how spoken interaction should be carried out. More importantly, it could raise English teachers or lecturers‘ awareness in teaching features of spoken language, so that, the students could develop their communicative competence as the native speakers of English do.
Full Text Available The Prosodic Parallelism hypothesis claims adjacent prosodic categories to prefer identical branching of internal adjacent constituents. According to Wiese and Speyer (2015, this preference implies feet contained in the same phonological phrase to display either binary or unary branching, but not different types of branching. The seemingly free schwa-zero alternations at the end of some words in German make it possible to test this hypothesis. The hypothesis was successfully tested by conducting a corpus study which used large-scale bodies of written German. As some open questions remain, and as it is unclear whether Prosodic Parallelism is valid for the spoken modality as well, the present study extends this inquiry to spoken German. As in the previous study, the results of a corpus analysis recruiting a variety of linguistic constructions are presented. The Prosodic Parallelism hypothesis can be demonstrated to be valid for spoken German as well as for written German. The paper thus contributes to the question whether prosodic preferences are similar between the spoken and written modes of a language. Some consequences of the results for the production of language are discussed.
Corneli, Joseph; Corneli, Miriam
"Natural Language," whether spoken and attended to by humans, or processed and generated by computers, requires networked structures that reflect creative processes in semantic, syntactic, phonetic, linguistic, social, emotional, and cultural modules. Being able to produce novel and useful behavior following repeated practice gets to the root of both artificial intelligence and human language. This paper investigates the modalities involved in language-like applications that computers -- and ...
Spoken language understanding (SLU) is an emerging field in between speech and language processing, investigating human/ machine and human/ human communication by leveraging technologies from signal processing, pattern recognition, machine learning and artificial intelligence. SLU systems are designed to extract the meaning from speech utterances and its applications are vast, from voice search in mobile devices to meeting summarization, attracting interest from both commercial and academic sectors. Both human/machine and human/human communications can benefit from the application of SLU, usin
Moeller, Aleidine J.; Theiler, Janine
Communicative approaches to teaching language have emphasized the centrality of oral proficiency in the language acquisition process, but research investigating oral proficiency has been surprisingly limited, yielding an incomplete understanding of spoken language development. This study investigated the development of spoken language at the high…
Office of English Language Acquisition, US Department of Education, 2015
The Office of English Language Acquisition (OELA) has synthesized key data on English learners (ELs) into two-page PDF sheets, by topic, with graphics, plus key contacts. The topics for this report on Asian/Pacific Islander languages spoken by English Learners (ELs) include: (1) Top 10 Most Common Asian/Pacific Islander Languages Spoken Among ELs:…
Carson, J; Walker, L A; Sanders, B J; Jones, J E; Weddell, J A; Tomlin, A M
The purpose of this study was to assess dmft, the number of decayed, missing (due to caries), and/ or filled primary teeth, of English-speaking and non-English speaking patients of a hospital based pediatric dental clinic under the age of 72 months to determine if native language is a risk marker for tooth decay. Records from an outpatient dental clinic which met the inclusion criteria were reviewed. Patient demographics and dmft score were recorded, and the patients were separated into three groups by the native language spoken by their parents: English, Spanish and all other languages. A total of 419 charts were assessed: 253 English-speaking, 126 Spanish-speaking, and 40 other native languages. After accounting for patient characteristics, dmft was significantly higher for the other language group than for the English-speaking (p0.05). Those patients under 72 months of age whose parents' native language is not English or Spanish, have the highest risk for increased dmft when compared to English and Spanish speaking patients. Providers should consider taking additional time to educate patients and their parents, in their native language, on the importance of routine dental care and oral hygiene.
Barberà, Gemma; Zwets, Martine
In both signed and spoken languages, pointing serves to direct an addressee's attention to a particular entity. This entity may be either present or absent in the physical context of the conversation. In this article we focus on pointing directed to nonspeaker/nonaddressee referents in Sign Language of the Netherlands (Nederlandse Gebarentaal,…
Hirschberg, Julia; Manning, Christopher D
Natural language processing employs computational techniques for the purpose of learning, understanding, and producing human language content. Early computational approaches to language research focused on automating the analysis of the linguistic structure of language and developing basic technologies such as machine translation, speech recognition, and speech synthesis. Today's researchers refine and make use of such tools in real-world applications, creating spoken dialogue systems and speech-to-speech translation engines, mining social media for information about health or finance, and identifying sentiment and emotion toward products and services. We describe successes and challenges in this rapidly advancing area. Copyright © 2015, American Association for the Advancement of Science.
Bradham, Tamala S.; Fonnesbeck, Christopher; Toll, Alice; Hecht, Barbara F.
Purpose: The purpose of the Listening and Spoken Language Data Repository (LSL-DR) was to address a critical need for a systemwide outcome data-monitoring program for the development of listening and spoken language skills in highly specialized educational programs for children with hearing loss highlighted in Goal 3b of the 2007 Joint Committee…
Li, Bei; Soli, Sigfrid D; Zheng, Yun; Li, Gang; Meng, Zhaoli
The purpose of this study was to evaluate early spoken language development in young Mandarin-speaking children during the first 24 months after cochlear implantation, as measured by receptive and expressive vocabulary growth rates. Growth rates were compared with those of normally hearing children and with growth rates for English-speaking children with cochlear implants. Receptive and expressive vocabularies were measured with the simplified short form (SSF) version of the Mandarin Communicative Development Inventory (MCDI) in a sample of 112 pediatric implant recipients at baseline, 3, 6, 12, and 24 months after implantation. Implant ages ranged from 1 to 5 years. Scores were expressed in terms of normal equivalent ages, allowing normalized vocabulary growth rates to be determined. Scores for English-speaking children were re-expressed in these terms, allowing direct comparisons of Mandarin and English early spoken language development. Vocabulary growth rates during the first 12 months after implantation were similar to those for normally hearing children less than 16 months of age. Comparisons with growth rates for normally hearing children 16-30 months of age showed that the youngest implant age group (1-2 years) had an average growth rate of 0.68 that of normally hearing children; while the middle implant age group (2-3 years) had an average growth rate of 0.65; and the oldest implant age group (>3 years) had an average growth rate of 0.56, significantly less than the other two rates. Growth rates for English-speaking children with cochlear implants were 0.68 in the youngest group, 0.54 in the middle group, and 0.57 in the oldest group. Growth rates in the middle implant age groups for the two languages differed significantly. The SSF version of the MCDI is suitable for assessment of Mandarin language development during the first 24 months after cochlear implantation. Effects of implant age and duration of implantation can be compared directly across
Matluck, Joseph H.; Mace-Matluck, Betty J.
This paper discusses the sociolinguistic problems inherent in multilingual testing, and the accompanying dangers of cultural bias in either the visuals or the language used in a given test. The first section discusses English-speaking Americans' perception of foreign speakers in terms of: (1) physical features; (2) speech, specifically vocabulary,…
Pa, Judy; Wilson, Stephen M; Pickell, Herbert; Bellugi, Ursula; Hickok, Gregory
Despite decades of research, there is still disagreement regarding the nature of the information that is maintained in linguistic short-term memory (STM). Some authors argue for abstract phonological codes, whereas others argue for more general sensory traces. We assess these possibilities by investigating linguistic STM in two distinct sensory-motor modalities, spoken and signed language. Hearing bilingual participants (native in English and American Sign Language) performed equivalent STM tasks in both languages during functional magnetic resonance imaging. Distinct, sensory-specific activations were seen during the maintenance phase of the task for spoken versus signed language. These regions have been previously shown to respond to nonlinguistic sensory stimulation, suggesting that linguistic STM tasks recruit sensory-specific networks. However, maintenance-phase activations common to the two languages were also observed, implying some form of common process. We conclude that linguistic STM involves sensory-dependent neural networks, but suggest that sensory-independent neural networks may also exist.
Hoog, B.E. de; Langereis, M.C.; Weerdenburg, M. van; Keuning, J.; Knoors, H.; Verhoeven, L.
BACKGROUND: Large variability in individual spoken language outcomes remains a persistent finding in the group of children with cochlear implants (CIs), particularly in their grammatical development. AIMS: In the present study, we examined the extent of delay in lexical and morphosyntactic spoken
Hoog, B.E. de; Langereis, M.C.; Weerdenburg, M.W.C. van; Keuning, J.; Knoors, H.E.T.; Verhoeven, L.T.W.
Background: Large variability in individual spoken language outcomes remains a persistent finding in the group of children with cochlear implants (CIs), particularly in their grammatical development. Aims: In the present study, we examined the extent of delay in lexical and morphosyntactic spoken
Fitzpatrick, Elizabeth M; Hamel, Candyce; Stevens, Adrienne; Pratt, Misty; Moher, David; Doucet, Suzanne P; Neuss, Deirdre; Bernstein, Anita; Na, Eunjung
Permanent hearing loss affects 1 to 3 per 1000 children and interferes with typical communication development. Early detection through newborn hearing screening and hearing technology provide most children with the option of spoken language acquisition. However, no consensus exists on optimal interventions for spoken language development. To conduct a systematic review of the effectiveness of early sign and oral language intervention compared with oral language intervention only for children with permanent hearing loss. An a priori protocol was developed. Electronic databases (eg, Medline, Embase, CINAHL) from 1995 to June 2013 and gray literature sources were searched. Studies in English and French were included. Two reviewers screened potentially relevant articles. Outcomes of interest were measures of auditory, vocabulary, language, and speech production skills. All data collection and risk of bias assessments were completed and then verified by a second person. Grades of Recommendation, Assessment, Development, and Evaluation (GRADE) was used to judge the strength of evidence. Eleven cohort studies met inclusion criteria, of which 8 included only children with severe to profound hearing loss with cochlear implants. Language development was the most frequently reported outcome. Other reported outcomes included speech and speech perception. Several measures and metrics were reported across studies, and descriptions of interventions were sometimes unclear. Very limited, and hence insufficient, high-quality evidence exists to determine whether sign language in combination with oral language is more effective than oral language therapy alone. More research is needed to supplement the evidence base. Copyright © 2016 by the American Academy of Pediatrics.
Salverda, Anne Pier; Altmann, Gerry T. M.
Participants saw a small number of objects in a visual display and performed a visual detection or visual-discrimination task in the context of task-irrelevant spoken distractors. In each experiment, a visual cue was presented 400 ms after the onset of a spoken word. In experiments 1 and 2, the cue was an isoluminant color change and participants…
Laveson, J. I.; Silver, C. A.
Assessment of the merits of a limited spoken language (56 words) computer in a simulated air traffic control (ATC) task. An airport zone approximately 60 miles in diameter with a traffic flow simulation ranging from single-engine to commercial jet aircraft provided the workload for the controllers. This research determined that, under the circumstances of the experiments carried out, the use of a spoken-language computer would not improve the controller performance.
Full Text Available Current views about language are dominated by the idea of arbitrary connections between linguistic form and meaning. However, if we look beyond the more familiar Indo-European languages and also include both spoken and signed language modalities, we find that motivated, iconic form-meaning mappings are, in fact, pervasive in language. In this paper, we review the different types of iconic mappings that characterize languages in both modalities, including the predominantly visually iconic mappings in signed languages. Having shown that iconic mapping are present across languages, we then proceed to review evidence showing that language users (signers and speakers exploit iconicity in language processing and language acquisition. While not discounting the presence and importance of arbitrariness in language, we put forward the idea that iconicity need also be recognized as a general property of language, which may serve the function of reducing the gap between linguistic form and conceptual representation to allow the language system to hook up to motor and perceptual experience.
Loucas, Tom; Riches, Nick; Baird, Gillian; Pickles, Andrew; Simonoff, Emily; Chandler, Susie; Charman, Tony
Spoken word recognition, during gating, appears intact in specific language impairment (SLI). This study used gating to investigate the process in adolescents with autism spectrum disorders plus language impairment (ALI). Adolescents with ALI, SLI, and typical language development (TLD), matched on nonverbal IQ listened to gated words that varied…
Marshall, C. R.; Jones, A.; Fastelli, A.; Atkinson, J.; Botting, N.; Morgan, G.
Background: Deafness has an adverse impact on children's ability to acquire spoken languages. Signed languages offer a more accessible input for deaf children, but because the vast majority are born to hearing parents who do not sign, their early exposure to sign language is limited. Deaf children as a whole are therefore at high risk of language…
Sodiya, Adesina Simon
Natural languages are the latest generation of programming languages, which require processing real human natural expressions. Over the years, several groups or researchers have trying to develop widely accepted natural language languages based on artificial intelligence (AI). But no true natural language has been developed. The goal of this work is to design a natural language preprocessing architecture that identifies and accepts programming instructions or sentences in their natural forms ...
To assess the effects of data-driven instruction (DDI) on spoken language outcomes of children with cochlear implants and hearing aids. Retrospective, matched-pairs comparison of post-treatment speech/language data of children who did and did not receive DDI. Private, spoken-language preschool for children with hearing loss. Eleven matched pairs of children with cochlear implants who attended the same spoken language preschool. Groups were matched for age of hearing device fitting, time in the program, degree of predevice fitting hearing loss, sex, and age at testing. Daily informal language samples were collected and analyzed over a 2-year period, per preschool protocol. Annual informal and formal spoken language assessments in articulation, vocabulary, and omnibus language were administered at the end of three time intervals: baseline, end of year one, and end of year two. The primary outcome measures were total raw score performance of spontaneous utterance sentence types and syntax element use as measured by the Teacher Assessment of Spoken Language (TASL). In addition, standardized assessments (the Clinical Evaluation of Language Fundamentals--Preschool Version 2 (CELF-P2), the Expressive One-Word Picture Vocabulary Test (EOWPVT), the Receptive One-Word Picture Vocabulary Test (ROWPVT), and the Goldman-Fristoe Test of Articulation 2 (GFTA2)) were also administered and compared with the control group. The DDI group demonstrated significantly higher raw scores on the TASL each year of the study. The DDI group also achieved statistically significant higher scores for total language on the CELF-P and expressive vocabulary on the EOWPVT, but not for articulation nor receptive vocabulary. Post-hoc assessment revealed that 78% of the students in the DDI group achieved scores in the average range compared with 59% in the control group. The preliminary results of this study support further investigation regarding DDI to investigate whether this method can consistently
Stevens, Catherine J.; Keller, Peter E.; Tyler, Michael D.
An experiment investigated the effect of tonal language background on discrimination of pitch contour in short spoken and musical items. It was hypothesized that extensive exposure to a tonal language attunes perception of pitch contour. Accuracy and reaction times of adult participants from tonal (Thai) and non-tonal (Australian English) language…
de Hoog, Brigitte E; Langereis, Margreet C; van Weerdenburg, Marjolijn; Keuning, Jos; Knoors, Harry; Verhoeven, Ludo
Large variability in individual spoken language outcomes remains a persistent finding in the group of children with cochlear implants (CIs), particularly in their grammatical development. In the present study, we examined the extent of delay in lexical and morphosyntactic spoken language levels of children with CIs as compared to those of a normative sample of age-matched children with normal hearing. Furthermore, the predictive value of auditory and verbal memory factors in the spoken language performance of implanted children was analyzed. Thirty-nine profoundly deaf children with CIs were assessed using a test battery including measures of lexical, grammatical, auditory and verbal memory tests. Furthermore, child-related demographic characteristics were taken into account. The majority of the children with CIs did not reach age-equivalent lexical and morphosyntactic language skills. Multiple linear regression analyses revealed that lexical spoken language performance in children with CIs was best predicted by age at testing, phoneme perception, and auditory word closure. The morphosyntactic language outcomes of the CI group were best predicted by lexicon, auditory word closure, and auditory memory for words. Qualitatively good speech perception skills appear to be crucial for lexical and grammatical development in children with CIs. Furthermore, strongly developed vocabulary skills and verbal memory abilities predict morphosyntactic language skills. Copyright © 2016 Elsevier Ltd. All rights reserved.
Huettig, Falk; Brouwer, Susanne
It is now well established that anticipation of upcoming input is a key characteristic of spoken language comprehension. It has also frequently been observed that literacy influences spoken language processing. Here, we investigated whether anticipatory spoken language processing is related to individuals' word reading abilities. Dutch adults with dyslexia and a control group participated in two eye-tracking experiments. Experiment 1 was conducted to assess whether adults with dyslexia show the typical language-mediated eye gaze patterns. Eye movements of both adults with and without dyslexia closely replicated earlier research: spoken language is used to direct attention to relevant objects in the environment in a closely time-locked manner. In Experiment 2, participants received instructions (e.g., 'Kijk naar de(COM) afgebeelde piano(COM)', look at the displayed piano) while viewing four objects. Articles (Dutch 'het' or 'de') were gender marked such that the article agreed in gender only with the target, and thus, participants could use gender information from the article to predict the target object. The adults with dyslexia anticipated the target objects but much later than the controls. Moreover, participants' word reading scores correlated positively with their anticipatory eye movements. We conclude by discussing the mechanisms by which reading abilities may influence predictive language processing. Copyright © 2015 John Wiley & Sons, Ltd.
Hampton, L. H.; Kaiser, A. P.
Background: Although spoken-language deficits are not core to an autism spectrum disorder (ASD) diagnosis, many children with ASD do present with delays in this area. Previous meta-analyses have assessed the effects of intervention on reducing autism symptomatology, but have not determined if intervention improves spoken language. This analysis…
Mani, Nivedita; Huettig, Falk
Despite the efficiency with which language users typically process spoken language, a growing body of research finds substantial individual differences in both the speed and accuracy of spoken language processing potentially attributable to participants' literacy skills. Against this background, the current study took a look at the role of word reading skill in listeners' anticipation of upcoming spoken language input in children at the cusp of learning to read; if reading skills affect predictive language processing, then children at this stage of literacy acquisition should be most susceptible to the effects of reading skills on spoken language processing. We tested 8-year-olds on their prediction of upcoming spoken language input in an eye-tracking task. Although children, like in previous studies to date, were successfully able to anticipate upcoming spoken language input, there was a strong positive correlation between children's word reading skills (but not their pseudo-word reading and meta-phonological awareness or their spoken word recognition skills) and their prediction skills. We suggest that these findings are most compatible with the notion that the process of learning orthographic representations during reading acquisition sharpens pre-existing lexical representations, which in turn also supports anticipation of upcoming spoken words. Copyright © 2014 Elsevier Inc. All rights reserved.
Tur-Kaspa, Hana; Dromi, Esther
The present study reports a detailed analysis of written and spoken language samples of Hebrew-speaking children aged 11-13 years who are deaf. It focuses on the description of various grammatical deviations in the two modalities. Participants were 13 students with hearing impairments (HI) attending special classrooms integrated into two elementary schools in Tel Aviv, Israel, and 9 students with normal hearing (NH) in regular classes in these same schools. Spoken and written language samples were collected from all participants using the same five preplanned elicitation probes. Students with HI were found to display significantly more grammatical deviations than their NH peers in both their spoken and written language samples. Most importantly, between-modality differences were noted. The participants with HI exhibited significantly more grammatical deviations in their written language samples than in their spoken samples. However, the distribution of grammatical deviations across categories was similar in the two modalities. The most common grammatical deviations in order of their frequency were failure to supply obligatory morphological markers, failure to mark grammatical agreement, and the omission of a major syntactic constituent in a sentence. Word order violations were rarely recorded in the Hebrew samples. Performance differences in the two modalities encourage clinicians and teachers to facilitate target linguistic forms in diverse communication contexts. Furthermore, the identification of linguistic targets for intervention must be based on the unique grammatical structure of the target language.
Full Text Available The paper explores similarities and differences in the strategies of structuring information at sentence level in spoken and written language, respectively. In particular, it is concerned with the position of the rheme in the sentence in the two different modalities of language, and with the application and correlation of the end-focus and the end-weight principles. The assumption is that while there is a general tendency in both written and spoken language to place the focus in or close to the final position, owing to the limitations imposed by short-term memory capacity (and possibly by other factors, for the sake of easy processibility, it may occasionally be more felicitous in spoken language to place the rhematic element in the initial position or at least close to the beginning of the sentence. The paper aims to identify differences in the function of selected grammatical structures in written and spoken language, respectively, and to point out circumstances under which initial focus is a convenient alternative to the usual end-focus principle.
Evans, Julia L; Gillam, Ronald B; Montgomery, James W
This study examined the influence of cognitive factors on spoken word recognition in children with developmental language disorder (DLD) and typically developing (TD) children. Participants included 234 children (aged 7;0-11;11 years;months), 117 with DLD and 117 TD children, propensity matched for age, gender, socioeconomic status, and maternal education. Children completed a series of standardized assessment measures, a forward gating task, a rapid automatic naming task, and a series of tasks designed to examine cognitive factors hypothesized to influence spoken word recognition including phonological working memory, updating, attention shifting, and interference inhibition. Spoken word recognition for both initial and final accept gate points did not differ for children with DLD and TD controls after controlling target word knowledge in both groups. The 2 groups also did not differ on measures of updating, attention switching, and interference inhibition. Despite the lack of difference on these measures, for children with DLD, attention shifting and interference inhibition were significant predictors of spoken word recognition, whereas updating and receptive vocabulary were significant predictors of speed of spoken word recognition for the children in the TD group. Contrary to expectations, after controlling for target word knowledge, spoken word recognition did not differ for children with DLD and TD controls; however, the cognitive processing factors that influenced children's ability to recognize the target word in a stream of speech differed qualitatively for children with and without DLDs.
Full Text Available Tujuan dari penelitian ini adalah untuk menggambarkan penerapan metode Communicative Language Teaching/CLT untuk pembelajaran spoken recount. Penelitian ini menelaah data yang kualitatif. Penelitian ini mengambarkan fenomena yang terjadi di dalam kelas. Data studi ini adalah perilaku dan respon para siswa dalam pembelajaran spoken recount dengan menggunakan metode CLT. Subjek penelitian ini adalah para siswa kelas X SMA Negeri 1 Kuaro yang terdiri dari 34 siswa. Observasi dan wawancara dilakukan dalam rangka untuk mengumpulkan data dalam mengajarkan spoken recount melalui tiga aktivitas (presentasi, bermain-peran, serta melakukan prosedur. Dalam penelitian ini ditemukan beberapa hal antara lain bahwa CLT meningkatkan kemampuan berbicara siswa dalam pembelajaran recount. Berdasarkan pada grafik peningkatan, disimpulkan bahwa tata bahasa, kosakata, pengucapan, kefasihan, serta performa siswa mengalami peningkatan. Ini berarti bahwa performa spoken recount dari para siswa meningkat. Andaikata presentasi ditempatkan di bagian akhir dari langkah-langkah aktivitas, peforma spoken recount para siswa bahkan akan lebih baik lagi. Kesimpulannya adalah bahwa implementasi metode CLT beserta tiga praktiknya berkontribusi pada peningkatan kemampuan berbicara para siswa dalam pembelajaran recount dan bahkan metode CLT mengarahkan mereka untuk memiliki keberanian dalam mengonstruksi komunikasi yang bermakna dengan percaya diri. Kata kunci: Communicative Language Teaching (CLT, recount, berbicara, respon siswa
One fundamental difference between spoken and written language has to do with the "linearity" of speaking in time, in that the temporal structure of speaking is inherently the outcome of an interactive process between speaker and listener. But despite the status of "linearity" as one of Saussure's fundamental principles, in practice little more…
von Tetzchner, S; Øvreeide, K D; Jørgensen, K K; Ormhaug, B M; Oxholm, B; Warme, R
To describe a graphic-mode communication intervention involving a girl with intellectual impairment and autism who did not develop comprehension of spoken language. The aim was to teach graphic-mode vocabulary that reflected her interests, preferences, and the activities and routines of her daily life, by providing sufficient cues to the meanings of the graphic representations so that she would not need to comprehend spoken instructions. An individual case study design was selected, including the use of written records, participant observation, and registration of the girl's graphic vocabulary and use of graphic signs and other communicative expressions. While the girl's comprehension (and hence use) of spoken language remained lacking over a 3-year period, she acquired an active use of over 80 photographs and pictograms. The girl was able to cope better with the cognitive and attentional requirements of graphic communication than those of spoken language and manual signs, which had been focused in earlier interventions. Her achievements demonstrate that it is possible for communication-impaired children to learn to use an augmentative and alternative communication system without speech comprehension, provided the intervention utilizes functional strategies and non-language cues to the meaning of the graphic representations that are taught.
Nicholas, Johanna G.; Geers, Ann E.
Purpose: The major purpose of this study was to provide information about expected spoken language skills of preschool-age children who are deaf and who use a cochlear implant. A goal was to provide "benchmarks" against which those skills could be compared, for a given age at implantation. We also examined whether parent-completed…
Singh, Jitendra K.; Misra, Girishwar; De Raad, Boele
The psycho-lexical approach is extended to Hindi, a major language spoken in India. From both the dictionary and from Hindi novels, a huge set of personality descriptors was put together, ultimately reduced to a manageable set of 295 trait terms. Both self and peer ratings were collected on those
Gràcia, Marta; Vega, Fàtima; Galván-Bovaira, Maria José
Broadly speaking, the teaching of spoken language in Spanish schools has not been approached in a systematic way. Changes in school practices are needed in order to allow all children to become competent speakers and to understand and construct oral texts that are appropriate in different contexts and for different audiences both inside and…
McDuffie, Andrea; Machalicek, Wendy; Bullard, Lauren; Nelson, Sarah; Mello, Melissa; Tempero-Feigles, Robyn; Castignetti, Nancy; Abbeduto, Leonard
Using a single case design, a parent-mediated spoken-language intervention was delivered to three mothers and their school-aged sons with fragile X syndrome, the leading inherited cause of intellectual disability. The intervention was embedded in the context of shared storytelling using wordless picture books and targeted three empirically derived…
Full Text Available The authors investigate the addition of a new language, for which limited resources are available, to a phonotactic language identification system. Two classes of approaches are studied: in the first class, only existing phonetic recognizers...
Soleymani, Zahra; Keramati, Nasrin; Rohani, Farzaneh; Jalaei, Shohre
To determine verbal intelligence and spoken language of children with phenylketonuria and to study the effect of age at diagnosis and phenylalanine plasma level on these abilities. Cross-sectional. Children with phenylketonuria were recruited from pediatric hospitals in 2012. Normal control subjects were recruited from kindergartens in Tehran. 30 phenylketonuria and 42 control subjects aged 4-6.5 years. Skills were compared between 3 phenylketonuria groups categorized by age at diagnosis/treatment, and between the phenylketonuria and control groups. Scores on Wechsler Preschool and Primary Scale of Intelligence for verbal and total intelligence, and Test of Language Development-Primary, third edition for spoken language, listening, speaking, semantics, syntax, and organization. The performance of control subjects was significantly better than that of early-treated subjects for all composite quotients from Test of Language Development and verbal intelligence (Pphenylketonuria subjects.
Peterson, Nathaniel R.; Pisoni, David B.; Miyamoto, Richard T.
Cochlear implants (CIs) process sounds electronically and then transmit electric stimulation to the cochlea of individuals with sensorineural deafness, restoring some sensation of auditory perception. Many congenitally deaf CI recipients achieve a high degree of accuracy in speech perception and develop near-normal language skills. Post-lingually deafened implant recipients often regain the ability to understand and use spoken language with or without the aid of visual input (i.e. lip reading...
Fitzpatrick, Elizabeth M; Stevens, Adrienne; Garritty, Chantelle; Moher, David
Permanent childhood hearing loss affects 1 to 3 per 1000 children and frequently disrupts typical spoken language acquisition. Early identification of hearing loss through universal newborn hearing screening and the use of new hearing technologies including cochlear implants make spoken language an option for most children. However, there is no consensus on what constitutes optimal interventions for children when spoken language is the desired outcome. Intervention and educational approaches ranging from oral language only to oral language combined with various forms of sign language have evolved. Parents are therefore faced with important decisions in the first months of their child's life. This article presents the protocol for a systematic review of the effects of using sign language in combination with oral language intervention on spoken language acquisition. Studies addressing early intervention will be selected in which therapy involving oral language intervention and any form of sign language or sign support is used. Comparison groups will include children in early oral language intervention programs without sign support. The primary outcomes of interest to be examined include all measures of auditory, vocabulary, language, speech production, and speech intelligibility skills. We will include randomized controlled trials, controlled clinical trials, and other quasi-experimental designs that include comparator groups as well as prospective and retrospective cohort studies. Case-control, cross-sectional, case series, and case studies will be excluded. Several electronic databases will be searched (for example, MEDLINE, EMBASE, CINAHL, PsycINFO) as well as grey literature and key websites. We anticipate that a narrative synthesis of the evidence will be required. We will carry out meta-analysis for outcomes if clinical similarity, quantity and quality permit quantitative pooling of data. We will conduct subgroup analyses if possible according to severity
Jon-Ruben eVan Rhijn
Full Text Available Speech requires precise motor control and rapid sequencing of highly complex vocal musculature. Despite its complexity, most people produce spoken language effortlessly. This is due to activity in distributed neuronal circuitry including cortico-striato-thalamic loops that control speech-motor output. Understanding the neuro-genetic mechanisms that encode these pathways will shed light on how humans can effortlessly and innately use spoken language and could elucidate what goes wrong in speech-language disorders.FOXP2 was the first single gene identified to cause speech and language disorder. Individuals with FOXP2 mutations display a severe speech deficit that also includes receptive and expressive language impairments. The underlying neuro-molecular mechanisms controlled by FOXP2, which will give insight into our capacity for speech-motor control, are only beginning to be unraveled. Recently FOXP2 was found to regulate genes involved in retinoic acid signaling and to modify the cellular response to retinoic acid, a key regulator of brain development. Herein we explore the evidence that FOXP2 and retinoic acid signaling function in the same pathways. We present evidence at molecular, cellular and behavioral levels that suggest an interplay between FOXP2 and retinoic acid that may be important for fine motor control and speech-motor output. We propose that retinoic acid signaling is an exciting new angle from which to investigate how neurogenetic mechanisms can contribute to the (spoken language ready brain.
Boons, Tinne; Brokx, Jan P L; Dhooge, Ingeborg; Frijns, Johan H M; Peeraer, Louis; Vermeulen, Anneke; Wouters, Jan; van Wieringen, Astrid
Although deaf children with cochlear implants (CIs) are able to develop good language skills, the large variability in outcomes remains a significant concern. The first aim of this study was to evaluate language skills in children with CIs to establish benchmarks. The second aim was to make an estimation of the optimal age at implantation to provide maximal opportunities for the child to achieve good language skills afterward. The third aim was to gain more insight into the causes of variability to set recommendations for optimizing the rehabilitation process of prelingually deaf children with CIs. Receptive and expressive language development of 288 children who received CIs by age five was analyzed in a retrospective multicenter study. Outcome measures were language quotients (LQs) on the Reynell Developmental Language Scales and Schlichting Expressive Language Test at 1, 2, and 3 years after implantation. Independent predictive variables were nine child-related, environmental, and auditory factors. A series of multiple regression analyses determined the amount of variance in expressive and receptive language outcomes attributable to each predictor when controlling for the other variables. Simple linear regressions with age at first fitting and independent samples t tests demonstrated that children implanted before the age of two performed significantly better on all tests than children who were implanted at an older age. The mean LQ was 0.78 with an SD of 0.18. A child with an LQ lower than 0.60 (= 0.78-0.18) within 3 years after implantation was labeled as a weak performer compared with other deaf children implanted before the age of two. Contralateral stimulation with a second CI or a hearing aid and the absence of additional disabilities were related to better language outcomes. The effect of environmental factors, comprising multilingualism, parental involvement, and communication mode increased over time. Three years after implantation, the total multiple
Jul 1, 2009 ... and cultural metaphors of illness as part of language learning. The theory of .... role.21 Even in a military setting, where soldiers learnt Korean or Spanish as part of ... own language – a cross-cultural survey. Brit J Gen Pract ...
Johan Frijns; prof. Dr. Louis Peeraer; van Wieringen; Ingeborg Dhooge; Vermeulen; Jan Brokx; Tinne Boons; Wouters
Objectives: Although deaf children with cochlear implants (CIs) are able to develop good language skills, the large variability in outcomes remains a significant concern. The first aim of this study was to evaluate language skills in children with CIs to establish benchmarks. The second aim was to
Kovelman, Ioulia; Norton, Elizabeth S; Christodoulou, Joanna A; Gaab, Nadine; Lieberman, Daniel A; Triantafyllou, Christina; Wolf, Maryanne; Whitfield-Gabrieli, Susan; Gabrieli, John D E
Phonological awareness, knowledge that speech is composed of syllables and phonemes, is critical for learning to read. Phonological awareness precedes and predicts successful transition from language to literacy, and weakness in phonological awareness is a leading cause of dyslexia, but the brain basis of phonological awareness for spoken language in children is unknown. We used functional magnetic resonance imaging to identify the neural correlates of phonological awareness using an auditory word-rhyming task in children who were typical readers or who had dyslexia (ages 7-13) and a younger group of kindergarteners (ages 5-6). Typically developing children, but not children with dyslexia, recruited left dorsolateral prefrontal cortex (DLPFC) when making explicit phonological judgments. Kindergarteners, who were matched to the older children with dyslexia on standardized tests of phonological awareness, also recruited left DLPFC. Left DLPFC may play a critical role in the development of phonological awareness for spoken language critical for reading and in the etiology of dyslexia.
Emmorey, Karen; McCullough, Stephen; Mehta, Sonya; Grabowski, Thomas J.
To investigate the impact of sensory-motor systems on the neural organization for language, we conducted an H215O-PET study of sign and spoken word production (picture-naming) and an fMRI study of sign and audio-visual spoken language comprehension (detection of a semantically anomalous sentence) with hearing bilinguals who are native users of American Sign Language (ASL) and English. Directly contrasting speech and sign production revealed greater activation in bilateral parietal cortex for signing, while speaking resulted in greater activation in bilateral superior temporal cortex (STC) and right frontal cortex, likely reflecting auditory feedback control. Surprisingly, the language production contrast revealed a relative increase in activation in bilateral occipital cortex for speaking. We speculate that greater activation in visual cortex for speaking may actually reflect cortical attenuation when signing, which functions to distinguish self-produced from externally generated visual input. Directly contrasting speech and sign comprehension revealed greater activation in bilateral STC for speech and greater activation in bilateral occipital-temporal cortex for sign. Sign comprehension, like sign production, engaged bilateral parietal cortex to a greater extent than spoken language. We hypothesize that posterior parietal activation in part reflects processing related to spatial classifier constructions in ASL and that anterior parietal activation may reflect covert imitation that functions as a predictive model during sign comprehension. The conjunction analysis for comprehension revealed that both speech and sign bilaterally engaged the inferior frontal gyrus (with more extensive activation on the left) and the superior temporal sulcus, suggesting an invariant bilateral perisylvian language system. We conclude that surface level differences between sign and spoken languages should not be dismissed and are critical for understanding the neurobiology of language
Hulme, Charles; Snowling, Margaret J
We review current knowledge about reading development and the origins of difficulties in learning to read. We distinguish between the processes involved in learning to decode print, and the processes involved in reading for meaning (reading comprehension). At a cognitive level, difficulties in learning to read appear to be predominantly caused by deficits in underlying oral language skills. The development of decoding skills appears to depend critically upon phonological language skills, and variations in phoneme awareness, letter-sound knowledge and rapid automatized naming each appear to be causally related to problems in learning to read. Reading comprehension difficulties in contrast appear to be critically dependent on a range of oral language comprehension skills (including vocabulary knowledge and grammatical, morphological and pragmatic skills).
Tiun, Sabrina; AL-Dhief, Fahad Taha; Sammour, Mahmoud A. M.
Spoken Language Identification (LID) is the process of determining and classifying natural language from a given content and dataset. Typically, data must be processed to extract useful features to perform LID. The extracting features for LID, based on literature, is a mature process where the standard features for LID have already been developed using Mel-Frequency Cepstral Coefficients (MFCC), Shifted Delta Cepstral (SDC), the Gaussian Mixture Model (GMM) and ending with the i-vector based framework. However, the process of learning based on extract features remains to be improved (i.e. optimised) to capture all embedded knowledge on the extracted features. The Extreme Learning Machine (ELM) is an effective learning model used to perform classification and regression analysis and is extremely useful to train a single hidden layer neural network. Nevertheless, the learning process of this model is not entirely effective (i.e. optimised) due to the random selection of weights within the input hidden layer. In this study, the ELM is selected as a learning model for LID based on standard feature extraction. One of the optimisation approaches of ELM, the Self-Adjusting Extreme Learning Machine (SA-ELM) is selected as the benchmark and improved by altering the selection phase of the optimisation process. The selection process is performed incorporating both the Split-Ratio and K-Tournament methods, the improved SA-ELM is named Enhanced Self-Adjusting Extreme Learning Machine (ESA-ELM). The results are generated based on LID with the datasets created from eight different languages. The results of the study showed excellent superiority relating to the performance of the Enhanced Self-Adjusting Extreme Learning Machine LID (ESA-ELM LID) compared with the SA-ELM LID, with ESA-ELM LID achieving an accuracy of 96.25%, as compared to the accuracy of SA-ELM LID of only 95.00%. PMID:29672546
Albadr, Musatafa Abbas Abbood; Tiun, Sabrina; Al-Dhief, Fahad Taha; Sammour, Mahmoud A M
Spoken Language Identification (LID) is the process of determining and classifying natural language from a given content and dataset. Typically, data must be processed to extract useful features to perform LID. The extracting features for LID, based on literature, is a mature process where the standard features for LID have already been developed using Mel-Frequency Cepstral Coefficients (MFCC), Shifted Delta Cepstral (SDC), the Gaussian Mixture Model (GMM) and ending with the i-vector based framework. However, the process of learning based on extract features remains to be improved (i.e. optimised) to capture all embedded knowledge on the extracted features. The Extreme Learning Machine (ELM) is an effective learning model used to perform classification and regression analysis and is extremely useful to train a single hidden layer neural network. Nevertheless, the learning process of this model is not entirely effective (i.e. optimised) due to the random selection of weights within the input hidden layer. In this study, the ELM is selected as a learning model for LID based on standard feature extraction. One of the optimisation approaches of ELM, the Self-Adjusting Extreme Learning Machine (SA-ELM) is selected as the benchmark and improved by altering the selection phase of the optimisation process. The selection process is performed incorporating both the Split-Ratio and K-Tournament methods, the improved SA-ELM is named Enhanced Self-Adjusting Extreme Learning Machine (ESA-ELM). The results are generated based on LID with the datasets created from eight different languages. The results of the study showed excellent superiority relating to the performance of the Enhanced Self-Adjusting Extreme Learning Machine LID (ESA-ELM LID) compared with the SA-ELM LID, with ESA-ELM LID achieving an accuracy of 96.25%, as compared to the accuracy of SA-ELM LID of only 95.00%.
le Fevre Jakobsen, Bjarne
The tempo of Danish television news broadcasts has changed markedly over the past 40 years, while the language has essentially always been conservative, and remains so today. The development in the tempo of the broadcasts has gone through a number of phases from a newsreader in a rigid structure...
In the face of globalisation, the scale of communication is increasing from being merely .... capital goods and services across national frontiers involving too, political contexts of ... auditory and audiovisual entertainment, the use of English dominates. The language .... manners, entertainment, sports, the legal system, etc.
Vukotic , Vedran; Raymond , Christian; Gravier , Guillaume
International audience; Architectures of Recurrent Neural Networks (RNN) recently become a very popular choice for Spoken Language Understanding (SLU) problems; however, they represent a big family of different architectures that can furthermore be combined to form more complex neural networks. In this work, we compare different recurrent networks, such as simple Recurrent Neural Networks (RNN), Long Short-Term Memory (LSTM) networks, Gated Memory Units (GRU) and their bidirectional versions,...
Kasyidi, Fatan; Puji Lestari, Dessi
One of the important aspects in human to human communication is to understand emotion of each party. Recently, interactions between human and computer continues to develop, especially affective interaction where emotion recognition is one of its important components. This paper presents our extended works on emotion recognition of Indonesian spoken language to identify four main class of emotions: Happy, Sad, Angry, and Contentment using combination of acoustic/prosodic features and lexical features. We construct emotion speech corpus from Indonesia television talk show where the situations are as close as possible to the natural situation. After constructing the emotion speech corpus, the acoustic/prosodic and lexical features are extracted to train the emotion model. We employ some machine learning algorithms such as Support Vector Machine (SVM), Naive Bayes, and Random Forest to get the best model. The experiment result of testing data shows that the best model has an F-measure score of 0.447 by using only the acoustic/prosodic feature and F-measure score of 0.488 by using both acoustic/prosodic and lexical features to recognize four class emotion using the SVM RBF Kernel.
Williams, Joshua T.; Newman, Sharlene D.
A large body of literature has characterized unimodal monolingual and bilingual lexicons and how neighborhood density affects lexical access; however there have been relatively fewer studies that generalize these findings to bimodal (M2) second language (L2) learners of sign languages. The goal of the current study was to investigate parallel…
This article discusses the attitudes and motivations of two Saudi children learning Japanese as a foreign language (hence JFL), a language which is rarely spoken in the country. Studies regarding children's motivation for learning foreign languages that are not widely spread in their contexts in informal settings are scarce. The aim of the study…
Moats, L C
Reading research supports the necessity for directly teaching concepts about linguistic structure to beginning readers and to students with reading and spelling difficulties. In this study, experienced teachers of reading, language arts, and special education were tested to determine if they have the requisite awareness of language elements (e.g., phonemes, morphemes) and of how these elements are represented in writing (e.g., knowledge of sound-symbol correspondences). The results were surprisingly poor, indicating that even motivated and experienced teachers typically understand too little about spoken and written language structure to be able to provide sufficient instruction in these areas. The utility of language structure knowledge for instructional planning, for assessment of student progress, and for remediation of literacy problems is discussed.The teachers participating in the study subsequently took a course focusing on phonemic awareness training, spoken-written language relationships, and careful analysis of spelling and reading behavior in children. At the end of the course, the teachers judged this information to be essential for teaching and advised that it become a prerequisite for certification. Recommendations for requirements and content of teacher education programs are presented.
Fragadasilva, Francisco Jose; Saotome, Osamu; Deoliveira, Carlos Alberto
An automatic speech-to-text transformer system, suited to unlimited vocabulary, is presented. The basic acoustic unit considered are the allophones of the phonemes corresponding to the Portuguese language spoken in Brazil (PLB). The input to the system is a phonetic sequence, from a former step of isolated word recognition of slowly spoken speech. In a first stage, the system eliminates phonetic elements that don't belong to PLB. Using knowledge sources such as phonetics, phonology, orthography, and PLB specific lexicon, the output is a sequence of written words, ordered by probabilistic criterion that constitutes the set of graphemic possibilities to that input sequence. Pronunciation differences of some regions of Brazil are considered, but only those that cause differences in phonological transcription, because those of phonetic level are absorbed, during the transformation to phonological level. In the final stage, all possible written words are analyzed for orthography and grammar point of view, to eliminate the incorrect ones.
Williams, Joshua T.; Darcy, Isabelle; Newman, Sharlene D.
Understanding how language modality (i.e., signed vs. spoken) affects second language outcomes in hearing adults is important both theoretically and pedagogically, as it can determine the specificity of second language (L2) theory and inform how best to teach a language that uses a new modality. The present study investigated which…
Sarant, Julia Z; Holt, Colleen M; Dowell, Richard C; Rickards, Field W; Blamey, Peter J
This article documented spoken language outcomes for preschool children with hearing loss and examined the relationships between language abilities and characteristics of children such as degree of hearing loss, cognitive abilities, age at entry to early intervention, and parent involvement in children's intervention programs. Participants were evaluated using a combination of the Child Development Inventory, the Peabody Picture Vocabulary Test, and the Preschool Clinical Evaluation of Language Fundamentals depending on their age at the time of assessment. Maternal education, cognitive ability, and family involvement were also measured. Over half of the children who participated in this study had poor language outcomes overall. No significant differences were found in language outcomes on any of the measures for children who were diagnosed early and those diagnosed later. Multiple regression analyses showed that family participation, degree of hearing loss, and cognitive ability significantly predicted language outcomes and together accounted for almost 60% of the variance in scores. This article highlights the importance of family participation in intervention programs to enable children to achieve optimal language outcomes. Further work may clarify the effects of early diagnosis on language outcomes for preschool children.
As with biological systems, spoken languages are strikingly robust against perturbations. This paper shows that languages achieve robustness in a way that is highly similar to many biological systems. For example, speech sounds are encoded via multiple acoustically diverse, temporally distributed and functionally redundant cues, characteristics that bear similarities to what biologists call "degeneracy". Speech is furthermore adequately characterized by neutrality, with many different tongue configurations leading to similar acoustic outputs, and different acoustic variants understood as the same by recipients. This highlights the presence of a large neutral network of acoustic neighbors for every speech sound. Such neutrality ensures that a steady backdrop of variation can be maintained without impeding communication, assuring that there is "fodder" for subsequent evolution. Thus, studying linguistic robustness is not only important for understanding how linguistic systems maintain their functioning upon the background of noise, but also for understanding the preconditions for language evolution. © 2014 WILEY Periodicals, Inc.
Hirschmüller, Sarah; Egloff, Boris
How do individuals emotionally cope with the imminent real-world salience of mortality? DeWall and Baumeister as well as Kashdan and colleagues previously provided support that an increased use of positive emotion words serves as a way to protect and defend against mortality salience of one's own contemplated death. Although these studies provide important insights into the psychological dynamics of mortality salience, it remains an open question how individuals cope with the immense threat of mortality prior to their imminent actual death. In the present research, we therefore analyzed positivity in the final words spoken immediately before execution by 407 death row inmates in Texas. By using computerized quantitative text analysis as an objective measure of emotional language use, our results showed that the final words contained a significantly higher proportion of positive than negative emotion words. This emotional positivity was significantly higher than (a) positive emotion word usage base rates in spoken and written materials and (b) positive emotional language use with regard to contemplated death and attempted or actual suicide. Additional analyses showed that emotional positivity in final statements was associated with a greater frequency of language use that was indicative of self-references, social orientation, and present-oriented time focus as well as with fewer instances of cognitive-processing, past-oriented, and death-related word use. Taken together, our findings offer new insights into how individuals cope with the imminent real-world salience of mortality.
Pyykkönen, Pirita; Hyönä, Jukka; van Gompel, Roger P G
This study used the visual world eye-tracking method to investigate activation of general world knowledge related to gender-stereotypical role names in online spoken language comprehension in Finnish. The results showed that listeners activated gender stereotypes elaboratively in story contexts where this information was not needed to build coherence. Furthermore, listeners made additional inferences based on gender stereotypes to revise an already established coherence relation. Both results are consistent with mental models theory (e.g., Garnham, 2001). They are harder to explain by the minimalist account (McKoon & Ratcliff, 1992) which suggests that people limit inferences to those needed to establish coherence in discourse.
Al-Surmi, Mansoor Ali
TV shows, especially soap operas and sitcoms, are usually considered by ESL practitioners as a source of authentic spoken conversational materials presumably because they reflect the linguistic features of natural conversation. However, practitioners might be faced with the dilemma of how to evaluate whether such conversational materials reflect…
Nussbaum, Debra; Waddy-Smith, Bettie; Doyle, Jane
There is a core body of knowledge, experience, and skills integral to facilitating auditory, speech, and spoken language development when working with the general population of students who are deaf and hard of hearing. There are additional issues, strategies, and challenges inherent in speech habilitation/rehabilitation practices essential to the population of deaf and hard of hearing students who also use sign language. This article will highlight philosophical and practical considerations related to practices used to facilitate spoken language development and associated literacy skills for children and adolescents who sign. It will discuss considerations for planning and implementing practices that acknowledge and utilize a student's abilities in sign language, and address how to link these skills to developing and using spoken language. Included will be considerations for children from early childhood through high school with a broad range of auditory access, language, and communication characteristics. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
Sharp, J.K. [Sandia National Labs., Albuquerque, NM (United States)
This seminar describes a process and methodology that uses structured natural language to enable the construction of precise information requirements directly from users, experts, and managers. The main focus of this natural language approach is to create the precise information requirements and to do it in such a way that the business and technical experts are fully accountable for the results. These requirements can then be implemented using appropriate tools and technology. This requirement set is also a universal learning tool because it has all of the knowledge that is needed to understand a particular process (e.g., expense vouchers, project management, budget reviews, tax, laws, machine function).
Diao, Yali; Chandler, Paul; Sweller, John
Based on cognitive load theory, this study investigated the effect of simultaneous written presentations on comprehension of spoken English as a foreign language. Learners' language comprehension was compared while they used 3 instructional formats: listening with auditory materials only, listening with a full, written script, and listening with simultaneous subtitled text. Listening with the presence of a script and subtitles led to better understanding of the scripted and subtitled passage but poorer performance on a subsequent auditory passage than listening with the auditory materials only. These findings indicated that where the intention was learning to listen, the use of a full script or subtitles had detrimental effects on the construction and automation of listening comprehension schemas.
Deng, Zhizhou; Chandrasekaran, Bharath; Wang, Suiping; Wong, Patrick C.M.
A major challenge in language learning studies is to identify objective, pre-training predictors of success. Variation in the low-frequency fluctuations (LFFs) of spontaneous brain activity measured by resting-state functional magnetic resonance imaging (RS-fMRI) has been found to reflect individual differences in cognitive measures. In the present study, we aimed to investigate the extent to which initial spontaneous brain activity is related to individual differences in spoken language learning. We acquired RS-fMRI data and subsequently trained participants on a sound-to-word learning paradigm in which they learned to use foreign pitch patterns (from Mandarin Chinese) to signal word meaning. We performed amplitude of spontaneous low-frequency fluctuation (ALFF) analysis, graph theory-based analysis, and independent component analysis (ICA) to identify functional components of the LFFs in the resting-state. First, we examined the ALFF as a regional measure and showed that regional ALFFs in the left superior temporal gyrus were positively correlated with learning performance, whereas ALFFs in the default mode network (DMN) regions were negatively correlated with learning performance. Furthermore, the graph theory-based analysis indicated that the degree and local efficiency of the left superior temporal gyrus were positively correlated with learning performance. Finally, the default mode network and several task-positive resting-state networks (RSNs) were identified via the ICA. The “competition” (i.e., negative correlation) between the DMN and the dorsal attention network was negatively correlated with learning performance. Our results demonstrate that a) spontaneous brain activity can predict future language learning outcome without prior hypotheses (e.g., selection of regions of interest – ROIs) and b) both regional dynamics and network-level interactions in the resting brain can account for individual differences in future spoken language learning success
Deng, Zhizhou; Chandrasekaran, Bharath; Wang, Suiping; Wong, Patrick C M
A major challenge in language learning studies is to identify objective, pre-training predictors of success. Variation in the low-frequency fluctuations (LFFs) of spontaneous brain activity measured by resting-state functional magnetic resonance imaging (RS-fMRI) has been found to reflect individual differences in cognitive measures. In the present study, we aimed to investigate the extent to which initial spontaneous brain activity is related to individual differences in spoken language learning. We acquired RS-fMRI data and subsequently trained participants on a sound-to-word learning paradigm in which they learned to use foreign pitch patterns (from Mandarin Chinese) to signal word meaning. We performed amplitude of spontaneous low-frequency fluctuation (ALFF) analysis, graph theory-based analysis, and independent component analysis (ICA) to identify functional components of the LFFs in the resting-state. First, we examined the ALFF as a regional measure and showed that regional ALFFs in the left superior temporal gyrus were positively correlated with learning performance, whereas ALFFs in the default mode network (DMN) regions were negatively correlated with learning performance. Furthermore, the graph theory-based analysis indicated that the degree and local efficiency of the left superior temporal gyrus were positively correlated with learning performance. Finally, the default mode network and several task-positive resting-state networks (RSNs) were identified via the ICA. The "competition" (i.e., negative correlation) between the DMN and the dorsal attention network was negatively correlated with learning performance. Our results demonstrate that a) spontaneous brain activity can predict future language learning outcome without prior hypotheses (e.g., selection of regions of interest--ROIs) and b) both regional dynamics and network-level interactions in the resting brain can account for individual differences in future spoken language learning success
Full Text Available Arabic is a Semitic language spoken by more than 330 million people as a native language, in an area extending from the Arabian/Persian Gulf in the East to the Atlantic Ocean in the West. Moreover, it is the language in which 1.4 billion Muslims around the world perform their daily prayers. Over the last few years, Arabic natural language processing (ANLP has gained increasing importance, and several state of the art systems have been developed for a wide range of applications.
Ihle, Andreas; Oris, Michel; Fagot, Delphine; Kliegel, Matthias
Findings on the association of speaking different languages with cognitive functioning in old age are inconsistent and inconclusive so far. Therefore, the present study set out to investigate the relation of the number of languages spoken to cognitive performance and its interplay with several other markers of cognitive reserve in a large sample of older adults. Two thousand eight hundred and twelve older adults served as sample for the present study. Psychometric tests on verbal abilities, basic processing speed, and cognitive flexibility were administered. In addition, individuals were interviewed on their different languages spoken on a regular basis, educational attainment, occupation, and engaging in different activities throughout adulthood. Higher number of languages regularly spoken was significantly associated with better performance in verbal abilities and processing speed, but unrelated to cognitive flexibility. Regression analyses showed that the number of languages spoken predicted cognitive performance over and above leisure activities/physical demand of job/gainful activity as respective additional predictor, but not over and above educational attainment/cognitive level of job as respective additional predictor. There was no significant moderation of the association of the number of languages spoken with cognitive performance in any model. Present data suggest that speaking different languages on a regular basis may additionally contribute to the build-up of cognitive reserve in old age. Yet, this may not be universal, but linked to verbal abilities and basic cognitive processing speed. Moreover, it may be dependent on other types of cognitive stimulation that individuals also engaged in during their life course.
Gautreau, Aurore; Hoen, Michel; Meunier, Fanny
This study aimed to characterize the linguistic interference that occurs during speech-in-speech comprehension by combining offline and online measures, which included an intelligibility task (at a -5 dB Signal-to-Noise Ratio) and 2 lexical decision tasks (at a -5 dB and 0 dB SNR) that were performed with French spoken target words. In these 3 experiments we always compared the masking effects of speech backgrounds (i.e., 4-talker babble) that were produced in the same language as the target language (i.e., French) or in unknown foreign languages (i.e., Irish and Italian) to the masking effects of corresponding non-speech backgrounds (i.e., speech-derived fluctuating noise). The fluctuating noise contained similar spectro-temporal information as babble but lacked linguistic information. At -5 dB SNR, both tasks revealed significantly divergent results between the unknown languages (i.e., Irish and Italian) with Italian and French hindering French target word identification to a similar extent, whereas Irish led to significantly better performances on these tasks. By comparing the performances obtained with speech and fluctuating noise backgrounds, we were able to evaluate the effect of each language. The intelligibility task showed a significant difference between babble and fluctuating noise for French, Irish and Italian, suggesting acoustic and linguistic effects for each language. However, the lexical decision task, which reduces the effect of post-lexical interference, appeared to be more accurate, as it only revealed a linguistic effect for French. Thus, although French and Italian had equivalent masking effects on French word identification, the nature of their interference was different. This finding suggests that the differences observed between the masking effects of Italian and Irish can be explained at an acoustic level but not at a linguistic level.
Laporte , Eric
The connection between language processing and combinatorics on words is natural. Historically, linguists actually played a part in the beginning of the construction of theoretical combinatorics on words. Some of the terms in current use originate from linguistics: word, prefix, suffix, grammar, syntactic monoid... However, interpenetration between the formal world of computer theory and the intuitive world of linguistics is still a love story with ups and downs. We will encounter in this cha...
Nicholas, Johanna Grant; Geers, Ann E
By age 3, typically developing children have achieved extensive vocabulary and syntax skills that facilitate both cognitive and social development. Substantial delays in spoken language acquisition have been documented for children with severe to profound deafness, even those with auditory oral training and early hearing aid use. This study documents the spoken language skills achieved by orally educated 3-yr-olds whose profound hearing loss was identified and hearing aids fitted between 1 and 30 mo of age and who received a cochlear implant between 12 and 38 mo of age. The purpose of the analysis was to examine the effects of age, duration, and type of early auditory experience on spoken language competence at age 3.5 yr. The spoken language skills of 76 children who had used a cochlear implant for at least 7 mo were evaluated via standardized 30-minute language sample analysis, a parent-completed vocabulary checklist, and a teacher language-rating scale. The children were recruited from and enrolled in oral education programs or therapy practices across the United States. Inclusion criteria included presumed deaf since birth, English the primary language of the home, no other known conditions that interfere with speech/language development, enrolled in programs using oral education methods, and no known problems with the cochlear implant lasting more than 30 days. Strong correlations were obtained among all language measures. Therefore, principal components analysis was used to derive a single Language Factor score for each child. A number of possible predictors of language outcome were examined, including age at identification and intervention with a hearing aid, duration of use of a hearing aid, pre-implant pure-tone average (PTA) threshold with a hearing aid, PTA threshold with a cochlear implant, and duration of use of a cochlear implant/age at implantation (the last two variables were practically identical because all children were tested between 40 and 44
Peterson, Nathaniel R; Pisoni, David B; Miyamoto, Richard T
Cochlear implants (CIs) process sounds electronically and then transmit electric stimulation to the cochlea of individuals with sensorineural deafness, restoring some sensation of auditory perception. Many congenitally deaf CI recipients achieve a high degree of accuracy in speech perception and develop near-normal language skills. Post-lingually deafened implant recipients often regain the ability to understand and use spoken language with or without the aid of visual input (i.e. lip reading). However, there is wide variation in individual outcomes following cochlear implantation, and some CI recipients never develop useable speech and oral language skills. The causes of this enormous variation in outcomes are only partly understood at the present time. The variables most strongly associated with language outcomes are age at implantation and mode of communication in rehabilitation. Thus, some of the more important factors determining success of cochlear implantation are broadly related to neural plasticity that appears to be transiently present in deaf individuals. In this article we review the expected outcomes of cochlear implantation, potential predictors of those outcomes, the basic science regarding critical and sensitive periods, and several new research directions in the field of cochlear implantation.
Lund, Emily; Douglas, W. Michael; Schuele, C. Melanie
Children with hearing loss who are developing spoken language tend to lag behind children with normal hearing in vocabulary knowledge. Thus, researchers must validate instructional practices that lead to improved vocabulary outcomes for children with hearing loss. The purpose of this study was to investigate how semantic richness of instruction…
This article re-examines the notion of spoken fluency. Fluent and fluency are terms commonly used in everyday, lay language, and fluency, or lack of it, has social consequences. The article reviews the main approaches to understanding and measuring spoken fluency and suggest that spoken fluency is best understood as an interactive achievement, and offers the metaphor of ‘confluence’ to replace the term fluency. Many measures of spoken fluency are internal and monologue-based, whereas evidence...
Choroomi, S; Curotta, J
To review foreign body aspiration cases encountered over a 10-year period in a tertiary paediatric hospital, and to assess correlation between foreign body type and language spoken at home. Retrospective chart review of all children undergoing direct laryngobronchoscopy for foreign body aspiration over a 10-year period. Age, sex, foreign body type, complications, hospital stay and home language were analysed. At direct laryngobronchoscopy, 132 children had foreign body aspiration (male:female ratio 1.31:1; mean age 32 months (2.67 years)). Mean hospital stay was 2.0 days. Foreign bodies most commonly comprised food matter (53/132; 40.1 per cent), followed by non-food matter (44/132; 33.33 per cent), a negative endoscopy (11/132; 8.33 per cent) and unknown composition (24/132; 18.2 per cent). Most parents spoke English (92/132, 69.7 per cent; vs non-English-speaking 40/132, 30.3 per cent), but non-English-speaking patients had disproportionately more food foreign bodies, and significantly more nut aspirations (p = 0.0065). Results constitute level 2b evidence. Patients from non-English speaking backgrounds had a significantly higher incidence of food (particularly nut) aspiration. Awareness-raising and public education is needed in relevant communities to prevent certain foods, particularly nuts, being given to children too young to chew and swallow them adequately.
Wang, Jie; Wong, Andus Wing-Kuen; Wang, Suiping; Chen, Hsuan-Chih
It is widely acknowledged in Germanic languages that segments are the primary planning units at the phonological encoding stage of spoken word production. Mixed results, however, have been found in Chinese, and it is still unclear what roles syllables and segments play in planning Chinese spoken word production. In the current study, participants were asked to first prepare and later produce disyllabic Mandarin words upon picture prompts and a response cue while electroencephalogram (EEG) signals were recorded. Each two consecutive pictures implicitly formed a pair of prime and target, whose names shared the same word-initial atonal syllable or the same word-initial segments, or were unrelated in the control conditions. Only syllable repetition induced significant effects on event-related brain potentials (ERPs) after target onset: a widely distributed positivity in the 200- to 400-ms interval and an anterior positivity in the 400- to 600-ms interval. We interpret these to reflect syllable-size representations at the phonological encoding and phonetic encoding stages. Our results provide the first electrophysiological evidence for the distinct role of syllables in producing Mandarin spoken words, supporting a language specificity hypothesis about the primary phonological units in spoken word production.
Harris, Michael S; Kronenberger, William G; Gao, Sujuan; Hoen, Helena M; Miyamoto, Richard T; Pisoni, David B
Cochlear implants (CIs) help many deaf children achieve near-normal speech and language (S/L) milestones. Nevertheless, high levels of unexplained variability in S/L outcomes are limiting factors in improving the effectiveness of CIs in deaf children. The objective of this study was to longitudinally assess the role of verbal short-term memory (STM) and working memory (WM) capacity as a progress-limiting source of variability in S/L outcomes after CI in children. Longitudinal study of 66 children with CIs for prelingual severe-to-profound hearing loss. Outcome measures included performance on digit span forward (DSF), digit span backward (DSB), and four conventional S/L measures that examined spoken-word recognition (Phonetically Balanced Kindergarten word test), receptive vocabulary (Peabody Picture Vocabulary Test ), sentence-recognition skills (Hearing in Noise Test), and receptive and expressive language functioning (Clinical Evaluation of Language Fundamentals Fourth Edition Core Language Score; CELF). Growth curves for DSF and DSB in the CI sample over time were comparable in slope, but consistently lagged in magnitude relative to norms for normal-hearing peers of the same age. For DSF and DSB, 50.5% and 44.0%, respectively, of the CI sample scored more than 1 SD below the normative mean for raw scores across all ages. The first (baseline) DSF score significantly predicted all endpoint scores for the four S/L measures, and DSF slope (growth) over time predicted CELF scores. DSF baseline and slope accounted for an additional 13 to 31% of variance in S/L scores after controlling for conventional predictor variables such as: chronological age at time of testing, age at time of implantation, communication mode (auditory-oral communication versus total communication), and maternal education. Only DSB baseline scores predicted endpoint language scores on Peabody Picture Vocabulary Test and CELF. DSB slopes were not significantly related to any endpoint S/L measures
Werfel, Krystal L.
Purpose: The purpose of this study was to compare change in emergent literacy skills of preschool children with and without hearing loss over a 6-month period. Method: Participants included 19 children with hearing loss and 14 children with normal hearing. Children with hearing loss used amplification and spoken language. Participants completed…
Language understanding is essential for intelligent information processing. Processing of language itself involves configuration element analysis, syntactic analysis (parsing), and semantic analysis. They are not carried out in isolation. These are described for the Japanese language and their usage in understanding-systems is examined. 30 references.
Purpose: The current study sought to investigate the separate effects of dysarthria and cognitive status on global speech timing, speech hesitation, and linguistic complexity characteristics and how these speech behaviors impose on listener impressions for three connected speech tasks presumed to differ in cognitive-linguistic demand for four carefully defined speaker groups; 1) MS with cognitive deficits (MSCI), 2) MS with clinically diagnosed dysarthria and intact cognition (MSDYS), 3) MS without dysarthria or cognitive deficits (MS), and 4) healthy talkers (CON). The relationship between neuropsychological test scores and speech-language production and perceptual variables for speakers with cognitive deficits was also explored. Methods: 48 speakers, including 36 individuals reporting a neurological diagnosis of MS and 12 healthy talkers participated. The three MS groups and control group each contained 12 speakers (8 women and 4 men). Cognitive function was quantified using standard clinical tests of memory, information processing speed, and executive function. A standard z-score of ≤ -1.50 indicated deficits in a given cognitive domain. Three certified speech-language pathologists determined the clinical diagnosis of dysarthria for speakers with MS. Experimental speech tasks of interest included audio-recordings of an oral reading of the Grandfather passage and two spontaneous speech samples in the form of Familiar and Unfamiliar descriptive discourse. Various measures of spoken language were of interest. Suprasegmental acoustic measures included speech and articulatory rate. Linguistic speech hesitation measures included pause frequency (i.e., silent and filled pauses), mean silent pause duration, grammatical appropriateness of pauses, and interjection frequency. For the two discourse samples, three standard measures of language complexity were obtained including subordination index, inter-sentence cohesion adequacy, and lexical diversity. Ten listeners
Leni Amalia Suek
Full Text Available The maintenance of community languages of migrant students is heavily determined by language use and language attitudes. The superiority of a dominant language over a community language contributes to attitudes of migrant students toward their native languages. When they perceive their native languages as unimportant language, they will reduce the frequency of using that language even though at home domain. Solutions provided for a problem of maintaining community languages should be related to language use and attitudes of community languages, which are developed mostly in two important domains, school and family. Hence, the valorization of community language should be promoted not only in family but also school domains. Several programs such as community language school and community language program can be used for migrant students to practice and use their native languages. Since educational resources such as class session, teachers and government support are limited; family plays significant roles to stimulate positive attitudes toward community language and also to develop the use of native languages.
Petkov, Christopher I; Jarvis, Erich D
Vocal learners such as humans and songbirds can learn to produce elaborate patterns of structurally organized vocalizations, whereas many other vertebrates such as non-human primates and most other bird groups either cannot or do so to a very limited degree. To explain the similarities among humans and vocal-learning birds and the differences with other species, various theories have been proposed. One set of theories are motor theories, which underscore the role of the motor system as an evolutionary substrate for vocal production learning. For instance, the motor theory of speech and song perception proposes enhanced auditory perceptual learning of speech in humans and song in birds, which suggests a considerable level of neurobiological specialization. Another, a motor theory of vocal learning origin, proposes that the brain pathways that control the learning and production of song and speech were derived from adjacent motor brain pathways. Another set of theories are cognitive theories, which address the interface between cognition and the auditory-vocal domains to support language learning in humans. Here we critically review the behavioral and neurobiological evidence for parallels and differences between the so-called vocal learners and vocal non-learners in the context of motor and cognitive theories. In doing so, we note that behaviorally vocal-production learning abilities are more distributed than categorical, as are the auditory-learning abilities of animals. We propose testable hypotheses on the extent of the specializations and cross-species correspondences suggested by motor and cognitive theories. We believe that determining how spoken language evolved is likely to become clearer with concerted efforts in testing comparative data from many non-human animal species.
Full Text Available Spoken English may sometimes cause us to face a peculiar problem in respect of the reception and the decoding of auditive signals, which might lead to mishearings. Risen from erroneous perception, from a lack in understanding the communication and an involuntary mental replacement of a certain element or structure by a more familiar one, these mistakes are most frequently encountered in the case of listening to songs, where the melodic line can facilitate the development of confusion by its somewhat altered intonation, which produces the so called mondegreens. Still, instances can be met in all domains of verbal communication, as proven in several examples noticed during classes of English as a foreign language (EFL taught to non-philological subjects. Production and perceptions of language depend on a series of elements that influence the encoding and the decoding of the message. These filters belong to both psychological and semantic categories which can either interfere with the accuracy of emission and reception. Poor understanding of a notion or concept combined with a more familiar relation with a similarly sounding one will result in unconsciously picking the structure which is better known. This means ‘hearing’ something else than it had been said, something closer to the receiver’s preoccupations and baggage of knowledge than the original structure or word. Some mishearings become particularly relevant as they concern teaching English for Specific Purposes (ESP. Such are those encountered during classes of Business English or in English for Law. Though not very likely to occur too often, given an intuitively felt inaccuracy - as the terms are known by the users to need to be more specialised -, such examples are still not ignorable. Thus, we consider they deserve a higher degree of attention, as they might become quite relevant in the global context of an increasing work force migration and a spread of multinational companies.
Full Text Available , and complicates the design of the system as a whole. Current benchmark results are established by the National Institute of Standards and Technology (NIST) Language Recognition Evaluation (LRE) . Initially started in 1996, the next evaluation was in 2003..., Gunnar Evermann, Mark Gales, Thomas Hain, Dan Kershaw, Gareth Moore, Julian Odell, Dave Ollason, Dan Povey, Valtcho Valtchev, and Phil Woodland: “The HTK book. Revised for HTK version 3.3”, Online: http://htk.eng.cam.ac.uk/., 2005.  M.A. Zissman...
Adank, P.M.; Noordzij, M.L.; Hagoort, P.
A repetitionsuppression functional magnetic resonance imaging paradigm was used to explore the neuroanatomical substrates of processing two types of acoustic variationspeaker and accentduring spoken sentence comprehension. Recordings were made for two speakers and two accents: Standard Dutch and a
Willems, Roel M; Frank, Stefan L; Nijhof, Annabel D; Hagoort, Peter; van den Bosch, Antal
The notion of prediction is studied in cognitive neuroscience with increasing intensity. We investigated the neural basis of 2 distinct aspects of word prediction, derived from information theory, during story comprehension. We assessed the effect of entropy of next-word probability distributions as well as surprisal A computational model determined entropy and surprisal for each word in 3 literary stories. Twenty-four healthy participants listened to the same 3 stories while their brain activation was measured using fMRI. Reversed speech fragments were presented as a control condition. Brain areas sensitive to entropy were left ventral premotor cortex, left middle frontal gyrus, right inferior frontal gyrus, left inferior parietal lobule, and left supplementary motor area. Areas sensitive to surprisal were left inferior temporal sulcus ("visual word form area"), bilateral superior temporal gyrus, right amygdala, bilateral anterior temporal poles, and right inferior frontal sulcus. We conclude that prediction during language comprehension can occur at several levels of processing, including at the level of word form. Our study exemplifies the power of combining computational linguistics with cognitive neuroscience, and additionally underlines the feasibility of studying continuous spoken language materials with fMRI. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: email@example.com.
Provides a comprehensive, modern reference of practical tools and techniques for implementing natural language processing in computer systems. This title covers classical methods, empirical and statistical techniques, and various applications. It describes how the techniques can be applied to European and Asian languages as well as English
Oryadi-Zanjani, Mohammad Majid; Vahab, Maryam; Bazrafkan, Mozhdeh; Haghjoo, Asghar
The aim of this study was to examine the role of audiovisual speech recognition as a clinical criterion of cochlear implant or hearing aid efficiency in Persian-language children with severe-to-profound hearing loss. This research was administered as a cross-sectional study. The sample size was 60 Persian 5-7 year old children. The assessment tool was one of subtests of Persian version of the Test of Language Development-Primary 3. The study included two experiments: auditory-only and audiovisual presentation conditions. The test was a closed-set including 30 words which were orally presented by a speech-language pathologist. The scores of audiovisual word perception were significantly higher than auditory-only condition in the children with normal hearing (Paudiovisual presentation conditions (P>0.05). The audiovisual spoken word recognition can be applied as a clinical criterion to assess the children with severe to profound hearing loss in order to find whether cochlear implant or hearing aid has been efficient for them or not; i.e. if a child with hearing impairment who using CI or HA can obtain higher scores in audiovisual spoken word recognition than auditory-only condition, his/her auditory skills have appropriately developed due to effective CI or HA as one of the main factors of auditory habilitation. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
The current research examined how Arabic diglossia affects verbal learning memory. Thirty native Arab college students were tested using auditory verbal memory test that was adapted according to the Rey Auditory Verbal Learning Test and developed in three versions: Pure spoken language version (SL), pure standard language version (SA), and…
Adank, P.M.; Noordzij, M.L.; Hagoort, P.
A repetition–suppression functional magnetic resonance imaging paradigm was used to explore the neuroanatomical substrates of processing two types of acoustic variation—speaker and accent—during spoken sentence comprehension. Recordings were made for two speakers and two accents: Standard Dutch and
Constantinescu-Sharpe, Gabriella; Phillips, Rebecca L; Davis, Aleisha; Dornan, Dimity; Hogan, Anthony
Social inclusion is a common focus of listening and spoken language (LSL) early intervention for children with hearing loss. This exploratory study compared the social inclusion of young children with hearing loss educated using a listening and spoken language approach with population data. A framework for understanding the scope of social inclusion is presented in the Background. This framework guided the use of a shortened, modified version of the Longitudinal Study of Australian Children (LSAC) to measure two of the five facets of social inclusion ('education' and 'interacting with society and fulfilling social goals'). The survey was completed by parents of children with hearing loss aged 4-5 years who were educated using a LSL approach (n = 78; 37% who responded). These responses were compared to those obtained for typical hearing children in the LSAC dataset (n = 3265). Analyses revealed that most children with hearing loss had comparable outcomes to those with typical hearing on the 'education' and 'interacting with society and fulfilling social roles' facets of social inclusion. These exploratory findings are positive and warrant further investigation across all five facets of the framework to identify which factors influence social inclusion.
Gooskens, Charlotte; van Heuven, Vincent J.; van Bezooijen, Renee; Pacilly, Jos J. A.
The most straightforward way to explain why Danes understand spoken Swedish relatively better than Swedes understand spoken Danish would be that spoken Danish is intrinsically a more difficult language to understand than spoken Swedish. We discuss circumstantial evidence suggesting that Danish is
Boudewyn, Megan A.; Gordon, Peter C.; Long, Debra; Polse, Lara; Swaab, Tamara Y.
The goal of this study was to examine how lexical association and discourse congruence affect the time course of processing incoming words in spoken discourse. In an ERP norming study, we presented prime-target pairs in the absence of a sentence context to obtain a baseline measure of lexical priming. We observed a typical N400 effect when participants heard critical associated and unassociated target words in word pairs. In a subsequent experiment, we presented the same word pairs in spoken discourse contexts. Target words were always consistent with the local sentence context, but were congruent or not with the global discourse (e.g., “Luckily Ben had picked up some salt and pepper/basil”, preceded by a context in which Ben was preparing marinara sauce (congruent) or dealing with an icy walkway (incongruent). ERP effects of global discourse congruence preceded those of local lexical association, suggesting an early influence of the global discourse representation on lexical processing, even in locally congruent contexts. Furthermore, effects of lexical association occurred earlier in the congruent than incongruent condition. These results differ from those that have been obtained in studies of reading, suggesting that the effects may be unique to spoken word recognition. PMID:23002319
LANGUAGE POLICIES PURSUED IN THE AXIS OF OTHERING AND IN THE PROCESS OF CONVERTING SPOKEN LANGUAGE OF TURKS LIVING IN RUSSIA INTO THEIR WRITTEN LANGUAGE / RUSYA'DA YASAYAN TÜRKLERİN KONUSMA DİLLERİNİN YAZI DİLİNE DÖNÜSTÜRÜLME SÜRECİ VE ÖTEKİLESTİRME EKSENİNDE İZLENEN DİL POLİTİKALARI
Süleyman Kaan YALÇIN (M.A.H.
Full Text Available Language is an object realized in two ways; spokenlanguage and written language. Each language can havethe characteristics of a spoken language, however, everylanguage can not have the characteristics of a writtenlanguage since there are some requirements for alanguage to be deemed as a written language. Theserequirements are selection, coding, standardization andbecoming widespread. It is necessary for a language tomeet these requirements in either natural or artificial wayso to be deemed as a written language (standardlanguage.Turkish language, which developed as a singlewritten language till 13th century, was divided intolanguages as West Turkish and North-East Turkish bymeeting the requirements of a written language in anatural way. Following this separation and through anatural process, it showed some differences in itself;however, the policy of converting the spoken language ofeach Turkish clan into their written language -the policypursued by Russia in a planned way- turned Turkish,which came to 20th century as a few written languagesinto20 different written languages. Implementation ofdiscriminatory language policies suggested by missionerssuch as Slinky and Ostramov to Russian Government,imposing of Cyrillic alphabet full of different andunnecessary signs on each Turkish clan by force andothering activities of Soviet boarding schools opened hadconsiderable effects on the said process.This study aims at explaining that the conversionof spoken languages of Turkish societies in Russia intotheir written languages did not result from a naturalprocess; the historical development of Turkish languagewhich is shaped as 20 separate written languages onlybecause of the pressure exerted by political will; and how the Russian subjected language concept -which is thememory of a nation- to an artificial process.
Reese, Richard M
If you are a Java programmer who wants to learn about the fundamental tasks underlying natural language processing, this book is for you. You will be able to identify and use NLP tasks for many common problems, and integrate them in your applications to solve more difficult problems. Readers should be familiar/experienced with Java software development.
Full Text Available In this paper the use and quality of the evaluative language produced by a bilingual child in a story-telling situation is analysed. The subject, an 11-year-old Finnish boy, Jimmy, is bilingual in Finnish sign language (FinSL and spoken Finnish.He was born deaf but got a cochlear implant at the age of five.The data consist of a spoken and a signed version of “The Frog Story”. The analysis shows that evaluative devices and expressions differ in the spoken and signed stories told by the child. In his Finnish story he uses mostly lexical devices – comments on a character and the character’s actions as well as quoted speech occasionally combined with prosodic features. In his FinSL story he uses both lexical and paralinguistic devices in a balanced way.
Wilder Yesid Escobar
Full Text Available Recognizing that developing the competences needed to appropriately use linguistic resources according to contextual characteristics (pragmatics is as important as the cultural-imbedded linguistic knowledge itself (semantics and that both are equally essential to form competent speakers of English in foreign language contexts, we feel this research relies on corpus linguistics to analyze both the scope and the limitations of the sociolinguistic knowledge and the communicative skills of English students at the university level. To such end, a linguistic corpus was assembled, compared to an existing corpus of native speakers, and analyzed in terms of the frequency, overuse, underuse, misuse, ambiguity, success, and failure of the linguistic parameters used in speech acts. The findings herein describe the linguistic configurations employed to modify levels and degrees of descriptions (salient sematic theme exhibited in the EFL learners´ corpus appealing to the sociolinguistic principles governing meaning making and language use which are constructed under the social conditions of the environments where the language is naturally spoken for sociocultural exchange.
Considerable progress has been made in recent years in the development of dialogue systems that support robust and efficient human-machine interaction using spoken language. Spoken dialogue technology allows various interactive applications to be built and used for practical purposes, and research focuses on issues that aim to increase the system's communicative competence by including aspects of error correction, cooperation, multimodality, and adaptation in context. This book gives a comprehensive view of state-of-the-art techniques that are used to build spoken dialogue systems. It provides
De Angelis, Gessica
The present study adopts a multilingual approach to analysing the standardized test results of primary school immigrant children living in the bi-/multilingual context of South Tyrol, Italy. The standardized test results are from the Invalsi test administered across Italy in 2009/2010. In South Tyrol, several languages are spoken on a daily basis…
Li, Xiao-qing; Ren, Gui-qin
An event-related brain potentials (ERP) experiment was carried out to investigate how and when accentuation influences temporally selective attention and subsequent semantic processing during on-line spoken language comprehension, and how the effect of accentuation on attention allocation and semantic processing changed with the degree of…
Gollan, Tamar H.; Weissberger, Gali H.; Runnqvist, Elin; Montoya, Rosa I.; Cera, Cynthia M.
This study investigated correspondence between different measures of bilingual language proficiency contrasting self-report, proficiency interview, and picture naming skills. Fifty-two young (Experiment 1) and 20 aging (Experiment 2) Spanish-English bilinguals provided self-ratings of proficiency level, were interviewed for spoken proficiency, and…
Full Text Available Mismatch negativity (MMN, a primary response to an acoustic change and an index of sensory memory, was used to investigate the processing of the discrimination between familiar and unfamiliar Consonant-Vowel (CV speech contrasts. The MMN was elicited by rare familiar words presented among repetitive unfamiliar words. Phonetic and phonological contrasts were identical in all conditions. MMN elicited by the familiar word deviant was larger than that elicited by the unfamiliar word deviant. The presence of syllable contrast did significantly alter the word-elicited MMN in amplitude and scalp voltage field distribution. Thus, our results indicate the existence of word-related MMN enhancement largely independent of the word status of the standard stimulus. This enhancement may reflect the presence of a longterm memory trace for familiar spoken words in tonal languages.
Zimmer, Patricia Moore
Describes the author's experiences directing a play translated and acted in Korean. Notes that she had to get familiar with the sound of the language spoken fluently, to see how an actor's thought is discerned when the verbal language is not understood. Concludes that so much of understanding and communication unfolds in ways other than with…
Massaro, Dominic W
I review 2 seminal research reports published in this journal during its second decade more than a century ago. Given psychology's subdisciplines, they would not normally be reviewed together because one involves reading and the other speech perception. The small amount of interaction between these domains might have limited research and theoretical progress. In fact, the 2 early research reports revealed common processes involved in these 2 forms of language processing. Their illustration of the role of Wundt's apperceptive process in reading and speech perception anticipated descriptions of contemporary theories of pattern recognition, such as the fuzzy logical model of perception. Based on the commonalities between reading and listening, one can question why they have been viewed so differently. It is commonly believed that learning to read requires formal instruction and schooling, whereas spoken language is acquired from birth onward through natural interactions with people who talk. Most researchers and educators believe that spoken language is acquired naturally from birth onward and even prenatally. Learning to read, on the other hand, is not possible until the child has acquired spoken language, reaches school age, and receives formal instruction. If an appropriate form of written text is made available early in a child's life, however, the current hypothesis is that reading will also be learned inductively and emerge naturally, with no significant negative consequences. If this proposal is true, it should soon be possible to create an interactive system, Technology Assisted Reading Acquisition, to allow children to acquire literacy naturally.
Social Security Administration — This data set provides quarterly volumes for language preferences at the national level of individuals filing claims for Retirement and Survivor benefits for fiscal...
Social Security Administration — This data set provides quarterly volumes for language preferences at the national level of individuals filing claims for Retirement and Survivor benefits from fiscal...
Krahmer, Emiel; Theune, Mariet
Natural language generation (NLG) is a subfield of natural language processing (NLP) that is often characterized as the study of automatically converting non-linguistic representations (e.g., from databases or other knowledge sources) into coherent natural language text. In recent years the field
Rämä, Pia; Sirri, Louah; Serres, Josette
Our aim was to investigate whether developing language system, as measured by a priming task for spoken words, is organized by semantic categories. Event-related potentials (ERPs) were recorded during a priming task for spoken words in 18- and 24-month-old monolingual French learning children. Spoken word pairs were either semantically related (e.g., train-bike) or unrelated (e.g., chicken-bike). The results showed that the N400-like priming effect occurred in 24-month-olds over the right parietal-occipital recording sites. In 18-month-olds the effect was observed similarly to 24-month-olds only in those children with higher word production ability. The results suggest that words are categorically organized in the mental lexicon of children at the age of 2 years and even earlier in children with a high vocabulary. Copyright © 2013 Elsevier Inc. All rights reserved.
Kwon, Oh-Woog; Kim, Young-Kil; Lee, Yunkeun
This paper introduces a Dialog-Based Computer Assisted second-Language Learning (DB-CALL) system using task-oriented dialogue processing technology. The system promotes dialogue with a second-language learner for a specific task, such as purchasing tour tickets, ordering food, passing through immigration, etc. The dialog system plays a role of a…
Full Text Available The European ‘Lifelong Learning Programme’ (LLP) project ‘Games Online for Basic Language learning’ (GOBL) aimed to provide youths and adults wishing to improve their basic language skills access to materials for the development of communicative...
Smolík, Filip; Stepankova, Hana; Vyhnálek, Martin; Nikolai, Tomáš; Horáková, Karolína; Matejka, Štepán
Purpose Propositional density (PD) is a measure of content richness in language production that declines in normal aging and more profoundly in dementia. The present study aimed to develop a PD scoring system for Czech and use it to compare PD in language productions of older people with amnestic mild cognitive impairment (aMCI) and control…
Pimperton, Hannah; Kreppner, Jana; Mahon, Merle; Stevenson, Jim; Terlektsi, Emmanouela; Worsfold, Sarah; Yuen, Ho Ming; Kennedy, Colin R
This study aimed to examine whether (a) exposure to universal newborn hearing screening (UNHS) and b) early confirmation of hearing loss were associated with benefits to expressive and receptive language outcomes in the teenage years for a cohort of spoken language users. It also aimed to determine whether either of these two variables was associated with benefits to relative language gain from middle childhood to adolescence within this cohort. The participants were drawn from a prospective cohort study of a population sample of children with bilateral permanent childhood hearing loss, who varied in their exposure to UNHS and who had previously had their language skills assessed at 6-10 years. Sixty deaf or hard of hearing teenagers who were spoken language users and a comparison group of 38 teenagers with normal hearing completed standardized measures of their receptive and expressive language ability at 13-19 years. Teenagers exposed to UNHS did not show significantly better expressive (adjusted mean difference, 0.40; 95% confidence interval [CI], -0.26 to 1.05; d = 0.32) or receptive (adjusted mean difference, 0.68; 95% CI, -0.56 to 1.93; d = 0.28) language skills than those who were not. Those who had their hearing loss confirmed by 9 months of age did not show significantly better expressive (adjusted mean difference, 0.43; 95% CI, -0.20 to 1.05; d = 0.35) or receptive (adjusted mean difference, 0.95; 95% CI, -0.22 to 2.11; d = 0.42) language skills than those who had it confirmed later. In all cases, effect sizes were of small size and in favor of those exposed to UNHS or confirmed by 9 months. Subgroup analysis indicated larger beneficial effects of early confirmation for those deaf or hard of hearing teenagers without cochlear implants (N = 48; 80% of the sample), and these benefits were significant in the case of receptive language outcomes (adjusted mean difference, 1.55; 95% CI, 0.38 to 2.71; d = 0.78). Exposure to UNHS did not account for significant
Harris, David; Bennet, Lisa; Bant, Sharyn
Objectives: Although it has been established that bilateral cochlear implants (CIs) offer additional speech perception and localization benefits to many children with severe to profound hearing loss, whether these improved perceptual abilities facilitate significantly better language development has not yet been clearly established. The aims of this study were to compare language abilities of children having unilateral and bilateral CIs to quantify the rate of any improvement in language attributable to bilateral CIs and to document other predictors of language development in children with CIs. Design: The receptive vocabulary and language development of 91 children was assessed when they were aged either 5 or 8 years old by using the Peabody Picture Vocabulary Test (fourth edition), and either the Preschool Language Scales (fourth edition) or the Clinical Evaluation of Language Fundamentals (fourth edition), respectively. Cognitive ability, parent involvement in children’s intervention or education programs, and family reading habits were also evaluated. Language outcomes were examined by using linear regression analyses. The influence of elements of parenting style, child characteristics, and family background as predictors of outcomes were examined. Results: Children using bilateral CIs achieved significantly better vocabulary outcomes and significantly higher scores on the Core and Expressive Language subscales of the Clinical Evaluation of Language Fundamentals (fourth edition) than did comparable children with unilateral CIs. Scores on the Preschool Language Scales (fourth edition) did not differ significantly between children with unilateral and bilateral CIs. Bilateral CI use was found to predict significantly faster rates of vocabulary and language development than unilateral CI use; the magnitude of this effect was moderated by child age at activation of the bilateral CI. In terms of parenting style, high levels of parental involvement, low amounts of
Sarant, Julia; Harris, David; Bennet, Lisa; Bant, Sharyn
Although it has been established that bilateral cochlear implants (CIs) offer additional speech perception and localization benefits to many children with severe to profound hearing loss, whether these improved perceptual abilities facilitate significantly better language development has not yet been clearly established. The aims of this study were to compare language abilities of children having unilateral and bilateral CIs to quantify the rate of any improvement in language attributable to bilateral CIs and to document other predictors of language development in children with CIs. The receptive vocabulary and language development of 91 children was assessed when they were aged either 5 or 8 years old by using the Peabody Picture Vocabulary Test (fourth edition), and either the Preschool Language Scales (fourth edition) or the Clinical Evaluation of Language Fundamentals (fourth edition), respectively. Cognitive ability, parent involvement in children's intervention or education programs, and family reading habits were also evaluated. Language outcomes were examined by using linear regression analyses. The influence of elements of parenting style, child characteristics, and family background as predictors of outcomes were examined. Children using bilateral CIs achieved significantly better vocabulary outcomes and significantly higher scores on the Core and Expressive Language subscales of the Clinical Evaluation of Language Fundamentals (fourth edition) than did comparable children with unilateral CIs. Scores on the Preschool Language Scales (fourth edition) did not differ significantly between children with unilateral and bilateral CIs. Bilateral CI use was found to predict significantly faster rates of vocabulary and language development than unilateral CI use; the magnitude of this effect was moderated by child age at activation of the bilateral CI. In terms of parenting style, high levels of parental involvement, low amounts of screen time, and more time spent
Social Security Administration — This data set provides annual volumes for language preferences at the national level of individuals filing claims for SSI Blind and Disabled benefits from federal...
Social Security Administration — This data set provides annual volumes for language preferences at the national level of individuals filing claims for ESRD Medicare benefits for federal fiscal years...
Social Security Administration — This data set provides annual volumes for language preferences at the national level of individuals filing claims for SSI Aged benefits from federal fiscal year 2011...
Social Security Administration — This data set provides annual volumes for language preferences at the national level of individuals filing claims for ESRD Medicare benefits for federal fiscal years...
Social Security Administration — This data set provides annual volumes for language preferences at the national level of individuals filing claims for ESRD Medicare benefits for federal fiscal year...
Social Security Administration — This data set provides quarterly volumes for language preferences at the national level of individuals filing claims for SSI Aged benefits for fiscal years 2014 -...
Social Security Administration — This data set provides quarterly volumes for language preferences at the national level of individuals filing claims for ESRD Medicare benefits for fiscal years 2014...
Social Security Administration — This data set provides annual volumes for language preferences at the national level of individuals filing claims for ESRD Medicare benefits for federal fiscal year...
Schneider, Bruce A; Avivi-Reich, Meital; Daneman, Meredyth
Comprehending spoken discourse in noisy situations is likely to be more challenging to older adults than to younger adults due to potential declines in the auditory, cognitive, or linguistic processes supporting speech comprehension. These challenges might force older listeners to reorganize the ways in which they perceive and process speech, thereby altering the balance between the contributions of bottom-up versus top-down processes to speech comprehension. The authors review studies that investigated the effect of age on listeners' ability to follow and comprehend lectures (monologues), and two-talker conversations (dialogues), and the extent to which individual differences in lexical knowledge and reading comprehension skill relate to individual differences in speech comprehension. Comprehension was evaluated after each lecture or conversation by asking listeners to answer multiple-choice questions regarding its content. Once individual differences in speech recognition for words presented in babble were compensated for, age differences in speech comprehension were minimized if not eliminated. However, younger listeners benefited more from spatial separation than did older listeners. Vocabulary knowledge predicted the comprehension scores of both younger and older listeners when listening was difficult, but not when it was easy. However, the contribution of reading comprehension to listening comprehension appeared to be independent of listening difficulty in younger adults but not in older adults. The evidence suggests (1) that most of the difficulties experienced by older adults are due to age-related auditory declines, and (2) that these declines, along with listening difficulty, modulate the degree to which selective linguistic and cognitive abilities are engaged to support listening comprehension in difficult listening situations. When older listeners experience speech recognition difficulties, their attentional resources are more likely to be deployed to
Francis, Alexander L; Ho, Diana Wai Lam
There have been only two reports of multilingual cochlear implant users to date, and both of these were postlingually deafened adults. Here we report the case of a 6-year-old early-deafened child who is acquiring Cantonese, English and Mandarin in Hong Kong. He and two age-matched peers with similar educational backgrounds were tested using common, standardized tests of vocabulary and expressive and receptive language skills (Peabody Picture Vocabulary Test (Revised) and Reynell Developmental Language Scales version II). Results show that this child is acquiring Cantonese, English and Mandarin to a degree comparable to two classmates with normal hearing and similar educational and social backgrounds.
Klein, Evelyn R.; Armstrong, Sharon Lee; Shipon-Blum, Elisa
Children with selective mutism (SM) display a failure to speak in select situations despite speaking when comfortable. The purpose of this study was to obtain valid assessments of receptive and expressive language in 33 children (ages 5 to 12) with SM. Because some children with SM will speak to parents but not a professional, another purpose was…
Lexical sound symbolism in language appears to exploit the feature associations embedded in cross-sensory correspondences. For example, words incorporating relatively high acoustic frequencies (i.e., front/close rather than back/open vowels) are deemed more appropriate as names for concepts associated with brightness, lightness in weight,…
Nadkarni, Prakash M; Ohno-Machado, Lucila; Chapman, Wendy W
To provide an overview and tutorial of natural language processing (NLP) and modern NLP-system design. This tutorial targets the medical informatics generalist who has limited acquaintance with the principles behind NLP and/or limited knowledge of the current state of the art. We describe the historical evolution of NLP, and summarize common NLP sub-problems in this extensive field. We then provide a synopsis of selected highlights of medical NLP efforts. After providing a brief description of common machine-learning approaches that are being used for diverse NLP sub-problems, we discuss how modern NLP architectures are designed, with a summary of the Apache Foundation's Unstructured Information Management Architecture. We finally consider possible future directions for NLP, and reflect on the possible impact of IBM Watson on the medical field.
Werfel, Krystal L
The purpose of this study was to compare change in emergent literacy skills of preschool children with and without hearing loss over a 6-month period. Participants included 19 children with hearing loss and 14 children with normal hearing. Children with hearing loss used amplification and spoken language. Participants completed measures of oral language, phonological processing, and print knowledge twice at a 6-month interval. A series of repeated-measures analyses of variance were used to compare change across groups. Main effects of time were observed for all variables except phonological recoding. Main effects of group were observed for vocabulary, morphosyntax, phonological memory, and concepts of print. Interaction effects were observed for phonological awareness and concepts of print. Children with hearing loss performed more poorly than children with normal hearing on measures of oral language, phonological memory, and conceptual print knowledge. Two interaction effects were present. For phonological awareness and concepts of print, children with hearing loss demonstrated less positive change than children with normal hearing. Although children with hearing loss generally demonstrated a positive growth in emergent literacy skills, their initial performance was lower than that of children with normal hearing, and rates of change were not sufficient to catch up to the peers over time.
Full Text Available Recent studies of eye movements in world-situated language comprehension have demonstrated that rapid processing of morphosyntactic information – e.g., grammatical gender and number marking – can produce anticipatory eye movements to referents in the visual scene. We investigated how type of morphosyntactic information and the goals of language users in comprehension affected eye movements, focusing on the processing of grammatical number morphology in English-speaking adults. Participants’ eye movements were recorded as they listened to simple English declarative (There are the lions. and interrogative (Where are the lions? sentences. In Experiment 1, no differences were observed in speed to fixate target referents when grammatical number information was informative relative to when it was not. The same result was obtained in a speeded task (Experiment 2 and in a task using mixed sentence types (Experiment 3. We conclude that grammatical number processing in English and eye movements to potential referents are not tightly coordinated. These results suggest limits on the role of predictive eye movements in concurrent linguistic and scene processing. We discuss how these results can inform and constrain predictive approaches to language processing.
Hassani, Kaveh; Lee, Won-Sook
A natural language interface exploits the conceptual simplicity and naturalness of the language to create a high-level user-friendly communication channel between humans and machines. One of the promising applications of such interfaces is generating visual interpretations of semantic content of a given natural language that can be then visualized either as a static scene or a dynamic animation. This survey discusses requirements and challenges of developing such systems and reports 26 graphi...
Natural language processing techniques for automatic test questions generation using discourse connectives. ... PROMOTING ACCESS TO AFRICAN RESEARCH. AFRICAN JOURNALS ... Journal of Computer Science and Its Application.
In principle, natural language and knowledge representation are closely related. This paper investigates this by demonstrating how several natural language phenomena, such as definite reference, ambiguity, ellipsis, ill-formed input, figures of speech, and vagueness, require diverse knowledge sources and reasoning. The breadth of kinds of knowledge needed to represent morphology, syntax, semantics, and pragmatics is surveyed. Furthermore, several current issues in knowledge representation, such as logic versus semantic nets, general-purpose versus special-purpose reasoners, adequacy of first-order logic, wait-and-see strategies, and default reasoning, are illustrated in terms of their relation to natural language processing and how natural language impact the issues.
Mobile Speech and Advanced Natural Language Solutions provides a comprehensive and forward-looking treatment of natural speech in the mobile environment. This fourteen-chapter anthology brings together lead scientists from Apple, Google, IBM, AT&T, Yahoo! Research and other companies, along with academicians, technology developers and market analysts. They analyze the growing markets for mobile speech, new methodological approaches to the study of natural language, empirical research findings on natural language and mobility, and future trends in mobile speech. Mobile Speech opens with a challenge to the industry to broaden the discussion about speech in mobile environments beyond the smartphone, to consider natural language applications across different domains. Among the new natural language methods introduced in this book are Sequence Package Analysis, which locates and extracts valuable opinion-related data buried in online postings; microintonation as a way to make TTS truly human-like; and se...
Hovy, Eduard H
Recognizing that the generation of natural language is a goal- driven process, where many of the goals are pragmatic (i.e., interpersonal and situational) in nature, this book provides an overview of the role of pragmatics in language generation. Each chapter states a problem that arises in generation, develops a pragmatics-based solution, and then describes how the solution is implemented in PAULINE, a language generator that can produce numerous versions of a single underlying message, depending on its setting.
Chandrasekaran, Bharath; Kraus, Nina; Wong, Patrick C M
A challenge to learning words of a foreign language is encoding nonnative phonemes, a process typically attributed to cortical circuitry. Using multimodal imaging methods [functional magnetic resonance imaging-adaptation (fMRI-A) and auditory brain stem responses (ABR)], we examined the extent to which pretraining pitch encoding in the inferior colliculus (IC), a primary midbrain structure, related to individual variability in learning to successfully use nonnative pitch patterns to distinguish words in American English-speaking adults. fMRI-A indexed the efficiency of pitch representation localized to the IC, whereas ABR quantified midbrain pitch-related activity with millisecond precision. In line with neural "sharpening" models, we found that efficient IC pitch pattern representation (indexed by fMRI) related to superior neural representation of pitch patterns (indexed by ABR), and consequently more successful word learning following sound-to-meaning training. Our results establish a critical role for the IC in speech-sound representation, consistent with the established role for the IC in the representation of communication signals in other animal models.
Caminha, Guilherme Pilla; Melo Junior, José Tavares de; Hopkins, Claire; Pizzichini, Emilio; Pizzichini, Marcia Margaret Menezes
Rhinosinusitis is a highly prevalent disease and a major cause of high medical costs. It has been proven to have an impact on the quality of life through generic health-related quality of life assessments. However, generic instruments may not be able to factor in the effects of interventions and treatments. SNOT-22 is a major disease-specific instrument to assess quality of life for patients with rhinosinusitis. Nevertheless, there is still no validated SNOT-22 version in our country. Cross-cultural adaptation of the SNOT-22 into Brazilian Portuguese and assessment of its psychometric properties. The Brazilian version of the SNOT-22 was developed according to international guidelines and was broken down into nine stages: 1) Preparation 2) Translation 3) Reconciliation 4) Back-translation 5) Comparison 6) Evaluation by the author of the SNOT-22 7) Revision by committee of experts 8) Cognitive debriefing 9) Final version. Second phase: prospective study consisting of a verification of the psychometric properties, by analyzing internal consistency and test-retest reliability. Cultural adaptation showed adequate understanding, acceptability and psychometric properties. We followed the recommended steps for the cultural adaptation of the SNOT-22 into Portuguese language, producing a tool for the assessment of patients with sinonasal disorders of clinical importance and for scientific studies.
Theune, Mariet; Freedman, R.; Callaway, C.
This paper describes how a language generation system that was originally designed for monologue generation, has been adapted for use in the OVIS spoken dialogue system. To meet the requirement that in a dialogue, the system’s utterances should make up a single, coherent dialogue turn, several
Colin, C; Zuinen, T; Bayard, C; Leybaert, J
Sign languages (SL), like oral languages (OL), organize elementary, meaningless units into meaningful semantic units. Our aim was to compare, at behavioral and neurophysiological levels, the processing of the location parameter in French Belgian SL to that of the rhyme in oral French. Ten hearing and 10 profoundly deaf adults performed a rhyme judgment task in OL and a similarity judgment on location in SL. Stimuli were pairs of pictures. As regards OL, deaf subjects' performances, although above chance level, were significantly lower than that of hearing subjects, suggesting that a metaphonological analysis is possible for deaf people but rests on phonological representations that are less precise than in hearing people. As regards SL, deaf subjects scores indicated that a metaphonological judgment may be performed on location. The contingent negative variation (CNV) evoked by the first picture of a pair was similar in hearing subjects in OL and in deaf subjects in OL and SL. However, an N400 evoked by the second picture of the non-rhyming pairs was evidenced only in hearing subjects in OL. The absence of N400 in deaf subjects may be interpreted as the failure to associate two words according to their rhyme in OL or to their location in SL. Although deaf participants can perform metaphonological judgments in OL, they differ from hearing participants both behaviorally and in ERP. Judgment of location in SL is possible for deaf signers, but, contrary to rhyme judgment in hearing participants, does not elicit any N400. Copyright © 2013 Elsevier Masson SAS. All rights reserved.
Kwon, Youan; Choi, Sungmook; Lee, Yoonhyoung
This study examines whether orthographic information is used during prelexical processes in spoken word recognition by investigating ERPs during spoken word processing for Korean words. Differential effects due to orthographic syllable neighborhood size and sound-to-spelling consistency on P200 and N320 were evaluated by recording ERPs from 42 participants during a lexical decision task. The results indicate that P200 was smaller for words whose orthographic syllable neighbors are large in number rather than those that are small. In addition, a word with a large orthographic syllable neighborhood elicited a smaller N320 effect than a word with a small orthographic syllable neighborhood only when the word had inconsistent sound-to-spelling mapping. The results provide support for the assumption that orthographic information is used early during the prelexical spoken word recognition process. © 2015 Society for Psychophysiological Research.
Levison, Michael; Lessard, Gregory
Describes the natural language computer program, "Vinci." Explains that using an attribute grammar formalism, Vinci can simulate components of several current linguistic theories. Considers the design of the system and its applications in linguistic modelling and second language acquisition research. Notes Vinci's uses in linguistics…
Schreibman, Laura; Stahmer, Aubyn C
Presently there is no consensus on the specific behavioral treatment of choice for targeting language in young nonverbal children with autism. This randomized clinical trial compared the effectiveness of a verbally-based intervention, Pivotal Response Training (PRT) to a pictorially-based behavioral intervention, the Picture Exchange Communication System (PECS) on the acquisition of spoken language by young (2-4 years), nonverbal or minimally verbal (≤9 words) children with autism. Thirty-nine children were randomly assigned to either the PRT or PECS condition. Participants received on average 247 h of intervention across 23 weeks. Dependent measures included overall communication, expressive vocabulary, pictorial communication and parent satisfaction. Children in both intervention groups demonstrated increases in spoken language skills, with no significant difference between the two conditions. Seventy-eight percent of all children exited the program with more than 10 functional words. Parents were very satisfied with both programs but indicated PECS was more difficult to implement.
Sevens, Leen; Vandeghinste, Vincent; Schuurman, Ineke; Van Eynde, Frank
We present a Pictograph-to-Text translation system for people with Intellectual or Developmental Disabilities (IDD). The system translates pictograph messages, consisting of one or more pictographs, into Dutch text using WordNet links and an n-gram language model. We also provide several pictograph input methods assisting the users in selecting the appropriate pictographs.
This dissertation studies how people describe emotions with language and how computers can simulate this descriptive behavior. Although many non-human animals can express their current emotions as social signals, only humans can communicate about emotions symbolically. This symbolic communication of emotion allows us to talk about emotions that we…
The contributions in this volume focus on the Bayesian interpretation of natural languages, which is widely used in areas of artificial intelligence, cognitive science, and computational linguistics. This is the first volume to take up topics in Bayesian Natural Language Interpretation and make proposals based on information theory, probability theory, and related fields. The methodologies offered here extend to the target semantic and pragmatic analyses of computational natural language interpretation. Bayesian approaches to natural language semantics and pragmatics are based on methods from signal processing and the causal Bayesian models pioneered by especially Pearl. In signal processing, the Bayesian method finds the most probable interpretation by finding the one that maximizes the product of the prior probability and the likelihood of the interpretation. It thus stresses the importance of a production model for interpretation as in Grice's contributions to pragmatics or in interpretation by abduction.
Adelphi, MD 20783-1197 This technical note provides a brief description of a Java library for Arabic natural language processing ( NLP ) containing code...for training and applying the Arabic NLP system described in the paper "A Cross-Task Flexible Transition Model for Arabic Tokenization, Affix...and also English) natural language processing ( NLP ), containing code for training and applying the Arabic NLP system described in Stephen Tratz’s
Berwick, Robert C; Friederici, Angela D; Chomsky, Noam; Bolhuis, Johan J
Language serves as a cornerstone for human cognition, yet much about its evolution remains puzzling. Recent research on this question parallels Darwin's attempt to explain both the unity of all species and their diversity. What has emerged from this research is that the unified nature of human language arises from a shared, species-specific computational ability. This ability has identifiable correlates in the brain and has remained fixed since the origin of language approximately 100 thousand years ago. Although songbirds share with humans a vocal imitation learning ability, with a similar underlying neural organization, language is uniquely human. Copyright © 2012 Elsevier Ltd. All rights reserved.
Monti, Martin M; Parsons, Lawrence M; Osherson, Daniel N
A central question in cognitive science is whether natural language provides combinatorial operations that are essential to diverse domains of thought. In the study reported here, we addressed this issue by examining the role of linguistic mechanisms in forging the hierarchical structures of algebra. In a 3-T functional MRI experiment, we showed that processing of the syntax-like operations of algebra does not rely on the neural mechanisms of natural language. Our findings indicate that processing the syntax of language elicits the known substrate of linguistic competence, whereas algebraic operations recruit bilateral parietal brain regions previously implicated in the representation of magnitude. This double dissociation argues against the view that language provides the structure of thought across all cognitive domains.
Andreasen, Troels; Styltsvig, Henrik Bulskov; Jensen, Per Anker
We describe a natural logic for computational reasoning with a regimented fragment of natural language. The natural logic comes with intuitive inference rules enabling deductions and with an internal graph representation facilitating conceptual path finding between pairs of terms as an approach t...
Andreasen, Troels; Bulskov, Henrik; Jensen, Per Anker
We describe a natural logic for computational reasoning with a regimented fragment of natural language. The natural logic comes with intuitive inference rules enabling deductions and with an internal graph representation facilitating conceptual path finding between pairs of terms as an approach t...
The current research examined how Arabic diglossia affects verbal learning memory. Thirty native Arab college students were tested using auditory verbal memory test that was adapted according to the Rey Auditory Verbal Learning Test and developed in three versions: Pure spoken language version (SL), pure standard language version (SA), and phonologically similar version (PS). The result showed that for immediate free-recall, the performances were better for the SL and the PS conditions compared to the SA one. However, for the parts of delayed recall and recognition, the results did not reveal any significant consistent effect of diglossia. Accordingly, it was suggested that diglossia has a significant effect on the storage and short term memory functions but not on long term memory functions. The results were discussed in light of different approaches in the field of bilingual memory.
Petitto, Laura Ann; Holowka, Siobhan
Examines whether early simultaneous bilingual language exposure causes children to be language delayed or confused. Cites research suggesting normal and parallel linguistic development occurs in each language in young children and young children's dual language developments are similar to monolingual language acquisition. Research on simultaneous…
Wagner, J C; Rogers, J E; Baud, R H; Scherrer, J R
A number of compositional Medical Concept Representation systems are being developed. Although these provide for a detailed conceptual representation of the underlying information, they have to be translated back to natural language for used by end-users and applications. The GALEN programme has been developing one such representation and we report here on a tool developed to generate natural language phrases from the GALEN conceptual representations. This tool can be adapted to different source modelling schemes and to different destination languages or sublanguages of a domain. It is based on a multilingual approach to natural language generation, realised through a clean separation of the domain model from the linguistic model and their link by well defined structures. Specific knowledge structures and operations have been developed for bridging between the modelling 'style' of the conceptual representation and natural language. Using the example of the scheme developed for modelling surgical operative procedures within the GALEN-IN-USE project, we show how the generator is adapted to such a scheme. The basic characteristics of the surgical procedures scheme are presented together with the basic principles of the generation tool. Using worked examples, we discuss the transformation operations which change the initial source representation into a form which can more directly be translated to a given natural language. In particular, the linguistic knowledge which has to be introduced--such as definitions of concepts and relationships is described. We explain the overall generator strategy and how particular transformation operations are triggered by language-dependent and conceptual parameters. Results are shown for generated French phrases corresponding to surgical procedures from the urology domain.
Full Text Available Reading plays a key role in education and communication in modern society. Learning to read establishes the connections between the visual word form area (VWFA and language areas responsible for speech processing. Using resting-state functional connectivity (RSFC and Granger Causality Analysis (GCA methods, the current developmental study aimed to identify the difference in the relationship between the connections of VWFA-language areas and reading performance in both adults and children. The results showed that: (1 the spontaneous connectivity between VWFA and the spoken language areas, i.e., the left inferior frontal gyrus/supramarginal gyrus (LIFG/LSMG, was stronger in adults compared with children; (2 the spontaneous functional patterns of connectivity between VWFA and language network were negatively correlated with reading ability in adults but not in children; (3 the causal influence from LIFG to VWFA was negatively correlated with reading ability only in adults but not in children; (4 the RSFCs between left posterior middle frontal gyrus (LpMFG and VWFA/LIFG were positively correlated with reading ability in both adults and children; and (5 the causal influence from LIFG to LSMG was positively correlated with reading ability in both groups. These findings provide insights into the relationship between VWFA and the language network for reading, and the role of the unique features of Chinese in the neural circuits of reading.
Theune, Mariet; Freedman, R.; Callaway, C.
This paper describes how a language generation system that was originally designed for monologue generation, has been adapted for use in the OVIS spoken dialogue system. To meet the requirement that in a dialogue, the system’s utterances should make up a single, coherent dialogue turn, several modifications had to be made to the system. The paper also discusses the influence of dialogue context on information status, and its consequences for the generation of referring expressions and accentu...
Waltz, David L
Natural language understanding is central to the goals of artificial intelligence. Any truly intelligent machine must be capable of carrying on a conversation: dialogue, particularly clarification dialogue, is essential if we are to avoid disasters caused by the misunderstanding of the intelligent interactive systems of the future. This book is an interim report on the grand enterprise of devising a machine that can use natural language as fluently as a human. What has really been achieved since this goal was first formulated in Turing's famous test? What obstacles still need to be overcome?
Sedgwick, Carole; Garner, Mark
Non-native speakers of English who hold nursing qualifications from outside the UK are required to provide evidence of English language competence by achieving a minimum overall score of Band 7 on the International English Language Testing System (IELTS) academic test. To describe the English language required to deal with the daily demands of nursing in the UK. To compare these abilities with the stipulated levels on the language test. A tracking study was conducted with 4 nurses, and focus groups with 11 further nurses. The transcripts of the interviews and focus groups were analysed thematically for recurrent themes. These findings were then compared with the requirements of the IELTS spoken test. The study was conducted outside the participants' working shifts in busy London hospitals. The participants in the tracking study were selected opportunistically;all were trained in non-English speaking countries. Snowball sampling was used for the focus groups, of whom 4 were non-native and 7 native speakers of English. In the tracking study, each of the 4 nurses was interviewed on four occasions, outside the workplace, and as close to the end of a shift as possible. They were asked to recount their spoken interactions during the course of their shift. The participants in the focus groups were asked to describe their typical interactions with patients, family members, doctors, and nursing colleagues. They were prompted to recall specific instances of frequently-occurring communication problems. All interactions were audio-recorded, with the participants' permission,and transcribed. Nurses are at the centre of communication for patient care. They have to use appropriate registers to communicate with a range of health professionals, patients and their families. They must elicit information, calm and reassure, instruct, check procedures, ask for and give opinions,agree and disagree. Politeness strategies are needed to avoid threats to face. They participate in medical
This book discusses the following: Computational Linguistics, Artificial Intelligence, Linguistics, Philosophy, and Cognitive Science and the current state of natural language understanding. Three topics form the focus for discussion; these topics include aspects of grammars, aspects of semantics/pragmatics, and knowledge representation.
The present dissertation reports on research into the nature of Pragmatic Language Impairment (PLI) in children aged 4 to 7 in the Netherlands. First, the possibility of screening for PLI in the general population is examined. Results show that this is indeed possible as well as feasible. Second, an
Many natural language dialogue systems make use of `canned text' for output generation. This approach may be su±cient for dialogues in restricted domains where system utterances are short and simple and use fixed expressions (e.g., slot filling dialogues in the ticket reservation or travel
van Luin, J.; Nijholt, Antinus; op den Akker, Hendrikus J.A.; Giagourta, V.; Strintzis, M.G.
We describe our work on designing a natural language accessible navigation agent for a virtual reality (VR) environment. The agent is part of an agent framework, which means that it can communicate with other agents. Its navigation task consists of guiding the visitors in the environment and to
To identify the neural components that make a brain ready for language, it is important to have well defined linguistic phenotypes, to know precisely what language is. There are two central features to language: the capacity to form signs (words), and the capacity to combine them into complex structures. We must determine how the human brain enables these capacities. A sign is a link between a perceptual form and a conceptual meaning. Acoustic elements and content elements, are already brain-internal in non-human animals, but as categorical systems linked with brain-external elements. Being indexically tied to objects of the world, they cannot freely link to form signs. A crucial property of a language-ready brain is the capacity to process perceptual forms and contents offline, detached from any brain-external phenomena, so their "representations" may be linked into signs. These brain systems appear to have pleiotropic effects on a variety of phenotypic traits and not to be specifically designed for language. Syntax combines signs, so the combination of two signs operates simultaneously on their meaning and form. The operation combining the meanings long antedates its function in language: the primitive mode of predication operative in representing some information about an object. The combination of the forms is enabled by the capacity of the brain to segment vocal and visual information into discrete elements. Discrete temporal units have order and juxtaposition, and vocal units have intonation, length, and stress. These are primitive combinatorial processes. So the prior properties of the physical and conceptual elements of the sign introduce combinatoriality into the linguistic system, and from these primitive combinatorial systems derive concatenation in phonology and combination in morphosyntax. Given the nature of language, a key feature to our understanding of the language-ready brain is to be found in the mechanisms in human brains that enable the unique
Full Text Available To identify the neural components that make a brain ready for language, it is important to have well defined linguistic phenotypes, to know precisely what language is. There are two central features to language: the capacity to form signs (words, and the capacity to combine them into complex structures. We must determine how the human brain enables these capacities.A sign is a link between a perceptual form and a conceptual meaning. Acoustic elements and content elements, are already brain-internal in non-human animals, but as categorical systems linked with brain-external elements. Being indexically tied to objects of the world, they cannot freely link to form signs. A crucial property of a language-ready brain is the capacity to process perceptual forms and contents offline, detached from any brain-external phenomena, so their representations may be linked into signs. These brain systems appear to have pleiotropic effects on a variety of phenotypic traits and not to be specifically designed for language.Syntax combines signs, so the combination of two signs operates simultaneously on their meaning and form. The operation combining the meanings long antedates its function in language: the primitive mode of predication operative in representing some information about an object. The combination of the forms is enabled by the capacity of the brain to segment vocal and visual information into discrete elements. Discrete temporal units have order and juxtaposition, and vocal units have intonation, length, and stress. These are primitive combinatorial processes. So the prior properties of the physical and conceptual elements of the sign introduce combinatoriality into the linguistic system, and from these primitive combinatorial systems derive concatenation in phonology and combination in morphosyntax.Given the nature of language, a key feature to our understanding of the language-ready brain is to be found in the mechanisms in human brains that
Heger, A.S.; Koen, B.V.
A natural language interface has been developed for access to information from a data base, simulating a nuclear plant reliability data system (NPRDS), one of the several existing data bases serving the nuclear industry. In the last decade, the importance of information has been demonstrated by the impressive diffusion of data base management systems. The present methods that are employed to access data bases fall into two main categories of menu-driven systems and use of data base manipulation languages. Both of these methods are currently used by NPRDS. These methods have proven to be tedious, however, and require extensive training by the user for effective utilization of the data base. Artificial intelligence techniques have been used in the development of several intelligent front ends for data bases in nonnuclear domains. Lunar is a natural language program for interface to a data base describing moon rock samples brought back by Apollo. Intellect is one of the first data base question-answering systems that was commercially available in the financial area. Ladder is an intelligent data base interface that was developed as a management aid to Navy decision makers. A natural language interface for nuclear data bases that can be used by nonprogrammers with little or no training provides a means for achieving this goal for this industry
Kambayashi, Shaw; Uenaka, Junji
In this report, a natural language analyzer and two different task planning systems are described. In 1988, we have introduced a Japanese language analyzer named CS-PARSER for the input interface of the task planning system in the Human Acts Simulation Program (HASP). For the purpose of a high speed analysis, we have modified a dictionary system of the CS-PARSER by using C language description. It is found that the new dictionary system is very useful for a high speed analysis and an efficient maintenance of the dictionary. For the study of the task planning problem, we have modified a story generating system named Micro TALE-SPIN to generate a story written in Japanese sentences. We have also constructed a planning system with natural language interface by using the CS-PARSER. Task planning processes and related knowledge bases of these systems are explained. A concept design for a new task planning system will be also discussed from evaluations of above mentioned systems. (author)
Vandeventer Faltin, Anne
Full Text Available This paper illustrates the usefulness of natural language processing (NLP tools for computer assisted language learning (CALL through the presentation of three NLP tools integrated within a CALL software for French. These tools are (i a sentence structure viewer; (ii an error diagnosis system; and (iii a conjugation tool. The sentence structure viewer helps language learners grasp the structure of a sentence, by providing lexical and grammatical information. This information is derived from a deep syntactic analysis. Two different outputs are presented. The error diagnosis system is composed of a spell checker, a grammar checker, and a coherence checker. The spell checker makes use of alpha-codes, phonological reinterpretation, and some ad hoc rules to provide correction proposals. The grammar checker employs constraint relaxation and phonological reinterpretation as diagnosis techniques. The coherence checker compares the underlying "semantic" structures of a stored answer and of the learners' input to detect semantic discrepancies. The conjugation tool is a resource with enhanced capabilities when put on an electronic format, enabling searches from inflected and ambiguous verb forms.
Social Security Administration — This data set provides quarterly volumes for language preferences at the national level of individuals filing claims for SSI Blind and Disabled benefits for fiscal...
Social Security Administration — This data set provides quarterly volumes for language preferences at the national level of individuals filing claims for SSI Blind and Disabled benefits from fiscal...
Reviews what is known about Esperanto as a home language and first language. Recorded cases of Esperanto-speaking families are known since 1919, and in nearly all of the approximately 350 families documented, the language is spoken to the children by the father. The data suggests that this "artificial bilingualism" can be as successful…
Cawsey, A J; Webber, B L; Jones, R B
Good communication is vital in health care, both among health care professionals, and between health care professionals and their patients. And well-written documents, describing and/or explaining the information in structured databases may be easier to comprehend, more edifying, and even more convincing than the structured data, even when presented in tabular or graphic form. Documents may be automatically generated from structured data, using techniques from the field of natural language generation. These techniques are concerned with how the content, organization and language used in a document can be dynamically selected, depending on the audience and context. They have been used to generate health education materials, explanations and critiques in decision support systems, and medical reports and progress notes.
Marshall, Chloë; Mason, Kathryn; Rowley, Katherine; Herman, Rosalind; Atkinson, Joanna; Woll, Bencie; Morgan, Gary
Children with specific language impairment (SLI) perform poorly on sentence repetition tasks across different spoken languages, but until now, this methodology has not been investigated in children who have SLI in a signed language. Users of a natural sign language encode different sentence meanings through their choice of signs and by altering…
Hovy, Dirk; Spruit, Shannon
Research in natural language processing (NLP) used to be mostly performed on anonymous corpora, with the goal of enriching linguistic analysis. Authors were either largely unknown or public figures. As we increasingly use more data from social media, this situation has changed: users are now...... individually identifiable, and the outcome of NLP experiments and applications can have a direct effect on their lives. This change should spawn a debate about the ethical implications of NLP, but until now, the internal discourse in the field has not followed the technological development. This position paper...
Ma, Weiyi; Zhou, Peng; Singh, Leher; Gao, Liqun
The majority of the world's languages rely on both segmental (vowels, consonants) and suprasegmental (lexical tones) information to contrast the meanings of individual words. However, research on early language development has mostly focused on the acquisition of vowel-consonant languages. Developmental research comparing sensitivity to segmental and suprasegmental features in young tone learners is extremely rare. This study examined 2- and 3-year-old monolingual tone learners' sensitivity to vowels and tones. Experiment 1a tested the influence of vowel and tone variation on novel word learning. Vowel and tone variation hindered word recognition efficiency in both age groups. However, tone variation hindered word recognition accuracy only in 2-year-olds, while 3-year-olds were insensitive to tone variation. Experiment 1b demonstrated that 3-year-olds could use tones to learn new words when additional support was provided, and additionally, that Tone 3 words were exceptionally difficult to learn. Experiment 2 confirmed a similar pattern of results when children were presented with familiar words. This study is the first to show that despite the importance of tones in tone languages, vowels maintain primacy over tones in young children's word recognition and that tone sensitivity in word learning and recognition changes between 2 and 3years of age. The findings suggest that early lexical processes are more tightly constrained by variation in vowels than by tones. Copyright © 2016 Elsevier B.V. All rights reserved.
Chen, Pei-Hua; Liu, Ting-Wei
Telepractice provides an alternative form of auditory-verbal therapy (eAVT) intervention through videoconferencing; this can be of immense benefit for children with hearing loss, especially those living in rural or remote areas. The effectiveness of eAVT for the language development of Mandarin-speaking preschoolers with hearing loss was…
Cupples, Linda; Ching, Teresa Yc; Button, Laura; Seeto, Mark; Zhang, Vicky; Whitfield, Jessica; Gunnourie, Miriam; Martin, Louise; Marnane, Vivienne
This study investigated the factors influencing 5-year language, speech and everyday functioning of children with congenital hearing loss. Standardised tests including PLS-4, PPVT-4 and DEAP were directly administered to children. Parent reports on language (CDI) and everyday functioning (PEACH) were collected. Regression analyses were conducted to examine the influence of a range of demographic variables on outcomes. Participants were 339 children enrolled in the Longitudinal Outcomes of Children with Hearing Impairment (LOCHI) study. Children's average receptive and expressive language scores were approximately 1 SD below the mean of typically developing children, and scores on speech production and everyday functioning were more than 1 SD below. Regression models accounted for 70-23% of variance in scores across different tests. Earlier CI switch-on and higher non-verbal ability were associated with better outcomes in most domains. Earlier HA fitting and use of oral communication were associated with better outcomes on directly administered language assessments. Severity of hearing loss and maternal education influenced outcomes of children with HAs. The presence of additional disabilities affected outcomes of children with CIs. The findings provide strong evidence for the benefits of early HA fitting and early CI for improving children's outcomes.
Hoard, James E.
Integrating diverse information sources and application software in a principled and general manner will require a very capable advanced information management (AIM) system. In particular, such a system will need a comprehensive addressing scheme to locate the material in its docuverse. It will also need a natural language processing (NLP) system of great sophistication. It seems that the NLP system must serve three functions. First, it provides an natural language interface (NLI) for the users. Second, it serves as the core component that understands and makes use of the real-world interpretations (RWIs) contained in the docuverse. Third, it enables the reasoning specialists (RSs) to arrive at conclusions that can be transformed into procedures that will satisfy the users' requests. The best candidate for an intelligent agent that can satisfactorily make use of RSs and transform documents (TDs) appears to be an object oriented data base (OODB). OODBs have, apparently, an inherent capacity to use the large numbers of RSs and TDs that will be required by an AIM system and an inherent capacity to use them in an effective way.
Gevarter, William B.
Computer-based Natural Language Processing (NLP) is the key to enabling humans and their computer-based creations to interact with machines using natural languages (English, Japanese, German, etc.) rather than formal computer languages. NLP is a major research area in the fields of artificial intelligence and computational linguistics. Commercial…
Andreasen, Troels; Bulskov, Henrik; Nilsson, Jørgen Fischer
This paper makes a case for adopting appropriate forms of natural logic as target language for computational reasoning with descriptive natural language. Natural logics are stylized fragments of natural language where reasoning can be conducted directly by natural reasoning rules reflecting intui...... intuitive reasoning in natural language. The approach taken in this paper is to extend natural logic stepwise with a view to covering successively larger parts of natural language. We envisage applications for computational querying and reasoning, in particular within the life-sciences....
Christina Siu-Dschu Fan
Full Text Available In tonal languages, such as Mandarin Chinese, the pitch contour of vowels discriminates lexical meaning, which is not the case in non-tonal languages such as German. Recent data provide evidence that pitch processing is influenced by language experience. However, there are still many open questions concerning the representation of such phonological and language-related differences at the level of the auditory cortex (AC. Using magnetoencephalography (MEG, we recorded transient and sustained auditory evoked fields (AEF in native Chinese and German speakers to investigate language related phonological and semantic aspects in the processing of acoustic stimuli. AEF were elicited by spoken meaningful and meaningless syllables, by vowels, and by a French horn tone. Speech sounds were recorded from a native speaker and showed frequency-modulations according to the pitch-contours of Mandarin. The sustained field (SF evoked by natural speech signals was significantly larger for Chinese than for German listeners. In contrast, the SF elicited by a horn tone was not significantly different between groups. Furthermore, the SF of Chinese subjects was larger when evoked by meaningful syllables compared to meaningless ones, but there was no significant difference regarding whether vowels were part of the Chinese phonological system or not. Moreover, the N100m gave subtle but clear evidence that for Chinese listeners other factors than purely physical properties play a role in processing meaningful signals. These findings show that the N100 and the SF generated in Heschl's gyrus are influenced by language experience, which suggests that AC activity related to specific pitch contours of vowels is influenced in a top-down fashion by higher, language related areas. Such interactions are in line with anatomical findings and neuroimaging data, as well as with the dual-stream model of language of Hickok and Poeppel that highlights the close and reciprocal interaction
Fan, Christina Siu-Dschu; Zhu, Xingyu; Dosch, Hans Günter; von Stutterheim, Christiane; Rupp, André
In tonal languages, such as Mandarin Chinese, the pitch contour of vowels discriminates lexical meaning, which is not the case in non-tonal languages such as German. Recent data provide evidence that pitch processing is influenced by language experience. However, there are still many open questions concerning the representation of such phonological and language-related differences at the level of the auditory cortex (AC). Using magnetoencephalography (MEG), we recorded transient and sustained auditory evoked fields (AEF) in native Chinese and German speakers to investigate language related phonological and semantic aspects in the processing of acoustic stimuli. AEF were elicited by spoken meaningful and meaningless syllables, by vowels, and by a French horn tone. Speech sounds were recorded from a native speaker and showed frequency-modulations according to the pitch-contours of Mandarin. The sustained field (SF) evoked by natural speech signals was significantly larger for Chinese than for German listeners. In contrast, the SF elicited by a horn tone was not significantly different between groups. Furthermore, the SF of Chinese subjects was larger when evoked by meaningful syllables compared to meaningless ones, but there was no significant difference regarding whether vowels were part of the Chinese phonological system or not. Moreover, the N100m gave subtle but clear evidence that for Chinese listeners other factors than purely physical properties play a role in processing meaningful signals. These findings show that the N100 and the SF generated in Heschl's gyrus are influenced by language experience, which suggests that AC activity related to specific pitch contours of vowels is influenced in a top-down fashion by higher, language related areas. Such interactions are in line with anatomical findings and neuroimaging data, as well as with the dual-stream model of language of Hickok and Poeppel that highlights the close and reciprocal interaction between
Chen, Wei; Mostow, Jack; Aist, Gregory
Free-form spoken input would be the easiest and most natural way for young children to communicate to an intelligent tutoring system. However, achieving such a capability poses a challenge both to instruction design and to automatic speech recognition. To address the difficulties of accepting such input, we adopt the framework of predictable…
Waltz, D. L.; Maran, L. R.; Dorfman, M. H.; Dinitz, R.; Farwell, D.
During this contract period the authors have: (1) continued investigation of events and actions by means of representation schemes called 'event shape diagrams'; (2) written a parsing program which selects appropriate word and sentence meanings by a parallel process know as activation and inhibition; (3) begun investigation of the point of a story or event by modeling the motivations and emotional behaviors of story characters; (4) started work on combining and translating two machine-readable dictionaries into a lexicon and knowledge base which will form an integral part of our natural language understanding programs; (5) made substantial progress toward a general model for the representation of cognitive relations by comparing English scene and event descriptions with similar descriptions in other languages; (6) constructed a general model for the representation of tense and aspect of verbs; (7) made progress toward the design of an integrated robotics system which accepts English requests, and uses visual and tactile inputs in making decisions and learning new tasks.
Full Text Available This paper presents how to search mathematical formulae written in MathML when given plain words as a query. Since the proposed method allows natural language queries like the traditional Information Retrieval for the mathematical formula search, users do not need to enter any complicated math symbols and to use any formula input tool. For this, formula data is converted into plain texts, and features are extracted from the converted texts. In our experiments, we achieve an outstanding performance, a MRR of 0.659. In addition, we introduce how to utilize formula classification for formula search. By using class information, we finally achieve an improved performance, a MRR of 0.690.
Hovy, Dirk; Spruit, Shannon
Research in natural language processing (NLP) used to be mostly performed on anonymous corpora, with the goal of enriching linguistic analysis. Authors were either largely unknown or public figures. As we increasingly use more data from social media, this situation has changed: users are now...... individually identifiable, and the outcome of NLP experiments and applications can have a direct effect on their lives. This change should spawn a debate about the ethical implications of NLP, but until now, the internal discourse in the field has not followed the technological development. This position paper...... identifies a number of social implications that NLP research may have, and discusses their ethical significance, as well as ways to address them....
Full Text Available We propose a new application of quantum computing to the field of natural language processing. Ongoing work in this field attempts to incorporate grammatical structure into algorithms that compute meaning. In (Coecke, Sadrzadeh and Clark, 2010, the authors introduce such a model (the CSC model based on tensor product composition. While this algorithm has many advantages, its implementation is hampered by the large classical computational resources that it requires. In this work we show how computational shortcomings of the CSC approach could be resolved using quantum computation (possibly in addition to existing techniques for dimension reduction. We address the value of quantum RAM (Giovannetti,2008 for this model and extend an algorithm from Wiebe, Braun and Lloyd (2012 into a quantum algorithm to categorize sentences in CSC. Our new algorithm demonstrates a quadratic speedup over classical methods under certain conditions.
Modeling the entailment relation over sentences is one of the generic problems of natural language understanding. In order to account for this problem, we design a theorem prover for Natural Logic, a logic whose terms resemble natural language expressions. The prover is based on an analytic tableau
Šimáčková, Š.; Podlipský, V.J.; Chládková, K.
As a western Slavic language of the Indo-European family, Czech is closest to Slovak and Polish. It is spoken as a native language by nearly 10 million people in the Czech Republic (Czech Statistical Office n.d.). About two million people living abroad, mostly in the USA, Canada, Austria, Germany,
.... Initiated in 2004 at Defense Research and Development Canada (DRDC), the SACOT knowledge engineering research project is currently investigating, developing and validating innovative natural language processing (NLP...
In recent years, there has been a growing debate in the United States, Europe, and Australia about the nature of the Deaf community as a cultural community,1 and the recognition of signed languages as “real” or “legitimate” languages comparable in all meaningful ways to spoken languages. An important element of this ...
Langkopf, B.S.; Mallory, L.H.
A scientific data base, the Tuff Data Base, is being created at Sandia National Laboratories on the Cyber 170/855, using System 2000. It is being developed for use by scientists and engineers investigating the feasibility of locating a high-level radioactive waste repository in tuff (a type of volcanic rock) at Yucca Mountain on and adjacent to the Nevada Test Site. This project, the Nevada Nuclear Waste Storage Investigations (NNWSI) Project, is managed by the Nevada Operations Office of the US Department of Energy. A user-friendly interface, PRIMER, was developed that uses the Self-Contained Facility (SCF) command SUBMIT and System 2000 Natural Language functions and parametric strings that are schema resident. The interface was designed to: (1) allow users, with or without computer experience or keyboard skill, to sporadically access data in the Tuff Data Base; (2) produce retrieval capabilities for the user quickly; and (3) acquaint the users with the data in the Tuff Data Base. This paper gives a brief description of the Tuff Data Base Schema and the interface, PRIMER, which is written in Fortran V. 3 figures
The Policy-Based Management Natural Language Parser (PBEM) is a rules-based approach to enterprise management that can be used to automate certain management tasks. This parser simplifies the management of a given endeavor by establishing policies to deal with situations that are likely to occur. Policies are operating rules that can be referred to as a means of maintaining order, security, consistency, or other ways of successfully furthering a goal or mission. PBEM provides a way of managing configuration of network elements, applications, and processes via a set of high-level rules or business policies rather than managing individual elements, thus switching the control to a higher level. This software allows unique management rules (or commands) to be specified and applied to a cross-section of the Global Information Grid (GIG). This software embodies a parser that is capable of recognizing and understanding conversational English. Because all possible dialect variants cannot be anticipated, a unique capability was developed that parses passed on conversation intent rather than the exact way the words are used. This software can increase productivity by enabling a user to converse with the system in conversational English to define network policies. PBEM can be used in both manned and unmanned science-gathering programs. Because policy statements can be domain-independent, this software can be applied equally to a wide variety of applications.
Paul H Thibodeau
Full Text Available Metaphors pervade discussions of social issues like climate change, the economy, and crime. We ask how natural language metaphors shape the way people reason about such social issues. In previous work, we showed that describing crime metaphorically as a beast or a virus, led people to generate different solutions to a city's crime problem. In the current series of studies, instead of asking people to generate a solution on their own, we provided them with a selection of possible solutions and asked them to choose the best ones. We found that metaphors influenced people's reasoning even when they had a set of options available to compare and select among. These findings suggest that metaphors can influence not just what solution comes to mind first, but also which solution people think is best, even when given the opportunity to explicitly compare alternatives. Further, we tested whether participants were aware of the metaphor. We found that very few participants thought the metaphor played an important part in their decision. Further, participants who had no explicit memory of the metaphor were just as much affected by the metaphor as participants who were able to remember the metaphorical frame. These findings suggest that metaphors can act covertly in reasoning. Finally, we examined the role of political affiliation on reasoning about crime. The results confirm our previous findings that Republicans are more likely to generate enforcement and punishment solutions for dealing with crime, and are less swayed by metaphor than are Democrats or Independents.
Dethlefs, Nina; Milders, Maarten; Cuayáhuitl, Heriberto; Al-Salkini, Turkey; Douglas, Lorraine
Currently, an estimated 36 million people worldwide are affected by Alzheimer's disease or related dementias. In the absence of a cure, non-pharmacological interventions, such as cognitive stimulation, which slow down the rate of deterioration can benefit people with dementia and their caregivers. Such interventions have shown to improve well-being and slow down the rate of cognitive decline. It has further been shown that cognitive stimulation in interaction with a computer is as effective as with a human. However, the need to operate a computer often represents a difficulty for the elderly and stands in the way of widespread adoption. A possible solution to this obstacle is to provide a spoken natural language interface that allows people with dementia to interact with the cognitive stimulation software in the same way as they would interact with a human caregiver. This makes the assistive technology accessible to users regardless of their technical skills and provides a fully intuitive user experience. This article describes a pilot study that evaluated the feasibility of computer-based cognitive stimulation through a spoken natural language interface. Prototype software was evaluated with 23 users, including healthy elderly people and people with dementia. Feedback was overwhelmingly positive.
When we think of everyday language use, the first things that come to mind include colloquial conversations, reading and writing e-mails, sending text messages or reading a book. But can we study the brain basis of language as we use it in our daily lives? As a topic of study, the cognitive
593], pages International Conference of the IEEE Engineer- 351-363. ing in Medicine and Biology Society, volume 3, pages 1347-1348, New Orleans, LA...Conference on Machine Translation of Languages and Applied  Ingrid Zukerman. Koalas are not bears: Gener- Language Analysis. pages 66-80. Her
Full Text Available The performance of deep learning in natural language processing has been spectacular, but the reasons for this success remain unclear because of the inherent complexity of deep learning. This paper provides empirical evidence of its effectiveness and of a limitation of neural networks for language engineering. Precisely, we demonstrate that a neural language model based on long short-term memory (LSTM effectively reproduces Zipf's law and Heaps' law, two representative statistical properties underlying natural language. We discuss the quality of reproducibility and the emergence of Zipf's law and Heaps' law as training progresses. We also point out that the neural language model has a limitation in reproducing long-range correlation, another statistical property of natural language. This understanding could provide a direction for improving the architectures of neural networks.
Hamon, Thierry; Mougin, Fleur; Grabar, Natalia
With the recent and intensive research in the biomedical area, the knowledge accumulated is disseminated through various knowledge bases. Links between these knowledge bases are needed in order to use them jointly. Linked Data, SPARQL language, and interfaces in Natural Language question-answering provide interesting solutions for querying such knowledge bases. We propose a method for translating natural language questions in SPARQL queries. We use Natural Language Processing tools, semantic resources, and the RDF triples description. The method is designed on 50 questions over 3 biomedical knowledge bases, and evaluated on 27 questions. It achieves 0.78 F-measure on the test set. The method for translating natural language questions into SPARQL queries is implemented as Perl module available at http://search.cpan.org/ thhamon/RDF-NLP-SPARQLQuery.
Dougherty, Ray C
This book's main goal is to show readers how to use the linguistic theory of Noam Chomsky, called Universal Grammar, to represent English, French, and German on a computer using the Prolog computer language. In so doing, it presents a follow-the-dots approach to natural language processing, linguistic theory, artificial intelligence, and expert systems. The basic idea is to introduce meaningful answers to significant problems involved in representing human language data on a computer. The book offers a hands-on approach to anyone who wishes to gain a perspective on natural language
Zou, Lijuan; Desroches, Amy S.; Liu, Youyi; Xia, Zhichao; Shu, Hua
Orthographic influences in spoken word recognition have been previously examined in alphabetic languages. However, it is unknown whether orthographic information affects spoken word recognition in Chinese, which has a clean dissociation between orthography (O) and phonology (P). The present study investigated orthographic effects using event…
Conner, Peggy S.
A high percentage of individuals with dyslexia struggle to learn unfamiliar spoken words, creating a significant obstacle to foreign language learning after early childhood. The origin of spoken-word learning difficulties in this population, generally thought to be related to the underlying literacy deficit, is not well defined (e.g., Di Betta…
Dominick, Wayne D. (Editor); Liu, I-Hsiung
The currently developed user language interfaces of information systems are generally intended for serious users. These interfaces commonly ignore potentially the largest user group, i.e., casual users. This project discusses the concepts and implementations of a natural query language system which satisfy the nature and information needs of casual users by allowing them to communicate with the system in the form of their native (natural) language. In addition, a framework for the development of such an interface is also introduced for the MADAM (Multics Approach to Data Access and Management) system at the University of Southwestern Louisiana.
Loukina, Anastassia; Buzick, Heather
This study is an evaluation of the performance of automated speech scoring for speakers with documented or suspected speech impairments. Given that the use of automated scoring of open-ended spoken responses is relatively nascent and there is little research to date that includes test takers with disabilities, this small exploratory study focuses…
A proposal to transform Spanish into a universal language because it possesses the prerequisites: it is a living language, spoken in several countries; it is a natural language; and it uses the ordinary alphabet. Details on simplification and standardization are given. (Text is in Spanish.) (AMH)
May 26, 2018 ... resent, and store information in a natural-language-inde- pendent format . UNL is .... account semantic information available in words of the problem ...... Sentiment Analysis (SA) plays a vital role in decision making process.
Full Text Available Recent mathematical and algorithmic results in the field of finite-state technology, as well the increase in computing power, have constructed the base for a new approach in natural language processing. However the task of creating an appropriate model that would describe the phenomena of the natural language is still to be achieved. ln this paper I'm presenting some notions related to the finite-state modelling of syntax and morphology.
Institute for the Study of Violent Groups NATO North Atlantic Treaty Organization NLP Natural Language Processing PCorpus Permanent Corpus PDF...approaches, we apply Natural Language Processing ( NLP ) tools to a unique database of text documents collected by Whiteside (2014). His collection...from Arabic to English. Compared to other terrorism databases, Whiteside’s collection methodology limits the scope of the database and avoids coding
Garfield, D A; Rapp, C; Evens, M
The potential benefit of artificial intelligence (AI) technology as a tool of psychiatry has not been well defined. In this essay, the technology of natural language processing and its position with regard to the two main schools of AI is clearly outlined. Past experiments utilizing AI techniques in understanding psychopathology are reviewed. Natural language processing can automate the analysis of transcripts and can be used in modeling theories of language comprehension. In these ways, it can serve as a tool in testing psychological theories of psychopathology and can be used as an effective tool in empirical research on verbal behavior in psychopathology.
Using contemporary science, the paper builds on Wittgenstein’s views of human language. Rather than ascribing reality to inscription-like entities, it links embodiment with distributed cognition. The verbal or (quasi) technological aspect of language is traced to not action, but human specific...... interactivity. This species-specific form of sense-making sustains, among other things, using texts, making/construing phonetic gestures and thinking. Human action is thus grounded in appraisals or sense-saturated coordination. To illustrate interactivity at work, the paper focuses on a case study. Over 11 s......, a crime scene investigator infers that she is probably dealing with an inside job: she uses not words, but intelligent gaze. This connects professional expertise to circumstances and the feeling of thinking. It is suggested that, as for other species, human appraisal is based in synergies. However, since...
Currently, the concept of spoken grammar has been mentioned among Chinese teachers. However, teach-ers in China still have a vague idea of spoken grammar. Therefore this dissertation examines what spoken grammar is and argues that native speakers’ model of spoken grammar needs to be highlighted in the classroom teaching.
Gevarter, W. B.
Computer based Natural Language Processing (NLP) is the key to enabling humans and their computer based creations to interact with machines in natural language (like English, Japanese, German, etc., in contrast to formal computer languages). The doors that such an achievement can open have made this a major research area in Artificial Intelligence and Computational Linguistics. Commercial natural language interfaces to computers have recently entered the market and future looks bright for other applications as well. This report reviews the basic approaches to such systems, the techniques utilized, applications, the state of the art of the technology, issues and research requirements, the major participants and finally, future trends and expectations. It is anticipated that this report will prove useful to engineering and research managers, potential users, and others who will be affected by this field as it unfolds.
Olive, Joseph P; McCary, John
This comprehensive handbook, written by leading experts in the field, details the groundbreaking research conducted under the breakthrough GALE program - The Global Autonomous Language Exploitation within the Defense Advanced Research Projects Agency (DARPA), while placing it in the context of previous research in the fields of natural language and signal processing, artificial intelligence and machine translation. The most fundamental contrast between GALE and its predecessor programs was its holistic integration of previously separate or sequential processes. In earlier language research pro
Trujillo, Anna C.; Meszaros, Erica
. With a relatively well-defined and simple vocabulary, the operator can input the vast majority of the mission parameters using simple, intuitive voice commands. However, voice input may be more applicable to initial mission specification rather than for critical commands such as the need to land immediately due to time and feedback constraints. It would also be convenient to retrieve relevant mission information using voice input. Therefore, further on-going research is looking at using intent from operator utterances to provide the relevant mission information to the operator. The information displayed will be inferred from the operator's utterances just before key phrases are spoken. Linguistic analysis of the context of verbal communication provides insight into the intended meaning of commonly heard phrases such as "What's it doing now?" Analyzing the semantic sphere surrounding these common phrases enables us to predict the operator's intent and supply the operator's desired information to the interface. This paper also describes preliminary investigations into the generation of the semantic space of UAV operation and the success at providing information to the interface based on the operator's utterances.
Hiemstra, Djoerd; de Jong, Franciska M.G.
Traditionally, natural language processing techniques for information retrieval have always been studied outside the framework of formal models of information retrieval. In this article, we introduce a new formal model of information retrieval based on the application of statistical language models.
Widemann, David P. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Wang, Eric X. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States); Thiagarajan, Jayaraman J. [Lawrence Livermore National Lab. (LLNL), Livermore, CA (United States)
We present a novel Recoverable Order-Preserving Embedding (ROPE) of natural language. ROPE maps natural language passages from sparse concatenated one-hot representations to distributed vector representations of predetermined fixed length. We use Euclidean distance to return search results that are both grammatically and semantically similar. ROPE is based on a series of random projections of distributed word embeddings. We show that our technique typically forms a dictionary with sufficient incoherence such that sparse recovery of the original text is possible. We then show how our embedding allows for efficient and meaningful natural search and retrieval on Microsoft’s COCO dataset and the IMDB Movie Review dataset.
Shook, Anthony; Goldrick, Matthew; Engstler, Caroline; Marian, Viorica
When bilinguals process written language, they show delays in accessing lexical items relative to monolinguals. The present study investigated whether this effect extended to spoken language comprehension, examining the processing of sentences with either low or high semantic constraint in both first and second languages. English-German…
Full Text Available How human language arose is a mystery in the evolution of Homo sapiens. Miyagawa, Berwick, & Okanoya (Frontiers 2013 put forward a proposal, which we will call the Integration Hypothesis of human language evolution, which holds that human language is composed of two components, E for expressive, and L for lexical. Each component has an antecedent in nature: E as found, for example, in birdsong, and L in, for example, the alarm calls of monkeys. E and L integrated uniquely in humans to give rise to language. A challenge to the Integration Hypothesis is that while these non-human systems are finite-state in nature, human language is known to require characterization by a non-finite state grammar. Our claim is that E and L, taken separately, are finite-state; when a grammatical process crosses the boundary between E and L, it gives rise to the non-finite state character of human language. We provide empirical evidence for the Integration Hypothesis by showing that certain processes found in contemporary languages that have been characterized as non-finite state in nature can in fact be shown to be finite-state. We also speculate on how human language actually arose in evolution through the lens of the Integration Hypothesis.
Névéol, Aurélie; Dalianis, Hercules; Velupillai, Sumithra; Savova, Guergana; Zweigenbaum, Pierre
Natural language processing applied to clinical text or aimed at a clinical outcome has been thriving in recent years. This paper offers the first broad overview of clinical Natural Language Processing (NLP) for languages other than English. Recent studies are summarized to offer insights and outline opportunities in this area. We envision three groups of intended readers: (1) NLP researchers leveraging experience gained in other languages, (2) NLP researchers faced with establishing clinical text processing in a language other than English, and (3) clinical informatics researchers and practitioners looking for resources in their languages in order to apply NLP techniques and tools to clinical practice and/or investigation. We review work in clinical NLP in languages other than English. We classify these studies into three groups: (i) studies describing the development of new NLP systems or components de novo, (ii) studies describing the adaptation of NLP architectures developed for English to another language, and (iii) studies focusing on a particular clinical application. We show the advantages and drawbacks of each method, and highlight the appropriate application context. Finally, we identify major challenges and opportunities that will affect the impact of NLP on clinical practice and public health studies in a context that encompasses English as well as other languages.
Qu, Qingqing; Damian, Markus F
Extensive evidence from alphabetic languages demonstrates a role of orthography in the processing of spoken words. Because alphabetic systems explicitly code speech sounds, such effects are perhaps not surprising. However, it is less clear whether orthographic codes are involuntarily accessed from spoken words in languages with non-alphabetic systems, in which the sound-spelling correspondence is largely arbitrary. We investigated the role of orthography via a semantic relatedness judgment task: native Mandarin speakers judged whether or not spoken word pairs were related in meaning. Word pairs were either semantically related, orthographically related, or unrelated. Results showed that relatedness judgments were made faster for word pairs that were semantically related than for unrelated word pairs. Critically, orthographic overlap on semantically unrelated word pairs induced a significant increase in response latencies. These findings indicate that orthographic information is involuntarily accessed in spoken-word recognition, even in a non-alphabetic language such as Chinese.
Supervision Distant supervision is a recent trend in information extraction. Distantly-supervised extractors are trained using a corpus of unlabeled text...consists of fill-in-the-blank natural language questions such as “Incan emperor ” or “Cunningham directed Auchtre’s second music video .” These questions...with an 132 unknown knowledge base, simultaneously learning how to semantically parse language and pop - ulate the knowledge base. The weakly
Gevarter, W. B.
An overview of artificial intelligence (AI), its core ingredients, and its applications is presented. The knowledge representation, logic, problem solving approaches, languages, and computers pertaining to AI are examined, and the state of the art in AI is reviewed. The use of AI in expert systems, computer vision, natural language processing, speech recognition and understanding, speech synthesis, problem solving, and planning is examined. Basic AI topics, including automation, search-oriented problem solving, knowledge representation, and computational logic, are discussed.
Antonio Gisolfi; Enrico Fischetti
The aim of this paper is to show that with a subset of a natural language, simple systems running on PCs can be developed that can nevertheless be an effective tool for interfacing purposes in the building of an Intelligent Tutoring System (ITS). After presenting the special characteristics of the Smalltalk/V language, which provides an appropriate environment for the development of an interface, the overall architecture of the interface module is discussed. We then show how sentences are par...
Researchers, motivated by the need to improve the efficiency of natural language processing tools to handle web-scale data, have recently arrived at models that remarkably match the expected features of human language processing under the Now-or-Never bottleneck framework. This provides additional support for said framework and highlights the research potential in the interaction between applied computational linguistics and cognitive science.
This paper introduces natural language expressions and expert's subjectivity to system reliability analysis. To this end, this paper defines a subjective measure of reliability and presents the method of the system reliability analysis using the measure. The subjective measure of reliability corresponds to natural language expressions of reliability estimation, which is represented by a fuzzy set defined on [0,1]. The presented method deals with the dependence among subsystems and employs parametrized operations of subjective measures of reliability which can reflect expert 's subjectivity towards the analyzed system. The analysis results are also expressed by linguistic terms. Finally this paper gives an example of the system reliability analysis by the presented method
Learning to rank refers to machine learning techniques for training a model in a ranking task. Learning to rank is useful for many applications in information retrieval, natural language processing, and data mining. Intensive studies have been conducted on its problems recently, and significant progress has been made. This lecture gives an introduction to the area including the fundamental problems, major approaches, theories, applications, and future work.The author begins by showing that various ranking problems in information retrieval and natural language processing can be formalized as tw
Full Text Available There are some factors regarding which aspect of second language acquisition is affected by individual learner factors, age, learning style. aptitude, motivation, and personality. This research is about English language acquisition of fourth-year child by nature and nurture. The child acquired her second language acquisition at home and also in one of the courses in Jakarta. She schooled by her parents in order to be able to speak English well as a target language for her future time. The purpose of this paper is to see and examine individual learner difference especially in using English as a second language. This study is a library research and retrieved data collected, recorded, transcribed, and analyzed descriptively. The results can be concluded: the child is able to communicate well and also able to construct simple sentences, complex sentences, sentence statement, phrase questions, and explain something when her teacher asks her at school. She is able to communicate by making a simple sentence or compound sentence in well-form (two clauses or three clauses, even though she still not focus to use the past tense form and sometimes she forgets to put bound morpheme -s in third person singular but she can use turn-taking in her utterances. It is a very long process since the child does the second language acquisition. The family and teacher should participate and assist the child, the proven child can learn the first and the second language at the same time.
Anne E. Thessen
A computer can handle the volume but cannot make sense of the language. This paper reviews and discusses the use of natural language processing (NLP and machine-learning algorithms to extract information from systematic literature. NLP algorithms have been used for decades, but require special development for application in the biological realm due to the special nature of the language. Many tools exist for biological information extraction (cellular processes, taxonomic names, and morphological characters, but none have been applied life wide and most still require testing and development. Progress has been made in developing algorithms for automated annotation of taxonomic text, identification of taxonomic names in text, and extraction of morphological character information from taxonomic descriptions. This manuscript will briefly discuss the key steps in applying information extraction tools to enhance biodiversity science.
Michael, Joel; Rovick, Allen; Glass, Michael; Zhou, Yujian; Evens, Martha
CIRCSIM-Tutor is a computer tutor designed to carry out a natural language dialogue with a medical student. Its domain is the baroreceptor reflex, the part of the cardiovascular system that is responsible for maintaining a constant blood pressure. CIRCSIM-Tutor's interaction with students is modeled after the tutoring behavior of two experienced…
Doszkocs, Tamas E.
The National Library of Medicine's Current Information Transfer in English public access online catalog offers unique subject search capabilities--natural-language query input, automatic medical subject headings display, closest match search strategy, ranked document output, dynamic end user feedback for search refinement. References, description…
Szymczak, Bartlomiej Antoni
tried to establish a domain independent “ontological semantics” for relevant fragments of natural language. The purpose of this research is to develop methods and systems for taking advantage of formal ontologies for the purpose of extracting the meaning contents of texts. This functionality...
Dadlez, Eva M.
Describes a natural language searching strategy for retrieving current material which has bearing on George Orwell's "1984," and identifies four main themes (technology, authoritarianism, press and psychological/linguistic implications of surveillance, political oppression) which have emerged from cross-database searches of the "Big…
It is argued that pessimistic assessments of the adequacy of artificial neural networks (ANNs) for natural language processing (NLP) on the grounds that they have a finite state architecture are unjustified, and that their adequacy in this regard is an empirical issue. First, arguments that counter standard objections to finite state NLP on the…
Rodríguez, J. Tinguaro; Franco, Camilo; Montero, Javier
The evidence coming from cognitive psychology and linguistics shows that pairs of reference concepts (as e.g. good/bad, tall/short, nice/ugly, etc.) play a crucial role in the way we everyday use and understand natural languages in order to analyze reality and make decisions. Different situations...
van der Sluis, Ielka; Hielkema, F.; Mellish, C.; Doherty, G.
In this paper we look at what may be learned from a comparative study examining non-technical users with a background in social science browsing and querying metadata. Four query tasks were carried out with a natural language interface and with an interface that uses a web paradigm with hyperlinks.
Wiseheart, Rebecca; Altmann, Lori J. P.
Background: Individuals with dyslexia demonstrate syntactic difficulties on tasks of language comprehension, yet little is known about spoken language production in this population. Aims: To investigate whether spoken sentence production in college students with dyslexia is less proficient than in typical readers, and to determine whether group…
Full Text Available Computer Science 81 ( 2016 ) 128 – 135 5th Workshop on Spoken Language Technology for Under-resourced Languages, SLTU 2016, 9-12 May 2016, Yogyakarta, Indonesia Code-switched English Pronunciation Modeling for Swahili Spoken Term Detection Neil...
Saunders, Daniel R; Bex, Peter J; Woods, Russell L
Crowdsourcing has become a valuable method for collecting medical research data. This approach, recruiting through open calls on the Web, is particularly useful for assembling large normative datasets. However, it is not known how natural language datasets collected over the Web differ from those collected under controlled laboratory conditions. To compare the natural language responses obtained from a crowdsourced sample of participants with responses collected in a conventional laboratory setting from participants recruited according to specific age and gender criteria. We collected natural language descriptions of 200 half-minute movie clips, from Amazon Mechanical Turk workers (crowdsourced) and 60 participants recruited from the community (lab-sourced). Crowdsourced participants responded to as many clips as they wanted and typed their responses, whereas lab-sourced participants gave spoken responses to 40 clips, and their responses were transcribed. The content of the responses was evaluated using a take-one-out procedure, which compared responses to other responses to the same clip and to other clips, with a comparison of the average number of shared words. In contrast to the 13 months of recruiting that was required to collect normative data from 60 lab-sourced participants (with specific demographic characteristics), only 34 days were needed to collect normative data from 99 crowdsourced participants (contributing a median of 22 responses). The majority of crowdsourced workers were female, and the median age was 35 years, lower than the lab-sourced median of 62 years but similar to the median age of the US population. The responses contributed by the crowdsourced participants were longer on average, that is, 33 words compared to 28 words (Pcrowdsourced participants had more shared words (P=.004 and .01 respectively), whereas younger participants had higher numbers of shared words in the lab-sourced population (P=.01). Crowdsourcing is an effective approach
Full Text Available The study of discourse is the study of using language in actual use. In this article, the writer is trying to investigate the phonological features, either segmental or supra-segmental, in the spoken discourse of Indonesian university students. The data were taken from the recordings of 15 conversations by 30 students of Bina Nusantara University who are taking English Entrant subject (TOEFL –IBT. Finally, the writer is in opinion that the students are still influenced by their first language in their spoken discourse. This results in English with Indonesian accent. Even though it does not cause misunderstanding at the moment, this may become problematic if they have to communicate in the real world.
Van Lancker Sidtis, Diana
Although interest in the language sciences was previously focused on newly created sentences, more recently much attention has turned to the importance of formulaic expressions in normal and disordered communication. Also referred to as formulaic expressions and made up of speech formulas, idioms, expletives, serial and memorized speech, slang, sayings, clichés, and conventional expressions, non-propositional language forms a large proportion of every speaker's competence, and may be differentially disturbed in neurological disorders. This review aims to examine non-propositional speech with respect to linguistic descriptions, psycholinguistic experiments, sociolinguistic studies, child language development, clinical language disorders, and neurological studies. Evidence from numerous sources reveals differentiated and specialized roles for novel and formulaic verbal functions, and suggests that generation of novel sentences and management of prefabricated expressions represent two legitimate and separable processes in language behaviour. A preliminary model of language behaviour that encompasses unitary and compositional properties and their integration in everyday language use is proposed. Integration and synchronizing of two disparate processes in language behaviour, formulaic and novel, characterizes normal communicative function and contributes to creativity in language. This dichotomy is supported by studies arising from other disciplines in neurology and psychology. Further studies are necessary to determine in what ways the various categories of formulaic expressions are related, and how these categories are processed by the brain. Better understanding of how non-propositional categories of speech are stored and processed in the brain can lead to better informed treatment strategies in language disorders.
Nikora, Allen P.
This viewgraph presentation reviews the rationale of the program to transform natural language specifications into formal notation.Specifically, automate generation of Linear Temporal Logic (LTL)correctness properties from natural language temporal specifications. There are several reasons for this approach (1) Model-based techniques becoming more widely accepted, (2) Analytical verification techniques (e.g., model checking, theorem proving) significantly more effective at detecting types of specification design errors (e.g., race conditions, deadlock) than manual inspection, (3) Many requirements still written in natural language, which results in a high learning curve for specification languages, associated tools and increased schedule and budget pressure on projects reduce training opportunities for engineers, and (4) Formulation of correctness properties for system models can be a difficult problem. This has relevance to NASA in that it would simplify development of formal correctness properties, lead to more widespread use of model-based specification, design techniques, assist in earlier identification of defects and reduce residual defect content for space mission software systems. The presentation also discusses: potential applications, accomplishments and/or technological transfer potential and the next steps.
Komata, Masaoki; Oosawa, Yasuo; Ujita, Hiroshi
A natural language retrieval program NATLANG is developed to assist in the retrieval of information from event-and-cause descriptions in Licensee Event Reports (LER). The characteristics of NATLANG are (1) the use of base forms of words to retrieve related forms altered by the addition of prefixes or suffixes or changes in inflection, (2) direct access and short time retrieval with an alphabet pointer, (3) effective determination of the items and entries for a Hitachi event classification in a two step retrieval scheme, and (4) Japanese character output with the PL-1 language. NATLANG output reduces the effort needed to re-classify licensee events in the Hitachi event classification. (author)
Simmons, Noreen; Johnston, Judith
Background: Speech-language pathologists often advise families about interaction patterns that will facilitate language learning. This advice is typically based on research with North American families of European heritage and may not be culturally suited for non-Western families. Aims: The goal of the project was to identify differences in the…
Full Text Available This paper shows how fieldwork data can be managed using the program Toolbox together with the Natural Language Toolkit (NLTK for the Python programming language. It provides background information about Toolbox and describes how it can be downloaded and installed. The basic functionality of the program for lexicons and texts is described, and its strengths and weaknesses are reviewed. Its underlying data format is briefly discussed, and Toolbox processing capabilities of NLTK are introduced, showing ways in which it can be used to extend the functionality of Toolbox. This is illustrated with a few simple scripts that demonstrate basic data management tasks relevant to language documentation, such as printing out the contents of a lexicon as HTML.
This dissertation focuses on developing and evaluating hybrid approaches for analyzing free-form text in the medical domain. This research draws on natural language processing (NLP) techniques that are used to parse and extract concepts based on a controlled vocabulary. Once important concepts are extracted, additional machine learning algorithms,…
Nastassja A. Lewinski
Full Text Available Literature in the field of nanotechnology is exponentially increasing with more and more engineered nanomaterials being created, characterized, and tested for performance and safety. With the deluge of published data, there is a need for natural language processing approaches to semi-automate the cataloguing of engineered nanomaterials and their associated physico-chemical properties, performance, exposure scenarios, and biological effects. In this paper, we review the different informatics methods that have been applied to patent mining, nanomaterial/device characterization, nanomedicine, and environmental risk assessment. Nine natural language processing (NLP-based tools were identified: NanoPort, NanoMapper, TechPerceptor, a Text Mining Framework, a Nanodevice Analyzer, a Clinical Trial Document Classifier, Nanotoxicity Searcher, NanoSifter, and NEIMiner. We conclude with recommendations for sharing NLP-related tools through online repositories to broaden participation in nanoinformatics.
Full Text Available It is estimated that each year many people, most of whom are teenagers and young adults die by suicide worldwide. Suicide receives special attention with many countries developing national strategies for prevention. Since, more medical information is available in text, Preventing the growing trend of suicide in communities requires analyzing various textual resources, such as patient records, information on the web or questionnaires. For this purpose, this study systematically reviews recent studies related to the use of natural language processing techniques in the area of people’s health who have completed suicide or are at risk. After electronically searching for the PubMed and ScienceDirect databases and studying articles by two reviewers, 21 articles matched the inclusion criteria. This study revealed that, if a suitable data set is available, natural language processing techniques are well suited for various types of suicide related research.
ELEMENT. PROJECT. TASKN Artificial Inteligence Laboratory A1A4WR NTumet 0) 545 Technology Square Cambridge, MA 02139 Ln *t- CONTROLLING OFFICE NAME AND...RO-RI95 922 EXPLOITING LEXICAL REGULARITIES IN DESIGNING NATURAL 1/1 LANGUAGE SYSTENS(U) MASSACHUSETTS INST OF TECH CAMBRIDGE ARTIFICIAL INTELLIGENCE...oes.ary and ftdou.Ip hr Nl wow" L,2This paper presents the lexical component of the START Question Answering system developed at the MIT Artificial
studies: the Time-Triggered Ethernet (TTEthernet) communication platform used in space, and FAA-Isolette infant incubators used in NICU . We...in space, and FAA-Isolette infant incubators used in Neonatal Intensive Care Units ( NICUs ). We systematically evalu- ated various aspects of ARSENAL...effect, we present the ARSENAL methodology. ARSENAL uses state-of-the-art advances in natural language processing (NLP) and formal methods (FM) to
Pfau, R.; Steinbach, M.; Woll, B.
Sign language linguists show here that all the questions relevant to the linguistic investigation of spoken languages can be asked about sign languages. Conversely, questions that sign language linguists consider - even if spoken language researchers have not asked them yet - should also be asked of
Friedman, Carol; Hripcsak, George; Shagina, Lyuda; Liu, Hongfang
Objective: To design a document model that provides reliable and efficient access to clinical information in patient reports for a broad range of clinical applications, and to implement an automated method using natural language processing that maps textual reports to a form consistent with the model. Methods: A document model that encodes structured clinical information in patient reports while retaining the original contents was designed using the extensible markup language (XML), and a document type definition (DTD) was created. An existing natural language processor (NLP) was modified to generate output consistent with the model. Two hundred reports were processed using the modified NLP system, and the XML output that was generated was validated using an XML validating parser. Results: The modified NLP system successfully processed all 200 reports. The output of one report was invalid, and 199 reports were valid XML forms consistent with the DTD. Conclusions: Natural language processing can be used to automatically create an enriched document that contains a structured component whose elements are linked to portions of the original textual report. This integrated document model provides a representation where documents containing specific information can be accurately and efficiently retrieved by querying the structured components. If manual review of the documents is desired, the salient information in the original reports can also be identified and highlighted. Using an XML model of tagging provides an additional benefit in that software tools that manipulate XML documents are readily available. PMID:9925230
Simmons, Noreen; Johnston, Judith
Speech-language pathologists often advise families about interaction patterns that will facilitate language learning. This advice is typically based on research with North American families of European heritage and may not be culturally suited for non-Western families. The goal of the project was to identify differences in the beliefs and practices of Indian and Euro-Canadian mothers that would affect patterns of talk to children. A total of 47 Indian mothers and 51 Euro-Canadian mothers of preschool age children completed a written survey concerning child-rearing practices and beliefs, especially those about talk to children. Discriminant analyses indicated clear cross-cultural differences and produced functions that could predict group membership with a 96% accuracy rate. Items contributing most to these functions concerned the importance of family, perceptions of language learning, children's use of language in family and society, and interactions surrounding text. Speech-language pathologists who wish to adapt their services for families of Indian heritage should remember the centrality of the family, the likelihood that there will be less emphasis on early independence and achievement, and the preference for direct instruction.
Large, David R; Clark, Leigh; Quandt, Annie; Burnett, Gary; Skrypchuk, Lee
Given the proliferation of 'intelligent' and 'socially-aware' digital assistants embodying everyday mobile technology - and the undeniable logic that utilising voice-activated controls and interfaces in cars reduces the visual and manual distraction of interacting with in-vehicle devices - it appears inevitable that next generation vehicles will be embodied by digital assistants and utilise spoken language as a method of interaction. From a design perspective, defining the language and interaction style that a digital driving assistant should adopt is contingent on the role that they play within the social fabric and context in which they are situated. We therefore conducted a qualitative, Wizard-of-Oz study to explore how drivers might interact linguistically with a natural language digital driving assistant. Twenty-five participants drove for 10 min in a medium-fidelity driving simulator while interacting with a state-of-the-art, high-functioning, conversational digital driving assistant. All exchanges were transcribed and analysed using recognised linguistic techniques, such as discourse and conversation analysis, normally reserved for interpersonal investigation. Language usage patterns demonstrate that interactions with the digital assistant were fundamentally social in nature, with participants affording the assistant equal social status and high-level cognitive processing capability. For example, participants were polite, actively controlled turn-taking during the conversation, and used back-channelling, fillers and hesitation, as they might in human communication. Furthermore, participants expected the digital assistant to understand and process complex requests mitigated with hedging words and expressions, and peppered with vague language and deictic references requiring shared contextual information and mutual understanding. Findings are presented in six themes which emerged during the analysis - formulating responses; turn-taking; back
Cai, Tianrun; Giannopoulos, Andreas A.; Yu, Sheng; Kelil, Tatiana; Ripley, Beth; Kumamaru, Kanako K.; Rybicki, Frank J.
The migration of imaging reports to electronic medical record systems holds great potential in terms of advancing radiology research and practice by leveraging the large volume of data continuously being updated, integrated, and shared. However, there are significant challenges as well, largely due to the heterogeneity of how these data are formatted. Indeed, although there is movement toward structured reporting in radiology (ie, hierarchically itemized reporting with use of standardized terminology), the majority of radiology reports remain unstructured and use free-form language. To effectively “mine” these large datasets for hypothesis testing, a robust strategy for extracting the necessary information is needed. Manual extraction of information is a time-consuming and often unmanageable task. “Intelligent” search engines that instead rely on natural language processing (NLP), a computer-based approach to analyzing free-form text or speech, can be used to automate this data mining task. The overall goal of NLP is to translate natural human language into a structured format (ie, a fixed collection of elements), each with a standardized set of choices for its value, that is easily manipulated by computer programs to (among other things) order into subcategories or query for the presence or absence of a finding. The authors review the fundamentals of NLP and describe various techniques that constitute NLP in radiology, along with some key applications. ©RSNA, 2016 PMID:26761536
Cai, Tianrun; Giannopoulos, Andreas A; Yu, Sheng; Kelil, Tatiana; Ripley, Beth; Kumamaru, Kanako K; Rybicki, Frank J; Mitsouras, Dimitrios
The migration of imaging reports to electronic medical record systems holds great potential in terms of advancing radiology research and practice by leveraging the large volume of data continuously being updated, integrated, and shared. However, there are significant challenges as well, largely due to the heterogeneity of how these data are formatted. Indeed, although there is movement toward structured reporting in radiology (ie, hierarchically itemized reporting with use of standardized terminology), the majority of radiology reports remain unstructured and use free-form language. To effectively "mine" these large datasets for hypothesis testing, a robust strategy for extracting the necessary information is needed. Manual extraction of information is a time-consuming and often unmanageable task. "Intelligent" search engines that instead rely on natural language processing (NLP), a computer-based approach to analyzing free-form text or speech, can be used to automate this data mining task. The overall goal of NLP is to translate natural human language into a structured format (ie, a fixed collection of elements), each with a standardized set of choices for its value, that is easily manipulated by computer programs to (among other things) order into subcategories or query for the presence or absence of a finding. The authors review the fundamentals of NLP and describe various techniques that constitute NLP in radiology, along with some key applications. ©RSNA, 2016.
Maurice H. P. M. van Putten
Full Text Available We consider the rate R and variance σ 2 of Shannon information in snippets of text based on word frequencies in the natural language. We empirically identify Kolmogorov’s scaling law in σ 2 ∝ k - 1 . 66 ± 0 . 12 (95% c.l. as a function of k = 1 / N measured by word count N. This result highlights a potential association of information flow in snippets, analogous to energy cascade in turbulent eddies in fluids at high Reynolds numbers. We propose R and σ 2 as robust utility functions for objective ranking of concordances in efficient search for maximal information seamlessly across different languages and as a starting point for artificial attention.
Full Text Available The aim of this paper is to show that with a subset of a natural language, simple systems running on PCs can be developed that can nevertheless be an effective tool for interfacing purposes in the building of an Intelligent Tutoring System (ITS. After presenting the special characteristics of the Smalltalk/V language, which provides an appropriate environment for the development of an interface, the overall architecture of the interface module is discussed. We then show how sentences are parsed by the interface, and how interaction takes place with the user. The knowledge-acquisition phase is subsequently described. Finally, some excerpts from a tutoring session concerned with elementary geometry are discussed, and some of the problems and limitations of the approach are illustrated.
Shah, Nishal Pradeepkumar
A recent advance in computer technology has permitted scientists to implement and test algorithms that were known from quite some time (or not) but which were computationally expensive. Two such projects are IBM's Jeopardy as a part of its DeepQA project  and Wolfram's Wolframalpha. Both these methods implement natural language processing (another goal of AI scientists) and try to answer questions as asked by the user. Though the goal of the two projects is similar, both of them have a ...
Bochkarev, Vladimir V.; Lerner, Eduard Yu; Shevlyakova, Anna V.
This paper is devoted to verifying of the empirical Zipf and Hips laws in natural languages using Google Books Ngram corpus data. The connection between the Zipf and Heaps law which predicts the power dependence of the vocabulary size on the text size is discussed. In fact, the Heaps exponent in this dependence varies with the increasing of the text corpus. To explain it, the obtained results are compared with the probability model of text generation. Quasi-periodic variations with characteristic time periods of 60-100 years were also found.
Bochkarev, Vladimir V; Lerner, Eduard Yu; Shevlyakova, Anna V
This paper is devoted to verifying of the empirical Zipf and Hips laws in natural languages using Google Books Ngram corpus data. The connection between the Zipf and Heaps law which predicts the power dependence of the vocabulary size on the text size is discussed. In fact, the Heaps exponent in this dependence varies with the increasing of the text corpus. To explain it, the obtained results are compared with the probability model of text generation. Quasi-periodic variations with characteristic time periods of 60-100 years were also found
Full Text Available We present a publicly-available state-of-the-art research and development platform for Machine Translation and Natural Language Processing that runs on the Amazon Elastic Compute Cloud. This provides a standardized research environment for all users, and enables perfect reproducibility and compatibility. Box also enables users to use their hardware budget to avoid the management and logistical overhead of maintaining a research lab, yet still participate in global research community with the same state-of-the-art tools.
Leech, Geoffrey; Wilson, Andrew (All Of Lancaster University)
Word Frequencies in Written and Spoken English is a landmark volume in the development of vocabulary frequency studies. Whereas previous books have in general given frequency information about the written language only, this book provides information on both speech and writing. It not only gives information about the language as a whole, but also about the differences between spoken and written English, and between different spoken and written varieties of the language. The frequencies are derived from a wide ranging and up-to-date corpus of English: the British Na
The trends emerging in the natural language processing (NLP) of African languages spoken in South Africa, are explored in order to determine whether research in and development of such NLP is keeping abreast of international developments. This is done by investigating the past, present and future of NLP of African ...
Nafari, Maryam; Weaver, Chris
Richly interactive visualization tools are increasingly popular for data exploration and analysis in a wide variety of domains. Existing systems and techniques for recording provenance of interaction focus either on comprehensive automated recording of low-level interaction events or on idiosyncratic manual transcription of high-level analysis activities. In this paper, we present the architecture and translation design of a query-to-question (Q2Q) system that automatically records user interactions and presents them semantically using natural language (written English). Q2Q takes advantage of domain knowledge and uses natural language generation (NLG) techniques to translate and transcribe a progression of interactive visualization states into a visual log of styled text that complements and effectively extends the functionality of visualization tools. We present Q2Q as a means to support a cross-examination process in which questions rather than interactions are the focus of analytic reasoning and action. We describe the architecture and implementation of the Q2Q system, discuss key design factors and variations that effect question generation, and present several visualizations that incorporate Q2Q for analysis in a variety of knowledge domains.
Full Text Available Suicide is the second leading cause of death among 25–34 year olds and the third leading cause of death among 15–25 year olds in the United States. In the Emergency Department, where suicidal patients often present, estimating the risk of repeated attempts is generally left to clinical judgment. This paper presents our second attempt to determine the role of computational algorithms in understanding a suicidal patient’s thoughts, as represented by suicide notes. We focus on developing methods of natural language processing that distinguish between genuine and elicited suicide notes. We hypothesize that machine learning algorithms can categorize suicide notes as well as mental health professionals and psychiatric physician trainees do. The data used are comprised of suicide notes from 33 suicide completers and matched to 33 elicited notes from healthy control group members. Eleven mental health professionals and 31 psychiatric trainees were asked to decide if a note was genuine or elicited. Their decisions were compared to nine different machine-learning algorithms. The results indicate that trainees accurately classified notes 49% of the time, mental health professionals accurately classified notes 63% of the time, and the best machine learning algorithm accurately classified the notes 78% of the time. This is an important step in developing an evidence-based predictor of repeated suicide attempts because it shows that natural language processing can aid in distinguishing between classes of suicidal notes.
Pestian, John; Nasrallah, Henry; Matykiewicz, Pawel; Bennett, Aurora; Leenaars, Antoon
Suicide is the second leading cause of death among 25-34 year olds and the third leading cause of death among 15-25 year olds in the United States. In the Emergency Department, where suicidal patients often present, estimating the risk of repeated attempts is generally left to clinical judgment. This paper presents our second attempt to determine the role of computational algorithms in understanding a suicidal patient's thoughts, as represented by suicide notes. We focus on developing methods of natural language processing that distinguish between genuine and elicited suicide notes. We hypothesize that machine learning algorithms can categorize suicide notes as well as mental health professionals and psychiatric physician trainees do. The data used are comprised of suicide notes from 33 suicide completers and matched to 33 elicited notes from healthy control group members. Eleven mental health professionals and 31 psychiatric trainees were asked to decide if a note was genuine or elicited. Their decisions were compared to nine different machine-learning algorithms. The results indicate that trainees accurately classified notes 49% of the time, mental health professionals accurately classified notes 63% of the time, and the best machine learning algorithm accurately classified the notes 78% of the time. This is an important step in developing an evidence-based predictor of repeated suicide attempts because it shows that natural language processing can aid in distinguishing between classes of suicidal notes.
Pons, Ewoud; Braun, Loes M M; Hunink, M G Myriam; Kors, Jan A
Radiological reporting has generated large quantities of digital content within the electronic health record, which is potentially a valuable source of information for improving clinical care and supporting research. Although radiology reports are stored for communication and documentation of diagnostic imaging, harnessing their potential requires efficient and automated information extraction: they exist mainly as free-text clinical narrative, from which it is a major challenge to obtain structured data. Natural language processing (NLP) provides techniques that aid the conversion of text into a structured representation, and thus enables computers to derive meaning from human (ie, natural language) input. Used on radiology reports, NLP techniques enable automatic identification and extraction of information. By exploring the various purposes for their use, this review examines how radiology benefits from NLP. A systematic literature search identified 67 relevant publications describing NLP methods that support practical applications in radiology. This review takes a close look at the individual studies in terms of tasks (ie, the extracted information), the NLP methodology and tools used, and their application purpose and performance results. Additionally, limitations, future challenges, and requirements for advancing NLP in radiology will be discussed. (©) RSNA, 2016 Online supplemental material is available for this article.
This book explains how can be created information extraction (IE) applications that are able to tap the vast amount of relevant information available in natural language sources: Internet pages, official documents such as laws and regulations, books and newspapers, and social web. Readers are introduced to the problem of IE and its current challenges and limitations, supported with examples. The book discusses the need to fill the gap between documents, data, and people, and provides a broad overview of the technology supporting IE. The authors present a generic architecture for developing systems that are able to learn how to extract relevant information from natural language documents, and illustrate how to implement working systems using state-of-the-art and freely available software tools. The book also discusses concrete applications illustrating IE uses. · Provides an overview of state-of-the-art technology in information extraction (IE), discussing achievements and limitations for t...
Hannagan, Thomas; Magnuson, James S.; Grainger, Jonathan
How do we map the rapid input of spoken language onto phonological and lexical representations over time? Attempts at psychologically-tractable computational models of spoken word recognition tend either to ignore time or to transform the temporal input into a spatial representation. TRACE, a connectionist model with broad and deep coverage of speech perception and spoken word recognition phenomena, takes the latter approach, using exclusively time-specific units at every level of representation. TRACE reduplicates featural, phonemic, and lexical inputs at every time step in a large memory trace, with rich interconnections (excitatory forward and backward connections between levels and inhibitory links within levels). As the length of the memory trace is increased, or as the phoneme and lexical inventory of the model is increased to a realistic size, this reduplication of time- (temporal position) specific units leads to a dramatic proliferation of units and connections, begging the question of whether a more efficient approach is possible. Our starting point is the observation that models of visual object recognition—including visual word recognition—have grappled with the problem of spatial invariance, and arrived at solutions other than a fully-reduplicative strategy like that of TRACE. This inspires a new model of spoken word recognition that combines time-specific phoneme representations similar to those in TRACE with higher-level representations based on string kernels: temporally independent (time invariant) diphone and lexical units. This reduces the number of necessary units and connections by several orders of magnitude relative to TRACE. Critically, we compare the new model to TRACE on a set of key phenomena, demonstrating that the new model inherits much of the behavior of TRACE and that the drastic computational savings do not come at the cost of explanatory power. PMID:24058349
Boulet, J R; van Zanten, M; McKinley, D W; Gary, N E
The purpose of this study was to gather additional evidence for the validity and reliability of spoken English proficiency ratings provided by trained standardized patients (SPs) in high-stakes clinical skills examination. Over 2500 candidates who took the Educational Commission for Foreign Medical Graduates' (ECFMG) Clinical Skills Assessment (CSA) were studied. The CSA consists of 10 or 11 timed clinical encounters. Standardized patients evaluate spoken English proficiency and interpersonal skills in every encounter. Generalizability theory was used to estimate the consistency of spoken English ratings. Validity coefficients were calculated by correlating summary English ratings with CSA scores and other external criterion measures. Mean spoken English ratings were also compared by various candidate background variables. The reliability of the spoken English ratings, based on 10 independent evaluations, was high. The magnitudes of the associated variance components indicated that the evaluation of a candidate's spoken English proficiency is unlikely to be affected by the choice of cases or SPs used in a given assessment. Proficiency in spoken English was related to native language (English versus other) and scores from the Test of English as a Foreign Language (TOEFL). The pattern of the relationships, both within assessment components and with external criterion measures, suggests that valid measures of spoken English proficiency are obtained. This result, combined with the high reproducibility of the ratings over encounters and SPs, supports the use of trained SPs to measure spoken English skills in a simulated medical environment.
Spoken dialog systems have the potential to offer highly intuitive user interfaces, as they allow systems to be controlled using natural language. However, the complexity inherent in natural language dialogs means that careful testing of the system must be carried out from the very beginning of the design process. This book examines how user models can be used to support such early evaluations in two ways: by running simulations of dialogs, and by estimating the quality judgments of users. First, a design environment supporting the creation of dialog flows, the simulation of dialogs, and the analysis of the simulated data is proposed. How the quality of user simulations may be quantified with respect to their suitability for both formative and summative evaluation is then discussed. The remainder of the book is dedicated to the problem of predicting quality judgments of users based on interaction data. New modeling approaches are presented, which process the dialogs as sequences, and which allow knowl...
Full Text Available Advances in spoken corpora analysis have brought about new insights into language pedagogy and have led to an awareness of the characteristics of spoken language. Current findings have shown that grammar of spoken language is different from written language. However, most listening and speaking materials are concocted based on written grammar and lack core spoken language features. The aim of the present study was to explore the question whether awareness of spoken grammar features could affect learners’ comprehension of real-life conversations. To this end, 45 university students in two intact classes participated in a listening course employing corpus-based materials. The instruction of the spoken grammar features to the experimental group was done overtly through awareness raising tasks, whereas the control group, though exposed to the same materials, was not provided with such tasks for learning the features. The results of the independent samples t tests revealed that the learners in the experimental group comprehended everyday conversations much better than those in the control group. Additionally, the highly positive views of spoken grammar held by the learners, which was elicited by means of a retrospective questionnaire, were generally comparable to those reported in the literature.
Behrns, Ingrid; Wengelin, Asa; Broberg, Malin; Hartelius, Lena
The aim of the present study was to explore how a personal narrative told by a group of eight persons with aphasia differed between written and spoken language, and to compare this with findings from 10 participants in a reference group. The stories were analysed through holistic assessments made by 60 participants without experience of aphasia…
Ordelman, Roeland J.F.; van Hessen, Adrianus J.; de Jong, Franciska M.G.
In this paper, ongoing work concerning the language modelling and lexicon optimization of a Dutch speech recognition system for Spoken Document Retrieval is described: the collection and normalization of a training data set and the optimization of our recognition lexicon. Effects on lexical coverage
Ordelman, Roeland J.F.; van Hessen, Adrianus J.; de Jong, Franciska M.G.; Dalsgaard, P.; Lindberg, B.; Benner, H.
In this paper, ongoing work concerning the language modelling and lexicon optimization of a Dutch speech recognition system for Spoken Document Retrieval is described: the collection and normalization of a training data set and the optimization of our recognition lexicon. Effects on lexical coverage
Kobayashi, Yuichiro; Abe, Mariko
The purpose of the present study is to assess second language (L2) spoken English using automated scoring techniques. Automated scoring aims to classify a large set of learners' oral performance data into a small number of discrete oral proficiency levels. In automated scoring, objectively measurable features such as the frequencies of lexical and…
de Jong, Franciska M.G.; Heeren, W.F.L.; van Hessen, Adrianus J.; Ordelman, Roeland J.F.; Nijholt, Antinus; Ruiz Miyares, L.; Alvarez Silva, M.R.
Archival practice is shifting from the analogue to the digital world. A specific subset of heritage collections that impose interesting challenges for the field of language and speech technology are spoken word archives. Given the enormous backlog at audiovisual archives of unannotated materials and
Krahmer, E.J.; Swerts, M.G.J.; Theune, M.; Weegels, M.F.
Given the state of the art of current language and speech technology, errors are unavoidable in present-day spoken dialogue systems. Therefore, one of the main concerns in dialogue design is how to decide whether or not the system has understood the user correctly. In human-human communication,
Krahmer, E.; Swerts, M.; Theune, Mariet; Weegels, M.
Given the state of the art of current language and speech technology, errors are unavoidable in present-day spoken dialogue systems. Therefore, one of the main concerns in dialogue design is how to decide whether or not the system has understood the user correctly. In human-human communication,
Juan Manuel Montero
Full Text Available We describe the work on infusion of emotion into a limited-task autonomous spoken conversational agent situated in the domestic environment, using a need-inspired task-independent emotion model (NEMO. In order to demonstrate the generation of affect through the use of the model, we describe the work of integrating it with a natural-language mixed-initiative HiFi-control spoken conversational agent (SCA. NEMO and the host system communicate externally, removing the need for the Dialog Manager to be modified, as is done in most existing dialog systems, in order to be adaptive. The first part of the paper concerns the integration between NEMO and the host agent. The second part summarizes the work on automatic affect prediction, namely, frustration and contentment, from dialog features, a non-conventional source, in the attempt of moving towards a more user-centric approach. The final part reports the evaluation results obtained from a user study, in which both versions of the agent (non-adaptive and emotionally-adaptive were compared. The results provide substantial evidences with respect to the benefits of adding emotion in a spoken conversational agent, especially in mitigating users’ frustrations and, ultimately, improving their satisfaction.
Research into natural language understanding systems for computers has concentrated on implementing particular grammars and grammatical models of the language concerned. This paper presents a rationale for research into natural language understanding systems based on neurological and psychological principles. Important features of the approach are that it seeks to place the onus of learning the language on the computer, and that it seeks to make use of the vast wealth of relevant psycholinguistic and neurolinguistic theory. 22 references.
Poletiek, Fenna H; Fitz, Hartmut; Bocanegra, Bruno R
Rey et al. (2012) present data from a study with baboons that they interpret in support of the idea that center-embedded structures in human language have their origin in low level memory mechanisms and associative learning. Critically, the authors claim that the baboons showed a behavioral preference that is consistent with center-embedded sequences over other types of sequences. We argue that the baboons' response patterns suggest that two mechanisms are involved: first, they can be trained to associate a particular response with a particular stimulus, and, second, when faced with two conditioned stimuli in a row, they respond to the most recent one first, copying behavior they had been rewarded for during training. Although Rey et al. (2012) 'experiment shows that the baboons' behavior is driven by low level mechanisms, it is not clear how the animal behavior reported, bears on the phenomenon of Center Embedded structures in human syntax. Hence, (1) natural language syntax may indeed have been shaped by low level mechanisms, and (2) the baboons' behavior is driven by low level stimulus response learning, as Rey et al. propose. But is the second evidence for the first? We will discuss in what ways this study can and cannot give evidential value for explaining the origin of Center Embedded recursion in human grammar. More generally, their study provokes an interesting reflection on the use of animal studies in order to understand features of the human linguistic system. Copyright © 2015 Elsevier B.V. All rights reserved.
This thesis puts forward the view that a purely signal- based approach to natural language processing is both plausible and desirable. By questioning the veracity of symbolic representations of meaning, it argues for a unified, non-symbolic model of knowledge representation that is both biologically plausible and, potentially, highly efficient. Processes to generate a grounded, neural form of this model-dubbed the semantic filter-are discussed. The combined effects of local neural organisation, coincident with perceptual maturation, are used to hypothesise its nature. This theoretical model is then validated in light of a number of fundamental neurological constraints and milestones. The mechanisms of semantic and episodic development that the model predicts are then used to explain linguistic properties, such as propositions and verbs, syntax and scripting. To mimic the growth of locally densely connected structures upon an unbounded neural substrate, a system is developed that can grow arbitrarily large, data- dependant structures composed of individual self- organising neural networks. The maturational nature of the data used results in a structure in which the perception of concepts is refined by the networks, but demarcated by subsequent structure. As a consequence, the overall structure shows significant memory and computational benefits, as predicted by the cognitive and neural models. Furthermore, the localised nature of the neural architecture also avoids the increasing error sensitivity and redundancy of traditional systems as the training domain grows. The semantic and episodic filters have been demonstrated to perform as well, or better, than more specialist networks, whilst using significantly larger vocabularies, more complex sentence forms and more natural corpora.
Wu, Joy T; Dernoncourt, Franck; Gehrmann, Sebastian; Tyler, Patrick D; Moseley, Edward T; Carlson, Eric T; Grant, David W; Li, Yeran; Welt, Jonathan; Celi, Leo Anthony
Advancement of Artificial Intelligence (AI) capabilities in medicine can help address many pressing problems in healthcare. However, AI research endeavors in healthcare may not be clinically relevant, may have unrealistic expectations, or may not be explicit enough about their limitations. A diverse and well-functioning multidisciplinary team (MDT) can help identify appropriate and achievable AI research agendas in healthcare, and advance medical AI technologies by developing AI algorithms as well as addressing the shortage of appropriately labeled datasets for machine learning. In this paper, our team of engineers, clinicians and machine learning experts share their experience and lessons learned from their two-year-long collaboration on a natural language processing (NLP) research project. We highlight specific challenges encountered in cross-disciplinary teamwork, dataset creation for NLP research, and expectation setting for current medical AI technologies. Copyright © 2017. Published by Elsevier B.V.
Kashyap, Vipul; Turchin, Alexander; Morin, Laura; Chang, Frank; Li, Qi; Hongsermeier, Tonya
Structured Clinical Documentation is a fundamental component of the healthcare enterprise, linking both clinical (e.g., electronic health record, clinical decision support) and administrative functions (e.g., evaluation and management coding, billing). One of the challenges in creating good quality documentation templates has been the inability to address specialized clinical disciplines and adapt to local clinical practices. A one-size-fits-all approach leads to poor adoption and inefficiencies in the documentation process. On the other hand, the cost associated with manual generation of documentation templates is significant. Consequently there is a need for at least partial automation of the template generation process. We propose an approach and methodology for the creation of structured documentation templates for diabetes using Natural Language Processing (NLP).
Deleger, Louise; Li, Qi; Lingren, Todd; Kaiser, Megan; Molnar, Katalin; Stoutenborough, Laura; Kouril, Michal; Marsolo, Keith; Solti, Imre
We present the construction of three annotated corpora to serve as gold standards for medical natural language processing (NLP) tasks. Clinical notes from the medical record, clinical trial announcements, and FDA drug labels are annotated. We report high inter-annotator agreements (overall F-measures between 0.8467 and 0.9176) for the annotation of Personal Health Information (PHI) elements for a de-identification task and of medications, diseases/disorders, and signs/symptoms for information extraction (IE) task. The annotated corpora of clinical trials and FDA labels will be publicly released and to facilitate translational NLP tasks that require cross-corpora interoperability (e.g. clinical trial eligibility screening) their annotation schemas are aligned with a large scale, NIH-funded clinical text annotation project.
Full Text Available Development of information technologies is growing steadily. With the latest software technologies development and application of the methods of artificial intelligence and machine learning intelligence embededs in computers, the expectations are that in near future computers will be able to solve problems themselves like people do. Artificial intelligence emulates human behavior on computers. Rather than executing instructions one by one, as theyare programmed, machine learning employs prior experience/data that is used in the process of system’s training. In this state of the art paper, common methods in AI, such as machine learning, pattern recognition and the natural language processing (NLP are discussed. Also are given standard architecture of NLP processing system and the level thatisneeded for understanding NLP. Lastly the statistical NLP processing and multi-word expressions are described.
Graham, Matthew; Zhang, M.; Djorgovski, S. G.; Donalek, C.; Drake, A. J.; Mahabal, A.
The rapidly emerging field of time domain astronomy is one of the most exciting and vibrant new research frontiers, ranging in scientific scope from studies of the Solar System to extreme relativistic astrophysics and cosmology. It is being enabled by a new generation of large synoptic digital sky surveys - LSST, PanStarrs, CRTS - that cover large areas of sky repeatedly, looking for transient objects and phenomena. One of the biggest challenges facing these is the automated classification of transient events, a process that needs machine-processible astronomical knowledge. Semantic technologies enable the formal representation of concepts and relations within a particular domain. ATELs (http://www.astronomerstelegram.org) are a commonly-used means for reporting and commenting upon new astronomical observations of transient sources (supernovae, stellar outbursts, blazar flares, etc). However, they are loose and unstructured and employ scientific natural language for description: this makes automated processing of them - a necessity within the next decade with petascale data rates - a challenge. Nevertheless they represent a potentially rich corpus of information that could lead to new and valuable insights into transient phenomena. This project lies in the cutting-edge field of astrosemantics, a branch of astroinformatics, which applies semantic technologies to astronomy. The ATELs have been used to develop an appropriate concept scheme - a representation of the information they contain - for transient astronomy using hierarchical clustering of processed natural language. This allows us to automatically organize ATELs based on the vocabulary used. We conclude that we can use simple algorithms to process and extract meaning from astronomical textual data.
Natural language processing (NLP) is a subfield of artificial intelligence and computational linguistics. It studies the problems of automated generation and understanding of natural human languages. This paper outlines a framework to use computer and natural language techniques for various levels of learners to learn foreign languages in Computer-based Learning environment. We propose some ideas for using the computer as a practical tool for learning foreign language where the most of courseware is generated automatically. We then describe how to build Computer Based Learning tools, discuss its effectiveness, and conclude with some possibilities using on-line resources.
Deane, Paul; Sheehan, Kathleen
This paper is an exploration of the conceptual issues that have arisen in the course of building a natural language generation (NLG) system for automatic test item generation. While natural language processing techniques are applicable to general verbal items, mathematics word problems are particularly tractable targets for natural language…
Miller, William R; Johnson, Wendy R
Client motivation for change, a topic of high interest to addiction clinicians, is multidimensional and complex, and many different approaches to measurement have been tried. The current effort drew on psycholinguistic research on natural language that is used by clients to describe their own motivation. Seven addiction treatment sites participated in the development of a simple scale to measure client motivation. Twelve items were drafted to represent six potential dimensions of motivation for change that occur in natural discourse. The maximum self-rating of motivation (10 on a 0-10 scale) was the median score on all items, and 43% of respondents rated 10 on all 12 items - a substantial ceiling effect. From 1035 responses, three factors emerged representing importance, ability, and commitment - constructs that are also reflected in several theoretical models of motivation. A 3-item version of the scale, with one marker item for each of these constructs, accounted for 81% of variance in the full scale. The three items are: 1. It is important for me to . . . 2. I could . . . and 3. I am trying to . . . This offers a quick (1-minute) assessment of clients' self-reported motivation for change.
This study explores language ideologies of English at a Korean university where English has been adopted as an official language. This study draws on ethnographic data in order to understand how speakers respond to and experience the institutional language policy. The findings show that language ideologies in this university represent the…
Wittrock, Merlin C.
Concepts in cognitive psychology are applied to the language used in military situations, and a sentence classification system for use in analyzing military language is outlined. The system is designed to be used, in part, in conjunction with a natural language query system that allows a user to access a database. The discussion of military…
Yu, Ping; Pan, Yingxin; Li, Chen; Zhang, Zengxiu; Shi, Qin; Chu, Wenpei; Liu, Mingzhuo; Zhu, Zhiting
Oral production is an important part in English learning. Lack of a language environment with efficient instruction and feedback is a big issue for non-native speakers' English spoken skill improvement. A computer-assisted language learning system can provide many potential benefits to language learners. It allows adequate instructions and instant…
Vývoj sociální kognice českých neslyšících dětí — uživatelů českého znakového jazyka a uživatelů mluvené češtiny: adaptace testové baterie : Development of Social Cognition in Czech Deaf Children — Czech Sign Language Users and Czech Spoken Language Users: Adaptation of a Test Battery
Full Text Available The present paper describes the process of an adaptation of a set of tasks for testing theory-of-mind competencies, Theory of Mind Task Battery, for the use with the population of Czech Deaf children — both users of Czech Sign Language as well as those using spoken Czech.
Hirschman, Lynette; Fort, Karën; Boué, Stéphanie; Kyrpides, Nikos; Islamaj Doğan, Rezarta; Cohen, Kevin Bretonnel
Crowdsourcing is increasingly utilized for performing tasks in both natural language processing and biocuration. Although there have been many applications of crowdsourcing in these fields, there have been fewer high-level discussions of the methodology and its applicability to biocuration. This paper explores crowdsourcing for biocuration through several case studies that highlight different ways of leveraging 'the crowd'; these raise issues about the kind(s) of expertise needed, the motivations of participants, and questions related to feasibility, cost and quality. The paper is an outgrowth of a panel session held at BioCreative V (Seville, September 9-11, 2015). The session consisted of four short talks, followed by a discussion. In their talks, the panelists explored the role of expertise and the potential to improve crowd performance by training; the challenge of decomposing tasks to make them amenable to crowdsourcing; and the capture of biological data and metadata through community editing.Database URL: http://www.mitre.org/publications/technical-papers/crowdsourcing-and-curation-perspectives. © The Author(s) 2016. Published by Oxford University Press.
A new approach for processing vowelized and unvowelized Arabic texts in order to prepare them for Natural Language Processing (NLP) purposes is described. The developed approach is rule-based and made up of four phases: text tokenization, word light stemming, word's morphological analysis and text annotation. The first phase preprocesses the input text in order to isolate the words and represent them in a formal way. The second phase applies a light stemmer in order to extract the stem of each word by eliminating the prefixes and suffixes. The third phase is a rule-based morphological analyzer that determines the root and the morphological pattern for each extracted stem. The last phase produces an annotated text where each word is tagged with its morphological attributes. The preprocessor presented in this paper is capable of dealing with vowelized and unvowelized words, and provides the input words along with relevant linguistics information needed by different applications. It is designed to be used with different NLP applications such as machine translation text summarization, text correction, information retrieval and automatic vowelization of Arabic Text. (author)
Juuso, Esko K.
Performance improvement is taken as the primary goal in the asset management. Advanced data analysis is needed to efficiently integrate condition monitoring data into the operation and maintenance. Intelligent stress and condition indices have been developed for control and condition monitoring by combining generalized norms with efficient nonlinear scaling. These nonlinear scaling methodologies can also be used to handle performance measures used for management since management oriented indicators can be presented in the same scale as intelligent condition and stress indices. Performance indicators are responses of the process, machine or system to the stress contributions analyzed from process and condition monitoring data. Scaled values are directly used in intelligent temporal analysis to calculate fluctuations and trends. All these methodologies can be used in prognostics and fatigue prediction. The meanings of the variables are beneficial in extracting expert knowledge and representing information in natural language. The idea of dividing the problems into the variable specific meanings and the directions of interactions provides various improvements for performance monitoring and decision making. The integrated temporal analysis and uncertainty processing facilitates the efficient use of domain expertise. Measurements can be monitored with generalized statistical process control (GSPC) based on the same scaling functions.
Wu Stephen T
Full Text Available Abstract Background One challenge in reusing clinical data stored in electronic medical records is that these data are heterogenous. Clinical Natural Language Processing (NLP plays an important role in transforming information in clinical text to a standard representation that is comparable and interoperable. Information may be processed and shared when a type system specifies the allowable data structures. Therefore, we aim to define a common type system for clinical NLP that enables interoperability between structured and unstructured data generated in different clinical settings. Results We describe a common type system for clinical NLP that has an end target of deep semantics based on Clinical Element Models (CEMs, thus interoperating with structured data and accommodating diverse NLP approaches. The type system has been implemented in UIMA (Unstructured Information Management Architecture and is fully functional in a popular open-source clinical NLP system, cTAKES (clinical Text Analysis and Knowledge Extraction System versions 2.0 and later. Conclusions We have created a type system that targets deep semantics, thereby allowing for NLP systems to encapsulate knowledge from text and share it alongside heterogenous clinical data sources. Rather than surface semantics that are typically the end product of NLP algorithms, CEM-based semantics explicitly build in deep clinical semantics as the point of interoperability with more structured data types.
Wu, Stephen T; Kaggal, Vinod C; Dligach, Dmitriy; Masanz, James J; Chen, Pei; Becker, Lee; Chapman, Wendy W; Savova, Guergana K; Liu, Hongfang; Chute, Christopher G
One challenge in reusing clinical data stored in electronic medical records is that these data are heterogenous. Clinical Natural Language Processing (NLP) plays an important role in transforming information in clinical text to a standard representation that is comparable and interoperable. Information may be processed and shared when a type system specifies the allowable data structures. Therefore, we aim to define a common type system for clinical NLP that enables interoperability between structured and unstructured data generated in different clinical settings. We describe a common type system for clinical NLP that has an end target of deep semantics based on Clinical Element Models (CEMs), thus interoperating with structured data and accommodating diverse NLP approaches. The type system has been implemented in UIMA (Unstructured Information Management Architecture) and is fully functional in a popular open-source clinical NLP system, cTAKES (clinical Text Analysis and Knowledge Extraction System) versions 2.0 and later. We have created a type system that targets deep semantics, thereby allowing for NLP systems to encapsulate knowledge from text and share it alongside heterogenous clinical data sources. Rather than surface semantics that are typically the end product of NLP algorithms, CEM-based semantics explicitly build in deep clinical semantics as the point of interoperability with more structured data types.
Appelo, Lisette; Leermakers, M.C.J.; Rous, J.H.G.
A method is described for the generation of related natural-language expressions. The method is based on a formal grammar of the natural language in question, specified in the Controlled M-Grammar (CMG) formalism. In the CMG framework the generation of an utterance is controlled by a derivation
Carmichael, Lesley; Wright, Richard; Wassink, Alicia Beckford
We are developing a novel, searchable corpus as a research tool for investigating phonetic and phonological phenomena across various speech styles. Five speech styles have been well studied independently in previous work: reduced (casual), careful (hyperarticulated), citation (reading), Lombard effect (speech in noise), and ``motherese'' (child-directed speech). Few studies to date have collected a wide range of styles from a single set of speakers, and fewer yet have provided publicly available corpora. The pilot corpus includes recordings of (1) a set of speakers participating in a variety of tasks designed to elicit the five speech styles, and (2) casual peer conversations and wordlists to illustrate regional vowels. The data include high-quality recordings and time-aligned transcriptions linked to text files that can be queried. Initial measures drawn from the database provide comparison across speech styles along the following acoustic dimensions: MLU (changes in unit duration); relative intra-speaker intensity changes (mean and dynamic range); and intra-speaker pitch values (minimum, maximum, mean, range). The corpus design will allow for a variety of analyses requiring control of demographic and style factors, including hyperarticulation variety, disfluencies, intonation, discourse analysis, and detailed spectral measures.
In Monitoring Adaptive Spoken Dialog Systems, authors Alexander Schmitt and Wolfgang Minker investigate statistical approaches that allow for recognition of negative dialog patterns in Spoken Dialog Systems (SDS). The presented stochastic methods allow a flexible, portable and accurate use. Beginning with the foundations of machine learning and pattern recognition, this monograph examines how frequently users show negative emotions in spoken dialog systems and develop novel approaches to speech-based emotion recognition using hybrid approach to model emotions. The authors make use of statistical methods based on acoustic, linguistic and contextual features to examine the relationship between the interaction flow and the occurrence of emotions using non-acted recordings several thousand real users from commercial and non-commercial SDS. Additionally, the authors present novel statistical methods that spot problems within a dialog based on interaction patterns. The approaches enable future SDS to offer m...
Gullberg, M.; Robert, L.; Dimroth, C.; Veroude, K.; Indefrey, P.
Despite the literature on the role of input in adult second-language (L2) acquisition and on artificial and statistical language learning, surprisingly little is known about how adults break into a new language in the wild. This article reports on a series of behavioral and neuroimaging studies that
Lee, Ming Che; Chang, Jia Wei; Hsieh, Tung Cheng
This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to "artificial language", such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure.
Langus, Alan; Mehler, Jacques; Nespor, Marina
Spoken language is governed by rhythm. Linguistic rhythm is hierarchical and the rhythmic hierarchy partially mimics the prosodic as well as the morpho-syntactic hierarchy of spoken language. It can thus provide learners with cues about the structure of the language they are acquiring. We identify three universal levels of linguistic rhythm - the segmental level, the level of the metrical feet and the phonological phrase level - and discuss why primary lexical stress is not rhythmic. We survey experimental evidence on rhythm perception in young infants and native speakers of various languages to determine the properties of linguistic rhythm that are present at birth, those that mature during the first year of life and those that are shaped by the linguistic environment of language learners. We conclude with a discussion of the major gaps in current knowledge on linguistic rhythm and highlight areas of interest for future research that are most likely to yield significant insights into the nature, the perception, and the usefulness of linguistic rhythm. Copyright © 2016 Elsevier Ltd. All rights reserved.
McNamara, Danielle S; Crossley, Scott A; Roscoe, Rod
The Writing Pal is an intelligent tutoring system that provides writing strategy training. A large part of its artificial intelligence resides in the natural language processing algorithms to assess essay quality and guide feedback to students. Because writing is often highly nuanced and subjective, the development of these algorithms must consider a broad array of linguistic, rhetorical, and contextual features. This study assesses the potential for computational indices to predict human ratings of essay quality. Past studies have demonstrated that linguistic indices related to lexical diversity, word frequency, and syntactic complexity are significant predictors of human judgments of essay quality but that indices of cohesion are not. The present study extends prior work by including a larger data sample and an expanded set of indices to assess new lexical, syntactic, cohesion, rhetorical, and reading ease indices. Three models were assessed. The model reported by McNamara, Crossley, and McCarthy (Written Communication 27:57-86, 2010) including three indices of lexical diversity, word frequency, and syntactic complexity accounted for only 6% of the variance in the larger data set. A regression model including the full set of indices examined in prior studies of writing predicted 38% of the variance in human scores of essay quality with 91% adjacent accuracy (i.e., within 1 point). A regression model that also included new indices related to rhetoric and cohesion predicted 44% of the variance with 94% adjacent accuracy. The new indices increased accuracy but, more importantly, afford the means to provide more meaningful feedback in the context of a writing tutoring system.
Haug Peter J
Full Text Available Abstract Background The medical problem list is an important part of the electronic medical record in development in our institution. To serve the functions it is designed for, the problem list has to be as accurate and timely as possible. However, the current problem list is usually incomplete and inaccurate, and is often totally unused. To alleviate this issue, we are building an environment where the problem list can be easily and effectively maintained. Methods For this project, 80 medical problems were selected for their frequency of use in our future clinical field of evaluation (cardiovascular. We have developed an Automated Problem List system composed of two main components: a background and a foreground application. The background application uses Natural Language Processing (NLP to harvest potential problem list entries from the list of 80 targeted problems detected in the multiple free-text electronic documents available in our electronic medical record. These proposed medical problems drive the foreground application designed for management of the problem list. Within this application, the extracted problems are proposed to the physicians for addition to the official problem list. Results The set of 80 targeted medical problems selected for this project covered about 5% of all possible diagnoses coded in ICD-9-CM in our study population (cardiovascular adult inpatients, but about 64% of all instances of these coded diagnoses. The system contains algorithms to detect first document sections, then sentences within these sections, and finally potential problems within the sentences. The initial evaluation of the section and sentence detection algorithms demonstrated a sensitivity and positive predictive value of 100% when detecting sections, and a sensitivity of 89% and a positive predictive value of 94% when detecting sentences. Conclusion The global aim of our project is to automate the process of creating and maintaining a problem
Redd, Andrew; Pickard, Steve; Meystre, Stephane; Scehnet, Jeffrey; Bolton, Dan; Heavirland, Julia; Weaver, Allison Lynn; Hope, Carol; Garvin, Jennifer Hornung
We introduce and evaluate a new, easily accessible tool using a common statistical analysis and business analytics software suite, SAS, which can be programmed to remove specific protected health information (PHI) from a text document. Removal of PHI is important because the quantity of text documents used for research with natural language processing (NLP) is increasing. When using existing data for research, an investigator must remove all PHI not needed for the research to comply with human subjects' right to privacy. This process is similar, but not identical, to de-identification of a given set of documents. PHI Hunter removes PHI from free-form text. It is a set of rules to identify and remove patterns in text. PHI Hunter was applied to 473 Department of Veterans Affairs (VA) text documents randomly drawn from a research corpus stored as unstructured text in VA files. PHI Hunter performed well with PHI in the form of identification numbers such as Social Security numbers, phone numbers, and medical record numbers. The most commonly missed PHI items were names and locations. Incorrect removal of information occurred with text that looked like identification numbers. PHI Hunter fills a niche role that is related to but not equal to the role of de-identification tools. It gives research staff a tool to reasonably increase patient privacy. It performs well for highly sensitive PHI categories that are rarely used in research, but still shows possible areas for improvement. More development for patterns of text and linked demographic tables from electronic health records (EHRs) would improve the program so that more precise identifiable information can be removed. PHI Hunter is an accessible tool that can flexibly remove PHI not needed for research. If it can be tailored to the specific data set via linked demographic tables, its performance will improve in each new document set.
Rojas Barahona , Lina Maria; Quaglini , Silvana; Stefanelli , Mario
International audience; The prospective home-care management will probably of- fer intelligent conversational assistants for supporting patients at home through natural language interfaces. Homecare assistance in natural lan- guage, HomeNL, is a proof-of-concept dialogue system for the manage- ment of patients with hypertension. It follows up a conversation with a patient in which the patient is able to take the initiative. HomeNL pro- cesses natural language, makes an internal representation...
The Cross-Lingual Information Retrieval system (CLIR) or Multilingual Information Retrieval (MIR) has become the key issue in electronic documents management systems in a multinational environment. We propose here a multilingual information retrieval system consisting of a morpho-syntactic analyser, a transfer system from source language to target language and an information retrieval system. A thorough investigation into the system architecture and the transfer mechanisms is proposed in that report, using two different performance evaluation methods. (author) [fr
Li, Peggy; Dunham, Yarrow; Carey, Susan
Shown an entity (e.g., a plastic whisk) labeled by a novel noun in neutral syntax, speakers of Japanese, a classifier language, are more likely to assume the noun refers to the substance (plastic) than are speakers of English, a count/mass language, who are instead more likely to assume it refers to the object kind [whisk; Imai, M., & Gentner, D.…
I Nengah Sudipa
Full Text Available This article investigates the spoken ability for German students using Bahasa Indonesia (BI. They have studied it for six weeks in IBSN Program at Udayana University, Bali-Indonesia. The data was collected at the time the students sat for the mid-term oral test and was further analyzed with reference to the standard usage of BI. The result suggests that most students managed to express several concepts related to (1 LOCATION; (2 TIME; (3 TRANSPORT; (4 PURPOSE; (5 TRANSACTION; (6 IMPRESSION; (7 REASON; (8 FOOD AND BEVERAGE, and (9 NUMBER AND PERSON. The only problem few students might encounter is due to the influence from their own language system called interference, especially in word order.
Pelucchi, Bruna; Hay, Jessica F; Saffran, Jenny R
Numerous studies over the past decade support the claim that infants are equipped with powerful statistical language learning mechanisms. The primary evidence for statistical language learning in word segmentation comes from studies using artificial languages, continuous streams of synthesized syllables that are highly simplified relative to real speech. To what extent can these conclusions be scaled up to natural language learning? In the current experiments, English-learning 8-month-old infants' ability to track transitional probabilities in fluent infant-directed Italian speech was tested (N = 72). The results suggest that infants are sensitive to transitional probability cues in unfamiliar natural language stimuli, and support the claim that statistical learning is sufficiently robust to support aspects of real-world language acquisition.
Full Text Available Natural Language Processing is one of the most developing fields in research area. In most of the applications related to the Natural Language Processing findings of the Morphological Analysis and Morphological Generation can be considered very important. As morphological study is the technique to recognise a word and its output can be used on later on stages .Keeping in view this importance this paper describes how Morphological Analysis and Morphological Generation can be proved as an important part of various Natural Language Processing fields such as Spell checker Machine Translation etc.
Kowal, Sabine; O'Connell, Daniel C
The following article presents basic concepts and methods of Ragnar Rommetveit's (born 1924) hermeneutic-dialogical approach to everyday spoken dialogue with a focus on both shared consciousness and linguistically mediated meaning. He developed this approach originally in his engagement of mainstream linguistic and psycholinguistic research of the 1960s and 1970s. He criticized this research tradition for its individualistic orientation and its adherence to experimental methodology which did not allow the engagement of interactively established meaning and understanding in everyday spoken dialogue. As a social psychologist influenced by phenomenological philosophy, Rommetveit opted for an alternative conceptualization of such dialogue as a contextualized, partially private world, temporarily co-established by interlocutors on the basis of shared consciousness. He argued that everyday spoken dialogue should be investigated from within, i.e., from the perspectives of the interlocutors and from a psychology of the second person. Hence, he developed his approach with an emphasis on intersubjectivity, perspectivity and perspectival relativity, meaning potential of utterances, and epistemic responsibility of interlocutors. In his methods, he limited himself for the most part to casuistic analyses, i.e., logical analyses of fictitious examples to argue for the plausibility of his approach. After many years of experimental research on language, he pursued his phenomenologically oriented research on dialogue in English-language publications from the late 1980s up to 2003. During that period, he engaged psycholinguistic research on spoken dialogue carried out by Anglo-American colleagues only occasionally. Although his work remained unfinished and open to development, it provides both a challenging alternative and supplement to current Anglo-American research on spoken dialogue and some overlap therewith.
Mast, Marion; Maier, Elisabeth; Schmitz, Birte
This report describes how spoken language turns are segmented into utterances in the framework of the verbmobil project. The problem of segmenting turns is directly related to the task of annotating a discourse with dialogue act information: an utterance can be characterized as a stretch of dialogue that is attributed one dialogue act. Unfortunately, this rule in many cases is insufficient and many doubtful cases remain. We tried to at least reduce the number of unclear cases by providing a n...
Ming Che Lee
Full Text Available This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to “artificial language”, such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure.
Chang, Jia Wei; Hsieh, Tung Cheng
This paper presents a grammar and semantic corpus based similarity algorithm for natural language sentences. Natural language, in opposition to “artificial language”, such as computer programming languages, is the language used by the general public for daily communication. Traditional information retrieval approaches, such as vector models, LSA, HAL, or even the ontology-based approaches that extend to include concept similarity comparison instead of cooccurrence terms/words, may not always determine the perfect matching while there is no obvious relation or concept overlap between two natural language sentences. This paper proposes a sentence similarity algorithm that takes advantage of corpus-based ontology and grammatical rules to overcome the addressed problems. Experiments on two famous benchmarks demonstrate that the proposed algorithm has a significant performance improvement in sentences/short-texts with arbitrary syntax and structure. PMID:24982952
Kiraz, George Anton
This book presents a tractable computational model that can cope with complex morphological operations, especially in Semitic languages, and less complex morphological systems present in Western languages. It outlines a new generalized regular rewrite rule system that uses multiple finite-state automata to cater to root-and-pattern morphology,…
Wagner, J C; Solomon, W D; Michel, P A; Juge, C; Baud, R H; Rector, A L; Scherrer, J R
Re-usable and sharable, and therefore language-independent concept models are of increasing importance in the medical domain. The GALEN project (Generalized Architecture for Languages Encyclopedias and Nomenclatures in Medicine) aims at developing language-independent concept representation systems as the foundations for the next generation of multilingual coding systems. For use within clinical applications, the content of the model has to be mapped to natural language. A so-called Multilingual Information Module (MM) establishes the link between the language-independent concept model and different natural languages. This text generation software must be versatile enough to cope at the same time with different languages and with different parts of a compositional model. It has to meet, on the one hand, the properties of the language as used in the medical domain and, on the other hand, the specific characteristics of the underlying model and its representation formalism. We propose a semantic-oriented approach to natural language generation that is based on linguistic annotations to a concept model. This approach is realized as an integral part of a Terminology Server, built around the concept model and offering different terminological services for clinical applications.
Jamil, Hasan M
One of the many unique features of biological databases is that the mere existence of a ground data item is not always a precondition for a query response. It may be argued that from a biologist's standpoint, queries are not always best posed using a structured language. By this we mean that approximate and flexible responses to natural language like queries are well suited for this domain. This is partly due to biologists' tendency to seek simpler interfaces and partly due to the fact that questions in biology involve high level concepts that are open to interpretations computed using sophisticated tools. In such highly interpretive environments, rigidly structured databases do not always perform well. In this paper, our goal is to propose a semantic correspondence plug-in to aid natural language query processing over arbitrary biological database schema with an aim to providing cooperative responses to queries tailored to users' interpretations. Natural language interfaces for databases are generally effective when they are tuned to the underlying database schema and its semantics. Therefore, changes in database schema become impossible to support, or a substantial reorganization cost must be absorbed to reflect any change. We leverage developments in natural language parsing, rule languages and ontologies, and data integration technologies to assemble a prototype query processor that is able to transform a natural language query into a semantically equivalent structured query over the database. We allow knowledge rules and their frequent modifications as part of the underlying database schema. The approach we adopt in our plug-in overcomes some of the serious limitations of many contemporary natural language interfaces, including support for schema modifications and independence from underlying database schema. The plug-in introduced in this paper is generic and facilitates connecting user selected natural language interfaces to arbitrary databases using a
Barrera, Rosalinda B.; Aleman, Magdalena
Described is a newspaper project in which elementary students report life as it was in the Middle Ages. Students are involved in a variety of language-centered activities. For example, they gather and evaluate information about medieval times and write, edit, and proofread articles for the newspaper. (RM)
Where Humans Meet Machines: Innovative Solutions for Knotty Natural-Language Problems brings humans and machines closer together by showing how linguistic complexities that confound the speech systems of today can be handled effectively by sophisticated natural-language technology. Some of the most vexing natural-language problems that are addressed in this book entail recognizing and processing idiomatic expressions, understanding metaphors, matching an anaphor correctly with its antecedent, performing word-sense disambiguation, and handling out-of-vocabulary words and phrases. This fourteen-chapter anthology consists of contributions from industry scientists and from academicians working at major universities in North America and Europe. They include researchers who have played a central role in DARPA-funded programs and developers who craft real-world solutions for corporations. These contributing authors analyze the role of natural language technology in the global marketplace; they explore the need f...
Nakai, Satsuki; Lindsay, Shane; Ota, Mitsuhiko
When both members of a phonemic contrast in L2 (second language) are perceptually mapped to a single phoneme in one's L1 (first language), L2 words containing a member of that contrast can spuriously activate L2 words in spoken-word recognition. For example, upon hearing cattle, Dutch speakers of English are reported to experience activation…
Zhao, Jingjing; Guo, Jingjing; Zhou, Fengying; Shu, Hua
Evidence from event-related potential (ERP) analyses of English spoken words suggests that the time course of English word recognition in monosyllables is cumulative. Different types of phonological competitors (i.e., rhymes and cohorts) modulate the temporal grain of ERP components differentially (Desroches, Newman, & Joanisse, 2009). The time course of Chinese monosyllabic spoken word recognition could be different from that of English due to the differences in syllable structure between the two languages (e.g., lexical tones). The present study investigated the time course of Chinese monosyllabic spoken word recognition using ERPs to record brain responses online while subjects listened to spoken words. During the experiment, participants were asked to compare a target picture with a subsequent picture by judging whether or not these two pictures belonged to the same semantic category. The spoken word was presented between the two pictures, and participants were not required to respond during its presentation. We manipulated phonological competition by presenting spoken words that either matched or mismatched the target picture in one of the following four ways: onset mismatch, rime mismatch, tone mismatch, or syllable mismatch. In contrast to the English findings, our findings showed that the three partial mismatches (onset, rime, and tone mismatches) equally modulated the amplitudes and time courses of the N400 (a negative component that peaks about 400ms after the spoken word), whereas, the syllable mismatched words elicited an earlier and stronger N400 than the three partial mismatched words. The results shed light on the important role of syllable-level awareness in Chinese spoken word recognition and also imply that the recognition of Chinese monosyllabic words might rely more on global similarity of the whole syllable structure or syllable-based holistic processing rather than phonemic segment-based processing. We interpret the differences in spoken word
We describe basic concepts and software architectures for the integration of shallow and deep (linguistics-based, semantics-oriented) natural language processing (NLP) components. The main goal of this novel, hybrid integration paradigm is improving robustness of deep processing. After an introduction to constraint-based natural language parsing, we give an overview of typical shallow processing tasks. We introduce XML standoff markup as an additional abstraction layer that eases integration ...
Service oriented chatbot systems are used to inform users in a conversational manner about a particular service or product on a website. Our research shows that current systems are time consuming to build and not very accurate or satisfying to users. We find that natural language understanding and natural language generation methods are central to creating an e�fficient and useful system. In this thesis we investigate current and past methods in this research area and place particular emph...
It is difficult to find the exact number of other languages spoken besides Dutch in the Netherlands. A study showed that a total of 96 other languages are spoken by students attending Dutch primary and secondary schools. The variety of languages spoken shows the growth of linguistic diversity in the
Snefjella, Bryor; Kuperman, Victor
Existing evidence shows that more abstract mental representations are formed and more abstract language is used to characterize phenomena that are more distant from the self. Yet the precise form of the functional relationship between distance and linguistic abstractness is unknown. In four studies, we tested whether more abstract language is used in textual references to more geographically distant cities (Study 1), time points further into the past or future (Study 2), references to more socially distant people (Study 3), and references to a specific topic (Study 4). Using millions of linguistic productions from thousands of social-media users, we determined that linguistic concreteness is a curvilinear function of the logarithm of distance, and we discuss psychological underpinnings of the mathematical properties of this relationship. We also demonstrated that gradient curvilinear effects of geographic and temporal distance on concreteness are nearly identical, which suggests uniformity in representation of abstractness along multiple dimensions. © The Author(s) 2015.
Botha, Jan A.; Pitler, Emily; Ma, Ji; Bakalov, Anton; Salcianu, Alex; Weiss, David; McDonald, Ryan; Petrov, Slav
We show that small and shallow feed-forward neural networks can achieve near state-of-the-art results on a range of unstructured and structured language processing tasks while being considerably cheaper in memory and computational requirements than deep recurrent models. Motivated by resource-constrained environments like mobile phones, we showcase simple techniques for obtaining such small neural network models, and investigate different tradeoffs when deciding how to allocate a small memory...
Strauß, Antje; Kotz, Sonja A; Scharinger, Mathias; Obleser, Jonas
Slow neural oscillations (~1-15 Hz) are thought to orchestrate the neural processes of spoken language comprehension. However, functional subdivisions within this broad range of frequencies are disputed, with most studies hypothesizing only about single frequency bands. The present study utilizes an established paradigm of spoken word recognition (lexical decision) to test the hypothesis that within the slow neural oscillatory frequency range, distinct functional signatures and cortical networks can be identified at least for theta- (~3-7 Hz) and alpha-frequencies (~8-12 Hz). Listeners performed an auditory lexical decision task on a set of items that formed a word-pseudoword continuum: ranging from (1) real words over (2) ambiguous pseudowords (deviating from real words only in one vowel; comparable to natural mispronunciations in speech) to (3) pseudowords (clearly deviating from real words by randomized syllables). By means of time-frequency analysis and spatial filtering, we observed a dissociation into distinct but simultaneous patterns of alpha power suppression and theta power enhancement. Alpha exhibited a parametric suppression as items increasingly matched real words, in line with lowered functional inhibition in a left-dominant lexical processing network for more word-like input. Simultaneously, theta power in a bilateral fronto-temporal network was selectively enhanced for ambiguous pseudowords only. Thus, enhanced alpha power can neurally 'gate' lexical integration, while enhanced theta power might index functionally more specific ambiguity-resolution processes. To this end, a joint analysis of both frequency bands provides neural evidence for parallel processes in achieving spoken word recognition. Copyright © 2014 Elsevier Inc. All rights reserved.
Full Text Available Autism spectrum disorders (ASD are pervasive neurodevelopmental disorders involving a number of deficits to linguistic cognition. The gap between genetics and the pathophysiology of ASD remains open, in particular regarding its distinctive linguistic profile. The goal of this paper is to attempt to bridge this gap, focusing on how the autistic brain processes language, particularly through the perspective of brain rhythms. Due to the phenomenon of pleiotropy, which may take some decades to overcome, we believe that studies of brain rhythms, which are not faced with problems of this scale, may constitute a more tractable route to interpreting language deficits in ASD and eventually other neurocognitive disorders. Building on recent attempts to link neural oscillations to certain computational primitives of language, we show that interpreting language deficits in ASD as oscillopathic traits is a potentially fruitful way to construct successful endophenotypes of this condition. Additionally, we will show that candidate genes for ASD are overrepresented among the genes that played a role in the evolution of language. These genes include (and are related to genes involved in brain rhythmicity. We hope that the type of steps taken here will additionally lead to a better understanding of the comorbidity, heterogeneity, and variability of ASD, and may help achieve a better treatment of the affected populations.
Full Text Available It has long been speculated whether communication between humans and machines based on natural speech related cortical activity is possible. Over the past decade, studies have suggested that it is feasible to recognize isolated aspects of speech from neural signals, such as auditory features, phones or one of a few isolated words. However, until now it remained an unsolved challenge to decode continuously spoken speech from the neural substrate associated with speech and language processing. Here, we show for the first time that continuously spoken speech can be decoded into the expressed words from intracranial electrocorticographic (ECoG recordings. Specifically, we implemented a system, which we call Brain-To-Text that models single phones, employs techniques from automatic speech recognition (ASR, and thereby transforms brain activity while speaking into the corresponding textual representation. Our results demonstrate that our system achieved word error rates as low as 25% and phone error rates below 50%. Additionally, our approach contributes to the current understanding of the neural basis of continuous speech production by identifying those cortical regions that hold substantial information about individual phones. In conclusion, the Brain-To-Text system described in this paper represents an important step towards human-machine communication based on imagined speech.
Emmeche, Claus; Hoffmeyer, Jesper Normann
be of considerable value, not only heuristically, but in order to comprehend the irreducible nature of living organisms. In arguing for a semiotic perspective on living nature, it makes a marked difference whether the departure is made from the tradition of F. de Saussure´s structural linguistics or from...
Harispe, Sébastien; Janaqi, Stefan
Artificial Intelligence federates numerous scientific fields in the aim of developing machines able to assist human operators performing complex treatments---most of which demand high cognitive skills (e.g. learning or decision processes). Central to this quest is to give machines the ability to estimate the likeness or similarity between things in the way human beings estimate the similarity between stimuli.In this context, this book focuses on semantic measures: approaches designed for comparing semantic entities such as units of language, e.g. words, sentences, or concepts and instances def
Kandler, Anne; Unger, Roman; Steele, James
'Language shift' is the process whereby members of a community in which more than one language is spoken abandon their original vernacular language in favour of another. The historical shifts to English by Celtic language speakers of Britain and Ireland are particularly well-studied examples for which good census data exist for the most recent 100-120 years in many areas where Celtic languages were once the prevailing vernaculars. We model the dynamics of language shift as a competition process in which the numbers of speakers of each language (both monolingual and bilingual) vary as a function both of internal recruitment (as the net outcome of birth, death, immigration and emigration rates of native speakers), and of gains and losses owing to language shift. We examine two models: a basic model in which bilingualism is simply the transitional state for households moving between alternative monolingual states, and a diglossia model in which there is an additional demand for the endangered language as the preferred medium of communication in some restricted sociolinguistic domain, superimposed on the basic shift dynamics. Fitting our models to census data, we successfully reproduce the demographic trajectories of both languages over the past century. We estimate the rates of recruitment of new Scottish Gaelic speakers that would be required each year (for instance, through school education) to counteract the 'natural wastage' as households with one or more Gaelic speakers fail to transmit the language to the next generation informally, for different rates of loss during informal intergenerational transmission.
Introducing Spoken Dialogue Systems into Intelligent Environments outlines the formalisms of a novel knowledge-driven framework for spoken dialogue management and presents the implementation of a model-based Adaptive Spoken Dialogue Manager(ASDM) called OwlSpeak. The authors have identified three stakeholders that potentially influence the behavior of the ASDM: the user, the SDS, and a complex Intelligent Environment (IE) consisting of various devices, services, and task descriptions. The theoretical foundation of a working ontology-based spoken dialogue description framework, the prototype implementation of the ASDM, and the evaluation activities that are presented as part of this book contribute to the ongoing spoken dialogue research by establishing the fertile ground of model-based adaptive spoken dialogue management. This monograph is ideal for advanced undergraduate students, PhD students, and postdocs as well as academic and industrial researchers and developers in speech and multimodal interactive ...
Canfield, K.; Bray, B.; Huff, S.; Warner, H.
We describe a prototype system for semi-automatic database capture of free-text echocardiography reports. The system is very simple and uses a Unified Medical Language System compatible architecture. We use this system and a large body of texts to create a patient database and develop a comprehensive hierarchical dictionary for echocardiography.
Shafiee Nahrkhalaji, Saeedeh; Lotfi, Ahmad Reza; Koosha, Mansour
The present study aims to reveal some facts concerning first language (L[subscript 1]) and second language (L[subscript 2]) spoken-word processing in unbalanced proficient bilinguals using behavioral measures. The intention here is to examine the effects of auditory repetition word priming and semantic priming in first and second languages of…
Barker-Plummer, Dave; Dale, Robert; Cox, Richard; Romanczuk, Alex
We have assembled a large corpus of student submissions to an automatic grading system, where the subject matter involves the translation of natural language sentences into propositional logic. Of the 2.3 million translation instances in the corpus, 286,000 (approximately 12%) are categorized as being in error. We want to understand the nature of…
Daltrozzo, Jerome; Emerson, Samantha N; Deocampo, Joanne; Singh, Sonia; Freggens, Marjorie; Branum-Martin, Lee; Conway, Christopher M
Statistical learning (SL) is believed to enable language acquisition by allowing individuals to learn regularities within linguistic input. However, neural evidence supporting a direct relationship between SL and language ability is scarce. We investigated whether there are associations between event-related potential (ERP) correlates of SL and language abilities while controlling for the general level of selective attention. Seventeen adults completed tests of visual SL, receptive vocabulary, grammatical ability, and sentence completion. Response times and ERPs showed that SL is related to receptive vocabulary and grammatical ability. ERPs indicated that the relationship between SL and grammatical ability was independent of attention while the association between SL and receptive vocabulary depended on attention. The implications of these dissociative relationships in terms of underlying mechanisms of SL and language are discussed. These results further elucidate the cognitive nature of the links between SL mechanisms and language abilities. Copyright © 2017 Elsevier Inc. All rights reserved.
Bisikalo Oleg V.
Full Text Available The task of evaluating uncertainty in the measurement of sense in natural language constructions (NLCs was researched through formalization of the notions of the language image, formalization of artificial cognitive systems (ACSs and the formalization of units of meaning. The method for measuring the sense of natural language constructions incorporated fuzzy relations of meaning, which ensures that information about the links between lemmas of the text is taken into account, permitting the evaluation of two types of measurement uncertainty of sense characteristics. Using developed applications programs, experiments were conducted to investigate the proposed method to tackle the identification of informative characteristics of text. The experiments resulted in dependencies of parameters being obtained in order to utilise the Pareto distribution law to define relations between lemmas, analysis of which permits the identification of exponents of an average number of connections of the language image as the most informative characteristics of text.
Clody, Michael C
The essay argues that Francis Bacon's considerations of parables and cryptography reflect larger interpretative concerns of his natural philosophic project. Bacon describes nature as having a language distinct from those of God and man, and, in so doing, establishes a central problem of his natural philosophy—namely, how can the language of nature be accessed through scientific representation? Ultimately, Bacon's solution relies on a theory of differential and duplicitous signs that conceal within them the hidden voice of nature, which is best recognized in the natural forms of efficient causality. The "alphabet of nature"—those tables of natural occurrences—consequently plays a central role in his program, as it renders nature's language susceptible to a process and decryption that mirrors the model of the bilateral cipher. It is argued that while the writing of Bacon's natural philosophy strives for literality, its investigative process preserves a space for alterity within scientific representation, that is made accessible to those with the interpretative key.
Heinrich, Stefan; Wermter, Stefan
For the complex human brain that enables us to communicate in natural language, we gathered good understandings of principles underlying language acquisition and processing, knowledge about sociocultural conditions, and insights into activity patterns in the brain. However, we were not yet able to understand the behavioural and mechanistic characteristics for natural language and how mechanisms in the brain allow to acquire and process language. In bridging the insights from behavioural psychology and neuroscience, the goal of this paper is to contribute a computational understanding of appropriate characteristics that favour language acquisition. Accordingly, we provide concepts and refinements in cognitive modelling regarding principles and mechanisms in the brain and propose a neurocognitively plausible model for embodied language acquisition from real-world interaction of a humanoid robot with its environment. In particular, the architecture consists of a continuous time recurrent neural network, where parts have different leakage characteristics and thus operate on multiple timescales for every modality and the association of the higher level nodes of all modalities into cell assemblies. The model is capable of learning language production grounded in both, temporal dynamic somatosensation and vision, and features hierarchical concept abstraction, concept decomposition, multi-modal integration, and self-organisation of latent representations.
Liu, Haitao; Xu, Chunshan; Liang, Junying
Dependency distance, measured by the linear distance between two syntactically related words in a sentence, is generally held as an important index of memory burden and an indicator of syntactic difficulty. Since this constraint of memory is common for all human beings, there may well be a universal preference for dependency distance minimization (DDM) for the sake of reducing memory burden. This human-driven language universal is supported by big data analyses of various corpora that consistently report shorter overall dependency distance in natural languages than in artificial random languages and long-tailed distributions featuring a majority of short dependencies and a minority of long ones. Human languages, as complex systems, seem to have evolved to come up with diverse syntactic patterns under the universal pressure for dependency distance minimization. However, there always exist a small number of long-distance dependencies in natural languages, which may reflect some other biological or functional constraints. Language system may adapt itself to these sporadic long-distance dependencies. It is these universal constraints that have shaped such a rich diversity of syntactic patterns in human languages.
Liu, Haitao; Xu, Chunshan; Liang, Junying
Dependency distance, measured by the linear distance between two syntactically related words in a sentence, is generally held as an important index of memory burden and an indicator of syntactic difficulty. Since this constraint of memory is common for all human beings, there may well be a universal preference for dependency distance minimization (DDM) for the sake of reducing memory burden. This human-driven language universal is supported by big data analyses of various corpora that consistently report shorter overall dependency distance in natural languages than in artificial random languages and long-tailed distributions featuring a majority of short dependencies and a minority of long ones. Human languages, as complex systems, seem to have evolved to come up with diverse syntactic patterns under the universal pressure for dependency distance minimization. However, there always exist a small number of long-distance dependencies in natural languages, which may reflect some other biological or functional constraints. Language system may adapt itself to these sporadic long-distance dependencies. It is these universal constraints that have shaped such a rich diversity of syntactic patterns in human languages. Copyright © 2017. Published by Elsevier B.V.
Groth, P.T.; Gil, Y
Scientists increasingly use workflows to represent and share their computational experiments. Because of their declarative nature, focus on pre-existing component composition and the availability of visual editors, workflows provide a valuable start for creating user-friendly environments for end
how a Concept specializes its subsumer. |C|ANIMAL. |C|PLANT. |C(PERSON, and |C| UNICORN are natural kinds, and so will need a PrimitiveClass. As...build this proof, we must build a proof of p x (p X n) steps. The size of the proofs grows exponentially with the depth of nesting This :s clearly
fundamental to knowledge management problems. In [Wijaya13] presented a novel approach to this ontology alignment problem that employs a very large natural...to them. This report is the result of contracted fundamental research deemed exempt from public affairs security and policy review in accordance...S / ALEKSEY PANASYUK MICHAEL J. WESSING Work Unit Manager Deputy Chief, Information Intelligence Systems & Analysis Division Information
Bender, Emily M
Many NLP tasks have at their core a subtask of extracting the dependencies-who did what to whom-from natural language sentences. This task can be understood as the inverse of the problem solved in different ways by diverse human languages, namely, how to indicate the relationship between different parts of a sentence. Understanding how languages solve the problem can be extremely useful in both feature design and error analysis in the application of machine learning to NLP. Likewise, understanding cross-linguistic variation can be important for the design of MT systems and other multilingual a
Full Text Available We propose a stochastic model for the number of different words in a given database which incorporates the dependence on the database size and historical changes. The main feature of our model is the existence of two different classes of words: (i a finite number of core words, which have higher frequency and do not affect the probability of a new word to be used, and (ii the remaining virtually infinite number of noncore words, which have lower frequency and, once used, reduce the probability of a new word to be used in the future. Our model relies on a careful analysis of the Google Ngram database of books published in the last centuries, and its main consequence is the generalization of Zipf’s and Heaps’ law to two-scaling regimes. We confirm that these generalizations yield the best simple description of the data among generic descriptive models and that the two free parameters depend only on the language but not on the database. From the point of view of our model, the main change on historical time scales is the composition of the specific words included in the finite list of core words, which we observe to decay exponentially in time with a rate of approximately 30 words per year for English.
Whalen, D. H.; Giulivi, Sara; Nam, Hosung; Levitt, Andrea G.; Hallé, Pierre; Goldstein, Louis M.
Certain consonant/vowel (CV) combinations are more frequent than would be expected from the individual C and V frequencies alone, both in babbling and, to a lesser extent, in adult language, based on dictionary counts: Labial consonants co-occur with central vowels more often than chance would dictate; coronals co-occur with front vowels, and velars with back vowels (Davis & MacNeilage, 1994). Plausible biomechanical explanations have been proposed, but it is also possible that infants are mirroring the frequency of the CVs that they hear. As noted, previous assessments of adult language were based on dictionaries; these “type” counts are incommensurate with the babbling measures, which are necessarily “token” counts. We analyzed the tokens in two spoken corpora for English, two for French and one for Mandarin. We found that the adult spoken CV preferences correlated with the type counts for Mandarin and French, not for English. Correlations between the adult spoken corpora and the babbling results had all three possible outcomes: significantly positive (French), uncorrelated (Mandarin), and significantly negative (English). There were no correlations of the dictionary data with the babbling results when we consider all nine combinations of consonants and vowels. The results indicate that spoken frequencies of CV combinations can differ from dictionary (type) counts and that the CV preferences apparent in babbling are biomechanically driven and can ignore the frequencies of CVs in the ambient spoken language. PMID:23420980
Full Text Available Intelligent interface, to enhance efficient interactions between user and databases, is the need of the database applications. Databases must be intelligent enough to make the accessibility faster. However, not every user familiar with the Structured Query Language (SQL queries as they may not aware of structure of the database and they thus require to learn SQL. So, non-expert users need a system to interact with relational databases in their natural language such as English. For this, Database Management System (DBMS must have an ability to understand Natural Language (NL. In this research, an intelligent interface is developed using semantic matching technique which translates natural language query to SQL using set of production rules and data dictionary. The data dictionary consists of semantics sets for relations and attributes. A series of steps like lower case conversion, tokenization, speech tagging, database element and SQL element extraction is used to convert Natural Language Query (NLQ to SQL Query. The transformed query is executed and the results are obtained by the user. Intelligent Interface is the need of database applications to enhance efficient interaction between user and DBMS.
Feng, Qiangze; Qi, Hongwei; Fukushima, Toshikazu
Information services accessed via mobile phones provide information directly relevant to subscribers’ daily lives and are an area of dynamic market growth worldwide. Although many information services are currently offered by mobile operators, many of the existing solutions require a unique gateway for each service, and it is inconvenient for users to have to remember a large number of such gateways. Furthermore, the Short Message Service (SMS) is very popular in China and Chinese users would prefer to access these services in natural language via SMS. This chapter describes a Natural Language Based Service Selection System (NL3S) for use with a large number of mobile information services. The system can accept user queries in natural language and navigate it to the required service. Since it is difficult for existing methods to achieve high accuracy and high coverage and anticipate which other services a user might want to query, the NL3S is developed based on a Multi-service Ontology (MO) and Multi-service Query Language (MQL). The MO and MQL provide semantic and linguistic knowledge, respectively, to facilitate service selection for a user query and to provide adaptive service recommendations. Experiments show that the NL3S can achieve 75-95% accuracies and 85-95% satisfactions for processing various styles of natural language queries. A trial involving navigation of 30 different mobile services shows that the NL3S can provide a viable commercial solution for mobile operators.
MILROY, Lesley. Observing and Analysing Natural Language: A Critical Account of Sociolinguistic Method. Oxford: Basil Blackwell, 1987. 230pp. MILROY, Lesley. Observing and Analysing Natural Language: A Critical Account of Sociolinguistic Method. Oxford: Basil Blackwell, 1987. 230pp.
Iria Werlang Garcia
Full Text Available Lesley Milroy's Observing and Analysing Natural Language is a recent addition to an ever growing number of publications in the field of Sociolinguistics. It carries the weight of one of the experienced authors in the current days in the specified field and should offer basic information to both newcomers and established investigators in natural language. Lesley Milroy's Observing and Analysing Natural Language is a recent addition to an ever growing number of publications in the field of Sociolinguistics. It carries the weight of one of the experienced authors in the current days in the specified field and should offer basic information to both newcomers and established investigators in natural language.
Bisson, Marie-Josée; van Heuven, Walter J B; Conklin, Kathy; Tunney, Richard J
First language acquisition requires relatively little effort compared to foreign language acquisition and happens more naturally through informal learning. Informal exposure can also benefit foreign language learning, although evidence for this has been limited to speech perception and production. An important question is whether informal exposure to spoken foreign language also leads to vocabulary learning through the creation of form-meaning links. Here we tested the impact of exposure to foreign language words presented with pictures in an incidental learning phase on subsequent explicit foreign language learning. In the explicit learning phase, we asked adults to learn translation equivalents of foreign language words, some of which had appeared in the incidental learning phase. Results revealed rapid learning of the foreign language words in the incidental learning phase showing that informal exposure to multi-modal foreign language leads to foreign language vocabulary acquisition. The creation of form-meaning links during the incidental learning phase is discussed.
interpretation would not be too bad if one were to believe that a frame "is intended to represent a ’ stereotypical situation’" ( , p. 48). We...natural kind-like concepts - some form of definitional structuring is necessary. The internal structure of non atomic concepts (e.g., proximate genus ...types of beer, bottles of wine, etc.; <x> need not be any sort of Onatural genus .’ For example, in Dll the definite pronoun Othem" is not meant to I
Full Text Available This study addressed the development of and the relationship between foundational metalinguistic skills and word reading skills in Arabic. It compared Arabic-speaking children’s phonological awareness (PA, morphological awareness, and voweled and unvoweled word reading skills in spoken and standard language varieties separately in children across five grade levels from childhood to adolescence. Second, it investigated whether skills developed in the spoken variety of Arabic predict reading in the standard variety. Results indicate that although individual differences between students in PA are eliminated toward the end of elementary school in both spoken and standard language varieties, gaps in morphological awareness and in reading skills persisted through junior and high school years. The results also show that the gap in reading accuracy and fluency between Spoken Arabic (SpA and Standard Arabic (StA was evident in both voweled and unvoweled words. Finally, regression analyses showed that morphological awareness in SpA contributed to reading fluency in StA, i.e., children’s early morphological awareness in SpA explained variance in children’s gains in reading fluency in StA. These findings have important theoretical and practical contributions for Arabic reading theory in general and they extend the previous work regarding the cross-linguistic relevance of foundational metalinguistic skills in the first acquired language to reading in a second language, as in societal bilingualism contexts, or a second language variety, as in diglossic contexts.
Schiff, Rachel; Saiegh-Haddad, Elinor
This study addressed the development of and the relationship between foundational metalinguistic skills and word reading skills in Arabic. It compared Arabic-speaking children’s phonological awareness (PA), morphological awareness, and voweled and unvoweled word reading skills in spoken and standard language varieties separately in children across five grade levels from childhood to adolescence. Second, it investigated whether skills developed in the spoken variety of Arabic predict reading in the standard variety. Results indicate that although individual differences between students in PA are eliminated toward the end of elementary school in both spoken and standard language varieties, gaps in morphological awareness and in reading skills persisted through junior and high school years. The results also show that the gap in reading accuracy and fluency between Spoken Arabic (SpA) and Standard Arabic (StA) was evident in both voweled and unvoweled words. Finally, regression analyses showed that morphological awareness in SpA contributed to reading fluency in StA, i.e., children’s early morphological awareness in SpA explained variance in children’s gains in reading fluency in StA. These findings have important theoretical and practical contributions for Arabic reading theory in general and they extend the previous work regarding the cross-linguistic relevance of foundational metalinguistic skills in the first acquired language to reading in a second language, as in societal bilingualism contexts, or a second language variety, as in diglossic contexts. PMID:29686633
Of the approximately 7,000 languages spoken today, some 2,500 are generally considered endangered. Here we argue that this consensus figure vastly underestimates the danger of digital language death, in that less than 5% of all languages can still ascend to the digital realm. We present evidence of a massive die-off caused by the digital divide. PMID:24167559
Full Text Available Of the approximately 7,000 languages spoken today, some 2,500 are generally considered endangered. Here we argue that this consensus figure vastly underestimates the danger of digital language death, in that less than 5% of all languages can still ascend to the digital realm. We present evidence of a massive die-off caused by the digital divide.
Okano, Kana; Grainger, Jonathan; Holcomb, Phillip J
In a masked cross-modal priming experiment with ERP recordings, spoken Japanese words were primed with words written in one of the two syllabary scripts of Japanese. An early priming effect, peaking at around 200ms after onset of the spoken word target, was seen in left lateral electrode sites for Katakana primes, and later effects were seen for both Hiragana and Katakana primes on the N400 ERP component. The early effect is thought to reflect the efficiency with which words in Katakana script make contact with sublexical phonological representations involved in spoken language comprehension, due to the particular way this script is used by Japanese readers. This demonstrates fast-acting influences of visual primes on the processing of auditory target words, and suggests that briefly presented visual primes can influence sublexical processing of auditory target words. The later N400 priming effects, on the other hand, most likely reflect cross-modal influences on activity at the level of whole-word phonology and semantics.
Thessen,Anne; Preciado,Jenette; Jain,Payoj; Martin,James; Palmer,Martha; Bhat,Riyaz
The cTAKES package (using the ClearTK Natural Language Processing toolkit Bethard et al. 2014, http://cleartk.github.io/cleartk/) has been successfully used to automatically read clinical notes in the medical field (Albright et al. 2013, Styler et al. 2014). It is used on a daily basis to automatically process clinical notes and extract relevant information by dozens of medical institutions. ClearEarth is a collaborative project that brings together computational linguistics and domain scient...
Topac, Vasile; Jurcau, Daniel-Alexandru; Stoicu-Tivadar, Vasile
Medical terminology appears in the natural language in multiple forms: canonical, derived or inflected form. This research presents an analysis of the form in which medical terminology appears in Romanian and English language. The sources of medical language used for the study are web pages presenting medical information for patients and other lay users. The results show that, in English, medical terminology tends to appear more in canonical form while, in the case of Romanian, it is the opposite. This paper also presents the service that was created to perform this analysis. This tool is available for the general public, and it is designed to be easily extensible, allowing the addition of other languages.
Casey, Laura Baylot; Bicard, David F.
Language development in typically developing children has a very predictable pattern beginning with crying, cooing, babbling, and gestures along with the recognition of spoken words, comprehension of spoken words, and then one word utterances. This predictable pattern breaks down for children with language disorders. This article will discuss…
Fitzpatrick, A.Liam; /Boston U.; Kaplan, Jared; /SLAC; Penedones, Joao; /Perimeter Inst. Theor. Phys.; Raju, Suvrat; /Harish-Chandra Res. Inst.; van Rees, Balt C.; /YITP, Stony Brook
We provide dramatic evidence that 'Mellin space' is the natural home for correlation functions in CFTs with weakly coupled bulk duals. In Mellin space, CFT correlators have poles corresponding to an OPE decomposition into 'left' and 'right' sub-correlators, in direct analogy with the factorization channels of scattering amplitudes. In the regime where these correlators can be computed by tree level Witten diagrams in AdS, we derive an explicit formula for the residues of Mellin amplitudes at the corresponding factorization poles, and we use the conformal Casimir to show that these amplitudes obey algebraic finite difference equations. By analyzing the recursive structure of our factorization formula we obtain simple diagrammatic rules for the construction of Mellin amplitudes corresponding to tree-level Witten diagrams in any bulk scalar theory. We prove the diagrammatic rules using our finite difference equations. Finally, we show that our factorization formula and our diagrammatic rules morph into the flat space S-Matrix of the bulk theory, reproducing the usual Feynman rules, when we take the flat space limit of AdS/CFT. Throughout we emphasize a deep analogy with the properties of flat space scattering amplitudes in momentum space, which suggests that the Mellin amplitude may provide a holographic definition of the flat space S-Matrix.
Arndt, Karen Barako; Schuele, C. Melanie
Complex syntax production emerges shortly after the emergence of two-word combinations in oral language and continues to develop through the school-age years. This article defines a framework for the analysis of complex syntax in the spontaneous language of preschool- and early school-age children. The purpose of this article is to provide…
Leung, Constant; Scarino, Angela
Transformations associated with the increasing speed, scale, and complexity of mobilities, together with the information technology revolution, have changed the demography of most countries of the world and brought about accompanying social, cultural, and economic shifts (Heugh, 2013). This complex diversity has changed the very nature of…
facilities. BBN is developing a series of increasingly sophisticated natural language understanding systems which will serve as an integrated interface...Haas, A.R. A Syntactic Theory of Belief and Action. Artificial Intelligence. 1986. Forthcoming.  Hinrichs, E. Temporale Anaphora im Englischen
Nye, Benjamin D.; Graesser, Arthur C.; Hu, Xiangen
AutoTutor is a natural language tutoring system that has produced learning gains across multiple domains (e.g., computer literacy, physics, critical thinking). In this paper, we review the development, key research findings, and systems that have evolved from AutoTutor. First, the rationale for developing AutoTutor is outlined and the advantages…
Higginbotham, D Jeffery; Lesher, Gregory W; Moulton, Bryan J; Roark, Brian
Significant progress has been made in the application of natural language processing (NLP) to augmentative and alternative communication (AAC), particularly in the areas of interface design and word prediction. This article will survey the current state-of-the-science of NLP in AAC and discuss its future applications for the development of next generation of AAC technology.
Krahmer, E.; Krahmer, E.; Theune, Mariet
We are pleased to present the Proceedings of the 12th European Workshop on Natural Language Generation (ENLG 2009). ENLG 2009 was held in Athens, Greece, as a workshop at the 12th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2009). Following our call, we
A theoretical discussion is offered on whether the subjunctive in the Romance languages is by nature thematic, as suggested in previous studies. English and Spanish samples are used to test the hypothesis; one conclusion is that the subjunctive seems to offer speaker-related information and may express the intensity of the speaker's involvement.…
Laski, Karen E.; And Others
Parents of four nonverbal and four echolalic autistic children, aged five-nine, were trained to increase their children's speech by using the Natural Language Paradigm. Following training, parents increased the frequency with which they required their children to speak, and children increased the frequency of their verbalizations in three…
Stoianov, [No Value; Nerbonne, J; Bouma, H; Coppen, PA; vanHalteren, H; Teunissen, L
Simple Recurrent Networks (SRN) are Neural Network (connectionist) models able to process natural language. Phonotactics concerns the order of symbols in words. We continued an earlier unsuccessful trial to model the phonotactics of Dutch words with SRNs. In order to overcome the previously reported
Ingram, D. E.
The nature and development of the recently released International English Language Testing System (IELTS) instrument are described. The test is the result of a joint Australian-British project to develop a new test for use with foreign students planning to study in English-speaking countries. It is expected that the modular instrument will become…
Tierney, Patrick J.
This paper introduces a method of extending natural language-based processing of qualitative data analysis with the use of a very quantitative tool--graph theory. It is not an attempt to convert qualitative research to a positivist approach with a mathematical black box, nor is it a "graphical solution". Rather, it is a method to help qualitative…
Balyan, Renu; McCarthy, Kathryn S.; McNamara, Danielle S.
This study examined how machine learning and natural language processing (NLP) techniques can be leveraged to assess the interpretive behavior that is required for successful literary text comprehension. We compared the accuracy of seven different machine learning classification algorithms in predicting human ratings of student essays about…
Wong, Wing-Kwong; Yin, Sheng-Kai; Yang, Chang-Zhe
This paper presents a tool for drawing dynamic geometric figures by understanding the texts of geometry problems. With the tool, teachers and students can construct dynamic geometric figures on a web page by inputting a geometry problem in natural language. First we need to build the knowledge base for understanding geometry problems. With the…
Kyle, Kristopher; Crossley, Scott A.; McNamara, Danielle S.
This study explores the construct validity of speaking tasks included in the TOEFL iBT (e.g., integrated and independent speaking tasks). Specifically, advanced natural language processing (NLP) tools, MANOVA difference statistics, and discriminant function analyses (DFA) are used to assess the degree to which and in what ways responses to these…
Nguyen, Dong-Phuong; Dogruoz, A. Seza
Multilingual speakers switch between languages in online and spoken communication. Analyses of large scale multilingual data require automatic language identification at the word level. For our experiments with multilingual online discussions, we first tag the language of individual words using
Wang, Zhen; Zechner, Klaus; Sun, Yu
As automated scoring systems for spoken responses are increasingly used in language assessments, testing organizations need to analyze their performance, as compared to human raters, across several dimensions, for example, on individual items or based on subgroups of test takers. In addition, there is a need in testing organizations to establish…
With questions and answer sections throughout, this book helps you to improve your written and spoken English through understanding the structure of the English language. This is a thorough and useful book with all parts of speech and grammar explained. Used by ELT self-study students.
Čermáková, Anna; Komrsková, Zuzana; Kopřivová, Marie; Poukarová, Petra
-, 25.04.2017 (2017), s. 393-414 ISSN 2509-9507 R&D Projects: GA ČR GA15-01116S Institutional support: RVO:68378092 Keywords : Causality * Discourse marker * Spoken language * Czech Subject RIV: AI - Linguistics OBOR OECD: Linguistics https://link.springer.com/content/pdf/10.1007%2Fs41701-017-0014-y.pdf
Stone, Adam; Petitto, Laura-Ann; Bosworth, Rain
The infant brain may be predisposed to identify perceptually salient cues that are common to both signed and spoken languages. Recent theory based on spoken languages has advanced sonority as one of these potential language acquisition cues. Using a preferential looking paradigm with an infrared eye tracker, we explored visual attention of hearing…
Hofmann, Kristin; Chilla, Solveig
Adopting a bimodal bilingual language acquisition model, this qualitative case study is the first in Germany to investigate the spoken and sign language development of hearing children of deaf adults (codas). The spoken language competence of six codas within the age range of 3;10 to 6;4 is assessed by a series of standardised tests (SETK 3-5,…
favour of foreign language (and culture). They also ... native language, and children are unable to learn a language not spoken ... shielding them off their mother tongue”. ..... the effect endangered language has on the existence of the owners.
Aalberse, S.; Moro, F.; Braunmüller, K.; Höder, S.; Kühl, K.
This article discusses Malay and Chinese heritage languages as spoken in the Netherlands. Heritage speakers are dominant in another language and use their heritage language less. Moreover, they have qualitatively and quantitatively different input from monolinguals. Heritage languages are often
Aalberse, S.; Moro, F.R.; Braunmüller, K.; Höder, S.; Kühl, K.
This article discusses Malay and Chinese heritage languages as spoken in the Netherlands. Heritage speakers are dominant in another language and use their heritage language less. Moreover, they have qualitatively and quantitatively different input from monolinguals. Heritage languages are often
Mathematics and the Laws of Nature, Revised Edition describes the evolution of the idea that nature can be described in the language of mathematics. Colorful chapters explore the earliest attempts to apply deductive methods to the study of the natural world. This revised resource goes on to examine the development of classical conservation laws, including the conservation of momentum, the conservation of mass, and the conservation of energy. Chapters have been updated and revised to reflect recent information, including the mathematical pioneers who introduced new ideas about what it meant to
Wong, Miranda Kit-Yi; So, Wing Chee
This study developed a spoken narrative (i.e., storytelling) assessment as a supplementary measure of children's creativity. Both spoken and gestural contents of children's spoken narratives were coded to assess their verbal and nonverbal creativity. The psychometric properties of the coding system for the spoken narrative assessment were…
Sanden, Guro Refsum
Purpose: – The purpose of this paper is to analyse the consequences of globalisation in the area of corporate communication, and investigate how language may be managed as a strategic resource. Design/methodology/approach: – A review of previous studies on the effects of globalisation on corporate...... communication and the implications of language management initiatives in international business. Findings: – Efficient language management can turn language into a strategic resource. Language needs analyses, i.e. linguistic auditing/language check-ups, can be used to determine the language situation...... of a company. Language policies and/or strategies can be used to regulate a company’s internal modes of communication. Language management tools can be deployed to address existing and expected language needs. Continuous feedback from the front line ensures strategic learning and reduces the risk of suboptimal...
Van Heerden, C
Full Text Available Spoken dialogue systems (SDSs) have great potential for information access in the developing world. However, the realisation of that potential requires the solution of several challenging problems, including the development of sufficiently accurate...
Morales, Luis; Paolieri, Daniela; Dussias, Paola E.; Valdés kroff, Jorge R.; Gerfen, Chip; Bajo, María Teresa
We investigate the ‘gender-congruency’ effect during a spoken-word recognition task using the visual world paradigm. Eye movements of Italian–Spanish bilinguals and Spanish monolinguals were monitored while they viewed a pair of objects on a computer screen. Participants listened to instructions in Spanish (encuentra la bufanda / ‘find the scarf’) and clicked on the object named in the instruction. Grammatical gender of the objects’ name was manipulated so that pairs of objects had the same (congruent) or different (incongruent) gender in Italian, but gender in Spanish was always congruent. Results showed that bilinguals, but not monolinguals, looked at target objects less when they were incongruent in gender, suggesting a between-language gender competition effect. In addition, bilinguals looked at target objects more when the definite article in the spoken instructions provided a valid cue to anticipate its selection (different-gender condition). The temporal dynamics of gender processing and cross-language activation in bilinguals are discussed. PMID:28018132
Jackendoff, Ray; Pinker, Steven
In a continuation of the conversation with Fitch, Chomsky, and Hauser on the evolution of language, we examine their defense of the claim that the uniquely human, language-specific part of the language faculty (the ''narrow language faculty'') consists only of recursion, and that this part cannot be considered an adaptation to communication. We…
Full Text Available Communicative interactions involve a kind of procedural knowledge that is used by the human brain for processing verbal and nonverbal inputs and for language production. Although considerable work has been done on modeling human language abilities, it has been difficult to bring them together to a comprehensive tabula rasa system compatible with current knowledge of how verbal information is processed in the brain. This work presents a cognitive system, entirely based on a large-scale neural architecture, which was developed to shed light on the procedural knowledge involved in language elaboration. The main component of this system is the central executive, which is a supervising system that coordinates the other components of the working memory. In our model, the central executive is a neural network that takes as input the neural activation states of the short-term memory and yields as output mental actions, which control the flow of information among the working memory components through neural gating mechanisms. The proposed system is capable of learning to communicate through natural language starting from tabula rasa, without any a priori knowledge of the structure of phrases, meaning of words, role of the different classes of words, only by interacting with a human through a text-based interface, using an open-ended incremental learning process. It is able to learn nouns, verbs, adjectives, pronouns and other word classes, and to use them in expressive language. The model was validated on a corpus of 1587 input sentences, based on literature on early language assessment, at the level of about 4-years old child, and produced 521 output sentences, expressing a broad range of language processing functionalities.
Full Text Available Within the scope of my investigation on language use and language attitudes of People of German Descent from the USSR, I find almost regular different language contact phenomena, such as viel bliny habn=wir gbackt (engl.: 'we cooked lots of pancakes' (cf. Ries 2011. The aim of analysis is to examine both language use with regard to different forms of language contact and the language attitudes of the observed speakers. To be able to analyse both of these aspects and synthesize them, different types of data are required. The research is based on the following two data types: everyday conversations and interviews. In addition, the individual speakers' biography is a key part of the analysis, because it allows one to draw conclusions about language attitudes and use. This qualitative research is based on morpho-syntactic and interactional linguistic analysis of authentic spoken data. The data arise from a corpus compiled and edited by myself. My being a member of the examined group allowed me to build up an authentic corpus. The natural language use is analysed from the perspective of different language contact phenomena and potential functions of language alternations. One central issue is: How do speakers use the languages available to them, German and Russian? Structural characteristics such as code switching and discursive motives for these phenomena are discussed as results, together with the socio-cultural background of the individual speaker. Within the scope of this article I present exemplarily the data and results of one speaker.
Rassinoux, Anne-Marie; Baud, Robert H; Rodrigues, Jean-Marie; Lovis, Christian; Geissbühler, Antoine
The importance of clinical communication between providers, consumers and others, as well as the requisite for computer interoperability, strengthens the need for sharing common accepted terminologies. Under the directives of the World Health Organization (WHO), an approach is currently being conducted in Australia to adopt a standardized terminology for medical procedures that is intended to become an international reference. In order to achieve such a standard, a collaborative approach is adopted, in line with the successful experiment conducted for the development of the new French coding system CCAM. Different coding centres are involved in setting up a semantic representation of each term using a formal ontological structure expressed through a logic-based representation language. From this language-independent representation, multilingual natural language generation (NLG) is performed to produce noun phrases in various languages that are further compared for consistency with the original terms. Outcomes are presented for the assessment of the International Classification of Health Interventions (ICHI) and its translation into Portuguese. The initial results clearly emphasize the feasibility and cost-effectiveness of the proposed method for handling both a different classification and an additional language. NLG tools, based on ontology driven semantic representation, facilitate the discovery of ambiguous and inconsistent terms, and, as such, should be promoted for establishing coherent international terminologies.
Sherwood, Bruce Arne
Explains that reading English among Scientists is almost universal, however, there are enormous problems with spoken English. Advocates the use of Esperanto as a viable alternative, and as a language requirement for graduate work. (GA)
Sharma, Vivekanand; Law, Wayne; Balick, Michael J; Sarkar, Indra Neil
The growing amount of data describing historical medicinal uses of plants from digitization efforts provides the opportunity to develop systematic approaches for identifying potential plant-based therapies. However, the task of cataloguing plant use information from natural language text is a challenging task for ethnobotanists. To date, there have been only limited adoption of informatics approaches used for supporting the identification of ethnobotanical information associated with medicinal uses. This study explored the feasibility of using biomedical terminologies and natural language processing approaches for extracting relevant plant-associated therapeutic use information from historical biodiversity literature collection available from the Biodiversity Heritage Library. The results from this preliminary study suggest that there is potential utility of informatics methods to identify medicinal plant knowledge from digitized resources as well as highlight opportunities for improvement.
Molina, Martin; Sanchez-Soriano, Javier; Corcho, Oscar
Providing descriptions of isolated sensors and sensor networks in natural language, understandable by the general public, is useful to help users find relevant sensors and analyze sensor data. In this paper, we discuss the feasibility of using geographic knowledge from public databases available on the Web (such as OpenStreetMap, Geonames, or DBpedia) to automatically construct such descriptions. We present a general method that uses such information to generate sensor descriptions in natural language. The results of the evaluation of our method in a hydrologic national sensor network showed that this approach is feasible and capable of generating adequate sensor descriptions with a lower development effort compared to other approaches. In the paper we also analyze certain problems that we found in public databases (e.g., heterogeneity, non-standard use of labels, or rigid search methods) and their impact in the generation of sensor descriptions.
Min, Yul Ha; Park, Hyeoun-Ae; Jeon, Eunjoo; Lee, Joo Yun; Jo, Soo Jung
The purpose of this study was to develop an ontology model to generate nursing narratives as natural as human language from the entity-attribute-value triplets of a detailed clinical model using natural language generation technology. The model was based on the types of information and documentation time of the information along the nursing process. The typesof information are data characterizing the patient status, inferences made by the nurse from the patient data, and nursing actions selected by the nurse to change the patient status. This information was linked to the nursing process based on the time of documentation. We describe a case study illustrating the application of this model in an acute-care setting. The proposed model provides a strategy for designing an electronic nursing record system.
Hunter, James; Freer, Yvonne; Gatt, Albert; Reiter, Ehud; Sripada, Somayajulu; Sykes, Cindy; Westwater, Dave
The BT-Nurse system uses data-to-text technology to automatically generate a natural language nursing shift summary in a neonatal intensive care unit (NICU). The summary is solely based on data held in an electronic patient record system, no additional data-entry is required. BT-Nurse was tested for two months in the Royal Infirmary of Edinburgh NICU. Nurses were asked to rate the understandability, accuracy, and helpfulness of the computer-generated summaries; they were also asked for free-text comments about the summaries. The nurses found the majority of the summaries to be understandable, accurate, and helpful (pgenerated summaries. In conclusion, natural language NICU shift summaries can be automatically generated from an electronic patient record, but our proof-of-concept software needs considerable additional development work before it can be deployed.
Full Text Available Providing descriptions of isolated sensors and sensor networks in natural language, understandable by the general public, is useful to help users find relevant sensors and analyze sensor data. In this paper, we discuss the feasibility of using geographic knowledge from public databases available on the Web (such as OpenStreetMap, Geonames, or DBpedia to automatically construct such descriptions. We present a general method that uses such information to generate sensor descriptions in natural language. The results of the evaluation of our method in a hydrologic national sensor network showed that this approach is feasible and capable of generating adequate sensor descriptions with a lower development effort compared to other approaches. In the paper we also analyze certain problems that we found in public databases (e.g., heterogeneity, non-standard use of labels, or rigid search methods and their impact in the generation of sensor descriptions.
Stickland, Michael G.; Conrad, Gregory N.; Eaton, Shelley M.
Natural language processing-based knowledge management software, traditionally developed for security organizations, is now becoming commercially available. An informal survey was conducted to discover and examine current NLP and related technologies and potential applications for information retrieval, information extraction, summarization, categorization, terminology management, link analysis, and visualization for possible implementation at Sandia National Laboratories. This report documents our current understanding of the technologies, lists software vendors and their products, and identifies potential applications of these technologies.
Weng, Wei-Hung; Wagholikar, Kavishwar B.; McCray, Alexa T.; Szolovits, Peter; Chueh, Henry C.
Background The medical subdomain of a clinical note, such as cardiology or neurology, is useful content-derived metadata for developing machine learning downstream applications. To classify the medical subdomain of a note accurately, we have constructed a machine learning-based natural language processing (NLP) pipeline and developed medical subdomain classifiers based on the content of the note. Methods We constructed the pipeline using the clinical ...
conversational agent with information exchange disabled until the end of the experiment run. The meaning of the indicator in the top- right of the agent... Human Computer Collaboration at the Edge: Enhancing Collective Situation Understanding with Controlled Natural Language Alun Preece∗, William...email: PreeceAD@cardiff.ac.uk †Emerging Technology Services, IBM United Kingdom Ltd, Hursley Park, Winchester, UK ‡US Army Research Laboratory, Human
Will, Herbert A.; Mackin, Michael A.
PC software is described which provides flexible natural language process control capability with an IBM PC or compatible machine. Hardware requirements include the PC, and suitable hardware interfaces to all controlled devices. Software required includes the Microsoft Disk Operating System (MS-DOS) operating system, a PC-based FORTRAN-77 compiler, and user-written device drivers. Instructions for use of the software are given as well as a description of an application of the system.
This study examined the effects of preceding contextual stimuli, either auditory or visual, on the identification of spoken target words. Fifty-one participants (29% males, 71% females; mean age = 24.5 years, SD = 8.5) were divided into three groups: no context, auditory context, and visual context. All target stimuli were spoken words masked with white noise. The relationships between the context and target stimuli were as follows: identical word, similar word, and unrelated word. Participants presented with context experienced a sequence of six context stimuli in the form of either spoken words or photographs. Auditory and visual context conditions produced similar results, but the auditory context aided word identification more than the visual context in the similar word relationship. We discuss these results in the light of top-down processing, motor theory, and the phonological system of language.
Compact Closed categories and Frobenius and Bi algebras have been applied to model and reason about Quantum protocols. The same constructions have also been applied to reason about natural language semantics under the name: ``categorical distributional compositional'' semantics, or in short, the ``DisCoCat'' model. This model combines the statistical vector models of word meaning with the compositional models of grammatical structure. It has been applied to natural language tasks such as disambiguation, paraphrasing and entailment of phrases and sentences. The passage from the grammatical structure to vectors is provided by a functor, similar to the Quantization functor of Quantum Field Theory. The original DisCoCat model only used compact closed categories. Later, Frobenius algebras were added to it to model long distance dependancies such as relative pronouns. Recently, bialgebras have been added to the pack to reason about quantifiers. This paper reviews these constructions and their application to natural language semantics. We go over the theory and present some of the core experimental results.
Full Text Available Compact Closed categories and Frobenius and Bi algebras have been applied to model and reason about Quantum protocols. The same constructions have also been applied to reason about natural language semantics under the name: “categorical distributional compositional” semantics, or in short, the “DisCoCat” model. This model combines the statistical vector models of word meaning with the compositional models of grammatical structure. It has been applied to natural language tasks such as disambiguation, paraphrasing and entailment of phrases and sentences. The passage from the grammatical structure to vectors is provided by a functor, similar to the Quantization functor of Quantum Field Theory. The original DisCoCat model only used compact closed categories. Later, Frobenius algebras were added to it to model long distance dependancies such as relative pronouns. Recently, bialgebras have been added to the pack to reason about quantifiers. This paper reviews these constructions and their application to natural language semantics. We go over the theory and present some of the core experimental results.
Surveys developments in language revitalization and language death. Focusing on indigenous languages, discusses the role and nature of appropriate linguistic documentation, possibilities for bilingual education, and methods of promoting oral fluency and intergenerational transmission in affected languages. (Author/VWL)
Crane, Paul K; Gruhl, Jonathan C; Erosheva, Elena A; Gibbons, Laura E; McCurry, Susan M; Rhoads, Kristoffer; Nguyen, Viet; Arani, Keerthi; Masaki, Kamal; White, Lon
Spoken bilingualism may be associated with cognitive reserve. Mastering a complicated written language may be associated with additional reserve. We sought to determine if midlife use of spoken and written Japanese was associated with lower rates of late life cognitive decline. Participants were second-generation Japanese-American men from the Hawaiian island of Oahu, born 1900-1919, free of dementia in 1991, and categorized based on midlife self-reported use of spoken and written Japanese (total n included in primary analysis = 2,520). Cognitive functioning was measured with the Cognitive Abilities Screening Instrument scored using item response theory. We used mixed effects models, controlling for age, income, education, smoking status, apolipoprotein E e4 alleles, and number of study visits. Rates of cognitive decline were not related to use of spoken or written Japanese. This finding was consistent across numerous sensitivity analyses. We did not find evidence to support the hypothesis that multilingualism is associated with cognitive reserve.
Interferência da língua falada na escrita de crianças: processos de apagamento da oclusiva dental /d/ e da vibrante final /r/ Interference of the spoken language on children's writing: cancellation processes of the dental occlusive /d/ and final vibrant /r/
Socorro Cláudia Tavares de Sousa
Full Text Available O presente trabalho tem como objetivo investigar a influência da língua falada na escrita de crianças em relação aos fenômenos do cancelamento da dental /d/ e da vibrante final /r/. Elaboramos e aplicamos um instrumento de pesquisa em alunos do Ensino Fundamental em escolas de Fortaleza. Para a análise dos dados obtidos, utilizamos o software SPSS. Os resultados nos revelaram que o sexo masculino e as palavras polissílabas são fatores que influenciam, de forma parcial, a realização da variável dependente /no/ e que os verbos e o nível de escolaridade são elementos condicionadores para o cancelamento da vibrante final /r/.The present study aims to investigate the influence of the spoken language in children's writing in relation to the phenomena of cancellation of dental /d/ and final vibrant /r/. We elaborated and applied a research instrument to children from primary school in Fortaleza. We used the software SPSS to analyze the data. The results showed that the male sex and the words which have three or more syllable are factors that influence, in part, the realization of the dependent variable /no/ and that verbs and level of education are conditioners elements for the cancellation of the final vibrant /r/.
One dimension of early Canadian education is the attempt of the government to use the education system as an assimilative tool to integrate the First Nations and Me´tis people into Euro-Canadian society. Despite these attempts, many First Nations and Me´tis people retained their culture and their indigenous language. Few science educators have examined First Nations and Western scientific worldviews and the impact they may have on science learning. This study explored the views some First Nations (Cree) and Euro-Canadian Grade-7-level students in Manitoba had about the nature of science. Both qualitative (open-ended questions and interviews) and quantitative (a Likert-scale questionnaire) instruments were used to explore student views. A central hypothesis to this research programme is the possibility that the different world-views of two student populations, Cree and Euro-Canadian, are likely to influence their perceptions of science. This preliminary study explored a range of methodologies to probe the perceptions of the nature of science in these two student populations. It was found that the two cultural groups differed significantly between some of the tenets in a Nature of Scientific Knowledge Scale (NSKS). Cree students significantly differed from Euro-Canadian students on the developmental, testable and unified tenets of the nature of scientific knowledge scale. No significant differences were found in NSKS scores between language groups (Cree students who speak English in the home and those who speak English and Cree or Cree only). The differences found between language groups were primarily in the open-ended questions where preformulated responses were absent. Interviews about critical incidents provided more detailed accounts of the Cree students' perception of the nature of science. The implications of the findings of this study are discussed in relation to the challenges related to research methodology, further areas for investigation, science
Woo, Chong Woo; Evens, Martha W; Freedman, Reva; Glass, Michael; Shim, Leem Seop; Zhang, Yuemei; Zhou, Yujian; Michael, Joel
The objective of this research was to build an intelligent tutoring system capable of carrying on a natural language dialogue with a student who is solving a problem in physiology. Previous experiments have shown that students need practice in qualitative causal reasoning to internalize new knowledge and to apply it effectively and that they learn by putting their ideas into words. Analysis of a corpus of 75 hour-long tutoring sessions carried on in keyboard-to-keyboard style by two professors of physiology at Rush Medical College tutoring first-year medical students provided the rules used in tutoring strategies and tactics, parsing, and text generation. The system presents the student with a perturbation to the blood pressure, asks for qualitative predictions of the changes produced in seven important cardiovascular variables, and then launches a dialogue to correct any errors and to probe for possible misconceptions. The natural language understanding component uses a cascade of finite-state machines. The generation is based on lexical functional grammar. Results of experiments with pretests and posttests have shown that using the system for an hour produces significant learning gains and also that even this brief use improves the student's ability to solve problems more then reading textual material on the topic. Student surveys tell us that students like the system and feel that they learn from it. The system is now in regular use in the first-year physiology course at Rush Medical College. We conclude that the CIRCSIM-Tutor system demonstrates that intelligent tutoring systems can implement effective natural language dialogue with current language technology.
Bregman, Micah R; Creel, Sarah C
Traditional conceptions of spoken language assume that speech recognition and talker identification are computed separately. Neuropsychological and neuroimaging studies imply some separation between the two faculties, but recent perceptual studies suggest better talker recognition in familiar languages than unfamiliar languages. A familiar-language benefit in talker recognition potentially implies strong ties between the two domains. However, little is known about the nature of this language familiarity effect. The current study investigated the relationship between speech and talker processing by assessing bilingual and monolingual listeners' ability to learn voices as a function of language familiarity and age of acquisition. Two effects emerged. First, bilinguals learned to recognize talkers in their first language (Korean) more rapidly than they learned to recognize talkers in their second language (English), while English-speaking participants showed the opposite pattern (learning English talkers faster than Korean talkers). Second, bilinguals' learning rate for talkers in their second language (English) correlated with age of English acquisition. Taken together, these results suggest that language background materially affects talker encoding, implying a tight relationship between speech and talker representations. Copyright © 2013 Elsevier B.V. All rights reserved.
Pahisa-Solé, Joan; Herrera-Joancomartí, Jordi
In this article, we describe a compansion system that transforms the telegraphic language that comes from the use of pictogram-based augmentative and alternative communication (AAC) into natural language. The system was tested with four participants with severe cerebral palsy and ranging degrees of linguistic competence and intellectual disabilities. Participants had used pictogram-based AAC at least for the past 30 years each and presented a stable linguistic profile. During tests, which consisted of a total of 40 sessions, participants were able to learn new linguistic skills, such as the use of basic verb tenses, while using the compansion system, which proved a source of motivation. The system can be adapted to the linguistic competence of each person and required no learning curve during tests when none of its special features, like gender, number, verb tense, or sentence type modifiers, were used. Furthermore, qualitative and quantitative results showed a mean communication rate increase of 41.59%, compared to the same communication device without the compansion system, and an overall improvement in the communication experience when the output is in natural language. Tests were conducted in Catalan and Spanish.
This article explores the implications of Hegel's theories of language on second language (L2) teaching. Three among the various concepts in Hegel's theories of language are selected. They are the crucial role of intersubjectivity; the primacy of the spoken over the written form; and the importance of the training of form or grammar. Applying…
Schuit, J.; Baker, A.; Pfau, R.
Sign language typology is a fairly new research field and typological classifications have yet to be established. For spoken languages, these classifications are generally based on typological parameters; it would thus be desirable to establish these for sign languages. In this paper, different
Mazlack, L.J.; Paz, N.M.
Newspaper cartoons can graphically display the result of ambiguity in human speech; the result can be unexpected and funny. Likewise, computer analysis of natural language statements also needs to successfully resolve ambiguous situations. Computer techniques already developed use restricted world knowledge in resolving ambiguous language use. This paper illustrates how these techniques can be used in resolving ambiguous situations arising in cartoons. 8 references.
Full Text Available The semantic web extends the current World Wide Web by adding facilities for the machine understood description of meaning. The ontology based search model is used to enhance efficiency and accuracy of information retrieval. Ontology is the core technology for the semantic web and this mechanism for representing formal and shared domain descriptions. In this paper, we proposed ontology based meaningful search using semantic web and Natural Language Processing (NLP techniques in the educational domain. First we build the educational ontology then we present the semantic search system. The search model consisting three parts which are embedding spell-check, finding synonyms using WordNet API and querying ontology using SPARQL language. The results are both sensitive to spell check and synonymous context. This paper provides more accurate results and the complete details for the selected field in a single page.
domain adaptation, unsupervised learning , deep neural networks, bottleneck features 1. Introduction and task Spoken language identification (LID) is...the process of identifying the language in a spoken speech utterance. In recent years, great improvements in LID system performance have been seen...be the case in practice. Lastly, we conduct an out-of-set experiment where VoA data from 9 other languages (Amharic, Creole, Croatian, English
Kalt, Susan E.
Spanish is one of the most widely spoken languages in the world. Quechua is the largest indigenous language family to constitute the first language (L1) of second language (L2) Spanish speakers. Despite sheer number of speakers and typologically interesting contrasts, Quechua-Spanish second language acquisition is a nearly untapped research area,…
González-Alvarez, Julio; Palomar-García, María-Angeles
Research has shown that syllables play a relevant role in lexical access in Spanish, a shallow language with a transparent syllabic structure. Syllable frequency has been shown to have an inhibitory effect on visual word recognition in Spanish. However, no study has examined the syllable frequency effect on spoken word recognition. The present study tested the effect of the frequency of the first syllable on recognition of spoken Spanish words. A sample of 45 young adults (33 women, 12 men; M = 20.4, SD = 2.8; college students) performed an auditory lexical decision on 128 Spanish disyllabic words and 128 disyllabic nonwords. Words were selected so that lexical and first syllable frequency were manipulated in a within-subject 2 × 2 design, and six additional independent variables were controlled: token positional frequency of the second syllable, number of phonemes, position of lexical stress, number of phonological neighbors, number of phonological neighbors that have higher frequencies than the word, and acoustical durations measured in milliseconds. Decision latencies and error rates were submitted to linear mixed models analysis. Results showed a typical facilitatory effect of the lexical frequency and, importantly, an inhibitory effect of the first syllable frequency on reaction times and error rates. © The Author(s) 2016.
Kandler, Anne; Unger, Roman; Steele, James
‘Language shift’ is the process whereby members of a community in which more than one language is spoken abandon their original vernacular language in favour of another. The historical shifts to English by Celtic language speakers of Britain and Ireland are particularly well-studied examples for which good census data exist for the most recent 100–120 years in many areas where Celtic languages were once the prevailing vernaculars. We model the dynamics of language shift as a competition process in which the numbers of speakers of each language (both monolingual and bilingual) vary as a function both of internal recruitment (as the net outcome of birth, death, immigration and emigration rates of native speakers), and of gains and losses owing to language shift. We examine two models: a basic model in which bilingualism is simply the transitional state for households moving between alternative monolingual states, and a diglossia model in which there is an additional demand for the endangered language as the preferred medium of communication in some restricted sociolinguistic domain, superimposed on the basic shift dynamics. Fitting our models to census data, we successfully reproduce the demographic trajectories of both languages over the past century. We estimate the rates of recruitment of new Scottish Gaelic speakers that would be required each year (for instance, through school education) to counteract the ‘natural wastage’ as households with one or more Gaelic speakers fail to transmit the language to the next generation informally, for different rates of loss during informal intergenerational transmission. PMID:21041210
Famed for his collection of drawings of naturalia and his thoughts on the relationship between painting and natural knowledge, it now appears that the Bolognese naturalist Ulisse Aldrovandi (1522-1605) also pondered specifically color and pigments, compiling not only lists and diagrams of color terms but also a full-length unpublished manuscript entitled De coloribus or Trattato dei colori. Introducing these writings for the first time, this article portrays a scholar not so much interested in the materiality of pigment production, as in the cultural history of hues. It argues that these writings constituted an effort to build a language of color, in the sense both of a standard nomenclature of hues and of a lexicon, a dictionary of their denotations and connotations as documented in the literature of ancients and moderns. This language would serve the naturalist in his artistic patronage and his natural historical studies, where color was considered one of the most reliable signs for the correct identification of specimens, and a guarantee of accuracy in their illustration. Far from being an exception, Aldrovandi's 'color sensibility'spoke of that of his university-educated nature-loving peers.
Kenyan Sign Language (KSL) is a visual gestural language used by members of the deaf community in Kenya. Kiswahili on the other hand is a Bantu language that is used as the national language of Kenya. The two are world's apart, one being a spoken language and the other a signed language and thus their “… basic ...
This volume deals with the computational application of systemic functional grammar (SFG) for natural language generation. In particular, it describes the implementation of a fragment of the grammar of German in the computational framework of KOMET-PENMAN for multilingual generation. The text also presents a specification of explicit well-formedness constraints on syntagmatic structure which are defined in the form of typed feature structures. It thus achieves a model of systemic functional grammar that unites both the strengths of systemics, such as stratification, functional diversification
Vilic, Adnan; Petersen, John Asger; Hoppe, Karsten
This paper presents a data-driven approach to graphically presenting text-based patient journals while still maintaining all textual information. The system first creates a timeline representation of a patients’ physiological condition during an admission, which is assessed by electronically...... monitoring vital signs and then combining these into Early Warning Scores (EWS). Hereafter, techniques from Natural Language Processing (NLP) are applied on the existing patient journal to extract all entries. Finally, the two methods are combined into an interactive timeline featuring the ability to see...... drastic changes in the patients’ health, and thereby enabling staff to see where in the journal critical events have taken place....
Full Text Available The Quran is a scripture that acts as the main reference to people which their religion is Islam. It covers information from politics to science, with vast amount of information that requires effort to uncover the knowledge behind it. Today, the emergence of smartphones has led to the development of a wide-range application for enhancing knowledge-seeking activities. This project proposes a mobile application that is taking a natural language approach to searching topics in the Quran based on keyword searching. The benefit of the application is two-fold; it is intuitive and it saves time.
The aim of this bachelor thesis is to explore this image label database coming from the ESP game from the natural language processing (NLP) point of view. ESP game is an online game, in which human players do useful work - they label images. The output of the ESP game is then a database of images and their labels. What interests us is whether the data collected in the process of labeling images will be of any use in NLP tasks. Specifically, we are interested in the tasks of automatic corefere...
It is shown how certain kinds of domain independent expert systems based on classification problem-solving methods can be constructed directly from natural language descriptions by a human expert. The expert knowledge is not translated into production rules. Rather, it is mapped into conceptual structures which are integrated into long-term memory (LTM). The resulting system is one in which problem-solving, retrieval and memory organization are integrated processes. In other words, the same algorithm and knowledge representation structures are shared by these processes. As a result of this, the system can answer questions, solve problems or reorganize LTM.
Full Text Available Abstract Background Incident reporting is the most common method for detecting adverse events in a hospital. However, under-reporting or non-reporting and delay in submission of reports are problems that prevent early detection of serious adverse events. The aim of this study was to determine whether it is possible to promptly detect serious injuries after inpatient falls by using a natural language processing method and to determine which data source is the most suitable for this purpose. Methods We tried to detect adverse events from narrative text data of electronic medical records by using a natural language processing method. We made syntactic category decision rules to detect inpatient falls from text data in electronic medical records. We compared how often the true fall events were recorded in various sources of data including progress notes, discharge summaries, image order entries and incident reports. We applied the rules to these data sources and compared F-measures to detect falls between these data sources with reference to the results of a manual chart review. The lag time between event occurrence and data submission and the degree of injury were compared. Results We made 170 syntactic rules to detect inpatient falls by using a natural language processing method. Information on true fall events was most frequently recorded in progress notes (100%, incident reports (65.0% and image order entries (12.5%. However, F-measure to detect falls using the rules was poor when using progress notes (0.12 and discharge summaries (0.24 compared with that when using incident reports (1.00 and image order entries (0.91. Since the results suggested that incident reports and image order entries were possible data sources for prompt detection of serious falls, we focused on a comparison of falls found by incident reports and image order entries. Injury caused by falls found by image order entries was significantly more severe than falls detected by
This book introduces basic supervised learning algorithms applicable to natural language processing (NLP) and shows how the performance of these algorithms can often be improved by exploiting the marginal distribution of large amounts of unlabeled data. One reason for that is data sparsity, i.e., the limited amounts of data we have available in NLP. However, in most real-world NLP applications our labeled data is also heavily biased. This book introduces extensions of supervised learning algorithms to cope with data sparsity and different kinds of sampling bias.This book is intended to be both
Full Text Available This paper investigates the interplay of constructed action and the clause in Finnish Sign Language (FinSL. Constructed action is a form of gestural enactment in which the signers use their hands, face and other parts of the body to represent the actions, thoughts or feelings of someone they are referring to in the discourse. With the help of frequencies calculated from corpus data, this article shows firstly that when FinSL signers are narrating a story, there are differences in how they use constructed action. Then the paper argues that there are differences also in the prototypical structure, linkage type and non-manual activity of clauses, depending on the presence or non-presence of constructed action. Finally, taking the view that gesturality is an integral part of language, the paper discusses the nature of syntax in sign languages and proposes a conceptualization in which syntax is seen as a set of norms distributed on a continuum between a categorial-conventional end and a gradient-unconventional end.
Rice, Mabel L
Future perspectives on children with language impairments are framed from what is known about children with specific language impairment (SLI). A summary of the current state of services is followed by discussion of how these children can be overlooked and misunderstood and consideration of why it is so hard for some children to acquire language when it is effortless for most children. Genetic influences are highlighted, with the suggestion that nature plus nurture should be considered in present as well as future intervention approaches. A nurture perspective highlights the family context of the likelihood of SLI for some of the children. Future models of the causal pathways may provide more specific information to guide gene-treatment decisions, in ways parallel to current personalized medicine approaches. Future treatment options can build on the potential of electronic technologies and social media to provide personalized treatment methods available at a time and place convenient for the person to use as often as desired. The speech-language pathologist could oversee a wide range of treatment options and monitor evidence provided electronically to evaluate progress and plan future treatment steps. Most importantly, future methods can provide lifelong language acquisition activities that maintain the privacy and dignity of persons with language impairment, and in so doing will in turn enhance the effectiveness of speech-language pathologists. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
Malamud, B. D.; Rhodes, F. H. T.
This paper explores natural hazards teaching and communications through the use of a literary anthology of writings about the earth aimed at non-experts. Teaching natural hazards in high-school and university introductory Earth Science and Geography courses revolves mostly around lectures, examinations, and laboratory demonstrations/activities. Often the results of such a course are that a student 'memorizes' the answers, and is penalized when they miss a given fact [e.g., "You lost one point because you were off by 50 km/hr on the wind speed of an F5 tornado."] Although facts and general methodologies are certainly important when teaching natural hazards, it is a strong motivation to a student's assimilation of, and enthusiasm for, this knowledge, if supplemented by writings about the Earth. In this paper, we discuss a literary anthology which we developed [Language of the Earth, Rhodes, Stone, Malamud, Wiley-Blackwell, 2008] which includes many descriptions about natural hazards. Using first- and second-hand accounts of landslides, earthquakes, tsunamis, floods and volcanic eruptions, through the writings of McPhee, Gaskill, Voltaire, Austin, Cloos, and many others, hazards become 'alive', and more than 'just' a compilation of facts and processes. Using short excerpts such as these, or other similar anthologies, of remarkably written accounts and discussions about natural hazards results in 'dry' facts becoming more than just facts. These often highly personal viewpoints of our catostrophic world, provide a useful supplement to a student's understanding of the turbulent world in which we live.
Brøndsted, Tom; Larsen, Henrik Legind; Larsen, Lars Bo
window focused over the part which most likely contains an answer to the query. The two systems are integrated into a full spoken query answering system. The prototype can answer queries and questions within the chosen football (soccer) test domain, but the system has the flexibility for being ported...
PARKER, GARY J.; SOLA, DONALD F.
THE ESSENTIALS OF AYACUCHO GRAMMAR WERE PRESENTED IN THE FIRST VOLUME OF THIS SERIES, SPOKEN AYACUCHO QUECHUA, UNITS 1-10. THE 10 UNITS IN THIS VOLUME (11-20) ARE INTENDED FOR USE IN AN INTERMEDIATE OR ADVANCED COURSE, AND PRESENT THE STUDENT WITH LENGTHIER AND MORE COMPLEX DIALOGS, CONVERSATIONS, "LISTENING-INS," AND DICTATIONS AS WELL…
SOLA, DONALD F.; AND OTHERS
THIS SECOND VOLUME OF AN INTRODUCTORY COURSE IN SPOKEN CUZCO QUECHUA ALSO COMPRISES ENOUGH MATERIAL FOR ONE INTENSIVE SUMMER SESSION COURSE OR ONE SEMESTER OF SEMI-INTENSIVE INSTRUCTION (120 CLASS HOURS). THE METHOD OF PRESENTATION IS ESSENTIALLY THE SAME AS IN THE FIRST VOLUME WITH FURTHER CONTRASTIVE, LINGUISTIC ANALYSIS OF ENGLISH-QUECHUA…
LASTRA, YOLANDA; SOLA, DONALD F.
UNITS 13-24 OF THE SPOKEN COCHABAMBA QUECHUA COURSE FOLLOW THE GENERAL FORMAT OF THE FIRST VOLUME (UNITS 1-12). THIS SECOND VOLUME IS INTENDED FOR USE IN AN INTERMEDIATE OR ADVANCED COURSE AND INCLUDES MORE COMPLEX DIALOGS, CONVERSATIONS, "LISTENING-INS," AND DICTATIONS, AS WELL AS GRAMMAR AND EXERCISE SECTIONS COVERING ADDITIONAL…
PARKER, GARY J.; SOLA, DONALD F.
THIS BEGINNING COURSE IN AYACUCHO QUECHUA, SPOKEN BY ABOUT A MILLION PEOPLE IN SOUTH-CENTRAL PERU, WAS PREPARED TO INTRODUCE THE PHONOLOGY AND GRAMMAR OF THIS DIALECT TO SPEAKERS OF ENGLISH. THE FIRST OF TWO VOLUMES, IT SERVES AS A TEXT FOR A 6-WEEK INTENSIVE COURSE OF 20 CLASS HOURS A WEEK. THE AUTHORS COMPARE AND CONTRAST SIGNIFICANT FEATURES OF…
Thomas, Earl W.
This is a first-year text of Portuguese grammar based on the Portuguese of moderately educated Brazilians from the area around Rio de Janeiro. Spoken idiomatic usage is emphasized. An important innovation is found in the presentation of verb tenses; they are presented in the order in which the native speaker learns them. The text is intended to…
Ordelman, Roeland J.F.; Heeren, W.F.L.; Huijbregts, M.A.H.; Hiemstra, Djoerd; de Jong, Franciska M.G.; Larson, M; Fernie, K; Oomen, J; Cigarran, J.
This paper presents and discusses ongoing work aiming at affordable disclosure of real-world spoken word archives in general, and in particular of a collection of recorded interviews with Dutch survivors of World War II concentration camp Buchenwald. Given such collections, the least we want to be
Larson, M; Ordelman, Roeland J.F.; Heeren, W.F.L.; Fernie, K; de Jong, Franciska M.G.; Huijbregts, M.A.H.; Oomen, J; Hiemstra, Djoerd
This paper presents and discusses ongoing work aiming at affordable disclosure of real-world spoken heritage archives in general, and in particular of a collection of recorded interviews with Dutch survivors of World War II concentration camp Buchenwald. Given such collections, we at least want to
This study expands contemporary theorising about students' conceptions of equality. A nationally representative sample of New Zealand students' were asked to provide a spoken numerical response and an explanation as they solved an arithmetic additive missing number problem. Students' responses were conceptualised as acts of communication and…
Percy-Smith, Lone; Cayé-Thomasen, Per; Breinegaard, Nina
The present study demonstrates a very strong effect of the parental communication mode on the auditory capabilities and speech/language outcome for cochlear implanted children. The children exposed to spoken language had higher odds of scoring high in all tests applied and the findings suggest...... a very clear benefit of spoken language communication with a cochlear implanted child....
This study addresses the issue of promoting effective Business Spoken English of Enterprise Staff in China.It aims to assess the assessment of spoken English learning methods and identify the difficulties of learning English oral expression concerned business area.It also provides strategies for enhancing Enterprise Staff’s level of Business Spoken English.
Sridhar, Kamal K.
Language and literacy issues in India are reviewed in terms of background, steps taken to combat illiteracy, and some problems associated with literacy. The following facts are noted: India has 106 languages spoken by more than 685 million people, there are several minor script systems, a major language has different dialects, a language may use…
Pazos R, Rodolfo A; Aguirre L, Marco A; González B, Juan J; Martínez F, José A; Pérez O, Joaquín; Verástegui O, Andrés A
In the last decades the popularity of natural language interfaces to databases (NLIDBs) has increased, because in many cases information obtained from them is used for making important business decisions. Unfortunately, the complexity of their customization by database administrators make them difficult to use. In order for a NLIDB to obtain a high percentage of correctly translated queries, it is necessary that it is correctly customized for the database to be queried. In most cases the performance reported in NLIDB literature is the highest possible; i.e., the performance obtained when the interfaces were customized by the implementers. However, for end users it is more important the performance that the interface can yield when the NLIDB is customized by someone different from the implementers. Unfortunately, there exist very few articles that report NLIDB performance when the NLIDBs are not customized by the implementers. This article presents a semantically-enriched data dictionary (which permits solving many of the problems that occur when translating from natural language to SQL) and an experiment in which two groups of undergraduate students customized our NLIDB and English language frontend (ELF), considered one of the best available commercial NLIDBs. The experimental results show that, when customized by the first group, our NLIDB obtained a 44.69 % of correctly answered queries and ELF 11.83 % for the ATIS database, and when customized by the second group, our NLIDB attained 77.05 % and ELF 13.48 %. The performance attained by our NLIDB, when customized by ourselves was 90 %.
Amaechi Uneke Enyi
Full Text Available The study entitled. “Language and Interactional Discourse: Deconstructing the Talk - Generating Machinery in Natural Conversation,” is an analysis of spontaneous and informal conversation. The study, carried out in the theoretical and methodological tradition of Ethnomethodology, was aimed at explicating how ordinary talk is organized and produced, how people coordinate their talk –in- interaction, how meanings are determined, and the role of talk in the wider social processes. The study followed the basic assumption of conversation analysis which is, that talk is not just a product of two ‘speakers - hearers’ who attempt to exchange information or convey messages to each other. Rather, participants in conversation are seen to be mutually orienting to, and collaborating in order to achieve orderly and meaningful communication. The analytic objective is therefore to make clear these procedures on which speakers rely to produce utterances and by which they make sense of other speakers’ talk. The datum used for this study was a recorded informal conversation between two (and later three middle- class civil servants who are friends. The recording was done in such a way that the participants were not aware that they were being recorded. The recording was later transcribed in a way that we believe is faithful to the spontaneity and informality of the talk. Our finding showed that conversation has its own features and is an ordered and structured social day by- day event. Specifically, utterances are designed and informed by organized procedures, methods and resources which are tied to the contexts in which they are produced, and which participants are privy to by virtue of their membership of a culture or a natural language community. Keywords: Language, Discourse and Conversation
Lane, H. Chad; Vanlehn, Kurt
For beginning programmers, inadequate problem solving and planning skills are among the most salient of their weaknesses. In this paper, we test the efficacy of natural language tutoring to teach and scaffold acquisition of these skills. We describe ProPL (Pro-PELL), a dialogue-based intelligent tutoring system that elicits goal decompositions and program plans from students in natural language. The system uses a variety of tutoring tactics that leverage students' intuitive understandings of the problem, how it might be solved, and the underlying concepts of programming. We report the results of a small-scale evaluation comparing students who used ProPL with a control group who read the same content. Our primary findings are that students who received tutoring from ProPL seem to have developed an improved ability to solve the composition problem and displayed behaviors that suggest they were able to think at greater levels of abstraction than students in the read-only group.
Kreimeyer, Kory; Foster, Matthew; Pandey, Abhishek; Arya, Nina; Halford, Gwendolyn; Jones, Sandra F; Forshee, Richard; Walderhaug, Mark; Botsis, Taxiarchis
We followed a systematic approach based on the Preferred Reporting Items for Systematic Reviews and Meta-Analyses to identify existing clinical natural language processing (NLP) systems that generate structured information from unstructured free text. Seven literature databases were searched with a query combining the concepts of natural language processing and structured data capture. Two reviewers screened all records for relevance during two screening phases, and information about clinical NLP systems was collected from the final set of papers. A total of 7149 records (after removing duplicates) were retrieved and screened, and 86 were determined to fit the review criteria. These papers contained information about 71 different clinical NLP systems, which were then analyzed. The NLP systems address a wide variety of important clinical and research tasks. Certain tasks are well addressed by the existing systems, while others remain as open challenges that only a small number of systems attempt, such as extraction of temporal information or normalization of concepts to standard terminologies. This review has identified many NLP systems capable of processing clinical free text and generating structured output, and the information collected and evaluated here will be important for prioritizing development of new approaches for clinical NLP. Copyright © 2017 Elsevier Inc. All rights reserved.
Full Text Available Bahasa adalah sebuah cara berkomunikasi secara sistematis dengan menggunakan suara atau simbol-simbol yang memiliki arti, yang diucapkan melalui mulut. Bahasa juga ditulis dengan mengikuti kaidah yang berlaku. Salah satu bahasa yang banyak digunakan di belahan dunia adalah Bahasa Inggris. Namun ada beberapa kendala apabila kita belajar kepada seorang guru atau instruktur. Waktu yang diberikan seorang guru, terbatas pada jam sekolah atau les saja. Bila siswa pulang sekolah atau les, maka yang bersangkutan harus belajar bahasa Inggris secara mandiri. Dari permasalahan di atas, muncul sebuah ide tentang bagaimana membuat sebuah penelitian yang berkaitan dengan pembuatan aplikasi yang mampu memberikan pengetahuan kepada siswa tentang bagaimana belajar bahasa Inggris secara mandiri baik dari perubahan kalimat postif menjadi kalimat negatif dan kalimat tanya. Disamping itu, aplikasi ini juga mampu memberikan pengetahuan tentang bagaimana mengucapkan kalimat dalam bahasa Inggris. Pada intinya kontribusi yang dapat diperoleh dari hasil penelitian ini adalah pihak terkait dari tingkat SMP sampai dengan SMU/SMK, dapat menggunakan aplikasi text to speech berbasis natural language processing untuk mempelajari tenses pada bahasa Inggris. Aplikasi ini dapat memperdengarkan kalimat-kalimat pada bahasa inggris dan dapat menyusun kalimat tanya dan kalimat negatif berdasarkan kalimat positifnya dalam beberapa tenses bahasa Inggris. Kata Kunci : Natural language processing, Text to speech
Alexandr I Krupnov
Full Text Available The article discusses the results of empirical study of the association between variables of persistence and academic achievement in foreign languages. The sample includes students of the Faculty of Physics, Mathematics and Natural Science at the RUDN University ( n = 115, divided into 5 subsamples, two of which are featured in the present study (the most and the least successful students subsamples. Persistence as a personality trait is studied within A.I. Krupnov’s system-functional approach. A.I. Krupnov’s paper-and-pencil test was used to measure persistence variables. Academic achievement was measured according to the four parameters: Phonetics, Grammar, Speaking and Political vocabulary based on the grades students received during the academic year. The analysis revealed that persistence displays different associations with academic achievement variables in more and less successful students subsamples, the general prominence of this trait is more important for unsuccessful students. Phonetics is the academic achievement variable most associated with persistence due to its nature, a skill one can acquire through hard work and practice which is the definition of persistence. Grammar as an academic achievement variable is not associated with persistence and probably relates to other factors. Unsuccessful students may have difficulties in separating various aspects of language acquisition from each other which should be taken into consideration by the teachers.
Hunter, James; Freer, Yvonne; Gatt, Albert; Reiter, Ehud; Sripada, Somayajulu; Sykes, Cindy
Our objective was to determine whether and how a computer system could automatically generate helpful natural language nursing shift summaries solely from an electronic patient record system, in a neonatal intensive care unit (NICU). A system was developed which automatically generates partial NICU shift summaries (for the respiratory and cardiovascular systems), using data-to-text technology. It was evaluated for 2 months in the NICU at the Royal Infirmary of Edinburgh, under supervision. In an on-ward evaluation, a substantial majority of the summaries was found by outgoing and incoming nurses to be understandable (90%), and a majority was found to be accurate (70%), and helpful (59%). The evaluation also served to identify some outstanding issues, especially with regard to extra content the nurses wanted to see in the computer-generated summaries. It is technically possible automatically to generate limited natural language NICU shift summaries from an electronic patient record. However, it proved difficult to handle electronic data that was intended primarily for display to the medical staff, and considerable engineering effort would be required to create a deployable system from our proof-of-concept software. Copyright © 2012 Elsevier B.V. All rights reserved.
Genuardi, Michael T.
One strategy for machine-aided indexing (MAI) is to provide a concept-level analysis of the textual elements of documents or document abstracts. In such systems, natural-language phrases are analyzed in order to identify and classify concepts related to a particular subject domain. The overall performance of these MAI systems is largely dependent on the quality and comprehensiveness of their knowledge bases. These knowledge bases function to (1) define the relations between a controlled indexing vocabulary and natural language expressions; (2) provide a simple mechanism for disambiguation and the determination of relevancy; and (3) allow the extension of concept-hierarchical structure to all elements of the knowledge file. After a brief description of the NASA Machine-Aided Indexing system, concerns related to the development and maintenance of MAI knowledge bases are discussed. Particular emphasis is given to statistically-based text analysis tools designed to aid the knowledge base developer. One such tool, the Knowledge Base Building (KBB) program, presents the domain expert with a well-filtered list of synonyms and conceptually-related phrases for each thesaurus concept. Another tool, the Knowledge Base Maintenance (KBM) program, functions to identify areas of the knowledge base affected by changes in the conceptual domain (for example, the addition of a new thesaurus term). An alternate use of the KBM as an aid in thesaurus construction is also discussed.
Paredes-Valverde, Mario Andrés
The semantic Web aims to provide to Web information with a well-defined meaning and make it understandable not only by humans but also by computers, thus allowing the automation, integration and reuse of high-quality information across different applications. However, current information retrieval mechanisms for semantic knowledge bases are intended to be only used by expert users. In this work, we propose a natural language interface that allows non-expert users the access to this kind of information through formulating queries in natural language. The present approach uses a domain-independent ontology model to represent the question\\'s structure and context. Also, this model allows determination of the answer type expected by the user based on a proposed question classification. To prove the effectiveness of our approach, we have conducted an evaluation in the music domain using LinkedBrainz, an effort to provide the MusicBrainz information as structured data on the Web by means of Semantic Web technologies. Our proposal obtained encouraging results based on the F-measure metric, ranging from 0.74 to 0.82 for a corpus of questions generated by a group of real-world end users. © The Author(s) 2015.
Paredes-Valverde, Mario André s; Valencia-Garcí a, Rafael; Rodriguez-Garcia, Miguel Angel; Colomo-Palacios, Ricardo; Alor-Herná ndez, Giner
The semantic Web aims to provide to Web information with a well-defined meaning and make it understandable not only by humans but also by computers, thus allowing the automation, integration and reuse of high-quality information across different applications. However, current information retrieval mechanisms for semantic knowledge bases are intended to be only used by expert users. In this work, we propose a natural language interface that allows non-expert users the access to this kind of information through formulating queries in natural language. The present approach uses a domain-independent ontology model to represent the question's structure and context. Also, this model allows determination of the answer type expected by the user based on a proposed question classification. To prove the effectiveness of our approach, we have conducted an evaluation in the music domain using LinkedBrainz, an effort to provide the MusicBrainz information as structured data on the Web by means of Semantic Web technologies. Our proposal obtained encouraging results based on the F-measure metric, ranging from 0.74 to 0.82 for a corpus of questions generated by a group of real-world end users. © The Author(s) 2015.
Nugues, Pierre M
The areas of natural language processing and computational linguistics have continued to grow in recent years, driven by the demand to automatically process text and spoken data. With the processing power and techniques now available, research is scaling up from lab prototypes to real-world, proven applications. This book teaches the principles of natural language processing, first covering linguistics issues such as encoding, entropy, and annotation schemes; defining words, tokens and parts of speech; and morphology. It then details the language-processing functions involved, including part-o
Full Text Available The seventh issue of Complex Systems Informatics and Modeling Quarterly presents five papers devoted to two distinct research topics: systems modeling and natural language processing (NLP. Both of these subjects are very important in computer science. Through modeling we can simplify the studied problem by concentrating on only one aspect at a time. Moreover, a properly constructed model allows the modeler to work on higher levels of abstraction and not having to concentrate on details. Since the size and complexity of information systems grows rapidly, creating good models of such systems is crucial. The analysis of natural language is slowly becoming a widely used tool in commerce and day to day life. Opinion mining allows recommender systems to provide accurate recommendations based on user-generated reviews. Speech recognition and NLP are the basis for such widely used personal assistants as Apple’s Siri, Microsoft’s Cortana, and Google Now. While a lot of work has already been done on natural language processing, the research usually concerns widely used languages, such as English. Consequently, natural language processing in languages other than English is very relevant subject and is addressed in this issue.
Full Text Available The effects of word frequency and syllable frequency are well-established phenomena in domain such as spoken production in alphabetic languages. Chinese, as a non-alphabetic language, presents unique lexical and phonological properties in speech production. For example, the proximate unit of phonological encoding is syllable in Chinese but segments in Dutch, French or English. The present study investigated the effects of word frequency and syllable frequency, and their interaction in Chinese written and spoken production. Significant facilitatory word frequency and syllable frequency effects were observed in spoken as well as in written production. The syllable frequency effect in writing indicated that phonological properties (i.e., syllabic frequency constrain orthographic output via a lexical route, at least, in Chinese written production. However, the syllable frequency effect over repetitions was divergent in both modalities: it was significant in the former two repetitions in spoken whereas it was significant in the second repetition only in written. Due to the fragility of the syllable frequency effect in writing, we suggest that the phonological influence in handwritten production is not mandatory and universal, and it is modulated by experimental manipulations. This provides evidence for the orthographic autonomy hypothesis, rather than the phonological mediation hypothesis. The absence of an interaction between word frequency and syllable frequency showed that the syllable frequency effect is independent of the word frequency effect in spoken and written output modalities. The implications of these results on written production models are discussed.
Ma, Cuixia; Dai, Guozhong
Natural User Interface is one of the important next generation interactions. Computers are not just the tools of many special people or areas but for most people. Ubiquitous computing makes the world magic and more comfortable. In the design domain, current systems, which need the detail information, cannot conveniently support the conceptual design of the early phrase. Pen and paper are the natural and simple tools to use in our daily life, especially in design domain. Gestures are the useful and natural mode in the interaction of pen-based. In natural UI, gestures can be introduced and used through the similar mode to the existing resources in interaction. But the gestures always are defined beforehand without the users' intention and recognized to represent something in certain applications without being transplanted to others. We provide the gesture description language (GDL) to try to cite the useful gestures to the applications conveniently. It can be used in terms of the independent control resource such as menus or icons in applications. So we give the idea from two perspectives: one from the application-dependent point of view and the other from the application-independent point of view.
Jones, -A C; Toscano, E; Botting, N; Marshall, C-R; Atkinson, J R; Denmark, T; Herman, -R; Morgan, G
Previous research has highlighted that deaf children acquiring spoken English have difficulties in narrative development relative to their hearing peers both in terms of macro-structure and with micro-structural devices. The majority of previous research focused on narrative tasks designed for hearing children that depend on good receptive language skills. The current study compared narratives of 6 to 11-year-old deaf children who use spoken English (N=59) with matched for age and non-verbal intelligence hearing peers. To examine the role of general language abilities, single word vocabulary was also assessed. Narratives were elicited by the retelling of a story presented non-verbally in video format. Results showed that deaf and hearing children had equivalent macro-structure skills, but the deaf group showed poorer performance on micro-structural components. Furthermore, the deaf group gave less detailed responses to inferencing probe questions indicating poorer understanding of the story's underlying message. For deaf children, micro-level devices most strongly correlated with the vocabulary measure. These findings suggest that deaf children, despite spoken language delays, are able to convey the main elements of content and structure in narrative but have greater difficulty in using grammatical devices more dependent on finer linguistic and pragmatic skills. Crown Copyright © 2016. Published by Elsevier Ltd. All rights reserved.
Wiseheart, Rebecca; Altmann, Lori J P
Individuals with dyslexia demonstrate syntactic difficulties on tasks of language comprehension, yet little is known about spoken language production in this population. To investigate whether spoken sentence production in college students with dyslexia is less proficient than in typical readers, and to determine whether group differences can be attributable to cognitive differences between groups. Fifty-one college students with and without dyslexia were asked to produce sentences from stimuli comprising a verb and two nouns. Verb types varied in argument structure and morphological form and nouns varied in animacy. Outcome measures were precision (measured by fluency, grammaticality and completeness) and efficiency (measured by response times). Vocabulary and working memory tests were also administered and used as predictors of sentence production performance. Relative to non-dyslexic peers, students with dyslexia responded significantly slower and produced sentences that were significantly less precise in terms of fluency, grammaticality and completeness. The primary predictors of precision and efficiency were working memory, which differed between groups, and vocabulary, which did not. College students with dyslexia were significantly less facile and flexible on this spoken sentence-production task than typical readers, which is consistent with previous studies of school-age children with dyslexia. Group differences in performance were traced primarily to limited working memory, and were somewhat mitigated by strong vocabulary. © 2017 Royal College of Speech and Language Therapists.
Zhang, Xingyu; Kim, Joyce; Patzer, Rachel E; Pitts, Stephen R; Patzer, Aaron; Schrager, Justin D
To describe and compare logistic regression and neural network modeling strategies to predict hospital admission or transfer following initial presentation to Emergency Department (ED) triage with and without the addition of natural language processing elements. Using data from the National Hospital Ambulatory Medical Care Survey (NHAMCS), a cross-sectional probability sample of United States EDs from 2012 and 2013 survey years, we developed several predictive models with the outcome being admission to the hospital or transfer vs. discharge home. We included patient characteristics immediately available after the patient has presented to the ED and undergone a triage process. We used this information to construct logistic regression (LR) and multilayer neural network models (MLNN) which included natural language processing (NLP) and principal component analysis from the patient's reason for visit. Ten-fold cross validation was used to test the predictive capacity of each model and receiver operating curves (AUC) were then calculated for each model. Of the 47,200 ED visits from 642 hospitals, 6,335 (13.42%) resulted in hospital admission (or transfer). A total of 48 principal components were extracted by NLP from the reason for visit fields, which explained 75% of the overall variance for hospitalization. In the model including only structured variables, the AUC was 0.824 (95% CI 0.818-0.830) for logistic regression and 0.823 (95% CI 0.817-0.829) for MLNN. Models including only free-text information generated AUC of 0.742 (95% CI 0.731- 0.753) for logistic regression and 0.753 (95% CI 0.742-0.764) for MLNN. When both structured variables and free text variables were included, the AUC reached 0.846 (95% CI 0.839-0.853) for logistic regression and 0.844 (95% CI 0.836-0.852) for MLNN. The predictive accuracy of hospital admission or transfer for patients who presented to ED triage overall was good, and was improved with the inclusion of free text data from a patient
Lassiter, Daniel; Goodman, Noah D
The "new paradigm" unifying deductive and inductive reasoning in a Bayesian framework (Oaksford & Chater, 2007; Over, 2009) has been claimed to be falsified by results which show sharp differences between reasoning about necessity vs. plausibility (Heit & Rotello, 2010; Rips, 2001; Rotello & Heit, 2009). We provide a probabilistic model of reasoning with modal expressions such as "necessary" and "plausible" informed by recent work in formal semantics of natural language, and show that it predicts the possibility of non-linear response patterns which have been claimed to be problematic. Our model also makes a strong monotonicity prediction, while two-dimensional theories predict the possibility of reversals in argument strength depending on the modal word chosen. Predictions were tested using a novel experimental paradigm that replicates the previously-reported response patterns with a minimal manipulation, changing only one word of the stimulus between conditions. We found a spectrum of reasoning "modes" corresponding to different modal words, and strong support for our model's monotonicity prediction. This indicates that probabilistic approaches to reasoning can account in a clear and parsimonious way for data previously argued to falsify them, as well as new, more fine-grained, data. It also illustrates the importance of careful attention to the semantics of language employed in reasoning experiments. Copyright © 2014 Elsevier B.V. All rights reserved.
Soysal, Ergin; Wang, Jingqi; Jiang, Min; Wu, Yonghui; Pakhomov, Serguei; Liu, Hongfang; Xu, Hua
Existing general clinical natural language processing (NLP) systems such as MetaMap and Clinical Text Analysis and Knowledge Extraction System have been successfully applied to information extraction from clinical text. However, end users often have to customize existing systems for their individual tasks, which can require substantial NLP skills. Here we present CLAMP (Clinical Language Annotation, Modeling, and Processing), a newly developed clinical NLP toolkit that provides not only state-of-the-art NLP components, but also a user-friendly graphic user interface that can help users quickly build customized NLP pipelines for their individual applications. Our evaluation shows that the CLAMP default pipeline achieved good performance on named entity recognition and concept encoding. We also demonstrate the efficiency of the CLAMP graphic user interface in building customized, high-performance NLP pipelines with 2 use cases, extracting smoking status and lab test values. CLAMP is publicly available for research use, and we believe it is a unique asset for the clinical NLP community. © The Author 2017. Published by Oxford University Press on behalf of the American Medical Informatics Association. All rights reserved. For Permissions, please email: firstname.lastname@example.org.
Levitt, Ash; Schlauch, Robert C; Bartholow, Bruce D; Sher, Kenneth J
Examining the natural language college students use to describe various levels of intoxication can provide important insight into subjective perceptions of college alcohol use. Previous research (Levitt et al., Alcohol Clin Exp Res 2009; 33: 448) has shown that intoxication terms reflect moderate and heavy levels of intoxication and that self-use of these terms differs by gender among college students. However, it is still unknown whether these terms similarly apply to other individuals and, if so, whether similar gender differences exist. To address these issues, the current study examined the application of intoxication terms to characters in experimentally manipulated vignettes of naturalistic drinking situations within a sample of university undergraduates (n = 145). Findings supported and extended previous research by showing that other-directed applications of intoxication terms are similar to self-directed applications and depend on the gender of both the target and the user. Specifically, moderate intoxication terms were applied to and from women more than men, even when the character was heavily intoxicated, whereas heavy intoxication terms were applied to and from men more than women. The findings suggest that gender differences in the application of intoxication terms are other-directed as well as self-directed and that intoxication language can inform gender-specific prevention and intervention efforts targeting problematic alcohol use among college students. Copyright © 2013 by the Research Society on Alcoholism.
Verhoeven, Ludo; Steenge, Judit; van Leeuwe, Jan; van Balkom, Hans
In this study, we investigated which componential skills can be distinguished in the second language (L2) development of 140 bilingual children with specific language impairment in the Netherlands, aged 6-11 years, divided into 3 age groups. L2 development was assessed by means of spoken language tasks representing different language skills…
This article reports on a survey with 170 school-age children growing up with two or more languages in the Canadian province of Ontario where English is the majority language, French is a minority language, and numerous other minority languages may be spoken by immigrant or Indigenous residents. Within this context the study focuses on minority…
Payne, Philip R O; Kwok, Alan; Dhaval, Rakesh; Borlawsky, Tara B
The conduct of large-scale translational studies presents significant challenges related to the storage, management and analysis of integrative data sets. Ideally, the application of methodologies such as conceptual knowledge discovery in databases (CKDD) provides a means for moving beyond intuitive hypothesis discovery and testing in such data sets, and towards the high-throughput generation and evaluation of knowledge-anchored relationships between complex bio-molecular and phenotypic variables. However, the induction of such high-throughput hypotheses is non-trivial, and requires correspondingly high-throughput validation methodologies. In this manuscript, we describe an evaluation of the efficacy of a natural language processing-based approach to validating such hypotheses. As part of this evaluation, we will examine a phenomenon that we have labeled as "Conceptual Dissonance" in which conceptual knowledge derived from two or more sources of comparable scope and granularity cannot be readily integrated or compared using conventional methods and automated tools.
Juan Andres Laura
Full Text Available In recent studies Recurrent Neural Networks were used for generative processes and their surprising performance can be explained by their ability to create good predictions. In addition, Data Compression is also based on prediction. What the problem comes down to is whether a data compressor could be used to perform as well as recurrent neural networks in the natural language processing tasks of sentiment analysis and automatic text generation. If this is possible, then the problem comes down to determining if a compression algorithm is even more intelligent than a neural network in such tasks. In our journey, a fundamental difference between a Data Compression Algorithm and Recurrent Neural Networks has been discovered.
In this paper, I investigate a problem of finding most similar music tracks using, popular in Natural Language Processing, techniques like: TF-IDF and LDA. I de ned document as music track. Each music track is transformed to spectrogram, thanks that, I can use well known techniques to get words from images. I used SURF operation to detect characteristic points and novel approach for their description. The standard kmeans was used for clusterization. Clusterization is here identical with dictionary making, so after that I can transform spectrograms to text documents and perform TF-IDF and LDA. At the final, I can make a query in an obtained vector space. The research was done on 16 music tracks for training and 336 for testing, that are splitted in four categories: Hiphop, Jazz, Metal and Pop. Although used technique is completely unsupervised, results are satisfactory and encouraging to further research.
Full Text Available In the last decades, Natural Language Processing (NLP has obtained a high level of success. Interactions between NLP and Serious Games have started and some of them already include NLP techniques. The objectives of this paper are twofold: on the one hand, providing a simple framework to enable analysis of potential uses of NLP in Serious Games and, on the other hand, applying the NLP framework to existing Serious Games and giving an overview of the use of NLP in pedagogical Serious Games. In this paper we present 11 serious games exploiting NLP techniques. We present them systematically, according to the following structure: first, we highlight possible uses of NLP techniques in Serious Games, second, we describe the type of NLP implemented in the each specific Serious Game and, third, we provide a link to possible purposes of use for the different actors interacting in the Serious Game.
Bosco, Cristina; Delmonte, Rodolfo; Moschitti, Alessandro; Simi, Maria
The papers collected in this volume are selected as a sample of the progress in Natural Language Processing (NLP) performed within the Italian NLP community and especially attested by the PARLI project. PARLI (Portale per l’Accesso alle Risorse in Lingua Italiana) is a project partially funded by the Ministero Italiano per l’Università e la Ricerca (PRIN 2008) from 2008 to 2012 for monitoring and fostering the harmonic growth and coordination of the activities of Italian NLP. It was proposed by various teams of researchers working in Italian universities and research institutions. According to the spirit of the PARLI project, most of the resources and tools created within the project and here described are freely distributed and they did not terminate their life at the end of the project itself, hoping they could be a key factor in future development of computational linguistics.
Pai, Vinay M; Rodgers, Mary; Conroy, Richard; Luo, James; Zhou, Ruixia; Seto, Belinda
In April 2012, the National Institutes of Health organized a two-day workshop entitled 'Natural Language Processing: State of the Art, Future Directions and Applications for Enhancing Clinical Decision-Making' (NLP-CDS). This report is a summary of the discussions during the second day of the workshop. Collectively, the workshop presenters and participants emphasized the need for unstructured clinical notes to be included in the decision making workflow and the need for individualized longitudinal data tracking. The workshop also discussed the need to: (1) combine evidence-based literature and patient records with machine-learning and prediction models; (2) provide trusted and reproducible clinical advice; (3) prioritize evidence and test results; and (4) engage healthcare professionals, caregivers, and patients. The overall consensus of the NLP-CDS workshop was that there are promising opportunities for NLP and CDS to deliver cognitive support for healthcare professionals, caregivers, and patients.
Redman, Joseph S; Natarajan, Yamini; Hou, Jason K; Wang, Jingqi; Hanif, Muzammil; Feng, Hua; Kramer, Jennifer R; Desiderio, Roxanne; Xu, Hua; El-Serag, Hashem B; Kanwal, Fasiha
Natural language processing is a powerful technique of machine learning capable of maximizing data extraction from complex electronic medical records. We utilized this technique to develop algorithms capable of "reading" full-text radiology reports to accurately identify the presence of fatty liver disease. Abdominal ultrasound, computerized tomography, and magnetic resonance imaging reports were retrieved from the Veterans Affairs Corporate Data Warehouse from a random national sample of 652 patients. Radiographic fatty liver disease was determined by manual review by two physicians and verified with an expert radiologist. A split validation method was utilized for algorithm development. For all three imaging modalities, the algorithms could identify fatty liver disease with >90% recall and precision, with F-measures >90%. These algorithms could be used to rapidly screen patient records to establish a large cohort to facilitate epidemiological and clinical studies and examine the clinic course and outcomes of patients with radiographic hepatic steatosis.
Li, Muqun; Carrell, David; Aberdeen, John; Hirschman, Lynette; Kirby, Jacqueline; Li, Bo; Vorobeychik, Yevgeniy; Malin, Bradley A
Electronic medical records (EMRs) are increasingly repurposed for activities beyond clinical care, such as to support translational research and public policy analysis. To mitigate privacy risks, healthcare organizations (HCOs) aim to remove potentially identifying patient information. A substantial quantity of EMR data is in natural language form and there are concerns that automated tools for detecting identifiers are imperfect and leak information that can be exploited by ill-intentioned data recipients. Thus, HCOs have been encouraged to invest as much effort as possible to find and detect potential identifiers, but such a strategy assumes the recipients are sufficiently incentivized and capable of exploiting leaked identifiers. In practice, such an assumption may not hold true and HCOs may overinvest in de-identification technology. The goal of this study is to design a natural language de-identification framework, rooted in game theory, which enables an HCO to optimize their investments given the expected capabilities of an adversarial recipient. We introduce a Stackelberg game to balance risk and utility in natural language de-identification. This game represents a cost-benefit model that enables an HCO with a fixed budget to minimize their investment in the de-identification process. We evaluate this model by assessing the overall payoff to the HCO and the adversary using 2100 clinical notes from Vanderbilt University Medical Center. We simulate several policy alternatives using a range of parameters, including the cost of training a de-identification model and the loss in data utility due to the removal of terms that are not identifiers. In addition, we compare policy options where, when an attacker is fined for misuse, a monetary penalty is paid to the publishing HCO as opposed to a third party (e.g., a federal regulator). Our results show that when an HCO is forced to exhaust a limited budget (set to $2000 in the study), the precision and recall of the
Goldstein, Ayelet; Shahar, Yuval
Physicians are required to interpret, abstract and present in free-text large amounts of clinical data in their daily tasks. This is especially true for chronic-disease domains, but holds also in other clinical domains. We have recently developed a prototype system, CliniText, which, given a time-oriented clinical database, and appropriate formal abstraction and summarization knowledge, combines the computational mechanisms of knowledge-based temporal data abstraction, textual summarization, abduction, and natural-language generation techniques, to generate an intelligent textual summary of longitudinal clinical data. We demonstrate our methodology, and the feasibility of providing a free-text summary of longitudinal electronic patient records, by generating summaries in two very different domains - Diabetes Management and Cardiothoracic surgery. In particular, we explain the process of generating a discharge summary of a patient who had undergone a Coronary Artery Bypass Graft operation, and a brief summary of the treatment of a diabetes patient for five years.
El Saadawi, Gilan M.; Tseytlin, Eugene; Legowski, Elizabeth; Jukic, Drazen; Castine, Melissa; Fine, Jeffrey; Gormley, Robert; Crowley, Rebecca S.
Introduction We developed and evaluated a Natural Language Interface (NLI) for an Intelligent Tutoring System (ITS) in Diagnostic Pathology. The system teaches residents to examine pathologic slides and write accurate pathology reports while providing immediate feedback on errors they make in their slide review and diagnostic reports. Residents can ask for help at any point in the case, and will receive context-specific feedback. Research Questions We evaluated (1) the performance of our natural language system, (2) the effect of the system on learning (3) the effect of feedback timing on learning gains and (4) the effect of ReportTutor on performance to self-assessment correlations. Methods The study uses a crossover 2×2 factorial design. We recruited 20 subjects from 4 academic programs. Subjects were randomly assigned to one of the four conditions - two conditions for the immediate interface, and two for the delayed interface. An expert dermatopathologist created a reference standard and 2 board certified AP/CP pathology fellows manually coded the residents' assessment reports. Subjects were given the opportunity to self grade their performance and we used a survey to determine student response to both interfaces. Results Our results show a highly significant improvement in report writing after one tutoring session with 4-fold increase in the learning gains with both interfaces but no effect of feedback timing on performance gains. Residents who used the immediate feedback interface first experienced a feature learning gain that is correlated with the number of cases they viewed. There was no correlation between performance and self-assessment in either condition. PMID:17934789
The complex environment of the typical research laboratory requires flexible process control. This program provides natural language process control from an IBM PC or compatible machine. Sometimes process control schedules require changes frequently, even several times per day. These changes may include adding, deleting, and rearranging steps in a process. This program sets up a process control system that can either run without an operator, or be run by workers with limited programming skills. The software system includes three programs. Two of the programs, written in FORTRAN77, record data and control research processes. The third program, written in Pascal, generates the FORTRAN subroutines used by the other two programs to identify the user commands with the user-written device drivers. The software system also includes an input data set which allows the user to define the user commands which are to be executed by the computer. To set the system up the operator writes device driver routines for all of the controlled devices. Once set up, this system requires only an input file containing natural language command lines which tell the system what to do and when to do it. The operator can make up custom commands for operating and taking data from external research equipment at any time of the day or night without the operator in attendance. This process control system requires a personal computer operating under MS-DOS with suitable hardware interfaces to all controlled devices. The program requires a FORTRAN77 compiler and user-written device drivers. This program was developed in 1989 and has a memory requirement of about 62 Kbytes.
Baneyx, Audrey; Charlet, Jean; Jaulent, Marie-Christine
Pathologies and acts are classified in thesauri to help physicians to code their activity. In practice, the use of thesauri is not sufficient to reduce variability in coding and thesauri are not suitable for computer processing. We think the automation of the coding task requires a conceptual modeling of medical items: an ontology. Our task is to help lung specialists code acts and diagnoses with software that represents medical knowledge of this concerned specialty by an ontology. The objective of the reported work was to build an ontology of pulmonary diseases dedicated to the coding process. To carry out this objective, we develop a precise methodological process for the knowledge engineer in order to build various types of medical ontologies. This process is based on the need to express precisely in natural language the meaning of each concept using differential semantics principles. A differential ontology is a hierarchy of concepts and relationships organized according to their similarities and differences. Our main research hypothesis is to apply natural language processing tools to corpora to develop the resources needed to build the ontology. We consider two corpora, one composed of patient discharge summaries and the other being a teaching book. We propose to combine two approaches to enrich the ontology building: (i) a method which consists of building terminological resources through distributional analysis and (ii) a method based on the observation of corpus sequences in order to reveal semantic relationships. Our ontology currently includes 1550 concepts and the software implementing the coding process is still under development. Results show that the proposed approach is operational and indicates that the combination of these methods and the comparison of the resulting terminological structures give interesting clues to a knowledge engineer for the building of an ontology.