WorldWideScience

Sample records for preparing speech texts

  1. Predicting Prosody from Text for Text-to-Speech Synthesis

    CERN Document Server

    Rao, K Sreenivasa

    2012-01-01

    Predicting Prosody from Text for Text-to-Speech Synthesis covers the specific aspects of prosody, mainly focusing on how to predict the prosodic information from linguistic text, and then how to exploit the predicted prosodic knowledge for various speech applications. Author K. Sreenivasa Rao discusses proposed methods along with state-of-the-art techniques for the acquisition and incorporation of prosodic knowledge for developing speech systems. Positional, contextual and phonological features are proposed for representing the linguistic and production constraints of the sound units present in the text. This book is intended for graduate students and researchers working in the area of speech processing.

  2. Speech Act Classification of German Advertising Texts

    Directory of Open Access Journals (Sweden)

    Артур Нарманович Мамедов

    2015-12-01

    Full Text Available This paper uses the theory of speech acts and the underlying concept of pragmalinguistics to determine the types of speech acts and their classification in the German advertising printed texts. We ascertain that the advertising of cars and accessories, household appliances and computer equipment, watches, fancy goods, food, pharmaceuticals, and financial, insurance, legal services and also airline advertising is dominated by a pragmatic principle, which is based on demonstrating information about the benefits of a product / service. This influences the frequent usage of certain speech acts. The dominant form of exposure is to inform the recipient-user about the characteristics of the advertised product. This information is fore-grounded by means of stylistic and syntactic constructions specific to the advertisement (participial constructions, appositional constructions which contribute to emphasize certain notional components within the framework of the advertising text. Stylistic and syntactic devices of reduction (parceling constructions convey the author's idea. Other means like repetitions, enumerations etc are used by the advertiser to strengthen his selling power. The advertiser focuses the attention of the consumer on the characteristics of the product seeking to convince him of the utility of the product and to influence his/ her buying behavior.

  3. Texting while driving: is speech-based text entry less risky than handheld text entry?

    Science.gov (United States)

    He, J; Chaparro, A; Nguyen, B; Burge, R J; Crandall, J; Chaparro, B; Ni, R; Cao, S

    2014-11-01

    Research indicates that using a cell phone to talk or text while maneuvering a vehicle impairs driving performance. However, few published studies directly compare the distracting effects of texting using a hands-free (i.e., speech-based interface) versus handheld cell phone, which is an important issue for legislation, automotive interface design and driving safety training. This study compared the effect of speech-based versus handheld text entries on simulated driving performance by asking participants to perform a car following task while controlling the duration of a secondary text-entry task. Results showed that both speech-based and handheld text entries impaired driving performance relative to the drive-only condition by causing more variation in speed and lane position. Handheld text entry also increased the brake response time and increased variation in headway distance. Text entry using a speech-based cell phone was less detrimental to driving performance than handheld text entry. Nevertheless, the speech-based text entry task still significantly impaired driving compared to the drive-only condition. These results suggest that speech-based text entry disrupts driving, but reduces the level of performance interference compared to text entry with a handheld device. In addition, the difference in the distraction effect caused by speech-based and handheld text entry is not simply due to the difference in task duration. Copyright © 2014 Elsevier Ltd. All rights reserved.

  4. Speech to Text Translation for Malay Language

    Science.gov (United States)

    Al-khulaidi, Rami Ali; Akmeliawati, Rini

    2017-11-01

    The speech recognition system is a front end and a back-end process that receives an audio signal uttered by a speaker and converts it into a text transcription. The speech system can be used in several fields including: therapeutic technology, education, social robotics and computer entertainments. In most cases in control tasks, which is the purpose of proposing our system, wherein the speed of performance and response concern as the system should integrate with other controlling platforms such as in voiced controlled robots. Therefore, the need for flexible platforms, that can be easily edited to jibe with functionality of the surroundings, came to the scene; unlike other software programs that require recording audios and multiple training for every entry such as MATLAB and Phoenix. In this paper, a speech recognition system for Malay language is implemented using Microsoft Visual Studio C#. 90 (ninety) Malay phrases were tested by 10 (ten) speakers from both genders in different contexts. The result shows that the overall accuracy (calculated from Confusion Matrix) is satisfactory as it is 92.69%.

  5. Phonetic recalibration of speech by text

    NARCIS (Netherlands)

    Keetels, M.N.; Schakel, L.; de Bonte, M.; Vroomen, J.

    2016-01-01

    Listeners adjust their phonetic categories to cope with variations in the speech signal (phonetic recalibration). Previous studies have shown that lipread speech (and word knowledge) can adjust the perception of ambiguous speech and can induce phonetic adjustments (Bertelson, Vroomen, & de Gelder in

  6. Speech to Text Software Evaluation Report

    CERN Document Server

    Martins Santo, Ana Luisa

    2017-01-01

    This document compares out-of-box performance of three commercially available speech recognition software: Vocapia VoxSigma TM , Google Cloud Speech, and Lime- craft Transcriber. It is defined a set of evaluation criteria and test methods for speech recognition softwares. The evaluation of these softwares in noisy environments are also included for the testing purposes. Recognition accuracy was compared using noisy environments and languages. Testing in ”ideal” non-noisy environment of a quiet room has been also performed for comparison.

  7. Building a Prototype Text to Speech for Sanskrit

    Science.gov (United States)

    Mahananda, Baiju; Raju, C. M. S.; Patil, Ramalinga Reddy; Jha, Narayana; Varakhedi, Shrinivasa; Kishore, Prahallad

    This paper describes about the work done in building a prototype text to speech system for Sanskrit. A basic prototype text-to-speech is built using a simplified Sanskrit phone set, and employing a unit selection technique, where prerecorded sub-word units are concatenated to synthesize a sentence. We also discuss the issues involved in building a full-fledged text-to-speech for Sanskrit.

  8. Perception of synthetic speech produced automatically by rule: Intelligibility of eight text-to-speech systems.

    Science.gov (United States)

    Greene, Beth G; Logan, John S; Pisoni, David B

    1986-03-01

    We present the results of studies designed to measure the segmental intelligibility of eight text-to-speech systems and a natural speech control, using the Modified Rhyme Test (MRT). Results indicated that the voices tested could be grouped into four categories: natural speech, high-quality synthetic speech, moderate-quality synthetic speech, and low-quality synthetic speech. The overall performance of the best synthesis system, DECtalk-Paul, was equivalent to natural speech only in terms of performance on initial consonants. The findings are discussed in terms of recent work investigating the perception of synthetic speech under more severe conditions. Suggestions for future research on improving the quality of synthetic speech are also considered.

  9. Perception of synthetic speech produced automatically by rule: Intelligibility of eight text-to-speech systems

    Science.gov (United States)

    GREENE, BETH G.; LOGAN, JOHN S.; PISONI, DAVID B.

    2012-01-01

    We present the results of studies designed to measure the segmental intelligibility of eight text-to-speech systems and a natural speech control, using the Modified Rhyme Test (MRT). Results indicated that the voices tested could be grouped into four categories: natural speech, high-quality synthetic speech, moderate-quality synthetic speech, and low-quality synthetic speech. The overall performance of the best synthesis system, DECtalk-Paul, was equivalent to natural speech only in terms of performance on initial consonants. The findings are discussed in terms of recent work investigating the perception of synthetic speech under more severe conditions. Suggestions for future research on improving the quality of synthetic speech are also considered. PMID:23225916

  10. Speect: a multilingual text-to-speech system

    CSIR Research Space (South Africa)

    Louw, JA

    2008-11-01

    Full Text Available This paper introduces a new multilingual text-to-speech system, which we call Speect (Speech synthesis with extensible architecture), aiming to address the shortcomings of using Festival as a research sytem and Flite as a deployment system in a...

  11. Part-of-speech effects on text-to-speech synthesis

    CSIR Research Space (South Africa)

    Schlunz, GI

    2010-11-01

    Full Text Available One of the goals of text-to-speech (TTS) systems is to produce natural-sounding synthesised speech. Towards this end various natural language processing (NLP) tasks are performed to model the prosodic aspects of the TTS voice. One of the fundamental...

  12. Indonesian Text-To-Speech System Using Diphone Concatenative Synthesis

    Directory of Open Access Journals (Sweden)

    Sutarman

    2015-02-01

    Full Text Available In this paper, we describe the design and develop a database of Indonesian diphone synthesis using speech segment of recorded voice to be converted from text to speech and save it as audio file like WAV or MP3. In designing and develop a database of Indonesian diphone there are several steps to follow; First, developed Diphone database includes: create a list of sample of words consisting of diphones organized by prioritizing looking diphone located in the middle of a word if not at the beginning or end; recording the samples of words by segmentation. ;create diphones made with a tool Diphone Studio 1.3. Second, develop system using Microsoft Visual Delphi 6.0, includes: the conversion system from the input of numbers, acronyms, words, and sentences into representations diphone. There are two kinds of conversion (process alleged in analyzing the Indonesian text-to-speech system. One is to convert the text to be sounded to phonem and two, to convert the phonem to speech. Method used in this research is called Diphone Concatenative synthesis, in which recorded sound segments are collected. Every segment consists of a diphone (2 phonems. This synthesizer may produce voice with high level of naturalness. The Indonesian Text to Speech system can differentiate special phonemes like in ‘Beda’ and ‘Bedak’ but sample of other spesific words is necessary to put into the system. This Indonesia TTS system can handle texts with abbreviation, there is the facility to add such words.

  13. Rule-Based Storytelling Text-to-Speech (TTS Synthesis

    Directory of Open Access Journals (Sweden)

    Ramli Izzad

    2016-01-01

    Full Text Available In recent years, various real life applications such as talking books, gadgets and humanoid robots have drawn the attention to pursue research in the area of expressive speech synthesis. Speech synthesis is widely used in various applications. However, there is a growing need for an expressive speech synthesis especially for communication and robotic. In this paper, global and local rule are developed to convert neutral to storytelling style speech for the Malay language. In order to generate rules, modification of prosodic parameters such as pitch, intensity, duration, tempo and pauses are considered. Modification of prosodic parameters is examined by performing prosodic analysis on a story collected from an experienced female and male storyteller. The global and local rule is applied in sentence level and synthesized using HNM. Subjective tests are conducted to evaluate the synthesized storytelling speech quality of both rules based on naturalness, intelligibility, and similarity to the original storytelling speech. The results showed that global rule give a better result than local rule

  14. Indian accent text-to-speech system for web browsing

    Indian Academy of Sciences (India)

    This paper describes a 'web reader' which 'reads out' the textual contents of a selected web page in Hindi or in English with Indian accent. The content of the page is downloaded and parsed into suitable textual form. It is then passed on to an indigenously developed text-to-speech system for Hindi/Indian English, ...

  15. Speech-Language Pathology: Preparing Early Interventionists

    Science.gov (United States)

    Prelock, Patricia A.; Deppe, Janet

    2015-01-01

    The purpose of this article is to explain the role of speech-language pathology in early intervention. The expected credentials of professionals in the field are described, and the current numbers of practitioners serving young children are identified. Several resource documents available from the American Speech-­Language Hearing Association are…

  16. The role of speech prosody and text reading prosody in children's reading comprehension.

    Science.gov (United States)

    Veenendaal, Nathalie J; Groen, Margriet A; Verhoeven, Ludo

    2014-12-01

    Text reading prosody has been associated with reading comprehension. However, text reading prosody is a reading-dependent measure that relies heavily on decoding skills. Investigation of the contribution of speech prosody - which is independent from reading skills - in addition to text reading prosody, to reading comprehension could provide more insight into the general role of prosody in reading comprehension. The current study investigates how much variance in reading comprehension scores is explained by speech prosody and text reading prosody, after controlling for decoding, vocabulary, and syntactic awareness. A battery of reading and language assessments was performed by 106 Dutch fourth-grade primary school children. Speech prosody was assessed using a storytelling task and text reading prosody by oral text reading performance. Decoding skills, vocabulary, syntactic awareness, and reading comprehension were assessed using standardized tests. Hierarchical regression analyses showed that text reading prosody explained 6% of variance and that speech prosody explained 8% of variance in reading comprehension scores, after controlling for decoding, vocabulary, and syntactic awareness. Phrasing was the significant factor in both speech and text reading. When added in consecutive order, phrasing in speech added 5% variance to phrasing in reading. In contrast, phrasing in reading added only 3% variance to phrasing in speech. The variance that speech prosody explained in reading comprehension scores should not be neglected. Speech prosody seems to facilitate the construction of meaning in written language. © 2014 The British Psychological Society.

  17. Automatic speech recognition used for evaluation of text-to-speech systems

    Czech Academy of Sciences Publication Activity Database

    Vích, Robert; Nouza, J.; Vondra, Martin

    -, č. 5042 (2008), s. 136-148 ISSN 0302-9743 R&D Projects: GA AV ČR 1ET301710509; GA AV ČR 1QS108040569 Institutional research plan: CEZ:AV0Z20670512 Keywords : speech recognition * speech processing Subject RIV: JA - Electronics ; Optoelectronics, Electrical Engineering

  18. Top-down influences of written text on perceived clarity of degraded speech.

    Science.gov (United States)

    Sohoglu, Ediz; Peelle, Jonathan E; Carlyon, Robert P; Davis, Matthew H

    2014-02-01

    An unresolved question is how the reported clarity of degraded speech is enhanced when listeners have prior knowledge of speech content. One account of this phenomenon proposes top-down modulation of early acoustic processing by higher-level linguistic knowledge. Alternative, strictly bottom-up accounts argue that acoustic information and higher-level knowledge are combined at a late decision stage without modulating early acoustic processing. Here we tested top-down and bottom-up accounts using written text to manipulate listeners' knowledge of speech content. The effect of written text on the reported clarity of noise-vocoded speech was most pronounced when text was presented before (rather than after) speech (Experiment 1). Fine-grained manipulation of the onset asynchrony between text and speech revealed that this effect declined when text was presented more than 120 ms after speech onset (Experiment 2). Finally, the influence of written text was found to arise from phonological (rather than lexical) correspondence between text and speech (Experiment 3). These results suggest that prior knowledge effects are time-limited by the duration of auditory echoic memory for degraded speech, consistent with top-down modulation of early acoustic processing by linguistic knowledge. PsycINFO Database Record (c) 2014 APA, all rights reserved.

  19. Map Your Way to Speech Success! Employing Mind Mapping as a Speech Preparation Technique

    Science.gov (United States)

    Paxman, Christina G.

    2011-01-01

    Mind mapping has gained considerable credibility recently in corporations such as Boeing and Nabisco, as well as in the classroom in terms of preparing for examinations and preparing for speeches. A mind map is a graphic technique for organizing an individual's thoughts and other information. It harnesses the full range of cortical skills--word,…

  20. The first Malay language storytelling text-to-speech (TTS) corpus for ...

    African Journals Online (AJOL)

    speech annotations are described in detail in accordance to baseline work. The stories were recorded in two speaking styles that are neutral and storytelling speaking style. The first. Malay language storytelling corpus is not only necessary for the development of a storytelling text-to-speech (TTS) synthesis. It is also ...

  1. Speech-To-Text Conversion STT System Using Hidden Markov Model HMM

    Directory of Open Access Journals (Sweden)

    Su Myat Mon

    2015-06-01

    Full Text Available Abstract Speech is an easiest way to communicate with each other. Speech processing is widely used in many applications like security devices household appliances cellular phones ATM machines and computers. The human computer interface has been developed to communicate or interact conveniently for one who is suffering from some kind of disabilities. Speech-to-Text Conversion STT systems have a lot of benefits for the deaf or dumb people and find their applications in our daily lives. In the same way the aim of the system is to convert the input speech signals into the text output for the deaf or dumb students in the educational fields. This paper presents an approach to extract features by using Mel Frequency Cepstral Coefficients MFCC from the speech signals of isolated spoken words. And Hidden Markov Model HMM method is applied to train and test the audio files to get the recognized spoken word. The speech database is created by using MATLAB.Then the original speech signals are preprocessed and these speech samples are extracted to the feature vectors which are used as the observation sequences of the Hidden Markov Model HMM recognizer. The feature vectors are analyzed in the HMM depending on the number of states.

  2. Text as a Supplement to Speech in Young and Older Adults.

    Science.gov (United States)

    Krull, Vidya; Humes, Larry E

    2016-01-01

    The purpose of this experiment was to quantify the contribution of visual text to auditory speech recognition in background noise. Specifically, the authors tested the hypothesis that partially accurate visual text from an automatic speech recognizer could be used successfully to supplement speech understanding in difficult listening conditions in older adults, with normal or impaired hearing. The working hypotheses were based on what is known regarding audiovisual speech perception in the elderly from speechreading literature. We hypothesized that (1) combining auditory and visual text information will result in improved recognition accuracy compared with auditory or visual text information alone, (2) benefit from supplementing speech with visual text (auditory and visual enhancement) in young adults will be greater than that in older adults, and (3) individual differences in performance on perceptual measures would be associated with cognitive abilities. Fifteen young adults with normal hearing, 15 older adults with normal hearing, and 15 older adults with hearing loss participated in this study. All participants completed sentence recognition tasks in auditory-only, text-only, and combined auditory-text conditions. The auditory sentence stimuli were spectrally shaped to restore audibility for the older participants with impaired hearing. All participants also completed various cognitive measures, including measures of working memory, processing speed, verbal comprehension, perceptual and cognitive speed, processing efficiency, inhibition, and the ability to form wholes from parts. Group effects were examined for each of the perceptual and cognitive measures. Audiovisual benefit was calculated relative to performance on auditory- and visual-text only conditions. Finally, the relationship between perceptual measures and other independent measures were examined using principal-component factor analyses, followed by regression analyses. Both young and older adults

  3. On advantage of seeing text and hearing speech

    Directory of Open Access Journals (Sweden)

    Živanović Jelena

    2011-01-01

    Full Text Available The aim of this study was to examine the effect of congruence between the sensory modality through which a concept can be experienced and the modality through which the word denoting that concept is perceived during word recognition. Words denoting concepts that can be experienced visually (e.g. “color” and words denoting concepts that can be experienced auditorily (e.g. “noise” were presented both visually and auditorily. We observed shorter processing latencies when there was a match between the modality through which a concept could be experienced and the modality through which a word denoting that concept was presented. In visual lexical decision task, “color” was recognized faster than “noise”, whereas in auditory lexical decision task, “noise” was recognized faster than “color”. The obtained pattern of results can not be accounted for by exclusive amodal theories, whereas it can be easily integrated in theories based on perceptual representations.

  4. The Role of Speech Prosody and Text Reading Prosody in Children's Reading Comprehension

    Science.gov (United States)

    Veenendaal, Nathalie J.; Groen, Margriet A.; Verhoeven, Ludo

    2014-01-01

    Background: Text reading prosody has been associated with reading comprehension. However, text reading prosody is a reading-dependent measure that relies heavily on decoding skills. Investigation of the contribution of speech prosody--which is independent from reading skills--in addition to text reading prosody, to reading comprehension could…

  5. A discourse model of affect for text-to-speech synthesis

    CSIR Research Space (South Africa)

    Schlunz, GI

    2013-12-01

    Full Text Available This paper introduces a model of affect to improve prosody in text-to-speech synthesis. It operates on the discourse level of text to predict the underlying linguistic factors that contribute towards emotional appraisal, rather than any particular...

  6. Text-to-audiovisual speech synthesizer for children with learning disabilities.

    Science.gov (United States)

    Mendi, Engin; Bayrak, Coskun

    2013-01-01

    Learning disabilities affect the ability of children to learn, despite their having normal intelligence. Assistive tools can highly increase functional capabilities of children with learning disorders such as writing, reading, or listening. In this article, we describe a text-to-audiovisual synthesizer that can serve as an assistive tool for such children. The system automatically converts an input text to audiovisual speech, providing synchronization of the head, eye, and lip movements of the three-dimensional face model with appropriate facial expressions and word flow of the text. The proposed system can enhance speech perception and help children having learning deficits to improve their chances of success.

  7. THE UNDERLYING PRINCIPLES OF SUSILO BAMBANG YUDHOYONO‘S THOUGHT PATTERNS IN HIS ENGLISH SPEECH TEXTS

    Directory of Open Access Journals (Sweden)

    Sulistya ningsih

    2014-10-01

    Full Text Available The underlying principles of thought patterns as shown in SBY's English Speeches Texts are made because there are different responses from the public, a part of public praise that SBY is a good president, and others claim and criticize him that  he is slow (Djalal, 2007: forward page. This title so far has not been investigated. This research was aimed at finding out:  the underlying principles of SBY’s thought patterns in his English Speech Texts related to Javanese philosophy. This research is qualitative. The data selected from SBY’s speech Texts were analyzed using semantic and pragmastylistic theory then were related to Javanese philosophy. The findings are the underlying principles of SBY’s thought patterns based on Javanese philosophy manifested in his English Speech Texts are: first is Memayu Hayuning Bawana, Ambrasta dur Hangkara means to reach safety, peace, happiness and well-being of the world and its contents, to keep the world maintained and harmony. Second, Rukun agawe santosa crah agawe bubrah  means to build the condition of harmony, and avoid conflict, because conflict can be harmful to both parties. Third, tepa selira means keep thinking not to offend others or lighten the burdens of others, tolerance. Fourth is ana rembug becik dirembug means thru negotiations can avoid conflict and achieve cooperation, safety, peace and prosperity. In sum, the world peace can be reached thru discussions without war, soft powers.

  8. BILINGUAL MULTIMODAL SYSTEM FOR TEXT-TO-AUDIOVISUAL SPEECH AND SIGN LANGUAGE SYNTHESIS

    Directory of Open Access Journals (Sweden)

    A. A. Karpov

    2014-09-01

    Full Text Available We present a conceptual model, architecture and software of a multimodal system for audio-visual speech and sign language synthesis by the input text. The main components of the developed multimodal synthesis system (signing avatar are: automatic text processor for input text analysis; simulation 3D model of human's head; computer text-to-speech synthesizer; a system for audio-visual speech synthesis; simulation 3D model of human’s hands and upper body; multimodal user interface integrating all the components for generation of audio, visual and signed speech. The proposed system performs automatic translation of input textual information into speech (audio information and gestures (video information, information fusion and its output in the form of multimedia information. A user can input any grammatically correct text in Russian or Czech languages to the system; it is analyzed by the text processor to detect sentences, words and characters. Then this textual information is converted into symbols of the sign language notation. We apply international «Hamburg Notation System» - HamNoSys, which describes the main differential features of each manual sign: hand shape, hand orientation, place and type of movement. On their basis the 3D signing avatar displays the elements of the sign language. The virtual 3D model of human’s head and upper body has been created using VRML virtual reality modeling language, and it is controlled by the software based on OpenGL graphical library. The developed multimodal synthesis system is a universal one since it is oriented for both regular users and disabled people (in particular, for the hard-of-hearing and visually impaired, and it serves for multimedia output (by audio and visual modalities of input textual information.

  9. The Effect of Speech-to-Text Technology on Learning a Writing Strategy

    Science.gov (United States)

    Haug, Katrina N.; Klein, Perry D.

    2018-01-01

    Previous research has shown that speech-to-text (STT) software can support students in producing a given piece of writing. This is the 1st study to investigate the use of STT to teach a writing strategy. We pretested 45 Grade 5 students on argument writing and trained them to use STT. Students participated in 4 lessons on an argument writing…

  10. Using Text-to-Speech Reading Support for an Adult with Mild Aphasia and Cognitive Impairment

    Science.gov (United States)

    Harvey, Judy; Hux, Karen; Snell, Jeffry

    2013-01-01

    This single case study served to examine text-to-speech (TTS) effects on reading rate and comprehension in an individual with mild aphasia and cognitive impairment. Findings showed faster reading, given TTS presented at a normal speaking rate, but no significant comprehension changes. TTS may support reading in people with aphasia when time…

  11. The benefit obtained from visually displayed text from an automatic speech recognizer during listening to speech presented in noise

    NARCIS (Netherlands)

    Zekveld, A.A.; Kramer, S.E.; Kessens, J.M.; Vlaming, M.S.M.G.; Houtgast, T.

    2008-01-01

    OBJECTIVES: The aim of this study was to evaluate the benefit that listeners obtain from visually presented output from an automatic speech recognition (ASR) system during listening to speech in noise. DESIGN: Auditory-alone and audiovisual speech reception thresholds (SRTs) were measured. The SRT

  12. Applications in accessibility of text-to-speech synthesis for South African languages: Initial system integration and user engagement

    CSIR Research Space (South Africa)

    Schlünz, Georg I

    2017-09-01

    Full Text Available with little or no functional speech to speak out loud. Screen readers and accessible e-books allow a print-disabled (visually-impaired, partially-sighted or dyslexic) individual to read text material by listening to audio versions. Text-to-speech synthesis...

  13. SOCIOLINGUISTIC FACTORS OF THE WRITTEN SPEECH NORMS APPROXIMATION IN LABOR MIGRANTS’ TEXTS

    Directory of Open Access Journals (Sweden)

    Utesheva Altynay Pazylovna

    2015-06-01

    Full Text Available The article focuses on the features of written Russian speech of labor migrants from different countries considering the norms of written speech. The empirical basis of the research is represented by the handwritten CVs of unemployed migrants from Vietnam and Uzbekistan, that were presented to the departments of the Federal Migration Service of the Russian Federation in the city of Volgograd. Written speech violations are classified according to the age groups which migrants belong to. The following sociolinguistic characteristics of the migrants are also taken into account: nationality, period of school education, higher education, document writing competence. Group 1 combined informants aged from 20 to 30, without higher education, who studied the Russian language at school in the new period of the collapse of the Soviet Union procedures or on their own. It is an educational institution with no experience compiling official documents and communication skills in Russian. Group 2 combined informants aged from 30 to 50, without higher education, who studied Russian at school by Soviet methods with experience of drawing up official documents and possessing basic communication skills to communicate in Russian. Group 3 combined informants aged 50 and older with secondary special education, who studied Russian at school by Soviet methods and actively developed communicative competence at the expense of everyday communication, reading books, listening to the radio and watching programs in Russian, with experience in drafting official documents. The features of migrants' written speech are manifested in specific language and speech mistakes, particularly in graphic, phonetic and genre rules violations. The general patterns of mistakes are registered. The mistakes are caused not only by language transfer and the Russian language competence, but also by sociolinguistic factors. The particular cross-language differences of migrants writing are

  14. Orthographic learning and the role of text-to-speech software in Dutch disabled readers.

    Science.gov (United States)

    Staels, Eva; Van den Broeck, Wim

    2015-01-01

    In this study, we examined whether orthographic learning can be demonstrated in disabled readers learning to read in a transparent orthography (Dutch). In addition, we tested the effect of the use of text-to-speech software, a new form of direct instruction, on orthographic learning. Both research goals were investigated by replicating Share's self-teaching paradigm. A total of 65 disabled Dutch readers were asked to read eight stories containing embedded homophonic pseudoword targets (e.g., Blot/Blod), with or without the support of text-to-speech software. The amount of orthographic learning was assessed 3 or 7 days later by three measures of orthographic learning. First, the results supported the presence of orthographic learning during independent silent reading by demonstrating that target spellings were correctly identified more often, named more quickly, and spelled more accurately than their homophone foils. Our results support the hypothesis that all readers, even poor readers of transparent orthographies, are capable of developing word-specific knowledge. Second, a negative effect of text-to-speech software on orthographic learning was demonstrated in this study. This negative effect was interpreted as the consequence of passively listening to the auditory presentation of the text. We clarify how these results can be interpreted within current theoretical accounts of orthographic learning and briefly discuss implications for remedial interventions. © Hammill Institute on Disabilities 2013.

  15. Language and Text-to-Speech Technologies for Highly Accessible Language & Culture Learning

    Directory of Open Access Journals (Sweden)

    Anouk Gelan

    2011-06-01

    Full Text Available This contribution presents the results of the “Speech technology integrated learning modules for Intercultural Dialogue” project. The project objective was to increase the availability and quality of e-learning opportunities for less widely-used and less taught European languages using a user-friendly and highly accessible learning environment. The integration of new Text-to-Speech developments into web-based authoring software for tutorial CALL had a double goal: on the one hand increase the accessibility of e-learning packages, also for learners having difficulty reading (e.g. dyslexic learners or preferring auditory learning; on the other hand exploiting some didactic possibilities of this technology.

  16. Dialogue enabling speech-to-text user assistive agent system for hearing-impaired person.

    Science.gov (United States)

    Lee, Seongjae; Kang, Sunmee; Han, David K; Ko, Hanseok

    2016-06-01

    A novel approach for assisting bidirectional communication between people of normal hearing and hearing-impaired is presented. While the existing hearing-impaired assistive devices such as hearing aids and cochlear implants are vulnerable in extreme noise conditions or post-surgery side effects, the proposed concept is an alternative approach wherein spoken dialogue is achieved by means of employing a robust speech recognition technique which takes into consideration of noisy environmental factors without any attachment into human body. The proposed system is a portable device with an acoustic beamformer for directional noise reduction and capable of performing speech-to-text transcription function, which adopts a keyword spotting method. It is also equipped with an optimized user interface for hearing-impaired people, rendering intuitive and natural device usage with diverse domain contexts. The relevant experimental results confirm that the proposed interface design is feasible for realizing an effective and efficient intelligent agent for hearing-impaired.

  17. Use of speech-to-text technology for documentation by healthcare providers.

    Science.gov (United States)

    Ajami, Sima

    2016-01-01

    Medical records are a critical component of a patient's treatment. However, documentation of patient-related information is considered a secondary activity in the provision of healthcare services, often leading to incomplete medical records and patient data of low quality. Advances in information technology (IT) in the health system and registration of information in electronic health records (EHR) using speechto- text conversion software have facilitated service delivery. This narrative review is a literature search with the help of libraries, books, conference proceedings, databases of Science Direct, PubMed, Proquest, Springer, SID (Scientific Information Database), and search engines such as Yahoo, and Google. I used the following keywords and their combinations: speech recognition, automatic report documentation, voice to text software, healthcare, information, and voice recognition. Due to lack of knowledge of other languages, I searched all texts in English or Persian with no time limits. Of a total of 70, only 42 articles were selected. Speech-to-text conversion technology offers opportunities to improve the documentation process of medical records, reduce cost and time of recording information, enhance the quality of documentation, improve the quality of services provided to patients, and support healthcare providers in legal matters. Healthcare providers should recognize the impact of this technology on service delivery.

  18. Does Use of Text-to-Speech and Related Read-Aloud Tools Improve Reading Comprehension for Students with Reading Disabilities? A Meta-Analysis

    Science.gov (United States)

    Wood, Sarah G.; Moxley, Jerad H.; Tighe, Elizabeth L.; Wagner, Richard K.

    2018-01-01

    Text-to-speech and related read-aloud tools are being widely implemented in an attempt to assist students' reading comprehension skills. Read-aloud software, including text-to-speech, is used to translate written text into spoken text, enabling one to listen to written text while reading along. It is not clear how effective text-to-speech is at…

  19. Text to Speech Berbasis Natural Language pada Aplikasi Pembelajaran Tenses Bahasa Inggris

    Directory of Open Access Journals (Sweden)

    Amak Yunus

    2014-09-01

    Full Text Available Bahasa adalah sebuah cara berkomunikasi secara sistematis dengan menggunakan suara atau simbol-simbol yang memiliki arti, yang diucapkan melalui mulut. Bahasa juga ditulis dengan mengikuti kaidah yang berlaku. Salah satu bahasa yang banyak digunakan di belahan dunia adalah Bahasa Inggris. Namun ada beberapa kendala apabila kita belajar kepada seorang guru atau instruktur. Waktu yang diberikan seorang guru, terbatas pada jam sekolah atau les saja. Bila siswa pulang sekolah atau les, maka yang bersangkutan harus belajar bahasa Inggris secara mandiri. Dari permasalahan di atas, muncul sebuah ide tentang bagaimana membuat sebuah penelitian yang berkaitan dengan pembuatan aplikasi yang mampu memberikan pengetahuan kepada siswa tentang bagaimana belajar bahasa Inggris secara mandiri baik dari perubahan kalimat postif menjadi kalimat negatif dan kalimat tanya. Disamping itu, aplikasi ini juga mampu memberikan pengetahuan tentang bagaimana mengucapkan kalimat dalam bahasa Inggris. Pada intinya kontribusi yang dapat diperoleh dari hasil penelitian ini adalah pihak terkait dari tingkat SMP sampai dengan SMU/SMK, dapat menggunakan aplikasi text to speech berbasis natural language processing untuk mempelajari tenses pada bahasa Inggris. Aplikasi ini dapat memperdengarkan kalimat-kalimat pada bahasa inggris dan dapat menyusun kalimat tanya dan kalimat negatif berdasarkan kalimat positifnya dalam beberapa tenses bahasa Inggris. Kata Kunci : Natural language processing, Text to speech

  20. L’unité intonative dans les textes oralisés // Intonation unit in read speech

    Directory of Open Access Journals (Sweden)

    Lea Tylečková

    2015-12-01

    Full Text Available Prosodic phrasing, i.e. division of speech into intonation units, represents a phenomenon which is central to language comprehension. Incorrect prosodic boundary markings may lead to serious misunderstandings and ambiguous interpretations of utterances. The present paper investigates prosodic competencies of Czech students of French in the domain of prosodic phrasing in French read speech. Two texts of different length are examined through a perceptual method to observe how Czech speakers of French (B1–B2 level of CEFR divide read speech into prosodic units compared to French native speakers.

  1. College Students' Perceptions of the C-Print Speech-to-Text Transcription System.

    Science.gov (United States)

    Elliot, L B; Stinson, M S; McKee, B G; Everhart, V S; Francis, P J

    2001-01-01

    C-Print is a real-time speech-to-text transcription system used as a support service with deaf students in mainstreamed classes. Questionnaires were administered to 36 college students in 32 courses in which the C-Print system was used in addition to interpreting and note taking. Twenty-two of these students were also interviewed. Questionnaire items included student ratings of lecture comprehension. Student ratings indicated good comprehension with C-Print, and the mean rating was significantly higher than that for understanding of the interpreter. Students also rated the hard copy printout provided by C-Print as helpful, and they reported that they used these notes more frequently than the handwritten notes from a paid student note taker. Interview results were consistent with those for the questionnaire. Questionnaire and interview responses regarding use of C-Print as the only support service indicated that this arrangement would be acceptable to many students, but not to others. Communication characteristics were related to responses to the questionnaire. Students who were relatively proficient in reading and writing English, and in speech-reading, responded more favorably to C-Print.

  2. Usability Assessment of Text-to-Speech Synthesis for Additional Detail in an Automated Telephone Banking System

    OpenAIRE

    Morton , Hazel; Gunson , Nancie; Marshall , Diarmid; McInnes , Fergus; Ayres , Andrea; Jack , Mervyn

    2010-01-01

    Abstract This paper describes a comprehensive usability evaluation of an automated telephone banking system which employs text-to-speech (TTS) synthesis in offering additional detail on customers? account transactions. The paper describes a series of four experiments in which TTS was employed to offer an extra level of detail to recent transactions listings within an established banking service which otherwise uses recorded speech from a professional recording artist. Results from ...

  3. Text-to-speech enhanced eBooks for emerging literacy development

    CSIR Research Space (South Africa)

    Marais, L

    2015-10-01

    Full Text Available  with an isiXhosa version is under way. The studies measure the efficacy of the  eBook application to improve the vocabulary and word recognition skills in an Afrikaans  and an isiXhosa speaking group, respectively, of lower socio­economic status of 6­ to 7­  year old children with poor vocabulary.    An... stream_source_info Marais_15668_2015.pdf.txt stream_content_type text/plain stream_size 3124 Content-Encoding UTF-8 stream_name Marais_15668_2015.pdf.txt Content-Type text/plain; charset=UTF-8 Text­to­speech enhanced eBooks...

  4. Investigating an Application of Speech-to-Text Recognition: A Study on Visual Attention and Learning Behaviour

    Science.gov (United States)

    Huang, Y-M.; Liu, C-J.; Shadiev, Rustam; Shen, M-H.; Hwang, W-Y.

    2015-01-01

    One major drawback of previous research on speech-to-text recognition (STR) is that most findings showing the effectiveness of STR for learning were based upon subjective evidence. Very few studies have used eye-tracking techniques to investigate visual attention of students on STR-generated text. Furthermore, not much attention was paid to…

  5. Effects of Dictation, Speech to Text, and Handwriting on the Written Composition of Elementary School English Language Learners

    Science.gov (United States)

    Arcon, Nina; Klein, Perry D.; Dombroski, Jill D.

    2017-01-01

    Previous research has shown that both dictation and speech-to-text (STT) software can increase the quality of writing for native English speakers. The purpose of this study was to investigate the effect of these modalities on the written composition and cognitive load of elementary school English language learners (ELLs). In a within-subjects…

  6. Human factors engineering of interfaces for speech and text in the office

    NARCIS (Netherlands)

    Nes, van F.L.

    1986-01-01

    Current data-processing equipment almost exclusively uses one input medium: the keyboard, and one output medium: the visual display unit. An alternative to typing would be welcome in view of the effort needed to become proficient in typing; speech may provide this alternative if a proper spee

  7. Re-Presenting Subversive Songs: Applying Strategies for Invention and Arrangement to Nontraditional Speech Texts

    Science.gov (United States)

    Charlesworth, Dacia

    2010-01-01

    Invention deals with the content of a speech, arrangement involves placing the content in an order that is most strategic, style focuses on selecting linguistic devices, such as metaphor, to make the message more appealing, memory assists the speaker in delivering the message correctly, and delivery ideally enables great reception of the message.…

  8. Eliciting extra prominence in read-speech tasks: The effects of different text-highlighting methods on acoustic cues to perceived prominence

    DEFF Research Database (Denmark)

    Berger, Stephanie; Niebuhr, Oliver; Fischer, Kerstin

    2018-01-01

    The research initiative Innovating Speech EliCitation Techniques (INSPECT) aims to describe and quantify how recording methods, situations and materials influence speech produc-tion in lab-speech experiments. On this basis, INSPECT aims to develop methods that reliably stimulate specific patterns...... and styles of speech, like expressive or conversational speech or different types emphatic accents. The present study investigates if and how different text highlighting methods (yellow background, bold, capital letter, italics, and underlining) make speakers reinforce the level of perceived prominence...

  9. Support vector machine and mel frequency Cepstral coefficient based algorithm for hand gestures and bidirectional speech to text device

    Science.gov (United States)

    Balbin, Jessie R.; Padilla, Dionis A.; Fausto, Janette C.; Vergara, Ernesto M.; Garcia, Ramon G.; Delos Angeles, Bethsedea Joy S.; Dizon, Neil John A.; Mardo, Mark Kevin N.

    2017-02-01

    This research is about translating series of hand gesture to form a word and produce its equivalent sound on how it is read and said in Filipino accent using Support Vector Machine and Mel Frequency Cepstral Coefficient analysis. The concept is to detect Filipino speech input and translate the spoken words to their text form in Filipino. This study is trying to help the Filipino deaf community to impart their thoughts through the use of hand gestures and be able to communicate to people who do not know how to read hand gestures. This also helps literate deaf to simply read the spoken words relayed to them using the Filipino speech to text system.

  10. Ordering Operations in Square Root Extractions, Analyzing Some Early Medieval Sanskrit Mathematical Texts with the Help of Speech Act Theory

    Science.gov (United States)

    Keller, Agathe

    Procedures for extracting square roots written in Sanskrit in two treatises and their commentaries from the fifth to the twelfth centuries are explored with the help of Textology and Speech Act Theory. An analysis of the number and order of the steps presented in these texts is used to show that their aims were not limited to only describing how to carry out the algorithm. The intentions of authors of these Sanskrit mathematical texts are questioned by taking into account the expressivity of relationships established between the world and the text.1

  11. Nazareth College: Specialty Preparation for Speech-Language Pathologists to Work with Children Who Are Deaf and Hard of Hearing

    Science.gov (United States)

    Brown, Paula M.; Quenin, Cathy

    2010-01-01

    The specialty preparation program within the speech-language pathology master's degree program at Nazareth College in Rochester, New York, was designed to train speech-language pathologists to work with children who are deaf and hard of hearing, ages 0 to 21. The program is offered in collaboration with the Rochester Institute of Technology,…

  12. When will a stuttering moment occur? The determining role of speech motor preparation.

    Science.gov (United States)

    Vanhoutte, Sarah; Cosyns, Marjan; van Mierlo, Pieter; Batens, Katja; Corthals, Paul; De Letter, Miet; Van Borsel, John; Santens, Patrick

    2016-06-01

    The present study aimed to evaluate whether increased activity related to speech motor preparation preceding fluently produced words reflects a successful compensation strategy in stuttering. For this purpose, a contingent negative variation (CNV) was evoked during a picture naming task and measured by use of electro-encephalography. A CNV is a slow, negative event-related potential known to reflect motor preparation generated by the basal ganglia-thalamo-cortical (BGTC) - loop. In a previous analysis, the CNV of 25 adults with developmental stuttering (AWS) was significantly increased, especially over the right hemisphere, compared to the CNV of 35 fluent speakers (FS) when both groups were speaking fluently (Vanhoutte et al., (2015) doi: 10.1016/j.neuropsychologia.2015.05.013). To elucidate whether this increase is a compensation strategy enabling fluent speech in AWS, the present analysis evaluated the CNV of 7 AWS who stuttered during this picture naming task. The CNV preceding AWS stuttered words was statistically compared to the CNV preceding AWS fluent words and FS fluent words. Though no difference emerged between the CNV of the AWS stuttered words and the FS fluent words, a significant reduction was observed when comparing the CNV preceding AWS stuttered words to the CNV preceding AWS fluent words. The latter seems to confirm the compensation hypothesis: the increased CNV prior to AWS fluent words is a successful compensation strategy, especially when it occurs over the right hemisphere. The words are produced fluently because of an enlarged activity during speech motor preparation. The left CNV preceding AWS stuttered words correlated negatively with stuttering frequency and severity suggestive for a link between the left BGTC - network and the stuttering pathology. Overall, speech motor preparatory activity generated by the BGTC - loop seems to have a determining role in stuttering. An important divergence between left and right hemisphere is

  13. A video, text, and speech-driven realistic 3-d virtual head for human-machine interface.

    Science.gov (United States)

    Yu, Jun; Wang, Zeng-Fu

    2015-05-01

    A multiple inputs-driven realistic facial animation system based on 3-D virtual head for human-machine interface is proposed. The system can be driven independently by video, text, and speech, thus can interact with humans through diverse interfaces. The combination of parameterized model and muscular model is used to obtain a tradeoff between computational efficiency and high realism of 3-D facial animation. The online appearance model is used to track 3-D facial motion from video in the framework of particle filtering, and multiple measurements, i.e., pixel color value of input image and Gabor wavelet coefficient of illumination ratio image, are infused to reduce the influence of lighting and person dependence for the construction of online appearance model. The tri-phone model is used to reduce the computational consumption of visual co-articulation in speech synchronized viseme synthesis without sacrificing any performance. The objective and subjective experiments show that the system is suitable for human-machine interaction.

  14. Speaker-dependent Dictionary-based Speech Enhancement for Text-Dependent Speaker Verification

    DEFF Research Database (Denmark)

    Thomsen, Nicolai Bæk; Thomsen, Dennis Alexander Lehmann; Tan, Zheng-Hua

    2016-01-01

    not perform well in this setting. In this work we compare the performance of different noise reduction methods under different noise conditions in terms of speaker verification when the text is known and the system is trained on clean data (mis-matched conditions). We furthermore propose a new approach based......The problem of text-dependent speaker verification under noisy conditions is becoming ever more relevant, due to increased usage for authentication in real-world applications. Classical methods for noise reduction such as spectral subtraction and Wiener filtering introduce distortion and do...... on dictionary-based noise reduction and compare it to the baseline methods....

  15. Text-to-speech enhanced eBooks for emerging literacy development

    CSIR Research Space (South Africa)

    De Wet, Febe

    2015-09-01

    Full Text Available were to investigate if exposure to an interactive eBook would result in the acquisition of new vocabulary and sight word reading in the study participants. A randomised pre-test/post-test between-subjects design was used. An experimental group...

  16. The Speect text-to-speech system entry for the Blizzard Challenge 2013

    CSIR Research Space (South Africa)

    Louw, JA

    2013-09-01

    Full Text Available and bad) according to the (subjective) belief system of the person. Informally, Semaffect appraises an emotion from how one reacts to a good/bad person doing a good/bad deed to another good/bad person. Formally, the model appraises a given event in terms... of the good (1) and bad (0) va- lences of its semantic AGENT (A), verb predicate (v) and PATIENT (P). It is important to note that Semaffect de- fines an emotion anonymously based on the composition of the underlying semantic variables A, v and P, and not from...

  17. Speech Compression

    Directory of Open Access Journals (Sweden)

    Jerry D. Gibson

    2016-06-01

    Full Text Available Speech compression is a key technology underlying digital cellular communications, VoIP, voicemail, and voice response systems. We trace the evolution of speech coding based on the linear prediction model, highlight the key milestones in speech coding, and outline the structures of the most important speech coding standards. Current challenges, future research directions, fundamental limits on performance, and the critical open problem of speech coding for emergency first responders are all discussed.

  18. Information of the public in an emergency and preparation of text blocks

    International Nuclear Information System (INIS)

    Miska, H.

    1997-01-01

    In addition to the advance information, the EU also demands and regulates the information in an emergency. A prompt dissemination of the required news is facilitated by text blocks which can be prepared and harmonised with neighbouring administrations. Not before a press Center has been established, detailed texts may be compiled. (orig.) [de

  19. Automatic detection of hate speech in text: an overview of the topic and dataset annotation with hierarchical classes

    OpenAIRE

    Paula Cristina Teixeira Fortuna

    2017-01-01

    Nowadays people are using more and more social networks to communicate their opinions, share information and experiences. In social networks people have the feeling of being deindividualized and can incur more frequently in aggressive communication. In this context, it is important that government and social networks platforms have tools to detect hate speech because it is harmful to its targets. In our work we investigate the problem of detecting hate speech online. Our first goal is to make...

  20. Applications of Speech-to-Text Recognition and Computer-Aided Translation for Facilitating Cross-Cultural Learning through a Learning Activity: Issues and Their Solutions

    Science.gov (United States)

    Shadiev, Rustam; Wu, Ting-Ting; Sun, Ai; Huang, Yueh-Min

    2018-01-01

    In this study, 21 university students, who represented thirteen nationalities, participated in an online cross-cultural learning activity. The participants were engaged in interactions and exchanges carried out on Facebook® and Skype® platforms, and their multilingual communications were supported by speech-to-text recognition (STR) and…

  1. Non-linear frequency scale mapping for voice conversion in text-to-speech system with cepstral description

    Czech Academy of Sciences Publication Activity Database

    Přibilová, Anna; Přibil, Jiří

    2006-01-01

    Roč. 48, č. 12 (2006), s. 1691-1703 ISSN 0167-6393 R&D Projects: GA MŠk(CZ) OC 277.001; GA AV ČR(CZ) 1QS108040569 Grant - others:MŠk(SK) 102/VTP/2000; MŠk(SK) 1/3107/06 Institutional research plan: CEZ:AV0Z20670512 Keywords : signal processing * speech processing * speech synthesis Subject RIV: JA - Electronics ; Optoelectronics, Electrical Engineering Impact factor: 0.678, year: 2006

  2. Review of Design of Speech Recognition and Text Analytics based Digital Banking Customer Interface and Future Directions of Technology Adoption

    OpenAIRE

    Saha, Amal K

    2017-01-01

    Banking is one of the most significant adopters of cutting-edge information technologies. Since its modern era beginning in the form of paper based accounting maintained in the branch, adoption of computerized system made it possible to centralize the processing in data centre and improve customer experience by making a more available and efficient system. The latest twist in this evolution is adoption of natural language processing and speech recognition in the user interface between the hum...

  3. REPORTED SPEECH IN FICTIONAL NARRATIVE TEXTS IN TERMS OF SPEECH ACTS THEORY SÖZ EDİMLERİ KURAMI AÇISINDAN KURGUSAL ANLATI METİNLERİNDE SÖZ AKTARIMI

    Directory of Open Access Journals (Sweden)

    Soner AKŞEHİRLİ

    2011-06-01

    Full Text Available Speech or discourse reporting (speech representation is a linguistic phenomenon which is seen both in ordinary communication and fictional narrative texts. In linguistics, speech reporting is differentiated as direct, indirect and free-indirect speech. On the other and, speech acts theory, suggested by J.L.Auistin, can provide a new perspective for speech reporting. According to theory, to say or to produce a statement (locutionary act is to perform an act (illocutionary act.Moreover, one can performed an act ifluenced by an locutionary act. In ordinary communication, reporter and in fictional texts narrator may report one, two or all of the locutionary act, illocutionary act and perlocutionary act of reported statement. At the same time, these processes must considered in determining point of view that governing narrative texts. So that, we can develop a new typology of speech reporting for fictional texts Söz ya da söylem aktarımı hem günlük iletişimde hem de kurgusal anlatı metinlerinde sıkça görülen dilbilimsel bir olgudur. Dilbilim açısından söz aktarımı doğrudan, dolaylı ve serbest dolaylı olmak üzere üç temel biçimde ele alınır. J.L.Austin tarafından geliştiren söz edimleri kuramı ise, söz aktarımına farklı bir açıdan bakmamızı sağlayabilir. Kurama göre bir söz söylemek (düzsöz, bir iş yapmaktır (edimsöz. Ayrıca söylenen sözün etkisiyle yapılan bir iş de olabilir (etkisöz. Günlük iletişimde aktarıcı, kurgusal metinlerde ise anlatıcı söz aktarımını gerçekleştirirken, aktardığı sözün düzsöz, edimsöz ve etkisöz bileşenlerinden herhangi birini, ikisini ya da üçünü birden aktarabilir. Bu aynı zamanda anlatısal metinleri yöneten bakış açısının belirlenmesinde de dikkate alınması gereken bir süreçtir. Böylece kurgusal metinler için söz edimleri kuramına dayanan yeni bir söz aktarım tipolojisi oluşturulabilir.

  4. Speech-Language Pathologists' Preparation, Practices, and Perspectives on Serving Culturally and Linguistically Diverse Children

    Science.gov (United States)

    Guiberson, Mark; Atkins, Jenny

    2012-01-01

    This study describes the backgrounds, diversity training, and professional perspectives reported by 154 Colorado speech-language pathologists in serving children from culturally and linguistically diverse (CLD) backgrounds. The authors compare the results of the current survey to those of a similar survey collected in 1996. Respondents reported…

  5. Development and Testing of an Automated 4-Day Text Messaging Guidance as an Aid for Improving Colonoscopy Preparation.

    Science.gov (United States)

    Walter, Benjamin Michael; Klare, Peter; Neu, Bruno; Schmid, Roland M; von Delius, Stefan

    2016-06-21

    In gastroenterology a sufficient colon cleansing improves adenoma detection rate and prevents the need for preterm repeat colonoscopies due to invalid preparation. It has been shown that patient education is of major importance for improvement of colon cleansing. Objective of this study was to assess the function of an automated text messaging (short message service, SMS)-supported colonoscopy preparation starting 4 days before colonoscopy appointment. After preevaluation to assess mobile phone usage in the patient population for relevance of this approach, a Web-based, automated SMS text messaging system was developed, following which a single-center feasibility study at a tertiary care center was performed. Patients scheduled for outpatient colonoscopy were invited to participate. Patients enrolled in the study group received automated information about dietary recommendations and bowel cleansing during colonoscopy preparation. Data of outpatient colonoscopies with regular preparation procedure were used for pair matching and served as control. Primary end point was feasibility of SMS text messaging support in colonoscopy preparation assessed as stable and satisfactory function of the system. Secondary end points were quality of bowel preparation according to the Boston Bowel Preparation Scale (BBPS) and patient satisfaction with SMS text messaging-provided information assessed by a questionnaire. Web-based SMS text messaging-supported colonoscopy preparation was successful and feasible in 19 of 20 patients. Mean (standard error of the mean, SEM) total BBPS score was slightly higher in the SMS group than in the control group (7.3, SEM 0.3 vs 6.4, SEM 0.2) and for each colonic region (left, transverse, and right colon). Patient satisfaction regarding SMS text messaging-based information was high. Using SMS for colonoscopy preparation with 4 days' guidance including dietary recommendation is a new approach to improve colonoscopy preparation. Quality of colonoscopy

  6. The software for automatic creation of the formal grammars used by speech recognition, computer vision, editable text conversion systems, and some new functions

    Science.gov (United States)

    Kardava, Irakli; Tadyszak, Krzysztof; Gulua, Nana; Jurga, Stefan

    2017-02-01

    For more flexibility of environmental perception by artificial intelligence it is needed to exist the supporting software modules, which will be able to automate the creation of specific language syntax and to make a further analysis for relevant decisions based on semantic functions. According of our proposed approach, of which implementation it is possible to create the couples of formal rules of given sentences (in case of natural languages) or statements (in case of special languages) by helping of computer vision, speech recognition or editable text conversion system for further automatic improvement. In other words, we have developed an approach, by which it can be achieved to significantly improve the training process automation of artificial intelligence, which as a result will give us a higher level of self-developing skills independently from us (from users). At the base of our approach we have developed a software demo version, which includes the algorithm and software code for the entire above mentioned component's implementation (computer vision, speech recognition and editable text conversion system). The program has the ability to work in a multi - stream mode and simultaneously create a syntax based on receiving information from several sources.

  7. Text-to-Speech and Reading While Listening: Reading Support for Individuals with Severe Traumatic Brain Injury

    Science.gov (United States)

    Harvey, Judy

    2013-01-01

    Individuals with severe traumatic brain injury (TBI) often have reading challenges. They maintain or reestablish basic decoding and word recognition skills following injury, but problems with reading comprehension often persist. Practitioners have the potential to accommodate struggling readers by changing the presentational mode of text in a…

  8. Distracted While Reading? Changing to A Hard-to-read Font Shields against the Effects of Environmental Noise and Speech on Text Memory

    Directory of Open Access Journals (Sweden)

    Niklas Halin

    2016-08-01

    Full Text Available The purpose of this study was to investigate the distractive effects of background speech, aircraft noise and road traffic noise on text memory and particularly to examine if displaying the texts in a hard-to-read font can shield against the detrimental effects of these types of background sounds. This issue was addressed in an experiment where 56 students read shorter texts about different classes of fictitious creatures (i.e., animals, fishes, birds, and dinosaurs against a background of the aforementioned background sounds respectively and silence. For half of the participants the texts were displayed in an easy-to-read font (i.e., Times New Roman and for the other half in a hard-to-read font (i.e., Haettenschweiler. The dependent measure was the proportion correct answers on the multiple-choice tests that followed each sound condition. Participants’ performance in the easy-to-read font condition was significantly impaired by all three background sound conditions compared to silence. In contrast, there were no effects of the three background sound conditions compared to silence in the hard-to-read font condition. These results suggest that an increase in task demand—by displaying the text in a hard-to-read font—shields against various types of distracting background sounds by promoting a more steadfast locus-of-attention and by reducing the processing of background sound.

  9. Negative constructions in memoir texts by A.Konchalovsky as indices of speech characteristics and author’s personality characteristics

    Directory of Open Access Journals (Sweden)

    Elena A. Bolotova

    2016-09-01

    Full Text Available The aim of the article is to examine the relevant problem of negation in linguistics, as well as the possible ways of use by the author the memoirs texts by A. Konchalovsky («Low truths» and «Exalts deception» the negative constructions, their pragmatic orientation. It becomes clear how and what certain fiction-figurative and expressive means, based on the use of linguistic resources at various levels, it is possible to characterize the individual memoirist using negative semantic proposals. We consider the specifics of the use of various methods of influence by the author of memoirs on the reader by the means of representation categories of negopositivism. It reveals that any information that is submitted by the author in the text of memoirs, refers to the identity of the author, memoirist, and entering into a dialogue with their readers, the author establishes contacts with them. In the course of this dialogue there is the exchange of information about people, about oneself, about different events and facts in the different periods of the life of the author and the country.

  10. Distracted While Reading? Changing to a Hard-to-Read Font Shields against the Effects of Environmental Noise and Speech on Text Memory.

    Science.gov (United States)

    Halin, Niklas

    2016-01-01

    The purpose of this study was to investigate the distractive effects of background speech, aircraft noise and road traffic noise on text memory and particularly to examine if displaying the texts in a hard-to-read font can shield against the detrimental effects of these types of background sounds. This issue was addressed in an experiment where 56 students read shorter texts about different classes of fictitious creatures (i.e., animals, fishes, birds, and dinosaurs) against a background of the aforementioned background sounds respectively and silence. For half of the participants the texts were displayed in an easy-to-read font (i.e., Times New Roman) and for the other half in a hard-to-read font (i.e., Haettenschweiler). The dependent measure was the proportion correct answers on the multiple-choice tests that followed each sound condition. Participants' performance in the easy-to-read font condition was significantly impaired by all three background sound conditions compared to silence. In contrast, there were no effects of the three background sound conditions compared to silence in the hard-to-read font condition. These results suggest that an increase in task demand-by displaying the text in a hard-to-read font-shields against various types of distracting background sounds by promoting a more steadfast locus-of-attention and by reducing the processing of background sound.

  11. PREPARING TEXTUAL ELEMENTS OF BYOD TECHNOLOGIESIN THE WORD-ONLINE ENVIRONMENT TO SUPPORT ELEMENTARY SKILLS OF RUSSIAN SPEECH OF FOREIGN STUDENTS

    Directory of Open Access Journals (Sweden)

    Х Э Исмаилова

    2016-12-01

    Full Text Available The article considers some pedagogical and information technological aspects of the preparation and use of the Russian as a foreign language teacher copyrighted electronic manuals. The purpose of the development is to support the process of formation and development of the foreign students’ basic skills in Russian speech in the form of extracurricular activities with elements of BYOD technologies. As well as to form the basic elements of the intercultural communication in a multi-ethnic environment, tolerance and other components of communicative competence. The manual contains text, dedicated to the national holiday Navruz and a series of exercises. It is designed as the word-online document and hosted on the MS-OneDrive cloud disk. The scheme presented allows foreign students to use their own mobile devices to access the materials via the Internet. The information product was used for the preparation of the study group to attend extracurricular activities. In addition, an electronic document that is hosted on the teacher’s cloud drive can be linked in the e-textbooks and on the teacher’s web sites, for example the MOODLE type systems.

  12. A Comparison of Inter-Professional Education Programs in Preparing Prospective Teachers and Speech and Language Pathologists for Collaborative Language-Literacy Instruction

    Science.gov (United States)

    Wilson, Leanne; McNeill, Brigid; Gillon, Gail T.

    2016-01-01

    Ensuring teacher and speech and language pathology graduates are prepared to work collaboratively together to meet the diverse language literacy learning needs of children is an important goal. This study investigated the efficacy of a 3-h inter-professional education program focused on explicit instruction in the language skills that underpin…

  13. Individually-Personal Peculiarities of Younger Preschoolers’ Speech

    Directory of Open Access Journals (Sweden)

    M E Novikova

    2013-12-01

    Full Text Available Studying the speech of the younger preschoolers is a major factor in designing educational methods and preparing children for school. There exist individual and gender differences in the way children acquire speech skills. Word comprehension and idea interpretation depend on the child’s upbringing, his or her environment, the interaction within the family. This article submits the research data obtained from the study of the individual peculiarities of the younger preschool children’s speech.

  14. Features Speech Signature Image Recognition on Mobile Devices

    Directory of Open Access Journals (Sweden)

    Alexander Mikhailovich Alyushin

    2015-12-01

    Full Text Available The algorithms fordynamic spectrograms images recognition, processing and soundspeech signature (SS weredeveloped. The software for mobile phones, thatcan recognize speech signatureswas prepared. The investigation of the SS recognition speed on its boundarytypes was conducted. Recommendations on the boundary types choice in the optimal ratio of recognitionspeed and required space were given.

  15. Text Linguistics in Research Papers Prepared by University Students: Teaching through Lesson Plans and Textbooks

    Directory of Open Access Journals (Sweden)

    Manuel Albarrán-Santiago

    2015-01-01

    Full Text Available This research project revolves around the properties of text linguistics under a qualitative approach.  The author analyzed drafts of a research paper by two university students as well as lesson plans and textbooks of high school Spanish Language and Literature courses and lesson plans of courses from the Licentiate degree in Education.  According to the information from the drafts, students struggle with coherence and cohesion in writing; however, they succeed in choosing the correct language for the type of writing.  Difficulties are most likely due to fact that this topic is not included in secondary education plans and is not commonly addressed in textbooks or university classes.  In conclusion, teachers should include the properties of text linguistics in their lesson plans in order to help students overcome these difficulties.

  16. Preparing College Students To Search Full-Text Databases: Is Instruction Necessary?

    Science.gov (United States)

    Riley, Cheryl; Wales, Barbara

    Full-text databases allow Central Missouri State University's clients to access some of the serials that libraries have had to cancel due to escalating subscription costs; EbscoHost, the subject of this study, is one such database. The database is available free to all Missouri residents. A survey was designed consisting of 21 questions intended…

  17. Sensorimotor oscillations prior to speech onset reflect altered motor networks in adults who stutter

    Directory of Open Access Journals (Sweden)

    Anna-Maria Mersov

    2016-09-01

    Full Text Available Adults who stutter (AWS have demonstrated atypical coordination of motor and sensory regions during speech production. Yet little is known of the speech-motor network in AWS in the brief time window preceding audible speech onset. The purpose of the current study was to characterize neural oscillations in the speech-motor network during preparation for and execution of overt speech production in AWS using magnetoencephalography (MEG. Twelve AWS and twelve age-matched controls were presented with 220 words, each word embedded in a carrier phrase. Controls were presented with the same word list as their matched AWS participant. Neural oscillatory activity was localized using minimum-variance beamforming during two time periods of interest: speech preparation (prior to speech onset and speech execution (following speech onset. Compared to controls, AWS showed stronger beta (15-25Hz suppression in the speech preparation stage, followed by stronger beta synchronization in the bilateral mouth motor cortex. AWS also recruited the right mouth motor cortex significantly earlier in the speech preparation stage compared to controls. Exaggerated motor preparation is discussed in the context of reduced coordination in the speech-motor network of AWS. It is further proposed that exaggerated beta synchronization may reflect a more strongly inhibited motor system that requires a stronger beta suppression to disengage prior to speech initiation. These novel findings highlight critical differences in the speech-motor network of AWS that occur prior to speech onset and emphasize the need to investigate further the speech-motor assembly in the stuttering population.

  18. Preparing for reading comprehension: Fostering text comprehension skills in preschool and early elementary school children

    Directory of Open Access Journals (Sweden)

    Paul van den BROEK

    2011-11-01

    Full Text Available To understand what they read or hear, children and adults must create a coherent mental representation of presented information. Recent research suggests that the ability to do so starts to develop early –well before reading age- and that early individual differences are predictive of later reading-comprehension performance. In this paper, we review this research and discuss potential applications to early intervention. We then present two exploratory studies in which we examine whether it is feasible to design interventions with early readers (3rd grade and even toddlers (2-3 years old. The interventions employed causal questioning techniques as children listen to orally presented,age-appropriate narratives. Afterwards, comprehension was tested through question answering and recall tasks. Results indicate that such interventions are indeed feasible. Moreover, they suggest thatfor both toddlers and early readers questions during comprehension are more effective than questions after comprehension. Finally, for both groups higher working memory capacity was related to bettercomprehension.

  19. Preparing for reading comprehension: Fostering text comprehension skills in preschool and early elementary school children

    Directory of Open Access Journals (Sweden)

    Paul van den Brook

    2011-07-01

    Full Text Available To understand what they read or hear, children and adults must create a coherent mental representation of presented information. Recent research suggests that the ability to do so starts to develop early –well before reading age- and that early individual differences are predictive of later reading-comprehension performance. In this paper, we review this research and discuss potential applications to early intervention. We then present two exploratory studies in which we examine whether it is feasible to design interventions with early readers (3rd grade and even toddlers (2-3 years old. The interventions employed causal questioning techniques as children listen to orally presented, age-appropriate narratives. Afterwards, comprehension was tested through question answering and recall tasks. Results indicate that such interventions are indeed feasible. Moreover, they suggest that for both toddlers and early readers questions during comprehension are more effective than questions after comprehension. Finally, for both groups higher working memory capacity was related to better comprehension.

  20. Passion and Preparation in the Basic Course: The Influence of Students' Ego-Involvement with Speech Topics and Preparation Time on Public-Speaking Grades

    Science.gov (United States)

    Mazer, Joseph P.; Titsworth, Scott

    2012-01-01

    Authors of basic public-speaking course textbooks frequently encourage students to select speech topics in which they have vested interest, care deeply about, and hold strong opinions and beliefs. This study explores students' level of ego-involvement with informative and persuasive speech topics, examines possible ego-involvement predictors of…

  1. Blending the principles of Suggestopedia and thetheory of Speech Acts in writing suggestopedicdidactic texts, with reference to German andZulu scripts

    Directory of Open Access Journals (Sweden)

    R.H. Bodenstein

    2013-02-01

    Full Text Available This paper suggests that language teachers who use the suggestopedic method should write their own texts that comply with suggestopedic principles. This is imperative because of the lack of material that can be acquired and used in such courses. Writing their own scripts also enables teachers to identify with their materials and brings much reward and personal growth. Guidelines for the writing and setting up of these texts are provided The text should embody the philosophic and didactic .framework of suggestopedia. It should also be presented as a didactic play wherein the language components to be learned are presented in the form of new scenes in a continuous drama text. Traditional beliefs about level of complexity of the language suitable for beginners' courses are considered unfounded Suggestopedic scripts therefore contain complex, 'reallife' language .from the outset, starting with the language needed to make contact with native target language speakers. The main guideline for the organisation and structuring of the text is that it should mirror authentic communicative situations. The paper therefore argues that suggestopedic scripts should be written according to the lists of language jUnctions (or speech acts and topic areas required for the so-called 'threshold level' of language competence. The paper concludes with examples .from a German and a Zulu text to illustrate the didactic and structural principles and guidelines that were outlined in the article.Die artikel voer aan dat taalonderwysers wat die suggestopediese metode gebruik, self tekste behoort te skryf wat strook met die suggestopediese beginsels. Die gebrek aan geskikte materiaal op die mark noodsaak hulle om dit te doen. Wanneer onderwysers hulle eie tekste skryj beteken dit egter ook dat hulle met die onderrigmateriaal kan identifiseer. Dit kan professioneellonend wees en persoonlike groei teweeg bring. Riglyne vir die skryf en ontwerp van sulke tekste word verskaf Die

  2. A prepared speech in front of a pre-recorded audience: subjective, physiological, and neuroendocrine responses to the Leiden Public Speaking Task.

    Science.gov (United States)

    Westenberg, P Michiel; Bokhorst, Caroline L; Miers, Anne C; Sumter, Sindy R; Kallen, Victor L; van Pelt, Johannes; Blöte, Anke W

    2009-10-01

    This study describes a new public speaking protocol for youth. The main question asked whether a speech prepared at home and given in front of a pre-recorded audience creates a condition of social-evaluative threat. Findings showed that, on average, this task elicits a moderate stress response in a community sample of 83 12- to 15-year-old adolescents. During the speech, participants reported feeling more nervous and having higher heart rate and sweatiness of the hands than at baseline or recovery. Likewise, physiological (heart rate and skin conductance) and neuroendocrine (cortisol) activity were higher during the speech than at baseline or recovery. Additionally, an anticipation effect was observed: baseline levels were higher than recovery levels for most variables. Taking the anticipation and speech response together, a substantial cortisol response was observed for 55% of participants. The findings indicate that the Leiden Public Speaking Task might be particularly suited to investigate individual differences in sensitivity to social-evaluative situations.

  3. Speech to Text: Today and Tomorrow. Proceedings of a Conference at Gallaudet University (Washington, D.C., September, 1988). GRI Monograh Series B, No. 2.

    Science.gov (United States)

    Harkins, Judith E., Ed.; Virvan, Barbara M., Ed.

    The conference proceedings contains 23 papers on telephone relay service, real-time captioning, and automatic speech recognition, and a glossary. The keynote address, by Representative Major R. Owens, examines current issues in federal legislation. Other papers have the following titles and authors: "Telephone Relay Service: Rationale and…

  4. Hate speech

    Directory of Open Access Journals (Sweden)

    Anne Birgitta Nilsen

    2014-12-01

    Full Text Available The manifesto of the Norwegian terrorist Anders Behring Breivik is based on the “Eurabia” conspiracy theory. This theory is a key starting point for hate speech amongst many right-wing extremists in Europe, but also has ramifications beyond these environments. In brief, proponents of the Eurabia theory claim that Muslims are occupying Europe and destroying Western culture, with the assistance of the EU and European governments. By contrast, members of Al-Qaeda and other extreme Islamists promote the conspiracy theory “the Crusade” in their hate speech directed against the West. Proponents of the latter theory argue that the West is leading a crusade to eradicate Islam and Muslims, a crusade that is similarly facilitated by their governments. This article presents analyses of texts written by right-wing extremists and Muslim extremists in an effort to shed light on how hate speech promulgates conspiracy theories in order to spread hatred and intolerance.The aim of the article is to contribute to a more thorough understanding of hate speech’s nature by applying rhetorical analysis. Rhetorical analysis is chosen because it offers a means of understanding the persuasive power of speech. It is thus a suitable tool to describe how hate speech works to convince and persuade. The concepts from rhetorical theory used in this article are ethos, logos and pathos. The concept of ethos is used to pinpoint factors that contributed to Osama bin Laden's impact, namely factors that lent credibility to his promotion of the conspiracy theory of the Crusade. In particular, Bin Laden projected common sense, good morals and good will towards his audience. He seemed to have coherent and relevant arguments; he appeared to possess moral credibility; and his use of language demonstrated that he wanted the best for his audience.The concept of pathos is used to define hate speech, since hate speech targets its audience's emotions. In hate speech it is the

  5. Under-resourced speech recognition based on the speech manifold

    CSIR Research Space (South Africa)

    Sahraeian, R

    2015-09-01

    Full Text Available Conventional acoustic modeling involves estimating many parameters to effectively model feature distributions. The sparseness of speech and text data, however, degrades the reliability of the estimation process and makes speech recognition a...

  6. Preparation, Clinical Support, and Confidence of Speech-Language Therapists Managing Clients with a Tracheostomy in the UK

    Science.gov (United States)

    Ward, Elizabeth; Morgan, Tessa; McGowan, Sue; Spurgin, Ann-Louise; Solley, Maura

    2012-01-01

    Background: Literature regarding the education, training, clinical support and confidence of speech-language therapists (SLTs) working with patients with a tracheostomy is limited; however, it suggests that many clinicians have reduced clinical confidence when managing this complex population, many face role and team challenges practising in this…

  7. Priorities of Dialogic Speech Teaching Methodology at Higher Non-Linguistic School

    Directory of Open Access Journals (Sweden)

    Vida Asanavičienė

    2011-04-01

    Full Text Available The article deals with a number of relevant methodological issues. First of all, the author analyses psychological peculiarities of dialogic speech and states that the dialogue is the product of at least two persons. Therefore, in this view, dialogic speech, unlike monologic speech, happens impromptu and is not prepared in advance. Dialogic speech is mainly of situational character. The linguistic nature of dialogic speech, in the author’s opinion, lies in the process of exchanging replications, which are coherent in structural and functional character. The author classifies dialogue groups by the number of replications and communicative parameters. The basic goal of dialogic speech teaching is developing the abilities and skills which enable to exchange replications. The author distinguishes two basic stages of dialogic speech teaching: 1. Training of abilities to exchange replications during communicative exercises. 2. Development of skills by training the capability to perform exercises of creative nature during a group dialogue, conversation or debate.

  8. Speech Problems

    Science.gov (United States)

    ... Staying Safe Videos for Educators Search English Español Speech Problems KidsHealth / For Teens / Speech Problems What's in ... a person's ability to speak clearly. Some Common Speech and Language Disorders Stuttering is a problem that ...

  9. Filled pause refinement based on the pronunciation probability for lecture speech.

    Directory of Open Access Journals (Sweden)

    Yan-Hua Long

    Full Text Available Nowadays, although automatic speech recognition has become quite proficient in recognizing or transcribing well-prepared fluent speech, the transcription of speech that contains many disfluencies remains problematic, such as spontaneous conversational and lecture speech. Filled pauses (FPs are the most frequently occurring disfluencies in this type of speech. Most recent studies have shown that FPs are widely believed to increase the error rates for state-of-the-art speech transcription, primarily because most FPs are not well annotated or provided in training data transcriptions and because of the similarities in acoustic characteristics between FPs and some common non-content words. To enhance the speech transcription system, we propose a new automatic refinement approach to detect FPs in British English lecture speech transcription. This approach combines the pronunciation probabilities for each word in the dictionary and acoustic language model scores for FP refinement through a modified speech recognition forced-alignment framework. We evaluate the proposed approach on the Reith Lectures speech transcription task, in which only imperfect training transcriptions are available. Successful results are achieved for both the development and evaluation datasets. Acoustic models trained on different styles of speech genres have been investigated with respect to FP refinement. To further validate the effectiveness of the proposed approach, speech transcription performance has also been examined using systems built on training data transcriptions with and without FP refinement.

  10. Speech disorder prevention

    Directory of Open Access Journals (Sweden)

    Miladis Fornaris-Méndez

    2017-04-01

    Full Text Available Language therapy has trafficked from a medical focus until a preventive focus. However, difficulties are evidenced in the development of this last task, because he is devoted bigger space to the correction of the disorders of the language. Because the speech disorders is the dysfunction with more frequently appearance, acquires special importance the preventive work that is developed to avoid its appearance. Speech education since early age of the childhood makes work easier for prevent the appearance of speech disorders in the children. The present work has as objective to offer different activities for the prevention of the speech disorders.

  11. Illustrated Speech Anatomy.

    Science.gov (United States)

    Shearer, William M.

    Written for students in the fields of speech correction and audiology, the text deals with the following: structures involved in respiration; the skeleton and the processes of inhalation and exhalation; phonation and pitch, the larynx, and esophageal speech; muscles involved in articulation; muscles involved in resonance; and the anatomy of the…

  12. Speech Recognition

    Directory of Open Access Journals (Sweden)

    Adrian Morariu

    2009-01-01

    Full Text Available This paper presents a method of speech recognition by pattern recognition techniques. Learning consists in determining the unique characteristics of a word (cepstral coefficients by eliminating those characteristics that are different from one word to another. For learning and recognition, the system will build a dictionary of words by determining the characteristics of each word to be used in the recognition. Determining the characteristics of an audio signal consists in the following steps: noise removal, sampling it, applying Hamming window, switching to frequency domain through Fourier transform, calculating the magnitude spectrum, filtering data, determining cepstral coefficients.

  13. Speech Matters

    DEFF Research Database (Denmark)

    Hasse Jørgensen, Stina

    2011-01-01

    About Speech Matters - Katarina Gregos, the Greek curator's exhibition at the Danish Pavillion, the Venice Biannual 2011.......About Speech Matters - Katarina Gregos, the Greek curator's exhibition at the Danish Pavillion, the Venice Biannual 2011....

  14. Speech-to-Speech Relay Service

    Science.gov (United States)

    Consumer Guide Speech to Speech Relay Service Speech-to-Speech (STS) is one form of Telecommunications Relay Service (TRS). TRS is a service that allows persons with hearing and speech disabilities ...

  15. Comprehension of synthetic speech and digitized natural speech by adults with aphasia.

    Science.gov (United States)

    Hux, Karen; Knollman-Porter, Kelly; Brown, Jessica; Wallace, Sarah E

    2017-09-01

    Using text-to-speech technology to provide simultaneous written and auditory content presentation may help compensate for chronic reading challenges if people with aphasia can understand synthetic speech output; however, inherent auditory comprehension challenges experienced by people with aphasia may make understanding synthetic speech difficult. This study's purpose was to compare the preferences and auditory comprehension accuracy of people with aphasia when listening to sentences generated with digitized natural speech, Alex synthetic speech (i.e., Macintosh platform), or David synthetic speech (i.e., Windows platform). The methodology required each of 20 participants with aphasia to select one of four images corresponding in meaning to each of 60 sentences comprising three stimulus sets. Results revealed significantly better accuracy given digitized natural speech than either synthetic speech option; however, individual participant performance analyses revealed three patterns: (a) comparable accuracy regardless of speech condition for 30% of participants, (b) comparable accuracy between digitized natural speech and one, but not both, synthetic speech option for 45% of participants, and (c) greater accuracy with digitized natural speech than with either synthetic speech option for remaining participants. Ranking and Likert-scale rating data revealed a preference for digitized natural speech and David synthetic speech over Alex synthetic speech. Results suggest many individuals with aphasia can comprehend synthetic speech options available on popular operating systems. Further examination of synthetic speech use to support reading comprehension through text-to-speech technology is thus warranted. Copyright © 2017 Elsevier Inc. All rights reserved.

  16. Apraxia of Speech

    Science.gov (United States)

    ... Health Info » Voice, Speech, and Language Apraxia of Speech On this page: What is apraxia of speech? ... about apraxia of speech? What is apraxia of speech? Apraxia of speech (AOS)—also known as acquired ...

  17. Oscillatory Brain Responses Reflect Anticipation during Comprehension of Speech Acts in Spoken Dialog

    Directory of Open Access Journals (Sweden)

    Rosa S. Gisladottir

    2018-02-01

    Full Text Available Everyday conversation requires listeners to quickly recognize verbal actions, so-called speech acts, from the underspecified linguistic code and prepare a relevant response within the tight time constraints of turn-taking. The goal of this study was to determine the time-course of speech act recognition by investigating oscillatory EEG activity during comprehension of spoken dialog. Participants listened to short, spoken dialogs with target utterances that delivered three distinct speech acts (Answers, Declinations, Pre-offers. The targets were identical across conditions at lexico-syntactic and phonetic/prosodic levels but differed in the pragmatic interpretation of the speech act performed. Speech act comprehension was associated with reduced power in the alpha/beta bands just prior to Declination speech acts, relative to Answers and Pre-offers. In addition, we observed reduced power in the theta band during the beginning of Declinations, relative to Answers. Based on the role of alpha and beta desynchronization in anticipatory processes, the results are taken to indicate that anticipation plays a role in speech act recognition. Anticipation of speech acts could be critical for efficient turn-taking, allowing interactants to quickly recognize speech acts and respond within the tight time frame characteristic of conversation. The results show that anticipatory processes can be triggered by the characteristics of the interaction, including the speech act type.

  18. INTEGRATING MACHINE TRANSLATION AND SPEECH SYNTHESIS COMPONENT FOR ENGLISH TO DRAVIDIAN LANGUAGE SPEECH TO SPEECH TRANSLATION SYSTEM

    Directory of Open Access Journals (Sweden)

    J. SANGEETHA

    2015-02-01

    Full Text Available This paper provides an interface between the machine translation and speech synthesis system for converting English speech to Tamil text in English to Tamil speech to speech translation system. The speech translation system consists of three modules: automatic speech recognition, machine translation and text to speech synthesis. Many procedures for incorporation of speech recognition and machine translation have been projected. Still speech synthesis system has not yet been measured. In this paper, we focus on integration of machine translation and speech synthesis, and report a subjective evaluation to investigate the impact of speech synthesis, machine translation and the integration of machine translation and speech synthesis components. Here we implement a hybrid machine translation (combination of rule based and statistical machine translation and concatenative syllable based speech synthesis technique. In order to retain the naturalness and intelligibility of synthesized speech Auto Associative Neural Network (AANN prosody prediction is used in this work. The results of this system investigation demonstrate that the naturalness and intelligibility of the synthesized speech are strongly influenced by the fluency and correctness of the translated text.

  19. Introductory speeches

    International Nuclear Information System (INIS)

    2001-01-01

    This CD is multimedia presentation of programme safety upgrading of Bohunice V1 NPP. This chapter consist of introductory commentary and 4 introductory speeches (video records): (1) Introductory speech of Vincent Pillar, Board chairman and director general of Slovak electric, Plc. (SE); (2) Introductory speech of Stefan Schmidt, director of SE - Bohunice Nuclear power plants; (3) Introductory speech of Jan Korec, Board chairman and director general of VUJE Trnava, Inc. - Engineering, Design and Research Organisation, Trnava; Introductory speech of Dietrich Kuschel, Senior vice-president of FRAMATOME ANP Project and Engineering

  20. Interventions for Speech Sound Disorders in Children

    Science.gov (United States)

    Williams, A. Lynn, Ed.; McLeod, Sharynne, Ed.; McCauley, Rebecca J., Ed.

    2010-01-01

    With detailed discussion and invaluable video footage of 23 treatment interventions for speech sound disorders (SSDs) in children, this textbook and DVD set should be part of every speech-language pathologist's professional preparation. Focusing on children with functional or motor-based speech disorders from early childhood through the early…

  1. APPRECIATING SPEECH THROUGH GAMING

    Directory of Open Access Journals (Sweden)

    Mario T Carreon

    2014-06-01

    Full Text Available This paper discusses the Speech and Phoneme Recognition as an Educational Aid for the Deaf and Hearing Impaired (SPREAD application and the ongoing research on its deployment as a tool for motivating deaf and hearing impaired students to learn and appreciate speech. This application uses the Sphinx-4 voice recognition system to analyze the vocalization of the student and provide prompt feedback on their pronunciation. The packaging of the application as an interactive game aims to provide additional motivation for the deaf and hearing impaired student through visual motivation for them to learn and appreciate speech.

  2. Speech Synthesis Applied to Language Teaching.

    Science.gov (United States)

    Sherwood, Bruce

    1981-01-01

    The experimental addition of speech output to computer-based Esperanto lessons using speech synthesized from text is described. Because of Esperanto's phonetic spelling and simple rhythm, it is particularly easy to describe the mechanisms of Esperanto synthesis. Attention is directed to how the text-to-speech conversion is performed and the ways…

  3. Speech coding

    Energy Technology Data Exchange (ETDEWEB)

    Ravishankar, C., Hughes Network Systems, Germantown, MD

    1998-05-08

    Speech is the predominant means of communication between human beings and since the invention of the telephone by Alexander Graham Bell in 1876, speech services have remained to be the core service in almost all telecommunication systems. Original analog methods of telephony had the disadvantage of speech signal getting corrupted by noise, cross-talk and distortion Long haul transmissions which use repeaters to compensate for the loss in signal strength on transmission links also increase the associated noise and distortion. On the other hand digital transmission is relatively immune to noise, cross-talk and distortion primarily because of the capability to faithfully regenerate digital signal at each repeater purely based on a binary decision. Hence end-to-end performance of the digital link essentially becomes independent of the length and operating frequency bands of the link Hence from a transmission point of view digital transmission has been the preferred approach due to its higher immunity to noise. The need to carry digital speech became extremely important from a service provision point of view as well. Modem requirements have introduced the need for robust, flexible and secure services that can carry a multitude of signal types (such as voice, data and video) without a fundamental change in infrastructure. Such a requirement could not have been easily met without the advent of digital transmission systems, thereby requiring speech to be coded digitally. The term Speech Coding is often referred to techniques that represent or code speech signals either directly as a waveform or as a set of parameters by analyzing the speech signal. In either case, the codes are transmitted to the distant end where speech is reconstructed or synthesized using the received set of codes. A more generic term that is applicable to these techniques that is often interchangeably used with speech coding is the term voice coding. This term is more generic in the sense that the

  4. E-text

    DEFF Research Database (Denmark)

    Finnemann, Niels Ole

    2018-01-01

    text can be defined by taking as point of departure the digital format in which everything is represented in the binary alphabet. While the notion of text, in most cases, lends itself to be independent of medium and embodiment, it is also often tacitly assumed that it is, in fact, modeled around...... the print medium, rather than written text or speech. In late 20th century, the notion of text was subject to increasing criticism as in the question raised within literary text theory: is there a text in this class? At the same time, the notion was expanded by including extra linguistic sign modalities...

  5. DEVELOPMENT AND DISORDERS OF SPEECH IN CHILDHOOD.

    Science.gov (United States)

    KARLIN, ISAAC W.; AND OTHERS

    THE GROWTH, DEVELOPMENT, AND ABNORMALITIES OF SPEECH IN CHILDHOOD ARE DESCRIBED IN THIS TEXT DESIGNED FOR PEDIATRICIANS, PSYCHOLOGISTS, EDUCATORS, MEDICAL STUDENTS, THERAPISTS, PATHOLOGISTS, AND PARENTS. THE NORMAL DEVELOPMENT OF SPEECH AND LANGUAGE IS DISCUSSED, INCLUDING THEORIES ON THE ORIGIN OF SPEECH IN MAN AND FACTORS INFLUENCING THE NORMAL…

  6. Machine Translation from Text

    Science.gov (United States)

    Habash, Nizar; Olive, Joseph; Christianson, Caitlin; McCary, John

    Machine translation (MT) from text, the topic of this chapter, is perhaps the heart of the GALE project. Beyond being a well defined application that stands on its own, MT from text is the link between the automatic speech recognition component and the distillation component. The focus of MT in GALE is on translating from Arabic or Chinese to English. The three languages represent a wide range of linguistic diversity and make the GALE MT task rather challenging and exciting.

  7. An analysis on the results of part-of-speech tagging on social text%针对社交文本的词性标注结果分析

    Institute of Scientific and Technical Information of China (English)

    罗程多; 初立民; 吴晓蕊; 赵耀

    2017-01-01

    高速发展的社交媒体产生了大量文本数据.这些文本有别于传统文本的新特性使得现有的自然语言处理工具无法对其进行有效处理.其中针对社交文本的词性标注作为自然语言处理中的最基本任务,它的处理结果直接影响到后续处理的效果.本文针对传统的词性标注工具在处理社交文本时出现的性能下降问题进行研究,尝试找出性能下降的原因,并对多个不同改进程度的标注工具的词性标注结果进行了量化分析,实验结果揭示了传统工具的性能下降主要来源于未知单词的词性推导错误,同时也给出了针对社交文本的词性标注的改进方向.%The rapid development of social media generates a large amount of text data.The casualness and informality of the social text cause notable performance degradation of the traditional natural language processing(NLP) tools.The poor performance of part-of-speech(POS) tagging,as the fundamental task in NLP pipeline,on social text is critical for the other downstream NLP applications.This paper aims at finding out the exact reason of the performance drops of the traditional POS tagging tools.We quantitatively analyze the tagging results of the taggers with different adaptation degrees.The experimental results show that the major reason is the high error rate of the POS inferences for the unknown words.

  8. Neural Entrainment to Speech Modulates Speech Intelligibility

    NARCIS (Netherlands)

    Riecke, Lars; Formisano, Elia; Sorger, Bettina; Baskent, Deniz; Gaudrain, Etienne

    2018-01-01

    Speech is crucial for communication in everyday life. Speech-brain entrainment, the alignment of neural activity to the slow temporal fluctuations (envelope) of acoustic speech input, is a ubiquitous element of current theories of speech processing. Associations between speech-brain entrainment and

  9. Neural pathways for visual speech perception

    Directory of Open Access Journals (Sweden)

    Lynne E Bernstein

    2014-12-01

    Full Text Available This paper examines the questions, what levels of speech can be perceived visually, and how is visual speech represented by the brain? Review of the literature leads to the conclusions that every level of psycholinguistic speech structure (i.e., phonetic features, phonemes, syllables, words, and prosody can be perceived visually, although individuals differ in their abilities to do so; and that there are visual modality-specific representations of speech qua speech in higher-level vision brain areas. That is, the visual system represents the modal patterns of visual speech. The suggestion that the auditory speech pathway receives and represents visual speech is examined in light of neuroimaging evidence on the auditory speech pathways. We outline the generally agreed-upon organization of the visual ventral and dorsal pathways and examine several types of visual processing that might be related to speech through those pathways, specifically, face and body, orthography, and sign language processing. In this context, we examine the visual speech processing literature, which reveals widespread diverse patterns activity in posterior temporal cortices in response to visual speech stimuli. We outline a model of the visual and auditory speech pathways and make several suggestions: (1 The visual perception of speech relies on visual pathway representations of speech qua speech. (2 A proposed site of these representations, the temporal visual speech area (TVSA has been demonstrated in posterior temporal cortex, ventral and posterior to multisensory posterior superior temporal sulcus (pSTS. (3 Given that visual speech has dynamic and configural features, its representations in feedforward visual pathways are expected to integrate these features, possibly in TVSA.

  10. Effect of speech rate variation on acoustic phone stability in Afrikaans speech recognition

    CSIR Research Space (South Africa)

    Badenhorst, JAC

    2007-11-01

    Full Text Available The authors analyse the effect of speech rate variation on Afrikaans phone stability from an acoustic perspective. Specifically they introduce two techniques for the acoustic analysis of speech rate variation, apply these techniques to an Afrikaans...

  11. A NOVEL APPROACH TO STUTTERED SPEECH CORRECTION

    Directory of Open Access Journals (Sweden)

    Alim Sabur Ajibola

    2016-06-01

    Full Text Available Stuttered speech is a dysfluency rich speech, more prevalent in males than females. It has been associated with insufficient air pressure or poor articulation, even though the root causes are more complex. The primary features include prolonged speech and repetitive speech, while some of its secondary features include, anxiety, fear, and shame. This study used LPC analysis and synthesis algorithms to reconstruct the stuttered speech. The results were evaluated using cepstral distance, Itakura-Saito distance, mean square error, and likelihood ratio. These measures implied perfect speech reconstruction quality. ASR was used for further testing, and the results showed that all the reconstructed speech samples were perfectly recognized while only three samples of the original speech were perfectly recognized.

  12. Speech Research

    Science.gov (United States)

    Several articles addressing topics in speech research are presented. The topics include: exploring the functional significance of physiological tremor: A biospectroscopic approach; differences between experienced and inexperienced listeners to deaf speech; a language-oriented view of reading and its disabilities; Phonetic factors in letter detection; categorical perception; Short-term recall by deaf signers of American sign language; a common basis for auditory sensory storage in perception and immediate memory; phonological awareness and verbal short-term memory; initiation versus execution time during manual and oral counting by stutterers; trading relations in the perception of speech by five-year-old children; the role of the strap muscles in pitch lowering; phonetic validation of distinctive features; consonants and syllable boundaires; and vowel information in postvocalic frictions.

  13. Speech enhancement

    CERN Document Server

    Benesty, Jacob; Chen, Jingdong

    2006-01-01

    We live in a noisy world! In all applications (telecommunications, hands-free communications, recording, human-machine interfaces, etc.) that require at least one microphone, the signal of interest is usually contaminated by noise and reverberation. As a result, the microphone signal has to be ""cleaned"" with digital signal processing tools before it is played out, transmitted, or stored.This book is about speech enhancement. Different well-known and state-of-the-art methods for noise reduction, with one or multiple microphones, are discussed. By speech enhancement, we mean not only noise red

  14. Informational Text and the CCSS

    Science.gov (United States)

    Aspen Institute, 2012

    2012-01-01

    What constitutes an informational text covers a broad swath of different types of texts. Biographies & memoirs, speeches, opinion pieces & argumentative essays, and historical, scientific or technical accounts of a non-narrative nature are all included in what the Common Core State Standards (CCSS) envisions as informational text. Also included…

  15. Speech Intelligibility

    Science.gov (United States)

    Brand, Thomas

    Speech intelligibility (SI) is important for different fields of research, engineering and diagnostics in order to quantify very different phenomena like the quality of recordings, communication and playback devices, the reverberation of auditoria, characteristics of hearing impairment, benefit using hearing aids or combinations of these things.

  16. Audiovisual Speech Synchrony Measure: Application to Biometrics

    Directory of Open Access Journals (Sweden)

    Gérard Chollet

    2007-01-01

    Full Text Available Speech is a means of communication which is intrinsically bimodal: the audio signal originates from the dynamics of the articulators. This paper reviews recent works in the field of audiovisual speech, and more specifically techniques developed to measure the level of correspondence between audio and visual speech. It overviews the most common audio and visual speech front-end processing, transformations performed on audio, visual, or joint audiovisual feature spaces, and the actual measure of correspondence between audio and visual speech. Finally, the use of synchrony measure for biometric identity verification based on talking faces is experimented on the BANCA database.

  17. Perceived Speech Quality Estimation Using DTW Algorithm

    Directory of Open Access Journals (Sweden)

    S. Arsenovski

    2009-06-01

    Full Text Available In this paper a method for speech quality estimation is evaluated by simulating the transfer of speech over packet switched and mobile networks. The proposed system uses Dynamic Time Warping algorithm for test and received speech comparison. Several tests have been made on a test speech sample of a single speaker with simulated packet (frame loss effects on the perceived speech. The achieved results have been compared with measured PESQ values on the used transmission channel and their correlation has been observed.

  18. FUSING SPEECH SIGNAL AND PALMPRINT FEATURES FOR AN SECURED AUTHENTICATION SYSTEM

    Directory of Open Access Journals (Sweden)

    P.K. Mahesh

    2011-11-01

    Full Text Available In the application of Biometric authentication, personal identification is regarded as an effective method for automatic recognition, with a high confidence, a person’s identity. Using multimodal biometric systems we typically get better performance compare to single biometric modality. This paper proposes the multimodal biometrics system for identity verification using two traits, i.e., speech signal and palmprint. Integrating the palmprint and speech information increases robustness of person authentication. The proposed system is designed for applications where the training data contains a speech signal and palmprint. It is well known that the performance of person authentication using only speech signal or palmprint is deteriorated by feature changes with time. The final decision is made by fusion at matching score level architecture in which feature vectors are created independently for query measures and are then compared to the enrolment templates, which are stored during database preparation.

  19. 78 FR 49693 - Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services...

    Science.gov (United States)

    2013-08-15

    ...-Speech Services for Individuals with Hearing and Speech Disabilities, Report and Order (Order), document...] Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services; Telecommunications Relay Services and Speech-to-Speech Services for Individuals With Hearing and Speech Disabilities...

  20. Hearing speech in music

    Directory of Open Access Journals (Sweden)

    Seth-Reino Ekström

    2011-01-01

    Full Text Available The masking effect of a piano composition, played at different speeds and in different octaves, on speech-perception thresholds was investigated in 15 normal-hearing and 14 moderately-hearing-impaired subjects. Running speech (just follow conversation, JFC testing and use of hearing aids increased the everyday validity of the findings. A comparison was made with standard audiometric noises [International Collegium of Rehabilitative Audiology (ICRA noise and speech spectrum-filtered noise (SPN]. All masking sounds, music or noise, were presented at the same equivalent sound level (50 dBA. The results showed a significant effect of piano performance speed and octave (P<.01. Low octave and fast tempo had the largest effect; and high octave and slow tempo, the smallest. Music had a lower masking effect than did ICRA noise with two or six speakers at normal vocal effort (P<.01 and SPN (P<.05. Subjects with hearing loss had higher masked thresholds than the normal-hearing subjects (P<.01, but there were smaller differences between masking conditions (P<.01. It is pointed out that music offers an interesting opportunity for studying masking under realistic conditions, where spectral and temporal features can be varied independently. The results have implications for composing music with vocal parts, designing acoustic environments and creating a balance between speech perception and privacy in social settings.

  1. Collecting and evaluating speech recognition corpora for nine Southern Bantu languages

    CSIR Research Space (South Africa)

    Badenhorst, JAC

    2009-03-01

    Full Text Available The authors describes the Lwazi corpus for automatic speech recognition (ASR), a new telephone speech corpus which includes data from nine Southern Bantu languages. Because of practical constraints, the amount of speech per language is relatively...

  2. Nobel peace speech

    Directory of Open Access Journals (Sweden)

    Joshua FRYE

    2017-07-01

    Full Text Available The Nobel Peace Prize has long been considered the premier peace prize in the world. According to Geir Lundestad, Secretary of the Nobel Committee, of the 300 some peace prizes awarded worldwide, “none is in any way as well known and as highly respected as the Nobel Peace Prize” (Lundestad, 2001. Nobel peace speech is a unique and significant international site of public discourse committed to articulating the universal grammar of peace. Spanning over 100 years of sociopolitical history on the world stage, Nobel Peace Laureates richly represent an important cross-section of domestic and international issues increasingly germane to many publics. Communication scholars’ interest in this rhetorical genre has increased in the past decade. Yet, the norm has been to analyze a single speech artifact from a prestigious or controversial winner rather than examine the collection of speeches for generic commonalities of import. In this essay, we analyze the discourse of Nobel peace speech inductively and argue that the organizing principle of the Nobel peace speech genre is the repetitive form of normative liberal principles and values that function as rhetorical topoi. These topoi include freedom and justice and appeal to the inviolable, inborn right of human beings to exercise certain political and civil liberties and the expectation of equality of protection from totalitarian and tyrannical abuses. The significance of this essay to contemporary communication theory is to expand our theoretical understanding of rhetoric’s role in the maintenance and development of an international and cross-cultural vocabulary for the grammar of peace.

  3. Contrast in concept-to-speech generation

    NARCIS (Netherlands)

    Theune, Mariet; Walker, M.; Rambow, O.

    2002-01-01

    In concept-to-speech systems, spoken output is generated on the basis of a text that has been produced by the system itself. In such systems, linguistic information from the text generation component may be exploited to achieve a higher prosodic quality of the speech output than can be obtained in a

  4. Text Maps: Helping Students Navigate Informational Texts.

    Science.gov (United States)

    Spencer, Brenda H.

    2003-01-01

    Notes that a text map is an instructional approach designed to help students gain fluency in reading content area materials. Discusses how the goal is to teach students about the important features of the material and how the maps can be used to build new understandings. Presents the procedures for preparing and using a text map. (SG)

  5. Commencement Speech as a Hybrid Polydiscursive Practice

    Directory of Open Access Journals (Sweden)

    Светлана Викторовна Иванова

    2017-12-01

    Full Text Available Discourse and media communication researchers pay attention to the fact that popular discursive and communicative practices have a tendency to hybridization and convergence. Discourse which is understood as language in use is flexible. Consequently, it turns out that one and the same text can represent several types of discourses. A vivid example of this tendency is revealed in American commencement speech / commencement address / graduation speech. A commencement speech is a speech university graduates are addressed with which in compliance with the modern trend is delivered by outstanding media personalities (politicians, athletes, actors, etc.. The objective of this study is to define the specificity of the realization of polydiscursive practices within commencement speech. The research involves discursive, contextual, stylistic and definitive analyses. Methodologically the study is based on the discourse analysis theory, in particular the notion of a discursive practice as a verbalized social practice makes up the conceptual basis of the research. This research draws upon a hundred commencement speeches delivered by prominent representatives of American society since 1980s till now. In brief, commencement speech belongs to institutional discourse public speech embodies. Commencement speech institutional parameters are well represented in speeches delivered by people in power like American and university presidents. Nevertheless, as the results of the research indicate commencement speech institutional character is not its only feature. Conceptual information analysis enables to refer commencement speech to didactic discourse as it is aimed at teaching university graduates how to deal with challenges life is rich in. Discursive practices of personal discourse are also actively integrated into the commencement speech discourse. More than that, existential discursive practices also find their way into the discourse under study. Commencement

  6. Speech disorders - children

    Science.gov (United States)

    ... disorder; Voice disorders; Vocal disorders; Disfluency; Communication disorder - speech disorder; Speech disorder - stuttering ... evaluation tools that can help identify and diagnose speech disorders: Denver Articulation Screening Examination Goldman-Fristoe Test of ...

  7. HMM adaptation for child speech synthesis using ASR data

    CSIR Research Space (South Africa)

    Govender, N

    2015-11-01

    Full Text Available . This paper reports on a feasibility study that was conducted to determine whether it is possible to synthesize good quality child voices using child speech data that was recorded for automatic speech recognition (ASR) purposes. A text-to-speech system...

  8. Speech Processing.

    Science.gov (United States)

    1983-05-01

    The VDE system developed had the capability of recognizing up to 248 separate words in syntactic structures. 4 The two systems described are isolated...AND SPEAKER RECOGNITION by M.J.Hunt 5 ASSESSMENT OF SPEECH SYSTEMS ’ ..- * . by R.K.Moore 6 A SURVEY OF CURRENT EQUIPMENT AND RESEARCH’ by J.S.Bridle...TECHNOLOGY IN NAVY TRAINING SYSTEMS by R.Breaux, M.Blind and R.Lynchard 10 9 I-I GENERAL REVIEW OF MILITARY APPLICATIONS OF VOICE PROCESSING DR. BRUNO

  9. The analysis of speech acts patterns in two Egyptian inaugural speeches

    Directory of Open Access Journals (Sweden)

    Imad Hayif Sameer

    2017-09-01

    Full Text Available The theory of speech acts, which clarifies what people do when they speak, is not about individual words or sentences that form the basic elements of human communication, but rather about particular speech acts that are performed when uttering words. A speech act is the attempt at doing something purely by speaking. Many things can be done by speaking.  Speech acts are studied under what is called speech act theory, and belong to the domain of pragmatics. In this paper, two Egyptian inaugural speeches from El-Sadat and El-Sisi, belonging to different periods were analyzed to find out whether there were differences within this genre in the same culture or not. The study showed that there was a very small difference between these two speeches which were analyzed according to Searle’s theory of speech acts. In El Sadat’s speech, commissives came to occupy the first place. Meanwhile, in El–Sisi’s speech, assertives occupied the first place. Within the speeches of one culture, we can find that the differences depended on the circumstances that surrounded the elections of the Presidents at the time. Speech acts were tools they used to convey what they wanted and to obtain support from their audiences.

  10. Exploring the role of brain oscillations in speech perception in noise: Intelligibility of isochronously retimed speech

    Directory of Open Access Journals (Sweden)

    Vincent Aubanel

    2016-08-01

    Full Text Available A growing body of evidence shows that brain oscillations track speech. This mechanism is thought to maximise processing efficiency by allocating resources to important speech information, effectively parsing speech into units of appropriate granularity for further decoding. However, some aspects of this mechanism remain unclear. First, while periodicity is an intrinsic property of this physiological mechanism, speech is only quasi-periodic, so it is not clear whether periodicity would present an advantage in processing. Second, it is still a matter of debate which aspect of speech triggers or maintains cortical entrainment, from bottom-up cues such as fluctuations of the amplitude envelope of speech to higher level linguistic cues such as syntactic structure. We present data from a behavioural experiment assessing the effect of isochronous retiming of speech on speech perception in noise. Two types of anchor points were defined for retiming speech, namely syllable onsets and amplitude envelope peaks. For each anchor point type, retiming was implemented at two hierarchical levels, a slow time scale around 2.5 Hz and a fast time scale around 4 Hz. Results show that while any temporal distortion resulted in reduced speech intelligibility, isochronous speech anchored to P-centers (approximated by stressed syllable vowel onsets was significantly more intelligible than a matched anisochronous retiming, suggesting a facilitative role of periodicity defined on linguistically motivated units in processing speech in noise.

  11. Automatic speech recognition (ASR) based approach for speech therapy of aphasic patients: A review

    Science.gov (United States)

    Jamal, Norezmi; Shanta, Shahnoor; Mahmud, Farhanahani; Sha'abani, MNAH

    2017-09-01

    This paper reviews the state-of-the-art an automatic speech recognition (ASR) based approach for speech therapy of aphasic patients. Aphasia is a condition in which the affected person suffers from speech and language disorder resulting from a stroke or brain injury. Since there is a growing body of evidence indicating the possibility of improving the symptoms at an early stage, ASR based solutions are increasingly being researched for speech and language therapy. ASR is a technology that transfers human speech into transcript text by matching with the system's library. This is particularly useful in speech rehabilitation therapies as they provide accurate, real-time evaluation for speech input from an individual with speech disorder. ASR based approaches for speech therapy recognize the speech input from the aphasic patient and provide real-time feedback response to their mistakes. However, the accuracy of ASR is dependent on many factors such as, phoneme recognition, speech continuity, speaker and environmental differences as well as our depth of knowledge on human language understanding. Hence, the review examines recent development of ASR technologies and its performance for individuals with speech and language disorders.

  12. CAR2 - Czech Database of Car Speech

    Directory of Open Access Journals (Sweden)

    P. Sovka

    1999-12-01

    Full Text Available This paper presents new Czech language two-channel (stereo speech database recorded in car environment. The created database was designed for experiments with speech enhancement for communication purposes and for the study and the design of a robust speech recognition systems. Tools for automated phoneme labelling based on Baum-Welch re-estimation were realised. The noise analysis of the car background environment was done.

  13. Text Mining.

    Science.gov (United States)

    Trybula, Walter J.

    1999-01-01

    Reviews the state of research in text mining, focusing on newer developments. The intent is to describe the disparate investigations currently included under the term text mining and provide a cohesive structure for these efforts. A summary of research identifies key organizations responsible for pushing the development of text mining. A section…

  14. An analysis of machine translation and speech synthesis in speech-to-speech translation system

    OpenAIRE

    Hashimoto, K.; Yamagishi, J.; Byrne, W.; King, S.; Tokuda, K.

    2011-01-01

    This paper provides an analysis of the impacts of machine translation and speech synthesis on speech-to-speech translation systems. The speech-to-speech translation system consists of three components: speech recognition, machine translation and speech synthesis. Many techniques for integration of speech recognition and machine translation have been proposed. However, speech synthesis has not yet been considered. Therefore, in this paper, we focus on machine translation and speech synthesis, ...

  15. Multisensory integration of speech sounds with letters vs. visual speech : only visual speech induces the mismatch negativity

    NARCIS (Netherlands)

    Stekelenburg, J.J.; Keetels, M.N.; Vroomen, J.H.M.

    2018-01-01

    Numerous studies have demonstrated that the vision of lip movements can alter the perception of auditory speech syllables (McGurk effect). While there is ample evidence for integration of text and auditory speech, there are only a few studies on the orthographic equivalent of the McGurk effect.

  16. Speech and Language Delay

    Science.gov (United States)

    ... OTC Relief for Diarrhea Home Diseases and Conditions Speech and Language Delay Condition Speech and Language Delay Share Print Table of Contents1. ... Treatment6. Everyday Life7. Questions8. Resources What is a speech and language delay? A speech and language delay ...

  17. Speech emotion recognition methods: A literature review

    Science.gov (United States)

    Basharirad, Babak; Moradhaseli, Mohammadreza

    2017-10-01

    Recently, attention of the emotional speech signals research has been boosted in human machine interfaces due to availability of high computation capability. There are many systems proposed in the literature to identify the emotional state through speech. Selection of suitable feature sets, design of a proper classifications methods and prepare an appropriate dataset are the main key issues of speech emotion recognition systems. This paper critically analyzed the current available approaches of speech emotion recognition methods based on the three evaluating parameters (feature set, classification of features, accurately usage). In addition, this paper also evaluates the performance and limitations of available methods. Furthermore, it highlights the current promising direction for improvement of speech emotion recognition systems.

  18. Building Searchable Collections of Enterprise Speech Data.

    Science.gov (United States)

    Cooper, James W.; Viswanathan, Mahesh; Byron, Donna; Chan, Margaret

    The study has applied speech recognition and text-mining technologies to a set of recorded outbound marketing calls and analyzed the results. Since speaker-independent speech recognition technology results in a significantly lower recognition rate than that found when the recognizer is trained for a particular speaker, a number of post-processing…

  19. HMM Adaptation for child speech synthesis

    CSIR Research Space (South Africa)

    Govender, Avashna

    2015-09-01

    Full Text Available Hidden Markov Model (HMM)-based synthesis in combination with speaker adaptation has proven to be an approach that is well-suited for child speech synthesis. This paper describes the development and evaluation of different HMM-based child speech...

  20. Utility of TMS to understand the neurobiology of speech

    Directory of Open Access Journals (Sweden)

    Takenobu eMurakami

    2013-07-01

    Full Text Available According to a traditional view, speech perception and production are processed largely separately in sensory and motor brain areas. Recent psycholinguistic and neuroimaging studies provide novel evidence that the sensory and motor systems dynamically interact in speech processing, by demonstrating that speech perception and imitation share regional brain activations. However, the exact nature and mechanisms of these sensorimotor interactions are not completely understood yet.Transcranial magnetic stimulation (TMS has often been used in the cognitive neurosciences, including speech research, as a complementary technique to behavioral and neuroimaging studies. Here we provide an up-to-date review focusing on TMS studies that explored speech perception and imitation.Single-pulse TMS of the primary motor cortex (M1 demonstrated a speech specific and somatotopically specific increase of excitability of the M1 lip area during speech perception (listening to speech or lip reading. A paired-coil TMS approach showed increases in effective connectivity from brain regions that are involved in speech processing to the M1 lip area when listening to speech. TMS in virtual lesion mode applied to speech processing areas modulated performance of phonological recognition and imitation of perceived speech.In summary, TMS is an innovative tool to investigate processing of speech perception and imitation. TMS studies have provided strong evidence that the sensory system is critically involved in mapping sensory input onto motor output and that the motor system plays an important role in speech perception.

  1. Speech Timing Deficit of Stuttering: Evidence from Contingent Negative Variations.

    Directory of Open Access Journals (Sweden)

    Ning Ning

    Full Text Available The aim of the present study was to investigate the speech preparation processes of adults who stutter (AWS. Fifteen AWS and fifteen adults with fluent speech (AFS participated in the experiment. The event-related potentials (ERPs were recorded in a foreperiod paradigm. The warning signal (S1 was a color square, and the following imperative stimulus (S2 was either a white square (the Go signal that required participants to name the color of S1 or a white dot (the NoGo signal that prevents participants from speaking. Three differences were found between AWS and AFS. First, the mean amplitude of the ERP component parietal positivity elicited by S1 (S1-P3 was smaller in AWS than in AFS, which implies that AWS may have deficits in investing working memory on phonological programming. Second, the topographic shift from the early phase to the late phase of contingent negative variation occurred earlier for AWS than for AFS, thus suggesting that the motor preparation process is promoted in AWS. Third, the NoGo effect in the ERP component parietal positivity elicited by S2 (S2-P3 was larger for AFS than for AWS, indicating that AWS have difficulties in inhibiting a planned speech response. These results provide a full picture of the speech preparation and response inhibition processes of AWS. The relationship among these three findings is discussed. However, as stuttering was not manipulated in this study, it is still unclear whether the effects are the causes or the results of stuttering. Further studies are suggested to explore the relationship between stuttering and the effects found in the present study.

  2. Automatic Speech Recognition from Neural Signals: A Focused Review

    Directory of Open Access Journals (Sweden)

    Christian Herff

    2016-09-01

    Full Text Available Speech interfaces have become widely accepted and are nowadays integrated in various real-life applications and devices. They have become a part of our daily life. However, speech interfaces presume the ability to produce intelligible speech, which might be impossible due to either loud environments, bothering bystanders or incapabilities to produce speech (i.e.~patients suffering from locked-in syndrome. For these reasons it would be highly desirable to not speak but to simply envision oneself to say words or sentences. Interfaces based on imagined speech would enable fast and natural communication without the need for audible speech and would give a voice to otherwise mute people.This focused review analyzes the potential of different brain imaging techniques to recognize speech from neural signals by applying Automatic Speech Recognition technology. We argue that modalities based on metabolic processes, such as functional Near Infrared Spectroscopy and functional Magnetic Resonance Imaging, are less suited for Automatic Speech Recognition from neural signals due to low temporal resolution but are very useful for the investigation of the underlying neural mechanisms involved in speech processes. In contrast, electrophysiologic activity is fast enough to capture speech processes and is therefor better suited for ASR. Our experimental results indicate the potential of these signals for speech recognition from neural data with a focus on invasively measured brain activity (electrocorticography. As a first example of Automatic Speech Recognition techniques used from neural signals, we discuss the emph{Brain-to-text} system.

  3. Development of a System for Automatic Recognition of Speech

    Directory of Open Access Journals (Sweden)

    Roman Jarina

    2003-01-01

    Full Text Available The article gives a review of a research on processing and automatic recognition of speech signals (ARR at the Department of Telecommunications of the Faculty of Electrical Engineering, University of iilina. On-going research is oriented to speech parametrization using 2-dimensional cepstral analysis, and to an application of HMMs and neural networks for speech recognition in Slovak language. The article summarizes achieved results and outlines future orientation of our research in automatic speech recognition.

  4. Teaching Speech Acts

    Directory of Open Access Journals (Sweden)

    Teaching Speech Acts

    2007-01-01

    Full Text Available In this paper I argue that pragmatic ability must become part of what we teach in the classroom if we are to realize the goals of communicative competence for our students. I review the research on pragmatics, especially those articles that point to the effectiveness of teaching pragmatics in an explicit manner, and those that posit methods for teaching. I also note two areas of scholarship that address classroom needs—the use of authentic data and appropriate assessment tools. The essay concludes with a summary of my own experience teaching speech acts in an advanced-level Portuguese class.

  5. Automatic Speech Recognition Systems for the Evaluation of Voice and Speech Disorders in Head and Neck Cancer

    OpenAIRE

    Andreas Maier; Tino Haderlein; Florian Stelzle; Elmar Nöth; Emeka Nkenke; Frank Rosanowski; Anne Schützenberger; Maria Schuster

    2010-01-01

    In patients suffering from head and neck cancer, speech intelligibility is often restricted. For assessment and outcome measurements, automatic speech recognition systems have previously been shown to be appropriate for objective and quick evaluation of intelligibility. In this study we investigate the applicability of the method to speech disorders caused by head and neck cancer. Intelligibility was quantified by speech recognition on recordings of a standard text read by 41 German laryngect...

  6. Multimicrophone Speech Dereverberation: Experimental Validation

    Directory of Open Access Journals (Sweden)

    Marc Moonen

    2007-05-01

    Full Text Available Dereverberation is required in various speech processing applications such as handsfree telephony and voice-controlled systems, especially when signals are applied that are recorded in a moderately or highly reverberant environment. In this paper, we compare a number of classical and more recently developed multimicrophone dereverberation algorithms, and validate the different algorithmic settings by means of two performance indices and a speech recognition system. It is found that some of the classical solutions obtain a moderate signal enhancement. More advanced subspace-based dereverberation techniques, on the other hand, fail to enhance the signals despite their high-computational load.

  7. The Effect of English Verbal Songs on Connected Speech Aspects of Adult English Learners’ Speech Production

    Directory of Open Access Journals (Sweden)

    Farshid Tayari Ashtiani

    2015-02-01

    Full Text Available The present study was an attempt to investigate the impact of English verbal songs on connected speech aspects of adult English learners’ speech production. 40 participants were selected based on the results of their performance in a piloted and validated version of NELSON test given to 60 intermediate English learners in a language institute in Tehran. Then they were equally distributed in two control and experimental groups and received a validated pretest of reading aloud and speaking in English. Afterward, the treatment was performed in 18 sessions by singing preselected songs culled based on some criteria such as popularity, familiarity, amount, and speed of speech delivery, etc. In the end, the posttests of reading aloud and speaking in English were administered. The results revealed that the treatment had statistically positive effects on the connected speech aspects of English learners’ speech production at statistical .05 level of significance. Meanwhile, the results represented that there was not any significant difference between the experimental group’s mean scores on the posttests of reading aloud and speaking. It was thus concluded that providing the EFL learners with English verbal songs could positively affect connected speech aspects of both modes of speech production, reading aloud and speaking. The Findings of this study have pedagogical implications for language teachers to be more aware and knowledgeable of the benefits of verbal songs to promote speech production of language learners in terms of naturalness and fluency. Keywords: English Verbal Songs, Connected Speech, Speech Production, Reading Aloud, Speaking

  8. The Influence of Direct and Indirect Speech on Source Memory

    Directory of Open Access Journals (Sweden)

    Anita Eerland

    2018-02-01

    Full Text Available People perceive the same situation described in direct speech (e.g., John said, “I like the food at this restaurant” as more vivid and perceptually engaging than described in indirect speech (e.g., John said that he likes the food at the restaurant. So, if direct speech enhances the perception of vividness relative to indirect speech, what are the effects of using indirect speech? In four experiments, we examined whether the use of direct and indirect speech influences the comprehender’s memory for the identity of the speaker. Participants read a direct or an indirect speech version of a story and then addressed statements to one of the four protagonists of the story in a memory task. We found better source memory at the level of protagonist gender after indirect than direct speech (Exp. 1–3. When the story was rewritten to make the protagonists more distinctive, we also found an effect of speech type on source memory at the level of the individual, with better memory after indirect than direct speech (Exp. 3–4. Memory for the content of the story, however, was not influenced by speech type (Exp. 4. While previous research showed that direct speech may enhance memory for how something was said, we conclude that indirect speech enhances memory for who said what.

  9. Speech and Communication Disorders

    Science.gov (United States)

    ... to being completely unable to speak or understand speech. Causes include Hearing disorders and deafness Voice problems, ... or those caused by cleft lip or palate Speech problems like stuttering Developmental disabilities Learning disorders Autism ...

  10. Speech perception as an active cognitive process

    Directory of Open Access Journals (Sweden)

    Shannon eHeald

    2014-03-01

    Full Text Available One view of speech perception is that acoustic signals are transformed into representations for pattern matching to determine linguistic structure. This process can be taken as a statistical pattern-matching problem, assuming realtively stable linguistic categories are characterized by neural representations related to auditory properties of speech that can be compared to speech input. This kind of pattern matching can be termed a passive process which implies rigidity of processingd with few demands on cognitive processing. An alternative view is that speech recognition, even in early stages, is an active process in which speech analysis is attentionally guided. Note that this does not mean consciously guided but that information-contingent changes in early auditory encoding can occur as a function of context and experience. Active processing assumes that attention, plasticity, and listening goals are important in considering how listeners cope with adverse circumstances that impair hearing by masking noise in the environment or hearing loss. Although theories of speech perception have begun to incorporate some active processing, they seldom treat early speech encoding as plastic and attentionally guided. Recent research has suggested that speech perception is the product of both feedforward and feedback interactions between a number of brain regions that include descending projections perhaps as far downstream as the cochlea. It is important to understand how the ambiguity of the speech signal and constraints of context dynamically determine cognitive resources recruited during perception including focused attention, learning, and working memory. Theories of speech perception need to go beyond the current corticocentric approach in order to account for the intrinsic dynamics of the auditory encoding of speech. In doing so, this may provide new insights into ways in which hearing disorders and loss may be treated either through augementation or

  11. Speech, language and swallowing in Huntington’ Disease

    Directory of Open Access Journals (Sweden)

    Maryluz Camargo-Mendoza

    2017-04-01

    Full Text Available Huntington’s disease (HD has been described as a genetic condition caused by a mutation in the CAG (cytosine-adenine-guanine nucleotide sequence. Depending on the stage of the disease, people may have difficulties in speech, language and swallowing. The purpose of this paper is to describe these difficulties in detail, as well as to provide an account on speech and language therapy approach to this condition. Regarding speech, it is worth noticing that characteristics typical of hyperkinetic dysarthria can be found due to underlying choreic movements. The speech of people with HD tends to show shorter sentences, with much simpler syntactic structures, and difficulties in tasks that require complex cognitive processing. Moreover, swallowing may present dysphagia that progresses as the disease develops. A timely, comprehensive and effective speech-language intervention is essential to improve the quality of life of people and contribute to their communicative welfare.

  12. Speech cues contribute to audiovisual spatial integration.

    Directory of Open Access Journals (Sweden)

    Christopher W Bishop

    Full Text Available Speech is the most important form of human communication but ambient sounds and competing talkers often degrade its acoustics. Fortunately the brain can use visual information, especially its highly precise spatial information, to improve speech comprehension in noisy environments. Previous studies have demonstrated that audiovisual integration depends strongly on spatiotemporal factors. However, some integrative phenomena such as McGurk interference persist even with gross spatial disparities, suggesting that spatial alignment is not necessary for robust integration of audiovisual place-of-articulation cues. It is therefore unclear how speech-cues interact with audiovisual spatial integration mechanisms. Here, we combine two well established psychophysical phenomena, the McGurk effect and the ventriloquist's illusion, to explore this dependency. Our results demonstrate that conflicting spatial cues may not interfere with audiovisual integration of speech, but conflicting speech-cues can impede integration in space. This suggests a direct but asymmetrical influence between ventral 'what' and dorsal 'where' pathways.

  13. Free Speech Yearbook 1978.

    Science.gov (United States)

    Phifer, Gregg, Ed.

    The 17 articles in this collection deal with theoretical and practical freedom of speech issues. The topics include: freedom of speech in Marquette Park, Illinois; Nazis in Skokie, Illinois; freedom of expression in the Confederate States of America; Robert M. LaFollette's arguments for free speech and the rights of Congress; the United States…

  14. Speech versus singing: Infants choose happier sounds

    Directory of Open Access Journals (Sweden)

    Marieve eCorbeil

    2013-06-01

    Full Text Available Infants prefer speech to non-vocal sounds and to non-human vocalizations, and they prefer happy-sounding speech to neutral speech. They also exhibit an interest in singing, but there is little knowledge of their relative interest in speech and singing. The present study explored infants’ attention to unfamiliar audio samples of speech and singing. In Experiment 1, infants 4-13 months of age were exposed to happy-sounding infant-directed speech versus hummed lullabies by the same woman. They listened significantly longer to the speech, which had considerably greater acoustic variability and expressiveness, than to the lullabies. In Experiment 2, infants of comparable age who heard the lyrics of a Turkish children’s song spoken versus sung in a joyful/happy manner did not exhibit differential listening. Infants in Experiment 3 heard the happily sung lyrics of the Turkish children’s song versus a version that was spoken in an adult-directed or affectively neutral manner. They listened significantly longer to the sung version. Overall, happy voice quality rather than vocal mode (speech or singing was the principal contributor to infant attention, regardless of age.

  15. The speech signal segmentation algorithm using pitch synchronous analysis

    Directory of Open Access Journals (Sweden)

    Amirgaliyev Yedilkhan

    2017-03-01

    Full Text Available Parameterization of the speech signal using the algorithms of analysis synchronized with the pitch frequency is discussed. Speech parameterization is performed by the average number of zero transitions function and the signal energy function. Parameterization results are used to segment the speech signal and to isolate the segments with stable spectral characteristics. Segmentation results can be used to generate a digital voice pattern of a person or be applied in the automatic speech recognition. Stages needed for continuous speech segmentation are described.

  16. THE ONTOGENESIS OF SPEECH DEVELOPMENT

    Directory of Open Access Journals (Sweden)

    T. E. Braudo

    2017-01-01

    Full Text Available The purpose of this article is to acquaint the specialists, working with children having developmental disorders, with age-related norms for speech development. Many well-known linguists and psychologists studied speech ontogenesis (logogenesis. Speech is a higher mental function, which integrates many functional systems. Speech development in infants during the first months after birth is ensured by the innate hearing and emerging ability to fix the gaze on the face of an adult. Innate emotional reactions are also being developed during this period, turning into nonverbal forms of communication. At about 6 months a baby starts to pronounce some syllables; at 7–9 months – repeats various sounds combinations, pronounced by adults. At 10–11 months a baby begins to react on the words, referred to him/her. The first words usually appear at an age of 1 year; this is the start of the stage of active speech development. At this time it is acceptable, if a child confuses or rearranges sounds, distorts or misses them. By the age of 1.5 years a child begins to understand abstract explanations of adults. Significant vocabulary enlargement occurs between 2 and 3 years; grammatical structures of the language are being formed during this period (a child starts to use phrases and sentences. Preschool age (3–7 y. o. is characterized by incorrect, but steadily improving pronunciation of sounds and phonemic perception. The vocabulary increases; abstract speech and retelling are being formed. Children over 7 y. o. continue to improve grammar, writing and reading skills. The described stages may not have strict age boundaries, as soon as they are dependent not only on environment, but also on the child’s mental constitution, heredity and character.

  17. From Gesture to Speech

    Directory of Open Access Journals (Sweden)

    Maurizio Gentilucci

    2012-11-01

    Full Text Available One of the major problems concerning the evolution of human language is to understand how sounds became associated to meaningful gestures. It has been proposed that the circuit controlling gestures and speech evolved from a circuit involved in the control of arm and mouth movements related to ingestion. This circuit contributed to the evolution of spoken language, moving from a system of communication based on arm gestures. The discovery of the mirror neurons has provided strong support for the gestural theory of speech origin because they offer a natural substrate for the embodiment of language and create a direct link between sender and receiver of a message. Behavioural studies indicate that manual gestures are linked to mouth movements used for syllable emission. Grasping with the hand selectively affected movement of inner or outer parts of the mouth according to syllable pronunciation and hand postures, in addition to hand actions, influenced the control of mouth grasp and vocalization. Gestures and words are also related to each other. It was found that when producing communicative gestures (emblems the intention to interact directly with a conspecific was transferred from gestures to words, inducing modification in voice parameters. Transfer effects of the meaning of representational gestures were found on both vocalizations and meaningful words. It has been concluded that the results of our studies suggest the existence of a system relating gesture to vocalization which was precursor of a more general system reciprocally relating gesture to word.

  18. Causes of Speech Disorders in Primary School Students of Zahedan

    Directory of Open Access Journals (Sweden)

    Saeed Fakhrerahimi

    2013-02-01

    Full Text Available Background: Since making communication with others is the most important function of speech, undoubtedly, any type of disorder in speech will affect the human communicability with others. The objective of the study was to investigate reasons behind the [high] prevalence rate of stammer, producing disorders and aglossia.Materials and Methods: This descriptive-analytical study was conducted on 118 male and female students, who were studying in a primary school in Zahedan; they had referred to the Speech Therapy Centers of Zahedan University of Medical Sciences in a period of seven months. The speech therapist examinations, diagnosis tools common in speech therapy, Spielberg Children Trait and also patients' cases were used to find the reasons behind the [high] prevalence rate of speech disorders. Results: Psychological causes had the highest rate of correlation with the speech disorders among the other factors affecting the speech disorders. After psychological causes, family history and age of the subjects are the other factors which may bring about the speech disorders (P<0.05. Bilingualism and birth order has a negative relationship with the speech disorders. Likewise, another result of this study shows that only psychological causes, social causes, hereditary causes and age of subjects can predict the speech disorders (P<0.05.Conclusion: The present study shows that the speech disorders have a strong and close relationship with the psychological causes at the first step and also history of family and age of individuals at the next steps.

  19. Speech in spinocerebellar ataxia.

    Science.gov (United States)

    Schalling, Ellika; Hartelius, Lena

    2013-12-01

    Spinocerebellar ataxias (SCAs) are a heterogeneous group of autosomal dominant cerebellar ataxias clinically characterized by progressive ataxia, dysarthria and a range of other concomitant neurological symptoms. Only a few studies include detailed characterization of speech symptoms in SCA. Speech symptoms in SCA resemble ataxic dysarthria but symptoms related to phonation may be more prominent. One study to date has shown an association between differences in speech and voice symptoms related to genotype. More studies of speech and voice phenotypes are motivated, to possibly aid in clinical diagnosis. In addition, instrumental speech analysis has been demonstrated to be a reliable measure that may be used to monitor disease progression or therapy outcomes in possible future pharmacological treatments. Intervention by speech and language pathologists should go beyond assessment. Clinical guidelines for management of speech, communication and swallowing need to be developed for individuals with progressive cerebellar ataxia. Copyright © 2013 Elsevier Inc. All rights reserved.

  20. Digital speech processing using Matlab

    CERN Document Server

    Gopi, E S

    2014-01-01

    Digital Speech Processing Using Matlab deals with digital speech pattern recognition, speech production model, speech feature extraction, and speech compression. The book is written in a manner that is suitable for beginners pursuing basic research in digital speech processing. Matlab illustrations are provided for most topics to enable better understanding of concepts. This book also deals with the basic pattern recognition techniques (illustrated with speech signals using Matlab) such as PCA, LDA, ICA, SVM, HMM, GMM, BPN, and KSOM.

  1. Study Guide for Teacher Certification Test in Speech and Language Pathology.

    Science.gov (United States)

    Umberger, Forrest G.

    This study guide is designed for individuals preparing to take the Georgia Teacher Certification Test (TCT) in speech and language pathology. The test covers five subareas: (1) fundamentals of speech and language; (2) speech and language disorders; (3) related handicapping conditions; (4) hearing impairment; and (5) program management and…

  2. The Functional Connectome of Speech Control.

    Directory of Open Access Journals (Sweden)

    Stefan Fuertinger

    2015-07-01

    Full Text Available In the past few years, several studies have been directed to understanding the complexity of functional interactions between different brain regions during various human behaviors. Among these, neuroimaging research installed the notion that speech and language require an orchestration of brain regions for comprehension, planning, and integration of a heard sound with a spoken word. However, these studies have been largely limited to mapping the neural correlates of separate speech elements and examining distinct cortical or subcortical circuits involved in different aspects of speech control. As a result, the complexity of the brain network machinery controlling speech and language remained largely unknown. Using graph theoretical analysis of functional MRI (fMRI data in healthy subjects, we quantified the large-scale speech network topology by constructing functional brain networks of increasing hierarchy from the resting state to motor output of meaningless syllables to complex production of real-life speech as well as compared to non-speech-related sequential finger tapping and pure tone discrimination networks. We identified a segregated network of highly connected local neural communities (hubs in the primary sensorimotor and parietal regions, which formed a commonly shared core hub network across the examined conditions, with the left area 4p playing an important role in speech network organization. These sensorimotor core hubs exhibited features of flexible hubs based on their participation in several functional domains across different networks and ability to adaptively switch long-range functional connectivity depending on task content, resulting in a distinct community structure of each examined network. Specifically, compared to other tasks, speech production was characterized by the formation of six distinct neural communities with specialized recruitment of the prefrontal cortex, insula, putamen, and thalamus, which collectively

  3. Characterization of authorship speeches in classroom

    Directory of Open Access Journals (Sweden)

    Daniella de Almeida Santos

    2007-08-01

    Full Text Available Our paper intends to discuss how the teacher's speech can interfere in the construction of arguments on the part of the students, when they are involved with the task of solving an experimental problem in sciences classes. Thus, we wanted to understand how teacher and students relate to each other in a discursive movement for the senses structuring of the obtained experimental data. With that concern, our focus is in the processes of the speeches authorship, both students' and teachers', in the episodes in which the actors of the teaching and learning process organize their speeches, mediated by the experimental activity.

  4. Didactic speech synthesizer – acoustic module, formants model

    OpenAIRE

    Teixeira, João Paulo; Fernandes, Anildo

    2013-01-01

    Text-to-speech synthesis is the main subject treated in this work. It will be presented the constitution of a generic text-to-speech system conversion, explained the functions of the various modules and described the development techniques using the formants model. The development of a didactic formant synthesiser under Matlab environment will also be described. This didactic synthesiser is intended for a didactic understanding of the formant model of speech production.

  5. Atypical speech versus non-speech detection and discrimination in 4- to 6- yr old children with autism spectrum disorder: An ERP study.

    Directory of Open Access Journals (Sweden)

    Alena Galilee

    Full Text Available Previous event-related potential (ERP research utilizing oddball stimulus paradigms suggests diminished processing of speech versus non-speech sounds in children with an Autism Spectrum Disorder (ASD. However, brain mechanisms underlying these speech processing abnormalities, and to what extent they are related to poor language abilities in this population remain unknown. In the current study, we utilized a novel paired repetition paradigm in order to investigate ERP responses associated with the detection and discrimination of speech and non-speech sounds in 4- to 6-year old children with ASD, compared with gender and verbal age matched controls. ERPs were recorded while children passively listened to pairs of stimuli that were either both speech sounds, both non-speech sounds, speech followed by non-speech, or non-speech followed by speech. Control participants exhibited N330 match/mismatch responses measured from temporal electrodes, reflecting speech versus non-speech detection, bilaterally, whereas children with ASD exhibited this effect only over temporal electrodes in the left hemisphere. Furthermore, while the control groups exhibited match/mismatch effects at approximately 600 ms (central N600, temporal P600 when a non-speech sound was followed by a speech sound, these effects were absent in the ASD group. These findings suggest that children with ASD fail to activate right hemisphere mechanisms, likely associated with social or emotional aspects of speech detection, when distinguishing non-speech from speech stimuli. Together, these results demonstrate the presence of atypical speech versus non-speech processing in children with ASD when compared with typically developing children matched on verbal age.

  6. SPEECH TACTICS IN MASS MEDIA DISCOURSE

    Directory of Open Access Journals (Sweden)

    Olena Kaptiurova

    2014-06-01

    Full Text Available The article deals with the basic speech tactics used in mass media discourse. It has been stated that such tactics as contact establishment and speech interaction termination, yielding up initiative or its preserving are compulsory for the communicative situation of a talk show. Language personalities of television talk shows anchors and linguistic ways of the interview organisation are stressed. The material is amply illustrated with relevant examples.

  7. Free Speech as a Cultural Value in the United States

    Directory of Open Access Journals (Sweden)

    Mauricio J. Alvarez

    2018-02-01

    Full Text Available Political orientation influences support for free speech, with liberals often reporting greater support for free speech than conservatives. We hypothesized that this effect should be moderated by cultural context: individualist cultures value individual self-expression and self-determination, and collectivist cultures value group harmony and conformity. These different foci should differently influence liberals and conservatives’ support for free speech within these cultures. Two studies evaluated the joint influence of political orientation and cultural context on support for free speech. Study 1, using a multilevel analysis of data from 37 U.S. states (n = 1,001, showed that conservatives report stronger support for free speech in collectivist states, whereas there were no differences between conservatives and liberals in support for free speech in individualist states. Study 2 (n = 90 confirmed this pattern by priming independent and interdependent self-construals in liberals and conservatives. Results demonstrate the importance of cultural context for free speech. Findings suggest that in the U.S. support for free speech might be embraced for different reasons: conservatives’ support for free speech appears to be motivated by a focus on collectively held values favoring free speech, while liberals’ support for free speech might be motivated by a focus on individualist self-expression.

  8. Speech Alarms Pilot Study

    Science.gov (United States)

    Sandor, Aniko; Moses, Haifa

    2016-01-01

    Speech alarms have been used extensively in aviation and included in International Building Codes (IBC) and National Fire Protection Association's (NFPA) Life Safety Code. However, they have not been implemented on space vehicles. Previous studies conducted at NASA JSC showed that speech alarms lead to faster identification and higher accuracy. This research evaluated updated speech and tone alerts in a laboratory environment and in the Human Exploration Research Analog (HERA) in a realistic setup.

  9. ACOUSTIC SPEECH RECOGNITION FOR MARATHI LANGUAGE USING SPHINX

    Directory of Open Access Journals (Sweden)

    Aman Ankit

    2016-09-01

    Full Text Available Speech recognition or speech to text processing, is a process of recognizing human speech by the computer and converting into text. In speech recognition, transcripts are created by taking recordings of speech as audio and their text transcriptions. Speech based applications which include Natural Language Processing (NLP techniques are popular and an active area of research. Input to such applications is in natural language and output is obtained in natural language. Speech recognition mostly revolves around three approaches namely Acoustic phonetic approach, Pattern recognition approach and Artificial intelligence approach. Creation of acoustic model requires a large database of speech and training algorithms. The output of an ASR system is recognition and translation of spoken language into text by computers and computerized devices. ASR today finds enormous application in tasks that require human machine interfaces like, voice dialing, and etc. Our key contribution in this paper is to create corpora for Marathi language and explore the use of Sphinx engine for automatic speech recognition

  10. Speech Enhancement by MAP Spectral Amplitude Estimation Using a Super-Gaussian Speech Model

    Directory of Open Access Journals (Sweden)

    Lotter Thomas

    2005-01-01

    Full Text Available This contribution presents two spectral amplitude estimators for acoustical background noise suppression based on maximum a posteriori estimation and super-Gaussian statistical modelling of the speech DFT amplitudes. The probability density function of the speech spectral amplitude is modelled with a simple parametric function, which allows a high approximation accuracy for Laplace- or Gamma-distributed real and imaginary parts of the speech DFT coefficients. Also, the statistical model can be adapted to optimally fit the distribution of the speech spectral amplitudes for a specific noise reduction system. Based on the super-Gaussian statistical model, computationally efficient maximum a posteriori speech estimators are derived, which outperform the commonly applied Ephraim-Malah algorithm.

  11. Texts, Transmissions, Receptions. Modern Approaches to Narratives

    NARCIS (Netherlands)

    Lardinois, A.P.M.H.; Levie, S.A.; Hoeken, H.; Lüthy, C.H.

    2015-01-01

    The papers collected in this volume study the function and meaning of narrative texts from a variety of perspectives. The word 'text' is used here in the broadest sense of the term: it denotes literary books, but also oral tales, speeches, newspaper articles and comics. One of the purposes of this

  12. Towards dynamic interorganizational business process management (text keynote speech)

    NARCIS (Netherlands)

    Grefen, P.W.P.J.; Reddy, S.M.

    2006-01-01

    In the modern day business world, we see more and more complex collaboration scenarios in which multiple autonomous organizations work together. From a business perspective, we distinguish between horizontal and vertical relationships. We argue that the concept of business process is central in both

  13. Indian accent text-to-speech system for web browsing

    Indian Academy of Sciences (India)

    R. Narasimhan (Krishtel eMaging) 1461 1996 Oct 15 13:05:22

    In fact, this itself is a strong motivation for using 'Indian. English', rather ... Currently, we are making a statistical analysis of letter clusters present in ... This, coupled with grammatical validity check for noun, will improve this decision in future.

  14. Ear, Hearing and Speech

    DEFF Research Database (Denmark)

    Poulsen, Torben

    2000-01-01

    An introduction is given to the the anatomy and the function of the ear, basic psychoacoustic matters (hearing threshold, loudness, masking), the speech signal and speech intelligibility. The lecture note is written for the course: Fundamentals of Acoustics and Noise Control (51001)......An introduction is given to the the anatomy and the function of the ear, basic psychoacoustic matters (hearing threshold, loudness, masking), the speech signal and speech intelligibility. The lecture note is written for the course: Fundamentals of Acoustics and Noise Control (51001)...

  15. Principles of speech coding

    CERN Document Server

    Ogunfunmi, Tokunbo

    2010-01-01

    It is becoming increasingly apparent that all forms of communication-including voice-will be transmitted through packet-switched networks based on the Internet Protocol (IP). Therefore, the design of modern devices that rely on speech interfaces, such as cell phones and PDAs, requires a complete and up-to-date understanding of the basics of speech coding. Outlines key signal processing algorithms used to mitigate impairments to speech quality in VoIP networksOffering a detailed yet easily accessible introduction to the field, Principles of Speech Coding provides an in-depth examination of the

  16. Chinese legal texts – Quantitative Description

    Directory of Open Access Journals (Sweden)

    Ľuboš GAJDOŠ

    2017-06-01

    Full Text Available The aim of the paper is to provide a quantitative description of legal Chinese. This study adopts the approach of corpus-based analyses and it shows basic statistical parameters of legal texts in Chinese, namely the length of a sentence, the proportion of part of speech etc. The research is conducted on the Chinese monolingual corpus Hanku. The paper also discusses the issues of statistical data processing from various corpora, e.g. the tokenisation and part of speech tagging and their relevance to study of registers variation.

  17. Robust digital processing of speech signals

    CERN Document Server

    Kovacevic, Branko; Veinović, Mladen; Marković, Milan

    2017-01-01

    This book focuses on speech signal phenomena, presenting a robustification of the usual speech generation models with regard to the presumed types of excitation signals, which is equivalent to the introduction of a class of nonlinear models and the corresponding criterion functions for parameter estimation. Compared to the general class of nonlinear models, such as various neural networks, these models possess good properties of controlled complexity, the option of working in “online” mode, as well as a low information volume for efficient speech encoding and transmission. Providing comprehensive insights, the book is based on the authors’ research, which has already been published, supplemented by additional texts discussing general considerations of speech modeling, linear predictive analysis and robust parameter estimation.

  18. Designing the Database of Speech Under Stress

    Directory of Open Access Journals (Sweden)

    Sabo Róbert

    2017-12-01

    Full Text Available This study describes the methodology used for designing a database of speech under real stress. Based on limits of existing stress databases, we used a communication task via a computer game to collect speech data. To validate the presence of stress, known psychophysiological indicators such as heart rate and electrodermal activity, as well as subjective self-assessment were used. This paper presents the data from first 5 speakers (3 men, 2 women who participated in initial tests of the proposed design. In 4 out of 5 speakers increases in fundamental frequency and intensity of speech were registered. Similarly, in 4 out of 5 speakers heart rate was significantly increased during the task, when compared with reference measurement from before the task. These first results show that proposed design might be appropriate for building a speech under stress database. However, there are still considerations that need to be addressed.

  19. Perceptual evaluation of corpus-based speech synthesis techniques in under-resourced environments

    CSIR Research Space (South Africa)

    Van Niekerk, DR

    2009-11-01

    Full Text Available With the increasing prominence and maturity of corpus-based techniques for speech synthesis, the process of system development has in some ways been simplified considerably. However, the dependence on sufficient amounts of relevant speech data...

  20. Collective speech acts

    NARCIS (Netherlands)

    Meijers, A.W.M.; Tsohatzidis, S.L.

    2007-01-01

    From its early development in the 1960s, speech act theory always had an individualistic orientation. It focused exclusively on speech acts performed by individual agents. Paradigmatic examples are ‘I promise that p’, ‘I order that p’, and ‘I declare that p’. There is a single speaker and a single

  1. Private Speech in Ballet

    Science.gov (United States)

    Johnston, Dale

    2006-01-01

    Authoritarian teaching practices in ballet inhibit the use of private speech. This paper highlights the critical importance of private speech in the cognitive development of young ballet students, within what is largely a non-verbal art form. It draws upon research by Russian psychologist Lev Vygotsky and contemporary socioculturalists, to…

  2. Free Speech Yearbook 1980.

    Science.gov (United States)

    Kane, Peter E., Ed.

    The 11 articles in this collection deal with theoretical and practical freedom of speech issues. The topics covered are (1) the United States Supreme Court and communication theory; (2) truth, knowledge, and a democratic respect for diversity; (3) denial of freedom of speech in Jock Yablonski's campaign for the presidency of the United Mine…

  3. Free Speech. No. 38.

    Science.gov (United States)

    Kane, Peter E., Ed.

    This issue of "Free Speech" contains the following articles: "Daniel Schoor Relieved of Reporting Duties" by Laurence Stern, "The Sellout at CBS" by Michael Harrington, "Defending Dan Schorr" by Tome Wicker, "Speech to the Washington Press Club, February 25, 1976" by Daniel Schorr, "Funds…

  4. Speech parts as Poisson processes.

    Science.gov (United States)

    Badalamenti, A F

    2001-09-01

    This paper presents evidence that six of the seven parts of speech occur in written text as Poisson processes, simple or recurring. The six major parts are nouns, verbs, adjectives, adverbs, prepositions, and conjunctions, with the interjection occurring too infrequently to support a model. The data consist of more than the first 5000 words of works by four major authors coded to label the parts of speech, as well as periods (sentence terminators). Sentence length is measured via the period and found to be normally distributed with no stochastic model identified for its occurrence. The models for all six speech parts but the noun significantly distinguish some pairs of authors and likewise for the joint use of all words types. Any one author is significantly distinguished from any other by at least one word type and sentence length very significantly distinguishes each from all others. The variety of word type use, measured by Shannon entropy, builds to about 90% of its maximum possible value. The rate constants for nouns are close to the fractions of maximum entropy achieved. This finding together with the stochastic models and the relations among them suggest that the noun may be a primitive organizer of written text.

  5. THE BASIS FOR SPEECH PREVENTION

    Directory of Open Access Journals (Sweden)

    Jordan JORDANOVSKI

    1997-06-01

    Full Text Available The speech is a tool for accurate communication of ideas. When we talk about speech prevention as a practical realization of the language, we are referring to the fact that it should be comprised of the elements of the criteria as viewed from the perspective of the standards. This criteria, in the broad sense of the word, presupposes an exact realization of the thought expressed between the speaker and the recipient.The absence of this criterion catches the eye through the practical realization of the language and brings forth consequences, often hidden very deeply in the human psyche. Their outer manifestation already represents a delayed reaction of the social environment. The foundation for overcoming and standardization of this phenomenon must be the anatomy-physiological patterns of the body, accomplished through methods in concordance with the nature of the body.

  6. THE MEANING OF THE PREVENTION WITH SPEECH THERAPY AS A IMPORTANT FAC-TOR FOR THE PROPER DEVELOPMENT OF THE CHILDREN SPEECH

    Directory of Open Access Journals (Sweden)

    S. FILIPOVA

    1999-11-01

    Full Text Available The paper presented some conscientious and results from the finished research which showing the meaning of the prevention with speech therapy in the development of the speech. The research was done at Negotino and with that are shown the most frequent speech deficiency of the children at preschool age.

  7. Eigennoise Speech Recovery in Adverse Environments with Joint Compensation of Additive and Convolutive Noise

    Directory of Open Access Journals (Sweden)

    Trung-Nghia Phung

    2015-01-01

    Full Text Available The learning-based speech recovery approach using statistical spectral conversion has been used for some kind of distorted speech as alaryngeal speech and body-conducted speech (or bone-conducted speech. This approach attempts to recover clean speech (undistorted speech from noisy speech (distorted speech by converting the statistical models of noisy speech into that of clean speech without the prior knowledge on characteristics and distributions of noise source. Presently, this approach has still not attracted many researchers to apply in general noisy speech enhancement because of some major problems: those are the difficulties of noise adaptation and the lack of noise robust synthesizable features in different noisy environments. In this paper, we adopted the methods of state-of-the-art voice conversions and speaker adaptation in speech recognition to the proposed speech recovery approach applied in different kinds of noisy environment, especially in adverse environments with joint compensation of additive and convolutive noises. We proposed to use the decorrelated wavelet packet coefficients as a low-dimensional robust synthesizable feature under noisy environments. We also proposed a noise adaptation for speech recovery with the eigennoise similar to the eigenvoice in voice conversion. The experimental results showed that the proposed approach highly outperformed traditional nonlearning-based approaches.

  8. Speech recognition: impact on workflow and report availability

    International Nuclear Information System (INIS)

    Glaser, C.; Trumm, C.; Nissen-Meyer, S.; Francke, M.; Kuettner, B.; Reiser, M.

    2005-01-01

    With ongoing technical refinements speech recognition systems (SRS) are becoming an increasingly attractive alternative to traditional methods of preparing and transcribing medical reports. The two main components of any SRS are the acoustic model and the language model. Features of modern SRS with continuous speech recognition are macros with individually definable texts and report templates as well as the option to navigate in a text or to control SRS or RIS functions by speech recognition. The best benefit from SRS can be obtained if it is integrated into a RIS/RIS-PACS installation. Report availability and time efficiency of the reporting process (related to recognition rate, time expenditure for editing and correcting a report) are the principal determinants of the clinical performance of any SRS. For practical purposes the recognition rate is estimated by the error rate (unit ''word''). Error rates range from 4 to 28%. Roughly 20% of them are errors in the vocabulary which may result in clinically relevant misinterpretation. It is thus mandatory to thoroughly correct any transcribed text as well as to continuously train and adapt the SRS vocabulary. The implementation of SRS dramatically improves report availability. This is most pronounced for CT and CR. However, the individual time expenditure for (SRS-based) reporting increased by 20-25% (CR) and according to literature data there is an increase by 30% for CT and MRI. The extent to which the transcription staff profits from SRS depends largely on its qualification. Online dictation implies a workload shift from the transcription staff to the reporting radiologist. (orig.) [de

  9. Musician advantage for speech-on-speech perception

    NARCIS (Netherlands)

    Başkent, Deniz; Gaudrain, Etienne

    Evidence for transfer of musical training to better perception of speech in noise has been mixed. Unlike speech-in-noise, speech-on-speech perception utilizes many of the skills that musical training improves, such as better pitch perception and stream segregation, as well as use of higher-level

  10. Speech Production and Speech Discrimination by Hearing-Impaired Children.

    Science.gov (United States)

    Novelli-Olmstead, Tina; Ling, Daniel

    1984-01-01

    Seven hearing impaired children (five to seven years old) assigned to the Speakers group made highly significant gains in speech production and auditory discrimination of speech, while Listeners made only slight speech production gains and no gains in auditory discrimination. Combined speech and auditory training was more effective than auditory…

  11. Segmental intelligibility of synthetic speech produced by rule.

    Science.gov (United States)

    Logan, J S; Greene, B G; Pisoni, D B

    1989-08-01

    This paper reports the results of an investigation that employed the modified rhyme test (MRT) to measure the segmental intelligibility of synthetic speech generated automatically by rule. Synthetic speech produced by ten text-to-speech systems was studied and compared to natural speech. A variation of the standard MRT was also used to study the effects of response set size on perceptual confusions. Results indicated that the segmental intelligibility scores formed a continuum. Several systems displayed very high levels of performance that were close to or equal to scores obtained with natural speech; other systems displayed substantially worse performance compared to natural speech. The overall performance of the best system, DECtalk--Paul, was equivalent to the data obtained with natural speech for consonants in syllable-initial position. The findings from this study are discussed in terms of the use of a set of standardized procedures for measuring intelligibility of synthetic speech under controlled laboratory conditions. Recent work investigating the perception of synthetic speech under more severe conditions in which greater demands are made on the listener's processing resources is also considered. The wide range of intelligibility scores obtained in the present study demonstrates important differences in perception and suggests that not all synthetic speech is perceptually equivalent to the listener.

  12. Segmental intelligibility of synthetic speech produced by rule

    Science.gov (United States)

    Logan, John S.; Greene, Beth G.; Pisoni, David B.

    2012-01-01

    This paper reports the results of an investigation that employed the modified rhyme test (MRT) to measure the segmental intelligibility of synthetic speech generated automatically by rule. Synthetic speech produced by ten text-to-speech systems was studied and compared to natural speech. A variation of the standard MRT was also used to study the effects of response set size on perceptual confusions. Results indicated that the segmental intelligibility scores formed a continuum. Several systems displayed very high levels of performance that were close to or equal to scores obtained with natural speech; other systems displayed substantially worse performance compared to natural speech. The overall performance of the best system, DECtalk—Paul, was equivalent to the data obtained with natural speech for consonants in syllable-initial position. The findings from this study are discussed in terms of the use of a set of standardized procedures for measuring intelligibility of synthetic speech under controlled laboratory conditions. Recent work investigating the perception of synthetic speech under more severe conditions in which greater demands are made on the listener’s processing resources is also considered. The wide range of intelligibility scores obtained in the present study demonstrates important differences in perception and suggests that not all synthetic speech is perceptually equivalent to the listener. PMID:2527884

  13. Speech-in-speech perception and executive function involvement.

    Directory of Open Access Journals (Sweden)

    Marcela Perrone-Bertolotti

    Full Text Available This present study investigated the link between speech-in-speech perception capacities and four executive function components: response suppression, inhibitory control, switching and working memory. We constructed a cross-modal semantic priming paradigm using a written target word and a spoken prime word, implemented in one of two concurrent auditory sentences (cocktail party situation. The prime and target were semantically related or unrelated. Participants had to perform a lexical decision task on visual target words and simultaneously listen to only one of two pronounced sentences. The attention of the participant was manipulated: The prime was in the pronounced sentence listened to by the participant or in the ignored one. In addition, we evaluate the executive function abilities of participants (switching cost, inhibitory-control cost and response-suppression cost and their working memory span. Correlation analyses were performed between the executive and priming measurements. Our results showed a significant interaction effect between attention and semantic priming. We observed a significant priming effect in the attended but not in the ignored condition. Only priming effects obtained in the ignored condition were significantly correlated with some of the executive measurements. However, no correlation between priming effects and working memory capacity was found. Overall, these results confirm, first, the role of attention for semantic priming effect and, second, the implication of executive functions in speech-in-noise understanding capacities.

  14. Acoustic cues identifying phonetic transitions for speech segmentation

    CSIR Research Space (South Africa)

    Van Niekerk, DR

    2008-11-01

    Full Text Available The quality of corpus-based text-to-speech (TTS) systems depends strongly on the consistency of boundary placements during phonetic alignments. Expert human transcribers use visually represented acoustic cues in order to consistently place...

  15. Expression of future prospective in indirect speech

    Directory of Open Access Journals (Sweden)

    Bodnaruk Elena Vladimirovna

    2015-03-01

    Full Text Available The article analyzes the characteristics and use of grammatical semantics and lexical and grammatical means used to create future prospects in double indirect discourse. The material for the study were epic works by contemporary German writers. In the analysis of the empirical material it has been pointed out that indirect discourse has preterial basis and is the kind of most frequent inner speech of characters. The most widely used form with future semantics in preterial indirect speech is conditional I, formally having a conjunctive basis, but is mostly used with the indicative semantics. Competitive to conditional I in indirect speech is preterial indicative. A characteristic feature of the indirect speech is the use of modal verbs, which, thanks to its semantics is usually referred as an action at a later term, creating the prospect of future statements. The most frequent were modal verbs wollen and sollen in the form of the preterite, more rare verbs were m ssen and k nnen. German indirect speech distinguishes the ability to use forms on the basis of conjunctive: preterite and plusquamperfect of conjunctive. Both forms express values similar to those of the indicative. However, conjunctive forms the basis of the data shown in a slightly more pronounced seme of uncertainty that accompanies future uses of these forms in indirect speech. In addition, plusquamperfect conjunctive differs from others by the presence of the seme of completeness.

  16. SUSTAINABILITY IN THE BOWELS OF SPEECHES

    Directory of Open Access Journals (Sweden)

    Jadir Mauro Galvao

    2012-10-01

    Full Text Available The theme of sustainability has not yet achieved the feat of make up as an integral part the theoretical medley that brings out our most everyday actions, often visits some of our thoughts and permeates many of our speeches. The big event of 2012, the meeting gathered Rio +20 glances from all corners of the planet around that theme as burning, but we still see forward timidly. Although we have no very clear what the term sustainability closes it does not sound quite strange. Associate with things like ecology, planet, wastes emitted by smokestacks of factories, deforestation, recycling and global warming must be related, but our goal in this article is the least of clarifying the term conceptually and more try to observe as it appears in speeches of such conference. When the competent authorities talk about sustainability relate to what? We intend to investigate the lines and between the lines of these speeches, any assumptions associated with the term. Therefore we will analyze the speech of the People´s Summit, the opening speech of President Dilma and emblematic speech of the President of Uruguay, José Pepe Mujica.

  17. THE USE OF EXPRESSIVE SPEECH ACTS IN HANNAH MONTANA SESSION 1

    Directory of Open Access Journals (Sweden)

    Nur Vita Handayani

    2015-07-01

    Full Text Available This study aims to describe kinds and forms of expressive speech act in Hannah Montana Session 1. It belongs to descriptive qualitative method. The research object was expressive speech act. The data source was utterances which contain expressive speech acts in the film Hannah Montana Session 1. The researcher used observation method and noting technique in collecting the data. In analyzing the data, descriptive qualitative method was used. The research findings show that there are ten kinds of expressive speech act found in Hannah Montana Session 1, namely expressing apology, expressing thanks, expressing sympathy, expressing attitudes, expressing greeting, expressing wishes, expressing joy, expressing pain, expressing likes, and expressing dislikes. The forms of expressive speech act are direct literal expressive speech act, direct non-literal expressive speech act, indirect literal expressive speech act, and indirect non-literal expressive speech act.

  18. Rate and rhythm control strategies for apraxia of speech in nonfluent primary progressive aphasia

    Directory of Open Access Journals (Sweden)

    Bárbara Costa Beber

    Full Text Available ABSTRACT The nonfluent/agrammatic variant of primary progressive aphasia is characterized by apraxia of speech and agrammatism. Apraxia of speech limits patients' communication due to slow speaking rate, sound substitutions, articulatory groping, false starts and restarts, segmentation of syllables, and increased difficulty with increasing utterance length. Speech and language therapy is known to benefit individuals with apraxia of speech due to stroke, but little is known about its effects in primary progressive aphasia. This is a case report of a 72-year-old, illiterate housewife, who was diagnosed with nonfluent primary progressive aphasia and received speech and language therapy for apraxia of speech. Rate and rhythm control strategies for apraxia of speech were trained to improve initiation of speech. We discuss the importance of these strategies to alleviate apraxia of speech in this condition and the future perspectives in the area.

  19. Analysis of Feature Extraction Methods for Speaker Dependent Speech Recognition

    Directory of Open Access Journals (Sweden)

    Gurpreet Kaur

    2017-02-01

    Full Text Available Speech recognition is about what is being said, irrespective of who is saying. Speech recognition is a growing field. Major progress is taking place on the technology of automatic speech recognition (ASR. Still, there are lots of barriers in this field in terms of recognition rate, background noise, speaker variability, speaking rate, accent etc. Speech recognition rate mainly depends on the selection of features and feature extraction methods. This paper outlines the feature extraction techniques for speaker dependent speech recognition for isolated words. A brief survey of different feature extraction techniques like Mel-Frequency Cepstral Coefficients (MFCC, Linear Predictive Coding Coefficients (LPCC, Perceptual Linear Prediction (PLP, Relative Spectra Perceptual linear Predictive (RASTA-PLP analysis are presented and evaluation is done. Speech recognition has various applications from daily use to commercial use. We have made a speaker dependent system and this system can be useful in many areas like controlling a patient vehicle using simple commands.

  20. Multiple Transcoding Impact on Speech Quality in Ideal Network Conditions

    Directory of Open Access Journals (Sweden)

    Martin Mikulec

    2015-01-01

    Full Text Available This paper deals with the impact of transcoding on the speech quality. We have focused mainly on the transcoding between codecs without the negative influence of the network parameters such as packet loss and delay. It has ensured objective and repeatable results from our measurement. The measurement was performed on the Transcoding Measuring System developed especially for this purpose. The system is based on the open source projects and is useful as a design tool for VoIP system administrators. The paper compares the most used codecs from the transcoding perspective. The multiple transcoding between G711, GSM and G729 codecs were performed and the speech quality of these calls was evaluated. The speech quality was measured by Perceptual Evaluation of Speech Quality method, which provides results in Mean Opinion Score used to describe the speech quality on a scale from 1 to 5. The obtained results indicate periodical speech quality degradation on every transcoding between two codecs.

  1. Multi-thread Parallel Speech Recognition for Mobile Applications

    Directory of Open Access Journals (Sweden)

    LOJKA Martin

    2014-05-01

    Full Text Available In this paper, the server based solution of the multi-thread large vocabulary automatic speech recognition engine is described along with the Android OS and HTML5 practical application examples. The basic idea was to bring speech recognition available for full variety of applications for computers and especially for mobile devices. The speech recognition engine should be independent of commercial products and services (where the dictionary could not be modified. Using of third-party services could be also a security and privacy problem in specific applications, when the unsecured audio data could not be sent to uncontrolled environments (voice data transferred to servers around the globe. Using our experience with speech recognition applications, we have been able to construct a multi-thread speech recognition serverbased solution designed for simple applications interface (API to speech recognition engine modified to specific needs of particular application.

  2. Inner Speech's Relationship With Overt Speech in Poststroke Aphasia.

    Science.gov (United States)

    Stark, Brielle C; Geva, Sharon; Warburton, Elizabeth A

    2017-09-18

    Relatively preserved inner speech alongside poor overt speech has been documented in some persons with aphasia (PWA), but the relationship of overt speech with inner speech is still largely unclear, as few studies have directly investigated these factors. The present study investigates the relationship of relatively preserved inner speech in aphasia with selected measures of language and cognition. Thirty-eight persons with chronic aphasia (27 men, 11 women; average age 64.53 ± 13.29 years, time since stroke 8-111 months) were classified as having relatively preserved inner and overt speech (n = 21), relatively preserved inner speech with poor overt speech (n = 8), or not classified due to insufficient measurements of inner and/or overt speech (n = 9). Inner speech scores (by group) were correlated with selected measures of language and cognition from the Comprehensive Aphasia Test (Swinburn, Porter, & Al, 2004). The group with poor overt speech showed a significant relationship of inner speech with overt naming (r = .95, p speech and language and cognition factors were not significant for the group with relatively good overt speech. As in previous research, we show that relatively preserved inner speech is found alongside otherwise severe production deficits in PWA. PWA with poor overt speech may rely more on preserved inner speech for overt picture naming (perhaps due to shared resources with verbal working memory) and for written picture description (perhaps due to reliance on inner speech due to perceived task difficulty). Assessments of inner speech may be useful as a standard component of aphasia screening, and therapy focused on improving and using inner speech may prove clinically worthwhile. https://doi.org/10.23641/asha.5303542.

  3. A Research of Speech Emotion Recognition Based on Deep Belief Network and SVM

    Directory of Open Access Journals (Sweden)

    Chenchen Huang

    2014-01-01

    Full Text Available Feature extraction is a very important part in speech emotion recognition, and in allusion to feature extraction in speech emotion recognition problems, this paper proposed a new method of feature extraction, using DBNs in DNN to extract emotional features in speech signal automatically. By training a 5 layers depth DBNs, to extract speech emotion feature and incorporate multiple consecutive frames to form a high dimensional feature. The features after training in DBNs were the input of nonlinear SVM classifier, and finally speech emotion recognition multiple classifier system was achieved. The speech emotion recognition rate of the system reached 86.5%, which was 7% higher than the original method.

  4. Environmental Contamination of Normal Speech.

    Science.gov (United States)

    Harley, Trevor A.

    1990-01-01

    Environmentally contaminated speech errors (irrelevant words or phrases derived from the speaker's environment and erroneously incorporated into speech) are hypothesized to occur at a high level of speech processing, but with a relatively late insertion point. The data indicate that speech production processes are not independent of other…

  5. Global Freedom of Speech

    DEFF Research Database (Denmark)

    Binderup, Lars Grassme

    2007-01-01

    , as opposed to a legal norm, that curbs exercises of the right to free speech that offend the feelings or beliefs of members from other cultural groups. The paper rejects the suggestion that acceptance of such a norm is in line with liberal egalitarian thinking. Following a review of the classical liberal...... egalitarian reasons for free speech - reasons from overall welfare, from autonomy and from respect for the equality of citizens - it is argued that these reasons outweigh the proposed reasons for curbing culturally offensive speech. Currently controversial cases such as that of the Danish Cartoon Controversy...

  6. PRACTICING SPEECH THERAPY INTERVENTION FOR SOCIAL INTEGRATION OF CHILDREN WITH SPEECH DISORDERS

    Directory of Open Access Journals (Sweden)

    Martin Ofelia POPESCU

    2016-11-01

    Full Text Available The article presents a concise speech correction intervention program in of dyslalia in conjunction with capacity development of intra, interpersonal and social integration of children with speech disorders. The program main objectives represent: the potential increasing of individual social integration by correcting speech disorders in conjunction with intra- and interpersonal capacity, the potential growth of children and community groups for social integration by optimizing the socio-relational context of children with speech disorder. In the program were included 60 children / students with dyslalia speech disorders (monomorphic and polymorphic dyslalia, from 11 educational institutions - 6 kindergartens and 5 schools / secondary schools, joined with inter-school logopedic centre (CLI from Targu Jiu city and areas of Gorj district. The program was implemented under the assumption that therapeutic-formative intervention to correct speech disorders and facilitate the social integration will lead, in combination with correct pronunciation disorders, to social integration optimization of children with speech disorders. The results conirm the hypothesis and gives facts about the intervention program eficiency.

  7. Music expertise shapes audiovisual temporal integration windows for speech, sinewave speech and music

    Directory of Open Access Journals (Sweden)

    Hwee Ling eLee

    2014-08-01

    Full Text Available This psychophysics study used musicians as a model to investigate whether musical expertise shapes the temporal integration window for audiovisual speech, sinewave speech or music. Musicians and non-musicians judged the audiovisual synchrony of speech, sinewave analogues of speech, and music stimuli at 13 audiovisual stimulus onset asynchronies (±360, ±300 ±240, ±180, ±120, ±60, and 0 ms. Further, we manipulated the duration of the stimuli by presenting sentences/melodies or syllables/tones. Critically, musicians relative to non-musicians exhibited significantly narrower temporal integration windows for both music and sinewave speech. Further, the temporal integration window for music decreased with the amount of music practice, but not with age of acquisition. In other words, the more musicians practiced piano in the past three years, the more sensitive they became to the temporal misalignment of visual and auditory signals. Collectively, our findings demonstrate that music practicing fine-tunes the audiovisual temporal integration window to various extents depending on the stimulus class. While the effect of piano practicing was most pronounced for music, it also generalized to other stimulus classes such as sinewave speech and to a marginally significant degree to natural speech.

  8. Prediction and imitation in speech

    Directory of Open Access Journals (Sweden)

    Chiara eGambi

    2013-06-01

    Full Text Available It has been suggested that intra- and inter-speaker variability in speech are correlated. Interlocutors have been shown to converge on various phonetic dimensions. In addition, speakers imitate the phonetic properties of voices they are exposed to in shadowing, repetition, and even passive listening tasks. We review three theoretical accounts of speech imitation and convergence phenomena: (i the Episodic Theory (ET of speech perception and production (Goldinger, 1998; (ii the Motor Theory (MT of speech perception (Liberman and Whalen, 2000;Galantucci et al., 2006 ; (iii Communication Accommodation Theory (CAT; Giles et al., 1991;Giles and Coupland, 1991. We argue that no account is able to explain all the available evidence. In particular, there is a need to integrate low-level, mechanistic accounts (like ET and MT and higher-level accounts (like CAT. We propose that this is possible within the framework of an integrated theory of production and comprehension (Pickering & Garrod, in press. Similarly to both ET and MT, this theory assumes parity between production and perception. Uniquely, however, it posits that listeners simulate speakers’ utterances by computing forward-model predictions at many different levels, which are then compared to the incoming phonetic input. In our account phonetic imitation can be achieved via the same mechanism that is responsible for sensorimotor adaptation; i.e. the correction of prediction errors. In addition, the model assumes that the degree to which sensory prediction errors lead to motor adjustments is context-dependent. The notion of context subsumes both the preceding linguistic input and non-linguistic attributes of the situation (e.g., the speaker’s and listener’s social identities, their conversational roles, the listener’s intention to imitate.

  9. The NCHLT speech corpus of the South African languages

    CSIR Research Space (South Africa)

    Barnard, E

    2014-05-01

    Full Text Available The NCHLT speech corpus contains wide-band speech from approximately 200 speakers per language, in each of the eleven of cial languages of South Africa. We describe the design and development processes that were undertaken in order to develop...

  10. Speech act theory and New Testament exegesis

    Directory of Open Access Journals (Sweden)

    J. Botha

    1991-01-01

    Full Text Available Speech act theory offers New Testament exegesis some additional ways and means of approaching the text of the New Testament. This, the second in a series of two articles that make a plea for the continued utilisation and application of this theory to the text of the New Testament, deals with some of the possibilities and potential this theory holds for reading biblical texts. Advantages are pointed out and a few suggestions for the future proposed.

  11. Dysarthric Bengali speech: A neurolinguistic study

    Directory of Open Access Journals (Sweden)

    Chakraborty N

    2008-01-01

    Full Text Available Background and Aims: Dysarthria affects linguistic domains such as respiration, phonation, articulation, resonance and prosody due to upper motor neuron, lower motor neuron, cerebellar or extrapyramidal tract lesions. Although Bengali is one of the major languages globally, dysarthric Bengali speech has not been subjected to neurolinguistic analysis. We attempted such an analysis with the goal of identifying the speech defects in native Bengali speakers in various types of dysarthria encountered in neurological disorders. Settings and Design: A cross-sectional observational study was conducted with 66 dysarthric subjects, predominantly middle-aged males, attending the Neuromedicine OPD of a tertiary care teaching hospital in Kolkata. Materials and Methods: After neurological examination, an instrument comprising commonly used Bengali words and a text block covering all Bengali vowels and consonants were used to carry out perceptual analysis of dysarthric speech. From recorded speech, 24 parameters pertaining to five linguistic domains were assessed. The Kruskal-Wallis analysis of variance, Chi-square test and Fisher′s exact test were used for analysis. Results: The dysarthria types were spastic (15 subjects, flaccid (10, mixed (12, hypokinetic (12, hyperkinetic (9 and ataxic (8. Of the 24 parameters assessed, 15 were found to occur in one or more types with a prevalence of at least 25%. Imprecise consonant was the most frequently occurring defect in most dysarthrias. The spectrum of defects in each type was identified. Some parameters were capable of distinguishing between types. Conclusions: This perceptual analysis has defined linguistic defects likely to be encountered in dysarthric Bengali speech in neurological disorders. The speech distortion can be described and distinguished by a limited number of parameters. This may be of importance to the speech therapist and neurologist in planning rehabilitation and further management.

  12. Prevalence of Speech Disorders in Arak Primary School Students, 2014-2015

    Directory of Open Access Journals (Sweden)

    Abdoreza Yavari

    2016-09-01

    Full Text Available Abstract Background: The speech disorders may produce irreparable damage to childs speech and language development in the psychosocial view. The voice, speech sound production and fluency disorders are speech disorders, that may result from delay or impairment in speech motor control mechanism, central neuron system disorders, improper language stimulation or voice abuse. Materials and Methods: This study examined the prevalence of speech disorders in 1393 Arakian students at 1 to 6th grades of primary school. After collecting continuous speech samples, picture description, passage reading and phonetic test, we recorded the pathological signs of stuttering, articulation disorder and voice disorders in a special sheet. Results: The prevalence of articulation, voice and stuttering disorders was 8%, 3.5% and%1 and the prevalence of speech disorders was 11.9%. The prevalence of speech disorders was decreasing with increasing of student’s grade. 12.2% of boy students and 11.7% of girl students of primary school in Arak had speech disorders. Conclusion: The prevalence of speech disorders of primary school students in Arak is similar to the prevalence of speech disorders in Kermanshah, but the prevalence of speech disorders in this research is smaller than many similar researches in Iran. It seems that racial and cultural diversity has some effect on increasing the prevalence of speech disorders in Arak city.

  13. Charisma in business speeches

    DEFF Research Database (Denmark)

    Niebuhr, Oliver; Brem, Alexander; Novák-Tót, Eszter

    2016-01-01

    to business speeches. Consistent with the public opinion, our findings are indicative of Steve Jobs being a more charismatic speaker than Mark Zuckerberg. Beyond previous studies, our data suggest that rhythm and emphatic accentuation are also involved in conveying charisma. Furthermore, the differences...... between Steve Jobs and Mark Zuckerberg and the investor- and customer-related sections of their speeches support the modern understanding of charisma as a gradual, multiparametric, and context-sensitive concept....

  14. Speech spectrum envelope modeling

    Czech Academy of Sciences Publication Activity Database

    Vích, Robert; Vondra, Martin

    Vol. 4775, - (2007), s. 129-137 ISSN 0302-9743. [COST Action 2102 International Workshop. Vietri sul Mare, 29.03.2007-31.03.2007] R&D Projects: GA AV ČR(CZ) 1ET301710509 Institutional research plan: CEZ:AV0Z20670512 Keywords : speech * speech processing * cepstral analysis Subject RIV: JA - Electronics ; Optoelectronics, Electrical Engineering Impact factor: 0.302, year: 2005

  15. Memory for speech and speech for memory.

    Science.gov (United States)

    Locke, J L; Kutz, K J

    1975-03-01

    Thirty kindergarteners, 15 who substituted /w/ for /r/ and 15 with correct articulation, received two perception tests and a memory test that included /w/ and /r/ in minimally contrastive syllables. Although both groups had nearly perfect perception of the experimenter's productions of /w/ and /r/, misarticulating subjects perceived their own tape-recorded w/r productions as /w/. In the memory task these same misarticulating subjects committed significantly more /w/-/r/ confusions in unspoken recall. The discussion considers why people subvocally rehearse; a developmental period in which children do not rehearse; ways subvocalization may aid recall, including motor and acoustic encoding; an echoic store that provides additional recall support if subjects rehearse vocally, and perception of self- and other- produced phonemes by misarticulating children-including its relevance to a motor theory of perception. Evidence is presented that speech for memory can be sufficiently impaired to cause memory disorder. Conceptions that restrict speech disorder to an impairment of communication are challenged.

  16. Telephone based speech interfaces in the developing world, from the perspective of human-human communication

    CSIR Research Space (South Africa)

    Naidoo, S

    2005-07-01

    Full Text Available recently, before computers systems were able to synthesize or recognize speech, speech was a capability unique to humans. The human brain has developed to differentiate between human speech and other audio occurrences. Therefore, the slowly- evolving... human brain reacts in certain ways to voice stimuli, and has certain expectations regarding communication by voice. Nass affirms that the human brain operates using the same mechanisms when interacting with speech interfaces as when conversing...

  17. Objective voice and speech analysis of persons with chronic hoarseness by prosodic analysis of speech samples.

    Science.gov (United States)

    Haderlein, Tino; Döllinger, Michael; Matoušek, Václav; Nöth, Elmar

    2016-10-01

    Automatic voice assessment is often performed using sustained vowels. In contrast, speech analysis of read-out texts can be applied to voice and speech assessment. Automatic speech recognition and prosodic analysis were used to find regression formulae between automatic and perceptual assessment of four voice and four speech criteria. The regression was trained with 21 men and 62 women (average age 49.2 years) and tested with another set of 24 men and 49 women (48.3 years), all suffering from chronic hoarseness. They read the text 'Der Nordwind und die Sonne' ('The North Wind and the Sun'). Five voice and speech therapists evaluated the data on 5-point Likert scales. Ten prosodic and recognition accuracy measures (features) were identified which describe all the examined criteria. Inter-rater correlation within the expert group was between r = 0.63 for the criterion 'match of breath and sense units' and r = 0.87 for the overall voice quality. Human-machine correlation was between r = 0.40 for the match of breath and sense units and r = 0.82 for intelligibility. The perceptual ratings of different criteria were highly correlated with each other. Likewise, the feature sets modeling the criteria were very similar. The automatic method is suitable for assessing chronic hoarseness in general and for subgroups of functional and organic dysphonia. In its current version, it is almost as reliable as a randomly picked rater from a group of voice and speech therapists.

  18. Predicting speech intelligibility in conditions with nonlinearly processed noisy speech

    DEFF Research Database (Denmark)

    Jørgensen, Søren; Dau, Torsten

    2013-01-01

    The speech-based envelope power spectrum model (sEPSM; [1]) was proposed in order to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index (SII). The sEPSM applies the signal-tonoise ratio in the envelope domain (SNRenv), which was demonstrated...... to successfully predict speech intelligibility in conditions with nonlinearly processed noisy speech, such as processing with spectral subtraction. Moreover, a multiresolution version (mr-sEPSM) was demonstrated to account for speech intelligibility in various conditions with stationary and fluctuating...

  19. Bandwidth Extension of Telephone Speech Aided by Data Embedding

    Directory of Open Access Journals (Sweden)

    Sagi Ariel

    2007-01-01

    Full Text Available A system for bandwidth extension of telephone speech, aided by data embedding, is presented. The proposed system uses the transmitted analog narrowband speech signal as a carrier of the side information needed to carry out the bandwidth extension. The upper band of the wideband speech is reconstructed at the receiving end from two components: a synthetic wideband excitation signal, generated from the narrowband telephone speech and a wideband spectral envelope, parametrically represented and transmitted as embedded data in the telephone speech. We propose a novel data embedding scheme, in which the scalar Costa scheme is combined with an auditory masking model allowing high rate transparent embedding, while maintaining a low bit error rate. The signal is transformed to the frequency domain via the discrete Hartley transform (DHT and is partitioned into subbands. Data is embedded in an adaptively chosen subset of subbands by modifying the DHT coefficients. In our simulations, high quality wideband speech was obtained from speech transmitted over a telephone line (characterized by spectral magnitude distortion, dispersion, and noise, in which side information data is transparently embedded at the rate of 600 information bits/second and with a bit error rate of approximately . In a listening test, the reconstructed wideband speech was preferred (at different degrees over conventional telephone speech in of the test utterances.

  20. Bandwidth Extension of Telephone Speech Aided by Data Embedding

    Directory of Open Access Journals (Sweden)

    David Malah

    2007-01-01

    Full Text Available A system for bandwidth extension of telephone speech, aided by data embedding, is presented. The proposed system uses the transmitted analog narrowband speech signal as a carrier of the side information needed to carry out the bandwidth extension. The upper band of the wideband speech is reconstructed at the receiving end from two components: a synthetic wideband excitation signal, generated from the narrowband telephone speech and a wideband spectral envelope, parametrically represented and transmitted as embedded data in the telephone speech. We propose a novel data embedding scheme, in which the scalar Costa scheme is combined with an auditory masking model allowing high rate transparent embedding, while maintaining a low bit error rate. The signal is transformed to the frequency domain via the discrete Hartley transform (DHT and is partitioned into subbands. Data is embedded in an adaptively chosen subset of subbands by modifying the DHT coefficients. In our simulations, high quality wideband speech was obtained from speech transmitted over a telephone line (characterized by spectral magnitude distortion, dispersion, and noise, in which side information data is transparently embedded at the rate of 600 information bits/second and with a bit error rate of approximately 3⋅10−4. In a listening test, the reconstructed wideband speech was preferred (at different degrees over conventional telephone speech in 92.5% of the test utterances.

  1. Speech misperception: speaking and seeing interfere differently with hearing.

    Directory of Open Access Journals (Sweden)

    Takemi Mochida

    Full Text Available Speech perception is thought to be linked to speech motor production. This linkage is considered to mediate multimodal aspects of speech perception, such as audio-visual and audio-tactile integration. However, direct coupling between articulatory movement and auditory perception has been little studied. The present study reveals a clear dissociation between the effects of a listener's own speech action and the effects of viewing another's speech movements on the perception of auditory phonemes. We assessed the intelligibility of the syllables [pa], [ta], and [ka] when listeners silently and simultaneously articulated syllables that were congruent/incongruent with the syllables they heard. The intelligibility was compared with a condition where the listeners simultaneously watched another's mouth producing congruent/incongruent syllables, but did not articulate. The intelligibility of [ta] and [ka] were degraded by articulating [ka] and [ta] respectively, which are associated with the same primary articulator (tongue as the heard syllables. But they were not affected by articulating [pa], which is associated with a different primary articulator (lips from the heard syllables. In contrast, the intelligibility of [ta] and [ka] was degraded by watching the production of [pa]. These results indicate that the articulatory-induced distortion of speech perception occurs in an articulator-specific manner while visually induced distortion does not. The articulator-specific nature of the auditory-motor interaction in speech perception suggests that speech motor processing directly contributes to our ability to hear speech.

  2. SPEECH ACT OF ILTIFAT AND ITS INDONESIAN TRANSLATION PROBLEMS

    Directory of Open Access Journals (Sweden)

    Zaka Al Farisi

    2015-01-01

    Full Text Available Abstract: Iltifat (shifting speech act is distinctive and considered unique style of Arabic. It has potential errors when it is translated into Indonesian. Therefore, translation of iltifat speech act into another language can be an important issue. The objective of the study is to know translation procedures/techniques and ideology required in dealing with iltifat speech act. This research is directed at translation as a cognitive product of a translator. The data used in the present study were the corpus of Koranic verses that contain iltifat speech act along with their translation. Data analysis typically used descriptive-evaluative method with content analysis model. The data source of this research consisted of the Koran and its translation. The purposive sampling technique was employed, with the sample of the iltifat speech act contained in the Koran. The results showed that more than 60% of iltifat speech act were translated by using literal procedure. The significant number of literal translation of the verses asserts that the Ministry of Religious Affairs tended to use literal method of translation. In other words, the Koran translation made by the Ministry of Religious Affairs tended to be oriented to the source language in dealing with iltifat speech act. The number of the literal procedure used shows a tendency of foreignization ideology. Transitional pronouns contained in the iltifat speech act can be clearly translated when thick translations were used in the form of description in parentheses. In this case, explanation can be a choice in translating iltifat speech act.

  3. Sound frequency affects speech emotion perception: Results from congenital amusia

    Directory of Open Access Journals (Sweden)

    Sydney eLolli

    2015-09-01

    Full Text Available Congenital amusics, or tone-deaf individuals, show difficulty in perceiving and producing small pitch differences. While amusia has marked effects on music perception, its impact on speech perception is less clear. Here we test the hypothesis that individual differences in pitch perception affect judgment of emotion in speech, by applying band-pass filters to spoken statements of emotional speech. A norming study was first conducted on Mechanical Turk to ensure that the intended emotions from the Macquarie Battery for Evaluation of Prosody (MBEP were reliably identifiable by US English speakers. The most reliably identified emotional speech samples were used in in Experiment 1, in which subjects performed a psychophysical pitch discrimination task, and an emotion identification task under band-pass and unfiltered speech conditions. Results showed a significant correlation between pitch discrimination threshold and emotion identification accuracy for band-pass filtered speech, with amusics (defined here as those with a pitch discrimination threshold > 16 Hz performing worse than controls. This relationship with pitch discrimination was not seen in unfiltered speech conditions. Given the dissociation between band-pass filtered and unfiltered speech conditions, we inferred that amusics may be compensating for poorer pitch perception by using speech cues that are filtered out in this manipulation.

  4. Source Separation via Spectral Masking for Speech Recognition Systems

    Directory of Open Access Journals (Sweden)

    Gustavo Fernandes Rodrigues

    2012-12-01

    Full Text Available In this paper we present an insight into the use of spectral masking techniques in time-frequency domain, as a preprocessing step for the speech signal recognition. Speech recognition systems have their performance negatively affected in noisy environments or in the presence of other speech signals. The limits of these masking techniques for different levels of the signal-to-noise ratio are discussed. We show the robustness of the spectral masking techniques against four types of noise: white, pink, brown and human speech noise (bubble noise. The main contribution of this work is to analyze the performance limits of recognition systems  using spectral masking. We obtain an increase of 18% on the speech hit rate, when the speech signals were corrupted by other speech signals or bubble noise, with different signal-to-noise ratio of approximately 1, 10 and 20 dB. On the other hand, applying the ideal binary masks to mixtures corrupted by white, pink and brown noise, results an average growth of 9% on the speech hit rate, with the same different signal-to-noise ratio. The experimental results suggest that the masking spectral techniques are more suitable for the case when it is applied a bubble noise, which is produced by human speech, than for the case of applying white, pink and brown noise.

  5. Polish Phoneme Statistics Obtained On Large Set Of Written Texts

    Directory of Open Access Journals (Sweden)

    Bartosz Ziółko

    2009-01-01

    Full Text Available The phonetical statistics were collected from several Polish corpora. The paper is a summaryof the data which are phoneme n-grams and some phenomena in the statistics. Triphonestatistics apply context-dependent speech units which have an important role in speech recognitionsystems and were never calculated for a large set of Polish written texts. The standardphonetic alphabet for Polish, SAMPA, and methods of providing phonetic transcriptions are described.

  6. Music and Speech Perception in Children Using Sung Speech.

    Science.gov (United States)

    Nie, Yingjiu; Galvin, John J; Morikawa, Michael; André, Victoria; Wheeler, Harley; Fu, Qian-Jie

    2018-01-01

    This study examined music and speech perception in normal-hearing children with some or no musical training. Thirty children (mean age = 11.3 years), 15 with and 15 without formal music training participated in the study. Music perception was measured using a melodic contour identification (MCI) task; stimuli were a piano sample or sung speech with a fixed timbre (same word for each note) or a mixed timbre (different words for each note). Speech perception was measured in quiet and in steady noise using a matrix-styled sentence recognition task; stimuli were naturally intonated speech or sung speech with a fixed pitch (same note for each word) or a mixed pitch (different notes for each word). Significant musician advantages were observed for MCI and speech in noise but not for speech in quiet. MCI performance was significantly poorer with the mixed timbre stimuli. Speech performance in noise was significantly poorer with the fixed or mixed pitch stimuli than with spoken speech. Across all subjects, age at testing and MCI performance were significantly correlated with speech performance in noise. MCI and speech performance in quiet was significantly poorer for children than for adults from a related study using the same stimuli and tasks; speech performance in noise was significantly poorer for young than for older children. Long-term music training appeared to benefit melodic pitch perception and speech understanding in noise in these pediatric listeners.

  7. LIBERDADE DE EXPRESSÃO E DISCURSO DO ÓDIO NO BRASIL / FREE SPEECH AND HATE SPEECH IN BRAZIL

    Directory of Open Access Journals (Sweden)

    Nevita Maria Pessoa de Aquino Franca Luna

    2014-12-01

    Full Text Available The purpose of this article is to analyze the restriction of free speech when it comes close to hate speech. In this perspective, the aim of this study is to answer the question: what is the understanding adopted by the Brazilian Supreme Court in cases involving the conflict between free speech and hate speech? The methodology combines a bibliographic review on the theoretical assumptions of the research (concept of free speech and hate speech, and understanding of the rights of defense of traditionally discriminated minorities and empirical research (documental and jurisprudential analysis of judged cases of American Court, German Court and Brazilian Court. Firstly, free speech is discussed, defining its meaning, content and purpose. Then, the hate speech is pointed as an inhibitor element of free speech for offending members of traditionally discriminated minorities, who are outnumbered or in a situation of cultural, socioeconomic or political subordination. Subsequently, are discussed some aspects of American (negative freedom and German models (positive freedom, to demonstrate that different cultures adopt different legal solutions. At the end, it is concluded that there is an approximation of the Brazilian understanding with the German doctrine, from the analysis of landmark cases as the publisher Siegfried Ellwanger (2003 and the Samba School Unidos do Viradouro (2008. The Brazilian comprehension, a multicultural country made up of different ethnicities, leads to a new process of defending minorities who, despite of involving the collision of fundamental rights (dignity, equality and freedom, is still restrained by incompatible barriers of a contemporary pluralistic democracy.

  8. Syntactic error modeling and scoring normalization in speech recognition: Error modeling and scoring normalization in the speech recognition task for adult literacy training

    Science.gov (United States)

    Olorenshaw, Lex; Trawick, David

    1991-01-01

    The purpose was to develop a speech recognition system to be able to detect speech which is pronounced incorrectly, given that the text of the spoken speech is known to the recognizer. Better mechanisms are provided for using speech recognition in a literacy tutor application. Using a combination of scoring normalization techniques and cheater-mode decoding, a reasonable acceptance/rejection threshold was provided. In continuous speech, the system was tested to be able to provide above 80 pct. correct acceptance of words, while correctly rejecting over 80 pct. of incorrectly pronounced words.

  9. Practical speech user interface design

    CERN Document Server

    Lewis, James R

    2010-01-01

    Although speech is the most natural form of communication between humans, most people find using speech to communicate with machines anything but natural. Drawing from psychology, human-computer interaction, linguistics, and communication theory, Practical Speech User Interface Design provides a comprehensive yet concise survey of practical speech user interface (SUI) design. It offers practice-based and research-based guidance on how to design effective, efficient, and pleasant speech applications that people can really use. Focusing on the design of speech user interfaces for IVR application

  10. An Embedded Application for Degraded Text Recognition

    Directory of Open Access Journals (Sweden)

    Thillou Céline

    2005-01-01

    Full Text Available This paper describes a mobile device which tries to give the blind or visually impaired access to text information. Three key technologies are required for this system: text detection, optical character recognition, and speech synthesis. Blind users and the mobile environment imply two strong constraints. First, pictures will be taken without control on camera settings and a priori information on text (font or size and background. The second issue is to link several techniques together with an optimal compromise between computational constraints and recognition efficiency. We will present the overall description of the system from text detection to OCR error correction.

  11. A distributed approach to speech resource collection

    CSIR Research Space (South Africa)

    Molapo, R

    2013-12-01

    Full Text Available The authors describe the integration of several tools to enable the end-to-end development of an Automatic Speech Recognition system in a typical under-resourced language. The authors analyse the data acquired by each of the tools and develop an ASR...

  12. Comparing two developmental applications of speech technology

    CSIR Research Space (South Africa)

    Sharma Grover, A

    2011-05-01

    Full Text Available Over the past decade applications of speech technologies for development (ST4D) have shown much potential for enabling information access and service delivery. In this paper the authors review two deployed ST4D services and posit a set of dimensions...

  13. Lope and the Battle-Speech

    Directory of Open Access Journals (Sweden)

    Juan Carlos Iglesias-Zoido

    2013-05-01

    Full Text Available This article analyzes the way in which Lope de Vega conceives in his theater the pre-battle harangue, the most characteristic speech in ancient and renaissance historiography. Having this aim in mind, I have analyzed the role played by this type of speech in a group of plays dealing with historical and military subjects. These plays were written in a period when Lope was particularly interested in historical issues: La Santa Liga (1598-1603, Arauco domado (1599, El asalto de Mastrique (1595-1606 and Los Guanches de Tenerife (1604-1606.

  14. Speech and Language Disturbances in Neurology Practice

    Directory of Open Access Journals (Sweden)

    Oğuz Tanrıdağ

    2009-12-01

    Full Text Available Despite the well-known facts discerned from interesting cases of speech and language disturbances over thousands of years, the scientific background and the limitless discussions for nearly 150 years, this field has been considered one of the least important subjects in neurological sciences. In this review, we first analyze the possible causes for this “stepchild” attitude towards this subject and we then summarize the practical aspects concerning speech and language disturbances. Our underlying expectation with this review is to explain the facts concerning those disturbances that might offer us opportunities to better understand the nervous system and the affected patients

  15. Speech Alarms Pilot Study

    Science.gov (United States)

    Sandor, A.; Moses, H. R.

    2016-01-01

    Currently on the International Space Station (ISS) and other space vehicles Caution & Warning (C&W) alerts are represented with various auditory tones that correspond to the type of event. This system relies on the crew's ability to remember what each tone represents in a high stress, high workload environment when responding to the alert. Furthermore, crew receive a year or more in advance of the mission that makes remembering the semantic meaning of the alerts more difficult. The current system works for missions conducted close to Earth where ground operators can assist as needed. On long duration missions, however, they will need to work off-nominal events autonomously. There is evidence that speech alarms may be easier and faster to recognize, especially during an off-nominal event. The Information Presentation Directed Research Project (FY07-FY09) funded by the Human Research Program included several studies investigating C&W alerts. The studies evaluated tone alerts currently in use with NASA flight deck displays along with candidate speech alerts. A follow-on study used four types of speech alerts to investigate how quickly various types of auditory alerts with and without a speech component - either at the beginning or at the end of the tone - can be identified. Even though crew were familiar with the tone alert from training or direct mission experience, alerts starting with a speech component were identified faster than alerts starting with a tone. The current study replicated the results from the previous study in a more rigorous experimental design to determine if the candidate speech alarms are ready for transition to operations or if more research is needed. Four types of alarms (caution, warning, fire, and depressurization) were presented to participants in both tone and speech formats in laboratory settings and later in the Human Exploration Research Analog (HERA). In the laboratory study, the alerts were presented by software and participants were

  16. Speech Evoked Auditory Brainstem Response in Stuttering

    Directory of Open Access Journals (Sweden)

    Ali Akbar Tahaei

    2014-01-01

    Full Text Available Auditory processing deficits have been hypothesized as an underlying mechanism for stuttering. Previous studies have demonstrated abnormal responses in subjects with persistent developmental stuttering (PDS at the higher level of the central auditory system using speech stimuli. Recently, the potential usefulness of speech evoked auditory brainstem responses in central auditory processing disorders has been emphasized. The current study used the speech evoked ABR to investigate the hypothesis that subjects with PDS have specific auditory perceptual dysfunction. Objectives. To determine whether brainstem responses to speech stimuli differ between PDS subjects and normal fluent speakers. Methods. Twenty-five subjects with PDS participated in this study. The speech-ABRs were elicited by the 5-formant synthesized syllable/da/, with duration of 40 ms. Results. There were significant group differences for the onset and offset transient peaks. Subjects with PDS had longer latencies for the onset and offset peaks relative to the control group. Conclusions. Subjects with PDS showed a deficient neural timing in the early stages of the auditory pathway consistent with temporal processing deficits and their abnormal timing may underlie to their disfluency.

  17. PERSON DEIXIS IN USA PRESIDENTIAL CAMPAIGN SPEECHES

    Directory of Open Access Journals (Sweden)

    Nanda Anggarani Putri

    2015-06-01

    Full Text Available This study investigates the use of person deixis in presidential campaign speeches. This study is important because the use of person deixis in political speeches has been proved by many studies to give significant effects to the audience. The study largely employs a descriptive qualitative method. However, it also employs a simple quantitative method in calculating the number of personal pronouns used in the speeches and their percentages. The data for the study were collected from the transcriptions of six presidential campaign speeches of Barack Obama and Mitt Romney during the campaign rally in various places across the United States of America in July, September, and November 2012. The results of this study show that the presidential candidates make the best use of pronouns as a way to promote themselves and to attack their opponents. The results also suggest that the use of pronouns in the speeches enables the candidates to construct positive identity and reality, which are favorable to them and make them appear more eligible for the position.

  18. Gesture facilitates the syntactic analysis of speech

    Directory of Open Access Journals (Sweden)

    Henning eHolle

    2012-03-01

    Full Text Available Recent research suggests that the brain routinely binds together information from gesture and speech. However, most of this research focused on the integration of representational gestures with the semantic content of speech. Much less is known about how other aspects of gesture, such as emphasis, influence the interpretation of the syntactic relations in a spoken message. Here, we investigated whether beat gestures alter which syntactic structure is assigned to ambiguous spoken German sentences. The P600 component of the Event Related Brain Potential indicated that the more complex syntactic structure is easier to process when the speaker emphasizes the subject of a sentence with a beat. Thus, a simple flick of the hand can change our interpretation of who has been doing what to whom in a spoken sentence. We conclude that gestures and speech are an integrated system. Unlike previous studies, which have shown that the brain effortlessly integrates semantic information from gesture and speech, our study is the first to demonstrate that this integration also occurs for syntactic information. Moreover, the effect appears to be gesture-specific and was not found for other stimuli that draw attention to certain parts of speech, including prosodic emphasis, or a moving visual stimulus with the same trajectory as the gesture. This suggests that only visual emphasis produced with a communicative intention in mind (that is, beat gestures influences language comprehension, but not a simple visual movement lacking such an intention.

  19. Music and speech prosody: A common rhythm

    Directory of Open Access Journals (Sweden)

    Maija eHausen

    2013-09-01

    Full Text Available Disorders of music and speech perception, known as amusia and aphasia, have traditionally been regarded as dissociated deficits based on studies of brain damaged patients. This has been taken as evidence that music and speech are perceived by largely separate and independent networks in the brain. However, recent studies of congenital amusia have broadened this view by showing that the deficit is associated with problems in perceiving speech prosody, especially intonation and emotional prosody. In the present study the association between the perception of music and speech prosody was investigated with healthy Finnish adults (n = 61 using an on-line music perception test including the Scale subtest of Montreal Battery of Evaluation of Amusia (MBEA and Off-Beat and Out-of-key tasks as well as a prosodic verbal task that measures the perception of word stress. Regression analyses showed that there was a clear association between prosody perception and music perception, especially in the domain of rhythm perception. This association was evident after controlling for music education, age, pitch perception, visuospatial perception and working memory. Pitch perception was significantly associated with music perception but not with prosody perception. The association between music perception and visuospatial perception (measured using analogous tasks was less clear. Overall, the pattern of results indicates that there is a robust link between music and speech perception and that this link can be mediated by rhythmic cues (time and stress.

  20. Markers of Deception in Italian Speech

    Directory of Open Access Journals (Sweden)

    Katelyn eSpence

    2012-10-01

    Full Text Available Lying is a universal activity and the detection of lying a universal concern. Presently, there is great interest in determining objective measures of deception. The examination of speech, in particular, holds promise in this regard; yet, most of what we know about the relationship between speech and lying is based on the assessment of English-speaking participants. Few studies have examined indicators of deception in languages other than English. The world’s languages differ in significant ways, and cross-linguistic studies of deceptive communications are a research imperative. Here we review some of these differences amongst the world’s languages, and provide an overview of a number of recent studies demonstrating that cross-linguistic research is a worthwhile endeavour. In addition, we report the results of an empirical investigation of pitch, response latency, and speech rate as cues to deception in Italian speech. True and false opinions were elicited in an audio-taped interview. A within subjects analysis revealed no significant difference between the average pitch of the two conditions; however, speech rate was significantly slower, while response latency was longer, during deception compared with truth-telling. We explore the implications of these findings and propose directions for future research, with the aim of expanding the cross-linguistic branch of research on markers of deception.

  1. Directed Activities Related to Text: Text Analysis and Text Reconstruction.

    Science.gov (United States)

    Davies, Florence; Greene, Terry

    This paper describes Directed Activities Related to Text (DART), procedures that were developed and are used in the Reading for Learning Project at the University of Nottingham (England) to enhance learning from texts and that fall into two broad categories: (1) text analysis procedures, which require students to engage in some form of analysis of…

  2. Speech Emotion Recognition Based on Power Normalized Cepstral Coefficients in Noisy Conditions

    Directory of Open Access Journals (Sweden)

    M. Bashirpour

    2016-09-01

    Full Text Available Automatic recognition of speech emotional states in noisy conditions has become an important research topic in the emotional speech recognition area, in recent years. This paper considers the recognition of emotional states via speech in real environments. For this task, we employ the power normalized cepstral coefficients (PNCC in a speech emotion recognition system. We investigate its performance in emotion recognition using clean and noisy speech materials and compare it with the performances of the well-known MFCC, LPCC, RASTA-PLP, and also TEMFCC features. Speech samples are extracted from the Berlin emotional speech database (Emo DB and Persian emotional speech database (Persian ESD which are corrupted with 4 different noise types under various SNR levels. The experiments are conducted in clean train/noisy test scenarios to simulate practical conditions with noise sources. Simulation results show that higher recognition rates are achieved for PNCC as compared with the conventional features under noisy conditions.

  3. A Pilot Investigation of Speech Sound Disorder Intervention Delivered by Telehealth to School-Age Children

    Directory of Open Access Journals (Sweden)

    Sue Grogan-Johnson

    2011-05-01

    Full Text Available This article describes a school-based telehealth service delivery model and reports outcomes made by school-age students with speech sound disorders in a rural Ohio school district. Speech therapy using computer-based speech sound intervention materials was provided either by live interactive videoconferencing (telehealth, or conventional side-by-side intervention.  Progress was measured using pre- and post-intervention scores on the Goldman Fristoe Test of Articulation-2 (Goldman & Fristoe, 2002. Students in both service delivery models made significant improvements in speech sound production, with students in the telehealth condition demonstrating greater mastery of their Individual Education Plan (IEP goals. Live interactive videoconferencing thus appears to be a viable method for delivering intervention for speech sound disorders to children in a rural, public school setting. Keywords:  Telehealth, telerehabilitation, videoconferencing, speech sound disorder, speech therapy, speech-language pathology; E-Helper

  4. Mapping Speech Spectra from Throat Microphone to Close-Speaking Microphone: A Neural Network Approach

    Directory of Open Access Journals (Sweden)

    B. Yegnanarayana

    2007-01-01

    Full Text Available Speech recorded from a throat microphone is robust to the surrounding noise, but sounds unnatural unlike the speech recorded from a close-speaking microphone. This paper addresses the issue of improving the perceptual quality of the throat microphone speech by mapping the speech spectra from the throat microphone to the close-speaking microphone. A neural network model is used to capture the speaker-dependent functional relationship between the feature vectors (cepstral coefficients of the two speech signals. A method is proposed to ensure the stability of the all-pole synthesis filter. Objective evaluations indicate the effectiveness of the proposed mapping scheme. The advantage of this method is that the model gives a smooth estimate of the spectra of the close-speaking microphone speech. No distortions are perceived in the reconstructed speech. This mapping technique is also used for bandwidth extension of telephone speech.

  5. SPEECH DELAY IN THE PRACTICE OF A PAEDIATRICIAN AND CHILD’S NEUROLOGIST

    Directory of Open Access Journals (Sweden)

    N. N. Zavadenko

    2015-01-01

    Full Text Available The article describes the main clinical forms and causes of speech delay in children. It presents modern data on the role of neurobiological factors in the speech delay pathogenesis, including early organic damage to the central nervous system due to the pregnancy and childbirth pathology, as well as genetic mechanisms. For early and accurate diagnosis of speech disorders in children, you need to consider normal patterns of speech development. The article presents indicators of pre-speech and speech development in children and describes the screening method for determining the speech delay. The main areas of complex correction are speech therapy, psycho-pedagogical and psychotherapeutic assistance, as well as pharmaceutical treatment. The capabilities of drug therapy for dysphasia (alalia are shown. 

  6. Intelligibility of speech of children with speech and sound disorders

    OpenAIRE

    Ivetac, Tina

    2014-01-01

    The purpose of this study is to examine speech intelligibility of children with primary speech and sound disorders aged 3 to 6 years in everyday life. The research problem is based on the degree to which parents or guardians, immediate family members (sister, brother, grandparents), extended family members (aunt, uncle, cousin), child's friends, other acquaintances, child's teachers and strangers understand the speech of children with speech sound disorders. We examined whether the level ...

  7. Motivational Projections of Russian Spontaneous Speech

    Directory of Open Access Journals (Sweden)

    Galina M. Shipitsina

    2017-06-01

    Full Text Available The article deals with the semantic, pragmatic and structural features of words, phrases, dialogues motivation, in the contemporary Russian popular speech. These structural features are characterized by originality and unconventional use. Language material is the result of authors` direct observation of spontaneous verbal communication between people of different social and age groups. The words and remarks were analyzed in compliance with the communication system of national Russian language and cultural background of popular speech. Studies have discovered that in spoken discourse there are some other ways to increase the expression statement. It is important to note that spontaneous speech identifies lacunae in the nominative language and its vocabulary system. It is proved, prefixation is also effective and regular way of the same action presenting. The most typical forms, ways and means to update language resources as a result of the linguistic creativity of native speakers were identified.

  8. Resourcing speech-language pathologists to work with multilingual children.

    Science.gov (United States)

    McLeod, Sharynne

    2014-06-01

    Speech-language pathologists play important roles in supporting people to be competent communicators in the languages of their communities. However, with over 7000 languages spoken throughout the world and the majority of the global population being multilingual, there is often a mismatch between the languages spoken by children and families and their speech-language pathologists. This paper provides insights into service provision for multilingual children within an English-dominant country by viewing Australia's multilingual population as a microcosm of ethnolinguistic minorities. Recent population studies of Australian pre-school children show that their most common languages other than English are: Arabic, Cantonese, Vietnamese, Italian, Mandarin, Spanish, and Greek. Although 20.2% of services by Speech Pathology Australia members are offered in languages other than English, there is a mismatch between the language of the services and the languages of children within similar geographical communities. Australian speech-language pathologists typically use informal or English-based assessments and intervention tools with multilingual children. Thus, there is a need for accessible culturally and linguistically appropriate resources for working with multilingual children. Recent international collaborations have resulted in practical strategies to support speech-language pathologists during assessment, intervention, and collaboration with families, communities, and other professionals. The International Expert Panel on Multilingual Children's Speech was assembled to prepare a position paper to address issues faced by speech-language pathologists when working with multilingual populations. The Multilingual Children's Speech website ( http://www.csu.edu.au/research/multilingual-speech ) addresses one of the aims of the position paper by providing free resources and information for speech-language pathologists about more than 45 languages. These international

  9. Speech and non-speech processing in children with phonological disorders: an electrophysiological study

    Directory of Open Access Journals (Sweden)

    Isabela Crivellaro Gonçalves

    2011-01-01

    Full Text Available OBJECTIVE: To determine whether neurophysiological auditory brainstem responses to clicks and repeated speech stimuli differ between typically developing children and children with phonological disorders. INTRODUCTION: Phonological disorders are language impairments resulting from inadequate use of adult phonological language rules and are among the most common speech and language disorders in children (prevalence: 8 - 9%. Our hypothesis is that children with phonological disorders have basic differences in the way that their brains encode acoustic signals at brainstem level when compared to normal counterparts. METHODS: We recorded click and speech evoked auditory brainstem responses in 18 typically developing children (control group and in 18 children who were clinically diagnosed with phonological disorders (research group. The age range of the children was from 7-11 years. RESULTS: The research group exhibited significantly longer latency responses to click stimuli (waves I, III and V and speech stimuli (waves V and A when compared to the control group. DISCUSSION: These results suggest that the abnormal encoding of speech sounds may be a biological marker of phonological disorders. However, these results cannot define the biological origins of phonological problems. We also observed that speech-evoked auditory brainstem responses had a higher specificity/sensitivity for identifying phonological disorders than click-evoked auditory brainstem responses. CONCLUSIONS: Early stages of the auditory pathway processing of an acoustic stimulus are not similar in typically developing children and those with phonological disorders. These findings suggest that there are brainstem auditory pathway abnormalities in children with phonological disorders.

  10. Automatic Speech Recognition Systems for the Evaluation of Voice and Speech Disorders in Head and Neck Cancer

    Directory of Open Access Journals (Sweden)

    Andreas Maier

    2010-01-01

    Full Text Available In patients suffering from head and neck cancer, speech intelligibility is often restricted. For assessment and outcome measurements, automatic speech recognition systems have previously been shown to be appropriate for objective and quick evaluation of intelligibility. In this study we investigate the applicability of the method to speech disorders caused by head and neck cancer. Intelligibility was quantified by speech recognition on recordings of a standard text read by 41 German laryngectomized patients with cancer of the larynx or hypopharynx and 49 German patients who had suffered from oral cancer. The speech recognition provides the percentage of correctly recognized words of a sequence, that is, the word recognition rate. Automatic evaluation was compared to perceptual ratings by a panel of experts and to an age-matched control group. Both patient groups showed significantly lower word recognition rates than the control group. Automatic speech recognition yielded word recognition rates which complied with experts' evaluation of intelligibility on a significant level. Automatic speech recognition serves as a good means with low effort to objectify and quantify the most important aspect of pathologic speech—the intelligibility. The system was successfully applied to voice and speech disorders.

  11. Robust Speech/Non-Speech Classification in Heterogeneous Multimedia Content

    NARCIS (Netherlands)

    Huijbregts, M.A.H.; de Jong, Franciska M.G.

    In this paper we present a speech/non-speech classification method that allows high quality classification without the need to know in advance what kinds of audible non-speech events are present in an audio recording and that does not require a single parameter to be tuned on in-domain data. Because

  12. Speech Acquisition and Automatic Speech Recognition for Integrated Spacesuit Audio Systems

    Science.gov (United States)

    Huang, Yiteng; Chen, Jingdong; Chen, Shaoyan

    2010-01-01

    A voice-command human-machine interface system has been developed for spacesuit extravehicular activity (EVA) missions. A multichannel acoustic signal processing method has been created for distant speech acquisition in noisy and reverberant environments. This technology reduces noise by exploiting differences in the statistical nature of signal (i.e., speech) and noise that exists in the spatial and temporal domains. As a result, the automatic speech recognition (ASR) accuracy can be improved to the level at which crewmembers would find the speech interface useful. The developed speech human/machine interface will enable both crewmember usability and operational efficiency. It can enjoy a fast rate of data/text entry, small overall size, and can be lightweight. In addition, this design will free the hands and eyes of a suited crewmember. The system components and steps include beam forming/multi-channel noise reduction, single-channel noise reduction, speech feature extraction, feature transformation and normalization, feature compression, model adaption, ASR HMM (Hidden Markov Model) training, and ASR decoding. A state-of-the-art phoneme recognizer can obtain an accuracy rate of 65 percent when the training and testing data are free of noise. When it is used in spacesuits, the rate drops to about 33 percent. With the developed microphone array speech-processing technologies, the performance is improved and the phoneme recognition accuracy rate rises to 44 percent. The recognizer can be further improved by combining the microphone array and HMM model adaptation techniques and using speech samples collected from inside spacesuits. In addition, arithmetic complexity models for the major HMMbased ASR components were developed. They can help real-time ASR system designers select proper tasks when in the face of constraints in computational resources.

  13. Song and speech: examining the link between singing talent and speech imitation ability

    Directory of Open Access Journals (Sweden)

    Markus eChristiner

    2013-11-01

    Full Text Available In previous research on speech imitation, musicality and an ability to sing were isolated as the strongest indicators of good pronunciation skills in foreign languages. We, therefore, wanted to take a closer look at the nature of the ability to sing, which shares a common ground with the ability to imitate speech. This study focuses on whether good singing performance predicts good speech imitation. Fourty-one singers of different levels of proficiency were selected for the study and their ability to sing, to imitate speech, their musical talent and working memory were tested. Results indicated that singing performance is a better indicator of the ability to imitate speech than the playing of a musical instrument. A multiple regression revealed that 64 % of the speech imitation score variance could be explained by working memory together with educational background and singing performance. A second multiple regression showed that 66 % of the speech imitation variance of completely unintelligible and unfamiliar language stimuli (Hindi could be explained by working memory together with a singer’s sense of rhythm and quality of voice. This supports the idea that both vocal behaviors have a common grounding in terms of vocal and motor flexibility, ontogenetic and phylogenetic development, neural orchestration and sound memory with singing fitting better into the category of "speech" on the productive level and "music" on the acoustic level. As a result, good singers benefit from vocal and motor flexibility, productively and cognitively, in three ways. 1. Motor flexibility and the ability to sing improve language and musical function. 2. Good singers retain a certain plasticity and are open to new and unusual sound combinations during adulthood both perceptually and productively. 3. The ability to sing improves the memory span of the auditory short term memory.

  14. The influence of age, hearing, and working memory on the speech comprehension benefit derived from an automatic speech recognition system.

    Science.gov (United States)

    Zekveld, Adriana A; Kramer, Sophia E; Kessens, Judith M; Vlaming, Marcel S M G; Houtgast, Tammo

    2009-04-01

    The aim of the current study was to examine whether partly incorrect subtitles that are automatically generated by an Automatic Speech Recognition (ASR) system, improve speech comprehension by listeners with hearing impairment. In an earlier study (Zekveld et al. 2008), we showed that speech comprehension in noise by young listeners with normal hearing improves when presenting partly incorrect, automatically generated subtitles. The current study focused on the effects of age, hearing loss, visual working memory capacity, and linguistic skills on the benefit obtained from automatically generated subtitles during listening to speech in noise. In order to investigate the effects of age and hearing loss, three groups of participants were included: 22 young persons with normal hearing (YNH, mean age = 21 years), 22 middle-aged adults with normal hearing (MA-NH, mean age = 55 years) and 30 middle-aged adults with hearing impairment (MA-HI, mean age = 57 years). The benefit from automatic subtitling was measured by Speech Reception Threshold (SRT) tests (Plomp & Mimpen, 1979). Both unimodal auditory and bimodal audiovisual SRT tests were performed. In the audiovisual tests, the subtitles were presented simultaneously with the speech, whereas in the auditory test, only speech was presented. The difference between the auditory and audiovisual SRT was defined as the audiovisual benefit. Participants additionally rated the listening effort. We examined the influences of ASR accuracy level and text delay on the audiovisual benefit and the listening effort using a repeated measures General Linear Model analysis. In a correlation analysis, we evaluated the relationships between age, auditory SRT, visual working memory capacity and the audiovisual benefit and listening effort. The automatically generated subtitles improved speech comprehension in noise for all ASR accuracies and delays covered by the current study. Higher ASR accuracy levels resulted in more benefit obtained

  15. Tackling the complexity in speech

    DEFF Research Database (Denmark)

    section includes four carefully selected chapters. They deal with facets of speech production, speech acoustics, and/or speech perception or recognition, place them in an integrated phonetic-phonological perspective, and relate them in more or less explicit ways to aspects of speech technology. Therefore......, we hope that this volume can help speech scientists with traditional training in phonetics and phonology to keep up with the latest developments in speech technology. In the opposite direction, speech researchers starting from a technological perspective will hopefully get inspired by reading about...... the questions, phenomena, and communicative functions that are currently addressed in phonetics and phonology. Either way, the future of speech research lies in international, interdisciplinary collaborations, and our volume is meant to reflect and facilitate such collaborations...

  16. Developing speech resources from parliamentary data for South African english

    CSIR Research Space (South Africa)

    De Wet, Febe

    2016-05-01

    Full Text Available Workshop on Spoken Language Technology for Under-resourced Languages, SLTU 2016, 9-12 May 2016, Yogyakarta, Indonesia Developing Speech Resources from Parliamentary Data for South African English Febe de Wet*, Jaco Badenhorst, Thipe Modipa Human...

  17. Neurological manifestations in speech after snake bite: A rare case ...

    African Journals Online (AJOL)

    Neurological manifestations in speech after snake bite: A rare case. D Vir, D Gupta, M Modi, N Panda. Abstract. No Abstract. Full Text: EMAIL FREE FULL TEXT EMAIL FREE FULL TEXT · DOWNLOAD FULL TEXT DOWNLOAD FULL TEXT · http://dx.doi.org/10.4314/pamj.v4i1.53597 · AJOL African Journals Online. HOW TO ...

  18. Innovative Speech Reconstructive Surgery

    OpenAIRE

    Hashem Shemshadi

    2003-01-01

    Proper speech functioning in human being, depends on the precise coordination and timing balances in a series of complex neuro nuscular movements and actions. Starting from the prime organ of energy source of expelled air from respirato y system; deliver such air to trigger vocal cords; swift changes of this phonatory episode to a comprehensible sound in RESONACE and final coordination of all head and neck structures to elicit final speech in ...

  19. The chairman's speech

    International Nuclear Information System (INIS)

    Allen, A.M.

    1986-01-01

    The paper contains a transcript of a speech by the chairman of the UKAEA, to mark the publication of the 1985/6 annual report. The topics discussed in the speech include: the Chernobyl accident and its effect on public attitudes to nuclear power, management and disposal of radioactive waste, the operation of UKAEA as a trading fund, and the UKAEA development programmes. The development programmes include work on the following: fast reactor technology, thermal reactors, reactor safety, health and safety aspects of water cooled reactors, the Joint European Torus, and under-lying research. (U.K.)

  20. A Customizable Text Classifier for Text Mining

    Directory of Open Access Journals (Sweden)

    Yun-liang Zhang

    2007-12-01

    Full Text Available Text mining deals with complex and unstructured texts. Usually a particular collection of texts that is specified to one or more domains is necessary. We have developed a customizable text classifier for users to mine the collection automatically. It derives from the sentence category of the HNC theory and corresponding techniques. It can start with a few texts, and it can adjust automatically or be adjusted by user. The user can also control the number of domains chosen and decide the standard with which to choose the texts based on demand and abundance of materials. The performance of the classifier varies with the user's choice.

  1. Speech Outcomes after Tonsillectomy in Patients with Known Velopharyngeal Insufficiency

    Directory of Open Access Journals (Sweden)

    L. M. Paulson

    2012-01-01

    Full Text Available Introduction. Controversy exists over whether tonsillectomy will affect speech in patients with known velopharyngeal insufficiency (VPI, particularly in those with cleft palate. Methods. All patients seen at the OHSU Doernbecher Children's Hospital VPI clinic between 1997 and 2010 with VPI who underwent tonsillectomy were reviewed. Speech parameters were assessed before and after tonsillectomy. Wilcoxon rank-sum testing was used to evaluate for significance. Results. A total of 46 patients with VPI underwent tonsillectomy during this period. Twenty-three had pre- and postoperative speech evaluation sufficient for analysis. The majority (87% had a history of cleft palate. Indications for tonsillectomy included obstructive sleep apnea in 11 (48% and staged tonsillectomy prior to pharyngoplasty in 10 (43%. There was no significant difference between pre- and postoperative speech intelligibility or velopharyngeal competency in this population. Conclusion. In this study, tonsillectomy in patients with VPI did not significantly alter speech intelligibility or velopharyngeal competence.

  2. Histogram Equalization to Model Adaptation for Robust Speech Recognition

    Directory of Open Access Journals (Sweden)

    Suh Youngjoo

    2010-01-01

    Full Text Available We propose a new model adaptation method based on the histogram equalization technique for providing robustness in noisy environments. The trained acoustic mean models of a speech recognizer are adapted into environmentally matched conditions by using the histogram equalization algorithm on a single utterance basis. For more robust speech recognition in the heavily noisy conditions, trained acoustic covariance models are efficiently adapted by the signal-to-noise ratio-dependent linear interpolation between trained covariance models and utterance-level sample covariance models. Speech recognition experiments on both the digit-based Aurora2 task and the large vocabulary-based task showed that the proposed model adaptation approach provides significant performance improvements compared to the baseline speech recognizer trained on the clean speech data.

  3. The politeness prosody of the Javanese directive speech

    Directory of Open Access Journals (Sweden)

    F.X. Rahyono

    2009-10-01

    Full Text Available This experimental phonetic research deals with the prosodies of directive speech in Javanese. The research procedures were: (1 speech production, (2 acoustic analysis, and (3 perception test. The data investigated are three directive utterances, in the form of statements, commands, and questions. The data were obtained by recording dialogues that present polite as well as impolite speech. Three acoustic experiments were conducted for statements, commands, and questions in directive speech: (1 modifications of duration, (2 modifications of contour, and (3 modifications of fundamental frequency. The result of the subsequent perception tests to 90 stimuli with 24 subjects were analysed statistically with ANOVA (Analysis of Variant. Based on this statistic analysis, the prosodic characteristics of polite and impolite speech were identified.

  4. Indonesian Automatic Speech Recognition For Command Speech Controller Multimedia Player

    Directory of Open Access Journals (Sweden)

    Vivien Arief Wardhany

    2014-12-01

    Full Text Available The purpose of multimedia devices development is controlling through voice. Nowdays voice that can be recognized only in English. To overcome the issue, then recognition using Indonesian language model and accousticc model and dictionary. Automatic Speech Recognizier is build using engine CMU Sphinx with modified english language to Indonesian Language database and XBMC used as the multimedia player. The experiment is using 10 volunteers testing items based on 7 commands. The volunteers is classifiedd by the genders, 5 Male & 5 female. 10 samples is taken in each command, continue with each volunteer perform 10 testing command. Each volunteer also have to try all 7 command that already provided. Based on percentage clarification table, the word “Kanan” had the most recognize with percentage 83% while “pilih” is the lowest one. The word which had the most wrong clarification is “kembali” with percentagee 67%, while the word “kanan” is the lowest one. From the result of Recognition Rate by male there are several command such as “Kembali”, “Utama”, “Atas “ and “Bawah” has the low Recognition Rate. Especially for “kembali” cannot be recognized as the command in the female voices but in male voice that command has 4% of RR this is because the command doesn’t have similar word in english near to “kembali” so the system unrecognize the command. Also for the command “Pilih” using the female voice has 80% of RR but for the male voice has only 4% of RR. This problem is mostly because of the different voice characteristic between adult male and female which male has lower voice frequencies (from 85 to 180 Hz than woman (165 to 255 Hz.The result of the experiment showed that each man had different number of recognition rate caused by the difference tone, pronunciation, and speed of speech. For further work needs to be done in order to improving the accouracy of the Indonesian Automatic Speech Recognition system

  5. Visualizing structures of speech expressiveness

    DEFF Research Database (Denmark)

    Herbelin, Bruno; Jensen, Karl Kristoffer; Graugaard, Lars

    2008-01-01

    Speech is both beautiful and informative. In this work, a conceptual study of the speech, through investigation of the tower of Babel, the archetypal phonemes, and a study of the reasons of uses of language is undertaken in order to create an artistic work investigating the nature of speech. The ....... The artwork is presented at the Re:New festival in May 2008....

  6. Capitalising on North American speech resources for the development of a South African English large vocabulary speech recognition system

    CSIR Research Space (South Africa)

    Kamper, H

    2014-11-01

    Full Text Available -West University, Vanderbijlpark, South Africa 2Human Language Technologies Research Group, Meraka Institute, CSIR, Pretoria, South Africa {etienne.barnard, marelie.davel, cvheerden}@gmail.com, {fdwet, jbadenhorst}@csir.co.za Abstract The NCHLT speech...

  7. Thinking soap But Speaking ‘oaps’. The Sound Preparation Period: Backward Calculation From Utterance to Muscle Innervation

    Directory of Open Access Journals (Sweden)

    Nora Wiedenmann

    2010-04-01

    Full Text Available

    In this article’s model—on speech and on speech errors, dyscoordinations, and disorders—, the time-course from the muscle innervation impetuses to the utterance of sounds as intended for canonical speech sound sequences is calculated backward. This time-course is shown as the sum of all the known physiological durations of speech sounds and speech gestures that are necessary to produce an utterance. The model introduces two internal clocks, based on positive or negative factors, representing certain physiologically-based time-courses during the sound preparation period (Lautvorspann. The use of these internal clocks show that speech gestures—like other motor activities—work according to a simple serialization principle: Under non-default conditions,
    alterations of the time-courses may cause speech errors of sound serialization, dyscoordinations of sounds as observed during first language acquisition, or speech disorders as pathological cases. These alterations of the time-course are modelled by varying the two internal-clock factors. The calculation of time-courses uses as default values the sound durations of the context-dependent Munich PHONDAT Database of Spoken German (see Appendix 4. As a new, human approach, this calculation agrees mathematically with the approach of Linear Programming / Operations Research. This work gives strong support to the fairly old suspicion (of 1908 of the famous Austrian speech error scientist Meringer [15], namely that one mostly thinks and articulates in a different serialization than is audible from one’s uttered sound sequences.

  8. Digitized Ethnic Hate Speech: Understanding Effects of Digital Media Hate Speech on Citizen Journalism in Kenya

    Directory of Open Access Journals (Sweden)

    Stephen Gichuhi Kimotho

    2016-06-01

    Full Text Available Ethnicity in Kenya permeates all spheres of life. However, it is in politics that ethnicity is most visible. Election time in Kenya often leads to ethnic competition and hatred, often expressed through various media. Ethnic hate speech characterized the 2007 general elections in party rallies and through text messages, emails, posters and leaflets. This resulted in widespread skirmishes that left over 1200 people dead, and many displaced (KNHRC, 2008. In 2013, however, the new battle zone was the war of words on social media platform. More than any other time in Kenyan history, Kenyans poured vitriolic ethnic hate speech through digital media like Facebook, tweeter and blogs. Although scholars have studied the role and effects of the mainstream media like television and radio in proliferating the ethnic hate speech in Kenya (Michael Chege, 2008; Goldstein & Rotich, 2008a; Ismail & Deane, 2008; Jacqueline Klopp & Prisca Kamungi, 2007, little has been done in regard to social media.  This paper investigated the nature of digitized hate speech by: describing the forms of ethnic hate speech on social media in Kenya; the effects of ethnic hate speech on Kenyan’s perception of ethnic entities; ethnic conflict and ethics of citizen journalism. This study adopted a descriptive interpretive design, and utilized Austin’s Speech Act Theory, which explains use of language to achieve desired purposes and direct behaviour (Tarhom & Miracle, 2013. Content published between January and April 2013 from six purposefully identified blogs was analysed. Questionnaires were used to collect data from university students as they form a good sample of Kenyan population, are most active on social media and are drawn from all parts of the country. Qualitative data were analysed using NVIVO 10 software, while responses from the questionnaire were analysed using IBM SPSS version 21. The findings indicated that Facebook and Twitter were the main platforms used to

  9. Processing changes when listening to foreign-accented speech

    Directory of Open Access Journals (Sweden)

    Carlos eRomero-Rivas

    2015-03-01

    Full Text Available This study investigates the mechanisms responsible for fast changes in processing foreign-accented speech. Event Related brain Potentials (ERPs were obtained while native speakers of Spanish listened to native and foreign-accented speakers of Spanish. We observed a less positive P200 component for foreign-accented speech relative to native speech comprehension. This suggests that the extraction of spectral information and other important acoustic features was hampered during foreign-accented speech comprehension. However, the amplitude of the N400 component for foreign-accented speech comprehension decreased across the experiment, suggesting the use of a higher level, lexical mechanism. Furthermore, during native speech comprehension, semantic violations in the critical words elicited an N400 effect followed by a late positivity. During foreign-accented speech comprehension, semantic violations only elicited an N400 effect. Overall, our results suggest that, despite a lack of improvement in phonetic discrimination, native listeners experience changes at lexical-semantic levels of processing after brief exposure to foreign-accented speech. Moreover, these results suggest that lexical access, semantic integration and linguistic re-analysis processes are permeable to external factors, such as the accent of the speaker.

  10. Relationship between the stuttering severity index and speech rate

    Directory of Open Access Journals (Sweden)

    Claudia Regina Furquim de Andrade

    Full Text Available CONTEXT: The speech rate is one of the parameters considered when investigating speech fluency and is an important variable in the assessment of individuals with communication complaints. OBJECTIVE: To correlate the stuttering severity index with one of the indices used for assessing fluency/speech rate. DESIGN: Cross-sectional study. SETTING: Fluency and Fluency Disorders Investigation Laboratory, Faculdade de Medicina da Universidade de São Paulo. PARTICIPANTS: Seventy adults with stuttering diagnosis. MAIN MEASUREMENTS: A speech sample from each participant containing at least 200 fluent syllables was videotaped and analyzed according to a stuttering severity index test and speech rate parameters. RESULTS: The results obtained in this study indicate that the stuttering severity and the speech rate present significant variation, i.e., the more severe the stuttering is, the lower the speech rate in words and syllables per minute. DISCUSSION AND CONCLUSION: The results suggest that speech rate is an important indicator of fluency levels and should be incorporated in the assessment and treatment of stuttering. This study represents a first attempt to identify the possible subtypes of developmental stuttering. DEFINITION: Objective tests that quantify diseases are important in their diagnosis, treatment and prognosis.

  11. Infants' preference for native audiovisual speech dissociated from congruency preference.

    Directory of Open Access Journals (Sweden)

    Kathleen Shaw

    Full Text Available Although infant speech perception in often studied in isolated modalities, infants' experience with speech is largely multimodal (i.e., speech sounds they hear are accompanied by articulating faces. Across two experiments, we tested infants' sensitivity to the relationship between the auditory and visual components of audiovisual speech in their native (English and non-native (Spanish language. In Experiment 1, infants' looking times were measured during a preferential looking task in which they saw two simultaneous visual speech streams articulating a story, one in English and the other in Spanish, while they heard either the English or the Spanish version of the story. In Experiment 2, looking times from another group of infants were measured as they watched single displays of congruent and incongruent combinations of English and Spanish audio and visual speech streams. Findings demonstrated an age-related increase in looking towards the native relative to non-native visual speech stream when accompanied by the corresponding (native auditory speech. This increase in native language preference did not appear to be driven by a difference in preference for native vs. non-native audiovisual congruence as we observed no difference in looking times at the audiovisual streams in Experiment 2.

  12. Visual feedback of tongue movement for novel speech sound learning

    Directory of Open Access Journals (Sweden)

    William F Katz

    2015-11-01

    Full Text Available Pronunciation training studies have yielded important information concerning the processing of audiovisual (AV information. Second language (L2 learners show increased reliance on bottom-up, multimodal input for speech perception (compared to monolingual individuals. However, little is known about the role of viewing one’s own speech articulation processes during speech training. The current study investigated whether real-time, visual feedback for tongue movement can improve a speaker’s learning of non-native speech sounds. An interactive 3D tongue visualization system based on electromagnetic articulography (EMA was used in a speech training experiment. Native speakers of American English produced a novel speech sound (/ɖ̠/; a voiced, coronal, palatal stop before, during, and after trials in which they viewed their own speech movements using the 3D model. Talkers’ productions were evaluated using kinematic (tongue-tip spatial positioning and acoustic (burst spectra measures. The results indicated a rapid gain in accuracy associated with visual feedback training. The findings are discussed with respect to neural models for multimodal speech processing.

  13. Gender and Speech in a Disney Princess Movie

    Directory of Open Access Journals (Sweden)

    Azmi N.J.

    2016-11-01

    Full Text Available One of the latest Disney princess movies is Frozen which was released in 2013. Female characters in Frozen differ from the female characters in previous Disney movies, such as The Little Mermaid and Tangled. In comparison, female characters in Frozen are portrayed as having more heroic values and norms, which makes it interesting to examine their speech characteristics. Do they use typical female speech despite having more heroic characteristics? This paper aims to provide insights into the female speech characteristics in this movie based on Lakoff’s (1975 model of female speech.  Data analysis shows that female and male characters in the movie used almost equal number of female speech elements in their dialogues. Interestingly, although female characters in the movie do not behave stereotypically, their speech still contain the elements of female speech, such as the use empty adjectives, questions, hedges and intensifier. This paper argues that the blurring of boundaries between male and female speech characteristics in this movie is an attempt to break gender stereotyping by showing that female characters share similar characteristics with heroic male characters thus they should not be seen as inferior to the male  characters.

  14. Perception of words and pitch patterns in song and speech

    Directory of Open Access Journals (Sweden)

    Julia eMerrill

    2012-03-01

    Full Text Available This fMRI study examines shared and distinct cortical areas involved in the auditory perception of song and speech at the level of their underlying constituents: words, pitch and rhythm. Univariate and multivariate analyses were performed on the brain activity patterns of six conditions, arranged in a subtractive hierarchy: sung sentences including words, pitch and rhythm; hummed speech prosody and song melody containing only pitch patterns and rhythm; as well as the pure musical or speech rhythm.Systematic contrasts between these balanced conditions following their hierarchical organization showed a great overlap between song and speech at all levels in the bilateral temporal lobe, but suggested a differential role of the inferior frontal gyrus (IFG and intraparietal sulcus (IPS in processing song and speech. The left IFG was involved in word- and pitch-related processing in speech, the right IFG in processing pitch in song.Furthermore, the IPS showed sensitivity to discrete pitch relations in song as opposed to the gliding pitch in speech. Finally, the superior temporal gyrus and premotor cortex coded for general differences between words and pitch patterns, irrespective of whether they were sung or spoken. Thus, song and speech share many features which are reflected in a fundamental similarity of brain areas involved in their perception. However, fine-grained acoustic differences on word and pitch level are reflected in the activity of IFG and IPS.

  15. Workshop: Welcoming speech

    International Nuclear Information System (INIS)

    Lummerzheim, D.

    1994-01-01

    The welcoming speech underlines the fact that any validation process starting with calculation methods and ending with studies on the long-term behaviour of a repository system can only be effected through laboratory, field and natural-analogue studies. The use of natural analogues (NA) is to secure the biosphere and to verify whether this safety really exists. (HP) [de

  16. Hearing speech in music.

    Science.gov (United States)

    Ekström, Seth-Reino; Borg, Erik

    2011-01-01

    The masking effect of a piano composition, played at different speeds and in different octaves, on speech-perception thresholds was investigated in 15 normal-hearing and 14 moderately-hearing-impaired subjects. Running speech (just follow conversation, JFC) testing and use of hearing aids increased the everyday validity of the findings. A comparison was made with standard audiometric noises [International Collegium of Rehabilitative Audiology (ICRA) noise and speech spectrum-filtered noise (SPN)]. All masking sounds, music or noise, were presented at the same equivalent sound level (50 dBA). The results showed a significant effect of piano performance speed and octave (Ptempo had the largest effect; and high octave and slow tempo, the smallest. Music had a lower masking effect than did ICRA noise with two or six speakers at normal vocal effort (Pmusic offers an interesting opportunity for studying masking under realistic conditions, where spectral and temporal features can be varied independently. The results have implications for composing music with vocal parts, designing acoustic environments and creating a balance between speech perception and privacy in social settings.

  17. Free Speech Yearbook 1979.

    Science.gov (United States)

    Kane, Peter E., Ed.

    The seven articles in this collection deal with theoretical and practical freedom of speech issues. Topics covered are: the United States Supreme Court, motion picture censorship, and the color line; judicial decision making; the established scientific community's suppression of the ideas of Immanuel Velikovsky; the problems of avant-garde jazz,…

  18. Speech Denoising in White Noise Based on Signal Subspace Low-rank Plus Sparse Decomposition

    Directory of Open Access Journals (Sweden)

    yuan Shuai

    2017-01-01

    Full Text Available In this paper, a new subspace speech enhancement method using low-rank and sparse decomposition is presented. In the proposed method, we firstly structure the corrupted data as a Toeplitz matrix and estimate its effective rank for the underlying human speech signal. Then the low-rank and sparse decomposition is performed with the guidance of speech rank value to remove the noise. Extensive experiments have been carried out in white Gaussian noise condition, and experimental results show the proposed method performs better than conventional speech enhancement methods, in terms of yielding less residual noise and lower speech distortion.

  19. The natural statistics of audiovisual speech.

    Directory of Open Access Journals (Sweden)

    Chandramouli Chandrasekaran

    2009-07-01

    Full Text Available Humans, like other animals, are exposed to a continuous stream of signals, which are dynamic, multimodal, extended, and time varying in nature. This complex input space must be transduced and sampled by our sensory systems and transmitted to the brain where it can guide the selection of appropriate actions. To simplify this process, it's been suggested that the brain exploits statistical regularities in the stimulus space. Tests of this idea have largely been confined to unimodal signals and natural scenes. One important class of multisensory signals for which a quantitative input space characterization is unavailable is human speech. We do not understand what signals our brain has to actively piece together from an audiovisual speech stream to arrive at a percept versus what is already embedded in the signal structure of the stream itself. In essence, we do not have a clear understanding of the natural statistics of audiovisual speech. In the present study, we identified the following major statistical features of audiovisual speech. First, we observed robust correlations and close temporal correspondence between the area of the mouth opening and the acoustic envelope. Second, we found the strongest correlation between the area of the mouth opening and vocal tract resonances. Third, we observed that both area of the mouth opening and the voice envelope are temporally modulated in the 2-7 Hz frequency range. Finally, we show that the timing of mouth movements relative to the onset of the voice is consistently between 100 and 300 ms. We interpret these data in the context of recent neural theories of speech which suggest that speech communication is a reciprocally coupled, multisensory event, whereby the outputs of the signaler are matched to the neural processes of the receiver.

  20. Talker Variability in Audiovisual Speech Perception

    Directory of Open Access Journals (Sweden)

    Shannon eHeald

    2014-07-01

    Full Text Available A change in talker is a change in the context for the phonetic interpretation of acoustic patterns of speech. Different talkers have different mappings between acoustic patterns and phonetic categories and listeners need to adapt to these differences. Despite this complexity, listeners are adept at comprehending speech in multiple-talker contexts, albeit at a slight but measurable performance cost (e.g., slower recognition. So far, this talker-variability cost has been demonstrated only in audio-only speech. Other research in single-talker contexts have shown, however, that when listeners are able to see a talker’s face, speech recognition is improved under adverse listening (e.g., noise or distortion conditions that can increase uncertainty in the mapping between acoustic patterns and phonetic categories. Does seeing a talker's face reduce the cost of word recognition in multiple-talker contexts? We used a speeded word-monitoring task in which listeners make quick judgments about target-word recognition in single- and multiple-talker contexts. Results show faster recognition performance in single-talker conditions compared to multiple-talker conditions for both audio-only and audio-visual speech. However, recognition time in a multiple-talker context was slower in the audio-visual condition compared to audio-only condition. These results suggest that seeing a talker’s face during speech perception may slow recognition by increasing the importance of talker identification, signaling to the listener a change in talker has occurred.

  1. TongueToSpeech (TTS): Wearable wireless assistive device for augmented speech.

    Science.gov (United States)

    Marjanovic, Nicholas; Piccinini, Giacomo; Kerr, Kevin; Esmailbeigi, Hananeh

    2017-07-01

    Speech is an important aspect of human communication; individuals with speech impairment are unable to communicate vocally in real time. Our team has developed the TongueToSpeech (TTS) device with the goal of augmenting speech communication for the vocally impaired. The proposed device is a wearable wireless assistive device that incorporates a capacitive touch keyboard interface embedded inside a discrete retainer. This device connects to a computer, tablet or a smartphone via Bluetooth connection. The developed TTS application converts text typed by the tongue into audible speech. Our studies have concluded that an 8-contact point configuration between the tongue and the TTS device would yield the best user precision and speed performance. On average using the TTS device inside the oral cavity takes 2.5 times longer than the pointer finger using a T9 (Text on 9 keys) keyboard configuration to type the same phrase. In conclusion, we have developed a discrete noninvasive wearable device that allows the vocally impaired individuals to communicate in real time.

  2. Speech rhythms and multiplexed oscillatory sensory coding in the human brain.

    Directory of Open Access Journals (Sweden)

    Joachim Gross

    2013-12-01

    Full Text Available Cortical oscillations are likely candidates for segmentation and coding of continuous speech. Here, we monitored continuous speech processing with magnetoencephalography (MEG to unravel the principles of speech segmentation and coding. We demonstrate that speech entrains the phase of low-frequency (delta, theta and the amplitude of high-frequency (gamma oscillations in the auditory cortex. Phase entrainment is stronger in the right and amplitude entrainment is stronger in the left auditory cortex. Furthermore, edges in the speech envelope phase reset auditory cortex oscillations thereby enhancing their entrainment to speech. This mechanism adapts to the changing physical features of the speech envelope and enables efficient, stimulus-specific speech sampling. Finally, we show that within the auditory cortex, coupling between delta, theta, and gamma oscillations increases following speech edges. Importantly, all couplings (i.e., brain-speech and also within the cortex attenuate for backward-presented speech, suggesting top-down control. We conclude that segmentation and coding of speech relies on a nested hierarchy of entrained cortical oscillations.

  3. Metaheuristic applications to speech enhancement

    CERN Document Server

    Kunche, Prajna

    2016-01-01

    This book serves as a basic reference for those interested in the application of metaheuristics to speech enhancement. The major goal of the book is to explain the basic concepts of optimization methods and their use in heuristic optimization in speech enhancement to scientists, practicing engineers, and academic researchers in speech processing. The authors discuss why it has been a challenging problem for researchers to develop new enhancement algorithms that aid in the quality and intelligibility of degraded speech. They present powerful optimization methods to speech enhancement that can help to solve the noise reduction problems. Readers will be able to understand the fundamentals of speech processing as well as the optimization techniques, how the speech enhancement algorithms are implemented by utilizing optimization methods, and will be given the tools to develop new algorithms. The authors also provide a comprehensive literature survey regarding the topic.

  4. Stemming Malay Text and Its Application in Automatic Text Categorization

    Science.gov (United States)

    Yasukawa, Michiko; Lim, Hui Tian; Yokoo, Hidetoshi

    In Malay language, there are no conjugations and declensions and affixes have important grammatical functions. In Malay, the same word may function as a noun, an adjective, an adverb, or, a verb, depending on its position in the sentence. Although extensively simple root words are used in informal conversations, it is essential to use the precise words in formal speech or written texts. In Malay, to make sentences clear, derivative words are used. Derivation is achieved mainly by the use of affixes. There are approximately a hundred possible derivative forms of a root word in written language of the educated Malay. Therefore, the composition of Malay words may be complicated. Although there are several types of stemming algorithms available for text processing in English and some other languages, they cannot be used to overcome the difficulties in Malay word stemming. Stemming is the process of reducing various words to their root forms in order to improve the effectiveness of text processing in information systems. It is essential to avoid both over-stemming and under-stemming errors. We have developed a new Malay stemmer (stemming algorithm) for removing inflectional and derivational affixes. Our stemmer uses a set of affix rules and two types of dictionaries: a root-word dictionary and a derivative-word dictionary. The use of set of rules is aimed at reducing the occurrence of under-stemming errors, while that of the dictionaries is believed to reduce the occurrence of over-stemming errors. We performed an experiment to evaluate the application of our stemmer in text mining software. For the experiment, text data used were actual web pages collected from the World Wide Web to demonstrate the effectiveness of our Malay stemming algorithm. The experimental results showed that our stemmer can effectively increase the precision of the extracted Boolean expressions for text categorization.

  5. Two speeches that changed the world: from Fulton to Zurich

    Directory of Open Access Journals (Sweden)

    Alan John Watson

    2016-12-01

    Full Text Available In this extract from his new book Churchill’s Legacy: Two Speeches to Save the World (Watson, 2016, Lord Watson of Richmond draws on his own experience of post war British politics, as a television presenter and media commentator and then as a Liberal Peer and Chairman of the English-Speaking Union, to analyse the significance of Churchill’s Zurich speech of 19 September 1946. He argues that, building on Churchill’s earlier speech at Fulton, Missouri, it helped change the perceptions of the West and alter their response to the emerging Cold War and the future of Europe.

  6. Cognitive Spare Capacity and Speech Communication: A Narrative Overview

    Directory of Open Access Journals (Sweden)

    Mary Rudner

    2014-01-01

    Full Text Available Background noise can make speech communication tiring and cognitively taxing, especially for individuals with hearing impairment. It is now well established that better working memory capacity is associated with better ability to understand speech under adverse conditions as well as better ability to benefit from the advanced signal processing in modern hearing aids. Recent work has shown that although such processing cannot overcome hearing handicap, it can increase cognitive spare capacity, that is, the ability to engage in higher level processing of speech. This paper surveys recent work on cognitive spare capacity and suggests new avenues of investigation.

  7. Speech Motor Control in Fluent and Dysfluent Speech Production of an Individual with Apraxia of Speech and Broca's Aphasia

    Science.gov (United States)

    van Lieshout, Pascal H. H. M.; Bose, Arpita; Square, Paula A.; Steele, Catriona M.

    2007-01-01

    Apraxia of speech (AOS) is typically described as a motor-speech disorder with clinically well-defined symptoms, but without a clear understanding of the underlying problems in motor control. A number of studies have compared the speech of subjects with AOS to the fluent speech of controls, but only a few have included speech movement data and if…

  8. FTP: Full-Text Publishing?

    Science.gov (United States)

    Jul, Erik

    1992-01-01

    Describes the use of file transfer protocol (FTP) on the INTERNET computer network and considers its use as an electronic publishing system. The differing electronic formats of text files are discussed; the preparation and access of documents are described; and problems are addressed, including a lack of consistency. (LRW)

  9. Digitized Ethnic Hate Speech: Understanding Effects of Digital Media Hate Speech on Citizen Journalism in Kenya

    Science.gov (United States)

    Kimotho, Stephen Gichuhi; Nyaga, Rahab Njeri

    2016-01-01

    Ethnicity in Kenya permeates all spheres of life. However, it is in politics that ethnicity is most visible. Election time in Kenya often leads to ethnic competition and hatred, often expressed through various media. Ethnic hate speech characterized the 2007 general elections in party rallies and through text messages, emails, posters and…

  10. Predicting automatic speech recognition performance over communication channels from instrumental speech quality and intelligibility scores

    NARCIS (Netherlands)

    Gallardo, L.F.; Möller, S.; Beerends, J.

    2017-01-01

    The performance of automatic speech recognition based on coded-decoded speech heavily depends on the quality of the transmitted signals, determined by channel impairments. This paper examines relationships between speech recognition performance and measurements of speech quality and intelligibility

  11. Modern Tools in Patient-Centred Speech Therapy for Romanian Language

    Directory of Open Access Journals (Sweden)

    Mirela Danubianu

    2016-03-01

    Full Text Available The most common way to communicate with those around us is speech. Suffering from a speech disorder can have negative social effects: from leaving the individuals with low confidence and moral to problems with social interaction and the ability to live independently like adults. The speech therapy intervention is a complex process having particular objectives such as: discovery and identification of speech disorder and directing the therapy to correction, recovery, compensation, adaptation and social integration of patients. Computer-based Speech Therapy systems are a real help for therapists by creating a special learning environment. The Romanian language is a phonetic one, with special linguistic particularities. This paper aims to present a few computer-based speech therapy systems developed for the treatment of various speech disorders specific to Romanian language.

  12. Russian Speech in Radio: Norm and Deviation

    Directory of Open Access Journals (Sweden)

    Igor V. Nefedov

    2017-06-01

    Full Text Available National radio, like television, is called upon to bring to the masses not only relevant information, but also a high culture of language. There were always serious demands to oral public speech from the point of view of the correctness and uniformity of the pronunciation. However, today the analysis of the language practice of broadcasting often indicates a discrepancy between the use of linguistic resources in existing literary norms. The author of the article from the end of December 2016 to early April 2017 listened and analyzed from the point of view of language correctness the majority of programs on the radio Komsomolskaya Pravda (KP. In general, recognizing the good speech qualification of the workers of this radio, as well as their «guests» (political scientists, lawyers, historians, etc., one can not but note the presence of a significant number of errors in their speech. The material presented in the article allows us to conclude that at present, broadcasting is losing its position in the field of speech culture. Neglect of the rules of the Russian language on the radio «Komsomolskaya Pravda» negatively affects the image of the Russian language, which is formed in the minds of listeners. The language of radio should strive to become a standard of cleanliness and high culture for the population, since it has the enormous power of mass impact and supports the unity of the cultural and linguistic space.

  13. Causal inference of asynchronous audiovisual speech

    Directory of Open Access Journals (Sweden)

    John F Magnotti

    2013-11-01

    Full Text Available During speech perception, humans integrate auditory information from the voice with visual information from the face. This multisensory integration increases perceptual precision, but only if the two cues come from the same talker; this requirement has been largely ignored by current models of speech perception. We describe a generative model of multisensory speech perception that includes this critical step of determining the likelihood that the voice and face information have a common cause. A key feature of the model is that it is based on a principled analysis of how an observer should solve this causal inference problem using the asynchrony between two cues and the reliability of the cues. This allows the model to make predictions abut the behavior of subjects performing a synchrony judgment task, predictive power that does not exist in other approaches, such as post hoc fitting of Gaussian curves to behavioral data. We tested the model predictions against the performance of 37 subjects performing a synchrony judgment task viewing audiovisual speech under a variety of manipulations, including varying asynchronies, intelligibility, and visual cue reliability. The causal inference model outperformed the Gaussian model across two experiments, providing a better fit to the behavioral data with fewer parameters. Because the causal inference model is derived from a principled understanding of the task, model parameters are directly interpretable in terms of stimulus and subject properties.

  14. Speech is Golden

    DEFF Research Database (Denmark)

    Juel Henrichsen, Peter

    2014-01-01

    on the supply side. The present article reports on a new public action strategy which has taken shape in the course of 2013-14. While Denmark is a small language area, our public sector is well organised and has considerable purchasing power. Across this past year, Danish local authorities have organised around......Most of the Danish municipalities are ready to begin to adopt automatic speech recognition, but at the same time remain nervous following a long series of bad business cases in the recent past. Complaints are voiced over costly licences and low service levels, typical effects of a de facto monopoly...... the speech technology challenge, they have formulated a number of joint questions and new requirements to be met by suppliers and have deliberately worked towards formulating tendering material which will allow fair competition. Public researchers have contributed to this work, including the author...

  15. SPEECH VISUALIZATION SISTEM AS A BASIS FOR SPEECH TRAINING AND COMMUNICATION AIDS

    Directory of Open Access Journals (Sweden)

    Oliana KRSTEVA

    1997-09-01

    Full Text Available One receives much more information through a visual sense than through a tactile one. However, most visual aids for hearing-impaired persons are not wearable because it is difficult to make them compact and it is not a best way to mask always their vision.Generally it is difficult to get the integrated patterns by a single mathematical transform of signals, such as a Foruier transform. In order to obtain the integrated pattern speech parameters should be carefully extracted by an analysis according as each parameter, and a visual pattern, which can intuitively be understood by anyone, must be synthesized from them. Successful integration of speech parameters will never disturb understanding of individual features, so that the system can be used for speech training and communication.

  16. Multilevel Analysis in Analyzing Speech Data

    Science.gov (United States)

    Guddattu, Vasudeva; Krishna, Y.

    2011-01-01

    The speech produced by human vocal tract is a complex acoustic signal, with diverse applications in phonetics, speech synthesis, automatic speech recognition, speaker identification, communication aids, speech pathology, speech perception, machine translation, hearing research, rehabilitation and assessment of communication disorders and many…

  17. Speech-Language Therapy (For Parents)

    Science.gov (United States)

    ... Staying Safe Videos for Educators Search English Español Speech-Language Therapy KidsHealth / For Parents / Speech-Language Therapy ... most kids with speech and/or language disorders. Speech Disorders, Language Disorders, and Feeding Disorders A speech ...

  18. [Improving speech comprehension using a new cochlear implant speech processor].

    Science.gov (United States)

    Müller-Deile, J; Kortmann, T; Hoppe, U; Hessel, H; Morsnowski, A

    2009-06-01

    The aim of this multicenter clinical field study was to assess the benefits of the new Freedom 24 sound processor for cochlear implant (CI) users implanted with the Nucleus 24 cochlear implant system. The study included 48 postlingually profoundly deaf experienced CI users who demonstrated speech comprehension performance with their current speech processor on the Oldenburg sentence test (OLSA) in quiet conditions of at least 80% correct scores and who were able to perform adaptive speech threshold testing using the OLSA in noisy conditions. Following baseline measures of speech comprehension performance with their current speech processor, subjects were upgraded to the Freedom 24 speech processor. After a take-home trial period of at least 2 weeks, subject performance was evaluated by measuring the speech reception threshold with the Freiburg multisyllabic word test and speech intelligibility with the Freiburg monosyllabic word test at 50 dB and 70 dB in the sound field. The results demonstrated highly significant benefits for speech comprehension with the new speech processor. Significant benefits for speech comprehension were also demonstrated with the new speech processor when tested in competing background noise.In contrast, use of the Abbreviated Profile of Hearing Aid Benefit (APHAB) did not prove to be a suitably sensitive assessment tool for comparative subjective self-assessment of hearing benefits with each processor. Use of the preprocessing algorithm known as adaptive dynamic range optimization (ADRO) in the Freedom 24 led to additional improvements over the standard upgrade map for speech comprehension in quiet and showed equivalent performance in noise. Through use of the preprocessing beam-forming algorithm BEAM, subjects demonstrated a highly significant improved signal-to-noise ratio for speech comprehension thresholds (i.e., signal-to-noise ratio for 50% speech comprehension scores) when tested with an adaptive procedure using the Oldenburg

  19. Neurophysiology of speech differences in childhood apraxia of speech.

    Science.gov (United States)

    Preston, Jonathan L; Molfese, Peter J; Gumkowski, Nina; Sorcinelli, Andrea; Harwood, Vanessa; Irwin, Julia R; Landi, Nicole

    2014-01-01

    Event-related potentials (ERPs) were recorded during a picture naming task of simple and complex words in children with typical speech and with childhood apraxia of speech (CAS). Results reveal reduced amplitude prior to speaking complex (multisyllabic) words relative to simple (monosyllabic) words for the CAS group over the right hemisphere during a time window thought to reflect phonological encoding of word forms. Group differences were also observed prior to production of spoken tokens regardless of word complexity during a time window just prior to speech onset (thought to reflect motor planning/programming). Results suggest differences in pre-speech neurolinguistic processes.

  20. Recent advances in Automatic Speech Recognition for Vietnamese

    OpenAIRE

    Le , Viet-Bac; Besacier , Laurent; Seng , Sopheap; Bigi , Brigitte; Do , Thi-Ngoc-Diep

    2008-01-01

    International audience; This paper presents our recent activities for automatic speech recognition for Vietnamese. First, our text data collection and processing methods and tools are described. For language modeling, we investigate word, sub-word and also hybrid word/sub-word models. For acoustic modeling, when only limited speech data are available for Vietnamese, we propose some crosslingual acoustic modeling techniques. Furthermore, since the use of sub-word units can reduce the high out-...

  1. A case of crossed aphasia with apraxia of speech

    Directory of Open Access Journals (Sweden)

    Yogesh Patidar

    2013-01-01

    Full Text Available Apraxia of speech (AOS is a rare, but well-defined motor speech disorder. It is characterized by irregular articulatory errors, attempts of self-correction and persistent prosodic abnormalities. Similar to aphasia, AOS is also localized to the dominant cerebral hemisphere. We report a case of Crossed Aphasia with AOS in a 48-year-old right-handed man due to an ischemic infarct in right cerebral hemisphere.

  2. Auditory and Cognitive Factors Underlying Individual Differences in Aided Speech-Understanding among Older Adults

    Directory of Open Access Journals (Sweden)

    Larry E. Humes

    2013-10-01

    Full Text Available This study was designed to address individual differences in aided speech understanding among a relatively large group of older adults. The group of older adults consisted of 98 adults (50 female and 48 male ranging in age from 60 to 86 (mean = 69.2. Hearing loss was typical for this age group and about 90% had not worn hearing aids. All subjects completed a battery of tests, including cognitive (6 measures, psychophysical (17 measures, and speech-understanding (9 measures, as well as the Speech, Spatial and Qualities of Hearing (SSQ self-report scale. Most of the speech-understanding measures made use of competing speech and the non-speech psychophysical measures were designed to tap phenomena thought to be relevant for the perception of speech in competing speech (e.g., stream segregation, modulation-detection interference. All measures of speech understanding were administered with spectral shaping applied to the speech stimuli to fully restore audibility through at least 4000 Hz. The measures used were demonstrated to be reliable in older adults and, when compared to a reference group of 28 young normal-hearing adults, age-group differences were observed on many of the measures. Principal-components factor analysis was applied successfully to reduce the number of independent and dependent (speech understanding measures for a multiple-regression analysis. Doing so yielded one global cognitive-processing factor and five non-speech psychoacoustic factors (hearing loss, dichotic signal detection, multi-burst masking, stream segregation, and modulation detection as potential predictors. To this set of six potential predictor variables were added subject age, Environmental Sound Identification (ESI, and performance on the text-recognition-threshold (TRT task (a visual analog of interrupted speech recognition. These variables were used to successfully predict one global aided speech-understanding factor, accounting for about 60% of the variance.

  3. Speech endpoint detection with non-language speech sounds for generic speech processing applications

    Science.gov (United States)

    McClain, Matthew; Romanowski, Brian

    2009-05-01

    Non-language speech sounds (NLSS) are sounds produced by humans that do not carry linguistic information. Examples of these sounds are coughs, clicks, breaths, and filled pauses such as "uh" and "um" in English. NLSS are prominent in conversational speech, but can be a significant source of errors in speech processing applications. Traditionally, these sounds are ignored by speech endpoint detection algorithms, where speech regions are identified in the audio signal prior to processing. The ability to filter NLSS as a pre-processing step can significantly enhance the performance of many speech processing applications, such as speaker identification, language identification, and automatic speech recognition. In order to be used in all such applications, NLSS detection must be performed without the use of language models that provide knowledge of the phonology and lexical structure of speech. This is especially relevant to situations where the languages used in the audio are not known apriori. We present the results of preliminary experiments using data from American and British English speakers, in which segments of audio are classified as language speech sounds (LSS) or NLSS using a set of acoustic features designed for language-agnostic NLSS detection and a hidden-Markov model (HMM) to model speech generation. The results of these experiments indicate that the features and model used are capable of detection certain types of NLSS, such as breaths and clicks, while detection of other types of NLSS such as filled pauses will require future research.

  4. A theory of lexical access in speech production [target paper

    NARCIS (Netherlands)

    Levelt, W.J.M.; Roelofs, A.P.A.; Meyer, A.S.

    1999-01-01

    Preparing words in speech production is normally a fast and accurate process. We generate them two or three per second in fluent conversation; and overtly naming a clear picture of an object can easily be initiated within 600 ms after picture onset. The underlying process, however, is exceedingly

  5. Speech therapy in peripheral facial palsy: an orofacial myofunctional approach

    Directory of Open Access Journals (Sweden)

    Hipólito Virgílio Magalhães Júnior

    2009-12-01

    Full Text Available Objective: To delineate the contributions of speech therapy in the rehabilitation of peripheral facial palsy, describing the role of orofacial myofunctional approach in this process. Methods: A literature review of published articles since 1995, held from March to December 2008, based on the characterization of peripheral facial palsy and its relation with speechlanguage disorders related to orofacial disorders in mobility, speech and chewing, among others. The review prioritized scientific journal articles and specific chapters from the studied period. As inclusion criteria, the literature should contain data on peripheral facial palsy, quotes on the changes in the stomatognathic system and on orofacial miofunctional approach. We excluded studies that addressed central paralysis, congenital palsy and those of non idiopathic causes. Results: The literature has addressed the contribution of speech therapy in the rehabilitation of facial symmetry, with improvement in the retention of liquids and soft foods during chewing and swallowing. The orofacial myofunctional approach contextualized the role of speech therapy in the improvement of the coordination of speech articulation and in the gain of oral control during chewing and swallowing Conclusion: Speech therapy in peripheral facial palsy contributed and was outlined by applying the orofacial myofunctional approach in the reestablishment of facial symmetry, from the work directed to the functions of the stomatognathic system, including oralfacial exercises and training of chewing in association with the training of the joint. There is a need for a greater number of publications in this specific area for speech therapy professional.

  6. Behavioural, computational, and neuroimaging studies of acquired apraxia of speech

    Directory of Open Access Journals (Sweden)

    Kirrie J Ballard

    2014-11-01

    Full Text Available A critical examination of speech motor control depends on an in-depth understanding of network connectivity associated with Brodmann areas 44 and 45 and surrounding cortices. Damage to these areas has been associated with two conditions - the speech motor programming disorder apraxia of speech (AOS and the linguistic / grammatical disorder of Broca’s aphasia. Here we focus on AOS, which is most commonly associated with damage to posterior Broca's area and adjacent cortex. We provide an overview of our own studies into the nature of AOS, including behavioral and neuroimaging methods, to explore components of the speech motor network that are associated with normal and disordered speech motor programming in AOS. Behavioral, neuroimaging, and computational modeling studies are indicating that AOS is associated with impairment in learning feedforward models and/or implementing feedback mechanisms and with the functional contribution of BA6. While functional connectivity methods are not yet routinely applied to the study of AOS, we highlight the need for focusing on the functional impact of localised lesions throughout the speech network, as well as larger scale comparative studies to distinguish the unique behavioral and neurological signature of AOS. By coupling these methods with neural network models, we have a powerful set of tools to improve our understanding of the neural mechanisms that underlie AOS, and speech production generally.

  7. Temporal factors affecting somatosensory-auditory interactions in speech processing

    Directory of Open Access Journals (Sweden)

    Takayuki eIto

    2014-11-01

    Full Text Available Speech perception is known to rely on both auditory and visual information. However, sound specific somatosensory input has been shown also to influence speech perceptual processing (Ito et al., 2009. In the present study we addressed further the relationship between somatosensory information and speech perceptual processing by addressing the hypothesis that the temporal relationship between orofacial movement and sound processing contributes to somatosensory-auditory interaction in speech perception. We examined the changes in event-related potentials in response to multisensory synchronous (simultaneous and asynchronous (90 ms lag and lead somatosensory and auditory stimulation compared to individual unisensory auditory and somatosensory stimulation alone. We used a robotic device to apply facial skin somatosensory deformations that were similar in timing and duration to those experienced in speech production. Following synchronous multisensory stimulation the amplitude of the event-related potential was reliably different from the two unisensory potentials. More importantly, the magnitude of the event-related potential difference varied as a function of the relative timing of the somatosensory-auditory stimulation. Event-related activity change due to stimulus timing was seen between 160-220 ms following somatosensory onset, mostly around the parietal area. The results demonstrate a dynamic modulation of somatosensory-auditory convergence and suggest the contribution of somatosensory information for speech processing process is dependent on the specific temporal order of sensory inputs in speech production.

  8. Speech intelligibility after gingivectomy of excess palatal tissue

    Directory of Open Access Journals (Sweden)

    Aruna Balasundaram

    2014-01-01

    Full Text Available To appreciate any enhancement in speech following gingivectomy of enlarged anterior palatal gingiva. Periodontal literature has documented various conditions, pathophysiology, and treatment modalities of gingival enlargement. Relationship between gingival maladies and speech alteration has received scant attention. This case report describes on altered speech pattern enhancement secondary to the gingivectomy procedure. A systemically healthy 24-year- female patient reported with bilateral anterior gingival enlargement who was provisionally diagnosed as "gingival abscess with inflammatory enlargement" in relation to palatal aspect of the right maxillary canine to left maxillary canine. Bilateral gingivectomy procedure was performed by external bevel incision in relation to anterior palatal gingiva and a large wedge of epithelium and connective tissue was removed. Patient and her close acquaintances noticed a great improvement in her pronunciation and enunciation of sounds like "t", "d", "n", "l", "th", following removal of excess gingival palatal tissue and was also appreciated with visual analog scale score. Exploration of linguistic research documented the significance of tongue-palate contact during speech. Any excess gingival tissue in palatal region brings about disruption in speech by altering tongue-palate contact. Periodontal surgery like gingivectomy may improve disrupted phonetics. Excess gingival palatal tissue impedes on tongue-palate contact and interferes speech. Pronunciation of consonants like "t", "d", "n", "l", "th", are altered with anterior enlarged palatal gingiva. Excision of the enlarged palatal tissue results in improvement of speech.

  9. Random Deep Belief Networks for Recognizing Emotions from Speech Signals

    Directory of Open Access Journals (Sweden)

    Guihua Wen

    2017-01-01

    Full Text Available Now the human emotions can be recognized from speech signals using machine learning methods; however, they are challenged by the lower recognition accuracies in real applications due to lack of the rich representation ability. Deep belief networks (DBN can automatically discover the multiple levels of representations in speech signals. To make full of its advantages, this paper presents an ensemble of random deep belief networks (RDBN method for speech emotion recognition. It firstly extracts the low level features of the input speech signal and then applies them to construct lots of random subspaces. Each random subspace is then provided for DBN to yield the higher level features as the input of the classifier to output an emotion label. All outputted emotion labels are then fused through the majority voting to decide the final emotion label for the input speech signal. The conducted experimental results on benchmark speech emotion databases show that RDBN has better accuracy than the compared methods for speech emotion recognition.

  10. Speech recognition systems on the Cell Broadband Engine

    Energy Technology Data Exchange (ETDEWEB)

    Liu, Y; Jones, H; Vaidya, S; Perrone, M; Tydlitat, B; Nanda, A

    2007-04-20

    In this paper we describe our design, implementation, and first results of a prototype connected-phoneme-based speech recognition system on the Cell Broadband Engine{trademark} (Cell/B.E.). Automatic speech recognition decodes speech samples into plain text (other representations are possible) and must process samples at real-time rates. Fortunately, the computational tasks involved in this pipeline are highly data-parallel and can receive significant hardware acceleration from vector-streaming architectures such as the Cell/B.E. Identifying and exploiting these parallelism opportunities is challenging, but also critical to improving system performance. We observed, from our initial performance timings, that a single Cell/B.E. processor can recognize speech from thousands of simultaneous voice channels in real time--a channel density that is orders-of-magnitude greater than the capacity of existing software speech recognizers based on CPUs (central processing units). This result emphasizes the potential for Cell/B.E.-based speech recognition and will likely lead to the future development of production speech systems using Cell/B.E. clusters.

  11. Multiengine Speech Processing Using SNR Estimator in Variable Noisy Environments

    Directory of Open Access Journals (Sweden)

    Ahmad R. Abu-El-Quran

    2012-01-01

    Full Text Available We introduce a multiengine speech processing system that can detect the location and the type of audio signal in variable noisy environments. This system detects the location of the audio source using a microphone array; the system examines the audio first, determines if it is speech/nonspeech, then estimates the value of the signal to noise (SNR using a Discrete-Valued SNR Estimator. Using this SNR value, instead of trying to adapt the speech signal to the speech processing system, we adapt the speech processing system to the surrounding environment of the captured speech signal. In this paper, we introduced the Discrete-Valued SNR Estimator and a multiengine classifier, using Multiengine Selection or Multiengine Weighted Fusion. Also we use the SI as example of the speech processing. The Discrete-Valued SNR Estimator achieves an accuracy of 98.4% in characterizing the environment's SNR. Compared to a conventional single engine SI system, the improvement in accuracy was as high as 9.0% and 10.0% for the Multiengine Selection and Multiengine Weighted Fusion, respectively.

  12. Unvoiced Speech Recognition Using Tissue-Conductive Acoustic Sensor

    Directory of Open Access Journals (Sweden)

    Heracleous Panikos

    2007-01-01

    Full Text Available We present the use of stethoscope and silicon NAM (nonaudible murmur microphones in automatic speech recognition. NAM microphones are special acoustic sensors, which are attached behind the talker's ear and can capture not only normal (audible speech, but also very quietly uttered speech (nonaudible murmur. As a result, NAM microphones can be applied in automatic speech recognition systems when privacy is desired in human-machine communication. Moreover, NAM microphones show robustness against noise and they might be used in special systems (speech recognition, speech transform, etc. for sound-impaired people. Using adaptation techniques and a small amount of training data, we achieved for a 20 k dictation task a word accuracy for nonaudible murmur recognition in a clean environment. In this paper, we also investigate nonaudible murmur recognition in noisy environments and the effect of the Lombard reflex on nonaudible murmur recognition. We also propose three methods to integrate audible speech and nonaudible murmur recognition using a stethoscope NAM microphone with very promising results.

  13. Unvoiced Speech Recognition Using Tissue-Conductive Acoustic Sensor

    Directory of Open Access Journals (Sweden)

    Hiroshi Saruwatari

    2007-01-01

    Full Text Available We present the use of stethoscope and silicon NAM (nonaudible murmur microphones in automatic speech recognition. NAM microphones are special acoustic sensors, which are attached behind the talker's ear and can capture not only normal (audible speech, but also very quietly uttered speech (nonaudible murmur. As a result, NAM microphones can be applied in automatic speech recognition systems when privacy is desired in human-machine communication. Moreover, NAM microphones show robustness against noise and they might be used in special systems (speech recognition, speech transform, etc. for sound-impaired people. Using adaptation techniques and a small amount of training data, we achieved for a 20 k dictation task a 93.9% word accuracy for nonaudible murmur recognition in a clean environment. In this paper, we also investigate nonaudible murmur recognition in noisy environments and the effect of the Lombard reflex on nonaudible murmur recognition. We also propose three methods to integrate audible speech and nonaudible murmur recognition using a stethoscope NAM microphone with very promising results.

  14. Attention mechanisms and the mosaic evolution of speech

    Directory of Open Access Journals (Sweden)

    Pedro Tiago Martins

    2014-12-01

    Full Text Available There is still no categorical answer for why humans, and no other species, have speech, or why speech is the way it is. Several purely anatomical arguments have been put forward, but they have been shown to be false, biologically implausible, or of limited scope. This perspective paper supports the idea that evolutionary theories of speech could benefit from a focus on the cognitive mechanisms that make speech possible, for which antecedents in evolutionary history and brain correlates can be found. This type of approach is part of a very recent, but rapidly growing tradition, which has provided crucial insights on the nature of human speech by focusing on the biological bases of vocal learning. Here, we call attention to what might be an important ingredient for speech. We contend that a general mechanism of attention, which manifests itself not only in visual but also auditory (and possibly other modalities, might be one of the key pieces of human speech, in addition to the mechanisms underlying vocal learning, and the pairing of facial gestures with vocalic units.

  15. Detecting self-produced speech errors before and after articulation: An ERP investigation

    Directory of Open Access Journals (Sweden)

    Kevin Michael Trewartha

    2013-11-01

    Full Text Available It has been argued that speech production errors are monitored by the same neural system involved in monitoring other types of action errors. Behavioral evidence has shown that speech errors can be detected and corrected prior to articulation, yet the neural basis for such pre-articulatory speech error monitoring is poorly understood. The current study investigated speech error monitoring using a phoneme-substitution task known to elicit speech errors. Stimulus-locked event-related potential (ERP analyses comparing correct and incorrect utterances were used to assess pre-articulatory error monitoring and response-locked ERP analyses were used to assess post-articulatory monitoring. Our novel finding in the stimulus-locked analysis revealed that words that ultimately led to a speech error were associated with a larger P2 component at midline sites (FCz, Cz, and CPz. This early positivity may reflect the detection of an error in speech formulation, or a predictive mechanism to signal the potential for an upcoming speech error. The data also revealed that general conflict monitoring mechanisms are involved during this task as both correct and incorrect responses elicited an anterior N2 component typically associated with conflict monitoring. The response-locked analyses corroborated previous observations that self-produced speech errors led to a fronto-central ERN. These results demonstrate that speech errors can be detected prior to articulation, and that speech error monitoring relies on a central error monitoring mechanism.

  16. Multiple Time-Instances Features of Degraded Speech for Single Ended Quality Measurement

    Directory of Open Access Journals (Sweden)

    Rajesh Kumar Dubey

    2017-01-01

    Full Text Available The use of single time-instance features, where entire speech utterance is used for feature computation, is not accurate and adequate in capturing the time localized information of short-time transient distortions and their distinction from plosive sounds of speech, particularly degraded by impulsive noise. Hence, the importance of estimating features at multiple time-instances is sought. In this, only active speech segments of degraded speech are used for features computation at multiple time-instances on per frame basis. Here, active speech means both voiced and unvoiced frames except silence. The features of different combinations of multiple contiguous active speech segments are computed and called multiple time-instances features. The joint GMM training has been done using these features along with the subjective MOS of the corresponding speech utterance to obtain the parameters of GMM. These parameters of GMM and multiple time-instances features of test speech are used to compute the objective MOS values of different combinations of multiple contiguous active speech segments. The overall objective MOS of the test speech utterance is obtained by assigning equal weight to the objective MOS values of the different combinations of multiple contiguous active speech segments. This algorithm outperforms the Recommendation ITU-T P.563 and recently published algorithms.

  17. How may the basal ganglia contribute to auditory categorization and speech perception?

    Directory of Open Access Journals (Sweden)

    Sung-Joo eLim

    2014-08-01

    Full Text Available Listeners must accomplish two complementary perceptual feats in extracting a message from speech. They must discriminate linguistically-relevant acoustic variability and generalize across irrelevant variability. Said another way, they must categorize speech. Since the mapping of acoustic variability is language-specific, these categories must be learned from experience. Thus, understanding how, in general, the auditory system acquires and represents categories can inform us about the toolbox of mechanisms available to speech perception. This perspective invites consideration of findings from cognitive neuroscience literatures outside of the speech domain as a means of constraining models of speech perception. Although neurobiological models of speech perception have mainly focused on cerebral cortex, research outside the speech domain is consistent with the possibility of significant subcortical contributions in category learning. Here, we review the functional role of one such structure, the basal ganglia. We examine research from animal electrophysiology, human neuroimaging, and behavior to consider characteristics of basal ganglia processing that may be advantageous for speech category learning. We also present emerging evidence for a direct role for basal ganglia in learning auditory categories in a complex, naturalistic task intended to model the incidental manner in which speech categories are acquired. To conclude, we highlight new research questions that arise in incorporating the broader neuroscience research literature in modeling speech perception, and suggest how understanding contributions of the basal ganglia can inform attempts to optimize training protocols for learning non-native speech categories in adulthood.

  18. Abortion and compelled physician speech.

    Science.gov (United States)

    Orentlicher, David

    2015-01-01

    Informed consent mandates for abortion providers may infringe the First Amendment's freedom of speech. On the other hand, they may reinforce the physician's duty to obtain informed consent. Courts can promote both doctrines by ensuring that compelled physician speech pertains to medical facts about abortion rather than abortion ideology and that compelled speech is truthful and not misleading. © 2015 American Society of Law, Medicine & Ethics, Inc.

  19. Dynamic Programming Algorithms in Speech Recognition

    Directory of Open Access Journals (Sweden)

    Titus Felix FURTUNA

    2008-01-01

    Full Text Available In a system of speech recognition containing words, the recognition requires the comparison between the entry signal of the word and the various words of the dictionary. The problem can be solved efficiently by a dynamic comparison algorithm whose goal is to put in optimal correspondence the temporal scales of the two words. An algorithm of this type is Dynamic Time Warping. This paper presents two alternatives for implementation of the algorithm designed for recognition of the isolated words.

  20. Speech Recognition on Mobile Devices

    DEFF Research Database (Denmark)

    Tan, Zheng-Hua; Lindberg, Børge

    2010-01-01

    in the mobile context covering motivations, challenges, fundamental techniques and applications. Three ASR architectures are introduced: embedded speech recognition, distributed speech recognition and network speech recognition. Their pros and cons and implementation issues are discussed. Applications within......The enthusiasm of deploying automatic speech recognition (ASR) on mobile devices is driven both by remarkable advances in ASR technology and by the demand for efficient user interfaces on such devices as mobile phones and personal digital assistants (PDAs). This chapter presents an overview of ASR...

  1. Internet video telephony allows speech reading by deaf individuals and improves speech perception by cochlear implant users.

    Directory of Open Access Journals (Sweden)

    Georgios Mantokoudis

    Full Text Available OBJECTIVE: To analyze speech reading through Internet video calls by profoundly hearing-impaired individuals and cochlear implant (CI users. METHODS: Speech reading skills of 14 deaf adults and 21 CI users were assessed using the Hochmair Schulz Moser (HSM sentence test. We presented video simulations using different video resolutions (1280 × 720, 640 × 480, 320 × 240, 160 × 120 px, frame rates (30, 20, 10, 7, 5 frames per second (fps, speech velocities (three different speakers, webcameras (Logitech Pro9000, C600 and C500 and image/sound delays (0-500 ms. All video simulations were presented with and without sound and in two screen sizes. Additionally, scores for live Skype™ video connection and live face-to-face communication were assessed. RESULTS: Higher frame rate (>7 fps, higher camera resolution (>640 × 480 px and shorter picture/sound delay (<100 ms were associated with increased speech perception scores. Scores were strongly dependent on the speaker but were not influenced by physical properties of the camera optics or the full screen mode. There is a significant median gain of +8.5%pts (p = 0.009 in speech perception for all 21 CI-users if visual cues are additionally shown. CI users with poor open set speech perception scores (n = 11 showed the greatest benefit under combined audio-visual presentation (median speech perception +11.8%pts, p = 0.032. CONCLUSION: Webcameras have the potential to improve telecommunication of hearing-impaired individuals.

  2. EVOLUTION OF SPEECH: A NEW HYPOTHESIS

    Directory of Open Access Journals (Sweden)

    Shishir

    2016-03-01

    Full Text Available BACKGROUND The first and foremost characteristic of speech is that it is human. Speech is one characteristic feature that has evolved in humans and is by far the most powerful form of communication in the Kingdom Animalia. Today, human has established himself as an alpha species and speech and language evolution has made it possible. But how is speech possible? What anatomical changes have made us possible to speak? A sincere effort has been put in this paper to establish a possible anatomical answer to the riddle. METHODS The prototypes of the cranial skeletons of all the major classes of phylum vertebrata were studied. The materials were studied in museums of Wayanad, Karwar and Museum of Natural History, Imphal. The skeleton of mammal was studied in the Department of Anatomy, K. S. Hegde Medical Academy, Mangalore. RESULTS The curve formed in the base of the skull due to flexion of the splanchnocranium with the neurocranium holds the key to answer of how humans were able to speak. CONCLUSION Of course this may not be the only reason which participated in the evolution of speech like the brain also had to evolve and as a matter of fact the occipital lobes are more prominent in humans when compared to that of the lower mammals. Although, not the only criteria but it is one of the most important thing that has happened in the course of evolution and made us to speak. This small space at the base of the brain is the difference which made us the dominant alpha species.

  3. Current trends in multilingual speech processing

    Indian Academy of Sciences (India)

    2016-08-26

    ; speech-to-speech translation; language identification. ... interest owing to two strong driving forces. Firstly, technical advances in speech recognition and synthesis are posing new challenges and opportunities to researchers.

  4. Text-based language identification of multilingual names

    CSIR Research Space (South Africa)

    Giwa, O

    2015-11-01

    Full Text Available Text-based language identification (T-LID) of isolated words has been shown to be useful for various speech processing tasks, including pronunciation modelling and data categorisation. When the words to be categorised are proper names, the task...

  5. Multiresolution analysis applied to text-independent phone segmentation

    International Nuclear Information System (INIS)

    Cherniz, AnalIa S; Torres, MarIa E; Rufiner, Hugo L; Esposito, Anna

    2007-01-01

    Automatic speech segmentation is of fundamental importance in different speech applications. The most common implementations are based on hidden Markov models. They use a statistical modelling of the phonetic units to align the data along a known transcription. This is an expensive and time-consuming process, because of the huge amount of data needed to train the system. Text-independent speech segmentation procedures have been developed to overcome some of these problems. These methods detect transitions in the evolution of the time-varying features that represent the speech signal. Speech representation plays a central role is the segmentation task. In this work, two new speech parameterizations based on the continuous multiresolution entropy, using Shannon entropy, and the continuous multiresolution divergence, using Kullback-Leibler distance, are proposed. These approaches have been compared with the classical Melbank parameterization. The proposed encodings increase significantly the segmentation performance. Parameterization based on the continuous multiresolution divergence shows the best results, increasing the number of correctly detected boundaries and decreasing the amount of erroneously inserted points. This suggests that the parameterization based on multiresolution information measures provide information related to acoustic features that take into account phonemic transitions

  6. Adaptation to delayed auditory feedback induces the temporal recalibration effect in both speech perception and production.

    Science.gov (United States)

    Yamamoto, Kosuke; Kawabata, Hideaki

    2014-12-01

    We ordinarily speak fluently, even though our perceptions of our own voices are disrupted by various environmental acoustic properties. The underlying mechanism of speech is supposed to monitor the temporal relationship between speech production and the perception of auditory feedback, as suggested by a reduction in speech fluency when the speaker is exposed to delayed auditory feedback (DAF). While many studies have reported that DAF influences speech motor processing, its relationship to the temporal tuning effect on multimodal integration, or temporal recalibration, remains unclear. We investigated whether the temporal aspects of both speech perception and production change due to adaptation to the delay between the motor sensation and the auditory feedback. This is a well-used method of inducing temporal recalibration. Participants continually read texts with specific DAF times in order to adapt to the delay. Then, they judged the simultaneity between the motor sensation and the vocal feedback. We measured the rates of speech with which participants read the texts in both the exposure and re-exposure phases. We found that exposure to DAF changed both the rate of speech and the simultaneity judgment, that is, participants' speech gained fluency. Although we also found that a delay of 200 ms appeared to be most effective in decreasing the rates of speech and shifting the distribution on the simultaneity judgment, there was no correlation between these measurements. These findings suggest that both speech motor production and multimodal perception are adaptive to temporal lag but are processed in distinct ways.

  7. Multimodal Speech Capture System for Speech Rehabilitation and Learning.

    Science.gov (United States)

    Sebkhi, Nordine; Desai, Dhyey; Islam, Mohammad; Lu, Jun; Wilson, Kimberly; Ghovanloo, Maysam

    2017-11-01

    Speech-language pathologists (SLPs) are trained to correct articulation of people diagnosed with motor speech disorders by analyzing articulators' motion and assessing speech outcome while patients speak. To assist SLPs in this task, we are presenting the multimodal speech capture system (MSCS) that records and displays kinematics of key speech articulators, the tongue and lips, along with voice, using unobtrusive methods. Collected speech modalities, tongue motion, lips gestures, and voice are visualized not only in real-time to provide patients with instant feedback but also offline to allow SLPs to perform post-analysis of articulators' motion, particularly the tongue, with its prominent but hardly visible role in articulation. We describe the MSCS hardware and software components, and demonstrate its basic visualization capabilities by a healthy individual repeating the words "Hello World." A proof-of-concept prototype has been successfully developed for this purpose, and will be used in future clinical studies to evaluate its potential impact on accelerating speech rehabilitation by enabling patients to speak naturally. Pattern matching algorithms to be applied to the collected data can provide patients with quantitative and objective feedback on their speech performance, unlike current methods that are mostly subjective, and may vary from one SLP to another.

  8. Measurement of speech parameters in casual speech of dementia patients

    NARCIS (Netherlands)

    Ossewaarde, Roelant; Jonkers, Roel; Jalvingh, Fedor; Bastiaanse, Yvonne

    Measurement of speech parameters in casual speech of dementia patients Roelant Adriaan Ossewaarde1,2, Roel Jonkers1, Fedor Jalvingh1,3, Roelien Bastiaanse1 1CLCG, University of Groningen (NL); 2HU University of Applied Sciences Utrecht (NL); 33St. Marienhospital - Vechta, Geriatric Clinic Vechta

  9. Alternative Speech Communication System for Persons with Severe Speech Disorders

    Science.gov (United States)

    Selouani, Sid-Ahmed; Sidi Yakoub, Mohammed; O'Shaughnessy, Douglas

    2009-12-01

    Assistive speech-enabled systems are proposed to help both French and English speaking persons with various speech disorders. The proposed assistive systems use automatic speech recognition (ASR) and speech synthesis in order to enhance the quality of communication. These systems aim at improving the intelligibility of pathologic speech making it as natural as possible and close to the original voice of the speaker. The resynthesized utterances use new basic units, a new concatenating algorithm and a grafting technique to correct the poorly pronounced phonemes. The ASR responses are uttered by the new speech synthesis system in order to convey an intelligible message to listeners. Experiments involving four American speakers with severe dysarthria and two Acadian French speakers with sound substitution disorders (SSDs) are carried out to demonstrate the efficiency of the proposed methods. An improvement of the Perceptual Evaluation of the Speech Quality (PESQ) value of 5% and more than 20% is achieved by the speech synthesis systems that deal with SSD and dysarthria, respectively.

  10. Speech Perception as a Multimodal Phenomenon

    OpenAIRE

    Rosenblum, Lawrence D.

    2008-01-01

    Speech perception is inherently multimodal. Visual speech (lip-reading) information is used by all perceivers and readily integrates with auditory speech. Imaging research suggests that the brain treats auditory and visual speech similarly. These findings have led some researchers to consider that speech perception works by extracting amodal information that takes the same form across modalities. From this perspective, speech integration is a property of the input information itself. Amodal s...

  11. Speech pathology in ancient India--a review of Sanskrit literature.

    Science.gov (United States)

    Savithri, S R

    1987-12-01

    This paper aims at highlighting the knowledge of the Sanskrit scholars of ancient times in the field of speech and language pathology. The information collected here is mainly from the Sanskrit texts written between 2000 B.C. and 1633 A.D. Some aspects of speech and language that have been dealt with in this review have been elaborately described in the original Sanskrit texts. The present paper, however, being limited in its scope, reviews only the essential facts, but not the details. The purpose is only to give a glimpse of the knowledge that the Sanskrit scholars of those times possessed. In brief, this paper is a review of Sanskrit literature for information on the origin and development of speech and language, speech production, normality of speech and language, and disorders of speech and language and their treatment.

  12. Automatic Emotion Recognition in Speech: Possibilities and Significance

    Directory of Open Access Journals (Sweden)

    Milana Bojanić

    2009-12-01

    Full Text Available Automatic speech recognition and spoken language understanding are crucial steps towards a natural humanmachine interaction. The main task of the speech communication process is the recognition of the word sequence, but the recognition of prosody, emotion and stress tags may be of particular importance as well. This paper discusses thepossibilities of recognition emotion from speech signal in order to improve ASR, and also provides the analysis of acoustic features that can be used for the detection of speaker’s emotion and stress. The paper also provides a short overview of emotion and stress classification techniques. The importance and place of emotional speech recognition is shown in the domain of human-computer interactive systems and transaction communication model. The directions for future work are given at the end of this work.

  13. TIMLOGORO - AN INTERACTIVE PLATFORM DESIGN FOR SPEECH THERAPY

    Directory of Open Access Journals (Sweden)

    Georgeta PÂNIȘOARĂ

    2016-12-01

    Full Text Available This article presents some tehnical and pedagogical features of an interactive platforme used for language therapy. Timlogoro project demonstrates that technology is an effective tool in learning and, in particular, a viable solution for improving speech disorders present in different stages of age. A digital platform for different categories of users with speech impairments (children and adults has a good support in pedagogical principles. In speech therapy, the computer was originally used to assess deficiencies. Nowadays it has become a useful tool in language rehabilitation. A few Romanian speech therapists create digital applications that will be used in therapy for recovery.This work was supported by a grant of the Romanian National Authority for Scientific UEFISCDI.

  14. Speech and swallowing outcomes in buccal mucosa carcinoma

    Directory of Open Access Journals (Sweden)

    Sunila John

    2011-01-01

    Full Text Available Buccal carcinoma is one of the most common malignant neoplasms among all oral cancers in India. Understanding the role of speech language pathologists (SLPs in the domains of evaluation and management strategies of this condition is limited, especially in the Indian context. This is a case report of a young adult with recurrent squamous cell carcinoma of the buccal mucosa with no deleterious habits usually associated with buccal mucosa carcinoma. Following composite resection, pectoralis major myocutaneous flap reconstruction, he developed severe oral dysphagia and demonstrated unintelligible speech. This case report focuses on the issues of swallowing and speech deficits in buccal mucosa carcinoma that need to be addressed by SLPs, and the outcomes of speech and swallowing rehabilitation and prognostic issues.

  15. Hybrid methodological approach to context-dependent speech recognition

    Directory of Open Access Journals (Sweden)

    Dragiša Mišković

    2017-01-01

    Full Text Available Although the importance of contextual information in speech recognition has been acknowledged for a long time now, it has remained clearly underutilized even in state-of-the-art speech recognition systems. This article introduces a novel, methodologically hybrid approach to the research question of context-dependent speech recognition in human–machine interaction. To the extent that it is hybrid, the approach integrates aspects of both statistical and representational paradigms. We extend the standard statistical pattern-matching approach with a cognitively inspired and analytically tractable model with explanatory power. This methodological extension allows for accounting for contextual information which is otherwise unavailable in speech recognition systems, and using it to improve post-processing of recognition hypotheses. The article introduces an algorithm for evaluation of recognition hypotheses, illustrates it for concrete interaction domains, and discusses its implementation within two prototype conversational agents.

  16. Repetition and Emotive Communication in Music Versus Speech

    Directory of Open Access Journals (Sweden)

    Elizabeth Hellmuth eMargulis

    2013-04-01

    Full Text Available Music and speech are often placed alongside one another as comparative cases. Their relative overlaps and disassociations have been well explored (e.g. Patel, 2010. But one key attribute distinguishing these two domains has often been overlooked: the greater preponderance of repetition in music in comparison to speech. Recent fMRI studies have shown that familiarity – achieved through repetition – is a critical component of emotional engagement with music (Pereira et al., 2011. If repetition is fundamental to emotional responses to music, and repetition is a key distinguisher between the domains of music and speech, then close examination of the phenomenon of repetition might help clarify the ways that music elicits emotion differently than speech.

  17. Integrated Phoneme Subspace Method for Speech Feature Extraction

    Directory of Open Access Journals (Sweden)

    Park Hyunsin

    2009-01-01

    Full Text Available Speech feature extraction has been a key focus in robust speech recognition research. In this work, we discuss data-driven linear feature transformations applied to feature vectors in the logarithmic mel-frequency filter bank domain. Transformations are based on principal component analysis (PCA, independent component analysis (ICA, and linear discriminant analysis (LDA. Furthermore, this paper introduces a new feature extraction technique that collects the correlation information among phoneme subspaces and reconstructs feature space for representing phonemic information efficiently. The proposed speech feature vector is generated by projecting an observed vector onto an integrated phoneme subspace (IPS based on PCA or ICA. The performance of the new feature was evaluated for isolated word speech recognition. The proposed method provided higher recognition accuracy than conventional methods in clean and reverberant environments.

  18. Speech/Nonspeech Detection Using Minimal Walsh Basis Functions

    Directory of Open Access Journals (Sweden)

    Pwint Moe

    2007-01-01

    Full Text Available This paper presents a new method to detect speech/nonspeech components of a given noisy signal. Employing the combination of binary Walsh basis functions and an analysis-synthesis scheme, the original noisy speech signal is modified first. From the modified signals, the speech components are distinguished from the nonspeech components by using a simple decision scheme. Minimal number of Walsh basis functions to be applied is determined using singular value decomposition (SVD. The main advantages of the proposed method are low computational complexity, less parameters to be adjusted, and simple implementation. It is observed that the use of Walsh basis functions makes the proposed algorithm efficiently applicable in real-world situations where processing time is crucial. Simulation results indicate that the proposed algorithm achieves high-speech and nonspeech detection rates while maintaining a low error rate for different noisy conditions.

  19. The Speech Act Theory between Linguistics and Language Philosophy

    Directory of Open Access Journals (Sweden)

    Liviu-Mihail MARINESCU

    2006-10-01

    Full Text Available Of all the issues in the general theory of language usage, speech act theory has probably aroused the widest interest. Psychologists, forexample, have suggested that the acquisition of the concepts underlying speech acts may be a prerequisite for the acquisition of language in general,literary critics have looked to speech act theory for an illumination of textual subtleties or for an understanding of the nature of literary genres,anthropologists have hoped to find in the theory some account of the nature of magical incantations, philosophers have seen potential applications to,amongst other things, the status of ethical statements, while linguists have seen the notions of speech act theory as variously applicable to problemsin syntax, semantics, second language learning, and elsewhere.

  20. Recognizing emotional speech in Persian: a validated database of Persian emotional speech (Persian ESD).

    Science.gov (United States)

    Keshtiari, Niloofar; Kuhlmann, Michael; Eslami, Moharram; Klann-Delius, Gisela

    2015-03-01

    Research on emotional speech often requires valid stimuli for assessing perceived emotion through prosody and lexical content. To date, no comprehensive emotional speech database for Persian is officially available. The present article reports the process of designing, compiling, and evaluating a comprehensive emotional speech database for colloquial Persian. The database contains a set of 90 validated novel Persian sentences classified in five basic emotional categories (anger, disgust, fear, happiness, and sadness), as well as a neutral category. These sentences were validated in two experiments by a group of 1,126 native Persian speakers. The sentences were articulated by two native Persian speakers (one male, one female) in three conditions: (1) congruent (emotional lexical content articulated in a congruent emotional voice), (2) incongruent (neutral sentences articulated in an emotional voice), and (3) baseline (all emotional and neutral sentences articulated in neutral voice). The speech materials comprise about 470 sentences. The validity of the database was evaluated by a group of 34 native speakers in a perception test. Utterances recognized better than five times chance performance (71.4 %) were regarded as valid portrayals of the target emotions. Acoustic analysis of the valid emotional utterances revealed differences in pitch, intensity, and duration, attributes that may help listeners to correctly classify the intended emotion. The database is designed to be used as a reliable material source (for both text and speech) in future cross-cultural or cross-linguistic studies of emotional speech, and it is available for academic research purposes free of charge. To access the database, please contact the first author.

  1. The relationship of speech intelligibility with hearing sensitivity, cognition, and perceived hearing difficulties varies for different speech perception tests

    Directory of Open Access Journals (Sweden)

    Antje eHeinrich

    2015-06-01

    Full Text Available Listeners vary in their ability to understand speech in noisy environments. Hearing sensitivity, as measured by pure-tone audiometry, can only partly explain these results, and cognition has emerged as another key concept. Although cognition relates to speech perception, the exact nature of the relationship remains to be fully understood. This study investigates how different aspects of cognition, particularly working memory and attention, relate to speech intelligibility for various tests.Perceptual accuracy of speech perception represents just one aspect of functioning in a listening environment. Activity and participation limits imposed by hearing loss, in addition to the demands of a listening environment, are also important and may be better captured by self-report questionnaires. Understanding how speech perception relates to self-reported aspects of listening forms the second focus of the study.Forty-four listeners aged between 50-74 years with mild SNHL were tested on speech perception tests differing in complexity from low (phoneme discrimination in quiet, to medium (digit triplet perception in speech-shaped noise to high (sentence perception in modulated noise; cognitive tests of attention, memory, and nonverbal IQ; and self-report questionnaires of general health-related and hearing-specific quality of life.Hearing sensitivity and cognition related to intelligibility differently depending on the speech test: neither was important for phoneme discrimination, hearing sensitivity alone was important for digit triplet perception, and hearing and cognition together played a role in sentence perception. Self-reported aspects of auditory functioning were correlated with speech intelligibility to different degrees, with digit triplets in noise showing the richest pattern. The results suggest that intelligibility tests can vary in their auditory and cognitive demands and their sensitivity to the challenges that auditory environments pose on

  2. DELVING INTO SPEECH ACT A Case Of Indonesian EFL Young Learners

    Directory of Open Access Journals (Sweden)

    Swastika Septiani, S.Pd

    2017-04-01

    Full Text Available This study attempts to describe the use of speech acts applied in primary school. This study is intended to identify the speech acts performed in primary school, to find the most dominant speech acts performed in elementary school, to give brief description of how speech acts applied in primary school, and to know how to apply the result of the study in English teaching learning to young learners. The speech acts performed in primary school is classified based on Searle‘s theory of speech acts. The most dominant speech acts performed in primary school is Directive (41.17%, the second speech act mostly performed is Declarative (33.33%, the third speech act mostly performed is Representative and Expressive (each 11.76%, and the least speech act performed is Commisive (1.9%. The speech acts performed in elementary school is applied on the context of situation determined by the National Education Standards Agency (BSNP. The speech acts performed in fourth grade have to be applied in the context of classroom, and the speech acts performed in fifth grade have to be applied in the context of school, whereas the speech acts performed in sixth grade have to be applied in the context of the students‘ surroundings. The result of this study is highy expected to give significant contribution to English teaching learning to young learners. By acknowledging the characteristics of young learners, the way they learn English as a foreign language, the teachers are expected to have inventive strategies and various techniques to create a fun and condusive atmosphere in English class.

  3. Automatic initial and final segmentation in cleft palate speech of Mandarin speakers.

    Directory of Open Access Journals (Sweden)

    Ling He

    Full Text Available The speech unit segmentation is an important pre-processing step in the analysis of cleft palate speech. In Mandarin, one syllable is composed of two parts: initial and final. In cleft palate speech, the resonance disorders occur at the finals and the voiced initials, while the articulation disorders occur at the unvoiced initials. Thus, the initials and finals are the minimum speech units, which could reflect the characteristics of cleft palate speech disorders. In this work, an automatic initial/final segmentation method is proposed. It is an important preprocessing step in cleft palate speech signal processing. The tested cleft palate speech utterances are collected from the Cleft Palate Speech Treatment Center in the Hospital of Stomatology, Sichuan University, which has the largest cleft palate patients in China. The cleft palate speech data includes 824 speech segments, and the control samples contain 228 speech segments. The syllables are extracted from the speech utterances firstly. The proposed syllable extraction method avoids the training stage, and achieves a good performance for both voiced and unvoiced speech. Then, the syllables are classified into with "quasi-unvoiced" or with "quasi-voiced" initials. Respective initial/final segmentation methods are proposed to these two types of syllables. Moreover, a two-step segmentation method is proposed. The rough locations of syllable and initial/final boundaries are refined in the second segmentation step, in order to improve the robustness of segmentation accuracy. The experiments show that the initial/final segmentation accuracies for syllables with quasi-unvoiced initials are higher than quasi-voiced initials. For the cleft palate speech, the mean time error is 4.4ms for syllables with quasi-unvoiced initials, and 25.7ms for syllables with quasi-voiced initials, and the correct segmentation accuracy P30 for all the syllables is 91.69%. For the control samples, P30 for all the

  4. Experience with speech sounds is not necessary for cue trading by budgerigars (Melopsittacus undulatus.

    Directory of Open Access Journals (Sweden)

    Mary Flaherty

    Full Text Available The influence of experience with human speech sounds on speech perception in budgerigars, vocal mimics whose speech exposure can be tightly controlled in a laboratory setting, was measured. Budgerigars were divided into groups that differed in auditory exposure and then tested on a cue-trading identification paradigm with synthetic speech. Phonetic cue trading is a perceptual phenomenon observed when changes on one cue dimension are offset by changes in another cue dimension while still maintaining the same phonetic percept. The current study examined whether budgerigars would trade the cues of voice onset time (VOT and the first formant onset frequency when identifying syllable initial stop consonants and if this would be influenced by exposure to speech sounds. There were a total of four different exposure groups: No speech exposure (completely isolated, Passive speech exposure (regular exposure to human speech, and two Speech-trained groups. After the exposure period, all budgerigars were tested for phonetic cue trading using operant conditioning procedures. Birds were trained to peck keys in response to different synthetic speech sounds that began with "d" or "t" and varied in VOT and frequency of the first formant at voicing onset. Once training performance criteria were met, budgerigars were presented with the entire intermediate series, including ambiguous sounds. Responses on these trials were used to determine which speech cues were used, if a trading relation between VOT and the onset frequency of the first formant was present, and whether speech exposure had an influence on perception. Cue trading was found in all birds and these results were largely similar to those of a group of humans. Results indicated that prior speech experience was not a requirement for cue trading by budgerigars. The results are consistent with theories that explain phonetic cue trading in terms of a rich auditory encoding of the speech signal.

  5. Auditory Modeling for Noisy Speech Recognition

    National Research Council Canada - National Science Library

    2000-01-01

    ... digital filtering for noise cancellation which interfaces to speech recognition software. It uses auditory features in speech recognition training, and provides applications to multilingual spoken language translation...

  6. Speech-based emotion detection in a resource-scarce environment

    CSIR Research Space (South Africa)

    Martirosian, O

    2007-11-01

    Full Text Available , happiness and frustration; passive emotion encompasses sadness and dis- appointment, and neutral encompasses speech with a negligible amount of emotional content. Because a study on the expression of emotion in speech has not been done in the South... seconds long and the segments labelled with the dominant emotion of the speech contained in them. The fine emotional labels used were angry, frustrated, happy, friendly, neutral, sad and depressed. These fine labels were combined into three broad...

  7. Speech Training for Inmate Rehabilitation.

    Science.gov (United States)

    Parkinson, Michael G.; Dobkins, David H.

    1982-01-01

    Using a computerized content analysis, the authors demonstrate changes in speech behaviors of prison inmates. They conclude that two to four hours of public speaking training can have only limited effect on students who live in a culture in which "prison speech" is the expected and rewarded form of behavior. (PD)

  8. Separating Underdetermined Convolutive Speech Mixtures

    DEFF Research Database (Denmark)

    Pedersen, Michael Syskind; Wang, DeLiang; Larsen, Jan

    2006-01-01

    a method for underdetermined blind source separation of convolutive mixtures. The proposed framework is applicable for separation of instantaneous as well as convolutive speech mixtures. It is possible to iteratively extract each speech signal from the mixture by combining blind source separation...

  9. Speech recognition from spectral dynamics

    Indian Academy of Sciences (India)

    Some of the history of gradual infusion of the modulation spectrum concept into Automatic recognition of speech (ASR) comes next, pointing to the relationship of modulation spectrum processing to wellaccepted ASR techniques such as dynamic speech features or RelAtive SpecTrAl (RASTA) filtering. Next, the frequency ...

  10. Speech Prosody in Cerebellar Ataxia

    Science.gov (United States)

    Casper, Maureen A.; Raphael, Lawrence J.; Harris, Katherine S.; Geibel, Jennifer M.

    2007-01-01

    Persons with cerebellar ataxia exhibit changes in physical coordination and speech and voice production. Previously, these alterations of speech and voice production were described primarily via perceptual coordinates. In this study, the spatial-temporal properties of syllable production were examined in 12 speakers, six of whom were healthy…

  11. Speech comprehension difficulties in chronic tinnitus and its relation to hyperacusis

    Directory of Open Access Journals (Sweden)

    Veronika Vielsmeier

    2016-12-01

    Full Text Available AbstractObjectiveMany tinnitus patients complain about difficulties regarding speech comprehension. In spite of the high clinical relevance little is known about underlying mechanisms and predisposing factors. Here, we performed an exploratory investigation in a large sample of tinnitus patients to (1 estimate the prevalence of speech comprehension difficulties among tinnitus patients, to (2 compare subjective reports of speech comprehension difficulties with objective measurements in a standardized speech comprehension test and to (3 explore underlying mechanisms by analyzing the relationship between speech comprehension difficulties and peripheral hearing function (pure tone audiogram, as well as with co-morbid hyperacusis as a central auditory processing disorder. Subjects and MethodsSpeech comprehension was assessed in 361 tinnitus patients presenting between 07/2012 and 08/2014 at the Interdisciplinary Tinnitus Clinic at the University of Regensburg. The assessment included standard audiological assessment (pure tone audiometry, tinnitus pitch and loudness matching, the Goettingen sentence test (in quiet for speech audiometric evaluation, two questions about hyperacusis, and two questions about speech comprehension in quiet and noisy environments (How would you rate your ability to understand speech?; How would you rate your ability to follow a conversation when multiple people are speaking simultaneously?. Results Subjectively reported speech comprehension deficits are frequent among tinnitus patients, especially in noisy environments (cocktail party situation. 74.2% of all investigated patients showed disturbed speech comprehension (indicated by values above 21.5 dB SPL in the Goettingen sentence test. Subjective speech comprehension complaints (both in general and in noisy environment were correlated with hearing level and with audiologically-assessed speech comprehension ability. In contrast, co-morbid hyperacusis was only correlated

  12. Das sprachliche Register (Speech Registers)

    Science.gov (United States)

    Hess-Luttich, Ernest W. B.

    1974-01-01

    The linguistic behavior of a given individual varies; he will on different occasions speak (or write) differently according to what may be roughly described as different social situations: he will use a number of different registers. The application of such registers both in the field of text analysis and in the preparation of teaching materials…

  13. Text-Fabric

    NARCIS (Netherlands)

    Roorda, Dirk

    2016-01-01

    Text-Fabric is a Python3 package for Text plus Annotations. It provides a data model, a text file format, and a binary format for (ancient) text plus (linguistic) annotations. The emphasis of this all is on: data processing; sharing data; and contributing modules. A defining characteristic is that

  14. Contextual Text Mining

    Science.gov (United States)

    Mei, Qiaozhu

    2009-01-01

    With the dramatic growth of text information, there is an increasing need for powerful text mining systems that can automatically discover useful knowledge from text. Text is generally associated with all kinds of contextual information. Those contexts can be explicit, such as the time and the location where a blog article is written, and the…

  15. XML and Free Text.

    Science.gov (United States)

    Riggs, Ken Roger

    2002-01-01

    Discusses problems with marking free text, text that is either natural language or semigrammatical but unstructured, that prevent well-formed XML from marking text for readily available meaning. Proposes a solution to mark meaning in free text that is consistent with the intended simplicity of XML versus SGML. (Author/LRW)

  16. Challenges in discriminating profanity from hate speech

    Science.gov (United States)

    Malmasi, Shervin; Zampieri, Marcos

    2018-03-01

    In this study, we approach the problem of distinguishing general profanity from hate speech in social media, something which has not been widely considered. Using a new dataset annotated specifically for this task, we employ supervised classification along with a set of features that includes ?-grams, skip-grams and clustering-based word representations. We apply approaches based on single classifiers as well as more advanced ensemble classifiers and stacked generalisation, achieving the best result of ? accuracy for this 3-class classification task. Analysis of the results reveals that discriminating hate speech and profanity is not a simple task, which may require features that capture a deeper understanding of the text not always possible with surface ?-grams. The variability of gold labels in the annotated data, due to differences in the subjective adjudications of the annotators, is also an issue. Other directions for future work are discussed.

  17. Speech Remediation of Long-Term Stuttering

    Directory of Open Access Journals (Sweden)

    Betty L. McMicken

    2012-09-01

    Full Text Available This research article describes the remediation of moderate stuttering in an adult client who experienced speech dysfluency for more than 40 years. Treatment took place at an urban residential rehabilitation mission where the client was court sentenced for a history of felonies and current narcotic sales and use. In conjunction with the operant conditioning instruction of the rehabilitation mission, the Ryan Fluency Program was implemented along with the initial use of pause time in response to the complex needs of the client. The article provides an overview of the assessment (Fluency Interviews, Criterion Tests and treatment program. At present, 2.5 years post-initiation of treatment, the client has reported and been observed to have achieved smooth, forward-flowing, natural sounding speech throughout his work environment, family interaction, and daily life.

  18. On speech recognition during anaesthesia

    DEFF Research Database (Denmark)

    Alapetite, Alexandre

    2007-01-01

    This PhD thesis in human-computer interfaces (informatics) studies the case of the anaesthesia record used during medical operations and the possibility to supplement it with speech recognition facilities. Problems and limitations have been identified with the traditional paper-based anaesthesia...... and inaccuracies in the anaesthesia record. Supplementing the electronic anaesthesia record interface with speech input facilities is proposed as one possible solution to a part of the problem. The testing of the various hypotheses has involved the development of a prototype of an electronic anaesthesia record...... interface with speech input facilities in Danish. The evaluation of the new interface was carried out in a full-scale anaesthesia simulator. This has been complemented by laboratory experiments on several aspects of speech recognition for this type of use, e.g. the effects of noise on speech recognition...

  19. Comprehending text in literature class

    Directory of Open Access Journals (Sweden)

    Purić Daliborka S.

    2016-01-01

    Full Text Available The paper discusses the problem of understanding a text and the contribution of methodological apparatus in the reader book to comprehension of a text being read in junior classes of elementary school. By using the technique of content analysis from methodological apparatuses in eight reader books for the fourth grade of elementary school, approved for usage in 2014/2015 academic year, and surveying 350 teachers in 33 elementary schools and 11 administrative districts in the Republic of Serbia we examined: (a to what extent the Serbian language text book contents enable junior students to understand a literary text; (b to what extent teachers accept the suggestions offered in the textbook for preparing literature teaching. The results show that a large number of suggestions relate to reading comprehension, but some of categories of understanding are unevenly distributed in the methodological apparatus. On the other hand, the majority of teachers use the methodological apparatus given in a textbook for preparing classes, not only the textbook he or she selected for teaching but also other textbooks for the same grade.

  20. Methods for Mining and Summarizing Text Conversations

    CERN Document Server

    Carenini, Giuseppe; Murray, Gabriel

    2011-01-01

    Due to the Internet Revolution, human conversational data -- in written forms -- are accumulating at a phenomenal rate. At the same time, improvements in speech technology enable many spoken conversations to be transcribed. Individuals and organizations engage in email exchanges, face-to-face meetings, blogging, texting and other social media activities. The advances in natural language processing provide ample opportunities for these "informal documents" to be analyzed and mined, thus creating numerous new and valuable applications. This book presents a set of computational methods

  1. The potential of speech act theory for New Testament exegesis ...

    African Journals Online (AJOL)

    Speech act theory as well offers New Testament exegesis some additional ways and means of approaching the text of the New Testament. This first in a series of two articles making a plea for the continued utilisation and application of this theory to the text of the New Testament, offers a brief discussion of the basic ...

  2. Speech act theory and New Testament exegesis | Botha | HTS ...

    African Journals Online (AJOL)

    Speech act theory offers New Testament exegesis some additional ways and means of approaching the text of the New Testament. This, the second in a series of two articles that make a plea for the continued utilisation and application of this theory to the text of the New Testament, deals with some of the possibilities and ...

  3. Texting on the Move

    Science.gov (United States)

    ... text. What's the Big Deal? The problem is multitasking. No matter how young and agile we are, ... on something other than the road. In fact, driving while texting (DWT) can be more dangerous than ...

  4. Contributions of the textual analysis of speeches for the teaching in the virtual enviroments

    Directory of Open Access Journals (Sweden)

    Sueli Cristina Marquesi

    2013-12-01

    Full Text Available The present paper aims at discussing theoretical aspects of Textual Analysis of Speeches which guide an autonomous learning methodology for university students. Having as a theoretical ground, mainly, the studies developed by Adam (2008, a thematic unit of theoretical content in the area of Exact Sciences will be discussed. Within this unit, explicative and descriptive sequences are constructed in such a way to promote the interaction between the text, presented in a virtual environment, and the student. Consequently, the new content of learning is facilitated. For doing so, activities – presented totally at distance - prepared for engineering students of a Brazilian university will be brought to discussion. The methodology establishes a dialogue between an issue that is central in dealing with learning in virtual environments – the interaction through language – and the role the student has to assume in these environments: a reader/ author who makes meaning and transfers knowledge.

  5. Text Coherence in Translation

    Science.gov (United States)

    Zheng, Yanping

    2009-01-01

    In the thesis a coherent text is defined as a continuity of senses of the outcome of combining concepts and relations into a network composed of knowledge space centered around main topics. And the author maintains that in order to obtain the coherence of a target language text from a source text during the process of translation, a translator can…

  6. Stuttering Frequency, Speech Rate, Speech Naturalness, and Speech Effort During the Production of Voluntary Stuttering.

    Science.gov (United States)

    Davidow, Jason H; Grossman, Heather L; Edge, Robin L

    2018-05-01

    Voluntary stuttering techniques involve persons who stutter purposefully interjecting disfluencies into their speech. Little research has been conducted on the impact of these techniques on the speech pattern of persons who stutter. The present study examined whether changes in the frequency of voluntary stuttering accompanied changes in stuttering frequency, articulation rate, speech naturalness, and speech effort. In total, 12 persons who stutter aged 16-34 years participated. Participants read four 300-syllable passages during a control condition, and three voluntary stuttering conditions that involved attempting to produce purposeful, tension-free repetitions of initial sounds or syllables of a word for two or more repetitions (i.e., bouncing). The three voluntary stuttering conditions included bouncing on 5%, 10%, and 15% of syllables read. Friedman tests and follow-up Wilcoxon signed ranks tests were conducted for the statistical analyses. Stuttering frequency, articulation rate, and speech naturalness were significantly different between the voluntary stuttering conditions. Speech effort did not differ between the voluntary stuttering conditions. Stuttering frequency was significantly lower during the three voluntary stuttering conditions compared to the control condition, and speech effort was significantly lower during two of the three voluntary stuttering conditions compared to the control condition. Due to changes in articulation rate across the voluntary stuttering conditions, it is difficult to conclude, as has been suggested previously, that voluntary stuttering is the reason for stuttering reductions found when using voluntary stuttering techniques. Additionally, future investigations should examine different types of voluntary stuttering over an extended period of time to determine their impact on stuttering frequency, speech rate, speech naturalness, and speech effort.

  7. Acoustic assessment of speech privacy curtains in two nursing units

    Directory of Open Access Journals (Sweden)

    Diana S Pope

    2016-01-01

    Full Text Available Hospitals have complex soundscapes that create challenges to patient care. Extraneous noise and high reverberation rates impair speech intelligibility, which leads to raised voices. In an unintended spiral, the increasing noise may result in diminished speech privacy, as people speak loudly to be heard over the din. The products available to improve hospital soundscapes include construction materials that absorb sound (acoustic ceiling tiles, carpet, wall insulation and reduce reverberation rates. Enhanced privacy curtains are now available and offer potential for a relatively simple way to improve speech privacy and speech intelligibility by absorbing sound at the hospital patient′s bedside. Acoustic assessments were performed over 2 days on two nursing units with a similar design in the same hospital. One unit was built with the 1970s′ standard hospital construction and the other was newly refurbished (2013 with sound-absorbing features. In addition, we determined the effect of an enhanced privacy curtain versus standard privacy curtains using acoustic measures of speech privacy and speech intelligibility indexes. Privacy curtains provided auditory protection for the patients. In general, that protection was increased by the use of enhanced privacy curtains. On an average, the enhanced curtain improved sound absorption from 20% to 30%; however, there was considerable variability, depending on the configuration of the rooms tested. Enhanced privacy curtains provide measureable improvement to the acoustics of patient rooms but cannot overcome larger acoustic design issues. To shorten reverberation time, additional absorption, and compact and more fragmented nursing unit floor plate shapes should be considered.

  8. Aging and Spectro-Temporal Integration of Speech

    Directory of Open Access Journals (Sweden)

    John H. Grose

    2016-10-01

    Full Text Available The purpose of this study was to determine the effects of age on the spectro-temporal integration of speech. The hypothesis was that the integration of speech fragments distributed over frequency, time, and ear of presentation is reduced in older listeners—even for those with good audiometric hearing. Younger, middle-aged, and older listeners (10 per group with good audiometric hearing participated. They were each tested under seven conditions that encompassed combinations of spectral, temporal, and binaural integration. Sentences were filtered into two bands centered at 500 Hz and 2500 Hz, with criterion bandwidth tailored for each participant. In some conditions, the speech bands were individually square wave interrupted at a rate of 10 Hz. Configurations of uninterrupted, synchronously interrupted, and asynchronously interrupted frequency bands were constructed that constituted speech fragments distributed across frequency, time, and ear of presentation. The over-arching finding was that, for most configurations, performance was not differentially affected by listener age. Although speech intelligibility varied across condition, there was no evidence of performance deficits in older listeners in any condition. This study indicates that age, per se, does not necessarily undermine the ability to integrate fragments of speech dispersed across frequency and time.

  9. An Investigation of effective factors on nurses\\' speech errors

    Directory of Open Access Journals (Sweden)

    Maryam Tafaroji yeganeh

    2017-03-01

    Full Text Available Background : Speech errors are a branch of psycholinguistic science. Speech error or slip of tongue is a natural process that happens to everyone. The importance of this research is because of sensitivity and importance of nursing in which the speech errors may be interfere in the treatment of patients, but unfortunately no research has been done yet in this field.This research has been done to study the factors (personality, stress, fatigue and insomnia which cause speech errors happen to nurses of Ilam province. Materials and Methods: The sample of this correlation-descriptive research consists of 50 nurses working in Mustafa Khomeini Hospital of Ilam province who were selected randomly. Our data were collected using The Minnesota Multiphasic Personality Inventory, NEO-Five Factor Inventory and Expanded Nursing Stress Scale, and were analyzed using SPSS version 20, descriptive, inferential and multivariate linear regression or two-variable statistical methods (with significant level: p≤0. 05. Results: 30 (60% of nurses participating in the study were female and 19 (38% were male. In this study, all three factors (type of personality, stress and fatigue have significant effects on nurses' speech errors Conclusion: 30 (60% of nurses participating in the study were female and 19 (38% were male. In this study, all three factors (type of personality, stress and fatigue have significant effects on nurses' speech errors.

  10. THE DIRECTIVE SPEECH ACTS USED IN ENGLISH SPEAKING CLASS

    Directory of Open Access Journals (Sweden)

    Muhammad Khatib Bayanuddin

    2016-12-01

    Full Text Available This research discusses about an analysis of the directive speech acts used in english speaking class at the third semester of english speaking class of english study program of IAIN STS Jambi. The aims of this research are to describe the types of directive speech acts and politeness strategies that found in English speaking class. This research used descriptive qualitative method. This method used to describe clearly about the types and politeness strategies of directive speech acts based on the data in English speaking class. The result showed that in English speaking class that there are some types and politeness strategies of directive speech acts, such as: requestives, questions, requirements, prohibitives, permissives, and advisores as types, as well as on-record indirect strategies (prediction statement, strong obligation statement, possibility statement, weaker obligation statement, volitional statement, direct strategies (imperative, performative, and nonsentential strategies as politeness strategies. The achievement of this research are hoped can be additional knowledge about linguistics study, especially in directive speech acts and can be developed for future researches. Key words: directive speech acts, types, politeness strategies.

  11. Duration and speed of speech events: A selection of methods

    Directory of Open Access Journals (Sweden)

    Gibbon Dafydd

    2015-07-01

    Full Text Available The study of speech timing, i.e. the duration and speed or tempo of speech events, has increased in importance over the past twenty years, in particular in connection with increased demands for accuracy, intelligibility and naturalness in speech technology, with applications in language teaching and testing, and with the study of speech timing patterns in language typology. H owever, the methods used in such studies are very diverse, and so far there is no accessible overview of these methods. Since the field is too broad for us to provide an exhaustive account, we have made two choices: first, to provide a framework of paradigmatic (classificatory, syntagmatic (compositional and functional (discourse-oriented dimensions for duration analysis; and second, to provide worked examples of a selection of methods associated primarily with these three dimensions. Some of the methods which are covered are established state-of-the-art approaches (e.g. the paradigmatic Classification and Regression Trees, CART , analysis, others are discussed in a critical light (e.g. so-called ‘rhythm metrics’. A set of syntagmatic approaches applies to the tokenisation and tree parsing of duration hierarchies, based on speech annotations, and a functional approach describes duration distributions with sociolinguistic variables. Several of the methods are supported by a new web-based software tool for analysing annotated speech data, the Time Group Analyser.

  12. FREEDOM OF SPEECH IN INDONESIAN PRESS: INTERNATIONAL HUMAN RIGHTS PERSPECTIVE

    Directory of Open Access Journals (Sweden)

    Clara Staples

    2016-06-01

    Full Text Available This paper will firstly examine the international framework of human rights law and its guidelines for safeguarding the right to freedom of speech in the press. Secondly, it will describe the constitutional and other legal rights protecting freedom of speech in Indonesia and assess their compatibility with the right to freedom of speech under the international human rights law framework. Thirdly it will consider the impact of Indonesia’s constitutional law and criminal and civil law, including sedition and defamation laws, and finally media ownership, on the interpretation and scope of the right to freedom of speech in the press. Consideration of these laws will be integrated with a discussion of judicial processes. This discussion will be used to determine how and in what circumstances the constitutional right to freedom of speech in the press may be facilitated or enabled, or on the other hand, limited, overridden or curtailed in Indonesia. Conclusions will then be drawn regarding the strengths and weaknesses of Indonesian laws in safeguarding the right to freedom of speech in the press and the democratic implications from an international human rights perspective. This inquiry will be restricted to Indonesian laws in existence during the post-New Order period of 1998 to the present, and to the information and analysis provided by English-language sources.

  13. SynFace—Speech-Driven Facial Animation for Virtual Speech-Reading Support

    Directory of Open Access Journals (Sweden)

    Giampiero Salvi

    2009-01-01

    Full Text Available This paper describes SynFace, a supportive technology that aims at enhancing audio-based spoken communication in adverse acoustic conditions by providing the missing visual information in the form of an animated talking head. Firstly, we describe the system architecture, consisting of a 3D animated face model controlled from the speech input by a specifically optimised phonetic recogniser. Secondly, we report on speech intelligibility experiments with focus on multilinguality and robustness to audio quality. The system, already available for Swedish, English, and Flemish, was optimised for German and for Swedish wide-band speech quality available in TV, radio, and Internet communication. Lastly, the paper covers experiments with nonverbal motions driven from the speech signal. It is shown that turn-taking gestures can be used to affect the flow of human-human dialogues. We have focused specifically on two categories of cues that may be extracted from the acoustic signal: prominence/emphasis and interactional cues (turn-taking/back-channelling.

  14. Speech Recognition for the iCub Platform

    Directory of Open Access Journals (Sweden)

    Bertrand Higy

    2018-02-01

    Full Text Available This paper describes open source software (available at https://github.com/robotology/natural-speech to build automatic speech recognition (ASR systems and run them within the YARP platform. The toolkit is designed (i to allow non-ASR experts to easily create their own ASR system and run it on iCub and (ii to build deep learning-based models specifically addressing the main challenges an ASR system faces in the context of verbal human–iCub interactions. The toolkit mostly consists of Python, C++ code and shell scripts integrated in YARP. As additional contribution, a second codebase (written in Matlab is provided for more expert ASR users who want to experiment with bio-inspired and developmental learning-inspired ASR systems. Specifically, we provide code for two distinct kinds of speech recognition: “articulatory” and “unsupervised” speech recognition. The first is largely inspired by influential neurobiological theories of speech perception which assume speech perception to be mediated by brain motor cortex activities. Our articulatory systems have been shown to outperform strong deep learning-based baselines. The second type of recognition systems, the “unsupervised” systems, do not use any supervised information (contrary to most ASR systems, including our articulatory systems. To some extent, they mimic an infant who has to discover the basic speech units of a language by herself. In addition, we provide resources consisting of pre-trained deep learning models for ASR, and a 2.5-h speech dataset of spoken commands, the VoCub dataset, which can be used to adapt an ASR system to the typical acoustic environments in which iCub operates.

  15. Segmentation cues in conversational speech: Robust semantics and fragile phonotactics

    Directory of Open Access Journals (Sweden)

    Laurence eWhite

    2012-10-01

    Full Text Available Multiple cues influence listeners’ segmentation of connected speech into words, but most previous studies have used stimuli elicited in careful readings rather than natural conversation. Discerning word boundaries in conversational speech may differ from the laboratory setting. In particular, a speaker’s articulatory effort – hyperarticulation vs hypoarticulation (H&H – may vary according to communicative demands, suggesting a compensatory relationship whereby acoustic-phonetic cues are attenuated when other information sources strongly guide segmentation. We examined how listeners’ interpretation of segmentation cues is affected by speech style (spontaneous conversation vs read, using cross-modal identity priming. To elicit spontaneous stimuli, we used a map task in which speakers discussed routes around stylised landmarks. These landmarks were two-word phrases in which the strength of potential segmentation cues – semantic likelihood and cross-boundary diphone phonotactics – was systematically varied. Landmark-carrying utterances were transcribed and later re-recorded as read speech.Independent of speech style, we found an interaction between cue valence (favourable/unfavourable and cue type (phonotactics/semantics. Thus, there was an effect of semantic plausibility, but no effect of cross-boundary phonotactics, indicating that the importance of phonotactic segmentation may have been overstated in studies where lexical information was artificially suppressed. These patterns were unaffected by whether the stimuli were elicited in a spontaneous or read context, even though the difference in speech styles was evident in a main effect. Durational analyses suggested speaker-driven cue trade-offs congruent with an H&H account, but these modulations did not impact on listener behaviour. We conclude that previous research exploiting read speech is reliable in indicating the primacy of lexically-based cues in the segmentation of natural

  16. Automatic speech signal segmentation based on the innovation adaptive filter

    Directory of Open Access Journals (Sweden)

    Makowski Ryszard

    2014-06-01

    Full Text Available Speech segmentation is an essential stage in designing automatic speech recognition systems and one can find several algorithms proposed in the literature. It is a difficult problem, as speech is immensely variable. The aim of the authors’ studies was to design an algorithm that could be employed at the stage of automatic speech recognition. This would make it possible to avoid some problems related to speech signal parametrization. Posing the problem in such a way requires the algorithm to be capable of working in real time. The only such algorithm was proposed by Tyagi et al., (2006, and it is a modified version of Brandt’s algorithm. The article presents a new algorithm for unsupervised automatic speech signal segmentation. It performs segmentation without access to information about the phonetic content of the utterances, relying exclusively on second-order statistics of a speech signal. The starting point for the proposed method is time-varying Schur coefficients of an innovation adaptive filter. The Schur algorithm is known to be fast, precise, stable and capable of rapidly tracking changes in second order signal statistics. A transfer from one phoneme to another in the speech signal always indicates a change in signal statistics caused by vocal track changes. In order to allow for the properties of human hearing, detection of inter-phoneme boundaries is performed based on statistics defined on the mel spectrum determined from the reflection coefficients. The paper presents the structure of the algorithm, defines its properties, lists parameter values, describes detection efficiency results, and compares them with those for another algorithm. The obtained segmentation results, are satisfactory.

  17. Speech-activated Myoclonus Mimicking Stuttering in a Patient with Myoclonus–Dystonia Syndrome

    Directory of Open Access Journals (Sweden)

    Peter Hedera

    2016-07-01

    Full Text Available Background: Acquired neurogenic stuttering has been considered a fairly uncommon clinical occurrence; speech-activated myoclonus is a rare entity that can mimic stuttering and is caused by a wide array of etiologies.Case Report: Here we report a patient with myoclonus–dystonia syndrome (MDS, due to an identified disease-causing mutation, who displayed speech-activated myoclonus mimicking stuttering.Discussion: In MDS, myoclonus has only infrequently been reported to affect speech. This case further expands the spectrum of conditions causing the rare clinical phenomenon of speech-activated myoclonus. 

  18. Methods and Application of Phonetic Label Alignment in Speech Processing Tasks

    Directory of Open Access Journals (Sweden)

    M. Myslivec

    2000-12-01

    Full Text Available The paper deals with the problem of automatic phonetic segmentation ofspeech signals, namely for speech analysis and recognition purposes.Several methods and approaches are described and evaluated from thepoint of view of their accuracy. A complete instruction for creating anannotated database for training a Czech speech recognition system isprovided together with the authors' own experience. The results of thework have found practical applications, for example, in developing atool for semi-automatic speech segmentation, building alarge-vocabulary phoneme-based speech recognition system and designingan aid for learning and practicing pronunciation of words or phrases inthe native or a foreign language.

  19. A novel speech prosthesis for mandibular guidance therapy in hemimandibulectomy patient: A clinical report

    Directory of Open Access Journals (Sweden)

    Raghavendra Adaki

    2016-01-01

    Full Text Available Treating diverse maxillofacial patients poses a challenge to the maxillofacial prosthodontist. Rehabilitation of hemimandibulectomy patients must aim at restoring mastication and other functions such as intelligible speech, swallowing, and esthetics. Prosthetic methods such as palatal ramp and mandibular guiding flange reposition the deviated mandible. Such prosthesis can also be used to restore speech in case of patients with debilitating speech following surgical resection. This clinical report gives detail of a hemimandibulectomy patient provided with an interim removable dental speech prosthesis with composite resin flange for mandibular guidance therapy.

  20. Analysis of vocal signal in its amplitude - time representation. speech synthesis-by-rules

    International Nuclear Information System (INIS)

    Rodet, Xavier

    1977-01-01

    In the first part of this dissertation, the natural speech production and the resulting acoustic waveform are examined under various aspects: communication, phonetics, frequency and temporal analysis. Our own study of direct signal is compared to other researches in these different fields, and fundamental features of vocal signals are described. The second part deals with the numerous methods already used for automatic text-to-speech synthesis. In the last part, we expose the new speech synthesis-by-rule methods that we have worked out, and we present in details the structure of the real-time speech synthesiser that we have implemented on a mini-computer. (author) [fr

  1. The power of antiquity for modern purposes. A rhetorical analysis of a speech by Steve Jobs

    OpenAIRE

    Martín González, Ana Cristina

    2015-01-01

    The purpose of this paper is to analyze the blend of classic rhetorical techniques with linguistic devices to create an effective speech. To observe this, Steve Jobs’ Commencement Speech, given at Stanford University in 2005, is going to be the text under study. The success of the speech is going to be studied focusing on the relationship between the speaker and his audience. Jobs’ speech is related with the new communication media, which has become a viral video on the web working in differe...

  2. Speech disorders did not correlate with age at onset of Parkinson’s disease

    Directory of Open Access Journals (Sweden)

    Alice Estevo Dias

    2016-02-01

    Full Text Available ABSTRACT Speech disorders are common manifestations of Parkinson´s disease. Objective To compare speech articulation in patients according to age at onset of the disease. Methods Fifty patients was divided into two groups: Group I consisted of 30 patients with age at onset between 40 and 55 years; Group II consisted of 20 patients with age at onset after 65 years. All patients were evaluated based on the Unified Parkinson’s Disease Rating Scale scores, Hoehn and Yahr scale and speech evaluation by perceptual and acoustical analysis. Results There was no statistically significant difference between the two groups regarding neurological involvement and speech characteristics. Correlation analysis indicated differences in speech articulation in relation to staging and axial scores of rigidity and bradykinesia for middle and late-onset. Conclusions Impairment of speech articulation did not correlate with age at onset of disease, but was positively related with disease duration and higher scores in both groups.

  3. Speech-like rhythm in a voiced and voiceless orangutan call.

    Directory of Open Access Journals (Sweden)

    Adriano R Lameira

    Full Text Available The evolutionary origins of speech remain obscure. Recently, it was proposed that speech derived from monkey facial signals which exhibit a speech-like rhythm of ∼5 open-close lip cycles per second. In monkeys, these signals may also be vocalized, offering a plausible evolutionary stepping stone towards speech. Three essential predictions remain, however, to be tested to assess this hypothesis' validity; (i Great apes, our closest relatives, should likewise produce 5Hz-rhythm signals, (ii speech-like rhythm should involve calls articulatorily similar to consonants and vowels given that speech rhythm is the direct product of stringing together these two basic elements, and (iii speech-like rhythm should be experience-based. Via cinematic analyses we demonstrate that an ex-entertainment orangutan produces two calls at a speech-like rhythm, coined "clicks" and "faux-speech." Like voiceless consonants, clicks required no vocal fold action, but did involve independent manoeuvring over lips and tongue. In parallel to vowels, faux-speech showed harmonic and formant modulations, implying vocal fold and supralaryngeal action. This rhythm was several times faster than orangutan chewing rates, as observed in monkeys and humans. Critically, this rhythm was seven-fold faster, and contextually distinct, than any other known rhythmic calls described to date in the largest database of the orangutan repertoire ever assembled. The first two predictions advanced by this study are validated and, based on parsimony and exclusion of potential alternative explanations, initial support is given to the third prediction. Irrespectively of the putative origins of these calls and underlying mechanisms, our findings demonstrate irrevocably that great apes are not respiratorily, articulatorilly, or neurologically constrained for the production of consonant- and vowel-like calls at speech rhythm. Orangutan clicks and faux-speech confirm the importance of rhythmic speech

  4. Brain-to-text: Decoding spoken phrases from phone representations in the brain

    Directory of Open Access Journals (Sweden)

    Christian eHerff

    2015-06-01

    Full Text Available It has long been speculated whether communication between humans and machines based on natural speech related cortical activity is possible. Over the past decade, studies have suggested that it is feasible to recognize isolated aspects of speech from neural signals, such as auditory features, phones or one of a few isolated words. However, until now it remained an unsolved challenge to decode continuously spoken speech from the neural substrate associated with speech and language processing. Here, we show for the first time that continuously spoken speech can be decoded into the expressed words from intracranial electrocorticographic (ECoG recordings. Specifically, we implemented a system, which we call Brain-To-Text that models single phones, employs techniques from automatic speech recognition (ASR, and thereby transforms brain activity while speaking into the corresponding textual representation. Our results demonstrate that our system achieved word error rates as low as 25% and phone error rates below 50%. Additionally, our approach contributes to the current understanding of the neural basis of continuous speech production by identifying those cortical regions that hold substantial information about individual phones. In conclusion, the Brain-To-Text system described in this paper represents an important step towards human-machine communication based on imagined speech.

  5. Asymmetric dynamic attunement of speech and gestures in the construction of children’s understanding

    Directory of Open Access Journals (Sweden)

    Lisette eDe Jonge-Hoekstra

    2016-03-01

    Full Text Available As children learn they use their speech to express words and their hands to gesture. This study investigates the interplay between real-time gestures and speech as children construct cognitive understanding during a hands-on science task. 12 children (M = 6, F = 6 from Kindergarten (n = 5 and first grade (n = 7 participated in this study. Each verbal utterance and gesture during the task were coded, on a complexity scale derived from dynamic skill theory. To explore the interplay between speech and gestures, we applied a cross recurrence quantification analysis (CRQA to the two coupled time series of the skill levels of verbalizations and gestures. The analysis focused on 1 the temporal relation between gestures and speech, 2 the relative strength and direction of the interaction between gestures and speech, 3 the relative strength and direction between gestures and speech for different levels of understanding, and 4 relations between CRQA measures and other child characteristics. The results show that older and younger children differ in the (temporal asymmetry in the gestures-speech interaction. For younger children, the balance leans more towards gestures leading speech in time, while the balance leans more towards speech leading gestures for older children. Secondly, at the group level, speech attracts gestures in a more dynamically stable fashion than vice versa, and this asymmetry in gestures and speech extends to lower and higher understanding levels. Yet, for older children, the mutual coupling between gestures and speech is more dynamically stable regarding the higher understanding levels. Gestures and speech are more synchronized in time as children are older. A higher score on schools’ language tests is related to speech attracting gestures more rigidly and more asymmetry between gestures and speech, only for the less difficult understanding levels. A higher score on math or past science tasks is related to less asymmetry between

  6. The expressive potential in a dramatic text: Brecht's A Respectable Wedding

    Directory of Open Access Journals (Sweden)

    Katarina Podbevšek

    2017-12-01

    Full Text Available The article discusses the linguistic shaping of a dramatic text and its influence on the text’s stage speech realisation, using the Slovenian translation of Brecht’s one-act play Malomeščanska svatba as an example. A dramatic text typically has a specific – and also graphically visible – textual and linguistic structure that indicates its speech intention. A linguistic analysis of Brecht’s text reveals a great speech potential, both in the stage directions (especially the stage directions for pauses, silence, spoken execution and in the dialogue (characteristic linguistic elements for spontaneous speech. A short comparison of the text with the stage speech performance shows that the actor used not only prosody (especially pauses to semantically enrich and rhythmically organise the written language, but also linguistic interventions into the lexical and syntactic structure (repetition, addition, omission, etc.. The great speech potential of the text thus stimulated the actor’s speech and interpretive creativity.

  7. Visual Speech Fills in Both Discrimination and Identification of Non-Intact Auditory Speech in Children

    Science.gov (United States)

    Jerger, Susan; Damian, Markus F.; McAlpine, Rachel P.; Abdi, Herve

    2018-01-01

    To communicate, children must discriminate and identify speech sounds. Because visual speech plays an important role in this process, we explored how visual speech influences phoneme discrimination and identification by children. Critical items had intact visual speech (e.g. baez) coupled to non-intact (excised onsets) auditory speech (signified…

  8. Speech enhancement using emotion dependent codebooks

    NARCIS (Netherlands)

    Naidu, D.H.R.; Srinivasan, S.

    2012-01-01

    Several speech enhancement approaches utilize trained models of clean speech data, such as codebooks, Gaussian mixtures, and hidden Markov models. These models are typically trained on neutral clean speech data, without any emotion. However, in practical scenarios, emotional speech is a common

  9. Automated Speech Rate Measurement in Dysarthria

    Science.gov (United States)

    Martens, Heidi; Dekens, Tomas; Van Nuffelen, Gwen; Latacz, Lukas; Verhelst, Werner; De Bodt, Marc

    2015-01-01

    Purpose: In this study, a new algorithm for automated determination of speech rate (SR) in dysarthric speech is evaluated. We investigated how reliably the algorithm calculates the SR of dysarthric speech samples when compared with calculation performed by speech-language pathologists. Method: The new algorithm was trained and tested using Dutch…

  10. Is Birdsong More Like Speech or Music?

    Science.gov (United States)

    Shannon, Robert V

    2016-04-01

    Music and speech share many acoustic cues but not all are equally important. For example, harmonic pitch is essential for music but not for speech. When birds communicate is their song more like speech or music? A new study contrasting pitch and spectral patterns shows that birds perceive their song more like humans perceive speech. Copyright © 2016 Elsevier Ltd. All rights reserved.

  11. Freedom of Speech Newsletter, September, 1975.

    Science.gov (United States)

    Allen, Winfred G., Jr., Ed.

    The Freedom of Speech Newsletter is the communication medium for the Freedom of Speech Interest Group of the Western Speech Communication Association. The newsletter contains such features as a statement of concern by the National Ad Hoc Committee Against Censorship; Reticence and Free Speech, an article by James F. Vickrey discussing the subtle…

  12. Speech recovery device

    Energy Technology Data Exchange (ETDEWEB)

    Frankle, Christen M.

    2004-04-20

    There is provided an apparatus and method for assisting speech recovery in people with inability to speak due to aphasia, apraxia or another condition with similar effect. A hollow, rigid, thin-walled tube with semi-circular or semi-elliptical cut out shapes at each open end is positioned such that one end mates with the throat/voice box area of the neck of the assistor and the other end mates with the throat/voice box area of the assisted. The speaking person (assistor) makes sounds that produce standing wave vibrations at the same frequency in the vocal cords of the assisted person. Driving the assisted person's vocal cords with the assisted person being able to hear the correct tone enables the assisted person to speak by simply amplifying the vibration of membranes in their throat.

  13. Speech recovery device

    Energy Technology Data Exchange (ETDEWEB)

    Frankle, Christen M.

    2000-10-19

    There is provided an apparatus and method for assisting speech recovery in people with inability to speak due to aphasia, apraxia or another condition with similar effect. A hollow, rigid, thin-walled tube with semi-circular or semi-elliptical cut out shapes at each open end is positioned such that one end mates with the throat/voice box area of the neck of the assistor and the other end mates with the throat/voice box area of the assisted. The speaking person (assistor) makes sounds that produce standing wave vibrations at the same frequency in the vocal cords of the assisted person. Driving the assisted person's vocal cords with the assisted person being able to hear the correct tone enables the assisted person to speak by simply amplifying the vibration of membranes in their throat.

  14. Steganalysis of recorded speech

    Science.gov (United States)

    Johnson, Micah K.; Lyu, Siwei; Farid, Hany

    2005-03-01

    Digital audio provides a suitable cover for high-throughput steganography. At 16 bits per sample and sampled at a rate of 44,100 Hz, digital audio has the bit-rate to support large messages. In addition, audio is often transient and unpredictable, facilitating the hiding of messages. Using an approach similar to our universal image steganalysis, we show that hidden messages alter the underlying statistics of audio signals. Our statistical model begins by building a linear basis that captures certain statistical properties of audio signals. A low-dimensional statistical feature vector is extracted from this basis representation and used by a non-linear support vector machine for classification. We show the efficacy of this approach on LSB embedding and Hide4PGP. While no explicit assumptions about the content of the audio are made, our technique has been developed and tested on high-quality recorded speech.

  15. Facilitated auditory detection for speech sounds

    Directory of Open Access Journals (Sweden)

    Carine eSignoret

    2011-07-01

    Full Text Available If it is well known that knowledge facilitates higher cognitive functions, such as visual and auditory word recognition, little is known about the influence of knowledge on detection, particularly in the auditory modality. Our study tested the influence of phonological and lexical knowledge on auditory detection. Words, pseudo words and complex non phonological sounds, energetically matched as closely as possible, were presented at a range of presentation levels from sub threshold to clearly audible. The participants performed a detection task (Experiments 1 and 2 that was followed by a two alternative forced choice recognition task in Experiment 2. The results of this second task in Experiment 2 suggest a correct recognition of words in the absence of detection with a subjective threshold approach. In the detection task of both experiments, phonological stimuli (words and pseudo words were better detected than non phonological stimuli (complex sounds, presented close to the auditory threshold. This finding suggests an advantage of speech for signal detection. An additional advantage of words over pseudo words was observed in Experiment 2, suggesting that lexical knowledge could also improve auditory detection when listeners had to recognize the stimulus in a subsequent task. Two simulations of detection performance performed on the sound signals confirmed that the advantage of speech over non speech processing could not be attributed to energetic differences in the stimuli.

  16. Speech interaction strategies for a humanoid assistant

    Directory of Open Access Journals (Sweden)

    Stüker Sebastian

    2018-01-01

    Full Text Available The goal of SecondHands, a H2020 project, is to design a robot that can offer help to a maintenance technician in a proactive manner. The robot is to act as a second pair of hands that can assist the technician when he is in need of help. In order for the robot to be of real help to the technician, it needs to understand his needs and follow his commands. Interaction via speech is a crucial part of this. Due to the nature of the situation in which the interactions take place, often the technician needs to speak to the robot when under stress performing strenuous physical labor, the classical turn based interaction schemes need to be transformed into dialogue systems that perform stream processing, anticipating user intentions, correcting itself as more information become available, in order to be able to respond in a rapid manner. In order to meet these demands, we are developing low-latency streaming based automatic speech recognition systems in combination with recurrent neural network based Natural Language Understanding systems that perform slot filling and intent recognition in order for the robot to provide assistance in a rapid manner, that can be partly based on speculative classifications that are then being refined as more speech becomes available.

  17. Speech and language intervention in bilinguals

    Directory of Open Access Journals (Sweden)

    Eliane Ramos

    2011-12-01

    Full Text Available Increasingly, speech and language pathologists (SLPs around the world are faced with the unique set of issues presented by their bilingual clients. Some professional associations in different countries have presented recommendations when assessing and treating bilingual populations. In children, most of the studies have focused on intervention for language and phonology/ articulation impairments and very few focus on stuttering. In general, studies of language intervention tend to agree that intervention in the first language (L1 either increase performance on L2 or does not hinder it. In bilingual adults, monolingual versus bilingual intervention is especially relevant in cases of aphasia; dysarthria in bilinguals has been barely approached. Most studies of cross-linguistic effects in bilingual aphasics have focused on lexical retrieval training. It has been noted that even though a majority of studies have disclosed a cross-linguistic generalization from one language to the other, some methodological weaknesses are evident. It is concluded that even though speech and language intervention in bilinguals represents a most important clinical area in speech language pathology, much more research using larger samples and controlling for potentially confounding variables is evidently required.

  18. Self-Reported Speech Problems in Adolescents and Young Adults with 22q11.2 Deletion Syndrome: A Cross-Sectional Cohort Study

    Directory of Open Access Journals (Sweden)

    Nicole E Spruijt

    2014-09-01

    Full Text Available BackgroundSpeech problems are a common clinical feature of the 22q11.2 deletion syndrome. The objectives of this study were to inventory the speech history and current self-reported speech rating of adolescents and young adults, and examine the possible variables influencing the current speech ratings, including cleft palate, surgery, speech and language therapy, intelligence quotient, and age at assessment.MethodsIn this cross-sectional cohort study, 50 adolescents and young adults with the 22q11.2 deletion syndrome (ages, 12-26 years, 67% female filled out questionnaires. A neuropsychologist administered an age-appropriate intelligence quotient test. The demographics, histories, and intelligence of patients with normal speech (speech rating=1 were compared to those of patients with different speech (speech rating>1.ResultsOf the 50 patients, a minority (26% had a cleft palate, nearly half (46% underwent a pharyngoplasty, and all (100% had speech and language therapy. Poorer speech ratings were correlated with more years of speech and language therapy (Spearman's correlation= 0.418, P=0.004; 95% confidence interval, 0.145-0.632. Only 34% had normal speech ratings. The groups with normal and different speech were not significantly different with respect to the demographic variables; a history of cleft palate, surgery, or speech and language therapy; and the intelligence quotient.ConclusionsAll adolescents and young adults with the 22q11.2 deletion syndrome had undergone speech and language therapy, and nearly half of them underwent pharyngoplasty. Only 34% attained normal speech ratings. Those with poorer speech ratings had speech and language therapy for more years.

  19. Speech enhancement theory and practice

    CERN Document Server

    Loizou, Philipos C

    2013-01-01

    With the proliferation of mobile devices and hearing devices, including hearing aids and cochlear implants, there is a growing and pressing need to design algorithms that can improve speech intelligibility without sacrificing quality. Responding to this need, Speech Enhancement: Theory and Practice, Second Edition introduces readers to the basic problems of speech enhancement and the various algorithms proposed to solve these problems. Updated and expanded, this second edition of the bestselling textbook broadens its scope to include evaluation measures and enhancement algorithms aimed at impr

  20. Vocabulary Constraint on Texts

    Directory of Open Access Journals (Sweden)

    C. Sutarsyah

    2008-01-01

    Full Text Available This case study was carried out in the English Education Department of State University of Malang. The aim of the study was to identify and describe the vocabulary in the reading text and to seek if the text is useful for reading skill development. A descriptive qualitative design was applied to obtain the data. For this purpose, some available computer programs were used to find the description of vocabulary in the texts. It was found that the 20 texts containing 7,945 words are dominated by low frequency words which account for 16.97% of the words in the texts. The high frequency words occurring in the texts were dominated by function words. In the case of word levels, it was found that the texts have very limited number of words from GSL (General Service List of English Words (West, 1953. The proportion of the first 1,000 words of GSL only accounts for 44.6%. The data also show that the texts contain too large proportion of words which are not in the three levels (the first 2,000 and UWL. These words account for 26.44% of the running words in the texts.  It is believed that the constraints are due to the selection of the texts which are made of a series of short-unrelated texts. This kind of text is subject to the accumulation of low frequency words especially those of content words and limited of words from GSL. It could also defeat the development of students' reading skills and vocabulary enrichment.

  1. Speech of people with autism: Echolalia and echolalic speech

    OpenAIRE

    Błeszyński, Jacek Jarosław

    2013-01-01

    Speech of people with autism is recognised as one of the basic diagnostic, therapeutic and theoretical problems. One of the most common symptoms of autism in children is echolalia, described here as being of different types and severity. This paper presents the results of studies into different levels of echolalia, both in normally developing children and in children diagnosed with autism, discusses the differences between simple echolalia and echolalic speech - which can be considered to b...

  2. Advocate: A Distributed Architecture for Speech-to-Speech Translation

    Science.gov (United States)

    2009-01-01

    tecture, are either wrapped natural-language processing ( NLP ) components or objects developed from scratch using the architecture’s API. GATE is...framework, we put together a demonstration Arabic -to- English speech translation system using both internally developed ( Arabic speech recognition and MT...conditions of our Arabic S2S demonstration system described earlier. Once again, the data size was varied and eighty identical requests were

  3. Dictionaries for text production

    DEFF Research Database (Denmark)

    Fuertes-Olivera, Pedro; Bergenholtz, Henning

    2018-01-01

    Dictionaries for Text Production are information tools that are designed and constructed for helping users to produce (i.e. encode) texts, both oral and written texts. These can be broadly divided into two groups: (a) specialized text production dictionaries, i.e., dictionaries that only offer...... a small amount of lexicographic data, most or all of which are typically used in a production situation, e.g. synonym dictionaries, grammar and spelling dictionaries, collocation dictionaries, concept dictionaries such as the Longman Language Activator, which is advertised as the World’s First Production...... Dictionary; (b) general text production dictionaries, i.e., dictionaries that offer all or most of the lexicographic data that are typically used in a production situation. A review of existing production dictionaries reveals that there are many specialized text production dictionaries but only a few general...

  4. Research of Features of the Phonetic System of Speech and Identification of Announcers on the Voice

    Directory of Open Access Journals (Sweden)

    Roman Aleksandrovich Vasilyev

    2013-02-01

    Full Text Available In the work the method of the phonetic analysis of speech — allocation of the list of elementary speech units such as separate phonemes from a continuous stream of informal conversation of the specific announcer is offered. The practical algorithm of identification of the announcer — process of definition speaking of the set of announcers is described.

  5. Speech and other modalities in the office environment: Some research results

    NARCIS (Netherlands)

    Nes, van F.L.; Bullinger, H.-J.

    1991-01-01

    Research was carried out on the application of speech in three areas of man-computer communication: instruction, voice commands for system control and annotation of documents. As to instruction, learning was found to proceed equally fast with speech and written text; a number of subjects preferred

  6. Oral speech teaching to students of mathematic specialties: a grammatical aspect

    Directory of Open Access Journals (Sweden)

    Ibragimov I.I.

    2016-08-01

    Full Text Available the paper considers teaching features of English speech grammar aspects. The case studies include undergraduates of mathematical specialties. The content of students’ educational activity at the final stage of language teaching is pointed out. Besides the structure of grammar section, a special didactic training unit in which framework mastering grammar phenomena used in oral speech takes place is described.

  7. Instant Sublime Text starter

    CERN Document Server

    Haughee, Eric

    2013-01-01

    A starter which teaches the basic tasks to be performed with Sublime Text with the necessary practical examples and screenshots. This book requires only basic knowledge of the Internet and basic familiarity with any one of the three major operating systems, Windows, Linux, or Mac OS X. However, as Sublime Text 2 is primarily a text editor for writing software, many of the topics discussed will be specifically relevant to software development. That being said, the Sublime Text 2 Starter is also suitable for someone without a programming background who may be looking to learn one of the tools of

  8. Digitized Ethnic Hate Speech: Understanding Effects of Digital Media Hate Speech on Citizen Journalism in Kenya

    OpenAIRE

    Stephen Gichuhi Kimotho; Rahab Njeri Nyaga

    2016-01-01

    Ethnicity in Kenya permeates all spheres of life. However, it is in politics that ethnicity is most visible. Election time in Kenya often leads to ethnic competition and hatred, often expressed through various media. Ethnic hate speech characterized the 2007 general elections in party rallies and through text messages, emails, posters and leaflets. This resulted in widespread skirmishes that left over 1200 people dead, and many displaced (KNHRC, 2008). In 2013, however, the new battle zone wa...

  9. Semi-Spontaneous Oral Text Production: Measurements in Clinical Practice

    Science.gov (United States)

    Lind, Marianne; Kristoffersen, Kristian Emil; Moen, Inger; Simonsen, Hanne Gram

    2009-01-01

    Functionally relevant assessment of the language production of speakers with aphasia should include assessment of connected speech production. Despite the ecological validity of everyday conversations, more controlled and monological types of texts may be easier to obtain and analyse in clinical practice. This article discusses some simple…

  10. Categorizing Children: Automated Text Classification of CHILDES files

    NARCIS (Netherlands)

    Opsomer, Rob; Knoth, Peter; Wiering, Marco; van Polen, Freek; Trapman, Jantine

    2008-01-01

    In this paper we present the application of machine learning text classification methods to two tasks: categorization of children’s speech in the CHILDES Database according to gender and age. Both tasks are binary. For age, we distinguish two age groups between the age of 1.9 and 3.0 years old. The

  11. Sharing Expository Texts with Preschool Children in Special Education

    Science.gov (United States)

    Breit-Smith, Allison; Busch, Jamie; Guo, Ying

    2015-01-01

    Although a general limited availability of expository texts currently exists in preschool special education classrooms, expository tests offer speech-language pathologists (SLPs) a rich context for addressing the language goals of preschool children with language impairment on their caseloads. Thus, this article highlights the differences between…

  12. Seimo posėdžių stenogramų tekstynas autorystės nustatymo bei autoriaus profilio sudarymo tyrimams | Corpus of transcribed parliamentary speeches for authorship attribution and author profiling tasks

    Directory of Open Access Journals (Sweden)

    Jurgita Kapočiūtė-Dzikienė

    2014-12-01

    Full Text Available In our paper we present a corpus of transcribed Lithuanian parliamentary speeches. The corpus is prepared in a specific format, appropriate for different authorship identification tasks. The corpus consists of approximately 111 thousand texts (24 million words. Each text matches one parliamentary speech produced during an ordinary session from the period of 7 parliamentary terms starting on March 10, 1990 and ending on December 23, 2013. The texts are grouped into 147 categories corresponding to individual authors, therefore they can be used for authorship attribution tasks; besides, these texts are also grouped according to age, gender and political views, therefore they are also suitable for author profiling tasks. Whereas short texts complicate recognition of author speaking style and are ambiguous in relation to the style of other authors, we incorporated only texts containing not less than 100 words into the corpus. In order to make each category as comprehensive and representative as possible, we included only those authors, who produced speeches at least 200 times. All the texts are lemmatized, morphologically and syntactically annotated, tokenized into the character n-grams. The statistical information of the corpus is also available. We have also demonstrated that the created corpus can be effectively used in authorship attribution and author profiling tasks with supervised machine learning methods. The corpus structure also allows using it with unsupervised machine learning methods and can be used for creation of rule-based methods, as well as in different linguistic analyses.

  13. Linguistics in Text Interpretation

    DEFF Research Database (Denmark)

    Togeby, Ole

    2011-01-01

    A model for how text interpretation proceeds from what is pronounced, through what is said to what is comunicated, and definition of the concepts 'presupposition' and 'implicature'.......A model for how text interpretation proceeds from what is pronounced, through what is said to what is comunicated, and definition of the concepts 'presupposition' and 'implicature'....

  14. LocText

    DEFF Research Database (Denmark)

    Cejuela, Juan Miguel; Vinchurkar, Shrikant; Goldberg, Tatyana

    2018-01-01

    trees and was trained and evaluated on a newly improved LocTextCorpus. Combined with an automatic named-entity recognizer, LocText achieved high precision (P = 86%±4). After completing development, we mined the latest research publications for three organisms: human (Homo sapiens), budding yeast...

  15. Systematic text condensation

    DEFF Research Database (Denmark)

    Malterud, Kirsti

    2012-01-01

    To present background, principles, and procedures for a strategy for qualitative analysis called systematic text condensation and discuss this approach compared with related strategies.......To present background, principles, and procedures for a strategy for qualitative analysis called systematic text condensation and discuss this approach compared with related strategies....

  16. The Perfect Text.

    Science.gov (United States)

    Russo, Ruth

    1998-01-01

    A chemistry teacher describes the elements of the ideal chemistry textbook. The perfect text is focused and helps students draw a coherent whole out of the myriad fragments of information and interpretation. The text would show chemistry as the central science necessary for understanding other sciences and would also root chemistry firmly in the…

  17. Text 2 Mind Map

    OpenAIRE

    Iona, John

    2017-01-01

    This is a review of the web resource 'Text 2 Mind Map' www.Text2MindMap.com. It covers what the resource is, and how it might be used in Library and education context, in particular for School Librarians.

  18. Text File Comparator

    Science.gov (United States)

    Kotler, R. S.

    1983-01-01

    File Comparator program IFCOMP, is text file comparator for IBM OS/VScompatable systems. IFCOMP accepts as input two text files and produces listing of differences in pseudo-update form. IFCOMP is very useful in monitoring changes made to software at the source code level.

  19. Speech Mannerisms: Games Clients Play

    Science.gov (United States)

    Morgan, Lewis B.

    1978-01-01

    This article focuses on speech mannerisms often employed by clients in a helping relationship. Eight mannerisms are presented and discussed, as well as possible interpretations. Suggestions are given to help counselors respond to them. (Author)

  20. Speech recognition from spectral dynamics

    Indian Academy of Sciences (India)

    Carrier nature of speech; modulation spectrum; spectral dynamics ... the relationships between phonetic values of sounds and their short-term spectral envelopes .... the number of free parameters that need to be estimated from training data.

  1. Designing speech for a recipient

    DEFF Research Database (Denmark)

    Fischer, Kerstin

    This study asks how speakers adjust their speech to their addressees, focusing on the potential roles of cognitive representations such as partner models, automatic processes such as interactive alignment, and social processes such as interactional negotiation. The nature of addressee orientation......, psycholinguistics and conversation analysis, and offers both overviews of child-directed, foreigner-directed and robot-directed speech and in-depth analyses of the processes involved in adjusting to a communication partner....

  2. National features of speech etiquette

    OpenAIRE

    Nacafova S.

    2017-01-01

    The article shows the differences between the speech etiquette of different peoples. The most important thing is to find a common language with this or that interlocutor. Knowledge of national etiquette, national character helps to learn the principles of speech of another nation. The article indicates in which cases certain forms of etiquette considered acceptable. At the same time, the rules of etiquette emphasized in the conduct of a dialogue in official meetings and for example, in the ex...

  3. Censored: Whistleblowers and impossible speech

    OpenAIRE

    Kenny, Kate

    2017-01-01

    What happens to a person who speaks out about corruption in their organization, and finds themselves excluded from their profession? In this article, I argue that whistleblowers experience exclusions because they have engaged in ‘impossible speech’, that is, a speech act considered to be unacceptable or illegitimate. Drawing on Butler’s theories of recognition and censorship, I show how norms of acceptable speech working through recruitment practices, alongside the actions of colleagues, can ...

  4. Speech Enhancement of Mobile Devices Based on the Integration of a Dual Microphone Array and a Background Noise Elimination Algorithm

    Directory of Open Access Journals (Sweden)

    Yung-Yue Chen

    2018-05-01

    Full Text Available Mobile devices are often used in our daily lives for the purposes of speech and communication. The speech quality of mobile devices is always degraded due to the environmental noises surrounding mobile device users. Regretfully, an effective background noise reduction solution cannot easily be developed for this speech enhancement problem. Due to these depicted reasons, a methodology is systematically proposed to eliminate the effects of background noises for the speech communication of mobile devices. This methodology integrates a dual microphone array with a background noise elimination algorithm. The proposed background noise elimination algorithm includes a whitening process, a speech modelling method and an H2 estimator. Due to the adoption of the dual microphone array, a low-cost design can be obtained for the speech enhancement of mobile devices. Practical tests have proven that this proposed method is immune to random background noises, and noiseless speech can be obtained after executing this denoise process.

  5. Zum Bildungspotenzial biblischer Texte

    Directory of Open Access Journals (Sweden)

    Theis, Joachim

    2017-11-01

    Full Text Available Biblical education as a holistic process goes far beyond biblical learning. It must be understood as a lifelong process, in which both biblical texts and their understanders operate appropriating their counterpart in a dialogical way. – Neither does the recipient’s horizon of understanding appear as an empty room, which had to be filled with the text only, nor is the latter a dead material one could only examine cognitively. The recipient discovers the meaning of the biblical text recomposing it by existential appropriation. So the text is brought to live in each individual reality. Both scientific insights and subjective structures as well as the understanders’ community must be included to avoid potential one-sidednesses. Unfortunately, a special negative association obscures the approach of the bible very often: Still biblical work as part of religious education appears in a cognitively oriented habit, which is neither regarding the vitality and sovereignty of the biblical texts nor the students’ desire for meaning. Moreover, the bible is getting misused for teaching moral terms or pontifications. Such downfalls can be disrupted by biblical didactics which are empowerment didactics. Regarding the sovereignty of biblical texts, these didactics assist the understander with his/her individuation by opening the texts with focus on the understander’s otherness. Thus each the text and the recipient become subjects in a dialogue. The approach of the Biblical-Enabling-Didactics leads the Bible to become always new a book of life. Understanding them from within their hermeneutics, empowerment didactics could be raised to the principle of biblical didactics in general and grow into an essential element of holistic education.

  6. Speech Function and Speech Role in Carl Fredricksen's Dialogue on Up Movie

    OpenAIRE

    Rehana, Ridha; Silitonga, Sortha

    2013-01-01

    One aim of this article is to show through a concrete example how speech function and speech role used in movie. The illustrative example is taken from the dialogue of Up movie. Central to the analysis proper form of dialogue on Up movie that contain of speech function and speech role; i.e. statement, offer, question, command, giving, and demanding. 269 dialogue were interpreted by actor, and it was found that the use of speech function and speech role.

  7. Speech Act Theory and the Concept of Intention in Literary Criticism

    OpenAIRE

    García Landa, José Angel

    2011-01-01

    The aim of this paper is to trace the outline of a speech act theory of literature, taking into account the work of critics who react against the prevailing anti-intentionalist schools of criticism, such as the New Criticism, some versions of structuralism, and deconstruction. The intentionalist critics prepare the ground for a theory of literary discourse considered as a speech act, since it is known that the concept of intention is central to the analysis of speech acts. Such a theory of li...

  8. EST: Evading Scientific Text.

    Science.gov (United States)

    Ward, Jeremy

    2001-01-01

    Examines chemical engineering students' attitudes to text and other parts of English language textbooks. A questionnaire was administered to a group of undergraduates. Results reveal one way students get around the problem of textbook reading. (Author/VWL)

  9. nal Sesotho texts

    African Journals Online (AJOL)

    with literary texts written in indigenous South African languages. The project ... Homi Bhabha uses the words of Salman Rushdie to underline the fact that new .... I could not conceptualise an African-language-to-African-language dictionary. An.

  10. Plagiarism in Academic Texts

    Directory of Open Access Journals (Sweden)

    Marta Eugenia Rojas-Porras

    2012-08-01

    Full Text Available The ethical and social responsibility of citing the sources in a scientific or artistic work is undeniable. This paper explores, in a preliminary way, academic plagiarism in its various forms. It includes findings based on a forensic analysis. The purpose of this paper is to raise awareness on the importance of considering these details when writing and publishing a text. Hopefully, this analysis may put the issue under discussion.

  11. Exploring Australian speech-language pathologists' use and perceptions ofnon-speech oral motor exercises.

    Science.gov (United States)

    Rumbach, Anna F; Rose, Tanya A; Cheah, Mynn

    2018-01-29

    To explore Australian speech-language pathologists' use of non-speech oral motor exercises, and rationales for using/not using non-speech oral motor exercises in clinical practice. A total of 124 speech-language pathologists practising in Australia, working with paediatric and/or adult clients with speech sound difficulties, completed an online survey. The majority of speech-language pathologists reported that they did not use non-speech oral motor exercises when working with paediatric or adult clients with speech sound difficulties. However, more than half of the speech-language pathologists working with adult clients who have dysarthria reported using non-speech oral motor exercises with this population. The most frequently reported rationale for using non-speech oral motor exercises in speech sound difficulty management was to improve awareness/placement of articulators. The majority of speech-language pathologists agreed there is no clear clinical or research evidence base to support non-speech oral motor exercise use with clients who have speech sound difficulties. This study provides an overview of Australian speech-language pathologists' reported use and perceptions of non-speech oral motor exercises' applicability and efficacy in treating paediatric and adult clients who have speech sound difficulties. The research findings provide speech-language pathologists with insight into how and why non-speech oral motor exercises are currently used, and adds to the knowledge base regarding Australian speech-language pathology practice of non-speech oral motor exercises in the treatment of speech sound difficulties. Implications for Rehabilitation Non-speech oral motor exercises refer to oral motor activities which do not involve speech, but involve the manipulation or stimulation of oral structures including the lips, tongue, jaw, and soft palate. Non-speech oral motor exercises are intended to improve the function (e.g., movement, strength) of oral structures. The

  12. Audiovisual Integration of Speech in a Patient with Broca’s Aphasia

    Directory of Open Access Journals (Sweden)

    Tobias Søren Andersen

    2015-04-01

    Full Text Available Lesions to Broca’s area cause aphasia characterised by a severe impairment of the ability to speak, with comparatively intact speech perception. However, some studies have found effects on speech perception under adverse listening conditions, indicating that Broca’s area is also involved in speech perception. While these studies have focused on auditory speech perception other studies have shown that Broca’s area is activated by visual speech perception. Furthermore, one preliminary report found that a patient with Broca’s aphasia did not experience the McGurk illusion suggesting that an intact Broca’s area is necessary for audiovisual integration of speech. Here we describe a patient with Broca’s aphasia who experienced the McGurk illusion. This indicates that an intact Broca’s area is not necessary for audiovisual integration of speech. The McGurk illusions this patient experienced were atypical, which could be due to Broca’s area having a more subtle role in audiovisual integration of speech. The McGurk illusions of a control subject with Wernicke’s aphasia were, however, also atypical. This indicates that the atypical McGurk illusions were due to deficits in speech processing that are not specific to Broca’s aphasia.

  13. Studies of Speech Disorders in Schizophrenia. History and State-of-the-art

    Directory of Open Access Journals (Sweden)

    Shedovskiy E. F.

    2015-08-01

    Full Text Available The article reviews studies of speech disorders in schizophrenia. The authors paid attention to a historical course and characterization of studies of areas: the actual psychopathological (speech disorders as a psychopathological symptoms, their description and taxonomy, psychological (isolated neurons and pathopsychological perspective analysis separately analyzed some modern foreign works, covering a variety of approaches to the study of speech disorders in the endogenous mental disorders. Disorders and features of speech are among the most striking manifestations of schizophrenia along with impaired thinking (Savitskaya A. V., Mikirtumov B. E.. With all the variety of symptoms, speech disorders in schizophrenia could be classified and organized. The few clinical psychological studies of speech activity in schizophrenia presented work on the study of generation and standard speech utterance; features verbal associative process, speed parameters of speech utterances. Special attention is given to integrated research in the mainstream of biological psychiatry and genetic trends. It is shown that the topic for more than a half-century history of originality of speech pathology in schizophrenia has received some coverage in the psychiatric and psychological literature and continues to generate interest in the modern integrated multidisciplinary approach

  14. Beyond production: Brain responses during speech perception in adults who stutter

    Directory of Open Access Journals (Sweden)

    Tali Halag-Milo

    2016-01-01

    Full Text Available Developmental stuttering is a speech disorder that disrupts the ability to produce speech fluently. While stuttering is typically diagnosed based on one's behavior during speech production, some models suggest that it involves more central representations of language, and thus may affect language perception as well. Here we tested the hypothesis that developmental stuttering implicates neural systems involved in language perception, in a task that manipulates comprehensibility without an overt speech production component. We used functional magnetic resonance imaging to measure blood oxygenation level dependent (BOLD signals in adults who do and do not stutter, while they were engaged in an incidental speech perception task. We found that speech perception evokes stronger activation in adults who stutter (AWS compared to controls, specifically in the right inferior frontal gyrus (RIFG and in left Heschl's gyrus (LHG. Significant differences were additionally found in the lateralization of response in the inferior frontal cortex: AWS showed bilateral inferior frontal activity, while controls showed a left lateralized pattern of activation. These findings suggest that developmental stuttering is associated with an imbalanced neural network for speech processing, which is not limited to speech production, but also affects cortical responses during speech perception.

  15. The neural processing of foreign-accented speech and its relationship to listener bias

    Directory of Open Access Journals (Sweden)

    Han-Gyol eYi

    2014-10-01

    Full Text Available Foreign-accented speech often presents a challenging listening condition. In addition to deviations from the target speech norms related to the inexperience of the nonnative speaker, listener characteristics may play a role in determining intelligibility levels. We have previously shown that an implicit visual bias for associating East Asian faces and foreignness predicts the listeners’ perceptual ability to process Korean-accented English audiovisual speech (Yi et al., 2013. Here, we examine the neural mechanism underlying the influence of listener bias to foreign faces on speech perception. In a functional magnetic resonance imaging (fMRI study, native English speakers listened to native- and Korean-accented English sentences, with or without faces. The participants’ Asian-foreign association was measured using an implicit association test (IAT, conducted outside the scanner. We found that foreign-accented speech evoked greater activity in the bilateral primary auditory cortices and the inferior frontal gyri, potentially reflecting greater computational demand. Higher IAT scores, indicating greater bias, were associated with increased BOLD response to foreign-accented speech with faces in the primary auditory cortex, the early node for spectrotemporal analysis. We conclude the following: (1 foreign-accented speech perception places greater demand on the neural systems underlying speech perception; (2 face of the talker can exaggerate the perceived foreignness of foreign-accented speech; (3 implicit Asian-foreign association is associated with decreased neural efficiency in early spectrotemporal processing.

  16. Real-Time Control of an Articulatory-Based Speech Synthesizer for Brain Computer Interfaces.

    Directory of Open Access Journals (Sweden)

    Florent Bocquelet

    2016-11-01

    Full Text Available Restoring natural speech in paralyzed and aphasic people could be achieved using a Brain-Computer Interface (BCI controlling a speech synthesizer in real-time. To reach this goal, a prerequisite is to develop a speech synthesizer producing intelligible speech in real-time with a reasonable number of control parameters. We present here an articulatory-based speech synthesizer that can be controlled in real-time for future BCI applications. This synthesizer converts movements of the main speech articulators (tongue, jaw, velum, and lips into intelligible speech. The articulatory-to-acoustic mapping is performed using a deep neural network (DNN trained on electromagnetic articulography (EMA data recorded on a reference speaker synchronously with the produced speech signal. This DNN is then used in both offline and online modes to map the position of sensors glued on different speech articulators into acoustic parameters that are further converted into an audio signal using a vocoder. In offline mode, highly intelligible speech could be obtained as assessed by perceptual evaluation performed by 12 listeners. Then, to anticipate future BCI applications, we further assessed the real-time control of the synthesizer by both the reference speaker and new speakers, in a closed-loop paradigm using EMA data recorded in real time. A short calibration period was used to compensate for differences in sensor positions and articulatory differences between new speakers and the reference speaker. We found that real-time synthesis of vowels and consonants was possible with good intelligibility. In conclusion, these results open to future speech BCI applications using such articulatory-based speech synthesizer.

  17. Look Who’s Talking NOW! Parentese Speech, Social Context, and Language Development Across Time

    Directory of Open Access Journals (Sweden)

    Nairán Ramírez-Esparza

    2017-06-01

    Full Text Available In previous studies, we found that the social interactions infants experience in their everyday lives at 11- and 14-months of age affect language ability at 24 months of age. These studies investigated relationships between the speech style (i.e., parentese speech vs. standard speech and social context [i.e., one-on-one (1:1 vs. group] of language input in infancy and later speech development (i.e., at 24 months of age, controlling for socioeconomic status (SES. Results showed that the amount of exposure to parentese speech-1:1 in infancy was related to productive vocabulary at 24 months. The general goal of the present study was to investigate changes in (1 the pattern of social interactions between caregivers and their children from infancy to childhood and (2 relationships among speech style, social context, and language learning across time. Our study sample consisted of 30 participants from the previously published infant studies, evaluated at 33 months of age. Social interactions were assessed at home using digital first-person perspective recordings of the auditory environment. We found that caregivers use less parentese speech-1:1, and more standard speech-1:1, as their children get older. Furthermore, we found that the effects of parentese speech-1:1 in infancy on later language development at 24 months persist at 33 months of age. Finally, we found that exposure to standard speech-1:1 in childhood was the only social interaction that related to concurrent word production/use. Mediation analyses showed that standard speech-1:1 in childhood fully mediated the effects of parentese speech-1:1 in infancy on language development in childhood, controlling for SES. This study demonstrates that engaging in one-on-one interactions in infancy and later in life has important implications for language development.

  18. Novel Techniques for Dialectal Arabic Speech Recognition

    CERN Document Server

    Elmahdy, Mohamed; Minker, Wolfgang

    2012-01-01

    Novel Techniques for Dialectal Arabic Speech describes approaches to improve automatic speech recognition for dialectal Arabic. Since speech resources for dialectal Arabic speech recognition are very sparse, the authors describe how existing Modern Standard Arabic (MSA) speech data can be applied to dialectal Arabic speech recognition, while assuming that MSA is always a second language for all Arabic speakers. In this book, Egyptian Colloquial Arabic (ECA) has been chosen as a typical Arabic dialect. ECA is the first ranked Arabic dialect in terms of number of speakers, and a high quality ECA speech corpus with accurate phonetic transcription has been collected. MSA acoustic models were trained using news broadcast speech. In order to cross-lingually use MSA in dialectal Arabic speech recognition, the authors have normalized the phoneme sets for MSA and ECA. After this normalization, they have applied state-of-the-art acoustic model adaptation techniques like Maximum Likelihood Linear Regression (MLLR) and M...

  19. Experimental comparison between speech transmission index, rapid speech transmission index, and speech intelligibility index.

    Science.gov (United States)

    Larm, Petra; Hongisto, Valtteri

    2006-02-01

    During the acoustical design of, e.g., auditoria or open-plan offices, it is important to know how speech can be perceived in various parts of the room. Different objective methods have been developed to measure and predict speech intelligibility, and these have been extensively used in various spaces. In this study, two such methods were compared, the speech transmission index (STI) and the speech intelligibility index (SII). Also the simplification of the STI, the room acoustics speech transmission index (RASTI), was considered. These quantities are all based on determining an apparent speech-to-noise ratio on selected frequency bands and summing them using a specific weighting. For comparison, some data were needed on the possible differences of these methods resulting from the calculation scheme and also measuring equipment. Their prediction accuracy was also of interest. Measurements were made in a laboratory having adjustable noise level and absorption, and in a real auditorium. It was found that the measurement equipment, especially the selection of the loudspeaker, can greatly affect the accuracy of the results. The prediction accuracy of the RASTI was found acceptable, if the input values for the prediction are accurately known, even though the studied space was not ideally diffuse.

  20. Upregulation of cognitive control networks in older adults’ speech comprehension

    Directory of Open Access Journals (Sweden)

    Julia eErb

    2013-12-01

    Full Text Available Speech comprehension abilities decline with age and with age-related hearing loss, but it is unclear how this decline expresses in terms of central neural mechanisms. The current study examined neural speech processing in a group of older adults (aged 56–77, n=16, with varying degrees of sensorineural hearing loss, and compared them to a cohort of young adults (aged 22–31, n=30, self-reported normal hearing. In an fMRI experiment, listeners heard and repeated back degraded sentences (4-band vocoding, which preserves the temporal envelope of the acoustic signal, while substantially degrading spectral information. Behaviourally, older adults adapted to degraded speech at the same rate as young listeners, although their overall comprehension of degraded speech was lower. Neurally, both older and young adults relied on the left anterior insula for degraded more than clear speech perception. However, anterior insula engagement in older adults was dependent on hearing acuity. Young adults additionally employed the anterior cingulate cortex (ACC. Interestingly, this age group × degradation interaction was driven by a reduced dynamic range in older adults, who displayed elevated levels of ACC activity in both conditions, consistent with a persistent upregulation in cognitive control irrespective of task difficulty. For correct speech comprehension, older adults recruited the middle frontal gyrus in addition to a core speech comprehension network on which young adults relied, suggestive of a compensatory mechanism. Taken together, the results indicate that older adults increasingly recruit cognitive control networks, even under optimal listening conditions, at the expense of these systems’ dynamic range.

  1. Delayed speech development in children: Introduction to terminology

    Directory of Open Access Journals (Sweden)

    M. Yu. Bobylova

    2017-01-01

    Full Text Available There has been recently an increase in the number of children diagnosed with delayed speech development. There is delay compensation with age, but mild deficiency often remains for life. Delayed speech development is more common in boys than in girls. Its etiology is unknown in most cases, so a child should be followed up to make an accurate diagnosis. Genetic predisposition or environmental factors frequently influence speech development. The course of its delays is various. In the history of a number of disorders (childhood disintegrative disorder, Landau–Kleffner syndrome, there is evidence for the normal development of speech to a certain period and then stops or even regresses. By way of comparison, there are generally speech developmental changes in autism even during the preverbal stage (a complex of revival fails to form; babbling is poor, low emotional, gibberish; at the same time, the baby recipes whole phrases without using them to communicate. These speech disorders are considered not only as a delay, but also as a developmental abnormality. Speech disorders in children should be diagnosed as early as possible in order to initiative corrective measures in time. In this case, a physician makes a diagnosis and a special education teacher does corrective work. The successful collaboration and mutual understanding of the specialists in these areas will determine quality of life for a child in the future. This paper focusses on the terminology and classification of delays, which are necessary for physicians and teachers to speak the same language.

  2. TEXT Energy Storage System

    International Nuclear Information System (INIS)

    Weldon, W.F.; Rylander, H.G.; Woodson, H.H.

    1977-01-01

    The Texas Experimental Tokamak (TEXT) Enery Storage System, designed by the Center for Electromechanics (CEM), consists of four 50 MJ, 125 V homopolar generators and their auxiliaries and is designed to power the toroidal and poloidal field coils of TEXT on a two-minute duty cycle. The four 50 MJ generators connected in series were chosen because they represent the minimum cost configuration and also represent a minimal scale up from the successful 5.0 MJ homopolar generator designed, built, and operated by the CEM

  3. Speech Telepractice: Installing a Speech Therapy Upgrade for the 21st Century

    Directory of Open Access Journals (Sweden)

    Michael P. Towey

    2012-12-01

    Full Text Available Much of speech therapy involves the clinician guiding the therapeutic process (e.g., presenting stimuli and eliciting client responses; however, this Brief Communication describes a different approach to speech therapy delivery. Clinicians at Waldo County General Hospital (WCGH use high definition audio and video to engage clients in telepractice using interactive web-based virtual environments. This technology enables clients and their clinicians to co-create salient treatment activities using authentic materials captured via digital cameras, video and/or curricular materials.  Both therapists and clients manipulate the materials and interact online in real-time. The web-based technology engenders highly personalized and engaging activities, such that clients’ interactions with these high interest tasks often continue well beyond the therapy sessions.

  4. New mathematical cuneiform texts

    CERN Document Server

    Friberg, Jöran

    2016-01-01

    This monograph presents in great detail a large number of both unpublished and previously published Babylonian mathematical texts in the cuneiform script. It is a continuation of the work A Remarkable Collection of Babylonian Mathematical Texts (Springer 2007) written by Jöran Friberg, the leading expert on Babylonian mathematics. Focussing on the big picture, Friberg explores in this book several Late Babylonian arithmetical and metro-mathematical table texts from the sites of Babylon, Uruk and Sippar, collections of mathematical exercises from four Old Babylonian sites, as well as a new text from Early Dynastic/Early Sargonic Umma, which is the oldest known collection of mathematical exercises. A table of reciprocals from the end of the third millennium BC, differing radically from well-documented but younger tables of reciprocals from the Neo-Sumerian and Old-Babylonian periods, as well as a fragment of a Neo-Sumerian clay tablet showing a new type of a labyrinth are also discussed. The material is presen...

  5. The Emar Lexical Texts

    NARCIS (Netherlands)

    Gantzert, Merijn

    2011-01-01

    This four-part work provides a philological analysis and a theoretical interpretation of the cuneiform lexical texts found in the Late Bronze Age city of Emar, in present-day Syria. These word and sign lists, commonly dated to around 1100 BC, were almost all found in the archive of a single school.

  6. Text Induced Spelling Correction

    NARCIS (Netherlands)

    Reynaert, M.W.C.

    2004-01-01

    We present TISC, a language-independent and context-sensitive spelling checking and correction system designed to facilitate the automatic removal of non-word spelling errors in large corpora. Its lexicon is derived from a very large corpus of raw text, without supervision, and contains word

  7. Texts and Readers.

    Science.gov (United States)

    Iser, Wolfgang

    1980-01-01

    Notes that, since fictional discourse need not reflect prevailing systems of meaning and norms or values, readers gain detachment from their own presuppositions; by constituting and formulating text-sense, readers are constituting and formulating their own cognition and becoming aware of the operations for doing so. (FL)

  8. Documents and legal texts

    International Nuclear Information System (INIS)

    2017-01-01

    This section treats of the following documents and legal texts: 1 - Belgium 29 June 2014 - Act amending the Act of 22 July 1985 on Third-Party Liability in the Field of Nuclear Energy; 2 - Belgium, 7 December 2016. - Act amending the Act of 22 July 1985 on Third-Party Liability in the Field of Nuclear Energy

  9. PUBLIC SERVICE ADVERTISING: AN ANALYSIS ON TEXT AND SEMIOTICS

    Directory of Open Access Journals (Sweden)

    Ni Wayan Sukarini

    2012-07-01

    Full Text Available This study concerns with text and semiotics analysis on the use of language in public service advertising (PSA. PSA in this study is the text which is especially on health. There are three problems that are analysed in this research, namely: (1 grammatical structure and the lexical of the text; (2 the relationship of trichotomies (representamen, object, and interpretant with the three components of sign in nonverbal aspect; and (3 ideologies and messages conveyed in the verbal and nonverbal signs. Three methods applied in this research respectively including descriptive, qualitative, and interpretative. The type of data was the written one which was taken from printed media in the forms of poster and brochure. The data was collected through five procedures, they are clipping, numbering, coding, picturing, and documenting. As a scientific writing, a number of theories must be applied for the analysis. The relevant theories are semantics, semiotics, speech act, hermeneutics, language function, and text structure. These six theories were applied eclecticly in analysing the grammatical structure, lexicals, signs, and the structure of texts in order to elaborate the meaning, ideology, and message which were being conveyed through the texts of PSA. The result of the analysis showed that the grammatical structure applied in the PSA of health could be classified into the simple structure in the forms of phrase, clause, and sentence. The use of verbs dominated initially in order to express the imperative meaning but still had the purpose of being persuasive. Kinds of lexicals found were very close to disease, reproduction, and health either the general terms, for example victims, medicine or the specific ones like HIV/AIDS, Odha, perinatal, nifas, jampersal, sadari. From the nonverbal aspect, the relationship of trichotomy with the three of sign components are more realistics in the Object with its three sub components. Triadic relationship of three sub

  10. PUBLIC SERVICE ADVERTISING: AN ANALYSIS ON TEXT AND SEMIOTICS

    Directory of Open Access Journals (Sweden)

    Ni Wayan Sukarini

    2015-07-01

    Full Text Available This study concerns with text and semiotics analysis on the use of language in public service advertising (PSA. PSA in this study is the text which is especially on health. There are three problems that are analysed in this research, namely: (1 grammatical structure and the lexical of the text; (2 the relationship of trichotomies (representamen, object, and interpretant with the three components of sign in nonverbal aspect; and (3 ideologies and messages conveyed in the verbal and nonverbal signs. Three methods applied in this research respectively including descriptive, qualitative, and interpretative. The type of data was the written one which was taken from printed media in the forms of poster and brochure. The data was collected through five procedures, they are clipping, numbering, coding, picturing, and documenting. As a scientific writing, a number of theories must be applied for the analysis. The relevant theories are semantics, semiotics, speech act, hermeneutics, language function, and text structure. These six theories were applied eclecticly in analysing the grammatical structure, lexicals, signs, and the structure of texts in order to elaborate the meaning, ideology, and message which were being conveyed through the texts of PSA. The result of the analysis showed that the grammatical structure applied in the PSA of health could be classified into the simple structure in the forms of phrase, clause, and sentence. The use of verbs dominated initially in order to express the imperative meaning but still had the purpose of being persuasive. Kinds of lexicals found were very close to disease, reproduction, and health either the general terms, for example victims, medicine or the specific ones like HIV/AIDS, Odha, perinatal, nifas, jampersal, sadari. From the nonverbal aspect, the relationship of trichotomy with the three of sign components are more realistics in the Object with its three sub components. Triadic relationship of three sub

  11. Functional lateralization of speech processing in adults and children who stutter

    Directory of Open Access Journals (Sweden)

    Yutaka eSato

    2011-04-01

    Full Text Available Developmental stuttering is a speech disorder in fluency characterized by repetitions, prolongations and silent blocks, especially in the initial parts of utterances. Although their symptoms are motor related, people who stutter show abnormal patterns of cerebral hemispheric dominance in both anterior and posterior language areas. It is unknown whether the abnormal functional lateralization in the posterior language area starts during childhood or emerges as a consequence of many years of stuttering. In order to address this issue, we measured the lateralization of hemodynamic responses in the auditory cortex during auditory speech processing in adults and children who stutter, including preschoolers, with near-infrared spectroscopy (NIRS. We used the analysis-resynthesis technique to prepare two types of stimuli: (i a phonemic contrast embedded in Japanese spoken words (/itta/ vs. /itte/ and (ii a prosodic contrast (/itta/ vs. /itta?/. In the baseline blocks, only /itta/ tokens were presented. In phonemic contrast blocks, /itta/ and /itte/ tokens were presented pseudo-randomly, and /itta/ and /itta?/ tokens in prosodic contrast blocks. In adults and children who do not stutter, there was a clear left-hemispheric advantage for the phonemic contrast compared to the prosodic contrast. Adults and children who stutter, however, showed no significant difference between the two stimulus conditions. A subject-by-subject analysis revealed that not a single subject who stutters showed a left advantage in the phonemic contrast over the prosodic contrast condition. These results indicate that the functional lateralization for auditory speech processing is in disarray among those who stutter, even at preschool age. These results shed light on the neural pathophysiology of developmental stuttering.

  12. Recognition of Speech of Normal-hearing Individuals with Tinnitus and Hyperacusis

    Directory of Open Access Journals (Sweden)

    Hennig, Tais Regina

    2011-01-01

    Full Text Available Introduction: Tinnitus and hyperacusis are increasingly frequent audiological symptoms that may occur in the absence of the hearing involvement, but it does not offer a lower impact or bothering to the affected individuals. The Medial Olivocochlear System helps in the speech recognition in noise and may be connected to the presence of tinnitus and hyperacusis. Objective: To evaluate the speech recognition of normal-hearing individual with and without complaints of tinnitus and hyperacusis, and to compare their results. Method: Descriptive, prospective and cross-study in which 19 normal-hearing individuals were evaluated with complaint of tinnitus and hyperacusis of the Study Group (SG, and 23 normal-hearing individuals without audiological complaints of the Control Group (CG. The individuals of both groups were submitted to the test List of Sentences in Portuguese, prepared by Costa (1998 to determine the Sentences Recognition Threshold in Silence (LRSS and the signal to noise ratio (S/N. The SG also answered the Tinnitus Handicap Inventory for tinnitus analysis, and to characterize hyperacusis the discomfort thresholds were set. Results: The CG and SG presented with average LRSS and S/N ratio of 7.34 dB NA and -6.77 dB, and of 7.20 dB NA and -4.89 dB, respectively. Conclusion: The normal-hearing individuals with or without audiological complaints of tinnitus and hyperacusis had a similar performance in the speech recognition in silence, which was not the case when evaluated in the presence of competitive noise, since the SG had a lower performance in this communication scenario, with a statistically significant difference.

  13. Effects of hearing loss on speech recognition under distracting conditions and working memory in the elderly

    Directory of Open Access Journals (Sweden)

    Na W

    2017-08-01

    Full Text Available Wondo Na,1 Gibbeum Kim,1 Gungu Kim,1 Woojae Han,2 Jinsook Kim2 1Department of Speech Pathology and Audiology, Graduate School, 2Division of Speech Pathology and Audiology, Research Institute of Audiology and Speech Pathology, College of Natural Sciences, Hallym University, Chuncheon, Republic of Korea Purpose: The current study aimed to evaluate hearing-related changes in terms of speech-in-noise processing, fast-rate speech processing, and working memory; and to identify which of these three factors is significantly affected by age-related hearing loss.Methods: One hundred subjects aged 65–84 years participated in the study. They were classified into four groups ranging from normal hearing to moderate-to-severe hearing loss. All the participants were tested for speech perception in quiet and noisy conditions and for speech perception with time alteration in quiet conditions. Forward- and backward-digit span tests were also conducted to measure the participants’ working memory.Results: 1 As the level of background noise increased, speech perception scores systematically decreased in all the groups. This pattern was more noticeable in the three hearing-impaired groups than in the normal hearing group. 2 As the speech rate increased faster, speech perception scores decreased. A significant interaction was found between speed of speech and hearing loss. In particular, 30% of compressed sentences revealed a clear differentiation between moderate hearing loss and moderate-to-severe hearing loss. 3 Although all the groups showed a longer span on the forward-digit span test than the backward-digit span test, there was no significant difference as a function of hearing loss.Conclusion: The degree of hearing loss strongly affects the speech recognition of babble-masked and time-compressed speech in the elderly but does not affect the working memory. We expect these results to be applied to appropriate rehabilitation strategies for hearing

  14. Speech control interface for Eurocontrol’s LINK2000+ system

    Directory of Open Access Journals (Sweden)

    Dan-Cristian ION

    2012-06-01

    Full Text Available This paper continues recent research of the authors, considering the use of speech recognition in air traffic control. It proposes the use of a voice control interface for Eurocontrol’s LINK2000+ system, offering an alternative means to improve air transport safety and efficiency.

  15. A Neural Network Based Dutch Part of Speech Tagger

    NARCIS (Netherlands)

    Boschman, E.; op den Akker, Hendrikus J.A.; Nijholt, A.; Nijholt, Antinus; Pantic, Maja; Pantic, M.; Poel, M.; Poel, Mannes; Hondorp, G.H.W.

    2008-01-01

    In this paper a Neural Network is designed for Part-of-Speech Tagging of Dutch text. Our approach uses the Corpus Gesproken Nederlands (CGN) consisting of almost 9 million transcribed words of spoken Dutch, divided into 15 different categories. The outcome of the design is a Neural Network with an

  16. Channel normalization technique for speech recognition in mismatched conditions

    CSIR Research Space (South Africa)

    Kleynhans, N

    2008-11-01

    Full Text Available , where one wishes to use any available training data for a variety of purposes. Research into a new channel normalization (CN) technique for channel mismatched speech recognition is presented. A process of inverse linear filtering is used in order...

  17. Appropriate baseline values for HMM-based speech recognition

    CSIR Research Space (South Africa)

    Barnard, E

    2004-11-01

    Full Text Available A number of issues realted to the development of speech-recognition systems with Hidden Markov Models (HMM) are discussed. A set of systematic experiments using the HTK toolkit and the TMIT database are used to elucidate matters such as the number...

  18. Detection of target phonemes in spontaneous and read speech

    NARCIS (Netherlands)

    Mehta, G.; Cutler, A.

    1988-01-01

    Although spontaneous speech occurs more frequently in most listeners' experience than read speech, laboratory studies of human speech recognition typically use carefully controlled materials read from a script. The phonological and prosodic characteristics of spontaneous and read speech differ

  19. SOME EXAMPLES OF APPLIED SYSTEMS WITH SPEECH INTERFACE

    Directory of Open Access Journals (Sweden)

    V. A. Zhitko

    2017-01-01

    Full Text Available Three examples of applied systems with a speech interface are considered in the article. The first two of these provide the end user with the opportunity to ask verbally the question and to hear the response from the system, creating an addition to the traditional I / O via the keyboard and computer screen. The third example, the «IntonTrainer» system, provides the user with the possibility of voice interaction and is designed for in-depth self-learning of the intonation of oral speech.

  20. Real time speech formant analyzer and display

    Science.gov (United States)

    Holland, George E.; Struve, Walter S.; Homer, John F.

    1987-01-01

    A speech analyzer for interpretation of sound includes a sound input which converts the sound into a signal representing the sound. The signal is passed through a plurality of frequency pass filters to derive a plurality of frequency formants. These formants are converted to voltage signals by frequency-to-voltage converters and then are prepared for visual display in continuous real time. Parameters from the inputted sound are also derived and displayed. The display may then be interpreted by the user. The preferred embodiment includes a microprocessor which is interfaced with a television set for displaying of the sound formants. The microprocessor software enables the sound analyzer to present a variety of display modes for interpretive and therapeutic used by the user.

  1. The Contribution of Cognitive Factors to Individual Differences in Understanding Noise-Vocoded Speech in Young and Older Adults

    Directory of Open Access Journals (Sweden)

    Stephanie Rosemann

    2017-06-01

    Full Text Available Noise-vocoded speech is commonly used to simulate the sensation after cochlear implantation as it consists of spectrally degraded speech. High individual variability exists in learning to understand both noise-vocoded speech and speech perceived through a cochlear implant (CI. This variability is partly ascribed to differing cognitive abilities like working memory, verbal skills or attention. Although clinically highly relevant, up to now, no consensus has been achieved about which cognitive factors exactly predict the intelligibility of speech in noise-vocoded situations in healthy subjects or in patients after cochlear implantation. We aimed to establish a test battery that can be used to predict speech understanding in patients prior to receiving a CI. Young and old healthy listeners completed a noise-vocoded speech test in addition to cognitive tests tapping on verbal memory, working memory, lexicon and retrieval skills as well as cognitive flexibility and attention. Partial-least-squares analysis revealed that six variables were important to significantly predict vocoded-speech performance. These were the ability to perceive visually degraded speech tested by the Text Reception Threshold, vocabulary size assessed with the Multiple Choice Word Test, working memory gauged with the Operation Span Test, verbal learning and recall of the Verbal Learning and Retention Test and task switching abilities tested by the Comprehensive Trail-Making Test. Thus, these cognitive abilities explain individual differences in noise-vocoded speech understanding and should be considered when aiming to predict hearing-aid outcome.

  2. A dynamical model of hierarchical selection and coordination in speech planning.

    Directory of Open Access Journals (Sweden)

    Sam Tilsen

    Full Text Available studies of the control of complex sequential movements have dissociated two aspects of movement planning: control over the sequential selection of movement plans, and control over the precise timing of movement execution. This distinction is particularly relevant in the production of speech: utterances contain sequentially ordered words and syllables, but articulatory movements are often executed in a non-sequential, overlapping manner with precisely coordinated relative timing. This study presents a hybrid dynamical model in which competitive activation controls selection of movement plans and coupled oscillatory systems govern coordination. The model departs from previous approaches by ascribing an important role to competitive selection of articulatory plans within a syllable. Numerical simulations show that the model reproduces a variety of speech production phenomena, such as effects of preparation and utterance composition on reaction time, and asymmetries in patterns of articulatory timing associated with onsets and codas. The model furthermore provides a unified understanding of a diverse group of phonetic and phonological phenomena which have not previously been related.

  3. Speech recognition in individuals with sensorineural hearing loss

    Directory of Open Access Journals (Sweden)

    Adriana Neves de Andrade

    Full Text Available ABSTRACT INTRODUCTION: Hearing loss can negatively influence the communication performance of individuals, who should be evaluated with suitable material and in situations of listening close to those found in everyday life. OBJECTIVE: To analyze and compare the performance of patients with mild-to-moderate sensorineural hearing loss in speech recognition tests carried out in silence and with noise, according to the variables ear (right and left and type of stimulus presentation. METHODS: The study included 19 right-handed individuals with mild-to-moderate symmetrical bilateral sensorineural hearing loss, submitted to the speech recognition test with words in different modalities and speech test with white noise and pictures. RESULTS: There was no significant difference between right and left ears in any of the tests. The mean number of correct responses in the speech recognition test with pictures, live voice, and recorded monosyllables was 97.1%, 85.9%, and 76.1%, respectively, whereas after the introduction of noise, the performance decreased to 72.6% accuracy. CONCLUSIONS: The best performances in the Speech Recognition Percentage Index were obtained using monosyllabic stimuli, represented by pictures presented in silence, with no significant differences between the right and left ears. After the introduction of competitive noise, there was a decrease in individuals' performance.

  4. Speech spectrum's correlation with speakers' Eysenck personality traits.

    Directory of Open Access Journals (Sweden)

    Chao Hu

    Full Text Available The current study explored the correlation between speakers' Eysenck personality traits and speech spectrum parameters. Forty-six subjects completed the Eysenck Personality Questionnaire. They were instructed to verbally answer the questions shown on a computer screen and their responses were recorded by the computer. Spectrum parameters of /sh/ and /i/ were analyzed by Praat voice software. Formant frequencies of the consonant /sh/ in lying responses were significantly lower than that in truthful responses, whereas no difference existed on the vowel /i/ speech spectrum. The second formant bandwidth of the consonant /sh/ speech spectrum was significantly correlated with the personality traits of Psychoticism, Extraversion, and Neuroticism, and the correlation differed between truthful and lying responses, whereas the first formant frequency of the vowel /i/ speech spectrum was negatively correlated with Neuroticism in both response types. The results suggest that personality characteristics may be conveyed through the human voice, although the extent to which these effects are due to physiological differences in the organs associated with speech or to a general Pygmalion effect is yet unknown.

  5. Dissociated Crossed Speech Areas in a Tumour Patient

    Directory of Open Access Journals (Sweden)

    Jörg Mauler

    2017-05-01

    Full Text Available In the past, the eloquent areas could be deliberately localised by the invasive Wada test. The very rare cases of dissociated crossed speech areas were accidentally found based on the clinical symptomatology. Today functional magnetic resonance imaging (fMRI-based imaging can be employed to non-invasively localise the eloquent areas in brain tumour patients for therapy planning. A 41-year-old, left-handed man with a low-grade glioma in the left frontal operculum extending to the insular cortex, tension headaches, and anomic aphasia over 5 months underwent a pre-operative speech area localisation fMRI measurement, which revealed the evidence of the transhemispheric disposition, where the dominant Wernicke speech area is located on the left and the Broca’s area is strongly lateralised to the right hemisphere. The outcome of the Wada test and the intraoperative cortico-subcortical stimulation mapping were congruent with this finding. After tumour removal, language area function was fully preserved. Upon the occurrence of brain tumours with a risk of impaired speech function, the rare dissociate crossed speech areas disposition may gain a clinically relevant meaning by allowing for more extended tumour removal. Hence, for its identification, diagnostics which take into account both brain hemispheres, such as fMRI, are recommended.

  6. Hemispheric asymmetries in speech perception: sense, nonsense and modulations.

    Directory of Open Access Journals (Sweden)

    Stuart Rosen

    Full Text Available The well-established left hemisphere specialisation for language processing has long been claimed to be based on a low-level auditory specialization for specific acoustic features in speech, particularly regarding 'rapid temporal processing'.A novel analysis/synthesis technique was used to construct a variety of sounds based on simple sentences which could be manipulated in spectro-temporal complexity, and whether they were intelligible or not. All sounds consisted of two noise-excited spectral prominences (based on the lower two formants in the original speech which could be static or varying in frequency and/or amplitude independently. Dynamically varying both acoustic features based on the same sentence led to intelligible speech but when either or both acoustic features were static, the stimuli were not intelligible. Using the frequency dynamics from one sentence with the amplitude dynamics of another led to unintelligible sounds of comparable spectro-temporal complexity to the intelligible ones. Positron emission tomography (PET was used to compare which brain regions were active when participants listened to the different sounds.Neural activity to spectral and amplitude modulations sufficient to support speech intelligibility (without actually being intelligible was seen bilaterally, with a right temporal lobe dominance. A left dominant response was seen only to intelligible sounds. It thus appears that the left hemisphere specialisation for speech is based on the linguistic properties of utterances, not on particular acoustic features.

  7. Speech rehabilitation of maxillectomy patients with hollow bulb obturator

    Directory of Open Access Journals (Sweden)

    Pravesh Kumar

    2012-01-01

    Full Text Available Aim: To evaluate the effect of hollow bulb obturator prosthesis on articulation and nasalance in maxillectomy patients. Materials and Methods: A total of 10 patients, who were to undergo maxillectomy, falling under Aramany classes I and II, with normal speech and hearing pattern were selected for the study. They were provided with definitive maxillary obturators after complete healing of the defect. The patients were asked to wear the obturator for six weeks and speech analysis was done to measure changes in articulation and nasalance at four different stages of treatment, namely, preoperative, postoperative (after complete healing, that is, 3-4 months after surgery, after 24 hours, and after six weeks of providing the obturators. Articulation was measured objectively for distortion, addition, substitution, and omission by a speech pathologist, and nasalance was measured by Dr. Speech software. Results: The statistical comparison of preoperative and six weeks post rehabilitation levels showed insignificance in articulation and nasalance. Comparison of post surgery complete healing with six weeks after rehabilitation showed significant differences in both nasalance and articulation. Conclusion: Providing an obturator improves the speech closer to presurgical levels of articulation and there is improvement in nasality also.

  8. Concurrent Speech Segregation Problems in Hearing Impaired Children

    Directory of Open Access Journals (Sweden)

    Hossein Talebi

    2014-04-01

    Full Text Available Objective: This study was a basic investigation of the ability of concurrent speech segregation in hearing impaired children. Concurrent segregation is one of the fundamental components of auditory scene analysis and plays an important role in speech perception. In the present study, we compared auditory late responses or ALRs between hearing impaired and normal children. Materials & Methods: Auditory late potentials in response to 12 double vowels were recorded in 10 children with moderate to severe sensory neural hearing loss and 10 normal children. Double vowels (pairs of synthetic vowels were presented concurrently and binaurally. Fundamental frequency (F0 of these vowels and the size of the difference in F0 between vowels was 100 Hz and 0.5 semitones respectively. Results: Comparing N1-P2 amplitude showed statistically significant difference in some stimuli between hearing impaired and normal children (P<0.05. This complex indexing the vowel change detection and reflecting central auditory speech representation without active client participation was decreased in hearing impaired children. Conclusion: This study showed problems in concurrent speech segregation in hearing impaired children evidenced by ALRs. This information indicated deficiencies in bottom-up processing of speech characteristics based on F0 and its differences in these children.

  9. Experiments on Automatic Recognition of Nonnative Arabic Speech

    Directory of Open Access Journals (Sweden)

    Douglas O'Shaughnessy

    2008-05-01

    Full Text Available The automatic recognition of foreign-accented Arabic speech is a challenging task since it involves a large number of nonnative accents. As well, the nonnative speech data available for training are generally insufficient. Moreover, as compared to other languages, the Arabic language has sparked a relatively small number of research efforts. In this paper, we are concerned with the problem of nonnative speech in a speaker independent, large-vocabulary speech recognition system for modern standard Arabic (MSA. We analyze some major differences at the phonetic level in order to determine which phonemes have a significant part in the recognition performance for both native and nonnative speakers. Special attention is given to specific Arabic phonemes. The performance of an HMM-based Arabic speech recognition system is analyzed with respect to speaker gender and its native origin. The WestPoint modern standard Arabic database from the language data consortium (LDC and the hidden Markov Model Toolkit (HTK are used throughout all experiments. Our study shows that the best performance in the overall phoneme recognition is obtained when nonnative speakers are involved in both training and testing phases. This is not the case when a language model and phonetic lattice networks are incorporated in the system. At the phonetic level, the results show that female nonnative speakers perform better than nonnative male speakers, and that emphatic phonemes yield a significant decrease in performance when they are uttered by both male and female nonnative speakers.

  10. Experiments on Automatic Recognition of Nonnative Arabic Speech

    Directory of Open Access Journals (Sweden)

    Selouani Sid-Ahmed

    2008-01-01

    Full Text Available The automatic recognition of foreign-accented Arabic speech is a challenging task since it involves a large number of nonnative accents. As well, the nonnative speech data available for training are generally insufficient. Moreover, as compared to other languages, the Arabic language has sparked a relatively small number of research efforts. In this paper, we are concerned with the problem of nonnative speech in a speaker independent, large-vocabulary speech recognition system for modern standard Arabic (MSA. We analyze some major differences at the phonetic level in order to determine which phonemes have a significant part in the recognition performance for both native and nonnative speakers. Special attention is given to specific Arabic phonemes. The performance of an HMM-based Arabic speech recognition system is analyzed with respect to speaker gender and its native origin. The WestPoint modern standard Arabic database from the language data consortium (LDC and the hidden Markov Model Toolkit (HTK are used throughout all experiments. Our study shows that the best performance in the overall phoneme recognition is obtained when nonnative speakers are involved in both training and testing phases. This is not the case when a language model and phonetic lattice networks are incorporated in the system. At the phonetic level, the results show that female nonnative speakers perform better than nonnative male speakers, and that emphatic phonemes yield a significant decrease in performance when they are uttered by both male and female nonnative speakers.

  11. Speech Respiratory Measures in Spastic Cerebral Palsied and Normal Children

    Directory of Open Access Journals (Sweden)

    Hashem Shemshadi

    2007-10-01

    Full Text Available Objective: Research is designed to determine speech respiratory measures in spastic cerebral palsied children versus normal ones, to be used as an applicable tool in speech therapy plans.  Materials & Methods: Via a comparative cross-sectional study (case–control, and through a directive goal oriented sampling in case and convenience approach for controls twenty spastic cerebral palsied and twenty control ones with age (5-12 years old and sex (F=20, M=20 were matched and identified. All possible inclusion and exclusion criteria were considered by thorough past medical, clinical and para clinical such as chest X-ray and Complete Blood Counts reviews to rule out any possible pulmonary and/or systemic disorders. Their speech respiratory indices were determined by Respirometer (ST 1-dysphonia, made and normalized by Glasgow University. Obtained data were analyzed by independent T test. Results: There were significant differences between cases and control groups for "mean tidal volume", "phonatory volume" and "vital capacity" at a=0/05 values and these values in patients were less (34% than normal children (P<0/001. Conclusion: Measures obtained are highly crucial for speech therapist in any speech therapy primary rehabilitative plans for spactic cerebral palsied children.

  12. Features of Speech Reactions to Mental State Concepts

    Directory of Open Access Journals (Sweden)

    Ekaterina M. Alekseeva

    2017-11-01

    Full Text Available The article is devoted to the problem of mental state associative speech representation. The study involved 31 Russian-speaking subjects (27 females and 4 males at the age of 18 - 22 years old. The experimental procedure using DMDX program allowed to measure the time of speech response to stimuli - the concepts of 25 mental states. The average reaction time to the concepts of mental states, shown on the computer monitor, made 2114.68 milliseconds. The most rapid associative speech response was the response to the following stimuli: "ecstasy" (1452.54 msec, "meditation" (1569.26 msec, "tranquility" (1685.21 msec, the slowest response is the response to "interest" (2517.5 msec and "indecision" (2454.63 msec. In total, 448 associations were given to the concepts of 25 mental states by tested subjects - speech reactions, i.e. 17.9 associations per mental state on the average. The greatest number of speech associations (24 was given to the concept of love. The smallest number was given to the concept of ecstasy (11 associations. Associative fields of mental states (meditation, ecstasy, melancholy, tiredness, loneliness have the most pronounced core. The prospects of the study consist in the performance of a similar associative experiment among the representatives of another culture, as well as in the studying of an estimated and situational associative representation of mental states.

  13. Automated Intelligibility Assessment of Pathological Speech Using Phonological Features

    Directory of Open Access Journals (Sweden)

    Catherine Middag

    2009-01-01

    Full Text Available It is commonly acknowledged that word or phoneme intelligibility is an important criterion in the assessment of the communication efficiency of a pathological speaker. People have therefore put a lot of effort in the design of perceptual intelligibility rating tests. These tests usually have the drawback that they employ unnatural speech material (e.g., nonsense words and that they cannot fully exclude errors due to listener bias. Therefore, there is a growing interest in the application of objective automatic speech recognition technology to automate the intelligibility assessment. Current research is headed towards the design of automated methods which can be shown to produce ratings that correspond well with those emerging from a well-designed and well-performed perceptual test. In this paper, a novel methodology that is built on previous work (Middag et al., 2008 is presented. It utilizes phonological features, automatic speech alignment based on acoustic models that were trained on normal speech, context-dependent speaker feature extraction, and intelligibility prediction based on a small model that can be trained on pathological speech samples. The experimental evaluation of the new system reveals that the root mean squared error of the discrepancies between perceived and computed intelligibilities can be as low as 8 on a scale of 0 to 100.

  14. Strategy as Texts

    DEFF Research Database (Denmark)

    Obed Madsen, Søren

    of the strategy into four categories. Second, the managers produce new texts based on the original strategy document by using four different ways of translation models. The study’s findings contribute to three areas. Firstly, it shows that translation is more than a sociological process. It is also...... a craftsmanship that requires knowledge and skills, which unfortunately seems to be overlooked in both the literature and in practice. Secondly, it shows that even though a strategy text is in singular, the translation makes strategy plural. Thirdly, the article proposes a way to open up the black box of what......This article shows empirically how managers translate a strategy plan at an individual level. By analysing how managers in three organizations translate strategies, it identifies that the translation happens in two steps: First, the managers decipher the strategy by coding the different parts...

  15. Noise-robust speech triage.

    Science.gov (United States)

    Bartos, Anthony L; Cipr, Tomas; Nelson, Douglas J; Schwarz, Petr; Banowetz, John; Jerabek, Ladislav

    2018-04-01

    A method is presented in which conventional speech algorithms are applied, with no modifications, to improve their performance in extremely noisy environments. It has been demonstrated that, for eigen-channel algorithms, pre-training multiple speaker identification (SID) models at a lattice of signal-to-noise-ratio (SNR) levels and then performing SID using the appropriate SNR dependent model was successful in mitigating noise at all SNR levels. In those tests, it was found that SID performance was optimized when the SNR of the testing and training data were close or identical. In this current effort multiple i-vector algorithms were used, greatly improving both processing throughput and equal error rate classification accuracy. Using identical approaches in the same noisy environment, performance of SID, language identification, gender identification, and diarization were significantly improved. A critical factor in this improvement is speech activity detection (SAD) that performs reliably in extremely noisy environments, where the speech itself is barely audible. To optimize SAD operation at all SNR levels, two algorithms were employed. The first maximized detection probability at low levels (-10 dB ≤ SNR < +10 dB) using just the voiced speech envelope, and the second exploited features extracted from the original speech to improve overall accuracy at higher quality levels (SNR ≥ +10 dB).

  16. Linguistic Processing of Accented Speech Across the Lifespan

    Directory of Open Access Journals (Sweden)

    Alejandrina eCristia

    2012-11-01

    Full Text Available In most of the world, people have regular exposure to multiple accents. Therefore, learning to quickly process accented speech is a prerequisite to successful communication. In this paper, we examine work on the perception of accented speech across the lifespan, from early infancy to late adulthood. Unfamiliar accents initially impair linguistic processing by infants, children, younger adults, and older adults, but listeners of all ages come to adapt to accented speech. Emergent research also goes beyond these perceptual abilities, by assessing links with production and the relative contributions of linguistic knowledge and general cognitive skills. We conclude by underlining points of convergence across ages, and the gaps left to face in future work.

  17. Speech development delay in a child with foetal alcohol syndrome

    Directory of Open Access Journals (Sweden)

    Jacek Wilczyński

    2016-09-01

    Full Text Available A female foetus in her mother’s womb was exposed to high concentrations of alcohol at each stage of pregnancy on a long-term basis, which resulted in a permanent disability. In addition to a number of deficiencies in the overall functioning of the body of the child, there are serious problems pertaining to verbal communication. This thesis aims to describe foetal alcohol syndrome (FAS disease and present the basic problems with communication functions in a child, caused by damage of brain structures responsible for speech development. The thesis includes a speech diagnosis and therapy program adapted to the presented case. In the Discussion Section we have presented characteristics of communication disorders in case of children with FAS and the description of developmental malformations, neurobehavioral disorders, and environmental factors affecting the development of the child’s speech.

  18. Social Robotics in Therapy of Apraxia of Speech

    Directory of Open Access Journals (Sweden)

    José Carlos Castillo

    2018-01-01

    Full Text Available Apraxia of speech is a motor speech disorder in which messages from the brain to the mouth are disrupted, resulting in an inability for moving lips or tongue to the right place to pronounce sounds correctly. Current therapies for this condition involve a therapist that in one-on-one sessions conducts the exercises. Our aim is to work in the line of robotic therapies in which a robot is able to perform partially or autonomously a therapy session, endowing a social robot with the ability of assisting therapists in apraxia of speech rehabilitation exercises. Therefore, we integrate computer vision and machine learning techniques to detect the mouth pose of the user and, on top of that, our social robot performs autonomously the different steps of the therapy using multimodal interaction.

  19. Toward Speech and Nonverbal Behaviors Integration for Humanoid Robot

    Directory of Open Access Journals (Sweden)

    Wei Wang

    2012-09-01

    Full Text Available It is essential to integrate speeches and nonverbal behaviors for a humanoid robot in human-robot interaction. This paper presents an approach using multi-object genetic algorithm to match the speeches and behaviors automatically. Firstly, with humanoid robot's emotion status, we construct a hierarchical structure to link voice characteristics and nonverbal behaviors. Secondly, these behaviors corresponding to speeches are matched and integrated into an action sequence based on genetic algorithm, so the robot can consistently speak and perform emotional behaviors. Our approach takes advantage of relevant knowledge described by psychologists and nonverbal communication. And from experiment results, our ultimate goal, implementing an affective robot to act and speak with partners vividly and fluently, could be achieved.

  20. Marx As Journalist: Revisiting The Free Speech Debate

    Directory of Open Access Journals (Sweden)

    Padmaja Shaw

    2012-05-01

    Full Text Available Marx was a practicing journalist for most of his adult life. He was editor, columnist and special correspondent at different times and his journalistic work provided significant inputs for his later theoretical work. Marx, through his engagement with the political revolutions of 19th century Europe, developed one of the finest arguments in defence of free speech and the need for expanding bourgeois democratic freedoms in the process of transition to socialism. This paper describes the role of the Marxist parties and intellectuals in India in using and expanding the democratic freedoms available in India. The paper concludes that there is a gap between the Marx’s ideological position on free speech and the praxis of Marxist parties. In contemporary India, there is urgent need to protect free speech, fight censorship and strengthen independent constitutional authorities that are governed by democratic principles.

  1. Exploring expressivity and emotion with artificial voice and speech technologies.

    Science.gov (United States)

    Pauletto, Sandra; Balentine, Bruce; Pidcock, Chris; Jones, Kevin; Bottaci, Leonardo; Aretoulaki, Maria; Wells, Jez; Mundy, Darren P; Balentine, James

    2013-10-01

    Emotion in audio-voice signals, as synthesized by text-to-speech (TTS) technologies, was investigated to formulate a theory of expression for user interface design. Emotional parameters were specified with markup tags, and the resulting audio was further modulated with post-processing techniques. Software was then developed to link a selected TTS synthesizer with an automatic speech recognition (ASR) engine, producing a chatbot that could speak and listen. Using these two artificial voice subsystems, investigators explored both artistic and psychological implications of artificial speech emotion. Goals of the investigation were interdisciplinary, with interest in musical composition, augmentative and alternative communication (AAC), commercial voice announcement applications, human-computer interaction (HCI), and artificial intelligence (AI). The work-in-progress points towards an emerging interdisciplinary ontology for artificial voices. As one study output, HCI tools are proposed for future collaboration.

  2. Factors Affecting Delayed Referral for Speech Therapy in Iranian children with Speech and Language Disorders

    Directory of Open Access Journals (Sweden)

    Roshanak Vameghi

    2014-03-01

    Full Text Available Objective: Early detection of children who are at risk for speech and language impairment and those at early stages of delay is crucial for provision of early intervention services. However, unfortunately in Iran, this disorder is not identified or referred for proper treatment and rehabilitation at early critical stages. Materials & Methods: This study was carried out in two phases. The first phase which was qualitative in nature was meant to identify all potentially affective factors through literature review as well as by acquiring the viewpoints of experts and families on this issue. Twelve experts and 9 parents of children with speech and language disorders participated in semi-structured in-depth interviews, thereby completing the first draft of potentially affective factors compiled through literature review. The completed list of factors finally led to the design of a questionnaire for identifying “factors affecting late referral in childhood speech and language impairment”. The questionnaire was approved for face and content validity. The cronbach’s alpha was determined to be 0.81. Two groups of parents were asked to complete the questionnaire: the parents of children who had attended speech and language clinics located on the west and central regions of Tehran city, after their child was 3 years old and those who had attended before their child was 3 years old, as the case and control group, respectively. Results: According to the results, among the seven factors which showed significant difference between the two groups of children before definite diagnosis of speech and language disorders was arrived for the child, 3 factors were related to the type of guidance and consultation received by the family from physicians, 2 factors were related to parents’ lack of awareness and knowledge, and 2 factors were related to the screening services received. All six factors showing significant difference between the two groups after

  3. Voice Activity Detection. Fundamentals and Speech Recognition System Robustness

    OpenAIRE

    Ramirez, J.; Gorriz, J. M.; Segura, J. C.

    2007-01-01

    This chapter has shown an overview of the main challenges in robust speech detection and a review of the state of the art and applications. VADs are frequently used in a number of applications including speech coding, speech enhancement and speech recognition. A precise VAD extracts a set of discriminative speech features from the noisy speech and formulates the decision in terms of well defined rule. The chapter has summarized three robust VAD methods that yield high speech/non-speech discri...

  4. Speech Inconsistency in Children with Childhood Apraxia of Speech, Language Impairment, and Speech Delay: Depends on the Stimuli

    Science.gov (United States)

    Iuzzini-Seigel, Jenya; Hogan, Tiffany P.; Green, Jordan R.

    2017-01-01

    Purpose: The current research sought to determine (a) if speech inconsistency is a core feature of childhood apraxia of speech (CAS) or if it is driven by comorbid language impairment that affects a large subset of children with CAS and (b) if speech inconsistency is a sensitive and specific diagnostic marker that can differentiate between CAS and…

  5. Reading Authentic Texts

    DEFF Research Database (Denmark)

    Balling, Laura Winther

    2013-01-01

    Most research on cognates has focused on words presented in isolation that are easily defined as cognate between L1 and L2. In contrast, this study investigates what counts as cognate in authentic texts and how such cognates are read. Participants with L1 Danish read news articles in their highly...... proficient L2, English, while their eye-movements were monitored. The experiment shows a cognate advantage for morphologically simple words, but only when cognateness is defined relative to translation equivalents that are appropriate in the context. For morphologically complex words, a cognate disadvantage...... word predictability indexed by the conditional probability of each word....

  6. Documents and legal texts

    International Nuclear Information System (INIS)

    2016-01-01

    This section treats of the following documents and legal texts: 1 - Brazil: Law No. 13,260 of 16 March 2016 (To regulate the provisions of item XLIII of Article 5 of the Federal Constitution on terrorism, dealing with investigative and procedural provisions and redefining the concept of a terrorist organisation; and amends Laws No. 7,960 of 21 December 1989 and No. 12,850 of 2 August 2013); 2 - India: The Atomic Energy (Amendment) Act, 2015; Department Of Atomic Energy Notification (Civil Liability for Nuclear Damage); 3 - Japan: Act on Subsidisation, etc. for Nuclear Damage Compensation Funds following the implementation of the Convention on Supplementary Compensation for Nuclear Damage

  7. Journalistic Text Production

    DEFF Research Database (Denmark)

    Haugaard, Rikke Hartmann

    , a multiple case study investigated three professional text producers’ practices as they unfolded in their natural setting at the Spanish newspaper, El Mundo. • Results indicate that journalists’ revisions are related to form markedly more often than to content. • Results suggest two writing phases serving...... at the Spanish newspaper, El Mundo, in Madrid. The study applied a combination of quantitative and qualitative methods, i.e. keystroke logging, participant observation and retrospective interview. Results indicate that journalists’ revisions are related to form markedly more often than to content (approx. three...

  8. Variable Span Filters for Speech Enhancement

    DEFF Research Database (Denmark)

    Jensen, Jesper Rindom; Benesty, Jacob; Christensen, Mads Græsbøll

    2016-01-01

    In this work, we consider enhancement of multichannel speech recordings. Linear filtering and subspace approaches have been considered previously for solving the problem. The current linear filtering methods, although many variants exist, have limited control of noise reduction and speech...

  9. Represented Speech in Qualitative Health Research

    DEFF Research Database (Denmark)

    Musaeus, Peter

    2017-01-01

    Represented speech refers to speech where we reference somebody. Represented speech is an important phenomenon in everyday conversation, health care communication, and qualitative research. This case will draw first from a case study on physicians’ workplace learning and second from a case study...... on nurses’ apprenticeship learning. The aim of the case is to guide the qualitative researcher to use own and others’ voices in the interview and to be sensitive to represented speech in everyday conversation. Moreover, reported speech matters to health professionals who aim to represent the voice...... of their patients. Qualitative researchers and students might learn to encourage interviewees to elaborate different voices or perspectives. Qualitative researchers working with natural speech might pay attention to how people talk and use represented speech. Finally, represented speech might be relevant...

  10. Quick Statistics about Voice, Speech, and Language

    Science.gov (United States)

    ... here Home » Health Info » Statistics and Epidemiology Quick Statistics About Voice, Speech, Language Voice, Speech, Language, and ... no 205. Hyattsville, MD: National Center for Health Statistics. 2015. Hoffman HJ, Li C-M, Losonczy K, ...

  11. Developmental language and speech disability.

    Science.gov (United States)

    Spiel, G; Brunner, E; Allmayer, B; Pletz, A

    2001-09-01

    Speech disabilities (articulation deficits) and language disorders--expressive (vocabulary) receptive (language comprehension) are not uncommon in children. An overview of these along with a global description of the impairment of communication as well as clinical characteristics of language developmental disorders are presented in this article. The diagnostic tables, which are applied in the European and Anglo-American speech areas, ICD-10 and DSM-IV, have been explained and compared. Because of their strengths and weaknesses an alternative classification of language and speech developmental disorders is proposed, which allows a differentiation between expressive and receptive language capabilities with regard to the semantic and the morphological/syntax domains. Prevalence and comorbidity rates, psychosocial influences, biological factors and the biological social interaction have been discussed. The necessity of the use of standardized examinations is emphasised. General logopaedic treatment paradigms, specific therapy concepts and an overview of prognosis have been described.

  12. Motor Speech Phenotypes of Frontotemporal Dementia, Primary Progressive Aphasia, and Progressive Apraxia of Speech

    Science.gov (United States)

    Poole, Matthew L.; Brodtmann, Amy; Darby, David; Vogel, Adam P.

    2017-01-01

    Purpose: Our purpose was to create a comprehensive review of speech impairment in frontotemporal dementia (FTD), primary progressive aphasia (PPA), and progressive apraxia of speech in order to identify the most effective measures for diagnosis and monitoring, and to elucidate associations between speech and neuroimaging. Method: Speech and…

  13. Visual context enhanced. The joint contribution of iconic gestures and visible speech to degraded speech comprehension.

    NARCIS (Netherlands)

    Drijvers, L.; Özyürek, A.

    2017-01-01

    Purpose: This study investigated whether and to what extent iconic co-speech gestures contribute to information from visible speech to enhance degraded speech comprehension at different levels of noise-vocoding. Previous studies of the contributions of these 2 visual articulators to speech

  14. Listeners Experience Linguistic Masking Release in Noise-Vocoded Speech-in-Speech Recognition

    Science.gov (United States)

    Viswanathan, Navin; Kokkinakis, Kostas; Williams, Brittany T.

    2018-01-01

    Purpose: The purpose of this study was to evaluate whether listeners with normal hearing perceiving noise-vocoded speech-in-speech demonstrate better intelligibility of target speech when the background speech was mismatched in language (linguistic release from masking [LRM]) and/or location (spatial release from masking [SRM]) relative to the…

  15. Predicting Speech Intelligibility with a Multiple Speech Subsystems Approach in Children with Cerebral Palsy

    Science.gov (United States)

    Lee, Jimin; Hustad, Katherine C.; Weismer, Gary

    2014-01-01

    Purpose: Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystems approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Method: Nine acoustic variables reflecting different subsystems, and…

  16. Visual Context Enhanced: The Joint Contribution of Iconic Gestures and Visible Speech to Degraded Speech Comprehension

    Science.gov (United States)

    Drijvers, Linda; Ozyurek, Asli

    2017-01-01

    Purpose: This study investigated whether and to what extent iconic co-speech gestures contribute to information from visible speech to enhance degraded speech comprehension at different levels of noise-vocoding. Previous studies of the contributions of these 2 visual articulators to speech comprehension have only been performed separately. Method:…

  17. An experimental Dutch keyboard-to-speech system for the speech impaired

    NARCIS (Netherlands)

    Deliege, R.J.H.

    1989-01-01

    An experimental Dutch keyboard-to-speech system has been developed to explor the possibilities and limitations of Dutch speech synthesis in a communication aid for the speech impaired. The system uses diphones and a formant synthesizer chip for speech synthesis. Input to the system is in

  18. Perceived Liveliness and Speech Comprehensibility in Aphasia: The Effects of Direct Speech in Auditory Narratives

    Science.gov (United States)

    Groenewold, Rimke; Bastiaanse, Roelien; Nickels, Lyndsey; Huiskes, Mike

    2014-01-01

    Background: Previous studies have shown that in semi-spontaneous speech, individuals with Broca's and anomic aphasia produce relatively many direct speech constructions. It has been claimed that in "healthy" communication direct speech constructions contribute to the liveliness, and indirectly to the comprehensibility, of speech.…

  19. Poor Speech Perception Is Not a Core Deficit of Childhood Apraxia of Speech: Preliminary Findings

    Science.gov (United States)

    Zuk, Jennifer; Iuzzini-Seigel, Jenya; Cabbage, Kathryn; Green, Jordan R.; Hogan, Tiffany P.

    2018-01-01

    Purpose: Childhood apraxia of speech (CAS) is hypothesized to arise from deficits in speech motor planning and programming, but the influence of abnormal speech perception in CAS on these processes is debated. This study examined speech perception abilities among children with CAS with and without language impairment compared to those with…

  20. Common neural substrates support speech and non-speech vocal tract gestures.

    Science.gov (United States)

    Chang, Soo-Eun; Kenney, Mary Kay; Loucks, Torrey M J; Poletto, Christopher J; Ludlow, Christy L

    2009-08-01

    The issue of whether speech is supported by the same neural substrates as non-speech vocal tract gestures has been contentious. In this fMRI study we tested whether producing non-speech vocal tract gestures in humans shares the same functional neuroanatomy as non-sense speech syllables. Production of non-speech vocal tract gestures, devoid of phonological content but similar to speech in that they had familiar acoustic and somatosensory targets, was compared to the production of speech syllables without meaning. Brain activation related to overt production was captured with BOLD fMRI using a sparse sampling design for both conditions. Speech and non-speech were compared using voxel-wise whole brain analyses, and ROI analyses focused on frontal and temporoparietal structures previously reported to support speech production. Results showed substantial activation overlap between speech and non-speech function in regions. Although non-speech gesture production showed greater extent and amplitude of activation in the regions examined, both speech and non-speech showed comparable left laterality in activation for both target perception and production. These findings posit a more general role of the previously proposed "auditory dorsal stream" in the left hemisphere--to support the production of vocal tract gestures that are not limited to speech processing.