WorldWideScience

Sample records for speech sound processing

  1. Speech endpoint detection with non-language speech sounds for generic speech processing applications

    Science.gov (United States)

    McClain, Matthew; Romanowski, Brian

    2009-05-01

    Non-language speech sounds (NLSS) are sounds produced by humans that do not carry linguistic information. Examples of these sounds are coughs, clicks, breaths, and filled pauses such as "uh" and "um" in English. NLSS are prominent in conversational speech, but can be a significant source of errors in speech processing applications. Traditionally, these sounds are ignored by speech endpoint detection algorithms, where speech regions are identified in the audio signal prior to processing. The ability to filter NLSS as a pre-processing step can significantly enhance the performance of many speech processing applications, such as speaker identification, language identification, and automatic speech recognition. In order to be used in all such applications, NLSS detection must be performed without the use of language models that provide knowledge of the phonology and lexical structure of speech. This is especially relevant to situations where the languages used in the audio are not known apriori. We present the results of preliminary experiments using data from American and British English speakers, in which segments of audio are classified as language speech sounds (LSS) or NLSS using a set of acoustic features designed for language-agnostic NLSS detection and a hidden-Markov model (HMM) to model speech generation. The results of these experiments indicate that the features and model used are capable of detection certain types of NLSS, such as breaths and clicks, while detection of other types of NLSS such as filled pauses will require future research.

  2. The influence of (central) auditory processing disorder in speech sound disorders.

    Science.gov (United States)

    Barrozo, Tatiane Faria; Pagan-Neves, Luciana de Oliveira; Vilela, Nadia; Carvallo, Renata Mota Mamede; Wertzner, Haydée Fiszbein

    2016-01-01

    Considering the importance of auditory information for the acquisition and organization of phonological rules, the assessment of (central) auditory processing contributes to both the diagnosis and targeting of speech therapy in children with speech sound disorders. To study phonological measures and (central) auditory processing of children with speech sound disorder. Clinical and experimental study, with 21 subjects with speech sound disorder aged between 7.0 and 9.11 years, divided into two groups according to their (central) auditory processing disorder. The assessment comprised tests of phonology, speech inconsistency, and metalinguistic abilities. The group with (central) auditory processing disorder demonstrated greater severity of speech sound disorder. The cutoff value obtained for the process density index was the one that best characterized the occurrence of phonological processes for children above 7 years of age. The comparison among the tests evaluated between the two groups showed differences in some phonological and metalinguistic abilities. Children with an index value above 0.54 demonstrated strong tendencies towards presenting a (central) auditory processing disorder, and this measure was effective to indicate the need for evaluation in children with speech sound disorder. Copyright © 2015 Associação Brasileira de Otorrinolaringologia e Cirurgia Cérvico-Facial. Published by Elsevier Editora Ltda. All rights reserved.

  3. The role of high-level processes for oscillatory phase entrainment to speech sound

    Directory of Open Access Journals (Sweden)

    Benedikt eZoefel

    2015-12-01

    Full Text Available Constantly bombarded with input, the brain has the need to filter out relevant information while ignoring the irrelevant rest. A powerful tool may be represented by neural oscillations which entrain their high-excitability phase to important input while their low-excitability phase attenuates irrelevant information. Indeed, the alignment between brain oscillations and speech improves intelligibility and helps dissociating speakers during a cocktail party. Although well-investigated, the contribution of low- and high-level processes to phase entrainment to speech sound has only recently begun to be understood. Here, we review those findings, and concentrate on three main results: (1 Phase entrainment to speech sound is modulated by attention or predictions, likely supported by top-down signals and indicating higher-level processes involved in the brain’s adjustment to speech. (2 As phase entrainment to speech can be observed without systematic fluctuations in sound amplitude or spectral content, it does not only reflect a passive steady-state ringing of the cochlea, but entails a higher-level process. (3 The role of intelligibility for phase entrainment is debated. Recent results suggest that intelligibility modulates the behavioral consequences of entrainment, rather than directly affecting the strength of entrainment in auditory regions. We conclude that phase entrainment to speech reflects a sophisticated mechanism: Several high-level processes interact to optimally align neural oscillations with predicted events of high relevance, even when they are hidden in a continuous stream of background noise.

  4. Speech, Sound and Music Processing: Embracing Research in India

    DEFF Research Database (Denmark)

    classical music and its impact in cognitive science are the focus of discussion. Eminent scientist from the USA, Japan, Sweden, France, Poland, Taiwan, India and other European and Asian countries have delivered state-of-the-art lectures in these areas every year at different places providing an opportunity......The Computer Music Modeling and Retrieval (CMMR) 2011 conference was the 8th event of this international series, and the first that took place outside Europe. Since its beginnings in 2003, this conference has been co-organized by the Laboratoire de M´ecanique et d’Acoustique in Marseille, France......, and the Department of Architecture, Design and Media Technology (ad:mt), University of Aalborg, Esbjerg, Denmark, and has taken place in France, Italy, Spain, and Denmark. Historically, CMMR offers a cross-disciplinary overview of current music information retrieval and sound modeling activities and related topics...

  5. Neural Correlates of Phonological Processing in Speech Sound Disorder: A Functional Magnetic Resonance Imaging Study

    Science.gov (United States)

    Tkach, Jean A.; Chen, Xu; Freebairn, Lisa A.; Schmithorst, Vincent J.; Holland, Scott K.; Lewis, Barbara A.

    2011-01-01

    Speech sound disorders (SSD) are the largest group of communication disorders observed in children. One explanation for these disorders is that children with SSD fail to form stable phonological representations when acquiring the speech sound system of their language due to poor phonological memory (PM). The goal of this study was to examine PM in…

  6. Deficits in Letter-Speech Sound Associations but Intact Visual Conflict Processing in Dyslexia: Results from a Novel ERP-Paradigm

    OpenAIRE

    Bakos, Sarolta; Landerl, Karin; Bartling, Jürgen; Schulte-Körne, Gerd; Moll, Kristina

    2017-01-01

    The reading and spelling deficits characteristic of developmental dyslexia (dyslexia) have been related to problems in phonological processing and in learning associations between letters and speech-sounds. Even when children with dyslexia have learned the letters and their corresponding speech sounds, letter-speech sound associations might still be less automatized compared to children with age-adequate literacy skills. In order to examine automaticity in letter-speech sound associations and...

  7. Bilateral capacity for speech sound processing in auditory comprehension: evidence from Wada procedures.

    Science.gov (United States)

    Hickok, G; Okada, K; Barr, W; Pa, J; Rogalsky, C; Donnelly, K; Barde, L; Grant, A

    2008-12-01

    Data from lesion studies suggest that the ability to perceive speech sounds, as measured by auditory comprehension tasks, is supported by temporal lobe systems in both the left and right hemisphere. For example, patients with left temporal lobe damage and auditory comprehension deficits (i.e., Wernicke's aphasics), nonetheless comprehend isolated words better than one would expect if their speech perception system had been largely destroyed (70-80% accuracy). Further, when comprehension fails in such patients their errors are more often semantically-based, than-phonemically based. The question addressed by the present study is whether this ability of the right hemisphere to process speech sounds is a result of plastic reorganization following chronic left hemisphere damage, or whether the ability exists in undamaged language systems. We sought to test these possibilities by studying auditory comprehension in acute left versus right hemisphere deactivation during Wada procedures. A series of 20 patients undergoing clinically indicated Wada procedures were asked to listen to an auditorily presented stimulus word, and then point to its matching picture on a card that contained the target picture, a semantic foil, a phonemic foil, and an unrelated foil. This task was performed under three conditions, baseline, during left carotid injection of sodium amytal, and during right carotid injection of sodium amytal. Overall, left hemisphere injection led to a significantly higher error rate than right hemisphere injection. However, consistent with lesion work, the majority (75%) of these errors were semantic in nature. These findings suggest that auditory comprehension deficits are predominantly semantic in nature, even following acute left hemisphere disruption. This, in turn, supports the hypothesis that the right hemisphere is capable of speech sound processing in the intact brain.

  8. Listening to an audio drama activates two processing networks, one for all sounds, another exclusively for speech.

    Directory of Open Access Journals (Sweden)

    Robert Boldt

    Full Text Available Earlier studies have shown considerable intersubject synchronization of brain activity when subjects watch the same movie or listen to the same story. Here we investigated the across-subjects similarity of brain responses to speech and non-speech sounds in a continuous audio drama designed for blind people. Thirteen healthy adults listened for ∼19 min to the audio drama while their brain activity was measured with 3 T functional magnetic resonance imaging (fMRI. An intersubject-correlation (ISC map, computed across the whole experiment to assess the stimulus-driven extrinsic brain network, indicated statistically significant ISC in temporal, frontal and parietal cortices, cingulate cortex, and amygdala. Group-level independent component (IC analysis was used to parcel out the brain signals into functionally coupled networks, and the dependence of the ICs on external stimuli was tested by comparing them with the ISC map. This procedure revealed four extrinsic ICs of which two-covering non-overlapping areas of the auditory cortex-were modulated by both speech and non-speech sounds. The two other extrinsic ICs, one left-hemisphere-lateralized and the other right-hemisphere-lateralized, were speech-related and comprised the superior and middle temporal gyri, temporal poles, and the left angular and inferior orbital gyri. In areas of low ISC four ICs that were defined intrinsic fluctuated similarly as the time-courses of either the speech-sound-related or all-sounds-related extrinsic ICs. These ICs included the superior temporal gyrus, the anterior insula, and the frontal, parietal and midline occipital cortices. Taken together, substantial intersubject synchronization of cortical activity was observed in subjects listening to an audio drama, with results suggesting that speech is processed in two separate networks, one dedicated to the processing of speech sounds and the other to both speech and non-speech sounds.

  9. Transfer Effect of Speech-sound Learning on Auditory-motor Processing of Perceived Vocal Pitch Errors.

    Science.gov (United States)

    Chen, Zhaocong; Wong, Francis C K; Jones, Jeffery A; Li, Weifeng; Liu, Peng; Chen, Xi; Liu, Hanjun

    2015-08-17

    Speech perception and production are intimately linked. There is evidence that speech motor learning results in changes to auditory processing of speech. Whether speech motor control benefits from perceptual learning in speech, however, remains unclear. This event-related potential study investigated whether speech-sound learning can modulate the processing of feedback errors during vocal pitch regulation. Mandarin speakers were trained to perceive five Thai lexical tones while learning to associate pictures with spoken words over 5 days. Before and after training, participants produced sustained vowel sounds while they heard their vocal pitch feedback unexpectedly perturbed. As compared to the pre-training session, the magnitude of vocal compensation significantly decreased for the control group, but remained consistent for the trained group at the post-training session. However, the trained group had smaller and faster N1 responses to pitch perturbations and exhibited enhanced P2 responses that correlated significantly with their learning performance. These findings indicate that the cortical processing of vocal pitch regulation can be shaped by learning new speech-sound associations, suggesting that perceptual learning in speech can produce transfer effects to facilitating the neural mechanisms underlying the online monitoring of auditory feedback regarding vocal production.

  10. The influence of (central) auditory processing disorder on the severity of speech-sound disorders in children.

    Science.gov (United States)

    Vilela, Nadia; Barrozo, Tatiane Faria; Pagan-Neves, Luciana de Oliveira; Sanches, Seisse Gabriela Gandolfi; Wertzner, Haydée Fiszbein; Carvallo, Renata Mota Mamede

    2016-02-01

    To identify a cutoff value based on the Percentage of Consonants Correct-Revised index that could indicate the likelihood of a child with a speech-sound disorder also having a (central) auditory processing disorder . Language, audiological and (central) auditory processing evaluations were administered. The participants were 27 subjects with speech-sound disorders aged 7 to 10 years and 11 months who were divided into two different groups according to their (central) auditory processing evaluation results. When a (central) auditory processing disorder was present in association with a speech disorder, the children tended to have lower scores on phonological assessments. A greater severity of speech disorder was related to a greater probability of the child having a (central) auditory processing disorder. The use of a cutoff value for the Percentage of Consonants Correct-Revised index successfully distinguished between children with and without a (central) auditory processing disorder. The severity of speech-sound disorder in children was influenced by the presence of (central) auditory processing disorder. The attempt to identify a cutoff value based on a severity index was successful.

  11. Predictive Brain Mechanisms in Sound-to-Meaning Mapping during Speech Processing.

    Science.gov (United States)

    Lyu, Bingjiang; Ge, Jianqiao; Niu, Zhendong; Tan, Li Hai; Gao, Jia-Hong

    2016-10-19

    Spoken language comprehension relies not only on the identification of individual words, but also on the expectations arising from contextual information. A distributed frontotemporal network is known to facilitate the mapping of speech sounds onto their corresponding meanings. However, how prior expectations influence this efficient mapping at the neuroanatomical level, especially in terms of individual words, remains unclear. Using fMRI, we addressed this question in the framework of the dual-stream model by scanning native speakers of Mandarin Chinese, a language highly dependent on context. We found that, within the ventral pathway, the violated expectations elicited stronger activations in the left anterior superior temporal gyrus and the ventral inferior frontal gyrus (IFG) for the phonological-semantic prediction of spoken words. Functional connectivity analysis showed that expectations were mediated by both top-down modulation from the left ventral IFG to the anterior temporal regions and enhanced cross-stream integration through strengthened connections between different subregions of the left IFG. By further investigating the dynamic causality within the dual-stream model, we elucidated how the human brain accomplishes sound-to-meaning mapping for words in a predictive manner. In daily communication via spoken language, one of the core processes is understanding the words being used. Effortless and efficient information exchange via speech relies not only on the identification of individual spoken words, but also on the contextual information giving rise to expected meanings. Despite the accumulating evidence for the bottom-up perception of auditory input, it is still not fully understood how the top-down modulation is achieved in the extensive frontotemporal cortical network. Here, we provide a comprehensive description of the neural substrates underlying sound-to-meaning mapping and demonstrate how the dual-stream model functions in the modulation of

  12. Cortical processing of speech and non-speech sounds in autism and Asperger syndrome

    OpenAIRE

    Lepistö, Tuulia

    2008-01-01

    Autism and Asperger syndrome (AS) are neurodevelopmental disorders characterised by deficient social and communication skills, as well as restricted, repetitive patterns of behaviour. The language development in individuals with autism is significantly delayed and deficient, whereas in individuals with AS, the structural aspects of language develop quite normally. Both groups, however, have semantic-pragmatic language deficits. The present thesis investigated auditory processing in individual...

  13. Musical ability and non-native speech-sound processing are linked through sensitivity to pitch and spectral information.

    Science.gov (United States)

    Kempe, Vera; Bublitz, Dennis; Brooks, Patricia J

    2015-05-01

    Is the observed link between musical ability and non-native speech-sound processing due to enhanced sensitivity to acoustic features underlying both musical and linguistic processing? To address this question, native English speakers (N = 118) discriminated Norwegian tonal contrasts and Norwegian vowels. Short tones differing in temporal, pitch, and spectral characteristics were used to measure sensitivity to the various acoustic features implicated in musical and speech processing. Musical ability was measured using Gordon's Advanced Measures of Musical Audiation. Results showed that sensitivity to specific acoustic features played a role in non-native speech-sound processing: Controlling for non-verbal intelligence, prior foreign language-learning experience, and sex, sensitivity to pitch and spectral information partially mediated the link between musical ability and discrimination of non-native vowels and lexical tones. The findings suggest that while sensitivity to certain acoustic features partially mediates the relationship between musical ability and non-native speech-sound processing, complex tests of musical ability also tap into other shared mechanisms. © 2014 The British Psychological Society.

  14. Not all sounds sound the same: Parkinson's disease affects differently emotion processing in music and in speech prosody.

    Science.gov (United States)

    Lima, César F; Garrett, Carolina; Castro, São Luís

    2013-01-01

    Does emotion processing in music and speech prosody recruit common neurocognitive mechanisms? To examine this question, we implemented a cross-domain comparative design in Parkinson's disease (PD). Twenty-four patients and 25 controls performed emotion recognition tasks for music and spoken sentences. In music, patients had impaired recognition of happiness and peacefulness, and intact recognition of sadness and fear; this pattern was independent of general cognitive and perceptual abilities. In speech, patients had a small global impairment, which was significantly mediated by executive dysfunction. Hence, PD affected differently musical and prosodic emotions. This dissociation indicates that the mechanisms underlying the two domains are partly independent.

  15. Evaluation of Speech Recognition of Cochlear Implant Recipients Using Adaptive, Digital Remote Microphone Technology and a Speech Enhancement Sound Processing Algorithm.

    Science.gov (United States)

    Wolfe, Jace; Morais, Mila; Schafer, Erin; Agrawal, Smita; Koch, Dawn

    2015-05-01

    Cochlear implant recipients often experience difficulty with understanding speech in the presence of noise. Cochlear implant manufacturers have developed sound processing algorithms designed to improve speech recognition in noise, and research has shown these technologies to be effective. Remote microphone technology utilizing adaptive, digital wireless radio transmission has also been shown to provide significant improvement in speech recognition in noise. There are no studies examining the potential improvement in speech recognition in noise when these two technologies are used simultaneously. The goal of this study was to evaluate the potential benefits and limitations associated with the simultaneous use of a sound processing algorithm designed to improve performance in noise (Advanced Bionics ClearVoice) and a remote microphone system that incorporates adaptive, digital wireless radio transmission (Phonak Roger). A two-by-two way repeated measures design was used to examine performance differences obtained without these technologies compared to the use of each technology separately as well as the simultaneous use of both technologies. Eleven Advanced Bionics (AB) cochlear implant recipients, ages 11 to 68 yr. AzBio sentence recognition was measured in quiet and in the presence of classroom noise ranging in level from 50 to 80 dBA in 5-dB steps. Performance was evaluated in four conditions: (1) No ClearVoice and no Roger, (2) ClearVoice enabled without the use of Roger, (3) ClearVoice disabled with Roger enabled, and (4) simultaneous use of ClearVoice and Roger. Speech recognition in quiet was better than speech recognition in noise for all conditions. Use of ClearVoice and Roger each provided significant improvement in speech recognition in noise. The best performance in noise was obtained with the simultaneous use of ClearVoice and Roger. ClearVoice and Roger technology each improves speech recognition in noise, particularly when used at the same time

  16. Intelligibility of speech of children with speech and sound disorders

    OpenAIRE

    Ivetac, Tina

    2014-01-01

    The purpose of this study is to examine speech intelligibility of children with primary speech and sound disorders aged 3 to 6 years in everyday life. The research problem is based on the degree to which parents or guardians, immediate family members (sister, brother, grandparents), extended family members (aunt, uncle, cousin), child's friends, other acquaintances, child's teachers and strangers understand the speech of children with speech sound disorders. We examined whether the level ...

  17. Speech Processing.

    Science.gov (United States)

    1983-05-01

    The VDE system developed had the capability of recognizing up to 248 separate words in syntactic structures. 4 The two systems described are isolated...AND SPEAKER RECOGNITION by M.J.Hunt 5 ASSESSMENT OF SPEECH SYSTEMS ’ ..- * . by R.K.Moore 6 A SURVEY OF CURRENT EQUIPMENT AND RESEARCH’ by J.S.Bridle...TECHNOLOGY IN NAVY TRAINING SYSTEMS by R.Breaux, M.Blind and R.Lynchard 10 9 I-I GENERAL REVIEW OF MILITARY APPLICATIONS OF VOICE PROCESSING DR. BRUNO

  18. Spectral integration in speech and non-speech sounds

    Science.gov (United States)

    Jacewicz, Ewa

    2005-04-01

    Spectral integration (or formant averaging) was proposed in vowel perception research to account for the observation that a reduction of the intensity of one of two closely spaced formants (as in /u/) produced a predictable shift in vowel quality [Delattre et al., Word 8, 195-210 (1952)]. A related observation was reported in psychoacoustics, indicating that when the components of a two-tone periodic complex differ in amplitude and frequency, its perceived pitch is shifted toward that of the more intense tone [Helmholtz, App. XIV (1875/1948)]. Subsequent research in both fields focused on the frequency interval that separates these two spectral components, in an attempt to determine the size of the bandwidth for spectral integration to occur. This talk will review the accumulated evidence for and against spectral integration within the hypothesized limit of 3.5 Bark for static and dynamic signals in speech perception and psychoacoustics. Based on similarities in the processing of speech and non-speech sounds, it is suggested that spectral integration may reflect a general property of the auditory system. A larger frequency bandwidth, possibly close to 3.5 Bark, may be utilized in integrating acoustic information, including speech, complex signals, or sound quality of a violin.

  19. Interventions for Speech Sound Disorders in Children

    Science.gov (United States)

    Williams, A. Lynn, Ed.; McLeod, Sharynne, Ed.; McCauley, Rebecca J., Ed.

    2010-01-01

    With detailed discussion and invaluable video footage of 23 treatment interventions for speech sound disorders (SSDs) in children, this textbook and DVD set should be part of every speech-language pathologist's professional preparation. Focusing on children with functional or motor-based speech disorders from early childhood through the early…

  20. Musical and linguistic expertise influence pre-attentive and attentive processing of non-speech sounds.

    Science.gov (United States)

    Marie, Céline; Kujala, Teija; Besson, Mireille

    2012-04-01

    The aim of this experiment was two-fold. Our first goal was to determine whether linguistic expertise influences the pre-attentive [as reflected by the Mismatch Negativity - (MMN)] and the attentive processing (as reflected by behavioural discrimination accuracy) of non-speech, harmonic sounds. The second was to directly compare the effects of linguistic and musical expertise. To this end, we compared non-musician native speakers of a quantity language, Finnish, in which duration is a phonemically contrastive cue, with French musicians and French non-musicians. Results revealed that pre-attentive and attentive processing of duration deviants was enhanced in Finn non-musicians and French musicians compared to French non-musicians. By contrast, MMN in French musicians was larger than in both Finns and French non-musicians for frequency deviants, whereas no between-group differences were found for intensity deviants. By showing similar effects of linguistic and musical expertise, these results argue in favor of common processing of duration in music and speech. Copyright © 2010 Elsevier Srl. All rights reserved.

  1. Facilitated auditory detection for speech sounds

    Directory of Open Access Journals (Sweden)

    Carine eSignoret

    2011-07-01

    Full Text Available If it is well known that knowledge facilitates higher cognitive functions, such as visual and auditory word recognition, little is known about the influence of knowledge on detection, particularly in the auditory modality. Our study tested the influence of phonological and lexical knowledge on auditory detection. Words, pseudo words and complex non phonological sounds, energetically matched as closely as possible, were presented at a range of presentation levels from sub threshold to clearly audible. The participants performed a detection task (Experiments 1 and 2 that was followed by a two alternative forced choice recognition task in Experiment 2. The results of this second task in Experiment 2 suggest a correct recognition of words in the absence of detection with a subjective threshold approach. In the detection task of both experiments, phonological stimuli (words and pseudo words were better detected than non phonological stimuli (complex sounds, presented close to the auditory threshold. This finding suggests an advantage of speech for signal detection. An additional advantage of words over pseudo words was observed in Experiment 2, suggesting that lexical knowledge could also improve auditory detection when listeners had to recognize the stimulus in a subsequent task. Two simulations of detection performance performed on the sound signals confirmed that the advantage of speech over non speech processing could not be attributed to energetic differences in the stimuli.

  2. The influence of environmental sound training on the perception of spectrally degraded speech and environmental sounds.

    Science.gov (United States)

    Shafiro, Valeriy; Sheft, Stanley; Gygi, Brian; Ho, Kim Thien N

    2012-06-01

    Perceptual training with spectrally degraded environmental sounds results in improved environmental sound identification, with benefits shown to extend to untrained speech perception as well. The present study extended those findings to examine longer-term training effects as well as effects of mere repeated exposure to sounds over time. Participants received two pretests (1 week apart) prior to a week-long environmental sound training regimen, which was followed by two posttest sessions, separated by another week without training. Spectrally degraded stimuli, processed with a four-channel vocoder, consisted of a 160-item environmental sound test, word and sentence tests, and a battery of basic auditory abilities and cognitive tests. Results indicated significant improvements in all speech and environmental sound scores between the initial pretest and the last posttest with performance increments following both exposure and training. For environmental sounds (the stimulus class that was trained), the magnitude of positive change that accompanied training was much greater than that due to exposure alone, with improvement for untrained sounds roughly comparable to the speech benefit from exposure. Additional tests of auditory and cognitive abilities showed that speech and environmental sound performance were differentially correlated with tests of spectral and temporal-fine-structure processing, whereas working memory and executive function were correlated with speech, but not environmental sound perception. These findings indicate generalizability of environmental sound training and provide a basis for implementing environmental sound training programs for cochlear implant (CI) patients.

  3. Sound of mind : electrophysiological and behavioural evidence for the role of context, variation and informativity in human speech processing

    NARCIS (Netherlands)

    Nixon, Jessie Sophia

    2014-01-01

    Spoken communication involves transmission of a message which takes physical form in acoustic waves. Within any given language, acoustic cues pattern in language-specific ways along language-specific acoustic dimensions to create speech sound contrasts. These cues are utilized by listeners to

  4. Speech versus singing: Infants choose happier sounds

    Directory of Open Access Journals (Sweden)

    Marieve eCorbeil

    2013-06-01

    Full Text Available Infants prefer speech to non-vocal sounds and to non-human vocalizations, and they prefer happy-sounding speech to neutral speech. They also exhibit an interest in singing, but there is little knowledge of their relative interest in speech and singing. The present study explored infants’ attention to unfamiliar audio samples of speech and singing. In Experiment 1, infants 4-13 months of age were exposed to happy-sounding infant-directed speech versus hummed lullabies by the same woman. They listened significantly longer to the speech, which had considerably greater acoustic variability and expressiveness, than to the lullabies. In Experiment 2, infants of comparable age who heard the lyrics of a Turkish children’s song spoken versus sung in a joyful/happy manner did not exhibit differential listening. Infants in Experiment 3 heard the happily sung lyrics of the Turkish children’s song versus a version that was spoken in an adult-directed or affectively neutral manner. They listened significantly longer to the sung version. Overall, happy voice quality rather than vocal mode (speech or singing was the principal contributor to infant attention, regardless of age.

  5. Sound quality measures for speech in noise through a commercial hearing aid implementing digital noise reduction.

    Science.gov (United States)

    Ricketts, Todd A; Hornsby, Benjamin W Y

    2005-05-01

    This brief report discusses the affect of digital noise reduction (DNR) processing on aided speech recognition and sound quality measures in 14 adults fitted with a commercial hearing aid. Measures of speech recognition and sound quality were obtained in two different speech-in-noise conditions (71 dBA speech, +6 dB SNR and 75 dBA speech, +1 dB SNR). The results revealed that the presence or absence of DNR processing did not impact speech recognition in noise (either positively or negatively). Paired comparisons of sound quality for the same speech in noise signals, however, revealed a strong preference for DNR processing. These data suggest that at least one implementation of DNR processing is capable of providing improved sound quality, for speech in noise, in the absence of improved speech recognition.

  6. Visual feedback of tongue movement for novel speech sound learning

    Directory of Open Access Journals (Sweden)

    William F Katz

    2015-11-01

    Full Text Available Pronunciation training studies have yielded important information concerning the processing of audiovisual (AV information. Second language (L2 learners show increased reliance on bottom-up, multimodal input for speech perception (compared to monolingual individuals. However, little is known about the role of viewing one’s own speech articulation processes during speech training. The current study investigated whether real-time, visual feedback for tongue movement can improve a speaker’s learning of non-native speech sounds. An interactive 3D tongue visualization system based on electromagnetic articulography (EMA was used in a speech training experiment. Native speakers of American English produced a novel speech sound (/ɖ̠/; a voiced, coronal, palatal stop before, during, and after trials in which they viewed their own speech movements using the 3D model. Talkers’ productions were evaluated using kinematic (tongue-tip spatial positioning and acoustic (burst spectra measures. The results indicated a rapid gain in accuracy associated with visual feedback training. The findings are discussed with respect to neural models for multimodal speech processing.

  7. Toward a Model of Pediatric Speech Sound Disorders (SSD) for Differential Diagnosis and Therapy Planning

    NARCIS (Netherlands)

    Terband, Hayo; Maassen, Bernardus; Maas, Edwin; van Lieshout, Pascal; Maassen, Ben; Terband, Hayo

    2016-01-01

    The classification and differentiation of pediatric speech sound disorders (SSD) is one of the main questions in the field of speech- and language pathology. Terms for classifying childhood and SSD and motor speech disorders (MSD) refer to speech production processes, and a variety of methods of

  8. Precision of working memory for speech sounds.

    Science.gov (United States)

    Joseph, Sabine; Iverson, Paul; Manohar, Sanjay; Fox, Zoe; Scott, Sophie K; Husain, Masud

    2015-01-01

    Memory for speech sounds is a key component of models of verbal working memory (WM). But how good is verbal WM? Most investigations assess this using binary report measures to derive a fixed number of items that can be stored. However, recent findings in visual WM have challenged such "quantized" views by employing measures of recall precision with an analogue response scale. WM for speech sounds might rely on both continuous and categorical storage mechanisms. Using a novel speech matching paradigm, we measured WM recall precision for phonemes. Vowel qualities were sampled from a formant space continuum. A probe vowel had to be adjusted to match the vowel quality of a target on a continuous, analogue response scale. Crucially, this provided an index of the variability of a memory representation around its true value and thus allowed us to estimate how memories were distorted from the original sounds. Memory load affected the quality of speech sound recall in two ways. First, there was a gradual decline in recall precision with increasing number of items, consistent with the view that WM representations of speech sounds become noisier with an increase in the number of items held in memory, just as for vision. Based on multidimensional scaling (MDS), the level of noise appeared to be reflected in distortions of the formant space. Second, as memory load increased, there was evidence of greater clustering of participants' responses around particular vowels. A mixture model captured both continuous and categorical responses, demonstrating a shift from continuous to categorical memory with increasing WM load. This suggests that direct acoustic storage can be used for single items, but when more items must be stored, categorical representations must be used.

  9. When speech sounds like music.

    Science.gov (United States)

    Falk, Simone; Rathcke, Tamara; Dalla Bella, Simone

    2014-08-01

    Repetition can boost memory and perception. However, repeating the same stimulus several times in immediate succession also induces intriguing perceptual transformations and illusions. Here, we investigate the Speech to Song Transformation (S2ST), a massed repetition effect in the auditory modality, which crosses the boundaries between language and music. In the S2ST, a phrase repeated several times shifts to being heard as sung. To better understand this unique cross-domain transformation, we examined the perceptual determinants of the S2ST, in particular the role of acoustics. In 2 Experiments, the effects of 2 pitch properties and 3 rhythmic properties on the probability and speed of occurrence of the transformation were examined. Results showed that both pitch and rhythmic properties are key features fostering the transformation. However, some properties proved to be more conducive to the S2ST than others. Stable tonal targets that allowed for the perception of a musical melody led more often and quickly to the S2ST than scalar intervals. Recurring durational contrasts arising from segmental grouping favoring a metrical interpretation of the stimulus also facilitated the S2ST. This was, however, not the case for a regular beat structure within and across repetitions. In addition, individual perceptual abilities allowed to predict the likelihood of the S2ST. Overall, the study demonstrated that repetition enables listeners to reinterpret specific prosodic features of spoken utterances in terms of musical structures. The findings underline a tight link between language and music, but they also reveal important differences in communicative functions of prosodic structure in the 2 domains.

  10. Applications of Hilbert Spectral Analysis for Speech and Sound Signals

    Science.gov (United States)

    Huang, Norden E.

    2003-01-01

    A new method for analyzing nonlinear and nonstationary data has been developed, and the natural applications are to speech and sound signals. The key part of the method is the Empirical Mode Decomposition method with which any complicated data set can be decomposed into a finite and often small number of Intrinsic Mode Functions (IMF). An IMF is defined as any function having the same numbers of zero-crossing and extrema, and also having symmetric envelopes defined by the local maxima and minima respectively. The IMF also admits well-behaved Hilbert transform. This decomposition method is adaptive, and, therefore, highly efficient. Since the decomposition is based on the local characteristic time scale of the data, it is applicable to nonlinear and nonstationary processes. With the Hilbert transform, the Intrinsic Mode Functions yield instantaneous frequencies as functions of time, which give sharp identifications of imbedded structures. This method invention can be used to process all acoustic signals. Specifically, it can process the speech signals for Speech synthesis, Speaker identification and verification, Speech recognition, and Sound signal enhancement and filtering. Additionally, as the acoustical signals from machinery are essentially the way the machines are talking to us. Therefore, the acoustical signals, from the machines, either from sound through air or vibration on the machines, can tell us the operating conditions of the machines. Thus, we can use the acoustic signal to diagnosis the problems of machines.

  11. Polysyllable Speech Accuracy and Predictors of Later Literacy Development in Preschool Children with Speech Sound Disorders

    Science.gov (United States)

    Masso, Sarah; Baker, Elise; McLeod, Sharynne; Wang, Cen

    2017-01-01

    Purpose: The aim of this study was to determine if polysyllable accuracy in preschoolers with speech sound disorders (SSD) was related to known predictors of later literacy development: phonological processing, receptive vocabulary, and print knowledge. Polysyllables--words of three or more syllables--are important to consider because unlike…

  12. Developmental changes in brain activation involved in the production of novel speech sounds in children.

    Science.gov (United States)

    Hashizume, Hiroshi; Taki, Yasuyuki; Sassa, Yuko; Thyreau, Benjamin; Asano, Michiko; Asano, Kohei; Takeuchi, Hikaru; Nouchi, Rui; Kotozaki, Yuka; Jeong, Hyeonjeong; Sugiura, Motoaki; Kawashima, Ryuta

    2014-08-01

    Older children are more successful at producing unfamiliar, non-native speech sounds than younger children during the initial stages of learning. To reveal the neuronal underpinning of the age-related increase in the accuracy of non-native speech production, we examined the developmental changes in activation involved in the production of novel speech sounds using functional magnetic resonance imaging. Healthy right-handed children (aged 6-18 years) were scanned while performing an overt repetition task and a perceptual task involving aurally presented non-native and native syllables. Productions of non-native speech sounds were recorded and evaluated by native speakers. The mouth regions in the bilateral primary sensorimotor areas were activated more significantly during the repetition task relative to the perceptual task. The hemodynamic response in the left inferior frontal gyrus pars opercularis (IFG pOp) specific to non-native speech sound production (defined by prior hypothesis) increased with age. Additionally, the accuracy of non-native speech sound production increased with age. These results provide the first evidence of developmental changes in the neural processes underlying the production of novel speech sounds. Our data further suggest that the recruitment of the left IFG pOp during the production of novel speech sounds was possibly enhanced due to the maturation of the neuronal circuits needed for speech motor planning. This, in turn, would lead to improvement in the ability to immediately imitate non-native speech. Copyright © 2014 Wiley Periodicals, Inc.

  13. Implications of diadochokinesia in children with speech sound disorder.

    Science.gov (United States)

    Wertzner, Haydée Fiszbein; Pagan-Neves, Luciana de Oliveira; Alves, Renata Ramos; Barrozo, Tatiane Faria

    2013-01-01

    To verify the performance of children with and without speech sound disorder in oral motor skills measured by oral diadochokinesia according to age and gender and to compare the results by two different methods of analysis. Participants were 72 subjects aged from 5 years to 7 years and 11 months divided into four subgroups according to the presence of speech sound disorder (Study Group and Control Group) and age (6 years and 5 months). Diadochokinesia skills were assessed by the repetition of the sequences 'pa', 'ta', 'ka' and 'pataka' measured both manually and by the software Motor Speech Profile®. Gender was statistically different for both groups but it did not influence on the number of sequences per second produced. Correlation between the number of sequences per second and age was observed for all sequences (except for 'ka') only for the control group children. Comparison between groups did not indicate differences between the number of sequences per second and age. Results presented strong agreement between the values of oral diadochokinesia measured manually and by MSP. This research demonstrated the importance of using different methods of analysis on the functional evaluation of oro-motor processing aspects of children with speech sound disorder and evidenced the oro-motor difficulties on children aged under than eight years old.

  14. Digital speech processing using Matlab

    CERN Document Server

    Gopi, E S

    2014-01-01

    Digital Speech Processing Using Matlab deals with digital speech pattern recognition, speech production model, speech feature extraction, and speech compression. The book is written in a manner that is suitable for beginners pursuing basic research in digital speech processing. Matlab illustrations are provided for most topics to enable better understanding of concepts. This book also deals with the basic pattern recognition techniques (illustrated with speech signals using Matlab) such as PCA, LDA, ICA, SVM, HMM, GMM, BPN, and KSOM.

  15. Subtyping Children with Speech Sound Disorders by Endophenotypes

    Science.gov (United States)

    Lewis, Barbara A.; Avrich, Allison A.; Freebairn, Lisa A.; Taylor, H. Gerry; Iyengar, Sudha K.; Stein, Catherine M.

    2011-01-01

    Purpose: The present study examined associations of 5 endophenotypes (i.e., measurable skills that are closely associated with speech sound disorders and are useful in detecting genetic influences on speech sound production), oral motor skills, phonological memory, phonological awareness, vocabulary, and speeded naming, with 3 clinical criteria…

  16. The effects of bilingualism on children's perception of speech sounds

    NARCIS (Netherlands)

    Brasileiro, I.

    2009-01-01

    The general topic addressed by this dissertation is that of bilingualism, and more specifically, the topic of bilingual acquisition of speech sounds. The central question in this study is the following: does bilingualism affect children’s perceptual development of speech sounds? The term bilingual

  17. Recent advances in nonlinear speech processing

    CERN Document Server

    Faundez-Zanuy, Marcos; Esposito, Antonietta; Cordasco, Gennaro; Drugman, Thomas; Solé-Casals, Jordi; Morabito, Francesco

    2016-01-01

    This book presents recent advances in nonlinear speech processing beyond nonlinear techniques. It shows that it exploits heuristic and psychological models of human interaction in order to succeed in the implementations of socially believable VUIs and applications for human health and psychological support. The book takes into account the multifunctional role of speech and what is “outside of the box” (see Björn Schuller’s foreword). To this aim, the book is organized in 6 sections, each collecting a small number of short chapters reporting advances “inside” and “outside” themes related to nonlinear speech research. The themes emphasize theoretical and practical issues for modelling socially believable speech interfaces, ranging from efforts to capture the nature of sound changes in linguistic contexts and the timing nature of speech; labors to identify and detect speech features that help in the diagnosis of psychological and neuronal disease, attempts to improve the effectiveness and performa...

  18. Emergence of category-level sensitivities in non-native speech sound learning

    Directory of Open Access Journals (Sweden)

    Emily eMyers

    2014-08-01

    Full Text Available Over the course of development, speech sounds that are contrastive in one’s native language tend to become perceived categorically: that is, listeners are unaware of variation within phonetic categories while showing excellent sensitivity to speech sounds that span linguistically meaningful phonetic category boundaries. The end stage of this developmental process is that the perceptual systems that handle acoustic-phonetic information show special tuning to native language contrasts, and as such, category-level information appears to be present at even fairly low levels of the neural processing stream. Research on adults acquiring non-native speech categories offers an avenue for investigating the interplay of category-level information and perceptual sensitivities to these sounds as speech categories emerge. In particular, one can observe the neural changes that unfold as listeners learn not only to perceive acoustic distinctions that mark non-native speech sound contrasts, but also to map these distinctions onto category-level representations. An emergent literature on the neural basis of novel and non-native speech sound learning offers new insight into this question. In this review, I will examine this literature in order to answer two key questions. First, where in the neural pathway does sensitivity to category-level phonetic information first emerge over the trajectory of speech sound learning? Second, how do frontal and temporal brain areas work in concert over the course of non-native speech sound learning? Finally, in the context of this literature I will describe a model of speech sound learning in which rapidly-adapting access to categorical information in the frontal lobes modulates the sensitivity of stable, slowly-adapting responses in the temporal lobes.

  19. A homology sound-based algorithm for speech signal interference

    Science.gov (United States)

    Jiang, Yi-jiao; Chen, Hou-jin; Li, Ju-peng; Zhang, Zhan-song

    2015-12-01

    Aiming at secure analog speech communication, a homology sound-based algorithm for speech signal interference is proposed in this paper. We first split speech signal into phonetic fragments by a short-term energy method and establish an interference noise cache library with the phonetic fragments. Then we implement the homology sound interference by mixing the randomly selected interferential fragments and the original speech in real time. The computer simulation results indicated that the interference produced by this algorithm has advantages of real time, randomness, and high correlation with the original signal, comparing with the traditional noise interference methods such as white noise interference. After further studies, the proposed algorithm may be readily used in secure speech communication.

  20. Effects of sounds of locomotion on speech perception

    Directory of Open Access Journals (Sweden)

    Matz Larsson

    2015-01-01

    Full Text Available Human locomotion typically creates noise, a possible consequence of which is the masking of sound signals originating in the surroundings. When walking side by side, people often subconsciously synchronize their steps. The neurophysiological and evolutionary background of this behavior is unclear. The present study investigated the potential of sound created by walking to mask perception of speech and compared the masking produced by walking in step with that produced by unsynchronized walking. The masking sound (footsteps on gravel and the target sound (speech were presented through the same speaker to 15 normal-hearing subjects. The original recorded walking sound was modified to mimic the sound of two individuals walking in pace or walking out of synchrony. The participants were instructed to adjust the sound level of the target sound until they could just comprehend the speech signal ("just follow conversation" or JFC level when presented simultaneously with synchronized or unsynchronized walking sound at 40 dBA, 50 dBA, 60 dBA, or 70 dBA. Synchronized walking sounds produced slightly less masking of speech than did unsynchronized sound. The median JFC threshold in the synchronized condition was 38.5 dBA, while the corresponding value for the unsynchronized condition was 41.2 dBA. Combined results at all sound pressure levels showed an improvement in the signal-to-noise ratio (SNR for synchronized footsteps; the median difference was 2.7 dB and the mean difference was 1.2 dB [P < 0.001, repeated-measures analysis of variance (RM-ANOVA]. The difference was significant for masker levels of 50 dBA and 60 dBA, but not for 40 dBA or 70 dBA. This study provides evidence that synchronized walking may reduce the masking potential of footsteps.

  1. Comparing Feedback Types in Multimedia Learning of Speech by Young Children With Common Speech Sound Disorders: Research Protocol for a Pretest Posttest Independent Measures Control Trial.

    Science.gov (United States)

    Doubé, Wendy; Carding, Paul; Flanagan, Kieran; Kaufman, Jordy; Armitage, Hannah

    2018-01-01

    Children with speech sound disorders benefit from feedback about the accuracy of sounds they make. Home practice can reinforce feedback received from speech pathologists. Games in mobile device applications could encourage home practice, but those currently available are of limited value because they are unlikely to elaborate "Correct"/"Incorrect" feedback with information that can assist in improving the accuracy of the sound. This protocol proposes a "Wizard of Oz" experiment that aims to provide evidence for the provision of effective multimedia feedback for speech sound development. Children with two common speech sound disorders will play a game on a mobile device and make speech sounds when prompted by the game. A human "Wizard" will provide feedback on the accuracy of the sound but the children will perceive the feedback as coming from the game. Groups of 30 young children will be randomly allocated to one of five conditions: four types of feedback and a control which does not play the game. The results of this experiment will inform not only speech sound therapy, but also other types of language learning, both in general, and in multimedia applications. This experiment is a cost-effective precursor to the development of a mobile application that employs pedagogically and clinically sound processes for speech development in young children.

  2. Comparing Feedback Types in Multimedia Learning of Speech by Young Children With Common Speech Sound Disorders: Research Protocol for a Pretest Posttest Independent Measures Control Trial

    Science.gov (United States)

    Doubé, Wendy; Carding, Paul; Flanagan, Kieran; Kaufman, Jordy; Armitage, Hannah

    2018-01-01

    Children with speech sound disorders benefit from feedback about the accuracy of sounds they make. Home practice can reinforce feedback received from speech pathologists. Games in mobile device applications could encourage home practice, but those currently available are of limited value because they are unlikely to elaborate “Correct”/”Incorrect” feedback with information that can assist in improving the accuracy of the sound. This protocol proposes a “Wizard of Oz” experiment that aims to provide evidence for the provision of effective multimedia feedback for speech sound development. Children with two common speech sound disorders will play a game on a mobile device and make speech sounds when prompted by the game. A human “Wizard” will provide feedback on the accuracy of the sound but the children will perceive the feedback as coming from the game. Groups of 30 young children will be randomly allocated to one of five conditions: four types of feedback and a control which does not play the game. The results of this experiment will inform not only speech sound therapy, but also other types of language learning, both in general, and in multimedia applications. This experiment is a cost-effective precursor to the development of a mobile application that employs pedagogically and clinically sound processes for speech development in young children. PMID:29674986

  3. Comparing Feedback Types in Multimedia Learning of Speech by Young Children With Common Speech Sound Disorders: Research Protocol for a Pretest Posttest Independent Measures Control Trial

    Directory of Open Access Journals (Sweden)

    Wendy Doubé

    2018-04-01

    Full Text Available Children with speech sound disorders benefit from feedback about the accuracy of sounds they make. Home practice can reinforce feedback received from speech pathologists. Games in mobile device applications could encourage home practice, but those currently available are of limited value because they are unlikely to elaborate “Correct”/”Incorrect” feedback with information that can assist in improving the accuracy of the sound. This protocol proposes a “Wizard of Oz” experiment that aims to provide evidence for the provision of effective multimedia feedback for speech sound development. Children with two common speech sound disorders will play a game on a mobile device and make speech sounds when prompted by the game. A human “Wizard” will provide feedback on the accuracy of the sound but the children will perceive the feedback as coming from the game. Groups of 30 young children will be randomly allocated to one of five conditions: four types of feedback and a control which does not play the game. The results of this experiment will inform not only speech sound therapy, but also other types of language learning, both in general, and in multimedia applications. This experiment is a cost-effective precursor to the development of a mobile application that employs pedagogically and clinically sound processes for speech development in young children.

  4. Effect of gap detection threshold on consistency of speech in children with speech sound disorder.

    Science.gov (United States)

    Sayyahi, Fateme; Soleymani, Zahra; Akbari, Mohammad; Bijankhan, Mahmood; Dolatshahi, Behrooz

    2017-02-01

    The present study examined the relationship between gap detection threshold and speech error consistency in children with speech sound disorder. The participants were children five to six years of age who were categorized into three groups of typical speech, consistent speech disorder (CSD) and inconsistent speech disorder (ISD).The phonetic gap detection threshold test was used for this study, which is a valid test comprised six syllables with inter-stimulus intervals between 20-300ms. The participants were asked to listen to the recorded stimuli three times and indicate whether they heard one or two sounds. There was no significant difference between the typical and CSD groups (p=0.55), but there were significant differences in performance between the ISD and CSD groups and the ISD and typical groups (p=0.00). The ISD group discriminated between speech sounds at a higher threshold. Children with inconsistent speech errors could not distinguish speech sounds during time-limited phonetic discrimination. It is suggested that inconsistency in speech is a representation of inconsistency in auditory perception, which causes by high gap detection threshold. Copyright © 2016 Elsevier Ltd. All rights reserved.

  5. Auditory spatial attention to speech and complex non-speech sounds in children with autism spectrum disorder.

    Science.gov (United States)

    Soskey, Laura N; Allen, Paul D; Bennetto, Loisa

    2017-08-01

    One of the earliest observable impairments in autism spectrum disorder (ASD) is a failure to orient to speech and other social stimuli. Auditory spatial attention, a key component of orienting to sounds in the environment, has been shown to be impaired in adults with ASD. Additionally, specific deficits in orienting to social sounds could be related to increased acoustic complexity of speech. We aimed to characterize auditory spatial attention in children with ASD and neurotypical controls, and to determine the effect of auditory stimulus complexity on spatial attention. In a spatial attention task, target and distractor sounds were played randomly in rapid succession from speakers in a free-field array. Participants attended to a central or peripheral location, and were instructed to respond to target sounds at the attended location while ignoring nearby sounds. Stimulus-specific blocks evaluated spatial attention for simple non-speech tones, speech sounds (vowels), and complex non-speech sounds matched to vowels on key acoustic properties. Children with ASD had significantly more diffuse auditory spatial attention than neurotypical children when attending front, indicated by increased responding to sounds at adjacent non-target locations. No significant differences in spatial attention emerged based on stimulus complexity. Additionally, in the ASD group, more diffuse spatial attention was associated with more severe ASD symptoms but not with general inattention symptoms. Spatial attention deficits have important implications for understanding social orienting deficits and atypical attentional processes that contribute to core deficits of ASD. Autism Res 2017, 10: 1405-1416. © 2017 International Society for Autism Research, Wiley Periodicals, Inc. © 2017 International Society for Autism Research, Wiley Periodicals, Inc.

  6. Polysyllable Speech Accuracy and Predictors of Later Literacy Development in Preschool Children With Speech Sound Disorders.

    Science.gov (United States)

    Masso, Sarah; Baker, Elise; McLeod, Sharynne; Wang, Cen

    2017-07-12

    The aim of this study was to determine if polysyllable accuracy in preschoolers with speech sound disorders (SSD) was related to known predictors of later literacy development: phonological processing, receptive vocabulary, and print knowledge. Polysyllables-words of three or more syllables-are important to consider because unlike monosyllables, polysyllables have been associated with phonological processing and literacy difficulties in school-aged children. They therefore have the potential to help identify preschoolers most at risk of future literacy difficulties. Participants were 93 preschool children with SSD from the Sound Start Study. Participants completed the Polysyllable Preschool Test (Baker, 2013) as well as phonological processing, receptive vocabulary, and print knowledge tasks. Cluster analysis was completed, and 2 clusters were identified: low polysyllable accuracy and moderate polysyllable accuracy. The clusters were significantly different based on 2 measures of phonological awareness and measures of receptive vocabulary, rapid naming, and digit span. The clusters were not significantly different on sound matching accuracy or letter, sound, or print concept knowledge. The participants' poor performance on print knowledge tasks suggested that as a group, they were at risk of literacy difficulties but that there was a cluster of participants at greater risk-those with both low polysyllable accuracy and poor phonological processing.

  7. Nonlinear frequency compression: effects on sound quality ratings of speech and music.

    Science.gov (United States)

    Parsa, Vijay; Scollie, Susan; Glista, Danielle; Seelisch, Andreas

    2013-03-01

    Frequency lowering technologies offer an alternative amplification solution for severe to profound high frequency hearing losses. While frequency lowering technologies may improve audibility of high frequency sounds, the very nature of this processing can affect the perceived sound quality. This article reports the results from two studies that investigated the impact of a nonlinear frequency compression (NFC) algorithm on perceived sound quality. In the first study, the cutoff frequency and compression ratio parameters of the NFC algorithm were varied, and their effect on the speech quality was measured subjectively with 12 normal hearing adults, 12 normal hearing children, 13 hearing impaired adults, and 9 hearing impaired children. In the second study, 12 normal hearing and 8 hearing impaired adult listeners rated the quality of speech in quiet, speech in noise, and music after processing with a different set of NFC parameters. Results showed that the cutoff frequency parameter had more impact on sound quality ratings than the compression ratio, and that the hearing impaired adults were more tolerant to increased frequency compression than normal hearing adults. No statistically significant differences were found in the sound quality ratings of speech-in-noise and music stimuli processed through various NFC settings by hearing impaired listeners. These findings suggest that there may be an acceptable range of NFC settings for hearing impaired individuals where sound quality is not adversely affected. These results may assist an Audiologist in clinical NFC hearing aid fittings for achieving a balance between high frequency audibility and sound quality.

  8. Between-Word Simplification Patterns in the Continuous Speech of Children with Speech Sound Disorders

    Science.gov (United States)

    Klein, Harriet B.; Liu-Shea, May

    2009-01-01

    Purpose: This study was designed to identify and describe between-word simplification patterns in the continuous speech of children with speech sound disorders. It was hypothesized that word combinations would reveal phonological changes that were unobserved with single words, possibly accounting for discrepancies between the intelligibility of…

  9. The Clinical Practice of Speech and Language Therapists with Children with Phonologically Based Speech Sound Disorders

    Science.gov (United States)

    Oliveira, Carla; Lousada, Marisa; Jesus, Luis M. T.

    2015-01-01

    Children with speech sound disorders (SSD) represent a large number of speech and language therapists' caseloads. The intervention with children who have SSD can involve different therapy approaches, and these may be articulatory or phonologically based. Some international studies reveal a widespread application of articulatory based approaches in…

  10. Knockdown of Dyslexia-Gene Dcdc2 Interferes with Speech Sound Discrimination in Continuous Streams.

    Science.gov (United States)

    Centanni, Tracy Michelle; Booker, Anne B; Chen, Fuyi; Sloan, Andrew M; Carraway, Ryan S; Rennaker, Robert L; LoTurco, Joseph J; Kilgard, Michael P

    2016-04-27

    Dyslexia is the most common developmental language disorder and is marked by deficits in reading and phonological awareness. One theory of dyslexia suggests that the phonological awareness deficit is due to abnormal auditory processing of speech sounds. Variants in DCDC2 and several other neural migration genes are associated with dyslexia and may contribute to auditory processing deficits. In the current study, we tested the hypothesis that RNAi suppression of Dcdc2 in rats causes abnormal cortical responses to sound and impaired speech sound discrimination. In the current study, rats were subjected in utero to RNA interference targeting of the gene Dcdc2 or a scrambled sequence. Primary auditory cortex (A1) responses were acquired from 11 rats (5 with Dcdc2 RNAi; DC-) before any behavioral training. A separate group of 8 rats (3 DC-) were trained on a variety of speech sound discrimination tasks, and auditory cortex responses were acquired following training. Dcdc2 RNAi nearly eliminated the ability of rats to identify specific speech sounds from a continuous train of speech sounds but did not impair performance during discrimination of isolated speech sounds. The neural responses to speech sounds in A1 were not degraded as a function of presentation rate before training. These results suggest that A1 is not directly involved in the impaired speech discrimination caused by Dcdc2 RNAi. This result contrasts earlier results using Kiaa0319 RNAi and suggests that different dyslexia genes may cause different deficits in the speech processing circuitry, which may explain differential responses to therapy. Although dyslexia is diagnosed through reading difficulty, there is a great deal of variation in the phenotypes of these individuals. The underlying neural and genetic mechanisms causing these differences are still widely debated. In the current study, we demonstrate that suppression of a candidate-dyslexia gene causes deficits on tasks of rapid stimulus processing

  11. Severe Speech Sound Disorders: An Integrated Multimodal Intervention

    Science.gov (United States)

    King, Amie M.; Hengst, Julie A.; DeThorne, Laura S.

    2013-01-01

    Purpose: This study introduces an integrated multimodal intervention (IMI) and examines its effectiveness for the treatment of persistent and severe speech sound disorders (SSD) in young children. The IMI is an activity-based intervention that focuses simultaneously on increasing the "quantity" of a child's meaningful productions of target words…

  12. Atypical central auditory speech-sound discrimination in children who stutter as indexed by the mismatch negativity

    NARCIS (Netherlands)

    Jansson-Verkasalo, E.; Eggers, K.; Järvenpää, A.; Suominen, K.; Van Den Bergh, B.R.H.; de Nil, L.; Kujala, T.

    2014-01-01

    Purpose Recent theoretical conceptualizations suggest that disfluencies in stuttering may arise from several factors, one of them being atypical auditory processing. The main purpose of the present study was to investigate whether speech sound encoding and central auditory discrimination, are

  13. Evolution of non-speech sound memory in postlingual deafness: implications for cochlear implant rehabilitation.

    Science.gov (United States)

    Lazard, D S; Giraud, A L; Truy, E; Lee, H J

    2011-07-01

    Neurofunctional patterns assessed before or after cochlear implantation (CI) are informative markers of implantation outcome. Because phonological memory reorganization in post-lingual deafness is predictive of the outcome, we investigated, using a cross-sectional approach, whether memory of non-speech sounds (NSS) produced by animals or objects (i.e. non-human sounds) is also reorganized, and how this relates to speech perception after CI. We used an fMRI auditory imagery task in which sounds were evoked by pictures of noisy items for post-lingual deaf candidates for CI and for normal-hearing subjects. When deaf subjects imagined sounds, the left inferior frontal gyrus, the right posterior temporal gyrus and the right amygdala were less activated compared to controls. Activity levels in these regions decreased with duration of auditory deprivation, indicating declining NSS representations. Whole brain correlations with duration of auditory deprivation and with speech scores after CI showed an activity decline in dorsal, fronto-parietal, cortical regions, and an activity increase in ventral cortical regions, the right anterior temporal pole and the hippocampal gyrus. Both dorsal and ventral reorganizations predicted poor speech perception outcome after CI. These results suggest that post-CI speech perception relies, at least partially, on the integrity of a neural system used for processing NSS that is based on audio-visual and articulatory mapping processes. When this neural system is reorganized, post-lingual deaf subjects resort to inefficient semantic- and memory-based strategies. These results complement those of other studies on speech processing, suggesting that both speech and NSS representations need to be maintained during deafness to ensure the success of CI. Copyright © 2011 Elsevier Ltd. All rights reserved.

  14. Letter-speech sound learning in children with dyslexia : From behavioral research to clinical practice

    NARCIS (Netherlands)

    Aravena, S.

    2017-01-01

    In alphabetic languages, learning to associate speech-sounds with unfamiliar characters is a critical step in becoming a proficient reader. This dissertation aimed at expanding our knowledge of this learning process and its relation to dyslexia, with an emphasis on bridging the gap between

  15. The speech perception skills of children with and without speech sound disorder.

    Science.gov (United States)

    Hearnshaw, Stephanie; Baker, Elise; Munro, Natalie

    To investigate whether Australian-English speaking children with and without speech sound disorder (SSD) differ in their overall speech perception accuracy. Additionally, to investigate differences in the perception of specific phonemes and the association between speech perception and speech production skills. Twenty-five Australian-English speaking children aged 48-60 months participated in this study. The SSD group included 12 children and the typically developing (TD) group included 13 children. Children completed routine speech and language assessments in addition to an experimental Australian-English lexical and phonetic judgement task based on Rvachew's Speech Assessment and Interactive Learning System (SAILS) program (Rvachew, 2009). This task included eight words across four word-initial phonemes-/k, ɹ, ʃ, s/. Children with SSD showed significantly poorer perceptual accuracy on the lexical and phonetic judgement task compared with TD peers. The phonemes /ɹ/ and /s/ were most frequently perceived in error across both groups. Additionally, the phoneme /ɹ/ was most commonly produced in error. There was also a positive correlation between overall speech perception and speech production scores. Children with SSD perceived speech less accurately than their typically developing peers. The findings suggest that an Australian-English variation of a lexical and phonetic judgement task similar to the SAILS program is promising and worthy of a larger scale study. Copyright © 2017 Elsevier Inc. All rights reserved.

  16. Predicting speech intelligibility in conditions with nonlinearly processed noisy speech

    DEFF Research Database (Denmark)

    Jørgensen, Søren; Dau, Torsten

    2013-01-01

    The speech-based envelope power spectrum model (sEPSM; [1]) was proposed in order to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index (SII). The sEPSM applies the signal-tonoise ratio in the envelope domain (SNRenv), which was demonstrated...... to successfully predict speech intelligibility in conditions with nonlinearly processed noisy speech, such as processing with spectral subtraction. Moreover, a multiresolution version (mr-sEPSM) was demonstrated to account for speech intelligibility in various conditions with stationary and fluctuating...

  17. A Pilot Investigation of Speech Sound Disorder Intervention Delivered by Telehealth to School-Age Children

    Directory of Open Access Journals (Sweden)

    Sue Grogan-Johnson

    2011-05-01

    Full Text Available This article describes a school-based telehealth service delivery model and reports outcomes made by school-age students with speech sound disorders in a rural Ohio school district. Speech therapy using computer-based speech sound intervention materials was provided either by live interactive videoconferencing (telehealth, or conventional side-by-side intervention.  Progress was measured using pre- and post-intervention scores on the Goldman Fristoe Test of Articulation-2 (Goldman & Fristoe, 2002. Students in both service delivery models made significant improvements in speech sound production, with students in the telehealth condition demonstrating greater mastery of their Individual Education Plan (IEP goals. Live interactive videoconferencing thus appears to be a viable method for delivering intervention for speech sound disorders to children in a rural, public school setting. Keywords:  Telehealth, telerehabilitation, videoconferencing, speech sound disorder, speech therapy, speech-language pathology; E-Helper

  18. Aging affects hemispheric asymmetry in the neural representation of speech sounds.

    Science.gov (United States)

    Bellis, T J; Nicol, T; Kraus, N

    2000-01-15

    Hemispheric asymmetries in the processing of elemental speech sounds appear to be critical for normal speech perception. This study investigated the effects of age on hemispheric asymmetry observed in the neurophysiological responses to speech stimuli in three groups of normal hearing, right-handed subjects: children (ages, 8-11 years), young adults (ages, 20-25 years), and older adults (ages > 55 years). Peak-to-peak response amplitudes of the auditory cortical P1-N1 complex obtained over right and left temporal lobes were examined to determine the degree of left/right asymmetry in the neurophysiological responses elicited by synthetic speech syllables in each of the three subject groups. In addition, mismatch negativity (MMN) responses, which are elicited by acoustic change, were obtained. Whereas children and young adults demonstrated larger P1-N1-evoked response amplitudes over the left temporal lobe than over the right, responses from elderly subjects were symmetrical. In contrast, MMN responses, which reflect an echoic memory process, were symmetrical in all subject groups. The differences observed in the neurophysiological responses were accompanied by a finding of significantly poorer ability to discriminate speech syllables involving rapid spectrotemporal changes in the older adult group. This study demonstrates a biological, age-related change in the neural representation of basic speech sounds and suggests one possible underlying mechanism for the speech perception difficulties exhibited by aging adults. Furthermore, results of this study support previous findings suggesting a dissociation between neural mechanisms underlying those processes that reflect the basic representation of sound structure and those that represent auditory echoic memory and stimulus change.

  19. Speech Abilities in Preschool Children with Speech Sound Disorder with and without Co-Occurring Language Impairment

    Science.gov (United States)

    Macrae, Toby; Tyler, Ann A.

    2014-01-01

    Purpose: The authors compared preschool children with co-occurring speech sound disorder (SSD) and language impairment (LI) to children with SSD only in their numbers and types of speech sound errors. Method: In this post hoc quasi-experimental study, independent samples t tests were used to compare the groups in the standard score from different…

  20. Music and language expertise influence the categorization of speech and musical sounds: behavioral and electrophysiological measurements.

    Science.gov (United States)

    Elmer, Stefan; Klein, Carina; Kühnis, Jürg; Liem, Franziskus; Meyer, Martin; Jäncke, Lutz

    2014-10-01

    In this study, we used high-density EEG to evaluate whether speech and music expertise has an influence on the categorization of expertise-related and unrelated sounds. With this purpose in mind, we compared the categorization of speech, music, and neutral sounds between professional musicians, simultaneous interpreters (SIs), and controls in response to morphed speech-noise, music-noise, and speech-music continua. Our hypothesis was that music and language expertise will strengthen the memory representations of prototypical sounds, which act as a perceptual magnet for morphed variants. This means that the prototype would "attract" variants. This so-called magnet effect should be manifested by an increased assignment of morphed items to the trained category, by a reduced maximal slope of the psychometric function, as well as by differential event-related brain responses reflecting memory comparison processes (i.e., N400 and P600 responses). As a main result, we provide first evidence for a domain-specific behavioral bias of musicians and SIs toward the trained categories, namely music and speech. In addition, SIs showed a bias toward musical items, indicating that interpreting training has a generic influence on the cognitive representation of spectrotemporal signals with similar acoustic properties to speech sounds. Notably, EEG measurements revealed clear distinct N400 and P600 responses to both prototypical and ambiguous items between the three groups at anterior, central, and posterior scalp sites. These differential N400 and P600 responses represent synchronous activity occurring across widely distributed brain networks, and indicate a dynamical recruitment of memory processes that vary as a function of training and expertise.

  1. Does seeing an Asian face make speech sound more accented?

    Science.gov (United States)

    Zheng, Yi; Samuel, Arthur G

    2017-08-01

    Prior studies have reported that seeing an Asian face makes American English sound more accented. The current study investigates whether this effect is perceptual, or if it instead occurs at a later decision stage. We first replicated the finding that showing static Asian and Caucasian faces can shift people's reports about the accentedness of speech accompanying the pictures. When we changed the static pictures to dubbed videos, reducing the demand characteristics, the shift in reported accentedness largely disappeared. By including unambiguous items along with the original ambiguous items, we introduced a contrast bias and actually reversed the shift, with the Asian-face videos yielding lower judgments of accentedness than the Caucasian-face videos. By changing to a mixed rather than blocked design, so that the ethnicity of the videos varied from trial to trial, we eliminated the difference in accentedness rating. Finally, we tested participants' perception of accented speech using the selective adaptation paradigm. After establishing that an auditory-only accented adaptor shifted the perception of how accented test words are, we found that no such adaptation effect occurred when the adapting sounds relied on visual information (Asian vs. Caucasian videos) to influence the accentedness of an ambiguous auditory adaptor. Collectively, the results demonstrate that visual information can affect the interpretation, but not the perception, of accented speech.

  2. Analysis of speech sounds is left-hemisphere predominant at 100-150ms after sound onset.

    Science.gov (United States)

    Rinne, T; Alho, K; Alku, P; Holi, M; Sinkkonen, J; Virtanen, J; Bertrand, O; Näätänen, R

    1999-04-06

    Hemispheric specialization of human speech processing has been found in brain imaging studies using fMRI and PET. Due to the restricted time resolution, these methods cannot, however, determine the stage of auditory processing at which this specialization first emerges. We used a dense electrode array covering the whole scalp to record the mismatch negativity (MMN), an event-related brain potential (ERP) automatically elicited by occasional changes in sounds, which ranged from non-phonetic (tones) to phonetic (vowels). MMN can be used to probe auditory central processing on a millisecond scale with no attention-dependent task requirements. Our results indicate that speech processing occurs predominantly in the left hemisphere at the early, pre-attentive level of auditory analysis.

  3. What does learner speech sound like? A case study on adult learners of isiXhosa

    CSIR Research Space (South Africa)

    Badenhorst, Jaco

    2016-12-01

    Full Text Available moved during recording or by a sound/beep that results from the press of a button and an obstruction of the device microphone. • Low volume: Speech is too soft to understand what is being said. • Whispering: Speaker whispers during recording. • Laughter...-processing categories. If any of these categories were marked for a particular utterance, the utterance was discarded. The event categories were combined as follows: • Option 1: Empty, Whispering, Laughter, Background speech, Transcription mismatch • Option 2: Empty...

  4. Temporal factors affecting somatosensory-auditory interactions in speech processing

    Directory of Open Access Journals (Sweden)

    Takayuki eIto

    2014-11-01

    Full Text Available Speech perception is known to rely on both auditory and visual information. However, sound specific somatosensory input has been shown also to influence speech perceptual processing (Ito et al., 2009. In the present study we addressed further the relationship between somatosensory information and speech perceptual processing by addressing the hypothesis that the temporal relationship between orofacial movement and sound processing contributes to somatosensory-auditory interaction in speech perception. We examined the changes in event-related potentials in response to multisensory synchronous (simultaneous and asynchronous (90 ms lag and lead somatosensory and auditory stimulation compared to individual unisensory auditory and somatosensory stimulation alone. We used a robotic device to apply facial skin somatosensory deformations that were similar in timing and duration to those experienced in speech production. Following synchronous multisensory stimulation the amplitude of the event-related potential was reliably different from the two unisensory potentials. More importantly, the magnitude of the event-related potential difference varied as a function of the relative timing of the somatosensory-auditory stimulation. Event-related activity change due to stimulus timing was seen between 160-220 ms following somatosensory onset, mostly around the parietal area. The results demonstrate a dynamic modulation of somatosensory-auditory convergence and suggest the contribution of somatosensory information for speech processing process is dependent on the specific temporal order of sensory inputs in speech production.

  5. Irrelevant sound disrupts speech production: exploring the relationship between short-term memory and experimentally induced slips of the tongue.

    Science.gov (United States)

    Saito, Satoru; Baddeley, Alan

    2004-10-01

    To explore the relationship between short-term memory and speech production, we developed a speech error induction technique. The technique, which was adapted from a Japanese word game, exposed participants to an auditory distractor word immediately before the utterance of a target word. In Experiment 1, the distractor words that were phonologically similar to the target word led to a greater number of errors in speaking the target than did the dissimilar distractor words. Furthermore, the speech error scores were significantly correlated with memory span scores. In Experiment 2, memory span scores were again correlated with the rate of the speech errors that were induced from the task-irrelevant speech sounds. Experiment 3 showed a strong irrelevant-sound effect in the serial recall of nonwords. The magnitude of the irrelevant-sound effects was not affected by phonological similarity between the to-be-remembered nonwords and the irrelevant-sound materials. Analysis of recall errors in Experiment 3 also suggested that there were no essential differences in recall error patterns between the dissimilar and similar irrelevant-sound conditions. We proposed two different underlying mechanisms in immediate memory, one operating via the phonological short-term memory store and the other via the processes underpinning speech production.

  6. What characterizes changing-state speech in affecting short-term memory? An EEG study on the irrelevant sound effect.

    Science.gov (United States)

    Schlittmeier, Sabine J; Weisz, Nathan; Bertrand, Olivier

    2011-12-01

    The irrelevant sound effect (ISE) describes reduced verbal short-term memory during irrelevant changing-state sounds which consist of different and distinct auditory tokens. Steady-state sounds lack such changing-state features and do not impair performance. An EEG experiment (N=16) explored the distinguishing neurophysiological aspects of detrimental changing-state speech (3-token sequence) compared to ineffective steady-state speech (1-token sequence) on serial recall performance. We analyzed evoked and induced activity related to the memory items as well as spectral activity during the retention phase. The main finding is that the behavioral sound effect was exclusively reflected by attenuated token-induced gamma activation most pronounced between 50-60 Hz and 50-100 ms post-stimulus onset. Changing-state speech seems to disrupt a behaviorally relevant ongoing process during target presentation (e.g., the serial binding of the items). Copyright © 2011 Society for Psychophysiological Research.

  7. [Non-speech oral motor treatment efficacy for children with developmental speech sound disorders].

    Science.gov (United States)

    Ygual-Fernandez, A; Cervera-Merida, J F

    2016-01-01

    In the treatment of speech disorders by means of speech therapy two antagonistic methodological approaches are applied: non-verbal ones, based on oral motor exercises (OME), and verbal ones, which are based on speech processing tasks with syllables, phonemes and words. In Spain, OME programmes are called 'programas de praxias', and are widely used and valued by speech therapists. To review the studies conducted on the effectiveness of OME-based treatments applied to children with speech disorders and the theoretical arguments that could justify, or not, their usefulness. Over the last few decades evidence has been gathered about the lack of efficacy of this approach to treat developmental speech disorders and pronunciation problems in populations without any neurological alteration of motor functioning. The American Speech-Language-Hearing Association has advised against its use taking into account the principles of evidence-based practice. The knowledge gathered to date on motor control shows that the pattern of mobility and its corresponding organisation in the brain are different in speech and other non-verbal functions linked to nutrition and breathing. Neither the studies on their effectiveness nor the arguments based on motor control studies recommend the use of OME-based programmes for the treatment of pronunciation problems in children with developmental language disorders.

  8. Introduction. The perception of speech: from sound to meaning.

    Science.gov (United States)

    Moore, Brian C J; Tyler, Lorraine K; Marslen-Wilson, William

    2008-03-12

    Spoken language communication is arguably the most important activity that distinguishes humans from non-human species. This paper provides an overview of the review papers that make up this theme issue on the processes underlying speech communication. The volume includes contributions from researchers who specialize in a wide range of topics within the general area of speech perception and language processing. It also includes contributions from key researchers in neuroanatomy and functional neuro-imaging, in an effort to cut across traditional disciplinary boundaries and foster cross-disciplinary interactions in this important and rapidly developing area of the biological and cognitive sciences.

  9. Application of wavelets in speech processing

    CERN Document Server

    Farouk, Mohamed Hesham

    2014-01-01

    This book provides a survey on wide-spread of employing wavelets analysis  in different applications of speech processing. The author examines development and research in different application of speech processing. The book also summarizes the state of the art research on wavelet in speech processing.

  10. Mechanisms underlying speech sound discrimination and categorization in humans and zebra finches

    NARCIS (Netherlands)

    Burgering, Merel A.; ten Cate, Carel; Vroomen, Jean

    Speech sound categorization in birds seems in many ways comparable to that by humans, but it is unclear what mechanisms underlie such categorization. To examine this, we trained zebra finches and humans to discriminate two pairs of edited speech sounds that varied either along one dimension (vowel

  11. Multilingual Aspects of Speech Sound Disorders in Children. Communication Disorders across Languages

    Science.gov (United States)

    McLeod, Sharynne; Goldstein, Brian

    2012-01-01

    Multilingual Aspects of Speech Sound Disorders in Children explores both multilingual and multicultural aspects of children with speech sound disorders. The 30 chapters have been written by 44 authors from 16 different countries about 112 languages and dialects. The book is designed to translate research into clinical practice. It is divided into…

  12. Dynamic Assessment of Phonological Awareness for Children with Speech Sound Disorders

    Science.gov (United States)

    Gillam, Sandra Laing; Ford, Mikenzi Bentley

    2012-01-01

    The current study was designed to examine the relationships between performance on a nonverbal phoneme deletion task administered in a dynamic assessment format with performance on measures of phoneme deletion, word-level reading, and speech sound production that required verbal responses for school-age children with speech sound disorders (SSDs).…

  13. Phoneme Compression: processing of the speech signal and effects on speech intelligibility in hearing-Impaired listeners

    NARCIS (Netherlands)

    A. Goedegebure (Andre)

    2005-01-01

    textabstractHearing-aid users often continue to have problems with poor speech understanding in difficult acoustical conditions. Another generally accounted problem is that certain sounds become too loud whereas other sounds are still not audible. Dynamic range compression is a signal processing

  14. Auditory Brainstem Response to Complex Sounds Predicts Self-Reported Speech-in-Noise Performance

    Science.gov (United States)

    Anderson, Samira; Parbery-Clark, Alexandra; White-Schwoch, Travis; Kraus, Nina

    2013-01-01

    Purpose: To compare the ability of the auditory brainstem response to complex sounds (cABR) to predict subjective ratings of speech understanding in noise on the Speech, Spatial, and Qualities of Hearing Scale (SSQ; Gatehouse & Noble, 2004) relative to the predictive ability of the Quick Speech-in-Noise test (QuickSIN; Killion, Niquette,…

  15. Identifying Residual Speech Sound Disorders in Bilingual Children: A Japanese-English Case Study

    Science.gov (United States)

    Preston, Jonathan L.; Seki, Ayumi

    2011-01-01

    Purpose: To describe (a) the assessment of residual speech sound disorders (SSDs) in bilinguals by distinguishing speech patterns associated with second language acquisition from patterns associated with misarticulations and (b) how assessment of domains such as speech motor control and phonological awareness can provide a more complete…

  16. The Prevalence of Stuttering, Voice, and Speech-Sound Disorders in Primary School Students in Australia

    Science.gov (United States)

    McKinnon, David H.; McLeod, Sharynne; Reilly, Sheena

    2007-01-01

    Purpose: The aims of this study were threefold: to report teachers' estimates of the prevalence of speech disorders (specifically, stuttering, voice, and speech-sound disorders); to consider correspondence between the prevalence of speech disorders and gender, grade level, and socioeconomic status; and to describe the level of support provided to…

  17. Speech sound disorder at 4 years: prevalence, comorbidities, and predictors in a community cohort of children.

    Science.gov (United States)

    Eadie, Patricia; Morgan, Angela; Ukoumunne, Obioha C; Ttofari Eecen, Kyriaki; Wake, Melissa; Reilly, Sheena

    2015-06-01

    The epidemiology of preschool speech sound disorder is poorly understood. Our aims were to determine: the prevalence of idiopathic speech sound disorder; the comorbidity of speech sound disorder with language and pre-literacy difficulties; and the factors contributing to speech outcome at 4 years. One thousand four hundred and ninety-four participants from an Australian longitudinal cohort completed speech, language, and pre-literacy assessments at 4 years. Prevalence of speech sound disorder (SSD) was defined by standard score performance of ≤79 on a speech assessment. Logistic regression examined predictors of SSD within four domains: child and family; parent-reported speech; cognitive-linguistic; and parent-reported motor skills. At 4 years the prevalence of speech disorder in an Australian cohort was 3.4%. Comorbidity with SSD was 40.8% for language disorder and 20.8% for poor pre-literacy skills. Sex, maternal vocabulary, socio-economic status, and family history of speech and language difficulties predicted SSD, as did 2-year speech, language, and motor skills. Together these variables provided good discrimination of SSD (area under the curve=0.78). This is the first epidemiological study to demonstrate prevalence of SSD at 4 years of age that was consistent with previous clinical studies. Early detection of SSD at 4 years should focus on family variables and speech, language, and motor skills measured at 2 years. © 2014 Mac Keith Press.

  18. Hemispheric lateralization in an analysis of speech sounds. Left hemisphere dominance replicated in Japanese subjects.

    Science.gov (United States)

    Koyama, S; Gunji, A; Yabe, H; Oiwa, S; Akahane-Yamada, R; Kakigi, R; Näätänen, R

    2000-09-01

    Evoked magnetic responses to speech sounds [R. Näätänen, A. Lehtokoski, M. Lennes, M. Cheour, M. Huotilainen, A. Iivonen, M. Vainio, P. Alku, R.J. Ilmoniemi, A. Luuk, J. Allik, J. Sinkkonen and K. Alho, Language-specific phoneme representations revealed by electric and magnetic brain responses. Nature, 385 (1997) 432-434.] were recorded from 13 Japanese subjects (right-handed). Infrequently presented vowels ([o]) among repetitive vowels ([e]) elicited the magnetic counterpart of mismatch negativity, MMNm (Bilateral, nine subjects; Left hemisphere alone, three subjects; Right hemisphere alone, one subject). The estimated source of the MMNm was stronger in the left than in the right auditory cortex. The sources were located posteriorly in the left than in the right auditory cortex. These findings are consistent with the results obtained in Finnish [R. Näätänen, A. Lehtokoski, M. Lennes, M. Cheour, M. Huotilainen, A. Iivonen, M.Vainio, P.Alku, R.J. Ilmoniemi, A. Luuk, J. Allik, J. Sinkkonen and K. Alho, Language-specific phoneme representations revealed by electric and magnetic brain responses. Nature, 385 (1997) 432-434.][T. Rinne, K. Alho, P. Alku, M. Holi, J. Sinkkonen, J. Virtanen, O. Bertrand and R. Näätänen, Analysis of speech sounds is left-hemisphere predominant at 100-150 ms after sound onset. Neuroreport, 10 (1999) 1113-1117.] and English [K. Alho, J.F. Connolly, M. Cheour, A. Lehtokoski, M. Huotilainen, J. Virtanen, R. Aulanko and R.J. Ilmoniemi, Hemispheric lateralization in preattentive processing of speech sounds. Neurosci. Lett., 258 (1998) 9-12.] subjects. Instead of the P1m observed in Finnish [M. Tervaniemi, A. Kujala, K. Alho, J. Virtanen, R.J. Ilmoniemi and R. Näätänen, Functional specialization of the human auditory cortex in processing phonetic and musical sounds: A magnetoencephalographic (MEG) study. Neuroimage, 9 (1999) 330-336.] and English [K. Alho, J. F. Connolly, M. Cheour, A. Lehtokoski, M. Huotilainen, J. Virtanen, R. Aulanko

  19. Differences between the production of [s] and [ʃ] in the speech of adults, typically developing children, and children with speech sound disorders: An ultrasound study.

    Science.gov (United States)

    Francisco, Danira Tavares; Wertzner, Haydée Fiszbein

    2017-01-01

    This study describes the criteria that are used in ultrasound to measure the differences between the tongue contours that produce [s] and [ʃ] sounds in the speech of adults, typically developing children (TDC), and children with speech sound disorder (SSD) with the phonological process of palatal fronting. Overlapping images of the tongue contours that resulted from 35 subjects producing the [s] and [ʃ] sounds were analysed to select 11 spokes on the radial grid that were spread over the tongue contour. The difference was calculated between the mean contour of the [s] and [ʃ] sounds for each spoke. A cluster analysis produced groups with some consistency in the pattern of articulation across subjects and differentiated adults and TDC to some extent and children with SSD with a high level of success. Children with SSD were less likely to show differentiation of the tongue contours between the articulation of [s] and [ʃ].

  20. Preschool Speech Error Patterns Predict Articulation and Phonological Awareness Outcomes in Children with Histories of Speech Sound Disorders

    Science.gov (United States)

    Preston, Jonathan L.; Hull, Margaret; Edwards, Mary Louise

    2013-01-01

    Purpose: To determine if speech error patterns in preschoolers with speech sound disorders (SSDs) predict articulation and phonological awareness (PA) outcomes almost 4 years later. Method: Twenty-five children with histories of preschool SSDs (and normal receptive language) were tested at an average age of 4;6 (years;months) and were followed up…

  1. Sound and speech detection and classification in a Health Smart Home.

    Science.gov (United States)

    Fleury, A; Noury, N; Vacher, M; Glasson, H; Seri, J F

    2008-01-01

    Improvements in medicine increase life expectancy in the world and create a new bottleneck at the entrance of specialized and equipped institutions. To allow elderly people to stay at home, researchers work on ways to monitor them in their own environment, with non-invasive sensors. To meet this goal, smart homes, equipped with lots of sensors, deliver information on the activities of the person and can help detect distress situations. In this paper, we present a global speech and sound recognition system that can be set-up in a flat. We placed eight microphones in the Health Smart Home of Grenoble (a real living flat of 47m(2)) and we automatically analyze and sort out the different sounds recorded in the flat and the speech uttered (to detect normal or distress french sentences). We introduce the methods for the sound and speech recognition, the post-processing of the data and finally the experimental results obtained in real conditions in the flat.

  2. Speech-language pathologists' practices regarding assessment, analysis, target selection, intervention, and service delivery for children with speech sound disorders.

    Science.gov (United States)

    Mcleod, Sharynne; Baker, Elise

    2014-01-01

    A survey of 231 Australian speech-language pathologists (SLPs) was undertaken to describe practices regarding assessment, analysis, target selection, intervention, and service delivery for children with speech sound disorders (SSD). The participants typically worked in private practice, education, or community health settings and 67.6% had a waiting list for services. For each child, most of the SLPs spent 10-40 min in pre-assessment activities, 30-60 min undertaking face-to-face assessments, and 30-60 min completing paperwork after assessments. During an assessment SLPs typically conducted a parent interview, single-word speech sampling, collected a connected speech sample, and used informal tests. They also determined children's stimulability and estimated intelligibility. With multilingual children, informal assessment procedures and English-only tests were commonly used and SLPs relied on family members or interpreters to assist. Common analysis techniques included determination of phonological processes, substitutions-omissions-distortions-additions (SODA), and phonetic inventory. Participants placed high priority on selecting target sounds that were stimulable, early developing, and in error across all word positions and 60.3% felt very confident or confident selecting an appropriate intervention approach. Eight intervention approaches were frequently used: auditory discrimination, minimal pairs, cued articulation, phonological awareness, traditional articulation therapy, auditory bombardment, Nuffield Centre Dyspraxia Programme, and core vocabulary. Children typically received individual therapy with an SLP in a clinic setting. Parents often observed and participated in sessions and SLPs typically included siblings and grandparents in intervention sessions. Parent training and home programs were more frequently used than the group therapy. Two-thirds kept up-to-date by reading journal articles monthly or every 6 months. There were many similarities with

  3. Perceptual sensitivity to spectral properties of earlier sounds during speech categorization.

    Science.gov (United States)

    Stilp, Christian E; Assgari, Ashley A

    2018-02-28

    Speech perception is heavily influenced by surrounding sounds. When spectral properties differ between earlier (context) and later (target) sounds, this can produce spectral contrast effects (SCEs) that bias perception of later sounds. For example, when context sounds have more energy in low-F 1 frequency regions, listeners report more high-F 1 responses to a target vowel, and vice versa. SCEs have been reported using various approaches for a wide range of stimuli, but most often, large spectral peaks were added to the context to bias speech categorization. This obscures the lower limit of perceptual sensitivity to spectral properties of earlier sounds, i.e., when SCEs begin to bias speech categorization. Listeners categorized vowels (/ɪ/-/ɛ/, Experiment 1) or consonants (/d/-/g/, Experiment 2) following a context sentence with little spectral amplification (+1 to +4 dB) in frequency regions known to produce SCEs. In both experiments, +3 and +4 dB amplification in key frequency regions of the context produced SCEs, but lesser amplification was insufficient to bias performance. This establishes a lower limit of perceptual sensitivity where spectral differences across sounds can bias subsequent speech categorization. These results are consistent with proposed adaptation-based mechanisms that potentially underlie SCEs in auditory perception. Recent sounds can change what speech sounds we hear later. This can occur when the average frequency composition of earlier sounds differs from that of later sounds, biasing how they are perceived. These "spectral contrast effects" are widely observed when sounds' frequency compositions differ substantially. We reveal the lower limit of these effects, as +3 dB amplification of key frequency regions in earlier sounds was enough to bias categorization of the following vowel or consonant sound. Speech categorization being biased by very small spectral differences across sounds suggests that spectral contrast effects occur

  4. Predicting the perceived sound quality of frequency-compressed speech.

    Directory of Open Access Journals (Sweden)

    Rainer Huber

    Full Text Available The performance of objective speech and audio quality measures for the prediction of the perceived quality of frequency-compressed speech in hearing aids is investigated in this paper. A number of existing quality measures have been applied to speech signals processed by a hearing aid, which compresses speech spectra along frequency in order to make information contained in higher frequencies audible for listeners with severe high-frequency hearing loss. Quality measures were compared with subjective ratings obtained from normal hearing and hearing impaired children and adults in an earlier study. High correlations were achieved with quality measures computed by quality models that are based on the auditory model of Dau et al., namely, the measure PSM, computed by the quality model PEMO-Q; the measure qc, computed by the quality model proposed by Hansen and Kollmeier; and the linear subcomponent of the HASQI. For the prediction of quality ratings by hearing impaired listeners, extensions of some models incorporating hearing loss were implemented and shown to achieve improved prediction accuracy. Results indicate that these objective quality measures can potentially serve as tools for assisting in initial setting of frequency compression parameters.

  5. International aspirations for speech-language pathologists' practice with multilingual children with speech sound disorders: development of a position paper.

    Science.gov (United States)

    McLeod, Sharynne; Verdon, Sarah; Bowen, Caroline

    2013-01-01

    A major challenge for the speech-language pathology profession in many cultures is to address the mismatch between the "linguistic homogeneity of the speech-language pathology profession and the linguistic diversity of its clientele" (Caesar & Kohler, 2007, p. 198). This paper outlines the development of the Multilingual Children with Speech Sound Disorders: Position Paper created to guide speech-language pathologists' (SLPs') facilitation of multilingual children's speech. An international expert panel was assembled comprising 57 researchers (SLPs, linguists, phoneticians, and speech scientists) with knowledge about multilingual children's speech, or children with speech sound disorders. Combined, they had worked in 33 countries and used 26 languages in professional practice. Fourteen panel members met for a one-day workshop to identify key points for inclusion in the position paper. Subsequently, 42 additional panel members participated online to contribute to drafts of the position paper. A thematic analysis was undertaken of the major areas of discussion using two data sources: (a) face-to-face workshop transcript (133 pages) and (b) online discussion artifacts (104 pages). Finally, a moderator with international expertise in working with children with speech sound disorders facilitated the incorporation of the panel's recommendations. The following themes were identified: definitions, scope, framework, evidence, challenges, practices, and consideration of a multilingual audience. The resulting position paper contains guidelines for providing services to multilingual children with speech sound disorders (http://www.csu.edu.au/research/multilingual-speech/position-paper). The paper is structured using the International Classification of Functioning, Disability and Health: Children and Youth Version (World Health Organization, 2007) and incorporates recommendations for (a) children and families, (b) SLPs' assessment and intervention, (c) SLPs' professional

  6. Speech parts as Poisson processes.

    Science.gov (United States)

    Badalamenti, A F

    2001-09-01

    This paper presents evidence that six of the seven parts of speech occur in written text as Poisson processes, simple or recurring. The six major parts are nouns, verbs, adjectives, adverbs, prepositions, and conjunctions, with the interjection occurring too infrequently to support a model. The data consist of more than the first 5000 words of works by four major authors coded to label the parts of speech, as well as periods (sentence terminators). Sentence length is measured via the period and found to be normally distributed with no stochastic model identified for its occurrence. The models for all six speech parts but the noun significantly distinguish some pairs of authors and likewise for the joint use of all words types. Any one author is significantly distinguished from any other by at least one word type and sentence length very significantly distinguishes each from all others. The variety of word type use, measured by Shannon entropy, builds to about 90% of its maximum possible value. The rate constants for nouns are close to the fractions of maximum entropy achieved. This finding together with the stochastic models and the relations among them suggest that the noun may be a primitive organizer of written text.

  7. The Hierarchical Cortical Organization of Human Speech Processing.

    Science.gov (United States)

    de Heer, Wendy A; Huth, Alexander G; Griffiths, Thomas L; Gallant, Jack L; Theunissen, Frédéric E

    2017-07-05

    Speech comprehension requires that the brain extract semantic meaning from the spectral features represented at the cochlea. To investigate this process, we performed an fMRI experiment in which five men and two women passively listened to several hours of natural narrative speech. We then used voxelwise modeling to predict BOLD responses based on three different feature spaces that represent the spectral, articulatory, and semantic properties of speech. The amount of variance explained by each feature space was then assessed using a separate validation dataset. Because some responses might be explained equally well by more than one feature space, we used a variance partitioning analysis to determine the fraction of the variance that was uniquely explained by each feature space. Consistent with previous studies, we found that speech comprehension involves hierarchical representations starting in primary auditory areas and moving laterally on the temporal lobe: spectral features are found in the core of A1, mixtures of spectral and articulatory in STG, mixtures of articulatory and semantic in STS, and semantic in STS and beyond. Our data also show that both hemispheres are equally and actively involved in speech perception and interpretation. Further, responses as early in the auditory hierarchy as in STS are more correlated with semantic than spectral representations. These results illustrate the importance of using natural speech in neurolinguistic research. Our methodology also provides an efficient way to simultaneously test multiple specific hypotheses about the representations of speech without using block designs and segmented or synthetic speech. SIGNIFICANCE STATEMENT To investigate the processing steps performed by the human brain to transform natural speech sound into meaningful language, we used models based on a hierarchical set of speech features to predict BOLD responses of individual voxels recorded in an fMRI experiment while subjects listened to

  8. Is the Speech Transmission Index (STI) a robust measure of sound system speech intelligibility performance?

    Science.gov (United States)

    Mapp, Peter

    2002-11-01

    Although RaSTI is a good indicator of the speech intelligibility capability of auditoria and similar spaces, during the past 2-3 years it has been shown that RaSTI is not a robust predictor of sound system intelligibility performance. Instead, it is now recommended, within both national and international codes and standards, that full STI measurement and analysis be employed. However, new research is reported, that indicates that STI is not as flawless, nor robust as many believe. The paper highlights a number of potential error mechanisms. It is shown that the measurement technique and signal excitation stimulus can have a significant effect on the overall result and accuracy, particularly where DSP-based equipment is employed. It is also shown that in its current state of development, STI is not capable of appropriately accounting for a number of fundamental speech and system attributes, including typical sound system frequency response variations and anomalies. This is particularly shown to be the case when a system is operating under reverberant conditions. Comparisons between actual system measurements and corresponding word score data are reported where errors of up to 50 implications for VA and PA system performance verification will be discussed.

  9. Cognitive Bias for Learning Speech Sounds From a Continuous Signal Space Seems Nonlinguistic

    Directory of Open Access Journals (Sweden)

    Sabine van der Ham

    2015-10-01

    Full Text Available When learning language, humans have a tendency to produce more extreme distributions of speech sounds than those observed most frequently: In rapid, casual speech, vowel sounds are centralized, yet cross-linguistically, peripheral vowels occur almost universally. We investigate whether adults’ generalization behavior reveals selective pressure for communication when they learn skewed distributions of speech-like sounds from a continuous signal space. The domain-specific hypothesis predicts that the emergence of sound categories is driven by a cognitive bias to make these categories maximally distinct, resulting in more skewed distributions in participants’ reproductions. However, our participants showed more centered distributions, which goes against this hypothesis, indicating that there are no strong innate linguistic biases that affect learning these speech-like sounds. The centralization behavior can be explained by a lack of communicative pressure to maintain categories.

  10. The neural basis of sublexical speech and corresponding nonspeech processing: a combined EEG-MEG study.

    Science.gov (United States)

    Kuuluvainen, Soila; Nevalainen, Päivi; Sorokin, Alexander; Mittag, Maria; Partanen, Eino; Putkinen, Vesa; Seppänen, Miia; Kähkönen, Seppo; Kujala, Teija

    2014-03-01

    We addressed the neural organization of speech versus nonspeech sound processing by investigating preattentive cortical auditory processing of changes in five features of a consonant-vowel syllable (consonant, vowel, sound duration, frequency, and intensity) and their acoustically matched nonspeech counterparts in a simultaneous EEG-MEG recording of mismatch negativity (MMN/MMNm). Overall, speech-sound processing was enhanced compared to nonspeech sound processing. This effect was strongest for changes which affect word meaning (consonant, vowel, and vowel duration) in the left and for the vowel identity change in the right hemisphere also. Furthermore, in the right hemisphere, speech-sound frequency and intensity changes were processed faster than their nonspeech counterparts, and there was a trend for speech-enhancement in frequency processing. In summary, the results support the proposed existence of long-term memory traces for speech sounds in the auditory cortices, and indicate at least partly distinct neural substrates for speech and nonspeech sound processing. Copyright © 2014 Elsevier Inc. All rights reserved.

  11. Auditory cortex processes variation in our own speech.

    Directory of Open Access Journals (Sweden)

    Kevin R Sitek

    Full Text Available As we talk, we unconsciously adjust our speech to ensure it sounds the way we intend it to sound. However, because speech production involves complex motor planning and execution, no two utterances of the same sound will be exactly the same. Here, we show that auditory cortex is sensitive to natural variations in self-produced speech from utterance to utterance. We recorded event-related potentials (ERPs from ninety-nine subjects while they uttered "ah" and while they listened to those speech sounds played back. Subjects' utterances were sorted based on their formant deviations from the previous utterance. Typically, the N1 ERP component is suppressed during talking compared to listening. By comparing ERPs to the least and most variable utterances, we found that N1 was less suppressed to utterances that differed greatly from their preceding neighbors. In contrast, an utterance's difference from the median formant values did not affect N1. Trial-to-trial pitch (f0 deviation and pitch difference from the median similarly did not affect N1. We discuss mechanisms that may underlie the change in N1 suppression resulting from trial-to-trial formant change. Deviant utterances require additional auditory cortical processing, suggesting that speaking-induced suppression mechanisms are optimally tuned for a specific production.

  12. Auditory Cortex Processes Variation in Our Own Speech

    Science.gov (United States)

    Sitek, Kevin R.; Mathalon, Daniel H.; Roach, Brian J.; Houde, John F.; Niziolek, Caroline A.; Ford, Judith M.

    2013-01-01

    As we talk, we unconsciously adjust our speech to ensure it sounds the way we intend it to sound. However, because speech production involves complex motor planning and execution, no two utterances of the same sound will be exactly the same. Here, we show that auditory cortex is sensitive to natural variations in self-produced speech from utterance to utterance. We recorded event-related potentials (ERPs) from ninety-nine subjects while they uttered “ah” and while they listened to those speech sounds played back. Subjects' utterances were sorted based on their formant deviations from the previous utterance. Typically, the N1 ERP component is suppressed during talking compared to listening. By comparing ERPs to the least and most variable utterances, we found that N1 was less suppressed to utterances that differed greatly from their preceding neighbors. In contrast, an utterance's difference from the median formant values did not affect N1. Trial-to-trial pitch (f0) deviation and pitch difference from the median similarly did not affect N1. We discuss mechanisms that may underlie the change in N1 suppression resulting from trial-to-trial formant change. Deviant utterances require additional auditory cortical processing, suggesting that speaking-induced suppression mechanisms are optimally tuned for a specific production. PMID:24349399

  13. Experience with speech sounds is not necessary for cue trading by budgerigars (Melopsittacus undulatus.

    Directory of Open Access Journals (Sweden)

    Mary Flaherty

    Full Text Available The influence of experience with human speech sounds on speech perception in budgerigars, vocal mimics whose speech exposure can be tightly controlled in a laboratory setting, was measured. Budgerigars were divided into groups that differed in auditory exposure and then tested on a cue-trading identification paradigm with synthetic speech. Phonetic cue trading is a perceptual phenomenon observed when changes on one cue dimension are offset by changes in another cue dimension while still maintaining the same phonetic percept. The current study examined whether budgerigars would trade the cues of voice onset time (VOT and the first formant onset frequency when identifying syllable initial stop consonants and if this would be influenced by exposure to speech sounds. There were a total of four different exposure groups: No speech exposure (completely isolated, Passive speech exposure (regular exposure to human speech, and two Speech-trained groups. After the exposure period, all budgerigars were tested for phonetic cue trading using operant conditioning procedures. Birds were trained to peck keys in response to different synthetic speech sounds that began with "d" or "t" and varied in VOT and frequency of the first formant at voicing onset. Once training performance criteria were met, budgerigars were presented with the entire intermediate series, including ambiguous sounds. Responses on these trials were used to determine which speech cues were used, if a trading relation between VOT and the onset frequency of the first formant was present, and whether speech exposure had an influence on perception. Cue trading was found in all birds and these results were largely similar to those of a group of humans. Results indicated that prior speech experience was not a requirement for cue trading by budgerigars. The results are consistent with theories that explain phonetic cue trading in terms of a rich auditory encoding of the speech signal.

  14. Relationship between individual differences in speech processing and cognitive functions.

    Science.gov (United States)

    Ou, Jinghua; Law, Sam-Po; Fung, Roxana

    2015-12-01

    A growing body of research has suggested that cognitive abilities may play a role in individual differences in speech processing. The present study took advantage of a widespread linguistic phenomenon of sound change to systematically assess the relationships between speech processing and various components of attention and working memory in the auditory and visual modalities among typically developed Cantonese-speaking individuals. The individual variations in speech processing are captured in an ongoing sound change-tone merging in Hong Kong Cantonese, in which typically developed native speakers are reported to lose the distinctions between some tonal contrasts in perception and/or production. Three groups of participants were recruited, with a first group of good perception and production, a second group of good perception but poor production, and a third group of good production but poor perception. Our findings revealed that modality-independent abilities of attentional switching/control and working memory might contribute to individual differences in patterns of speech perception and production as well as discrimination latencies among typically developed speakers. The findings not only have the potential to generalize to speech processing in other languages, but also broaden our understanding of the omnipresent phenomenon of language change in all languages.

  15. Varying acoustic-phonemic ambiguity reveals that talker normalization is obligatory in speech processing.

    Science.gov (United States)

    Choi, Ja Young; Hu, Elly R; Perrachione, Tyler K

    2018-04-01

    The nondeterministic relationship between speech acoustics and abstract phonemic representations imposes a challenge for listeners to maintain perceptual constancy despite the highly variable acoustic realization of speech. Talker normalization facilitates speech processing by reducing the degrees of freedom for mapping between encountered speech and phonemic representations. While this process has been proposed to facilitate the perception of ambiguous speech sounds, it is currently unknown whether talker normalization is affected by the degree of potential ambiguity in acoustic-phonemic mapping. We explored the effects of talker normalization on speech processing in a series of speeded classification paradigms, parametrically manipulating the potential for inconsistent acoustic-phonemic relationships across talkers for both consonants and vowels. Listeners identified words with varying potential acoustic-phonemic ambiguity across talkers (e.g., beet/boat vs. boot/boat) spoken by single or mixed talkers. Auditory categorization of words was always slower when listening to mixed talkers compared to a single talker, even when there was no potential acoustic ambiguity between target sounds. Moreover, the processing cost imposed by mixed talkers was greatest when words had the most potential acoustic-phonemic overlap across talkers. Models of acoustic dissimilarity between target speech sounds did not account for the pattern of results. These results suggest (a) that talker normalization incurs the greatest processing cost when disambiguating highly confusable sounds and (b) that talker normalization appears to be an obligatory component of speech perception, taking place even when the acoustic-phonemic relationships across sounds are unambiguous.

  16. Differential Diagnosis of Speech Sound Disorder (Phonological Disorder): Audiological Assessment beyond the Pure-tone Audiogram.

    Science.gov (United States)

    Iliadou, Vasiliki Vivian; Chermak, Gail D; Bamiou, Doris-Eva

    2015-04-01

    According to the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition, diagnosis of speech sound disorder (SSD) requires a determination that it is not the result of other congenital or acquired conditions, including hearing loss or neurological conditions that may present with similar symptomatology. To examine peripheral and central auditory function for the purpose of determining whether a peripheral or central auditory disorder was an underlying factor or contributed to the child's SSD. Central auditory processing disorder clinic pediatric case reports. Three clinical cases are reviewed of children with diagnosed SSD who were referred for audiological evaluation by their speech-language pathologists as a result of slower than expected progress in therapy. Audiological testing revealed auditory deficits involving peripheral auditory function or the central auditory nervous system. These cases demonstrate the importance of increasing awareness among professionals of the need to fully evaluate the auditory system to identify auditory deficits that could contribute to a patient's speech sound (phonological) disorder. Audiological assessment in cases of suspected SSD should not be limited to pure-tone audiometry given its limitations in revealing the full range of peripheral and central auditory deficits, deficits which can compromise treatment of SSD. American Academy of Audiology.

  17. Investigating the neural correlates of voice versus speech-sound directed information in pre-school children.

    Directory of Open Access Journals (Sweden)

    Nora Maria Raschle

    Full Text Available Studies in sleeping newborns and infants propose that the superior temporal sulcus is involved in speech processing soon after birth. Speech processing also implicitly requires the analysis of the human voice, which conveys both linguistic and extra-linguistic information. However, due to technical and practical challenges when neuroimaging young children, evidence of neural correlates of speech and/or voice processing in toddlers and young children remains scarce. In the current study, we used functional magnetic resonance imaging (fMRI in 20 typically developing preschool children (average age  = 5.8 y; range 5.2-6.8 y to investigate brain activation during judgments about vocal identity versus the initial speech sound of spoken object words. FMRI results reveal common brain regions responsible for voice-specific and speech-sound specific processing of spoken object words including bilateral primary and secondary language areas of the brain. Contrasting voice-specific with speech-sound specific processing predominantly activates the anterior part of the right-hemispheric superior temporal sulcus. Furthermore, the right STS is functionally correlated with left-hemispheric temporal and right-hemispheric prefrontal regions. This finding underlines the importance of the right superior temporal sulcus as a temporal voice area and indicates that this brain region is specialized, and functions similarly to adults by the age of five. We thus extend previous knowledge of voice-specific regions and their functional connections to the young brain which may further our understanding of the neuronal mechanism of speech-specific processing in children with developmental disorders, such as autism or specific language impairments.

  18. Sound localization and speech identification in the frontal median plane with a hear-through headset

    DEFF Research Database (Denmark)

    Hoffmann, Pablo F.; Møller, Anders Kalsgaard; Christensen, Flemming

    2014-01-01

    signals can be superimposed via earphone reproduction. An important aspect of the hear-through headset is its transparency, i.e. how close to real life can the electronically amplied sounds be perceived. Here we report experiments conducted to evaluate the auditory transparency of a hear-through headset...... prototype by comparing human performance in natural, hear-through, and fully occluded conditions for two spatial tasks: frontal vertical-plane sound localization and speech-on-speech spatial release from masking. Results showed that localization performance was impaired by the hear-through headset relative...... to the natural condition though not as much as in the fully occluded condition. Localization was affected the least when the sound source was in front of the listeners. Different from the vertical localization performance, results from the speech task suggest that normal speech-on-speech spatial release from...

  19. Evaluation of speech reception threshold in noise in young Cochlear™ Nucleus® system 6 implant recipients using two different digital remote microphone technologies and a speech enhancement sound processing algorithm.

    Science.gov (United States)

    Razza, Sergio; Zaccone, Monica; Meli, Aannalisa; Cristofari, Eliana

    2017-12-01

    Children affected by hearing loss can experience difficulties in challenging and noisy environments even when deafness is corrected by Cochlear implant (CI) devices. These patients have a selective attention deficit in multiple listening conditions. At present, the most effective ways to improve the performance of speech recognition in noise consists of providing CI processors with noise reduction algorithms and of providing patients with bilateral CIs. The aim of this study was to compare speech performances in noise, across increasing noise levels, in CI recipients using two kinds of wireless remote-microphone radio systems that use digital radio frequency transmission: the Roger Inspiro accessory and the Cochlear Wireless Mini Microphone accessory. Eleven Nucleus Cochlear CP910 CI young user subjects were studied. The signal/noise ratio, at a speech reception threshold (SRT) value of 50%, was measured in different conditions for each patient: with CI only, with the Roger or with the MiniMic accessory. The effect of the application of the SNR-noise reduction algorithm in each of these conditions was also assessed. The tests were performed with the subject positioned in front of the main speaker, at a distance of 2.5 m. Another two speakers were positioned at 3.50 m. The main speaker at 65 dB issued disyllabic words. Babble noise signal was delivered through the other speakers, with variable intensity. The use of both wireless remote microphones improved the SRT results. Both systems improved gain of speech performances. The gain was higher with the Mini Mic system (SRT = -4.76) than the Roger system (SRT = -3.01). The addition of the NR algorithm did not statistically further improve the results. There is significant improvement in speech recognition results with both wireless digital remote microphone accessories, in particular with the Mini Mic system when used with the CP910 processor. The use of a remote microphone accessory surpasses the benefit of

  20. Intervention efficacy and intensity for children with speech sound disorder.

    Science.gov (United States)

    Allen, Melissa M

    2013-06-01

    Clinicians do not have an evidence base they can use to recommend optimum intervention intensity for preschool children who present with speech sound disorder (SSD). This study examined the effect of dose frequency on phonological performance and the efficacy of the multiple oppositions approach. Fifty-four preschool children with SSD were randomly assigned to one of three intervention conditions. Two intervention conditions received the multiple oppositions approach either 3 times per week for 8 weeks (P3) or once weekly for 24 weeks (P1). A control (C) condition received a storybook intervention. Percentage of consonants correct (PCC) was evaluated at 8 weeks and after 24 sessions. PCC gain was examined after a 6-week maintenance period. The P3 condition had a significantly better phonological outcome than the P1 and C conditions at 8 weeks and than the P1 condition after 24 weeks. There were no significant differences between the P1 and C conditions. There was no significant difference between the P1 and P3 conditions in PCC gain during the maintenance period. Preschool children with SSD who received the multiple oppositions approach made significantly greater gains when they were provided with a more intensive dose frequency and when cumulative intervention intensity was held constant.

  1. Speech and non-speech processing in children with phonological disorders: an electrophysiological study

    Directory of Open Access Journals (Sweden)

    Isabela Crivellaro Gonçalves

    2011-01-01

    Full Text Available OBJECTIVE: To determine whether neurophysiological auditory brainstem responses to clicks and repeated speech stimuli differ between typically developing children and children with phonological disorders. INTRODUCTION: Phonological disorders are language impairments resulting from inadequate use of adult phonological language rules and are among the most common speech and language disorders in children (prevalence: 8 - 9%. Our hypothesis is that children with phonological disorders have basic differences in the way that their brains encode acoustic signals at brainstem level when compared to normal counterparts. METHODS: We recorded click and speech evoked auditory brainstem responses in 18 typically developing children (control group and in 18 children who were clinically diagnosed with phonological disorders (research group. The age range of the children was from 7-11 years. RESULTS: The research group exhibited significantly longer latency responses to click stimuli (waves I, III and V and speech stimuli (waves V and A when compared to the control group. DISCUSSION: These results suggest that the abnormal encoding of speech sounds may be a biological marker of phonological disorders. However, these results cannot define the biological origins of phonological problems. We also observed that speech-evoked auditory brainstem responses had a higher specificity/sensitivity for identifying phonological disorders than click-evoked auditory brainstem responses. CONCLUSIONS: Early stages of the auditory pathway processing of an acoustic stimulus are not similar in typically developing children and those with phonological disorders. These findings suggest that there are brainstem auditory pathway abnormalities in children with phonological disorders.

  2. Phonological Encoding in Speech-Sound Disorder: Evidence from a Cross-Modal Priming Experiment

    Science.gov (United States)

    Munson, Benjamin; Krause, Miriam O. P.

    2017-01-01

    Background: Psycholinguistic models of language production provide a framework for determining the locus of language breakdown that leads to speech-sound disorder (SSD) in children. Aims: To examine whether children with SSD differ from their age-matched peers with typical speech and language development (TD) in the ability phonologically to…

  3. Parental Beliefs and Experiences Regarding Involvement in Intervention for Their Child with Speech Sound Disorder

    Science.gov (United States)

    Watts Pappas, Nicole; McAllister, Lindy; McLeod, Sharynne

    2016-01-01

    Parental beliefs and experiences regarding involvement in speech intervention for their child with mild to moderate speech sound disorder (SSD) were explored using multiple, sequential interviews conducted during a course of treatment. Twenty-one interviews were conducted with seven parents of six children with SSD: (1) after their child's initial…

  4. Do Irrelevant Sounds Impair the Maintenance of All Characteristics of Speech in Memory?

    Science.gov (United States)

    Gabriel, D.; Gaudrain, E.; Lebrun-Guillaud, G.; Sheppard, F.; Tomescu, I. M.; Schnider, A.

    2012-01-01

    Several studies have shown that maintaining in memory some attributes of speech, such as the content or pitch of an interlocutor's message, is markedly reduced in the presence of background sounds made of spectrotemporal variations. However, experimental paradigms showing this interference have only focused on one attribute of speech at a time,…

  5. Balancing speech intelligibility versus sound exposure in selection of personal hearing protection equipment for Chinook aircrews

    NARCIS (Netherlands)

    Wijngaarden, S.J. van; Rots, G.

    2001-01-01

    Background: Aircrews are often exposed to high ambient sound levels, especially in military aviation. Since long-term exposure to such noise may cause hearing damage, selection of adequate hearing protective devices is crucial. Such devices also affect speech intelligibility. When speech

  6. Speech Sound Disorders in Preschool Children: Correspondence between Clinical Diagnosis and Teacher and Parent Report

    Science.gov (United States)

    Harrison, Linda J.; McLeod, Sharynne; McAllister, Lindy; McCormack, Jane

    2017-01-01

    This study sought to assess the level of correspondence between parent and teacher report of concern about young children's speech and specialist assessment of speech sound disorders (SSD). A sample of 157 children aged 4-5 years was recruited in preschools and long day care centres in Victoria and New South Wales (NSW). SSD was assessed…

  7. Profile of Australian Preschool Children with Speech Sound Disorders at Risk for Literacy Difficulties

    Science.gov (United States)

    McLeod, Sharynne; Crowe, Kathryn; Masso, Sarah; Baker, Elise; McCormack, Jane; Wren, Yvonne; Roulstone, Susan; Howland, Charlotte

    2017-01-01

    Speech sound disorders are a common communication difficulty in preschool children. Teachers indicate difficulty identifying and supporting these children. The aim of this research was to describe speech and language characteristics of children identified by their parents and/or teachers as having possible communication concerns. 275 Australian 4-…

  8. Profile of Australian preschool children with speech sound disorders at risk for literacy difficulties

    OpenAIRE

    McLeod, S.; Crowe, K.; Masso, S.; Baker, E.; McCormack, J.; Wren, Y.; Roulstone, S.; Howland, C.

    2017-01-01

    Background: Speech sound disorders are a common communication difficulty in preschool children. Teachers indicate difficulty identifying and supporting these children.\\ud \\ud Aim: To describe speech and language characteristics of children identified by their parents and/or teachers as having possible communication concerns.\\ud \\ud Method: 275 Australian 4- to 5-year-old children from 45 preschools whose parents and teachers were concerned about their talking participated in speech-language p...

  9. Intensive treatment with ultrasound visual feedback for speech sound errors in childhood apraxia

    Directory of Open Access Journals (Sweden)

    Jonathan L Preston

    2016-08-01

    Full Text Available Ultrasound imaging is an adjunct to traditional speech therapy that has shown to be beneficial in the remediation of speech sound errors. Ultrasound biofeedback can be utilized during therapy to provide clients additional knowledge about their tongue shapes when attempting to produce sounds that are in error. The additional feedback may assist children with childhood apraxia of speech in stabilizing motor patterns, thereby facilitating more consistent and accurate productions of sounds and syllables. However, due to its specialized nature, ultrasound visual feedback is a technology that is not widely available to clients. Short-term intensive treatment programs are one option that can be utilized to expand access to ultrasound biofeedback. Schema-based motor learning theory suggests that short-term intensive treatment programs (massed practice may assist children in acquiring more accurate motor patterns. In this case series, three participants ages 10-14 diagnosed with childhood apraxia of speech attended 16 hours of speech therapy over a two-week period to address residual speech sound errors. Two participants had distortions on rhotic sounds, while the third participant demonstrated lateralization of sibilant sounds. During therapy, cues were provided to assist participants in obtaining a tongue shape that facilitated a correct production of the erred sound. Additional practice without ultrasound was also included. Results suggested that all participants showed signs of acquisition of sounds in error. Generalization and retention results were mixed. One participant showed generalization and retention of sounds that were treated; one showed generalization but limited retention; and the third showed no evidence of generalization or retention. Individual characteristics that may facilitate generalization are discussed. Short-term intensive treatment programs using ultrasound biofeedback may result in the acquisition of more accurate motor

  10. Peripheral auditory processing and speech reception in impaired hearing

    DEFF Research Database (Denmark)

    Strelcyk, Olaf

    One of the most common complaints of people with impaired hearing concerns their difficulty with understanding speech. Particularly in the presence of background noise, hearing-impaired people often encounter great difficulties with speech communication. In most cases, the problem persists even...... if reduced audibility has been compensated for by hearing aids. It has been hypothesized that part of the difficulty arises from changes in the perception of sounds that are well above hearing threshold, such as reduced frequency selectivity and deficits in the processing of temporal fine structure (TFS......) at the output of the inner-ear (cochlear) filters. The purpose of this work was to investigate these aspects in detail. One chapter studies relations between frequency selectivity, TFS processing, and speech reception in listeners with normal and impaired hearing, using behavioral listening experiments. While...

  11. Use of Authentic-Speech Technique for Teaching Sound Recognition to EFL Students

    Science.gov (United States)

    Sersen, William J.

    2011-01-01

    The main objective of this research was to test an authentic-speech technique for improving the sound-recognition skills of EFL (English as a foreign language) students at Roi-Et Rajabhat University. The secondary objective was to determine the correlation, if any, between students' self-evaluation of sound-recognition progress and the actual…

  12. Speech perception as an active cognitive process

    Directory of Open Access Journals (Sweden)

    Shannon eHeald

    2014-03-01

    Full Text Available One view of speech perception is that acoustic signals are transformed into representations for pattern matching to determine linguistic structure. This process can be taken as a statistical pattern-matching problem, assuming realtively stable linguistic categories are characterized by neural representations related to auditory properties of speech that can be compared to speech input. This kind of pattern matching can be termed a passive process which implies rigidity of processingd with few demands on cognitive processing. An alternative view is that speech recognition, even in early stages, is an active process in which speech analysis is attentionally guided. Note that this does not mean consciously guided but that information-contingent changes in early auditory encoding can occur as a function of context and experience. Active processing assumes that attention, plasticity, and listening goals are important in considering how listeners cope with adverse circumstances that impair hearing by masking noise in the environment or hearing loss. Although theories of speech perception have begun to incorporate some active processing, they seldom treat early speech encoding as plastic and attentionally guided. Recent research has suggested that speech perception is the product of both feedforward and feedback interactions between a number of brain regions that include descending projections perhaps as far downstream as the cochlea. It is important to understand how the ambiguity of the speech signal and constraints of context dynamically determine cognitive resources recruited during perception including focused attention, learning, and working memory. Theories of speech perception need to go beyond the current corticocentric approach in order to account for the intrinsic dynamics of the auditory encoding of speech. In doing so, this may provide new insights into ways in which hearing disorders and loss may be treated either through augementation or

  13. Patterns and risk factors associated with speech sounds and language disorders in pakistan

    International Nuclear Information System (INIS)

    Arshad, H.; Ghayas, M.S.; Madiha, A.

    2013-01-01

    To observe the patterns of speech sounds and language disorders. To find out associated risk factors of speech sounds and language disorders. Background: Communication is the very essence of modern society. Communication disorders impacts quality of life. Patterns and factors associated with speech sounds and language impairments were explored. The association was seen with different environmental factors. Methodology: The patients included in the study were 200 whose age ranged between two and sixteen years presented in speech therapy clinic OPD Mayo Hospital. A cross-sectional survey questionnaire assessed the patient's bio data, socioeconomic background, family history of communication disorders and bilingualism. It was a descriptive study and was conducted through cross-sectional survey. Data was analysed by SPSS version 16. Results: Results reveal Language disorders were relatively more prevalent in males than those of speech sound disorders. Bilingualism was found as having insignificant effect on these disorders. It was concluded from this study that the socioeconomic status and family history were significant risk factors. Conclusion: Gender, socioeconomic status, family history can play as risk for developing speech sounds and language disorders. There is a grave need to understand patterns of communication disorders in the light of Pakistani society and culture. It is recommended to conduct further studies to determine risk factors and patterns of these impairments. (author)

  14. Hearing aid processing of loud speech and noise signals: Consequences for loudness perception and listening comfort

    DEFF Research Database (Denmark)

    Schmidt, Erik

    2007-01-01

    sounds, has found that both normal-hearing and hearing-impaired listeners prefer loud sounds to be closer to the most comfortable loudness-level, than suggested by common non-linear fitting rules. During this project, two listening experiments were carried out. In the first experiment, hearing aid users......Hearing aid processing of loud speech and noise signals: Consequences for loudness perception and listening comfort. Sound processing in hearing aids is determined by the fitting rule. The fitting rule describes how the hearing aid should amplify speech and sounds in the surroundings......, such that they become audible again for the hearing impaired person. The general goal is to place all sounds within the hearing aid users’ audible range, such that speech intelligibility and listening comfort become as good as possible. Amplification strategies in hearing aids are in many cases based on empirical...

  15. Cross-Modal Correspondence between Brightness and Chinese Speech Sound with Aspiration

    Directory of Open Access Journals (Sweden)

    Sachiko Hirata

    2011-10-01

    Full Text Available Phonetic symbolism is the phenomenon of speech sounds evoking images based on sensory experiences; it is often discussed with cross-modal correspondence. By using Garner's task, Hirata, Kita, and Ukita (2009 showed the cross-modal congruence between brightness and voiced/voiceless consonants in Japanese speech sound, which is known as phonetic symbolism. In the present study, we examined the effect of the meaning of mimetics (lexical words whose sound reflects its meaning, like “ding-dong” in Japanese language on the cross-modal correspondence. We conducted an experiment with Chinese speech sounds with or without aspiration using Chinese people. Chinese vocabulary also contains mimetics but the existence of aspiration doesn't relate to the meaning of Chinese mimetics. As a result, Chinese speech sounds with aspiration, which resemble voiceless consonants, were matched with white color, whereas those without aspiration were matched with black. This result is identical to its pattern in Japanese people and consequently suggests that cross-modal correspondence occurs without the effect of the meaning of mimetics. The problem that whether these cross-modal correspondences are purely based on physical properties of speech sound or affected from phonetic properties remains for further study.

  16. Sound frequency affects speech emotion perception: results from congenital amusia.

    Science.gov (United States)

    Lolli, Sydney L; Lewenstein, Ari D; Basurto, Julian; Winnik, Sean; Loui, Psyche

    2015-01-01

    Congenital amusics, or "tone-deaf" individuals, show difficulty in perceiving and producing small pitch differences. While amusia has marked effects on music perception, its impact on speech perception is less clear. Here we test the hypothesis that individual differences in pitch perception affect judgment of emotion in speech, by applying low-pass filters to spoken statements of emotional speech. A norming study was first conducted on Mechanical Turk to ensure that the intended emotions from the Macquarie Battery for Evaluation of Prosody were reliably identifiable by US English speakers. The most reliably identified emotional speech samples were used in Experiment 1, in which subjects performed a psychophysical pitch discrimination task, and an emotion identification task under low-pass and unfiltered speech conditions. Results showed a significant correlation between pitch-discrimination threshold and emotion identification accuracy for low-pass filtered speech, with amusics (defined here as those with a pitch discrimination threshold >16 Hz) performing worse than controls. This relationship with pitch discrimination was not seen in unfiltered speech conditions. Given the dissociation between low-pass filtered and unfiltered speech conditions, we inferred that amusics may be compensating for poorer pitch perception by using speech cues that are filtered out in this manipulation. To assess this potential compensation, Experiment 2 was conducted using high-pass filtered speech samples intended to isolate non-pitch cues. No significant correlation was found between pitch discrimination and emotion identification accuracy for high-pass filtered speech. Results from these experiments suggest an influence of low frequency information in identifying emotional content of speech.

  17. Sound frequency affects speech emotion perception: Results from congenital amusia

    Directory of Open Access Journals (Sweden)

    Sydney eLolli

    2015-09-01

    Full Text Available Congenital amusics, or tone-deaf individuals, show difficulty in perceiving and producing small pitch differences. While amusia has marked effects on music perception, its impact on speech perception is less clear. Here we test the hypothesis that individual differences in pitch perception affect judgment of emotion in speech, by applying band-pass filters to spoken statements of emotional speech. A norming study was first conducted on Mechanical Turk to ensure that the intended emotions from the Macquarie Battery for Evaluation of Prosody (MBEP were reliably identifiable by US English speakers. The most reliably identified emotional speech samples were used in in Experiment 1, in which subjects performed a psychophysical pitch discrimination task, and an emotion identification task under band-pass and unfiltered speech conditions. Results showed a significant correlation between pitch discrimination threshold and emotion identification accuracy for band-pass filtered speech, with amusics (defined here as those with a pitch discrimination threshold > 16 Hz performing worse than controls. This relationship with pitch discrimination was not seen in unfiltered speech conditions. Given the dissociation between band-pass filtered and unfiltered speech conditions, we inferred that amusics may be compensating for poorer pitch perception by using speech cues that are filtered out in this manipulation.

  18. Crossmodal deficit in dyslexic children: practice affects the neural timing of letter-speech sound integration

    Directory of Open Access Journals (Sweden)

    Gojko eŽarić

    2015-06-01

    Full Text Available A failure to build solid letter-speech sound associations may contribute to reading impairments in developmental dyslexia. Whether this reduced neural integration of letters and speech sounds changes over time within individual children and how this relates to behavioral gains in reading skills remains unknown. In this research, we examined changes in event-related potential (ERP measures of letter-speech sound integration over a 6-month period during which 9-year-old dyslexic readers (n=17 followed a training in letter-speech sound coupling next to their regular reading curriculum. We presented the Dutch spoken vowels /a/ and /o/ as standard and deviant stimuli in one auditory and two audiovisual oddball conditions. In one audiovisual condition (AV0, the letter ‘a’ was presented simultaneously with the vowels, while in the other (AV200 it was preceding vowel onset for 200 ms. Prior to the training (T1, dyslexic readers showed the expected pattern of typical auditory mismatch responses, together with the absence of letter-speech sound effects in a late negativity (LN window. After the training (T2, our results showed earlier (and enhanced crossmodal effects in the LN window. Most interestingly, earlier LN latency at T2 was significantly related to higher behavioral accuracy in letter-speech sound coupling. On a more general level, the timing of the earlier mismatch negativity (MMN in the simultaneous condition (AV0 measured at T1, significantly related to reading fluency at both T1 and T2 as well as with reading gains. Our findings suggest that the reduced neural integration of letters and speech sounds in dyslexic children may show moderate improvement with reading instruction and training and that behavioral improvements relate especially to individual differences in the timing of this neural integration.

  19. Musicians' Enhanced Neural Differentiation of Speech Sounds Arises Early in Life: Developmental Evidence from Ages 3 to 30

    Science.gov (United States)

    Strait, Dana L.; O'Connell, Samantha; Parbery-Clark, Alexandra; Kraus, Nina

    2014-01-01

    The perception and neural representation of acoustically similar speech sounds underlie language development. Music training hones the perception of minute acoustic differences that distinguish sounds; this training may generalize to speech processing given that adult musicians have enhanced neural differentiation of similar speech syllables compared with nonmusicians. Here, we asked whether this neural advantage in musicians is present early in life by assessing musically trained and untrained children as young as age 3. We assessed auditory brainstem responses to the speech syllables /ba/ and /ga/ as well as auditory and visual cognitive abilities in musicians and nonmusicians across 3 developmental time-points: preschoolers, school-aged children, and adults. Cross-phase analyses objectively measured the degree to which subcortical responses differed to these speech syllables in musicians and nonmusicians for each age group. Results reveal that musicians exhibit enhanced neural differentiation of stop consonants early in life and with as little as a few years of training. Furthermore, the extent of subcortical stop consonant distinction correlates with auditory-specific cognitive abilities (i.e., auditory working memory and attention). Results are interpreted according to a corticofugal framework for auditory learning in which subcortical processing enhancements are engendered by strengthened cognitive control over auditory function in musicians. PMID:23599166

  20. The influence of meaning on the perception of speech sounds.

    Science.gov (United States)

    Kazanina, Nina; Phillips, Colin; Idsardi, William

    2006-07-25

    As part of knowledge of language, an adult speaker possesses information on which sounds are used in the language and on the distribution of these sounds in a multidimensional acoustic space. However, a speaker must know not only the sound categories of his language but also the functional significance of these categories, in particular, which sound contrasts are relevant for storing words in memory and which sound contrasts are not. Using magnetoencephalographic brain recordings with speakers of Russian and Korean, we demonstrate that a speaker's perceptual space, as reflected in early auditory brain responses, is shaped not only by bottom-up analysis of the distribution of sounds in his language but also by more abstract analysis of the functional significance of those sounds.

  1. Hemispheric processing of vocal emblem sounds.

    Science.gov (United States)

    Neumann-Werth, Yael; Levy, Erika S; Obler, Loraine K

    2013-01-01

    Vocal emblems, such as shh and brr, are speech sounds that have linguistic and nonlinguistic features; thus, it is unclear how they are processed in the brain. Five adult dextral individuals with left-brain damage and moderate-severe Wernicke's aphasia, five adult dextral individuals with right-brain damage, and five Controls participated in two tasks: (1) matching vocal emblems to photographs ('picture task') and (2) matching vocal emblems to verbal translations ('phrase task'). Cross-group statistical analyses on items on which the Controls performed at ceiling revealed lower accuracy by the group with left-brain damage (than by Controls) on both tasks, and lower accuracy by the group with right-brain damage (than by Controls) on the picture task. Additionally, the group with left-brain damage performed significantly less accurately than the group with right-brain damage on the phrase task only. Findings suggest that comprehension of vocal emblems recruits more left- than right-hemisphere processing.

  2. Use of a Deep Recurrent Neural Network to Reduce Wind Noise: Effects on Judged Speech Intelligibility and Sound Quality

    Science.gov (United States)

    Keshavarzi, Mahmoud; Goehring, Tobias; Zakis, Justin; Turner, Richard E.; Moore, Brian C. J.

    2018-01-01

    Despite great advances in hearing-aid technology, users still experience problems with noise in windy environments. The potential benefits of using a deep recurrent neural network (RNN) for reducing wind noise were assessed. The RNN was trained using recordings of the output of the two microphones of a behind-the-ear hearing aid in response to male and female speech at various azimuths in the presence of noise produced by wind from various azimuths with a velocity of 3 m/s, using the “clean” speech as a reference. A paired-comparison procedure was used to compare all possible combinations of three conditions for subjective intelligibility and for sound quality or comfort. The conditions were unprocessed noisy speech, noisy speech processed using the RNN, and noisy speech that was high-pass filtered (which also reduced wind noise). Eighteen native English-speaking participants were tested, nine with normal hearing and nine with mild-to-moderate hearing impairment. Frequency-dependent linear amplification was provided for the latter. Processing using the RNN was significantly preferred over no processing by both subject groups for both subjective intelligibility and sound quality, although the magnitude of the preferences was small. High-pass filtering (HPF) was not significantly preferred over no processing. Although RNN was significantly preferred over HPF only for sound quality for the hearing-impaired participants, for the results as a whole, there was a preference for RNN over HPF. Overall, the results suggest that reduction of wind noise using an RNN is possible and might have beneficial effects when used in hearing aids. PMID:29708061

  3. Use of a Deep Recurrent Neural Network to Reduce Wind Noise: Effects on Judged Speech Intelligibility and Sound Quality.

    Science.gov (United States)

    Keshavarzi, Mahmoud; Goehring, Tobias; Zakis, Justin; Turner, Richard E; Moore, Brian C J

    2018-01-01

    Despite great advances in hearing-aid technology, users still experience problems with noise in windy environments. The potential benefits of using a deep recurrent neural network (RNN) for reducing wind noise were assessed. The RNN was trained using recordings of the output of the two microphones of a behind-the-ear hearing aid in response to male and female speech at various azimuths in the presence of noise produced by wind from various azimuths with a velocity of 3 m/s, using the "clean" speech as a reference. A paired-comparison procedure was used to compare all possible combinations of three conditions for subjective intelligibility and for sound quality or comfort. The conditions were unprocessed noisy speech, noisy speech processed using the RNN, and noisy speech that was high-pass filtered (which also reduced wind noise). Eighteen native English-speaking participants were tested, nine with normal hearing and nine with mild-to-moderate hearing impairment. Frequency-dependent linear amplification was provided for the latter. Processing using the RNN was significantly preferred over no processing by both subject groups for both subjective intelligibility and sound quality, although the magnitude of the preferences was small. High-pass filtering (HPF) was not significantly preferred over no processing. Although RNN was significantly preferred over HPF only for sound quality for the hearing-impaired participants, for the results as a whole, there was a preference for RNN over HPF. Overall, the results suggest that reduction of wind noise using an RNN is possible and might have beneficial effects when used in hearing aids.

  4. Lexical and phonological variability in preschool children with speech sound disorder.

    Science.gov (United States)

    Macrae, Toby; Tyler, Ann A; Lewis, Kerry E

    2014-02-01

    The authors of this study examined relationships between measures of word and speech error variability and between these and other speech and language measures in preschool children with speech sound disorder (SSD). In this correlational study, 18 preschool children with SSD, age-appropriate receptive vocabulary, and normal oral motor functioning and hearing were assessed across 2 sessions. Experimental measures included word and speech error variability, receptive vocabulary, nonword repetition (NWR), and expressive language. Pearson product–moment correlation coefficients were calculated among the experimental measures. The correlation between word and speech error variability was slight and nonsignificant. The correlation between word variability and receptive vocabulary was moderate and negative, although nonsignificant. High word variability was associated with small receptive vocabularies. The correlations between speech error variability and NWR and between speech error variability and the mean length of children's utterances were moderate and negative, although both were nonsignificant. High speech error variability was associated with poor NWR and language scores. High word variability may reflect unstable lexical representations, whereas high speech error variability may reflect indistinct phonological representations. Preschool children with SSD who show abnormally high levels of different types of speech variability may require slightly different approaches to intervention.

  5. Focal versus distributed temporal cortex activity for speech sound category assignment

    Science.gov (United States)

    Bouton, Sophie; Chambon, Valérian; Tyrand, Rémi; Seeck, Margitta; Karkar, Sami; van de Ville, Dimitri; Giraud, Anne-Lise

    2018-01-01

    Percepts and words can be decoded from distributed neural activity measures. However, the existence of widespread representations might conflict with the more classical notions of hierarchical processing and efficient coding, which are especially relevant in speech processing. Using fMRI and magnetoencephalography during syllable identification, we show that sensory and decisional activity colocalize to a restricted part of the posterior superior temporal gyrus (pSTG). Next, using intracortical recordings, we demonstrate that early and focal neural activity in this region distinguishes correct from incorrect decisions and can be machine-decoded to classify syllables. Crucially, significant machine decoding was possible from neuronal activity sampled across different regions of the temporal and frontal lobes, despite weak or absent sensory or decision-related responses. These findings show that speech-sound categorization relies on an efficient readout of focal pSTG neural activity, while more distributed activity patterns, although classifiable by machine learning, instead reflect collateral processes of sensory perception and decision. PMID:29363598

  6. Working memory span in Persian-speaking children with speech sound disorders and normal speech development.

    Science.gov (United States)

    Afshar, Mohamad Reza; Ghorbani, Ali; Rashedi, Vahid; Jalilevand, Nahid; Kamali, Mohamad

    2017-10-01

    The aim of this study was to compare working memory span in Persian-speaking preschool children with speech sound disorder (SSD) and their typically speaking peers. Additionally, the study aimed to examine Non-Word Repetition (NWR), Forward Digit Span (FDS) and Backward Digit Span (BDS) in four groups of children with varying severity levels of SSD. The participants in this study comprised 35 children with SSD and 35 typically developing (TD) children -matched for age and sex-as a control group. The participants were between the age range of 48 and 72 months. Two components of working memory including phonological loop and central executive were compared between two groups. We used two tasks (NWR and FDS) to assess phonological loop component, and one task (BDS) to assess central executive component. Percentage of correct consonants (PCC) was used to calculate the severity of SSD. Significant differences were observed between the two groups in all tasks that assess working memory (p working memory between the various severity groups indicated significant differences between different severities of both NWR and FDS tasks among the SSD children (p  0.05). The result showed that PCC scores in TD children were associated with NWR (p  0.05). The working memory skills were weaker in SSD children, in comparison to TD children. In addition, children with varying levels of severity of SSD differed in terms of NWR and FSD, but not BDS. Copyright © 2017 Elsevier B.V. All rights reserved.

  7. Office noise: Can headphones and masking sound attenuate distraction by background speech?

    Science.gov (United States)

    Jahncke, Helena; Björkeholm, Patrik; Marsh, John E; Odelius, Johan; Sörqvist, Patrik

    2016-11-22

    Background speech is one of the most disturbing noise sources at shared workplaces in terms of both annoyance and performance-related disruption. Therefore, it is important to identify techniques that can efficiently protect performance against distraction. It is also important that the techniques are perceived as satisfactory and are subjectively evaluated as effective in their capacity to reduce distraction. The aim of the current study was to compare three methods of attenuating distraction from background speech: masking a background voice with nature sound through headphones, masking a background voice with other voices through headphones and merely wearing headphones (without masking) as a way to attenuate the background sound. Quiet was deployed as a baseline condition. Thirty students participated in an experiment employing a repeated measures design. Performance (serial short-term memory) was impaired by background speech (1 voice), but this impairment was attenuated when the speech was masked - and in particular when it was masked by nature sound. Furthermore, perceived workload was lowest in the quiet condition and significantly higher in all other sound conditions. Notably, the headphones tested as a sound-attenuating device (i.e. without masking) did not protect against the effects of background speech on performance and subjective work load. Nature sound was the only masking condition that worked as a protector of performance, at least in the context of the serial recall task. However, despite the attenuation of distraction by nature sound, perceived workload was still high - suggesting that it is difficult to find a masker that is both effective and perceived as satisfactory.

  8. Working memory in school-age children with and without a persistent speech sound disorder.

    Science.gov (United States)

    Farquharson, Kelly; Hogan, Tiffany P; Bernthal, John E

    2017-03-17

    The aim of this study was to explore the role of working memory processes as a possible cognitive underpinning of persistent speech sound disorders (SSD). Forty school-aged children were enrolled; 20 children with persistent SSD (P-SSD) and 20 typically developing children. Children participated in three working memory tasks - one to target each of the components in Baddeley's working memory model: phonological loop, visual spatial sketchpad and central executive. Children with P-SSD performed poorly only on the phonological loop tasks compared to their typically developing age-matched peers. However, mediation analyses revealed that the relation between working memory and a P-SSD was reliant upon nonverbal intelligence. These results suggest that co-morbid low-average nonverbal intelligence are linked to poor working memory in children with P-SSD. Theoretical and clinical implications are discussed.

  9. Conditioned sounds enhance visual processing.

    Directory of Open Access Journals (Sweden)

    Fabrizio Leo

    Full Text Available This psychophysics study investigated whether prior auditory conditioning influences how a sound interacts with visual perception. In the conditioning phase, subjects were presented with three pure tones ( =  conditioned stimuli, CS that were paired with positive, negative or neutral unconditioned stimuli. As unconditioned reinforcers we employed pictures (highly pleasant, unpleasant and neutral or monetary outcomes (+50 euro cents, -50 cents, 0 cents. In the subsequent visual selective attention paradigm, subjects were presented with near-threshold Gabors displayed in their left or right hemifield. Critically, the Gabors were presented in synchrony with one of the conditioned sounds. Subjects discriminated whether the Gabors were presented in their left or right hemifields. Participants determined the location more accurately when the Gabors were presented in synchrony with positive relative to neutral sounds irrespective of reinforcer type. Thus, previously rewarded relative to neutral sounds increased the bottom-up salience of the visual Gabors. Our results are the first demonstration that prior auditory conditioning is a potent mechanism to modulate the effect of sounds on visual perception.

  10. Transfer of Training between Music and Speech: Common Processing, Attention, and Memory

    OpenAIRE

    Besson, Mireille; Chobert, Julie; Marie, Céline

    2011-01-01

    After a brief historical perspective of the relationship between language and music, we review our work on transfer of training from music to speech that aimed at testing the general hypothesis that musicians should be more sensitive than non-musicians to speech sounds. In light of recent results in the literature, we argue that when long-term experience in one domain influences acoustic processing in the other domain, results can be interpreted as common acoustic processing. But when long-te...

  11. Tutorial and Guidelines on Measurement of Sound Pressure Level in Voice and Speech

    Science.gov (United States)

    Švec, Jan G.; Granqvist, Svante

    2018-01-01

    Purpose: Sound pressure level (SPL) measurement of voice and speech is often considered a trivial matter, but the measured levels are often reported incorrectly or incompletely, making them difficult to compare among various studies. This article aims at explaining the fundamental principles behind these measurements and providing guidelines to…

  12. Inferior Frontal Sensitivity to Common Speech Sounds Is Amplified by Increasing Word Intelligibility

    Science.gov (United States)

    Vaden, Kenneth I., Jr.; Kuchinsky, Stefanie E.; Keren, Noam I.; Harris, Kelly C.; Ahlstrom, Jayne B.; Dubno, Judy R.; Eckert, Mark A.

    2011-01-01

    The left inferior frontal gyrus (LIFG) exhibits increased responsiveness when people listen to words composed of speech sounds that frequently co-occur in the English language (Vaden, Piquado, & Hickok, 2011), termed high phonotactic frequency (Vitevitch & Luce, 1998). The current experiment aimed to further characterize the relation of…

  13. A Survey of University Professors Teaching Speech Sound Disorders: Nonspeech Oral Motor Exercises and Other Topics

    Science.gov (United States)

    Watson, Maggie M.; Lof, Gregory L.

    2009-01-01

    Purpose: The purpose of this article was to obtain and organize information from instructors who teach course work on the subject of children's speech sound disorders (SSD) regarding their use of teaching resources, involvement in students' clinical practica, and intervention approaches presented to students. Instructors also reported if they…

  14. Estimation of sound pressure levels of voiced speech from skin vibration of the neck

    NARCIS (Netherlands)

    Svec, JG; Titze, IR; Popolo, PS

    How accurately can sound pressure levels (SPLs) of speech be estimated from skin vibration of the neck? Measurements using a small accelerometer were carried out in 27 subjects (10 males and 17 females) who read Rainbow and Marvin Williams passages in soft, comfortable, and loud voice, while skin

  15. Reading Skills of Students with Speech Sound Disorders at Three Stages of Literacy Development

    Science.gov (United States)

    Skebo, Crysten M.; Lewis, Barbara A.; Freebairn, Lisa A.; Tag, Jessica; Ciesla, Allison Avrich; Stein, Catherine M.

    2013-01-01

    Purpose: The relationship between phonological awareness, overall language, vocabulary, and nonlinguistic cognitive skills to decoding and reading comprehension was examined for students at 3 stages of literacy development (i.e., early elementary school, middle school, and high school). Students with histories of speech sound disorders (SSD) with…

  16. Speech-Sound Disorders and Attention-Deficit/Hyperactivity Disorder Symptoms

    Science.gov (United States)

    Lewis, Barbara A.; Short, Elizabeth J.; Iyengar, Sudha K.; Taylor, H. Gerry; Freebairn, Lisa; Tag, Jessica; Avrich, Allison A.; Stein, Catherine M.

    2012-01-01

    Purpose: The purpose of this study was to examine the association of speech-sound disorders (SSD) with symptoms of attention-deficit/hyperactivity disorder (ADHD) by the severity of the SSD and the mode of transmission of SSD within the pedigrees of children with SSD. Participants and Methods: The participants were 412 children who were enrolled…

  17. Early Intervening for Students with Speech Sound Disorders: Lessons from a School District

    Science.gov (United States)

    Mire, Stephen P.; Montgomery, Judy K.

    2009-01-01

    The concept of early intervening services was introduced into public school systems with the implementation of the Individuals With Disabilities Education Improvement Act (IDEA) of 2004. This article describes a program developed for students with speech sound disorders that incorporated concepts of early intervening services, response to…

  18. Nonspeech Oral Motor Treatment Issues Related to Children with Developmental Speech Sound Disorders

    Science.gov (United States)

    Ruscello, Dennis M.

    2008-01-01

    Purpose: This article examines nonspeech oral motor treatments (NSOMTs) in the population of clients with developmental speech sound disorders. NSOMTs are a collection of nonspeech methods and procedures that claim to influence tongue, lip, and jaw resting postures; increase strength; improve muscle tone; facilitate range of motion; and develop…

  19. Children with Speech Sound Disorders at School: Challenges for Children, Parents and Teachers

    Science.gov (United States)

    Daniel, Graham R.; McLeod, Sharynne

    2017-01-01

    Teachers play a major role in supporting children's educational, social, and emotional development although may be unprepared for supporting children with speech sound disorders. Interviews with 34 participants including six focus children, their parents, siblings, friends, teachers and other significant adults in their lives highlighted…

  20. Multisensory integration of speech sounds with letters vs. visual speech : only visual speech induces the mismatch negativity

    NARCIS (Netherlands)

    Stekelenburg, J.J.; Keetels, M.N.; Vroomen, J.H.M.

    2018-01-01

    Numerous studies have demonstrated that the vision of lip movements can alter the perception of auditory speech syllables (McGurk effect). While there is ample evidence for integration of text and auditory speech, there are only a few studies on the orthographic equivalent of the McGurk effect.

  1. Neural overlap in processing music and speech.

    Science.gov (United States)

    Peretz, Isabelle; Vuvan, Dominique; Lagrois, Marie-Élaine; Armony, Jorge L

    2015-03-19

    Neural overlap in processing music and speech, as measured by the co-activation of brain regions in neuroimaging studies, may suggest that parts of the neural circuitries established for language may have been recycled during evolution for musicality, or vice versa that musicality served as a springboard for language emergence. Such a perspective has important implications for several topics of general interest besides evolutionary origins. For instance, neural overlap is an important premise for the possibility of music training to influence language acquisition and literacy. However, neural overlap in processing music and speech does not entail sharing neural circuitries. Neural separability between music and speech may occur in overlapping brain regions. In this paper, we review the evidence and outline the issues faced in interpreting such neural data, and argue that converging evidence from several methodologies is needed before neural overlap is taken as evidence of sharing. © 2015 The Author(s) Published by the Royal Society. All rights reserved.

  2. Neural overlap in processing music and speech

    Science.gov (United States)

    Peretz, Isabelle; Vuvan, Dominique; Lagrois, Marie-Élaine; Armony, Jorge L.

    2015-01-01

    Neural overlap in processing music and speech, as measured by the co-activation of brain regions in neuroimaging studies, may suggest that parts of the neural circuitries established for language may have been recycled during evolution for musicality, or vice versa that musicality served as a springboard for language emergence. Such a perspective has important implications for several topics of general interest besides evolutionary origins. For instance, neural overlap is an important premise for the possibility of music training to influence language acquisition and literacy. However, neural overlap in processing music and speech does not entail sharing neural circuitries. Neural separability between music and speech may occur in overlapping brain regions. In this paper, we review the evidence and outline the issues faced in interpreting such neural data, and argue that converging evidence from several methodologies is needed before neural overlap is taken as evidence of sharing. PMID:25646513

  3. Binaural Processing of Multiple Sound Sources

    Science.gov (United States)

    2016-08-18

    AFRL-AFOSR-VA-TR-2016-0298 Binaural Processing of Multiple Sound Sources William Yost ARIZONA STATE UNIVERSITY 660 S MILL AVE STE 312 TEMPE, AZ 85281...18-08-2016 2. REPORT TYPE Final Performance 3. DATES COVERED (From - To) 15 Jul 2012 to 14 Jul 2016 4. TITLE AND SUBTITLE Binaural Processing of...three topics cited above are entirely within the scope of the AFOSR grant. 15. SUBJECT TERMS Binaural hearing, Sound Localization, Interaural signal

  4. Speech abilities in preschool children with speech sound disorder with and without co-occurring language impairment.

    Science.gov (United States)

    Macrae, Toby; Tyler, Ann A

    2014-10-01

    The authors compared preschool children with co-occurring speech sound disorder (SSD) and language impairment (LI) to children with SSD only in their numbers and types of speech sound errors. In this post hoc quasi-experimental study, independent samples t tests were used to compare the groups in the standard score from different tests of articulation/phonology, percent consonants correct, and the number of omission, substitution, distortion, typical, and atypical error patterns used in the production of different wordlists that had similar levels of phonetic and structural complexity. In comparison with children with SSD only, children with SSD and LI used similar numbers but different types of errors, including more omission patterns ( p < .001, d = 1.55) and fewer distortion patterns ( p = .022, d = 1.03). There were no significant differences in substitution, typical, and atypical error pattern use. Frequent omission error pattern use may reflect a more compromised linguistic system characterized by absent phonological representations for target sounds (see Shriberg et al., 2005). Research is required to examine the diagnostic potential of early frequent omission error pattern use in predicting later diagnoses of co-occurring SSD and LI and/or reading problems.

  5. When Does Speech Sound Disorder Matter for Literacy? The Role of Disordered Speech Errors, Co-Occurring Language Impairment and Family Risk of Dyslexia

    Science.gov (United States)

    Hayiou-Thomas, Marianna E.; Carroll, Julia M.; Leavett, Ruth; Hulme, Charles; Snowling, Margaret J.

    2017-01-01

    Background: This study considers the role of early speech difficulties in literacy development, in the context of additional risk factors. Method: Children were identified with speech sound disorder (SSD) at the age of 3½ years, on the basis of performance on the Diagnostic Evaluation of Articulation and Phonology. Their literacy skills were…

  6. CNTNAP2 Is Significantly Associated With Speech Sound Disorder in the Chinese Han Population.

    Science.gov (United States)

    Zhao, Yun-Jing; Wang, Yue-Ping; Yang, Wen-Zhu; Sun, Hong-Wei; Ma, Hong-Wei; Zhao, Ya-Ru

    2015-11-01

    Speech sound disorder is the most common communication disorder. Some investigations support the possibility that the CNTNAP2 gene might be involved in the pathogenesis of speech-related diseases. To investigate single-nucleotide polymorphisms in the CNTNAP2 gene, 300 unrelated speech sound disorder patients and 200 normal controls were included in the study. Five single-nucleotide polymorphisms were amplified and directly sequenced. Significant differences were found in the genotype (P = .0003) and allele (P = .0056) frequencies of rs2538976 between patients and controls. The excess frequency of the A allele in the patient group remained significant after Bonferroni correction (P = .0280). A significant haplotype association with rs2710102T/+rs17236239A/+2538976A/+2710117A (P = 4.10e-006) was identified. A neighboring single-nucleotide polymorphism, rs10608123, was found in complete linkage disequilibrium with rs2538976, and the genotypes exactly corresponded to each other. The authors propose that these CNTNAP2 variants increase the susceptibility to speech sound disorder. The single-nucleotide polymorphisms rs10608123 and rs2538976 may merge into one single-nucleotide polymorphism. © The Author(s) 2015.

  7. Language related differences of the sustained response evoked by natural speech sounds.

    Directory of Open Access Journals (Sweden)

    Christina Siu-Dschu Fan

    Full Text Available In tonal languages, such as Mandarin Chinese, the pitch contour of vowels discriminates lexical meaning, which is not the case in non-tonal languages such as German. Recent data provide evidence that pitch processing is influenced by language experience. However, there are still many open questions concerning the representation of such phonological and language-related differences at the level of the auditory cortex (AC. Using magnetoencephalography (MEG, we recorded transient and sustained auditory evoked fields (AEF in native Chinese and German speakers to investigate language related phonological and semantic aspects in the processing of acoustic stimuli. AEF were elicited by spoken meaningful and meaningless syllables, by vowels, and by a French horn tone. Speech sounds were recorded from a native speaker and showed frequency-modulations according to the pitch-contours of Mandarin. The sustained field (SF evoked by natural speech signals was significantly larger for Chinese than for German listeners. In contrast, the SF elicited by a horn tone was not significantly different between groups. Furthermore, the SF of Chinese subjects was larger when evoked by meaningful syllables compared to meaningless ones, but there was no significant difference regarding whether vowels were part of the Chinese phonological system or not. Moreover, the N100m gave subtle but clear evidence that for Chinese listeners other factors than purely physical properties play a role in processing meaningful signals. These findings show that the N100 and the SF generated in Heschl's gyrus are influenced by language experience, which suggests that AC activity related to specific pitch contours of vowels is influenced in a top-down fashion by higher, language related areas. Such interactions are in line with anatomical findings and neuroimaging data, as well as with the dual-stream model of language of Hickok and Poeppel that highlights the close and reciprocal interaction

  8. Language related differences of the sustained response evoked by natural speech sounds.

    Science.gov (United States)

    Fan, Christina Siu-Dschu; Zhu, Xingyu; Dosch, Hans Günter; von Stutterheim, Christiane; Rupp, André

    2017-01-01

    In tonal languages, such as Mandarin Chinese, the pitch contour of vowels discriminates lexical meaning, which is not the case in non-tonal languages such as German. Recent data provide evidence that pitch processing is influenced by language experience. However, there are still many open questions concerning the representation of such phonological and language-related differences at the level of the auditory cortex (AC). Using magnetoencephalography (MEG), we recorded transient and sustained auditory evoked fields (AEF) in native Chinese and German speakers to investigate language related phonological and semantic aspects in the processing of acoustic stimuli. AEF were elicited by spoken meaningful and meaningless syllables, by vowels, and by a French horn tone. Speech sounds were recorded from a native speaker and showed frequency-modulations according to the pitch-contours of Mandarin. The sustained field (SF) evoked by natural speech signals was significantly larger for Chinese than for German listeners. In contrast, the SF elicited by a horn tone was not significantly different between groups. Furthermore, the SF of Chinese subjects was larger when evoked by meaningful syllables compared to meaningless ones, but there was no significant difference regarding whether vowels were part of the Chinese phonological system or not. Moreover, the N100m gave subtle but clear evidence that for Chinese listeners other factors than purely physical properties play a role in processing meaningful signals. These findings show that the N100 and the SF generated in Heschl's gyrus are influenced by language experience, which suggests that AC activity related to specific pitch contours of vowels is influenced in a top-down fashion by higher, language related areas. Such interactions are in line with anatomical findings and neuroimaging data, as well as with the dual-stream model of language of Hickok and Poeppel that highlights the close and reciprocal interaction between

  9. Visual and Auditory Input in Second-Language Speech Processing

    Science.gov (United States)

    Hardison, Debra M.

    2010-01-01

    The majority of studies in second-language (L2) speech processing have involved unimodal (i.e., auditory) input; however, in many instances, speech communication involves both visual and auditory sources of information. Some researchers have argued that multimodal speech is the primary mode of speech perception (e.g., Rosenblum 2005). Research on…

  10. The Comorbidity between Attention-Deficit/Hyperactivity Disorder (ADHD in Children and Arabic Speech Sound Disorder

    Directory of Open Access Journals (Sweden)

    Ruaa Osama Hariri

    2016-04-01

    Full Text Available Children with Attention-Deficiency/Hyperactive Disorder (ADHD often have co-existing learning disabilities and developmental weaknesses or delays in some areas including speech (Rief, 2005. Seeing that phonological disorders include articulation errors and other forms of speech disorders, studies pertaining to children with ADHD symptoms who demonstrate signs of phonological disorders in their native Arabic language are lacking. The purpose of this study is to provide a description of Arabic language deficits and to present a theoretical model of potential associations between phonological language deficits and ADHD. Dodd and McCormack’s (1995 four subgroups classification of speech disorder and the phonological disorders pertaining to the Arabic language provided by a Saudi Institute for Speech and Hearing are examined within the theoretical framework. Since intervention may improve articulation and focuses a child’s attention on the sound structure of words, findings in this study are based on the assumption that children with ADHD may acquire phonology for their Arabic language in the same way, and following the same developmental stages as intelligible children. Both quantitative and qualitative analyses have proven that the ADHD group analyzed in this study had indeed failed to acquire most of their Arabic consonants as they should have. Keywords: speech sound disorder, attention-deficiency/hyperactive, developmental disorder, phonological disorder, language disorder/delay, language impairment

  11. Current trends in multilingual speech processing

    Indian Academy of Sciences (India)

    In this paper, we describe recent work at Idiap Research Institute in the domain of multilingual speech processing and provide some insights into emerging ... and industry for technologies to help break down domestic and international language barriers, these also being barriers to the expansion of policy and commerce.

  12. The functional anatomy of speech perception: Dorsal and ventral processing pathways

    Science.gov (United States)

    Hickok, Gregory

    2003-04-01

    Drawing on recent developments in the cortical organization of vision, and on data from a variety of sources, Hickok and Poeppel (2000) have proposed a new model of the functional anatomy of speech perception. The model posits that early cortical stages of speech perception involve auditory fields in the superior temporal gyrus bilaterally (although asymmetrically). This cortical processing system then diverges into two broad processing streams, a ventral stream, involved in mapping sound onto meaning, and a dorsal stream, involved in mapping sound onto articulatory-based representations. The ventral stream projects ventrolaterally toward inferior posterior temporal cortex which serves as an interface between sound and meaning. The dorsal stream projects dorsoposteriorly toward the parietal lobe and ultimately to frontal regions. This network provides a mechanism for the development and maintenance of ``parity'' between auditory and motor representations of speech. Although the dorsal stream represents a tight connection between speech perception and speech production, it is not a critical component of the speech perception process under ecologically natural listening conditions. Some degree of bi-directionality in both the dorsal and ventral pathways is also proposed. A variety of recent empirical tests of this model have provided further support for the proposal.

  13. Deficits in Sequential Processing Manifest in Motor and Linguistic Tasks in a Multigenerational Family with Childhood Apraxia of Speech

    Science.gov (United States)

    Peter, Beate; Button, Le; Stoel-Gammon, Carol; Chapman, Kathy; Raskind, Wendy H.

    2013-01-01

    The purpose of this study was to evaluate a global deficit in sequential processing as candidate endophenotypein a family with familial childhood apraxia of speech (CAS). Of 10 adults and 13 children in a three-generational family with speech sound disorder (SSD) consistent with CAS, 3 adults and 6 children had past or present SSD diagnoses. Two…

  14. Neural Correlates of Early Sound Encoding and their Relationship to Speech-in-Noise Perception

    Directory of Open Access Journals (Sweden)

    Emily B. J. Coffey

    2017-08-01

    Full Text Available Speech-in-noise (SIN perception is a complex cognitive skill that affects social, vocational, and educational activities. Poor SIN ability particularly affects young and elderly populations, yet varies considerably even among healthy young adults with normal hearing. Although SIN skills are known to be influenced by top-down processes that can selectively enhance lower-level sound representations, the complementary role of feed-forward mechanisms and their relationship to musical training is poorly understood. Using a paradigm that minimizes the main top-down factors that have been implicated in SIN performance such as working memory, we aimed to better understand how robust encoding of periodicity in the auditory system (as measured by the frequency-following response contributes to SIN perception. Using magnetoencephalograpy, we found that the strength of encoding at the fundamental frequency in the brainstem, thalamus, and cortex is correlated with SIN accuracy. The amplitude of the slower cortical P2 wave was previously also shown to be related to SIN accuracy and FFR strength; we use MEG source localization to show that the P2 wave originates in a temporal region anterior to that of the cortical FFR. We also confirm that the observed enhancements were related to the extent and timing of musicianship. These results are consistent with the hypothesis that basic feed-forward sound encoding affects SIN perception by providing better information to later processing stages, and that modifying this process may be one mechanism through which musical training might enhance the auditory networks that subserve both musical and language functions.

  15. Neural Correlates of Indicators of Sound Change in Cantonese: Evidence from Cortical and Subcortical Processes.

    Science.gov (United States)

    Maggu, Akshay R; Liu, Fang; Antoniou, Mark; Wong, Patrick C M

    2016-01-01

    Across time, languages undergo changes in phonetic, syntactic, and semantic dimensions. Social, cognitive, and cultural factors contribute to sound change, a phenomenon in which the phonetics of a language undergo changes over time. Individuals who misperceive and produce speech in a slightly divergent manner (called innovators ) contribute to variability in the society, eventually leading to sound change. However, the cause of variability in these individuals is still unknown. In this study, we examined whether such misperceptions are represented in neural processes of the auditory system. We investigated behavioral, subcortical (via FFR), and cortical (via P300) manifestations of sound change processing in Cantonese, a Chinese language in which several lexical tones are merging. Across the merging categories, we observed a similar gradation of speech perception abilities in both behavior and the brain (subcortical and cortical processes). Further, we also found that behavioral evidence of tone merging correlated with subjects' encoding at the subcortical and cortical levels. These findings indicate that tone-merger categories, that are indicators of sound change in Cantonese, are represented neurophysiologically with high fidelity. Using our results, we speculate that innovators encode speech in a slightly deviant neurophysiological manner, and thus produce speech divergently that eventually spreads across the community and contributes to sound change.

  16. Audiovisual integration in speech perception: a multi-stage process

    DEFF Research Database (Denmark)

    Eskelund, Kasper; Tuomainen, Jyrki; Andersen, Tobias

    2011-01-01

    investigate whether the integration of auditory and visual speech observed in these two audiovisual integration effects are specific traits of speech perception. We further ask whether audiovisual integration is undertaken in a single processing stage or multiple processing stages....

  17. Multilingual Speech and Language Processing

    Science.gov (United States)

    2003-04-01

    client software handles the user end of the transaction. Historically, four clients were provided: e-mail, web, FrameMaker , and command line. By...command-line client and an API. The API allows integration of CyberTrans into a number of processes including word processing packages ( FrameMaker ...preservation and logging, and others. The available clients remain e-mail, Web and FrameMaker . Platforms include both Unix and PC for clients, with

  18. Perceptions of The Seriousness of Mispronunciations of English Speech Sounds

    Directory of Open Access Journals (Sweden)

    Moedjito Moedjito

    2006-01-01

    Full Text Available The present study attempts to investigate Indonesian EFL teachers’ and native English speakers’ perceptions of mispronunciations of English sounds by Indonesian EFL learners. For this purpose, a paper-form questionnaire consisting of 32 target mispronunciations was distributed to Indonesian secondary school teachers of English and also to native English speakers. An analysis of the respondents’ perceptions has discovered that 14 out of the 32 target mispronunciations are pedagogically significant in pronunciation instruction. A further analysis of the reasons for these major mispronunciations has reconfirmed the prevalence of interference of learners’ native language in their English pronunciation as a major cause of mispronunciations. It has also revealed Indonesian EFL teachers’ tendency to overestimate the seriousness of their learners’ pronunciations. Based on these findings, the study makes suggestions for better English pronunciation teaching in Indonesia or other EFL countries.

  19. Cortical processes of speech illusions in the general population.

    Science.gov (United States)

    Schepers, E; Bodar, L; van Os, J; Lousberg, R

    2016-10-18

    There is evidence that experimentally elicited auditory illusions in the general population index risk for psychotic symptoms. As little is known about underlying cortical mechanisms of auditory illusions, an experiment was conducted to analyze processing of auditory illusions in a general population sample. In a follow-up design with two measurement moments (baseline and 6 months), participants (n = 83) underwent the White Noise task under simultaneous recording with a 14-lead EEG. An auditory illusion was defined as hearing any speech in a sound fragment containing white noise. A total number of 256 speech illusions (SI) were observed over the two measurements, with a high degree of stability of SI over time. There were 7 main effects of speech illusion on the EEG alpha band-the most significant indicating a decrease in activity at T3 (t = -4.05). Other EEG frequency bands (slow beta, fast beta, gamma, delta, theta) showed no significant associations with SI. SIs are characterized by reduced alpha activity in non-clinical populations. Given the association of SIs with psychosis, follow-up research is required to examine the possibility of reduced alpha activity mediating SIs in high risk and symptomatic populations.

  20. Current trends in multilingual speech processing

    Indian Academy of Sciences (India)

    2016-08-26

    ; speech-to-speech translation; language identification. ... interest owing to two strong driving forces. Firstly, technical advances in speech recognition and synthesis are posing new challenges and opportunities to researchers.

  1. Oral and Hand Movement Speeds Are Associated with Expressive Language Ability in Children with Speech Sound Disorder

    Science.gov (United States)

    Peter, Beate

    2012-01-01

    This study tested the hypothesis that children with speech sound disorder have generalized slowed motor speeds. It evaluated associations among oral and hand motor speeds and measures of speech (articulation and phonology) and language (receptive vocabulary, sentence comprehension, sentence imitation), in 11 children with moderate to severe SSD…

  2. How Should Children with Speech Sound Disorders be Classified? A Review and Critical Evaluation of Current Classification Systems

    Science.gov (United States)

    Waring, R.; Knight, R.

    2013-01-01

    Background: Children with speech sound disorders (SSD) form a heterogeneous group who differ in terms of the severity of their condition, underlying cause, speech errors, involvement of other aspects of the linguistic system and treatment response. To date there is no universal and agreed-upon classification system. Instead, a number of…

  3. The phonological memory profile of preschool children who make atypical speech sound errors.

    Science.gov (United States)

    Waring, Rebecca; Eadie, Patricia; Rickard Liow, Susan; Dodd, Barbara

    2018-01-01

    Previous research indicates that children with speech sound disorders (SSD) have underlying phonological memory deficits. The SSD population, however, is diverse. While children who make consistent atypical speech errors (phonological disorder/PhDis) are known to have executive function deficits in rule abstraction and cognitive flexibility, little is known about their memory profile. Sixteen monolingual preschool children with atypical speech errors (PhDis) were matched individually to age-and-gender peers with typically developing speech (TDS). The two groups were compared on forward recall of familiar words (pointing response), reverse recall of familiar words (pointing response), and reverse recall of digits (spoken response) and a receptive vocabulary task. There were no differences between children with TDS and children with PhDis on forward recall or vocabulary tasks. However, children with TDS significantly outperformed children with PhDis on the two reverse recall tasks. Findings suggest that atypical speech errors are associated with impaired phonological working memory, implicating executive function impairment in specific subtypes of SSD.

  4. Auditory and visual sustained attention in children with speech sound disorder.

    Directory of Open Access Journals (Sweden)

    Cristina F B Murphy

    Full Text Available Although research has demonstrated that children with specific language impairment (SLI and reading disorder (RD exhibit sustained attention deficits, no study has investigated sustained attention in children with speech sound disorder (SSD. Given the overlap of symptoms, such as phonological memory deficits, between these different language disorders (i.e., SLI, SSD and RD and the relationships between working memory, attention and language processing, it is worthwhile to investigate whether deficits in sustained attention also occur in children with SSD. A total of 55 children (18 diagnosed with SSD (8.11 ± 1.231 and 37 typically developing children (8.76 ± 1.461 were invited to participate in this study. Auditory and visual sustained-attention tasks were applied. Children with SSD performed worse on these tasks; they committed a greater number of auditory false alarms and exhibited a significant decline in performance over the course of the auditory detection task. The extent to which performance is related to auditory perceptual difficulties and probable working memory deficits is discussed. Further studies are needed to better understand the specific nature of these deficits and their clinical implications.

  5. Children with dyslexia show a reduced processing benefit from bimodal speech information compared to their typically developing peers.

    Science.gov (United States)

    Schaadt, Gesa; van der Meer, Elke; Pannekamp, Ann; Oberecker, Regine; Männel, Claudia

    2018-01-17

    During information processing, individuals benefit from bimodally presented input, as has been demonstrated for speech perception (i.e., printed letters and speech sounds) or the perception of emotional expressions (i.e., facial expression and voice tuning). While typically developing individuals show this bimodal benefit, school children with dyslexia do not. Currently, it is unknown whether the bimodal processing deficit in dyslexia also occurs for visual-auditory speech processing that is independent of reading and spelling acquisition (i.e., no letter-sound knowledge is required). Here, we tested school children with and without spelling problems on their bimodal perception of video-recorded mouth movements pronouncing syllables. We analyzed the event-related potential Mismatch Response (MMR) to visual-auditory speech information and compared this response to the MMR to monomodal speech information (i.e., auditory-only, visual-only). We found a reduced MMR with later onset to visual-auditory speech information in children with spelling problems compared to children without spelling problems. Moreover, when comparing bimodal and monomodal speech perception, we found that children without spelling problems showed significantly larger responses in the visual-auditory experiment compared to the visual-only response, whereas children with spelling problems did not. Our results suggest that children with dyslexia exhibit general difficulties in bimodal speech perception independently of letter-speech sound knowledge, as apparent in altered bimodal speech perception and lacking benefit from bimodal information. This general deficit in children with dyslexia may underlie the previously reported reduced bimodal benefit for letter-speech sound combinations and similar findings in emotion perception. Copyright © 2018 Elsevier Ltd. All rights reserved.

  6. IEP goals for school-age children with speech sound disorders.

    Science.gov (United States)

    Farquharson, Kelly; Tambyraja, Sherine R; Justice, Laura M; Redle, Erin E

    2014-01-01

    The purpose of the current study was to describe the current state of practice for writing Individualized Education Program (IEP) goals for children with speech sound disorders (SSDs). IEP goals for 146 children receiving services for SSDs within public school systems across two states were coded for their dominant theoretical framework and overall quality. A dichotomous scheme was used for theoretical framework coding: cognitive-linguistic or sensory-motor. Goal quality was determined by examining 7 specific indicators outlined by an empirically tested rating tool. In total, 147 long-term and 490 short-term goals were coded. The results revealed no dominant theoretical framework for long-term goals, whereas short-term goals largely reflected a sensory-motor framework. In terms of quality, the majority of speech production goals were functional and generalizable in nature, but were not able to be easily targeted during common daily tasks or by other members of the IEP team. Short-term goals were consistently rated higher in quality domains when compared to long-term goals. The current state of practice for writing IEP goals for children with SSDs indicates that theoretical framework may be eclectic in nature and likely written to support the individual needs of children with speech sound disorders. Further investigation is warranted to determine the relations between goal quality and child outcomes. (1) Identify two predominant theoretical frameworks and discuss how they apply to IEP goal writing. (2) Discuss quality indicators as they relate to IEP goals for children with speech sound disorders. (3) Discuss the relationship between long-term goals level of quality and related theoretical frameworks. (4) Identify the areas in which business-as-usual IEP goals exhibit strong quality.

  7. The abstract representations in speech processing.

    Science.gov (United States)

    Cutler, Anne

    2008-11-01

    Speech processing by human listeners derives meaning from acoustic input via intermediate steps involving abstract representations of what has been heard. Recent results from several lines of research are here brought together to shed light on the nature and role of these representations. In spoken-word recognition, representations of phonological form and of conceptual content are dissociable. This follows from the independence of patterns of priming for a word's form and its meaning. The nature of the phonological-form representations is determined not only by acoustic-phonetic input but also by other sources of information, including metalinguistic knowledge. This follows from evidence that listeners can store two forms as different without showing any evidence of being able to detect the difference in question when they listen to speech. The lexical representations are in turn separate from prelexical representations, which are also abstract in nature. This follows from evidence that perceptual learning about speaker-specific phoneme realization, induced on the basis of a few words, generalizes across the whole lexicon to inform the recognition of all words containing the same phoneme. The efficiency of human speech processing has its basis in the rapid execution of operations over abstract representations.

  8. Modeling auditory processing and speech perception in hearing-impaired listeners

    DEFF Research Database (Denmark)

    Jepsen, Morten Løve

    in a diagnostic rhyme test. The framework was constructed such that discrimination errors originating from the front-end and the back-end were separated. The front-end was fitted to individual listeners with cochlear hearing loss according to non-speech data, and speech data were obtained in the same listeners......A better understanding of how the human auditory system represents and analyzes sounds and how hearing impairment affects such processing is of great interest for researchers in the fields of auditory neuroscience, audiology, and speech communication as well as for applications in hearing......-instrument and speech technology. In this thesis, the primary focus was on the development and evaluation of a computational model of human auditory signal-processing and perception. The model was initially designed to simulate the normal-hearing auditory system with particular focus on the nonlinear processing...

  9. An exploratory study of the influence of load and practice on segmental and articulatory variability in children with speech sound disorders.

    Science.gov (United States)

    Vuolo, Janet; Goffman, Lisa

    2017-01-01

    This exploratory treatment study used phonetic transcription and speech kinematics to examine changes in segmental and articulatory variability. Nine children, ages 4 to 8 years old, served as participants, including two with childhood apraxia of speech (CAS), five with speech sound disorder (SSD) and two who were typically developing. Children practised producing agent + action phrases in an imitation task (low linguistic load) and a retrieval task (high linguistic load) over five sessions. In the imitation task in session one, both participants with CAS showed high degrees of segmental and articulatory variability. After five sessions, imitation practice resulted in increased articulatory variability for five participants. Retrieval practice resulted in decreased articulatory variability in three participants with SSD. These results suggest that short-term speech production practice in rote imitation disrupts articulatory control in children with and without CAS. In contrast, tasks that require linguistic processing may scaffold learning for children with SSD but not CAS.

  10. A sparse neural code for some speech sounds but not for others.

    Directory of Open Access Journals (Sweden)

    Mathias Scharinger

    Full Text Available The precise neural mechanisms underlying speech sound representations are still a matter of debate. Proponents of 'sparse representations' assume that on the level of speech sounds, only contrastive or otherwise not predictable information is stored in long-term memory. Here, in a passive oddball paradigm, we challenge the neural foundations of such a 'sparse' representation; we use words that differ only in their penultimate consonant ("coronal" [t] vs. "dorsal" [k] place of articulation and for example distinguish between the German nouns Latz ([lats]; bib and Lachs ([laks]; salmon. Changes from standard [t] to deviant [k] and vice versa elicited a discernible Mismatch Negativity (MMN response. Crucially, however, the MMN for the deviant [lats] was stronger than the MMN for the deviant [laks]. Source localization showed this difference to be due to enhanced brain activity in right superior temporal cortex. These findings reflect a difference in phonological 'sparsity': Coronal [t] segments, but not dorsal [k] segments, are based on more sparse representations and elicit less specific neural predictions; sensory deviations from this prediction are more readily 'tolerated' and accordingly trigger weaker MMNs. The results foster the neurocomputational reality of 'representationally sparse' models of speech perception that are compatible with more general predictive mechanisms in auditory perception.

  11. Intervention for bilingual speech sound disorders: A case study of an isiXhosa-English-speaking child.

    Science.gov (United States)

    Rossouw, Kate; Pascoe, Michelle

    2018-03-19

     Bilingualism is common in South Africa, with many children acquiring isiXhosa as a home language and learning English from a young age in nursery or crèche. IsiXhosa is a local language, part of the Bantu language family, widely spoken in the country. Aims: To describe changes in a bilingual child's speech following intervention based on a theoretically motivated and tailored intervention plan. Methods and procedures: This study describes a female isiXhosa-English bilingual child, named Gcobisa (pseudonym) (chronological age 4 years and 2 months) with a speech sound disorder. Gcobisa's speech was assessed and her difficulties categorised according to Dodd's (2005) diagnostic framework. From this, intervention was planned and the language of intervention was selected. Following intervention, Gcobisa's speech was reassessed. Outcomes and results: Gcobisa's speech was categorised as a consistent phonological delay as she presented with gliding of/l/in both English and isiXhosa, cluster reduction in English and several other age appropriate phonological processes. She was provided with 16 sessions of intervention using a minimal pairs approach, targeting the phonological process of gliding of/l/, which was not considered age appropriate for Gcobisa in isiXhosa when compared to the small set of normative data regarding monolingual isiXhosa development. As a result, the targets and stimuli were in isiXhosa while the main language of instruction was English. This reflects the language mismatch often faced by speech language therapists in South Africa. Gcobisa showed evidence of generalising the target phoneme to English words. Conclusions and implications: The data have theoretical implications regarding bilingual development of isiXhosa-English, as it highlights the ways bilingual development may differ from the monolingual development of this language pair. It adds to the small set of intervention studies investigating the changes in the speech of bilingual

  12. Intervention for bilingual speech sound disorders: A case study of an isiXhosa–English-speaking child

    Directory of Open Access Journals (Sweden)

    Kate Rossouw

    2018-03-01

    Full Text Available Background: Bilingualism is common in South Africa, with many children acquiring isiXhosa as a home language and learning English from a young age in nursery or crèche. IsiXhosa is a local language, part of the Bantu language family, widely spoken in the country.   Aims: To describe changes in a bilingual child’s speech following intervention based on a theoretically motivated and tailored intervention plan.   Methods and procedures: This study describes a female isiXhosa–English bilingual child, named Gcobisa (pseudonym (chronological age 4 years and 2 months with a speech sound disorder. Gcobisa’s speech was assessed and her difficulties categorised according to Dodd’s (2005 diagnostic framework. From this, intervention was planned and the language of intervention was selected. Following intervention, Gcobisa’s speech was reassessed.   Outcomes and results: Gcobisa’s speech was categorised as a consistent phonological delay as she presented with gliding of/l/in both English and isiXhosa, cluster reduction in English and several other age appropriate phonological processes. She was provided with 16 sessions of intervention using a minimal pairs approach, targeting the phonological process of gliding of/l/, which was not considered age appropriate for Gcobisa in isiXhosa when compared to the small set of normative data regarding monolingual isiXhosa development. As a result, the targets and stimuli were in isiXhosa while the main language of instruction was English. This reflects the language mismatch often faced by speech language therapists in South Africa. Gcobisa showed evidence of generalising the target phoneme to English words.   Conclusions and implications: The data have theoretical implications regarding bilingual development of isiXhosa–English, as it highlights the ways bilingual development may differ from the monolingual development of this language pair. It adds to the small set of intervention studies

  13. Reverberation impairs brainstem temporal representations of voiced vowel sounds: challenging periodicity-tagged segregation of competing speech in rooms

    Directory of Open Access Journals (Sweden)

    Mark eSayles

    2015-01-01

    Full Text Available The auditory system typically processes information from concurrently active sound sources (e.g., two voices speaking at once, in the presence of multiple delayed, attenuated and distorted sound-wave reflections (reverberation. Brainstem circuits help segregate these complex acoustic mixtures into auditory objects. Psychophysical studies demonstrate a strong interaction between reverberation and fundamental-frequency (F0 modulation, leading to impaired segregation of competing vowels when segregation is on the basis of F0 differences. Neurophysiological studies of complex-sound segregation have concentrated on sounds with steady F0s, in anechoic environments. However, F0 modulation and reverberation are quasi-ubiquitous.We examine the ability of 129 single units in the ventral cochlear nucleus of the anesthetized guinea pig to segregate the concurrent synthetic vowel sounds /a/ and /i/, based on temporal discharge patterns under closed-field conditions. We address the effects of added real-room reverberation, F0 modulation, and the interaction of these two factors, on brainstem neural segregation of voiced speech sounds. A firing-rate representation of single-vowels’ spectral envelopes is robust to the combination of F0 modulation and reverberation: local firing-rate maxima and minima across the tonotopic array code vowel-formant structure. However, single-vowel F0-related periodicity information in shuffled inter-spike interval distributions is significantly degraded in the combined presence of reverberation and F0 modulation. Hence, segregation of double-vowels’ spectral energy into two streams (corresponding to the two vowels, on the basis of temporal discharge patterns, is impaired by reverberation; specifically when F0 is modulated. All unit types (primary-like, chopper, onset are similarly affected. These results offer neurophysiological insights to perceptual organization of complex acoustic scenes under realistically challenging

  14. Learning foreign sounds in an alien world: videogame training improves non-native speech categorization.

    Science.gov (United States)

    Lim, Sung-joo; Holt, Lori L

    2011-01-01

    Although speech categories are defined by multiple acoustic dimensions, some are perceptually weighted more than others and there are residual effects of native-language weightings in non-native speech perception. Recent research on nonlinguistic sound category learning suggests that the distribution characteristics of experienced sounds influence perceptual cue weights: Increasing variability across a dimension leads listeners to rely upon it less in subsequent category learning (Holt & Lotto, 2006). The present experiment investigated the implications of this among native Japanese learning English /r/-/l/ categories. Training was accomplished using a videogame paradigm that emphasizes associations among sound categories, visual information, and players' responses to videogame characters rather than overt categorization or explicit feedback. Subjects who played the game for 2.5h across 5 days exhibited improvements in /r/-/l/ perception on par with 2-4 weeks of explicit categorization training in previous research and exhibited a shift toward more native-like perceptual cue weights. Copyright © 2011 Cognitive Science Society, Inc.

  15. Audiovisual speech perception development at varying levels of perceptual processing.

    Science.gov (United States)

    Lalonde, Kaylah; Holt, Rachael Frush

    2016-04-01

    This study used the auditory evaluation framework [Erber (1982). Auditory Training (Alexander Graham Bell Association, Washington, DC)] to characterize the influence of visual speech on audiovisual (AV) speech perception in adults and children at multiple levels of perceptual processing. Six- to eight-year-old children and adults completed auditory and AV speech perception tasks at three levels of perceptual processing (detection, discrimination, and recognition). The tasks differed in the level of perceptual processing required to complete them. Adults and children demonstrated visual speech influence at all levels of perceptual processing. Whereas children demonstrated the same visual speech influence at each level of perceptual processing, adults demonstrated greater visual speech influence on tasks requiring higher levels of perceptual processing. These results support previous research demonstrating multiple mechanisms of AV speech processing (general perceptual and speech-specific mechanisms) with independent maturational time courses. The results suggest that adults rely on both general perceptual mechanisms that apply to all levels of perceptual processing and speech-specific mechanisms that apply when making phonetic decisions and/or accessing the lexicon. Six- to eight-year-old children seem to rely only on general perceptual mechanisms across levels. As expected, developmental differences in AV benefit on this and other recognition tasks likely reflect immature speech-specific mechanisms and phonetic processing in children.

  16. Interdependent processing and encoding of speech and concurrent background noise.

    Science.gov (United States)

    Cooper, Angela; Brouwer, Susanne; Bradlow, Ann R

    2015-05-01

    Speech processing can often take place in adverse listening conditions that involve the mixing of speech and background noise. In this study, we investigated processing dependencies between background noise and indexical speech features, using a speeded classification paradigm (Garner, 1974; Exp. 1), and whether background noise is encoded and represented in memory for spoken words in a continuous recognition memory paradigm (Exp. 2). Whether or not the noise spectrally overlapped with the speech signal was also manipulated. The results of Experiment 1 indicated that background noise and indexical features of speech (gender, talker identity) cannot be completely segregated during processing, even when the two auditory streams are spectrally nonoverlapping. Perceptual interference was asymmetric, whereby irrelevant indexical feature variation in the speech signal slowed noise classification to a greater extent than irrelevant noise variation slowed speech classification. This asymmetry may stem from the fact that speech features have greater functional relevance to listeners, and are thus more difficult to selectively ignore than background noise. Experiment 2 revealed that a recognition cost for words embedded in different types of background noise on the first and second occurrences only emerged when the noise and the speech signal were spectrally overlapping. Together, these data suggest integral processing of speech and background noise, modulated by the level of processing and the spectral separation of the speech and noise.

  17. Visual Speech Fills in Both Discrimination and Identification of Non-Intact Auditory Speech in Children

    Science.gov (United States)

    Jerger, Susan; Damian, Markus F.; McAlpine, Rachel P.; Abdi, Herve

    2018-01-01

    To communicate, children must discriminate and identify speech sounds. Because visual speech plays an important role in this process, we explored how visual speech influences phoneme discrimination and identification by children. Critical items had intact visual speech (e.g. baez) coupled to non-intact (excised onsets) auditory speech (signified…

  18. Persistent Thalamic Sound Processing Despite Profound Cochlear Denervation

    Directory of Open Access Journals (Sweden)

    Anna R. Chambers

    2016-08-01

    Full Text Available Neurons at higher stages of sensory processing can partially compensate for a sudden drop in input from the periphery through a homeostatic plasticity process that increases the gain on weak afferent inputs. Even after a profound unilateral auditory neuropathy where > 95% of synapses between auditory nerve fibers and inner hair cells have been eliminated with ouabain, central gain can restore the cortical processing and perceptual detection of basic sounds delivered to the denervated ear. In this model of profound auditory neuropathy, cortical processing and perception recover despite the absence of an auditory brainstem response (ABR or brainstem acoustic reflexes, and only a partial recovery of sound processing at the level of the inferior colliculus (IC, an auditory midbrain nucleus. In this study, we induced a profound cochlear neuropathy with ouabain and asked whether central gain enabled a compensatory plasticity in the auditory thalamus comparable to the full recovery of function previously observed in the auditory cortex (ACtx, the partial recovery observed in the IC, or something different entirely. Unilateral ouabain treatment in adult mice effectively eliminated the ABR, yet robust sound-evoked activity persisted in a minority of units recorded from the contralateral medial geniculate body (MGB of awake mice. Sound-driven MGB units could decode moderate and high-intensity sounds with accuracies comparable to sham-treated control mice, but low-intensity classification was near chance. Pure tone receptive fields and synchronization to broadband pulse trains also persisted, albeit with significantly reduced quality and precision, respectively. MGB decoding of temporally modulated pulse trains and speech tokens were both greatly impaired in ouabain-treated mice. Taken together, the absence of an ABR belied a persistent auditory processing at the level of the MGB that was likely enabled through increased central gain. Compensatory

  19. The influence of masker type on early reflection processing and speech intelligibility (L)

    DEFF Research Database (Denmark)

    Arweiler, Iris; Buchholz, Jörg M.; Dau, Torsten

    2013-01-01

    Arweiler and Buchholz [J. Acoust. Soc. Am. 130, 996-1005 (2011)] showed that, while the energy of early reflections (ERs) in a room improves speech intelligibility, the benefit is smaller than that provided by the energy of the direct sound (DS). In terms of integration of ERs and DS, binaural...... listening did not provide a benefit from ERs apart from a binaural energy summation, such that monaural auditory processing could account for the data. However, a diffuse speech shaped noise (SSN) was used in the speech intelligibility experiments, which does not provide distinct binaural cues...... to the auditory system. In the present study, the monaural and binaural benefit from ERs for speech intelligibility was investigated using three directional maskers presented from 90° azimuth: a SSN, a multi-talker babble, and a reversed two-talker masker. For normal-hearing as well as hearing-impaired listeners...

  20. A systematic review and classification of interventions for speech-sound disorder in preschool children.

    Science.gov (United States)

    Wren, Yvonne; Harding, Sam; Goldbart, Juliet; Roulstone, Sue

    2018-05-01

    Multiple interventions have been developed to address speech sound disorder (SSD) in children. Many of these have been evaluated but the evidence for these has not been considered within a model which categorizes types of intervention. The opportunity to carry out a systematic review of interventions for SSD arose as part of a larger scale study of interventions for primary speech and language impairment in preschool children. To review systematically the evidence for interventions for SSD in preschool children and to categorize them within a classification of interventions for SSD. Relevant search terms were used to identify intervention studies published up to 2012, with the following inclusion criteria: participants were aged between 2 years and 5 years, 11 months; they exhibited speech, language and communication needs; and a primary outcome measure of speech was used. Studies that met inclusion criteria were quality appraised using the single case experimental design (SCED) or PEDro-P, depending on their methodology. Those judged to be high quality were classified according to the primary focus of intervention. The final review included 26 studies. Case series was the most common research design. Categorization to the classification system for interventions showed that cognitive-linguistic and production approaches to intervention were the most frequently reported. The highest graded evidence was for three studies within the auditory-perceptual and integrated categories. The evidence for intervention for preschool children with SSD is focused on seven out of 11 subcategories of interventions. Although all the studies included in the review were good quality as defined by quality appraisal checklists, they mostly represented lower-graded evidence. Higher-graded studies are needed to understand clearly the strength of evidence for different interventions. © 2018 Royal College of Speech and Language Therapists.

  1. Speech and audio processing for coding, enhancement and recognition

    CERN Document Server

    Togneri, Roberto; Narasimha, Madihally

    2015-01-01

    This book describes the basic principles underlying the generation, coding, transmission and enhancement of speech and audio signals, including advanced statistical and machine learning techniques for speech and speaker recognition with an overview of the key innovations in these areas. Key research undertaken in speech coding, speech enhancement, speech recognition, emotion recognition and speaker diarization are also presented, along with recent advances and new paradigms in these areas. ·         Offers readers a single-source reference on the significant applications of speech and audio processing to speech coding, speech enhancement and speech/speaker recognition. Enables readers involved in algorithm development and implementation issues for speech coding to understand the historical development and future challenges in speech coding research; ·         Discusses speech coding methods yielding bit-streams that are multi-rate and scalable for Voice-over-IP (VoIP) Networks; ·     �...

  2. Modular and Adaptive Control of Sound Processing

    Science.gov (United States)

    van Nort, Douglas

    This dissertation presents research into the creation of systems for the control of sound synthesis and processing. The focus differs from much of the work related to digital musical instrument design, which has rightly concentrated on the physicality of the instrument and interface: sensor design, choice of controller, feedback to performer and so on. Often times a particular choice of sound processing is made, and the resultant parameters from the physical interface are conditioned and mapped to the available sound parameters in an exploratory fashion. The main goal of the work presented here is to demonstrate the importance of the space that lies between physical interface design and the choice of sound manipulation algorithm, and to present a new framework for instrument design that strongly considers this essential part of the design process. In particular, this research takes the viewpoint that instrument designs should be considered in a musical control context, and that both control and sound dynamics must be considered in tandem. In order to achieve this holistic approach, the work presented in this dissertation assumes complementary points of view. Instrument design is first seen as a function of musical context, focusing on electroacoustic music and leading to a view on gesture that relates perceived musical intent to the dynamics of an instrumental system. The important design concept of mapping is then discussed from a theoretical and conceptual point of view, relating perceptual, systems and mathematically-oriented ways of examining the subject. This theoretical framework gives rise to a mapping design space, functional analysis of pertinent existing literature, implementations of mapping tools, instrumental control designs and several perceptual studies that explore the influence of mapping structure. Each of these reflect a high-level approach in which control structures are imposed on top of a high-dimensional space of control and sound synthesis

  3. When speaker identity is unavoidable: Neural processing of speaker identity cues in natural speech.

    Science.gov (United States)

    Tuninetti, Alba; Chládková, Kateřina; Peter, Varghese; Schiller, Niels O; Escudero, Paola

    2017-11-01

    Speech sound acoustic properties vary largely across speakers and accents. When perceiving speech, adult listeners normally disregard non-linguistic variation caused by speaker or accent differences, in order to comprehend the linguistic message, e.g. to correctly identify a speech sound or a word. Here we tested whether the process of normalizing speaker and accent differences, facilitating the recognition of linguistic information, is found at the level of neural processing, and whether it is modulated by the listeners' native language. In a multi-deviant oddball paradigm, native and nonnative speakers of Dutch were exposed to naturally-produced Dutch vowels varying in speaker, sex, accent, and phoneme identity. Unexpectedly, the analysis of mismatch negativity (MMN) amplitudes elicited by each type of change shows a large degree of early perceptual sensitivity to non-linguistic cues. This finding on perception of naturally-produced stimuli contrasts with previous studies examining the perception of synthetic stimuli wherein adult listeners automatically disregard acoustic cues to speaker identity. The present finding bears relevance to speech normalization theories, suggesting that at an unattended level of processing, listeners are indeed sensitive to changes in fundamental frequency in natural speech tokens. Copyright © 2017 Elsevier Inc. All rights reserved.

  4. Learning-induced neural plasticity of speech processing before birth.

    Science.gov (United States)

    Partanen, Eino; Kujala, Teija; Näätänen, Risto; Liitola, Auli; Sambeth, Anke; Huotilainen, Minna

    2013-09-10

    Learning, the foundation of adaptive and intelligent behavior, is based on plastic changes in neural assemblies, reflected by the modulation of electric brain responses. In infancy, auditory learning implicates the formation and strengthening of neural long-term memory traces, improving discrimination skills, in particular those forming the prerequisites for speech perception and understanding. Although previous behavioral observations show that newborns react differentially to unfamiliar sounds vs. familiar sound material that they were exposed to as fetuses, the neural basis of fetal learning has not thus far been investigated. Here we demonstrate direct neural correlates of human fetal learning of speech-like auditory stimuli. We presented variants of words to fetuses; unlike infants with no exposure to these stimuli, the exposed fetuses showed enhanced brain activity (mismatch responses) in response to pitch changes for the trained variants after birth. Furthermore, a significant correlation existed between the amount of prenatal exposure and brain activity, with greater activity being associated with a higher amount of prenatal speech exposure. Moreover, the learning effect was generalized to other types of similar speech sounds not included in the training material. Consequently, our results indicate neural commitment specifically tuned to the speech features heard before birth and their memory representations.

  5. A comparison of sound quality judgments for monaural and binaural hearing aid processed stimuli.

    Science.gov (United States)

    Balfour, P B; Hawkins, D B

    1992-10-01

    Fifteen adults with bilaterally symmetrical mild and/or moderate sensorineural hearing loss completed a paired-comparison task designed to elicit sound quality preference judgments for monaural/binaural hearing aid processed signals. Three stimuli (speech-in-quiet, speech-in-noise, and music) were recorded separately in three listening environments (audiometric test booth, living room, and a music/lecture hall) through hearing aids placed on a Knowles Electronics Manikin for Acoustics Research. Judgments were made on eight separate sound quality dimensions (brightness, clarity, fullness, loudness, nearness, overall impression, smoothness, and spaciousness) for each of the three stimuli in three listening environments. Results revealed a distinct binaural preference for all eight sound quality dimensions independent of listening environment. Binaural preferences were strongest for overall impression, fullness, and spaciousness. Stimulus type effect was significant only for fullness and spaciousness, where binaural preferences were strongest for speech-in-quiet. After binaural preference data were obtained, subjects ranked each sound quality dimension with respect to its importance for binaural listening relative to monaural. Clarity was ranked highest in importance and brightness was ranked least important. The key to demonstration of improved binaural hearing aid sound quality may be the use of a paired-comparison format.

  6. Cortical processing of dynamic sound envelope transitions.

    Science.gov (United States)

    Zhou, Yi; Wang, Xiaoqin

    2010-12-08

    Slow envelope fluctuations in the range of 2-20 Hz provide important segmental cues for processing communication sounds. For a successful segmentation, a neural processor must capture envelope features associated with the rise and fall of signal energy, a process that is often challenged by the interference of background noise. This study investigated the neural representations of slowly varying envelopes in quiet and in background noise in the primary auditory cortex (A1) of awake marmoset monkeys. We characterized envelope features based on the local average and rate of change of sound level in envelope waveforms and identified envelope features to which neurons were selective by reverse correlation. Our results showed that envelope feature selectivity of A1 neurons was correlated with the degree of nonmonotonicity in their static rate-level functions. Nonmonotonic neurons exhibited greater feature selectivity than monotonic neurons in quiet and in background noise. The diverse envelope feature selectivity decreased spike-timing correlation among A1 neurons in response to the same envelope waveforms. As a result, the variability, but not the average, of the ensemble responses of A1 neurons represented more faithfully the dynamic transitions in low-frequency sound envelopes both in quiet and in background noise.

  7. Quality of Mobile Phone and Tablet Mobile Apps for Speech Sound Disorders: Protocol for an Evidence-Based Appraisal.

    Science.gov (United States)

    Furlong, Lisa M; Morris, Meg E; Erickson, Shane; Serry, Tanya A

    2016-11-29

    Although mobile apps are readily available for speech sound disorders (SSD), their validity has not been systematically evaluated. This evidence-based appraisal will critically review and synthesize current evidence on available therapy apps for use by children with SSD. The main aims are to (1) identify the types of apps currently available for Android and iOS mobile phones and tablets, and (2) to critique their design features and content using a structured quality appraisal tool. This protocol paper presents and justifies the methods used for a systematic review of mobile apps that provide intervention for use by children with SSD. The primary outcomes of interest are (1) engagement, (2) functionality, (3) aesthetics, (4) information quality, (5) subjective quality, and (6) perceived impact. Quality will be assessed by 2 certified practicing speech-language pathologists using a structured quality appraisal tool. Two app stores will be searched from the 2 largest operating platforms, Android and iOS. Systematic methods of knowledge synthesis shall include searching the app stores using a defined procedure, data extraction, and quality analysis. This search strategy shall enable us to determine how many SSD apps are available for Android and for iOS compatible mobile phones and tablets. It shall also identify the regions of the world responsible for the apps' development, the content and the quality of offerings. Recommendations will be made for speech-language pathologists seeking to use mobile apps in their clinical practice. This protocol provides a structured process for locating apps and appraising the quality, as the basis for evaluating their use in speech pathology for children in English-speaking nations. ©Lisa M Furlong, Meg E Morris, Shane Erickson, Tanya A Serry. Originally published in JMIR Research Protocols (http://www.researchprotocols.org), 29.11.2016.

  8. Psychometric characteristics of single-word tests of children's speech sound production.

    Science.gov (United States)

    Flipsen, Peter; Ogiela, Diane A

    2015-04-01

    Our understanding of test construction has improved since the now-classic review by McCauley and Swisher (1984). The current review article examines the psychometric characteristics of current single-word tests of speech sound production in an attempt to determine whether our tests have improved since then. It also provides a resource that clinicians may use to help them make test selection decisions for their particular client populations. Ten tests published since 1990 were reviewed to determine whether they met the 10 criteria set out by McCauley and Swisher (1984), as well as 7 additional criteria. All of the tests reviewed met at least 3 of McCauley and Swisher's (1984) original criteria, and 9 of 10 tests met at least 5 of them. Most of the tests met some of the additional criteria as well. The state of the art for single-word tests of speech sound production in children appears to have improved in the last 30 years. There remains, however, room for improvement.

  9. Acoustic analyses of speech sounds and rhythms in Japanese- and English-learning infants

    Directory of Open Access Journals (Sweden)

    Yuko eYamashita

    2013-02-01

    Full Text Available The purpose of this study was to explore developmental changes, in terms of spectral fluctuations and temporal periodicity with Japanese- and English-learning infants. Three age groups (15, 20, and 24 months were selected, because infants diversify phonetic inventories with age. Natural speech of the infants was recorded. We utilized a critical-band-filter bank, which simulated the frequency resolution in adults’ auditory periphery. First, the correlations between the critical-band outputs represented by factor analysis were observed in order to see how the critical bands should be connected to each other, if a listener is to differentiate sounds in infants’ speech. In the following analysis, we analyzed the temporal fluctuations of factor scores by calculating autocorrelations. The present analysis identified three factors observed in adult speech at 24 months of age in both linguistic environments. These three factors were shifted to a higher frequency range corresponding to the smaller vocal tract size of the infants. The results suggest that the vocal tract structures of the infants had developed to become adult-like configuration by 24 months of age in both language environments. The amount of utterances with periodic nature of shorter time increased with age in both environments. This trend was clearer in the Japanese environment.

  10. Audiovisual speech perception development at varying levels of perceptual processing

    OpenAIRE

    Lalonde, Kaylah; Holt, Rachael Frush

    2016-01-01

    This study used the auditory evaluation framework [Erber (1982). Auditory Training (Alexander Graham Bell Association, Washington, DC)] to characterize the influence of visual speech on audiovisual (AV) speech perception in adults and children at multiple levels of perceptual processing. Six- to eight-year-old children and adults completed auditory and AV speech perception tasks at three levels of perceptual processing (detection, discrimination, and recognition). The tasks differed in the le...

  11. Engaged listeners: shared neural processing of powerful political speeches.

    Science.gov (United States)

    Schmälzle, Ralf; Häcker, Frank E K; Honey, Christopher J; Hasson, Uri

    2015-08-01

    Powerful speeches can captivate audiences, whereas weaker speeches fail to engage their listeners. What is happening in the brains of a captivated audience? Here, we assess audience-wide functional brain dynamics during listening to speeches of varying rhetorical quality. The speeches were given by German politicians and evaluated as rhetorically powerful or weak. Listening to each of the speeches induced similar neural response time courses, as measured by inter-subject correlation analysis, in widespread brain regions involved in spoken language processing. Crucially, alignment of the time course across listeners was stronger for rhetorically powerful speeches, especially for bilateral regions of the superior temporal gyri and medial prefrontal cortex. Thus, during powerful speeches, listeners as a group are more coupled to each other, suggesting that powerful speeches are more potent in taking control of the listeners' brain responses. Weaker speeches were processed more heterogeneously, although they still prompted substantially correlated responses. These patterns of coupled neural responses bear resemblance to metaphors of resonance, which are often invoked in discussions of speech impact, and contribute to the literature on auditory attention under natural circumstances. Overall, this approach opens up possibilities for research on the neural mechanisms mediating the reception of entertaining or persuasive messages. © The Author (2015). Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  12. A randomized controlled trial on the beneficial effects of training letter-speech sound integration on reading fluency in children with dyslexia

    NARCIS (Netherlands)

    Fraga González, G.; Žarić, G.; Tijms, J.; Bonte, M.; Blomert, L.; van der Molen, M.W.

    2015-01-01

    A recent account of dyslexia assumes that a failure to develop automated letter-speech sound integration might be responsible for the observed lack of reading fluency. This study uses a pre-test-training-post-test design to evaluate the effects of a training program based on letter-speech sound

  13. Facial Speech Gestures: The Relation between Visual Speech Processing, Phonological Awareness, and Developmental Dyslexia in 10-Year-Olds

    Science.gov (United States)

    Schaadt, Gesa; Männel, Claudia; van der Meer, Elke; Pannekamp, Ann; Friederici, Angela D.

    2016-01-01

    Successful communication in everyday life crucially involves the processing of auditory and visual components of speech. Viewing our interlocutor and processing visual components of speech facilitates speech processing by triggering auditory processing. Auditory phoneme processing, analyzed by event-related brain potentials (ERP), has been shown…

  14. A Diagnostic Marker to Discriminate Childhood Apraxia of Speech from Speech Delay: III. Theoretical Coherence of the Pause Marker with Speech Processing Deficits in Childhood Apraxia of Speech

    Science.gov (United States)

    Shriberg, Lawrence D.; Strand, Edythe A.; Fourakis, Marios; Jakielski, Kathy J.; Hall, Sheryl D.; Karlsson, Heather B.; Mabie, Heather L.; McSweeny, Jane L.; Tilkens, Christie M.; Wilson, David L.

    2017-01-01

    Purpose: Previous articles in this supplement described rationale for and development of the pause marker (PM), a diagnostic marker of childhood apraxia of speech (CAS), and studies supporting its validity and reliability. The present article assesses the theoretical coherence of the PM with speech processing deficits in CAS. Method: PM and other…

  15. Associations among Measures of Sequential Processing in Motor and Linguistics Tasks in Adults with and without a Family History of Childhood Apraxia of Speech: A Replication Study

    Science.gov (United States)

    Button, Le; Peter, Beate; Stoel-Gammon, Carol; Raskind, Wendy H.

    2013-01-01

    The purpose of this study was to address the hypothesis that childhood apraxia of speech (CAS) is influenced by an underlying deficit in sequential processing that is also expressed in other modalities. In a sample of 21 adults from five multigenerational families, 11 with histories of various familial speech sound disorders, 3 biologically…

  16. Time course of the influence of musical expertise on the processing of vocal and musical sounds.

    Science.gov (United States)

    Rigoulot, S; Pell, M D; Armony, J L

    2015-04-02

    Previous functional magnetic resonance imaging (fMRI) studies have suggested that different cerebral regions preferentially process human voice and music. Yet, little is known on the temporal course of the brain processes that decode the category of sounds and how the expertise in one sound category can impact these processes. To address this question, we recorded the electroencephalogram (EEG) of 15 musicians and 18 non-musicians while they were listening to short musical excerpts (piano and violin) and vocal stimuli (speech and non-linguistic vocalizations). The task of the participants was to detect noise targets embedded within the stream of sounds. Event-related potentials revealed an early differentiation of sound category, within the first 100 ms after the onset of the sound, with mostly increased responses to musical sounds. Importantly, this effect was modulated by the musical background of participants, as musicians were more responsive to music sounds than non-musicians, consistent with the notion that musical training increases sensitivity to music. In late temporal windows, brain responses were enhanced in response to vocal stimuli, but musicians were still more responsive to music. These results shed new light on the temporal course of neural dynamics of auditory processing and reveal how it is impacted by the stimulus category and the expertise of participants. Copyright © 2015 IBRO. Published by Elsevier Ltd. All rights reserved.

  17. Evidence for the treatment of co-occurring stuttering and speech sound disorder: A clinical case series.

    Science.gov (United States)

    Unicomb, Rachael; Hewat, Sally; Spencer, Elizabeth; Harrison, Elisabeth

    2017-06-01

    There is a paucity of evidence to guide treatment for children with co-occurring stuttering and speech sound disorder. Some guidelines suggest treating the two disorders simultaneously using indirect treatment approaches; however, the research supporting these recommendations is over 20 years old. In this clinical case series, we investigate whether these co-occurring disorders could be treated concurrently using direct treatment approaches supported by up-to-date, high-level evidence, and whether this could be done in an efficacious, safe and efficient manner. Five pre-school-aged participants received individual concurrent, direct intervention for both stuttering and speech sound disorder. All participants used the Lidcombe Program, as manualised. Direct treatment for speech sound disorder was individualised based on analysis of each child's sound system. At 12 months post commencement of treatment, all except one participant had completed the Lidcombe Program, and were less than 1.0% syllables stuttered on samples gathered within and beyond the clinic. These four participants completed Stage 1 of the Lidcombe Program in between 14 and 22 clinic visits, consistent with current benchmark data for this programme. At the same assessment point, all five participants exhibited significant increases in percentage of consonants correct and were in alignment with age-expected estimates of this measure. Further, they were treated in an average number of clinic visits that compares favourably with other research on treatment for speech sound disorder. These preliminary results indicate that young children with co-occurring stuttering and speech sound disorder may be treated concurrently using direct treatment approaches. This method of service delivery may have implications for cost and time efficiency and may also address the crucial need for early intervention in both disorders. These positive findings highlight the need for further research in the area and contribute to

  18. The sound of feelings: electrophysiological responses to emotional speech in alexithymia.

    Directory of Open Access Journals (Sweden)

    Katharina Sophia Goerlich

    Full Text Available Alexithymia is a personality trait characterized by difficulties in the cognitive processing of emotions (cognitive dimension and in the experience of emotions (affective dimension. Previous research focused mainly on visual emotional processing in the cognitive alexithymia dimension. We investigated the impact of both alexithymia dimensions on electrophysiological responses to emotional speech in 60 female subjects.During unattended processing, subjects watched a movie while an emotional prosody oddball paradigm was presented in the background. During attended processing, subjects detected deviants in emotional prosody. The cognitive alexithymia dimension was associated with a left-hemisphere bias during early stages of unattended emotional speech processing, and with generally reduced amplitudes of the late P3 component during attended processing. In contrast, the affective dimension did not modulate unattended emotional prosody perception, but was associated with reduced P3 amplitudes during attended processing particularly to emotional prosody spoken in high intensity.Our results provide evidence for a dissociable impact of the two alexithymia dimensions on electrophysiological responses during the attended and unattended processing of emotional prosody. The observed electrophysiological modulations are indicative of a reduced sensitivity to the emotional qualities of speech, which may be a contributing factor to problems in interpersonal communication associated with alexithymia.

  19. Learning to Produce Syllabic Speech Sounds via Reward-Modulated Neural Plasticity

    Science.gov (United States)

    Warlaumont, Anne S.; Finnegan, Megan K.

    2016-01-01

    At around 7 months of age, human infants begin to reliably produce well-formed syllables containing both consonants and vowels, a behavior called canonical babbling. Over subsequent months, the frequency of canonical babbling continues to increase. How the infant’s nervous system supports the acquisition of this ability is unknown. Here we present a computational model that combines a spiking neural network, reinforcement-modulated spike-timing-dependent plasticity, and a human-like vocal tract to simulate the acquisition of canonical babbling. Like human infants, the model’s frequency of canonical babbling gradually increases. The model is rewarded when it produces a sound that is more auditorily salient than sounds it has previously produced. This is consistent with data from human infants indicating that contingent adult responses shape infant behavior and with data from deaf and tracheostomized infants indicating that hearing, including hearing one’s own vocalizations, is critical for canonical babbling development. Reward receipt increases the level of dopamine in the neural network. The neural network contains a reservoir with recurrent connections and two motor neuron groups, one agonist and one antagonist, which control the masseter and orbicularis oris muscles, promoting or inhibiting mouth closure. The model learns to increase the number of salient, syllabic sounds it produces by adjusting the base level of muscle activation and increasing their range of activity. Our results support the possibility that through dopamine-modulated spike-timing-dependent plasticity, the motor cortex learns to harness its natural oscillations in activity in order to produce syllabic sounds. It thus suggests that learning to produce rhythmic mouth movements for speech production may be supported by general cortical learning mechanisms. The model makes several testable predictions and has implications for our understanding not only of how syllabic vocalizations develop

  20. Visual speech information: a help or hindrance in perceptual processing of dysarthric speech.

    Science.gov (United States)

    Borrie, Stephanie A

    2015-03-01

    This study investigated the influence of visual speech information on perceptual processing of neurologically degraded speech. Fifty listeners identified spastic dysarthric speech under both audio (A) and audiovisual (AV) conditions. Condition comparisons revealed that the addition of visual speech information enhanced processing of the neurologically degraded input in terms of (a) acuity (percent phonemes correct) of vowels and consonants and (b) recognition (percent words correct) of predictive and nonpredictive phrases. Listeners exploited stress-based segmentation strategies more readily in AV conditions, suggesting that the perceptual benefit associated with adding visual speech information to the auditory signal-the AV advantage-has both segmental and suprasegmental origins. Results also revealed that the magnitude of the AV advantage can be predicted, to some degree, by the extent to which an individual utilizes syllabic stress cues to inform word recognition in AV conditions. Findings inform the development of a listener-specific model of speech perception that applies to processing of dysarthric speech in everyday communication contexts.

  1. Acquired Apraxia of Speech: The Effects of Repeated Practice and Rate/Rhythm Control Treatments on Sound Production Accuracy

    Science.gov (United States)

    Wambaugh, Julie L.; Nessler, Christina; Cameron, Rosalea; Mauszycki, Shannon C.

    2012-01-01

    Purpose: This investigation was designed to elucidate the effects of repeated practice treatment on sound production accuracy in individuals with apraxia of speech (AOS) and aphasia. A secondary purpose was to determine if the addition of rate/rhythm control to treatment provided further benefits beyond those achieved with repeated practice.…

  2. Prevalence and Predictors of Persistent Speech Sound Disorder at Eight Years Old: Findings from a Population Cohort Study

    Science.gov (United States)

    Wren, Yvonne; Miller, Laura L.; Peters, Tim J.; Emond, Alan; Roulstone, Sue

    2016-01-01

    Purpose: The purpose of this study was to determine prevalence and predictors of persistent speech sound disorder (SSD) in children aged 8 years after disregarding children presenting solely with common clinical distortions (i.e., residual errors). Method: Data from the Avon Longitudinal Study of Parents and Children (Boyd et al., 2012) were used.…

  3. Auditory, Visual and Audiovisual Speech Processing Streams in Superior Temporal Sulcus.

    Science.gov (United States)

    Venezia, Jonathan H; Vaden, Kenneth I; Rong, Feng; Maddox, Dale; Saberi, Kourosh; Hickok, Gregory

    2017-01-01

    The human superior temporal sulcus (STS) is responsive to visual and auditory information, including sounds and facial cues during speech recognition. We investigated the functional organization of STS with respect to modality-specific and multimodal speech representations. Twenty younger adult participants were instructed to perform an oddball detection task and were presented with auditory, visual, and audiovisual speech stimuli, as well as auditory and visual nonspeech control stimuli in a block fMRI design. Consistent with a hypothesized anterior-posterior processing gradient in STS, auditory, visual and audiovisual stimuli produced the largest BOLD effects in anterior, posterior and middle STS (mSTS), respectively, based on whole-brain, linear mixed effects and principal component analyses. Notably, the mSTS exhibited preferential responses to multisensory stimulation, as well as speech compared to nonspeech. Within the mid-posterior and mSTS regions, response preferences changed gradually from visual, to multisensory, to auditory moving posterior to anterior. Post hoc analysis of visual regions in the posterior STS revealed that a single subregion bordering the mSTS was insensitive to differences in low-level motion kinematics yet distinguished between visual speech and nonspeech based on multi-voxel activation patterns. These results suggest that auditory and visual speech representations are elaborated gradually within anterior and posterior processing streams, respectively, and may be integrated within the mSTS, which is sensitive to more abstract speech information within and across presentation modalities. The spatial organization of STS is consistent with processing streams that are hypothesized to synthesize perceptual speech representations from sensory signals that provide convergent information from visual and auditory modalities.

  4. Subcortical processing of speech regularities underlies reading and music aptitude in children

    Science.gov (United States)

    2011-01-01

    Background Neural sensitivity to acoustic regularities supports fundamental human behaviors such as hearing in noise and reading. Although the failure to encode acoustic regularities in ongoing speech has been associated with language and literacy deficits, how auditory expertise, such as the expertise that is associated with musical skill, relates to the brainstem processing of speech regularities is unknown. An association between musical skill and neural sensitivity to acoustic regularities would not be surprising given the importance of repetition and regularity in music. Here, we aimed to define relationships between the subcortical processing of speech regularities, music aptitude, and reading abilities in children with and without reading impairment. We hypothesized that, in combination with auditory cognitive abilities, neural sensitivity to regularities in ongoing speech provides a common biological mechanism underlying the development of music and reading abilities. Methods We assessed auditory working memory and attention, music aptitude, reading ability, and neural sensitivity to acoustic regularities in 42 school-aged children with a wide range of reading ability. Neural sensitivity to acoustic regularities was assessed by recording brainstem responses to the same speech sound presented in predictable and variable speech streams. Results Through correlation analyses and structural equation modeling, we reveal that music aptitude and literacy both relate to the extent of subcortical adaptation to regularities in ongoing speech as well as with auditory working memory and attention. Relationships between music and speech processing are specifically driven by performance on a musical rhythm task, underscoring the importance of rhythmic regularity for both language and music. Conclusions These data indicate common brain mechanisms underlying reading and music abilities that relate to how the nervous system responds to regularities in auditory input

  5. Subcortical processing of speech regularities underlies reading and music aptitude in children

    Directory of Open Access Journals (Sweden)

    Strait Dana L

    2011-10-01

    Full Text Available Abstract Background Neural sensitivity to acoustic regularities supports fundamental human behaviors such as hearing in noise and reading. Although the failure to encode acoustic regularities in ongoing speech has been associated with language and literacy deficits, how auditory expertise, such as the expertise that is associated with musical skill, relates to the brainstem processing of speech regularities is unknown. An association between musical skill and neural sensitivity to acoustic regularities would not be surprising given the importance of repetition and regularity in music. Here, we aimed to define relationships between the subcortical processing of speech regularities, music aptitude, and reading abilities in children with and without reading impairment. We hypothesized that, in combination with auditory cognitive abilities, neural sensitivity to regularities in ongoing speech provides a common biological mechanism underlying the development of music and reading abilities. Methods We assessed auditory working memory and attention, music aptitude, reading ability, and neural sensitivity to acoustic regularities in 42 school-aged children with a wide range of reading ability. Neural sensitivity to acoustic regularities was assessed by recording brainstem responses to the same speech sound presented in predictable and variable speech streams. Results Through correlation analyses and structural equation modeling, we reveal that music aptitude and literacy both relate to the extent of subcortical adaptation to regularities in ongoing speech as well as with auditory working memory and attention. Relationships between music and speech processing are specifically driven by performance on a musical rhythm task, underscoring the importance of rhythmic regularity for both language and music. Conclusions These data indicate common brain mechanisms underlying reading and music abilities that relate to how the nervous system responds to

  6. Subcortical processing of speech regularities underlies reading and music aptitude in children.

    Science.gov (United States)

    Strait, Dana L; Hornickel, Jane; Kraus, Nina

    2011-10-17

    Neural sensitivity to acoustic regularities supports fundamental human behaviors such as hearing in noise and reading. Although the failure to encode acoustic regularities in ongoing speech has been associated with language and literacy deficits, how auditory expertise, such as the expertise that is associated with musical skill, relates to the brainstem processing of speech regularities is unknown. An association between musical skill and neural sensitivity to acoustic regularities would not be surprising given the importance of repetition and regularity in music. Here, we aimed to define relationships between the subcortical processing of speech regularities, music aptitude, and reading abilities in children with and without reading impairment. We hypothesized that, in combination with auditory cognitive abilities, neural sensitivity to regularities in ongoing speech provides a common biological mechanism underlying the development of music and reading abilities. We assessed auditory working memory and attention, music aptitude, reading ability, and neural sensitivity to acoustic regularities in 42 school-aged children with a wide range of reading ability. Neural sensitivity to acoustic regularities was assessed by recording brainstem responses to the same speech sound presented in predictable and variable speech streams. Through correlation analyses and structural equation modeling, we reveal that music aptitude and literacy both relate to the extent of subcortical adaptation to regularities in ongoing speech as well as with auditory working memory and attention. Relationships between music and speech processing are specifically driven by performance on a musical rhythm task, underscoring the importance of rhythmic regularity for both language and music. These data indicate common brain mechanisms underlying reading and music abilities that relate to how the nervous system responds to regularities in auditory input. Definition of common biological underpinnings

  7. Dissociating Cortical Activity during Processing of Native and Non-Native Audiovisual Speech from Early to Late Infancy

    Directory of Open Access Journals (Sweden)

    Eswen Fava

    2014-08-01

    Full Text Available Initially, infants are capable of discriminating phonetic contrasts across the world’s languages. Starting between seven and ten months of age, they gradually lose this ability through a process of perceptual narrowing. Although traditionally investigated with isolated speech sounds, such narrowing occurs in a variety of perceptual domains (e.g., faces, visual speech. Thus far, tracking the developmental trajectory of this tuning process has been focused primarily on auditory speech alone, and generally using isolated sounds. But infants learn from speech produced by people talking to them, meaning they learn from a complex audiovisual signal. Here, we use near-infrared spectroscopy to measure blood concentration changes in the bilateral temporal cortices of infants in three different age groups: 3-to-6 months, 7-to-10 months, and 11-to-14-months. Critically, all three groups of infants were tested with continuous audiovisual speech in both their native and another, unfamiliar language. We found that at each age range, infants showed different patterns of cortical activity in response to the native and non-native stimuli. Infants in the youngest group showed bilateral cortical activity that was greater overall in response to non-native relative to native speech; the oldest group showed left lateralized activity in response to native relative to non-native speech. These results highlight perceptual tuning as a dynamic process that happens across modalities and at different levels of stimulus complexity.

  8. Kinematic Analysis of Speech Sound Sequencing Errors Induced by Delayed Auditory Feedback.

    Science.gov (United States)

    Cler, Gabriel J; Lee, Jackson C; Mittelman, Talia; Stepp, Cara E; Bohland, Jason W

    2017-06-22

    Delayed auditory feedback (DAF) causes speakers to become disfluent and make phonological errors. Methods for assessing the kinematics of speech errors are lacking, with most DAF studies relying on auditory perceptual analyses, which may be problematic, as errors judged to be categorical may actually represent blends of sounds or articulatory errors. Eight typical speakers produced nonsense syllable sequences under normal and DAF (200 ms). Lip and tongue kinematics were captured with electromagnetic articulography. Time-locked acoustic recordings were transcribed, and the kinematics of utterances with and without perceived errors were analyzed with existing and novel quantitative methods. New multivariate measures showed that for 5 participants, kinematic variability for productions perceived to be error free was significantly increased under delay; these results were validated by using the spatiotemporal index measure. Analysis of error trials revealed both typical productions of a nontarget syllable and productions with articulatory kinematics that incorporated aspects of both the target and the perceived utterance. This study is among the first to characterize articulatory changes under DAF and provides evidence for different classes of speech errors, which may not be perceptually salient. New methods were developed that may aid visualization and analysis of large kinematic data sets. https://doi.org/10.23641/asha.5103067.

  9. Sleep Disrupts High-Level Speech Parsing Despite Significant Basic Auditory Processing.

    Science.gov (United States)

    Makov, Shiri; Sharon, Omer; Ding, Nai; Ben-Shachar, Michal; Nir, Yuval; Zion Golumbic, Elana

    2017-08-09

    parsing are also preserved. We used a novel approach for studying the depth of speech processing across wakefulness and sleep while tracking neuronal activity with EEG. We found that responses to the auditory sound stream remained intact; however, the sleeping brain did not show signs of hierarchical parsing of the continuous stream of syllables into words, phrases, and sentences. The results suggest that sleep imposes a functional barrier between basic sensory processing and high-level cognitive processing. This paradigm also holds promise for studying residual cognitive abilities in a wide array of unresponsive states. Copyright © 2017 the authors 0270-6474/17/377772-10$15.00/0.

  10. Temporal Context in Speech Processing and Attentional Stream Selection: A Behavioral and Neural perspective

    Science.gov (United States)

    Zion Golumbic, Elana M.; Poeppel, David; Schroeder, Charles E.

    2012-01-01

    The human capacity for processing speech is remarkable, especially given that information in speech unfolds over multiple time scales concurrently. Similarly notable is our ability to filter out of extraneous sounds and focus our attention on one conversation, epitomized by the ‘Cocktail Party’ effect. Yet, the neural mechanisms underlying on-line speech decoding and attentional stream selection are not well understood. We review findings from behavioral and neurophysiological investigations that underscore the importance of the temporal structure of speech for achieving these perceptual feats. We discuss the hypothesis that entrainment of ambient neuronal oscillations to speech’s temporal structure, across multiple time-scales, serves to facilitate its decoding and underlies the selection of an attended speech stream over other competing input. In this regard, speech decoding and attentional stream selection are examples of ‘active sensing’, emphasizing an interaction between proactive and predictive top-down modulation of neuronal dynamics and bottom-up sensory input. PMID:22285024

  11. Discrimination and preference of speech and non-speech sounds in autism patients%孤独症患者言语及非言语声音辨识和偏好特征

    Institute of Scientific and Technical Information of China (English)

    王崇颖; 江鸣山; 徐旸; 马斐然; 石锋

    2011-01-01

    Objective:To explore the discrimination and preference of speech and non-speech sounds in autism patients. Methods: Ten people (5 children vs. 5 adults) diagnosed with autism according to the criteria of Diagnostic and Statistical Manual of Mental Disorders. Fourth Version ( DSM-Ⅳ) were selected from database of Nankai University Center for Behavioural Science. Together with 10 healthy controls with matched age, people with autism were tested by three experiments on speech sounds, pure tone and intonation which were recorded and modified by Praat, a voice analysis software. Their discrimination and preference were collected orally. The exact probability values were calculated. Results: The results showed that there were no significant differences on the discrimination of speech sounds, pure tone and intonation between autism patients and controls ( P > 0. 05) while controls preferred speech and non-speech sounds with higher pitch than autism ( e. g. , - 100Hz/ +50Hz. 2 vs. 7. P < 0. 05:50Hz/250Hz. 4 vs. 10. P < 0. 05) and autism preferred non-speech sounds with lower pitch ( 100Hz/250Hz. 6 vs. 3.P < 0. 05). No significant difference on the preference of intonation between autism and controls ( P > 0. 05) was found. Conclusion:lt shows that people with autism have impaired auditory processing on speech and non-speech sounds.%目的:探究孤独症患者对言语及非言语声音的辨识和偏好特征.方法:从南开大学医学院行为医学中心患者数据库中选取根据美国精神障碍诊断与统计手册第4版(DSM-Ⅳ)诊断标准确诊的孤独症患者10名(儿童和成年人各5例),选取与年龄匹配的正常对照10名.所有被试均接受由专业的语音软件Praat录制和生成的语音音高、纯音音高和韵律的实验测试,口头报告其对言语及非言语声音的辨识和偏好结果.结果:孤独症患者在语音音高、纯音音高和韵律的辨识上和正常对照组差异无统计学意义(均P>0.05).

  12. Robust digital processing of speech signals

    CERN Document Server

    Kovacevic, Branko; Veinović, Mladen; Marković, Milan

    2017-01-01

    This book focuses on speech signal phenomena, presenting a robustification of the usual speech generation models with regard to the presumed types of excitation signals, which is equivalent to the introduction of a class of nonlinear models and the corresponding criterion functions for parameter estimation. Compared to the general class of nonlinear models, such as various neural networks, these models possess good properties of controlled complexity, the option of working in “online” mode, as well as a low information volume for efficient speech encoding and transmission. Providing comprehensive insights, the book is based on the authors’ research, which has already been published, supplemented by additional texts discussing general considerations of speech modeling, linear predictive analysis and robust parameter estimation.

  13. Auditory Peripheral Processing of Degraded Speech

    National Research Council Canada - National Science Library

    Ghitza, Oded

    2003-01-01

    ...". The underlying thesis is that the auditory periphery contributes to the robust performance of humans in speech reception in noise through a concerted contribution of the efferent feedback system...

  14. Current trends in multilingual speech processing

    Indian Academy of Sciences (India)

    (i) Feature extractor that extracts the relevant information from the speech ..... extraction we are able to approximately account for this variability across listeners. ..... scores (similar to Zissman (1996)) before making a decision about the word.

  15. Processing changes when listening to foreign-accented speech

    Directory of Open Access Journals (Sweden)

    Carlos eRomero-Rivas

    2015-03-01

    Full Text Available This study investigates the mechanisms responsible for fast changes in processing foreign-accented speech. Event Related brain Potentials (ERPs were obtained while native speakers of Spanish listened to native and foreign-accented speakers of Spanish. We observed a less positive P200 component for foreign-accented speech relative to native speech comprehension. This suggests that the extraction of spectral information and other important acoustic features was hampered during foreign-accented speech comprehension. However, the amplitude of the N400 component for foreign-accented speech comprehension decreased across the experiment, suggesting the use of a higher level, lexical mechanism. Furthermore, during native speech comprehension, semantic violations in the critical words elicited an N400 effect followed by a late positivity. During foreign-accented speech comprehension, semantic violations only elicited an N400 effect. Overall, our results suggest that, despite a lack of improvement in phonetic discrimination, native listeners experience changes at lexical-semantic levels of processing after brief exposure to foreign-accented speech. Moreover, these results suggest that lexical access, semantic integration and linguistic re-analysis processes are permeable to external factors, such as the accent of the speaker.

  16. Processing Complex Sounds Passing through the Rostral Brainstem: The New Early Filter Model

    Science.gov (United States)

    Marsh, John E.; Campbell, Tom A.

    2016-01-01

    The rostral brainstem receives both “bottom-up” input from the ascending auditory system and “top-down” descending corticofugal connections. Speech information passing through the inferior colliculus of elderly listeners reflects the periodicity envelope of a speech syllable. This information arguably also reflects a composite of temporal-fine-structure (TFS) information from the higher frequency vowel harmonics of that repeated syllable. The amplitude of those higher frequency harmonics, bearing even higher frequency TFS information, correlates positively with the word recognition ability of elderly listeners under reverberatory conditions. Also relevant is that working memory capacity (WMC), which is subject to age-related decline, constrains the processing of sounds at the level of the brainstem. Turning to the effects of a visually presented sensory or memory load on auditory processes, there is a load-dependent reduction of that processing, as manifest in the auditory brainstem responses (ABR) evoked by to-be-ignored clicks. Wave V decreases in amplitude with increases in the visually presented memory load. A visually presented sensory load also produces a load-dependent reduction of a slightly different sort: The sensory load of visually presented information limits the disruptive effects of background sound upon working memory performance. A new early filter model is thus advanced whereby systems within the frontal lobe (affected by sensory or memory load) cholinergically influence top-down corticofugal connections. Those corticofugal connections constrain the processing of complex sounds such as speech at the level of the brainstem. Selective attention thereby limits the distracting effects of background sound entering the higher auditory system via the inferior colliculus. Processing TFS in the brainstem relates to perception of speech under adverse conditions. Attentional selectivity is crucial when the signal heard is degraded or masked: e

  17. The neural basis of speech sound discrimination from infancy to adulthood

    OpenAIRE

    Partanen, Eino

    2013-01-01

    Rapid processing of speech is facilitated by neural representations of native language phonemes. However, some disorders and developmental conditions, such as developmental dyslexia, can hamper the development of these neural memory traces, leading to language delays and poor academic achievement. While the early identification of such deficits is paramount so that interventions can be started as early as possible, there is currently no systematically used ecologically valid paradigm for the ...

  18. Processing of Phonological Variation in Children with Hearing Loss: Compensation for English Place Assimilation in Connected Speech

    Science.gov (United States)

    Skoruppa, Katrin; Rosen, Stuart

    2014-01-01

    Purpose: In this study, the authors explored phonological processing in connected speech in children with hearing loss. Specifically, the authors investigated these children's sensitivity to English place assimilation, by which alveolar consonants like t and n can adapt to following sounds (e.g., the word ten can be realized as tem in the…

  19. The effect of F0 contour on the intelligibility of speech in the presence of interfering sounds for Mandarin Chinese.

    Science.gov (United States)

    Chen, Jing; Yang, Hongying; Wu, Xihong; Moore, Brian C J

    2018-02-01

    In Mandarin Chinese, the fundamental frequency (F0) contour defines lexical "Tones" that differ in meaning despite being phonetically identical. Flattening the F0 contour impairs the intelligibility of Mandarin Chinese in background sounds. This might occur because the flattening introduces misleading lexical information. To avoid this effect, two types of speech were used: single-Tone speech contained Tones 1 and 0 only, which have a flat F0 contour; multi-Tone speech contained all Tones and had a varying F0 contour. The intelligibility of speech in steady noise was slightly better for single-Tone speech than for multi-Tone speech. The intelligibility of speech in a two-talker masker, with the difference in mean F0 between the target and masker matched across conditions, was worse for the multi-Tone target in the multi-Tone masker than for any other combination of target and masker, probably because informational masking was maximal for this combination. The introduction of a perceived spatial separation between the target and masker, via the precedence effect, led to better performance for all target-masker combinations, especially the multi-Tone target in the multi-Tone masker. In summary, a flat F0 contour does not reduce the intelligibility of Mandarin Chinese when the introduction of misleading lexical cues is avoided.

  20. Multiengine Speech Processing Using SNR Estimator in Variable Noisy Environments

    Directory of Open Access Journals (Sweden)

    Ahmad R. Abu-El-Quran

    2012-01-01

    Full Text Available We introduce a multiengine speech processing system that can detect the location and the type of audio signal in variable noisy environments. This system detects the location of the audio source using a microphone array; the system examines the audio first, determines if it is speech/nonspeech, then estimates the value of the signal to noise (SNR using a Discrete-Valued SNR Estimator. Using this SNR value, instead of trying to adapt the speech signal to the speech processing system, we adapt the speech processing system to the surrounding environment of the captured speech signal. In this paper, we introduced the Discrete-Valued SNR Estimator and a multiengine classifier, using Multiengine Selection or Multiengine Weighted Fusion. Also we use the SI as example of the speech processing. The Discrete-Valued SNR Estimator achieves an accuracy of 98.4% in characterizing the environment's SNR. Compared to a conventional single engine SI system, the improvement in accuracy was as high as 9.0% and 10.0% for the Multiengine Selection and Multiengine Weighted Fusion, respectively.

  1. Children with speech sound disorder: Comparing a non-linguistic auditory approach with a phonological intervention approach to improve phonological skills

    Directory of Open Access Journals (Sweden)

    Cristina eMurphy

    2015-02-01

    Full Text Available This study aimed to compare the effects of a non-linguistic auditory intervention approach with a phonological intervention approach on the phonological skills of children with speech sound disorder. A total of 17 children, aged 7-12 years, with speech sound disorder were randomly allocated to either the non-linguistic auditory temporal intervention group (n = 10, average age 7.7 ± 1.2 or phonological intervention group (n = 7, average age 8.6 ± 1.2. The intervention outcomes included auditory-sensory measures (auditory temporal processing skills and cognitive measures (attention, short-term memory, speech production and phonological awareness skills. The auditory approach focused on non-linguistic auditory training (eg. backward masking and frequency discrimination, whereas the phonological approach focused on speech sound training (eg. phonological organisation and awareness. Both interventions consisted of twelve 45-minute sessions delivered twice per week, for a total of nine hours. Intra-group analysis demonstrated that the auditory intervention group showed significant gains in both auditory and cognitive measures, whereas no significant gain was observed in the phonological intervention group. No significant improvement on phonological skills was observed in any of the groups. Inter-group analysis demonstrated significant differences between the improvement following training for both groups, with a more pronounced gain for the non-linguistic auditory temporal intervention in one of the visual attention measures and both auditory measures. Therefore, both analyses suggest that although the non-linguistic auditory intervention approach appeared to be the most effective intervention approach, it was not sufficient to promote the enhancement of phonological skills.

  2. Differential Diagnosis of Speech Sound Disorders in Danish-speaking Children

    DEFF Research Database (Denmark)

    Clausen, Marit Carolin; Fox-Boyer, Anette

    in selecting the right intervention approach to resolve the SSD. Different quantitative and qualitative measurements are currently used to subgroup children with SSD. A quantitative method of classifying children is by accuracy of their productions. According to this approach, the severity of children’s SSD...... and clinical decision-making about the need for intervention should not be based on the quantitative approach only. A qualitative classification approach is needed for a distinct subgrouping of children with SSD whereas PCC-A can be used as additional information about the severity of the SSD. Keywords: speech...... is classified by calculating the percentage of correctly produced consonants (i.e. percentage consonants correct, PCC-A) (Shriberg et al., 1997). Alternatively, a qualitative approach seeks to ascertain which types of phonological processes are present in children’s speech, i.e. developmental or idiosyncratic...

  3. Sound Is Sound: Film Sound Techniques and Infrasound Data Array Processing

    Science.gov (United States)

    Perttu, A. B.; Williams, R.; Taisne, B.; Tailpied, D.

    2017-12-01

    A multidisciplinary collaboration between earth scientists and a sound designer/composer was established to explore the possibilities of audification analysis of infrasound array data. Through the process of audification of the infrasound we began to experiment with techniques and processes borrowed from cinema to manipulate the noise content of the signal. The results of this posed the question: "Would the accuracy of infrasound data array processing be enhanced by employing these techniques?". So a new area of research was born from this collaboration and highlights the value of these interactions and the unintended paths that can occur from them. Using a reference event database, infrasound data were processed using these new techniques and the results were compared with existing techniques to asses if there was any improvement to detection capability for the array. With just under one thousand volcanoes, and a high probability of eruption, Southeast Asia offers a unique opportunity to develop and test techniques for regional monitoring of volcanoes with different technologies. While these volcanoes are monitored locally (e.g. seismometer, infrasound, geodetic and geochemistry networks) and remotely (e.g. satellite and infrasound), there are challenges and limitations to the current monitoring capability. Not only is there a high fraction of cloud cover in the region, making plume observation more difficult via satellite, there have been examples of local monitoring networks and telemetry being destroyed early in the eruptive sequence. The success of local infrasound studies to identify explosions at volcanoes, and calculate plume heights from these signals, has led to an interest in retrieving source parameters for the purpose of ash modeling with a regional network independent of cloud cover.

  4. Sensitivity of cortical auditory evoked potential detection for hearing-impaired infants in response to short speech sounds

    Directory of Open Access Journals (Sweden)

    Bram Van Dun

    2012-01-01

    Full Text Available

    Background: Cortical auditory evoked potentials (CAEPs are an emerging tool for hearing aid fitting evaluation in young children who cannot provide reliable behavioral feedback. It is therefore useful to determine the relationship between the sensation level of speech sounds and the detection sensitivity of CAEPs.

    Design and methods: Twenty-five sensorineurally hearing impaired infants with an age range of 8 to 30 months were tested once, 18 aided and 7 unaided. First, behavioral thresholds of speech stimuli /m/, /g/, and /t/ were determined using visual reinforcement orientation audiometry (VROA. Afterwards, the same speech stimuli were presented at 55, 65, and 75 dB SPL, and CAEP recordings were made. An automatic statistical detection paradigm was used for CAEP detection.

    Results: For sensation levels above 0, 10, and 20 dB respectively, detection sensitivities were equal to 72 ± 10, 75 ± 10, and 78 ± 12%. In 79% of the cases, automatic detection p-values became smaller when the sensation level was increased by 10 dB.

    Conclusions: The results of this study suggest that the presence or absence of CAEPs can provide some indication of the audibility of a speech sound for infants with sensorineural hearing loss. The detection of a CAEP provides confidence, to a degree commensurate with the detection probability, that the infant is detecting that sound at the level presented. When testing infants where the audibility of speech sounds has not been established behaviorally, the lack of a cortical response indicates the possibility, but by no means a certainty, that the sensation level is 10 dB or less.

  5. Musical intervention enhances infants' neural processing of temporal structure in music and speech.

    Science.gov (United States)

    Zhao, T Christina; Kuhl, Patricia K

    2016-05-10

    Individuals with music training in early childhood show enhanced processing of musical sounds, an effect that generalizes to speech processing. However, the conclusions drawn from previous studies are limited due to the possible confounds of predisposition and other factors affecting musicians and nonmusicians. We used a randomized design to test the effects of a laboratory-controlled music intervention on young infants' neural processing of music and speech. Nine-month-old infants were randomly assigned to music (intervention) or play (control) activities for 12 sessions. The intervention targeted temporal structure learning using triple meter in music (e.g., waltz), which is difficult for infants, and it incorporated key characteristics of typical infant music classes to maximize learning (e.g., multimodal, social, and repetitive experiences). Controls had similar multimodal, social, repetitive play, but without music. Upon completion, infants' neural processing of temporal structure was tested in both music (tones in triple meter) and speech (foreign syllable structure). Infants' neural processing was quantified by the mismatch response (MMR) measured with a traditional oddball paradigm using magnetoencephalography (MEG). The intervention group exhibited significantly larger MMRs in response to music temporal structure violations in both auditory and prefrontal cortical regions. Identical results were obtained for temporal structure changes in speech. The intervention thus enhanced temporal structure processing not only in music, but also in speech, at 9 mo of age. We argue that the intervention enhanced infants' ability to extract temporal structure information and to predict future events in time, a skill affecting both music and speech processing.

  6. The Process of Optimizing Mechanical Sound Quality in Product Design

    DEFF Research Database (Denmark)

    Eriksen, Kaare; Holst, Thomas

    2011-01-01

    The research field concerning optimizing product sound quality is a relatively unexplored area, and may become difficult for designers to operate in. To some degree, sound is a highly subjective parameter, which is normally targeted sound specialists. This paper describes the theoretical...... and practical background for managing a process of optimizing the mechanical sound quality in a product design by using simple tools and workshops systematically. The procedure is illustrated by a case study of a computer navigation tool (computer mouse or mouse). The process is divided into 4 phases, which...... clarify the importance of product sound, defining perceptive demands identified by users, and, finally, how to suggest mechanical principles for modification of an existing sound design. The optimized mechanical sound design is followed by tests on users of the product in its use context. The result...

  7. Transfer of training between music and speech: common processing, attention and memory

    Directory of Open Access Journals (Sweden)

    Mireille eBesson

    2011-05-01

    Full Text Available After a brief historical perspective of the relationship between language and music, we review our work on transfer of training from music to speech that aimed at testing the general hypothesis that musicians should be more sensitive than nonmusicians to speech sounds. In light of recent results in the literature, we argue that when long-term experience in one domain influences acoustic processing in the other domain, results can be interpreted as common acoustic processing. But when long-term experience in one domain influences the building-up of abstract and specific percepts in another domain, results are taken as evidence for transfer of training effects. Moreover, we also discuss the influence of attention and working memory on transfer effects and we highlight the usefulness of the Event-Related Potentials method to disentangle the different processes that unfold in the course of music and speech perception. Finally, we give an overview of an on-going longitudinal project with children aimed at testing transfer effects from music to different levels and aspects of speech processing.

  8. Transfer of Training between Music and Speech: Common Processing, Attention, and Memory.

    Science.gov (United States)

    Besson, Mireille; Chobert, Julie; Marie, Céline

    2011-01-01

    After a brief historical perspective of the relationship between language and music, we review our work on transfer of training from music to speech that aimed at testing the general hypothesis that musicians should be more sensitive than non-musicians to speech sounds. In light of recent results in the literature, we argue that when long-term experience in one domain influences acoustic processing in the other domain, results can be interpreted as common acoustic processing. But when long-term experience in one domain influences the building-up of abstract and specific percepts in another domain, results are taken as evidence for transfer of training effects. Moreover, we also discuss the influence of attention and working memory on transfer effects and we highlight the usefulness of the event-related potentials method to disentangle the different processes that unfold in the course of music and speech perception. Finally, we give an overview of an on-going longitudinal project with children aimed at testing transfer effects from music to different levels and aspects of speech processing.

  9. Statistical Learning, Syllable Processing, and Speech Production in Healthy Hearing and Hearing-Impaired Preschool Children: A Mismatch Negativity Study.

    Science.gov (United States)

    Studer-Eichenberger, Esther; Studer-Eichenberger, Felix; Koenig, Thomas

    2016-01-01

    The objectives of the present study were to investigate temporal/spectral sound-feature processing in preschool children (4 to 7 years old) with peripheral hearing loss compared with age-matched controls. The results verified the presence of statistical learning, which was diminished in children with hearing impairments (HIs), and elucidated possible perceptual mediators of speech production. Perception and production of the syllables /ba/, /da/, /ta/, and /na/ were recorded in 13 children with normal hearing and 13 children with HI. Perception was assessed physiologically through event-related potentials (ERPs) recorded by EEG in a multifeature mismatch negativity paradigm and behaviorally through a discrimination task. Temporal and spectral features of the ERPs during speech perception were analyzed, and speech production was quantitatively evaluated using speech motor maximum performance tasks. Proximal to stimulus onset, children with HI displayed a difference in map topography, indicating diminished statistical learning. In later ERP components, children with HI exhibited reduced amplitudes in the N2 and early parts of the late disciminative negativity components specifically, which are associated with temporal and spectral control mechanisms. Abnormalities of speech perception were only subtly reflected in speech production, as the lone difference found in speech production studies was a mild delay in regulating speech intensity. In addition to previously reported deficits of sound-feature discriminations, the present study results reflect diminished statistical learning in children with HI, which plays an early and important, but so far neglected, role in phonological processing. Furthermore, the lack of corresponding behavioral abnormalities in speech production implies that impaired perceptual capacities do not necessarily translate into productive deficits.

  10. Processing of spatial sounds in the impaired auditory system

    DEFF Research Database (Denmark)

    Arweiler, Iris

    with an intelligibility-weighted “efficiency factor” which revealed that the spectral characteristics of the ER’s caused the reduced benefit. Hearing-impaired listeners were able to utilize the ER energy as effectively as normal-hearing listeners, most likely because binaural processing was not required...... implications for speech perception models and the development of compensation strategies in future generations of hearing instruments.......Understanding speech in complex acoustic environments presents a challenge for most hearing-impaired listeners. In conditions where normal-hearing listeners effortlessly utilize spatial cues to improve speech intelligibility, hearing-impaired listeners often struggle. In this thesis, the influence...

  11. Narrative Ability of Children With Speech Sound Disorders and the Prediction of Later Literacy Skills

    Science.gov (United States)

    Wellman, Rachel L.; Lewis, Barbara A.; Freebairn, Lisa A.; Avrich, Allison A.; Hansen, Amy J.; Stein, Catherine M.

    2012-01-01

    Purpose The main purpose of this study was to examine how children with isolated speech sound disorders (SSDs; n = 20), children with combined SSDs and language impairment (LI; n = 20), and typically developing children (n = 20), ages 3;3 (years;months) to 6;6, differ in narrative ability. The second purpose was to determine if early narrative ability predicts school-age (8–12 years) literacy skills. Method This study employed a longitudinal cohort design. The children completed a narrative retelling task before their formal literacy instruction began. The narratives were analyzed and compared for group differences. Performance on these early narratives was then used to predict the children’s reading decoding, reading comprehension, and written language ability at school age. Results Significant group differences were found in children’s (a) ability to answer questions about the story, (b) use of story grammars, and (c) number of correct and irrelevant utterances. Regression analysis demonstrated that measures of story structure and accuracy were the best predictors of the decoding of real words, reading comprehension, and written language. Measures of syntax and lexical diversity were the best predictors of the decoding of nonsense words. Conclusion Combined SSDs and LI, and not isolated SSDs, impact a child’s narrative abilities. Narrative retelling is a useful task for predicting which children may be at risk for later literacy problems. PMID:21969531

  12. Reading Skills of Students With Speech Sound Disorders at Three Stages of Literacy Development

    Science.gov (United States)

    Skebo, Crysten M.; Lewis, Barbara A.; Freebairn, Lisa A.; Tag, Jessica; Ciesla, Allison Avrich; Stein, Catherine M.

    2015-01-01

    Purpose The relationship between phonological awareness, overall language, vocabulary, and nonlinguistic cognitive skills to decoding and reading comprehension was examined for students at 3 stages of literacy development (i.e., early elementary school, middle school, and high school). Students with histories of speech sound disorders (SSD) with and without language impairment (LI) were compared to students without histories of SSD or LI (typical language; TL). Method In a cross-sectional design, students ages 7;0 (years; months) to 17;9 completed tests that measured reading, language, and nonlinguistic cognitive skills. Results For the TL group, phonological awareness predicted decoding at early elementary school, and overall language predicted reading comprehension at early elementary school and both decoding and reading comprehension at middle school and high school. For the SSD-only group, vocabulary predicted both decoding and reading comprehension at early elementary school, and overall language predicted both decoding and reading comprehension at middle school and decoding at high school. For the SSD and LI group, overall language predicted decoding at all 3 literacy stages and reading comprehension at early elementary school and middle school, and vocabulary predicted reading comprehension at high school. Conclusion Although similar skills contribute to reading across the age span, the relative importance of these skills changes with children’s literacy stages. PMID:23833280

  13. Influences on infant speech processing: toward a new synthesis.

    Science.gov (United States)

    Werker, J F; Tees, R C

    1999-01-01

    To comprehend and produce language, we must be able to recognize the sound patterns of our language and the rules for how these sounds "map on" to meaning. Human infants are born with a remarkable array of perceptual sensitivities that allow them to detect the basic properties that are common to the world's languages. During the first year of life, these sensitivities undergo modification reflecting an exquisite tuning to just that phonological information that is needed to map sound to meaning in the native language. We review this transition from language-general to language-specific perceptual sensitivity that occurs during the first year of life and consider whether the changes propel the child into word learning. To account for the broad-based initial sensitivities and subsequent reorganizations, we offer an integrated transactional framework based on the notion of a specialized perceptual-motor system that has evolved to serve human speech, but which functions in concert with other developing abilities. In so doing, we highlight the links between infant speech perception, babbling, and word learning.

  14. Individual differences in speech-in-noise perception parallel neural speech processing and attention in preschoolers

    Science.gov (United States)

    Thompson, Elaine C.; Carr, Kali Woodruff; White-Schwoch, Travis; Otto-Meyer, Sebastian; Kraus, Nina

    2016-01-01

    From bustling classrooms to unruly lunchrooms, school settings are noisy. To learn effectively in the unwelcome company of numerous distractions, children must clearly perceive speech in noise. In older children and adults, speech-in-noise perception is supported by sensory and cognitive processes, but the correlates underlying this critical listening skill in young children (3–5 year olds) remain undetermined. Employing a longitudinal design (two evaluations separated by ~12 months), we followed a cohort of 59 preschoolers, ages 3.0–4.9, assessing word-in-noise perception, cognitive abilities (intelligence, short-term memory, attention), and neural responses to speech. Results reveal changes in word-in-noise perception parallel changes in processing of the fundamental frequency (F0), an acoustic cue known for playing a role central to speaker identification and auditory scene analysis. Four unique developmental trajectories (speech-in-noise perception groups) confirm this relationship, in that improvements and declines in word-in-noise perception couple with enhancements and diminishments of F0 encoding, respectively. Improvements in word-in-noise perception also pair with gains in attention. Word-in-noise perception does not relate to strength of neural harmonic representation or short-term memory. These findings reinforce previously-reported roles of F0 and attention in hearing speech in noise in older children and adults, and extend this relationship to preschool children. PMID:27864051

  15. Social eye gaze modulates processing of speech and co-speech gesture.

    Science.gov (United States)

    Holler, Judith; Schubotz, Louise; Kelly, Spencer; Hagoort, Peter; Schuetze, Manuela; Özyürek, Aslı

    2014-12-01

    In human face-to-face communication, language comprehension is a multi-modal, situated activity. However, little is known about how we combine information from different modalities during comprehension, and how perceived communicative intentions, often signaled through visual signals, influence this process. We explored this question by simulating a multi-party communication context in which a speaker alternated her gaze between two recipients. Participants viewed speech-only or speech+gesture object-related messages when being addressed (direct gaze) or unaddressed (gaze averted to other participant). They were then asked to choose which of two object images matched the speaker's preceding message. Unaddressed recipients responded significantly more slowly than addressees for speech-only utterances. However, perceiving the same speech accompanied by gestures sped unaddressed recipients up to a level identical to that of addressees. That is, when unaddressed recipients' speech processing suffers, gestures can enhance the comprehension of a speaker's message. We discuss our findings with respect to two hypotheses attempting to account for how social eye gaze may modulate multi-modal language comprehension. Copyright © 2014 Elsevier B.V. All rights reserved.

  16. Mathematical modeling and signal processing in speech and hearing sciences

    CERN Document Server

    Xin, Jack

    2014-01-01

    The aim of the book is to give an accessible introduction of mathematical models and signal processing methods in speech and hearing sciences for senior undergraduate and beginning graduate students with basic knowledge of linear algebra, differential equations, numerical analysis, and probability. Speech and hearing sciences are fundamental to numerous technological advances of the digital world in the past decade, from music compression in MP3 to digital hearing aids, from network based voice enabled services to speech interaction with mobile phones. Mathematics and computation are intimately related to these leaps and bounds. On the other hand, speech and hearing are strongly interdisciplinary areas where dissimilar scientific and engineering publications and approaches often coexist and make it difficult for newcomers to enter.

  17. Effects of Active and Passive Hearing Protection Devices on Sound Source Localization, Speech Recognition, and Tone Detection.

    Directory of Open Access Journals (Sweden)

    Andrew D Brown

    Full Text Available Hearing protection devices (HPDs such as earplugs offer to mitigate noise exposure and reduce the incidence of hearing loss among persons frequently exposed to intense sound. However, distortions of spatial acoustic information and reduced audibility of low-intensity sounds caused by many existing HPDs can make their use untenable in high-risk (e.g., military or law enforcement environments where auditory situational awareness is imperative. Here we assessed (1 sound source localization accuracy using a head-turning paradigm, (2 speech-in-noise recognition using a modified version of the QuickSIN test, and (3 tone detection thresholds using a two-alternative forced-choice task. Subjects were 10 young normal-hearing males. Four different HPDs were tested (two active, two passive, including two new and previously untested devices. Relative to unoccluded (control performance, all tested HPDs significantly degraded performance across tasks, although one active HPD slightly improved high-frequency tone detection thresholds and did not degrade speech recognition. Behavioral data were examined with respect to head-related transfer functions measured using a binaural manikin with and without tested HPDs in place. Data reinforce previous reports that HPDs significantly compromise a variety of auditory perceptual facilities, particularly sound localization due to distortions of high-frequency spectral cues that are important for the avoidance of front-back confusions.

  18. Effects of Audio-Visual Integration on the Detection of Masked Speech and Non-Speech Sounds

    Science.gov (United States)

    Eramudugolla, Ranmalee; Henderson, Rachel; Mattingley, Jason B.

    2011-01-01

    Integration of simultaneous auditory and visual information about an event can enhance our ability to detect that event. This is particularly evident in the perception of speech, where the articulatory gestures of the speaker's lips and face can significantly improve the listener's detection and identification of the message, especially when that…

  19. Cluster-Randomized Controlled Trial Evaluating the Effectiveness of Computer-Assisted Intervention Delivered by Educators for Children with Speech Sound Disorders

    Science.gov (United States)

    McLeod, Sharynne; Baker, Elise; McCormack, Jane; Wren, Yvonne; Roulstone, Sue; Crowe, Kathryn; Masso, Sarah; White, Paul; Howland, Charlotte

    2017-01-01

    Purpose: The aim was to evaluate the effectiveness of computer-assisted input-based intervention for children with speech sound disorders (SSD). Method: The Sound Start Study was a cluster-randomized controlled trial. Seventy-nine early childhood centers were invited to participate, 45 were recruited, and 1,205 parents and educators of 4- and…

  20. Reduced neural integration of letters and speech sounds in dyslexic children scales with individual differences in reading fluency.

    Directory of Open Access Journals (Sweden)

    Gojko Žarić

    Full Text Available The acquisition of letter-speech sound associations is one of the basic requirements for fluent reading acquisition and its failure may contribute to reading difficulties in developmental dyslexia. Here we investigated event-related potential (ERP measures of letter-speech sound integration in 9-year-old typical and dyslexic readers and specifically test their relation to individual differences in reading fluency. We employed an audiovisual oddball paradigm in typical readers (n = 20, dysfluent (n = 18 and severely dysfluent (n = 18 dyslexic children. In one auditory and two audiovisual conditions the Dutch spoken vowels/a/and/o/were presented as standard and deviant stimuli. In audiovisual blocks, the letter 'a' was presented either simultaneously (AV0, or 200 ms before (AV200 vowel sound onset. Across the three children groups, vowel deviancy in auditory blocks elicited comparable mismatch negativity (MMN and late negativity (LN responses. In typical readers, both audiovisual conditions (AV0 and AV200 led to enhanced MMN and LN amplitudes. In both dyslexic groups, the audiovisual LN effects were mildly reduced. Most interestingly, individual differences in reading fluency were correlated with MMN latency in the AV0 condition. A further analysis revealed that this effect was driven by a short-lived MMN effect encompassing only the N1 window in severely dysfluent dyslexics versus a longer MMN effect encompassing both the N1 and P2 windows in the other two groups. Our results confirm and extend previous findings in dyslexic children by demonstrating a deficient pattern of letter-speech sound integration depending on the level of reading dysfluency. These findings underscore the importance of considering individual differences across the entire spectrum of reading skills in addition to group differences between typical and dyslexic readers.

  1. Relationships between music training, speech processing, and word learning: a network perspective.

    Science.gov (United States)

    Elmer, Stefan; Jäncke, Lutz

    2018-03-15

    Numerous studies have documented the behavioral advantages conferred on professional musicians and children undergoing music training in processing speech sounds varying in the spectral and temporal dimensions. These beneficial effects have previously often been associated with local functional and structural changes in the auditory cortex (AC). However, this perspective is oversimplified, in that it does not take into account the intrinsic organization of the human brain, namely, neural networks and oscillatory dynamics. Therefore, we propose a new framework for extending these previous findings to a network perspective by integrating multimodal imaging, electrophysiology, and neural oscillations. In particular, we provide concrete examples of how functional and structural connectivity can be used to model simple neural circuits exerting a modulatory influence on AC activity. In addition, we describe how such a network approach can be used for better comprehending the beneficial effects of music training on more complex speech functions, such as word learning. © 2018 New York Academy of Sciences.

  2. A speech processing study using an acoustic model of a multiple-channel cochlear implant

    Science.gov (United States)

    Xu, Ying

    1998-10-01

    A cochlear implant is an electronic device designed to provide sound information for adults and children who have bilateral profound hearing loss. The task of representing speech signals as electrical stimuli is central to the design and performance of cochlear implants. Studies have shown that the current speech- processing strategies provide significant benefits to cochlear implant users. However, the evaluation and development of speech-processing strategies have been complicated by hardware limitations and large variability in user performance. To alleviate these problems, an acoustic model of a cochlear implant with the SPEAK strategy is implemented in this study, in which a set of acoustic stimuli whose psychophysical characteristics are as close as possible to those produced by a cochlear implant are presented on normal-hearing subjects. To test the effectiveness and feasibility of this acoustic model, a psychophysical experiment was conducted to match the performance of a normal-hearing listener using model- processed signals to that of a cochlear implant user. Good agreement was found between an implanted patient and an age-matched normal-hearing subject in a dynamic signal discrimination experiment, indicating that this acoustic model is a reasonably good approximation of a cochlear implant with the SPEAK strategy. The acoustic model was then used to examine the potential of the SPEAK strategy in terms of its temporal and frequency encoding of speech. It was hypothesized that better temporal and frequency encoding of speech can be accomplished by higher stimulation rates and a larger number of activated channels. Vowel and consonant recognition tests were conducted on normal-hearing subjects using speech tokens processed by the acoustic model, with different combinations of stimulation rate and number of activated channels. The results showed that vowel recognition was best at 600 pps and 8 activated channels, but further increases in stimulation rate and

  3. Accommodating variation: dialects, idiolects, and speech processing.

    Science.gov (United States)

    Kraljic, Tanya; Brennan, Susan E; Samuel, Arthur G

    2008-04-01

    Listeners are faced with enormous variation in pronunciation, yet they rarely have difficulty understanding speech. Although much research has been devoted to figuring out how listeners deal with variability, virtually none (outside of sociolinguistics) has focused on the source of the variation itself. The current experiments explore whether different kinds of variation lead to different cognitive and behavioral adjustments. Specifically, we compare adjustments to the same acoustic consequence when it is due to context-independent variation (resulting from articulatory properties unique to a speaker) versus context-conditioned variation (resulting from common articulatory properties of speakers who share a dialect). The contrasting results for these two cases show that the source of a particular acoustic-phonetic variation affects how that variation is handled by the perceptual system. We also show that changes in perceptual representations do not necessarily lead to changes in production.

  4. Long-term exposure to noise impairs cortical sound processing and attention control.

    Science.gov (United States)

    Kujala, Teija; Shtyrov, Yury; Winkler, Istvan; Saher, Marieke; Tervaniemi, Mari; Sallinen, Mikael; Teder-Sälejärvi, Wolfgang; Alho, Kimmo; Reinikainen, Kalevi; Näätänen, Risto

    2004-11-01

    Long-term exposure to noise impairs human health, causing pathological changes in the inner ear as well as other anatomical and physiological deficits. Numerous individuals are daily exposed to excessive noise. However, there is a lack of systematic research on the effects of noise on cortical function. Here we report data showing that long-term exposure to noise has a persistent effect on central auditory processing and leads to concurrent behavioral deficits. We found that speech-sound discrimination was impaired in noise-exposed individuals, as indicated by behavioral responses and the mismatch negativity brain response. Furthermore, irrelevant sounds increased the distractibility of the noise-exposed subjects, which was shown by increased interference in task performance and aberrant brain responses. These results demonstrate that long-term exposure to noise has long-lasting detrimental effects on central auditory processing and attention control.

  5. Processing of prosodic changes in natural speech stimuli in school-age children.

    Science.gov (United States)

    Lindström, R; Lepistö, T; Makkonen, T; Kujala, T

    2012-12-01

    Speech prosody conveys information about important aspects of communication: the meaning of the sentence and the emotional state or intention of the speaker. The present study addressed processing of emotional prosodic changes in natural speech stimuli in school-age children (mean age 10 years) by recording the electroencephalogram, facial electromyography, and behavioral responses. The stimulus was a semantically neutral Finnish word uttered with four different emotional connotations: neutral, commanding, sad, and scornful. In the behavioral sound-discrimination task the reaction times were fastest for the commanding stimulus and longest for the scornful stimulus, and faster for the neutral than for the sad stimulus. EEG and EMG responses were measured during non-attentive oddball paradigm. Prosodic changes elicited a negative-going, fronto-centrally distributed neural response peaking at about 500 ms from the onset of the stimulus, followed by a fronto-central positive deflection, peaking at about 740 ms. For the commanding stimulus also a rapid negative deflection peaking at about 290 ms from stimulus onset was elicited. No reliable stimulus type specific rapid facial reactions were found. The results show that prosodic changes in natural speech stimuli activate pre-attentive neural change-detection mechanisms in school-age children. However, the results do not support the suggestion of automaticity of emotion specific facial muscle responses to non-attended emotional speech stimuli in children. Copyright © 2012 Elsevier B.V. All rights reserved.

  6. High school music classes enhance the neural processing of speech.

    Science.gov (United States)

    Tierney, Adam; Krizman, Jennifer; Skoe, Erika; Johnston, Kathleen; Kraus, Nina

    2013-01-01

    Should music be a priority in public education? One argument for teaching music in school is that private music instruction relates to enhanced language abilities and neural function. However, the directionality of this relationship is unclear and it is unknown whether school-based music training can produce these enhancements. Here we show that 2 years of group music classes in high school enhance the neural encoding of speech. To tease apart the relationships between music and neural function, we tested high school students participating in either music or fitness-based training. These groups were matched at the onset of training on neural timing, reading ability, and IQ. Auditory brainstem responses were collected to a synthesized speech sound presented in background noise. After 2 years of training, the neural responses of the music training group were earlier than at pre-training, while the neural timing of students in the fitness training group was unchanged. These results represent the strongest evidence to date that in-school music education can cause enhanced speech encoding. The neural benefits of musical training are, therefore, not limited to expensive private instruction early in childhood but can be elicited by cost-effective group instruction during adolescence.

  7. High school music classes enhance the neural processing of speech

    Directory of Open Access Journals (Sweden)

    Adam eTierney

    2013-12-01

    Full Text Available Should music be a priority in public education? One argument for teaching music in school is that private music instruction relates to enhanced language abilities and neural function. However, the directionality of this relationship is unclear and it is unknown whether school-based music training can produce these enhancements. Here we show that two years of group music classes in high school enhance the subcortical encoding of speech. To tease apart the relationships between music and neural function, we tested high school students participating in either music or fitness-based training. These groups were matched at the onset of training on neural timing, reading ability, and IQ. Auditory brainstem responses were collected to a synthesized speech sound presented in background noise. After 2 years of training, the subcortical responses of the music training group were earlier than at pretraining, while the neural timing of students in the fitness training group was unchanged. These results represent the strongest evidence to date that in-school music education can cause enhanced speech encoding. The neural benefits of musical training are, therefore, not limited to expensive private instruction early in childhood but can be elicited by cost-effective group instruction during adolescence.

  8. Linguistic Processing of Accented Speech Across the Lifespan

    Directory of Open Access Journals (Sweden)

    Alejandrina eCristia

    2012-11-01

    Full Text Available In most of the world, people have regular exposure to multiple accents. Therefore, learning to quickly process accented speech is a prerequisite to successful communication. In this paper, we examine work on the perception of accented speech across the lifespan, from early infancy to late adulthood. Unfamiliar accents initially impair linguistic processing by infants, children, younger adults, and older adults, but listeners of all ages come to adapt to accented speech. Emergent research also goes beyond these perceptual abilities, by assessing links with production and the relative contributions of linguistic knowledge and general cognitive skills. We conclude by underlining points of convergence across ages, and the gaps left to face in future work.

  9. On the Perception of Speech Sounds as Biologically Significant Signals1,2

    Science.gov (United States)

    Pisoni, David B.

    2012-01-01

    This paper reviews some of the major evidence and arguments currently available to support the view that human speech perception may require the use of specialized neural mechanisms for perceptual analysis. Experiments using synthetically produced speech signals with adults are briefly summarized and extensions of these results to infants and other organisms are reviewed with an emphasis towards detailing those aspects of speech perception that may require some need for specialized species-specific processors. Finally, some comments on the role of early experience in perceptual development are provided as an attempt to identify promising areas of new research in speech perception. PMID:399200

  10. Normal Aspects of Speech, Hearing, and Language.

    Science.gov (United States)

    Minifie, Fred. D., Ed.; And Others

    This book is written as a guide to the understanding of the processes involved in human speech communication. Ten authorities contributed material to provide an introduction to the physiological aspects of speech production and reception, the acoustical aspects of speech production and transmission, the psychophysics of sound reception, the nature…

  11. Context effects on processing widely deviant sounds in newborn infants

    Directory of Open Access Journals (Sweden)

    Gábor Péter Háden

    2013-09-01

    Full Text Available Detecting and orienting towards sounds carrying new information is a crucial feature of the human brain that supports adaptation to the environment. Rare, acoustically widely deviant sounds presented amongst frequent tones elicit large event related brain potentials (ERPs in neonates. Here we tested whether these discriminative ERP responses reflect only the activation of fresh afferent neuronal populations (i.e., neuronal circuits not affected by the tones or they also index the processing of contextual mismatch between the rare and the frequent sounds.In two separate experiments, we presented sleeping newborns with 150 different environmental sounds and the same number of white noise bursts. Both sounds served either as deviants in an oddball paradigm with the frequent standard stimulus a tone (Novel/Noise deviant, or as the standard stimulus with the tone as deviant (Novel/Noise standard, or they were delivered alone with the same timing as the deviants in the oddball condition (Novel/Noise alone.Whereas the ERP responses to noise–deviants elicited similar responses as the same sound presented alone, the responses elicited by environmental sounds in the corresponding conditions morphologically differed from each other. Thus whereas the ERP response to the noise sounds can be explained by the different refractory state of stimulus specific neuronal populations, the ERP response to environmental sounds indicated context sensitive processing. These results provide evidence for an innate tendency of context dependent auditory processing as well as a basis for the different developmental trajectories of processing acoustical deviance and contextual novelty.

  12. Emotional sounds modulate early neural processing of emotional pictures

    Directory of Open Access Journals (Sweden)

    Antje B M Gerdes

    2013-10-01

    Full Text Available In our natural environment, emotional information is conveyed by converging visual and auditory information; multimodal integration is of utmost importance. In the laboratory, however, emotion researchers have mostly focused on the examination of unimodal stimuli. Few existing studies on multimodal emotion processing have focused on human communication such as the integration of facial and vocal expressions. Extending the concept of multimodality, the current study examines how the neural processing of emotional pictures is influenced by simultaneously presented sounds. Twenty pleasant, unpleasant, and neutral pictures of complex scenes were presented to 22 healthy participants. On the critical trials these pictures were paired with pleasant, unpleasant and neutral sounds. Sound presentation started 500 ms before picture onset and each stimulus presentation lasted for 2s. EEG was recorded from 64 channels and ERP analyses focused on the picture onset. In addition, valence, and arousal ratings were obtained. Previous findings for the neural processing of emotional pictures were replicated. Specifically, unpleasant compared to neutral pictures were associated with an increased parietal P200 and a more pronounced centroparietal late positive potential (LPP, independent of the accompanying sound valence. For audiovisual stimulation, increased parietal P100 and P200 were found in response to all pictures which were accompanied by unpleasant or pleasant sounds compared to pictures with neutral sounds. Most importantly, incongruent audiovisual pairs of unpleasant pictures and pleasant sounds enhanced parietal P100 and P200 compared to pairings with congruent sounds. Taken together, the present findings indicate that emotional sounds modulate early stages of visual processing and, therefore, provide an avenue by which multimodal experience may enhance perception.

  13. The Comorbidity between Attention-Deficit/Hyperactivity Disorder (ADHD) in Children and Arabic Speech Sound Disorder

    Science.gov (United States)

    Hariri, Ruaa Osama

    2016-01-01

    Children with Attention-Deficiency/Hyperactive Disorder (ADHD) often have co-existing learning disabilities and developmental weaknesses or delays in some areas including speech (Rief, 2005). Seeing that phonological disorders include articulation errors and other forms of speech disorders, studies pertaining to children with ADHD symptoms who…

  14. Reconceptualizing Practice with Multilingual Children with Speech Sound Disorders: People, Practicalities and Policy

    Science.gov (United States)

    Verdon, Sarah; McLeod, Sharynne; Wong, Sandie

    2015-01-01

    Background: The speech and language therapy profession is required to provide services to increasingly multilingual caseloads. Much international research has focused on the challenges of speech and language therapists' (SLTs) practice with multilingual children. Aims: To draw on the experience and knowledge of experts in the field to: (1)…

  15. Application of the wavelet transform for speech processing

    Science.gov (United States)

    Maes, Stephane

    1994-01-01

    Speaker identification and word spotting will shortly play a key role in space applications. An approach based on the wavelet transform is presented that, in the context of the 'modulation model,' enables extraction of speech features which are used as input for the classification process.

  16. Acoustic richness modulates the neural networks supporting intelligible speech processing.

    Science.gov (United States)

    Lee, Yune-Sang; Min, Nam Eun; Wingfield, Arthur; Grossman, Murray; Peelle, Jonathan E

    2016-03-01

    The information contained in a sensory signal plays a critical role in determining what neural processes are engaged. Here we used interleaved silent steady-state (ISSS) functional magnetic resonance imaging (fMRI) to explore how human listeners cope with different degrees of acoustic richness during auditory sentence comprehension. Twenty-six healthy young adults underwent scanning while hearing sentences that varied in acoustic richness (high vs. low spectral detail) and syntactic complexity (subject-relative vs. object-relative center-embedded clause structures). We manipulated acoustic richness by presenting the stimuli as unprocessed full-spectrum speech, or noise-vocoded with 24 channels. Importantly, although the vocoded sentences were spectrally impoverished, all sentences were highly intelligible. These manipulations allowed us to test how intelligible speech processing was affected by orthogonal linguistic and acoustic demands. Acoustically rich speech showed stronger activation than acoustically less-detailed speech in a bilateral temporoparietal network with more pronounced activity in the right hemisphere. By contrast, listening to sentences with greater syntactic complexity resulted in increased activation of a left-lateralized network including left posterior lateral temporal cortex, left inferior frontal gyrus, and left dorsolateral prefrontal cortex. Significant interactions between acoustic richness and syntactic complexity occurred in left supramarginal gyrus, right superior temporal gyrus, and right inferior frontal gyrus, indicating that the regions recruited for syntactic challenge differed as a function of acoustic properties of the speech. Our findings suggest that the neural systems involved in speech perception are finely tuned to the type of information available, and that reducing the richness of the acoustic signal dramatically alters the brain's response to spoken language, even when intelligibility is high. Copyright © 2015 Elsevier

  17. What and Where in auditory sensory processing: A high-density electrical mapping study of distinct neural processes underlying sound object recognition and sound localization

    Directory of Open Access Journals (Sweden)

    Victoria M Leavitt

    2011-06-01

    Full Text Available Functionally distinct dorsal and ventral auditory pathways for sound localization (where and sound object recognition (what have been described in non-human primates. A handful of studies have explored differential processing within these streams in humans, with highly inconsistent findings. Stimuli employed have included simple tones, noise bursts and speech sounds, with simulated left-right spatial manipulations, and in some cases participants were not required to actively discriminate the stimuli. Our contention is that these paradigms were not well suited to dissociating processing within the two streams. Our aim here was to determine how early in processing we could find evidence for dissociable pathways using better titrated what and where task conditions. The use of more compelling tasks should allow us to amplify differential processing within the dorsal and ventral pathways. We employed high-density electrical mapping using a relatively large and environmentally realistic stimulus set (seven animal calls delivered from seven free-field spatial locations; with stimulus configuration identical across the where and what tasks. Topographic analysis revealed distinct dorsal and ventral auditory processing networks during the where and what tasks with the earliest point of divergence seen during the N1 component of the auditory evoked response, beginning at approximately 100 ms. While this difference occurred during the N1 timeframe, it was not a simple modulation of N1 amplitude as it displayed a wholly different topographic distribution to that of the N1. Global dissimilarity measures using topographic modulation analysis confirmed that this difference between tasks was driven by a shift in the underlying generator configuration. Minimum norm source reconstruction revealed distinct activations that corresponded well with activity within putative dorsal and ventral auditory structures.

  18. Sound

    CERN Document Server

    Robertson, William C

    2003-01-01

    Muddled about what makes music? Stuck on the study of harmonics? Dumbfounded by how sound gets around? Now you no longer have to struggle to teach concepts you really don t grasp yourself. Sound takes an intentionally light touch to help out all those adults science teachers, parents wanting to help with homework, home-schoolers seeking necessary scientific background to teach middle school physics with confidence. The book introduces sound waves and uses that model to explain sound-related occurrences. Starting with the basics of what causes sound and how it travels, you'll learn how musical instruments work, how sound waves add and subtract, how the human ear works, and even why you can sound like a Munchkin when you inhale helium. Sound is the fourth book in the award-winning Stop Faking It! Series, published by NSTA Press. Like the other popular volumes, it is written by irreverent educator Bill Robertson, who offers this Sound recommendation: One of the coolest activities is whacking a spinning metal rod...

  19. A Change of a Consonant Status: the Bedouinisation of the [j] Sound in the Speech of Kuwaitis: A Case Study

    Directory of Open Access Journals (Sweden)

    Abdulmohsen A. Dashti

    2015-09-01

    Full Text Available In light of sociolinguist phonological change, the following study investigates the [j] sound in the speech of Kuwaitis as the predominant form and characterizes the sedentary population which is made up of both the indigenous and non-indigenous group; while [ʤ] is the realisation of the Bedouins who are also a part of the indigenous population. Although [ʤ] is the classical variant, it has, for some time, been regarded by Kuwaitis as the stigmatized form and the [j] as the one that carries prestige. This study examines the change of status of [j] and [ʤ] in the speech of Kuwaitis. The main hypothesis is that [j] no longer carries prestige. To test this hypothesis, 40 Kuwaitis of different gender, ages, educational background, and social networks were spontaneously chosen to be interviewed. Their speech was phonetically transcribed and accordingly was quantitatively and qualitatively analyzed. Results indicate that the [j] variant is undergoing change of status and that the social parameters and the significant political and social changes, that Kuwait has undergone recently, have triggered this linguistic shift.

  20. A study on nonlinear characteristics of speech sound with reference to some languages of North East region

    Science.gov (United States)

    Dutta, Rashmi

    INTRODUCTION : Speech science is, in fact, a sub-discipline of the Nonlinear Dynamical System [2,104 ]. There are two different types of Dynamical System. A Continuous Dynamical System may be defined for the continuous time case, by the equation: x = F (x), where x is a vector of length d, defining a point in a d- dimensional space, F is some function (linear or nonlinear) operating on x, and x is the time derivative of x. This system is deterministic, in that it is possible to completely specify its evolution or flow of trajectories in the d- dimensional space, given the initial starting conditions. A Discrete Dynamical System can be defined as a map [by the process of literations]: Xn+1 = G ( Xn ), where Xn is again a d- length vector at time step n, and G is an operator function. Given an initial state, X0, it is possible to calculate the value of xn for any n > 0. Speech has evolved as a primary form of communication between humans, i.e. speech and hearing are the man's most used means of communication [104, 114]. Analysis of human speech has been a goal of Research during the last few decades [105, 108]. With the rapid development of information technology (IT), the human-machine communication, using natural speech, has received wide attention from both academic and business communities. One highly quantitative approach of characterizing the communications potential of speech is in terms of information theory ideas as introduced by Shannon [C.E. Shannon, "A Mathematical Theory of Communication," Bell System Tech journal, Vol 27, pp623- 656, October, 1968]. According to information theory, speech can be represented in terms of its message content, or information. An alternative way of characterizing speech is in terms of the signal carrying the message information, i.e., the acoustic waveform. Although information theoretic ideas have played a major role in sophisticated communications systems, it is the speech representation based on the waveform, or some

  1. Near-infrared-spectroscopic study on processing of sounds in the brain; a comparison between native and non-native speakers of Japanese.

    Science.gov (United States)

    Tsunoda, Koichi; Sekimoto, Sotaro; Itoh, Kenji

    2016-06-01

    Conclusions The result suggested that mother tongue Japanese and non- mother tongue Japanese differ in their pattern of brain dominance when listening to sounds from the natural world-in particular, insect sounds. These results reveal significant support for previous findings from Tsunoda (in 1970). Objectives This study concentrates on listeners who show clear evidence of a 'speech' brain vs a 'music' brain and determines which side is most active in the processing of insect sounds, using with near-infrared spectroscopy. Methods The present study uses 2-channel Near Infrared Spectroscopy (NIRS) to provide a more direct measure of left- and right-brain activity while participants listen to each of three types of sounds: Japanese speech, Western violin music, or insect sounds. Data were obtained from 33 participants who showed laterality on opposite sides for Japanese speech and Western music. Results Results showed that a majority (80%) of the MJ participants exhibited dominance for insect sounds on the side that was dominant for language, while a majority (62%) of the non-MJ participants exhibited dominance for insect sounds on the side that was dominant for music.

  2. A general auditory bias for handling speaker variability in speech? Evidence in humans and songbirds

    NARCIS (Netherlands)

    Kriengwatana, B.; Escudero, P.; Kerkhoven, A.H.; ten Cate, C.

    2015-01-01

    Different speakers produce the same speech sound differently, yet listeners are still able to reliably identify the speech sound. How listeners can adjust their perception to compensate for speaker differences in speech, and whether these compensatory processes are unique only to humans, is still

  3. Cortical oscillations and entrainment in speech processing during working memory load

    DEFF Research Database (Denmark)

    Hjortkjær, Jens; Märcher-Rørsted, Jonatan; Fuglsang, Søren A

    2018-01-01

    Neuronal oscillations are thought to play an important role in working memory (WM) and speech processing. Listening to speech in real-life situations is often cognitively demanding but it is unknown whether WM load influences how auditory cortical activity synchronizes to speech features. Here, we...... developed an auditory n-back paradigm to investigate cortical entrainment to speech envelope fluctuations under different degrees of WM load. We measured the electroencephalogram, pupil dilations and behavioural performance from 22 subjects listening to continuous speech with an embedded n-back task....... The speech stimuli consisted of long spoken number sequences created to match natural speech in terms of sentence intonation, syllabic rate and phonetic content. To burden different WM functions during speech processing, listeners performed an n-back task on the speech sequences in different levels...

  4. Neural correlates of audiovisual speech processing in a second language.

    Science.gov (United States)

    Barrós-Loscertales, Alfonso; Ventura-Campos, Noelia; Visser, Maya; Alsius, Agnès; Pallier, Christophe; Avila Rivera, César; Soto-Faraco, Salvador

    2013-09-01

    Neuroimaging studies of audiovisual speech processing have exclusively addressed listeners' native language (L1). Yet, several behavioural studies now show that AV processing plays an important role in non-native (L2) speech perception. The current fMRI study measured brain activity during auditory, visual, audiovisual congruent and audiovisual incongruent utterances in L1 and L2. BOLD responses to congruent AV speech in the pSTS were stronger than in either unimodal condition in both L1 and L2. Yet no differences in AV processing were expressed according to the language background in this area. Instead, the regions in the bilateral occipital lobe had a stronger congruency effect on the BOLD response (congruent higher than incongruent) in L2 as compared to L1. According to these results, language background differences are predominantly expressed in these unimodal regions, whereas the pSTS is similarly involved in AV integration regardless of language dominance. Copyright © 2013 Elsevier Inc. All rights reserved.

  5. Finding the music of speech: Musical knowledge influences pitch processing in speech.

    Science.gov (United States)

    Vanden Bosch der Nederlanden, Christina M; Hannon, Erin E; Snyder, Joel S

    2015-10-01

    Few studies comparing music and language processing have adequately controlled for low-level acoustical differences, making it unclear whether differences in music and language processing arise from domain-specific knowledge, acoustic characteristics, or both. We controlled acoustic characteristics by using the speech-to-song illusion, which often results in a perceptual transformation to song after several repetitions of an utterance. Participants performed a same-different pitch discrimination task for the initial repetition (heard as speech) and the final repetition (heard as song). Better detection was observed for pitch changes that violated rather than conformed to Western musical scale structure, but only when utterances transformed to song, indicating that music-specific pitch representations were activated and influenced perception. This shows that music-specific processes can be activated when an utterance is heard as song, suggesting that the high-level status of a stimulus as either language or music can be behaviorally dissociated from low-level acoustic factors. Copyright © 2015 Elsevier B.V. All rights reserved.

  6. The role of across-frequency envelope processing for speech intelligibility

    DEFF Research Database (Denmark)

    Chabot-Leclerc, Alexandre; Jørgensen, Søren; Dau, Torsten

    2013-01-01

    Speech intelligibility models consist of a preprocessing part that transforms the stimuli into some internal (auditory) representation, and a decision metric that quantifies effects of transmission channel, speech interferers, and auditory processing on the speech intelligibility. Here, two recent...... speech intelligibility models, the spectro-temporal modulation index [STMI; Elhilali et al. (2003)] and the speech-based envelope power spectrum model [sEPSM; Jørgensen and Dau (2011)] were evaluated in conditions of noisy speech subjected to reverberation, and to nonlinear distortions through either...

  7. The role of across-frequency envelope processing for speech intelligibility

    DEFF Research Database (Denmark)

    Chabot-Leclerc, Alexandre; Jørgensen, Søren; Dau, Torsten

    2013-01-01

    Speech intelligibility models consist of a preprocessing part that transforms the stimuli into some internal (auditory) representation, and a decision metric that quantifies effects of transmission channel, speech interferers, and auditory processing on the speech intelligibility. Here, two recent...... speech intelligibility models, the spectro-temporal modulation index (STMI; Elhilali et al., 2003) and the speech-based envelope power spectrum model (sEPSM; Jørgensen and Dau, 2011) were evaluated in conditions of noisy speech subjected to reverberation, and to nonlinear distortions through either...

  8. Atypical speech versus non-speech detection and discrimination in 4- to 6- yr old children with autism spectrum disorder: An ERP study.

    Directory of Open Access Journals (Sweden)

    Alena Galilee

    Full Text Available Previous event-related potential (ERP research utilizing oddball stimulus paradigms suggests diminished processing of speech versus non-speech sounds in children with an Autism Spectrum Disorder (ASD. However, brain mechanisms underlying these speech processing abnormalities, and to what extent they are related to poor language abilities in this population remain unknown. In the current study, we utilized a novel paired repetition paradigm in order to investigate ERP responses associated with the detection and discrimination of speech and non-speech sounds in 4- to 6-year old children with ASD, compared with gender and verbal age matched controls. ERPs were recorded while children passively listened to pairs of stimuli that were either both speech sounds, both non-speech sounds, speech followed by non-speech, or non-speech followed by speech. Control participants exhibited N330 match/mismatch responses measured from temporal electrodes, reflecting speech versus non-speech detection, bilaterally, whereas children with ASD exhibited this effect only over temporal electrodes in the left hemisphere. Furthermore, while the control groups exhibited match/mismatch effects at approximately 600 ms (central N600, temporal P600 when a non-speech sound was followed by a speech sound, these effects were absent in the ASD group. These findings suggest that children with ASD fail to activate right hemisphere mechanisms, likely associated with social or emotional aspects of speech detection, when distinguishing non-speech from speech stimuli. Together, these results demonstrate the presence of atypical speech versus non-speech processing in children with ASD when compared with typically developing children matched on verbal age.

  9. Atypical speech versus non-speech detection and discrimination in 4- to 6- yr old children with autism spectrum disorder: An ERP study.

    Science.gov (United States)

    Galilee, Alena; Stefanidou, Chrysi; McCleery, Joseph P

    2017-01-01

    Previous event-related potential (ERP) research utilizing oddball stimulus paradigms suggests diminished processing of speech versus non-speech sounds in children with an Autism Spectrum Disorder (ASD). However, brain mechanisms underlying these speech processing abnormalities, and to what extent they are related to poor language abilities in this population remain unknown. In the current study, we utilized a novel paired repetition paradigm in order to investigate ERP responses associated with the detection and discrimination of speech and non-speech sounds in 4- to 6-year old children with ASD, compared with gender and verbal age matched controls. ERPs were recorded while children passively listened to pairs of stimuli that were either both speech sounds, both non-speech sounds, speech followed by non-speech, or non-speech followed by speech. Control participants exhibited N330 match/mismatch responses measured from temporal electrodes, reflecting speech versus non-speech detection, bilaterally, whereas children with ASD exhibited this effect only over temporal electrodes in the left hemisphere. Furthermore, while the control groups exhibited match/mismatch effects at approximately 600 ms (central N600, temporal P600) when a non-speech sound was followed by a speech sound, these effects were absent in the ASD group. These findings suggest that children with ASD fail to activate right hemisphere mechanisms, likely associated with social or emotional aspects of speech detection, when distinguishing non-speech from speech stimuli. Together, these results demonstrate the presence of atypical speech versus non-speech processing in children with ASD when compared with typically developing children matched on verbal age.

  10. Anterior paracingulate and cingulate cortex mediates the effects of cognitive load on speech sound discrimination.

    Science.gov (United States)

    Gennari, Silvia P; Millman, Rebecca E; Hymers, Mark; Mattys, Sven L

    2018-06-11

    Perceiving speech while performing another task is a common challenge in everyday life. How the brain controls resource allocation during speech perception remains poorly understood. Using functional magnetic resonance imaging (fMRI), we investigated the effect of cognitive load on speech perception by examining brain responses of participants performing a phoneme discrimination task and a visual working memory task simultaneously. The visual task involved holding either a single meaningless image in working memory (low cognitive load) or four different images (high cognitive load). Performing the speech task under high load, compared to low load, resulted in decreased activity in pSTG/pMTG and increased activity in visual occipital cortex and two regions known to contribute to visual attention regulation-the superior parietal lobule (SPL) and the paracingulate and anterior cingulate gyrus (PaCG, ACG). Critically, activity in PaCG/ACG was correlated with performance in the visual task and with activity in pSTG/pMTG: Increased activity in PaCG/ACG was observed for individuals with poorer visual performance and with decreased activity in pSTG/pMTG. Moreover, activity in a pSTG/pMTG seed region showed psychophysiological interactions with areas of the PaCG/ACG, with stronger interaction in the high-load than the low-load condition. These findings show that the acoustic analysis of speech is affected by the demands of a concurrent visual task and that the PaCG/ACG plays a role in allocating cognitive resources to concurrent auditory and visual information. Copyright © 2018. Published by Elsevier Inc.

  11. Using ILD or ITD Cues for Sound Source Localization and Speech Understanding in a Complex Listening Environment by Listeners with Bilateral and with Hearing-Preservation Cochlear Implants

    Science.gov (United States)

    Loiselle, Louise H.; Dorman, Michael F.; Yost, William A.; Cook, Sarah J.; Gifford, Rene H.

    2016-01-01

    Purpose: To assess the role of interaural time differences and interaural level differences in (a) sound-source localization, and (b) speech understanding in a cocktail party listening environment for listeners with bilateral cochlear implants (CIs) and for listeners with hearing-preservation CIs. Methods: Eleven bilateral listeners with MED-EL…

  12. "When He's around His Brothers ... He's Not so Quiet": The Private and Public Worlds of School-Aged Children with Speech Sound Disorder

    Science.gov (United States)

    McLeod, Sharynne; Daniel, Graham; Barr, Jacqueline

    2013-01-01

    Children interact with people in context: including home, school, and in the community. Understanding children's relationships within context is important for supporting children's development. Using child-friendly methodologies, the purpose of this research was to understand the lives of children with speech sound disorder (SSD) in context.…

  13. Contributions of Letter-Speech Sound Learning and Visual Print Tuning to Reading Improvement: Evidence from Brain Potential and Dyslexia Training Studies

    NARCIS (Netherlands)

    Fraga González, G.; Žarić, G.; Tijms, J.; Bonte, M.; van der Molen, M.W.

    We use a neurocognitive perspective to discuss the contribution of learning letter-speech sound (L-SS) associations and visual specialization in the initial phases of reading in dyslexic children. We review findings from associative learning studies on related cognitive skills important for

  14. AGGLOMERATIVE CLUSTERING OF SOUND RECORD SPEECH SEGMENTS BASED ON BAYESIAN INFORMATION CRITERION

    Directory of Open Access Journals (Sweden)

    O. Yu. Kydashev

    2013-01-01

    Full Text Available This paper presents the detailed description of agglomerative clustering system implementation for speech segments based on Bayesian information criterion. Numerical experiment results with different acoustic features, as well as the full and diagonal covariance matrices application are given. The error rate DER equal to 6.4% for audio records of radio «Svoboda» was achieved by means of designed system.

  15. A multigenerational family study of oral and hand motor sequencing ability provides evidence for a familial speech sound disorder subtype

    Science.gov (United States)

    Peter, Beate; Raskind, Wendy H.

    2011-01-01

    Purpose To evaluate phenotypic expressions of speech sound disorder (SSD) in multigenerational families with evidence of familial forms of SSD. Method Members of five multigenerational families (N = 36) produced rapid sequences of monosyllables and disyllables and tapped computer keys with repetitive and alternating movements. Results Measures of repetitive and alternating motor speed were correlated within and between the two motor systems. Repetitive and alternating motor speeds increased in children and decreased in adults as a function of age. In two families with children who had severe speech deficits consistent with disrupted praxis, slowed alternating, but not repetitive, oral movements characterized most of the affected children and adults with a history of SSD, and slowed alternating hand movements were seen in some of the biologically related participants as well. Conclusion Results are consistent with a familial motor-based SSD subtype with incomplete penetrance, motivating new clinical questions about motor-based intervention not only in the oral but also the limb system. PMID:21909176

  16. Cognitive and linguistic sources of variance in 2-year-olds’ speech-sound discrimination: a preliminary investigation.

    Science.gov (United States)

    Lalonde, Kaylah; Holt, Rachael Frush

    2014-02-01

    This preliminary investigation explored potential cognitive and linguistic sources of variance in 2-year-olds’ speech-sound discrimination by using the toddler change/ no-change procedure and examined whether modifications would result in a procedure that can be used consistently with younger 2-year-olds. Twenty typically developing 2-year-olds completed the newly modified toddler change/no-change procedure. Behavioral tests and parent report questionnaires were used to measure several cognitive and linguistic constructs. Stepwise linear regression was used to relate discrimination sensitivity to the cognitive and linguistic measures. In addition, discrimination results from the current experiment were compared with those from 2-year-old children tested in a previous experiment. Receptive vocabulary and working memory explained 56.6% of variance in discrimination performance. Performance was not different on the modified toddler change/no-change procedure used in the current experiment from in a previous investigation, which used the original version of the procedure. The relationship between speech discrimination and receptive vocabulary and working memory provides further evidence that the procedure is sensitive to the strength of perceptual representations. The role for working memory might also suggest that there are specific subject-related, nonsensory factors limiting the applicability of the procedure to children who have not reached the necessary levels of cognitive and linguistic development.

  17. Understanding speech when wearing communication headsets and hearing protectors with subband processing.

    Science.gov (United States)

    Brammer, Anthony J; Yu, Gongqiang; Bernstein, Eric R; Cherniack, Martin G; Peterson, Donald R; Tufts, Jennifer B

    2014-08-01

    An adaptive, delayless, subband feed-forward control structure is employed to improve the speech signal-to-noise ratio (SNR) in the communication channel of a circumaural headset/hearing protector (HPD) from 90 Hz to 11.3 kHz, and to provide active noise control (ANC) from 50 to 800 Hz to complement the passive attenuation of the HPD. The task involves optimizing the speech SNR for each communication channel subband, subject to limiting the maximum sound level at the ear, maintaining a speech SNR preferred by users, and reducing large inter-band gain differences to improve speech quality. The performance of a proof-of-concept device has been evaluated in a pseudo-diffuse sound field when worn by human subjects under conditions of environmental noise and speech that do not pose a risk to hearing, and by simulation for other conditions. For the environmental noises employed in this study, subband speech SNR control combined with subband ANC produced greater improvement in word scores than subband ANC alone, and improved the consistency of word scores across subjects. The simulation employed a subject-specific linear model, and predicted that word scores are maintained in excess of 90% for sound levels outside the HPD of up to ∼115 dBA.

  18. Acoustic processing of temporally modulated sounds in infants: evidence from a combined near-infrared spectroscopy and EEG study

    Directory of Open Access Journals (Sweden)

    Silke eTelkemeyer

    2011-04-01

    Full Text Available Speech perception requires rapid extraction of the linguistic content from the acoustic signal. The ability to efficiently process rapid changes in auditory information is important for decoding speech and thereby crucial during language acquisition. Investigating functional networks of speech perception in infancy might elucidate neuronal ensembles supporting perceptual abilities that gate language acquisition. Interhemispheric specializations for language have been demonstrated in infants. How these asymmetries are shaped by basic temporal acoustic properties is under debate. We recently provided evidence that newborns process non-linguistic sounds sharing temporal features with language in a differential and lateralized fashion. The present study used the same material while measuring brain responses of 6 and 3 month old infants using simultaneous recordings of electroencephalography (EEG and near-infrared spectroscopy (NIRS. NIRS reveals that the lateralization observed in newborns remains constant over the first months of life. While fast acoustic modulations elicit bilateral neuronal activations, slow modulations lead to right-lateralized responses. Additionally, auditory evoked potentials and oscillatory EEG responses show differential responses for fast and slow modulations indicating a sensitivity for temporal acoustic variations. Oscillatory responses reveal an effect of development, that is, 6 but not 3 month old infants show stronger theta-band desynchronization for slowly modulated sounds. Whether this developmental effect is due to increasing fine-grained perception for spectrotemporal sounds in general remains speculative. Our findings support the notion that a more general specialization for acoustic properties can be considered the basis for lateralization of speech perception. The results show that concurrent assessment of vascular based imaging and electrophysiological responses have great potential in the research on language

  19. By the sound of it. An ERP investigation of human action sound processing in 7-month-old infants

    Directory of Open Access Journals (Sweden)

    Elena Geangu

    2015-04-01

    Full Text Available Recent evidence suggests that human adults perceive human action sounds as a distinct category from human vocalizations, environmental, and mechanical sounds, activating different neural networks (Engel et al., 2009; Lewis et al., 2011. Yet, little is known about the development of such specialization. Using event-related potentials (ERP, this study investigated neural correlates of 7-month-olds’ processing of human action (HA sounds in comparison to human vocalizations (HV, environmental (ENV, and mechanical (MEC sounds. Relative to the other categories, HA sounds led to increased positive amplitudes between 470 and 570 ms post-stimulus onset at left anterior temporal locations, while HV led to increased negative amplitudes at the more posterior temporal locations in both hemispheres. Collectively, human produced sounds (HA + HV led to significantly different response profiles compared to non-living sound sources (ENV + MEC at parietal and frontal locations in both hemispheres. Overall, by 7 months of age human action sounds are being differentially processed in the brain, consistent with a dichotomy for processing living versus non-living things. This provides novel evidence regarding the typical categorical processing of socially relevant sounds.

  20. Sound production treatment for acquired apraxia of speech: Effects of blocked and random practice on multisyllabic word production.

    Science.gov (United States)

    Wambaugh, Julie; Nessler, Christina; Wright, Sandra; Mauszycki, Shannon; DeLong, Catharine

    2016-10-01

    This study was designed to examine the effects of practice schedule, blocked vs random, on outcomes of a behavioural treatment for acquired apraxia of speech (AOS), Sound Production Treatment (SPT). SPT was administered to four speakers with chronic AOS and aphasia in the context of multiple baseline designs across behaviours and participants. Treatment was applied to multiple sound errors within three-to-five syllable words. All participants received both practice schedules: SPT-Random (SPT-R) and SPT-Blocked (SPT-B). Improvements in accuracy of word production for trained items were found for both treatment conditions for all participants. One participant demonstrated better maintenance effects associated with SPT-R. Response generalisation to untreated words varied across participants, but was generally modest and unstable. Stimulus generalisation to production of words in sentence completion was positive for three of the participants. Stimulus generalisation to production of phrases was positive for two of the participants. Findings provide additional efficacy data regarding SPT's effects on articulation of treated items and extend knowledge of the treatment's effects when applied to multiple targets within multisyllabic words.

  1. Working memory and intelligibility of hearing-aid processed speech

    Science.gov (United States)

    Souza, Pamela E.; Arehart, Kathryn H.; Shen, Jing; Anderson, Melinda; Kates, James M.

    2015-01-01

    Previous work suggested that individuals with low working memory capacity may be at a disadvantage in adverse listening environments, including situations with background noise or substantial modification of the acoustic signal. This study explored the relationship between patient factors (including working memory capacity) and intelligibility and quality of modified speech for older individuals with sensorineural hearing loss. The modification was created using a combination of hearing aid processing [wide-dynamic range compression (WDRC) and frequency compression (FC)] applied to sentences in multitalker babble. The extent of signal modification was quantified via an envelope fidelity index. We also explored the contribution of components of working memory by including measures of processing speed and executive function. We hypothesized that listeners with low working memory capacity would perform more poorly than those with high working memory capacity across all situations, and would also be differentially affected by high amounts of signal modification. Results showed a significant effect of working memory capacity for speech intelligibility, and an interaction between working memory, amount of hearing loss and signal modification. Signal modification was the major predictor of quality ratings. These data add to the literature on hearing-aid processing and working memory by suggesting that the working memory-intelligibility effects may be related to aggregate signal fidelity, rather than to the specific signal manipulation. They also suggest that for individuals with low working memory capacity, sensorineural loss may be most appropriately addressed with WDRC and/or FC parameters that maintain the fidelity of the signal envelope. PMID:25999874

  2. Neural Correlates of Indicators of Sound Change in Cantonese: Evidence from Cortical and Subcortical Processes

    OpenAIRE

    Maggu, Akshay R.; Liu, Fang; Antoniou, Mark; Wong, Patrick C. M.

    2016-01-01

    Across time, languages undergo changes in phonetic, syntactic, and semantic dimensions. Social, cognitive, and cultural factors contribute to sound change, a phenomenon in which the phonetics of a language undergo changes over time. Individuals who misperceive and produce speech in a slightly divergent manner (called innovators) contribute to variability in the society, eventually leading to sound change. However, the cause of variability in these individuals is still unknown. In this study, ...

  3. The effects of noise exposure and musical training on suprathreshold auditory processing and speech perception in noise.

    Science.gov (United States)

    Yeend, Ingrid; Beach, Elizabeth Francis; Sharma, Mridula; Dillon, Harvey

    2017-09-01

    Recent animal research has shown that exposure to single episodes of intense noise causes cochlear synaptopathy without affecting hearing thresholds. It has been suggested that the same may occur in humans. If so, it is hypothesized that this would result in impaired encoding of sound and lead to difficulties hearing at suprathreshold levels, particularly in challenging listening environments. The primary aim of this study was to investigate the effect of noise exposure on auditory processing, including the perception of speech in noise, in adult humans. A secondary aim was to explore whether musical training might improve some aspects of auditory processing and thus counteract or ameliorate any negative impacts of noise exposure. In a sample of 122 participants (63 female) aged 30-57 years with normal or near-normal hearing thresholds, we conducted audiometric tests, including tympanometry, audiometry, acoustic reflexes, otoacoustic emissions and medial olivocochlear responses. We also assessed temporal and spectral processing, by determining thresholds for detection of amplitude modulation and temporal fine structure. We assessed speech-in-noise perception, and conducted tests of attention, memory and sentence closure. We also calculated participants' accumulated lifetime noise exposure and administered questionnaires to assess self-reported listening difficulty and musical training. The results showed no clear link between participants' lifetime noise exposure and performance on any of the auditory processing or speech-in-noise tasks. Musical training was associated with better performance on the auditory processing tasks, but not the on the speech-in-noise perception tasks. The results indicate that sentence closure skills, working memory, attention, extended high frequency hearing thresholds and medial olivocochlear suppression strength are important factors that are related to the ability to process speech in noise. Crown Copyright © 2017. Published by

  4. A causal test of the motor theory of speech perception: a case of impaired speech production and spared speech perception.

    Science.gov (United States)

    Stasenko, Alena; Bonn, Cory; Teghipco, Alex; Garcea, Frank E; Sweet, Catherine; Dombovy, Mary; McDonough, Joyce; Mahon, Bradford Z

    2015-01-01

    The debate about the causal role of the motor system in speech perception has been reignited by demonstrations that motor processes are engaged during the processing of speech sounds. Here, we evaluate which aspects of auditory speech processing are affected, and which are not, in a stroke patient with dysfunction of the speech motor system. We found that the patient showed a normal phonemic categorical boundary when discriminating two non-words that differ by a minimal pair (e.g., ADA-AGA). However, using the same stimuli, the patient was unable to identify or label the non-word stimuli (using a button-press response). A control task showed that he could identify speech sounds by speaker gender, ruling out a general labelling impairment. These data suggest that while the motor system is not causally involved in perception of the speech signal, it may be used when other cues (e.g., meaning, context) are not available.

  5. Independence of early speech processing from word meaning.

    Science.gov (United States)

    Travis, Katherine E; Leonard, Matthew K; Chan, Alexander M; Torres, Christina; Sizemore, Marisa L; Qu, Zhe; Eskandar, Emad; Dale, Anders M; Elman, Jeffrey L; Cash, Sydney S; Halgren, Eric

    2013-10-01

    We combined magnetoencephalography (MEG) with magnetic resonance imaging and electrocorticography to separate in anatomy and latency 2 fundamental stages underlying speech comprehension. The first acoustic-phonetic stage is selective for words relative to control stimuli individually matched on acoustic properties. It begins ∼60 ms after stimulus onset and is localized to middle superior temporal cortex. It was replicated in another experiment, but is strongly dissociated from the response to tones in the same subjects. Within the same task, semantic priming of the same words by a related picture modulates cortical processing in a broader network, but this does not begin until ∼217 ms. The earlier onset of acoustic-phonetic processing compared with lexico-semantic modulation was significant in each individual subject. The MEG source estimates were confirmed with intracranial local field potential and high gamma power responses acquired in 2 additional subjects performing the same task. These recordings further identified sites within superior temporal cortex that responded only to the acoustic-phonetic contrast at short latencies, or the lexico-semantic at long. The independence of the early acoustic-phonetic response from semantic context suggests a limited role for lexical feedback in early speech perception.

  6. The impact of orthographic knowledge on speech processing

    Directory of Open Access Journals (Sweden)

    Régine Kolinsky

    2012-12-01

    Full Text Available http://dx.doi.org/10.5007/2175-8026.2012n63p161   The levels-of-processing approach to speech processing (cf. Kolinsky, 1998 distinguishes three levels, from bottom to top: perception, recognition (which involves activation of stored knowledge and formal explicit analysis or comparison (which belongs to metalinguistic ability, and assumes that only the former is immune to literacy-dependent knowledge.  in this contribution, we first briefly review the main ideas and evidence supporting the role of learning to read in the alphabetic system in the development of conscious representations of phonemes, and we contrast conscious and unconscious representations of phonemes. Then, we examine in detail recent compelling behavioral and neuroscientific evidence for the involvement of orthographic representation in the recognition of spoken words. We conclude by arguing that there is a strong need of theoretical re-elaboration of the models of speech recognition, which typically have ignored the influence of reading acquisition.

  7. Music and speech listening enhance the recovery of early sensory processing after stroke.

    Science.gov (United States)

    Särkämö, Teppo; Pihko, Elina; Laitinen, Sari; Forsblom, Anita; Soinila, Seppo; Mikkonen, Mikko; Autti, Taina; Silvennoinen, Heli M; Erkkilä, Jaakko; Laine, Matti; Peretz, Isabelle; Hietanen, Marja; Tervaniemi, Mari

    2010-12-01

    Our surrounding auditory environment has a dramatic influence on the development of basic auditory and cognitive skills, but little is known about how it influences the recovery of these skills after neural damage. Here, we studied the long-term effects of daily music and speech listening on auditory sensory memory after middle cerebral artery (MCA) stroke. In the acute recovery phase, 60 patients who had middle cerebral artery stroke were randomly assigned to a music listening group, an audio book listening group, or a control group. Auditory sensory memory, as indexed by the magnetic MMN (MMNm) response to changes in sound frequency and duration, was measured 1 week (baseline), 3 months, and 6 months after the stroke with whole-head magnetoencephalography recordings. Fifty-four patients completed the study. Results showed that the amplitude of the frequency MMNm increased significantly more in both music and audio book groups than in the control group during the 6-month poststroke period. In contrast, the duration MMNm amplitude increased more in the audio book group than in the other groups. Moreover, changes in the frequency MMNm amplitude correlated significantly with the behavioral improvement of verbal memory and focused attention induced by music listening. These findings demonstrate that merely listening to music and speech after neural damage can induce long-term plastic changes in early sensory processing, which, in turn, may facilitate the recovery of higher cognitive functions. The neural mechanisms potentially underlying this effect are discussed.

  8. Speech processing system demonstrated by positron emission tomography (PET). A review of the literature

    International Nuclear Information System (INIS)

    Hirano, Shigeru; Naito, Yasushi; Kojima, Hisayoshi

    1996-01-01

    We review the literature on speech processing in the central nervous system as demonstrated by positron emission tomography (PET). Activation study using PET has been proved to be a useful and non-invasive method of investigating the speech processing system in normal subjects. In speech recognition, the auditory association areas and lexico-semantic areas called Wernicke's area play important roles. Broca's area, motor areas, supplementary motor cortices and the prefrontal area have been proved to be related to speech output. Visual speech stimulation activates not only the visual association areas but also the temporal region and prefrontal area, especially in lexico-semantic processing. Higher level speech processing, such as conversation which includes auditory processing, vocalization and thinking, activates broad areas in both hemispheres. This paper also discusses problems to be resolved in the future. (author) 42 refs

  9. Adult-like processing of time-compressed speech by newborns: A NIRS study.

    Science.gov (United States)

    Issard, Cécile; Gervain, Judit

    2017-06-01

    Humans can adapt to a wide range of variations in the speech signal, maintaining an invariant representation of the linguistic information it contains. Among them, adaptation to rapid or time-compressed speech has been well studied in adults, but the developmental origin of this capacity remains unknown. Does this ability depend on experience with speech (if yes, as heard in utero or as heard postnatally), with sounds in general or is it experience-independent? Using near-infrared spectroscopy, we show that the newborn brain can discriminate between three different compression rates: normal, i.e. 100% of the original duration, moderately compressed, i.e. 60% of original duration and highly compressed, i.e. 30% of original duration. Even more interestingly, responses to normal and moderately compressed speech are similar, showing a canonical hemodynamic response in the left temporoparietal, right frontal and right temporal cortex, while responses to highly compressed speech are inverted, showing a decrease in oxyhemoglobin concentration. These results mirror those found in adults, who readily adapt to moderately compressed, but not to highly compressed speech, showing that adaptation to time-compressed speech requires little or no experience with speech, and happens at an auditory, and not at a more abstract linguistic level. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.

  10. Sounds Exaggerate Visual Shape

    Science.gov (United States)

    Sweeny, Timothy D.; Guzman-Martinez, Emmanuel; Ortega, Laura; Grabowecky, Marcia; Suzuki, Satoru

    2012-01-01

    While perceiving speech, people see mouth shapes that are systematically associated with sounds. In particular, a vertically stretched mouth produces a /woo/ sound, whereas a horizontally stretched mouth produces a /wee/ sound. We demonstrate that hearing these speech sounds alters how we see aspect ratio, a basic visual feature that contributes…

  11. Music and Sound in Time Processing of Children with ADHD.

    Science.gov (United States)

    Carrer, Luiz Rogério Jorgensen

    2015-01-01

    ADHD involves cognitive and behavioral aspects with impairments in many environments of children and their families' lives. Music, with its playful, spontaneous, affective, motivational, temporal, and rhythmic dimensions can be of great help for studying the aspects of time processing in ADHD. In this article, we studied time processing with simple sounds and music in children with ADHD with the hypothesis that children with ADHD have a different performance when compared with children with normal development in tasks of time estimation and production. The main objective was to develop sound and musical tasks to evaluate and correlate the performance of children with ADHD, with and without methylphenidate, compared to a control group with typical development. The study involved 36 participants of age 6-14 years, recruited at NANI-UNIFESP/SP, subdivided into three groups with 12 children in each. Data was collected through a musical keyboard using Logic Audio Software 9.0 on the computer that recorded the participant's performance in the tasks. Tasks were divided into sections: spontaneous time production, time estimation with simple sounds, and time estimation with music. (1) performance of ADHD groups in temporal estimation of simple sounds in short time intervals (30 ms) were statistically lower than that of control group (p < 0.05); (2) in the task comparing musical excerpts of the same duration (7 s), ADHD groups considered the tracks longer when the musical notes had longer durations, while in the control group, the duration was related to the density of musical notes in the track. The positive average performance observed in the three groups in most tasks perhaps indicates the possibility that music can, in some way, positively modulate the symptoms of inattention in ADHD.

  12. Sound of Silence: Comparison of ICT and speech deprivation among students

    Directory of Open Access Journals (Sweden)

    Tihana Brkljačić

    2017-12-01

    Full Text Available The aim of the study was twofold: to describe self-reported habits of ICT use in every-day life and to analyze feelings and behavior triggered by ICT and speech deprivation. The study was conducted on three randomly selected groups of students with different tasks: Without Speaking (W/S group (n=10 spent a day without talking to anyone; Without Technology (W/T group (n=13 spent a day without using any kind of ICT, while the third group was a control group (n=10 and had no restrictions. The participants’ task in all groups was to write a diary detailing their feelings, thoughts and behaviors related to their group’s conditions. Before the experiment, students reported their ICT related habits. Right after groups were assigned, they reported their task-related impressions. During the experiment, participants wrote diary records at three time-points. All participants used ICT on a daily basis, and most were online all the time. Dominant ICT activities were communication with friends and family, studying, followed by listening to music and watching films. Speech deprivation was a more difficult task compared to ICT deprivation, resulting in more drop-outs and more negative emotions. However, participants in W/S expected the task to be difficult, and some of them actually reported positive experiences, but for others it was a very difficult, lonesome and terrifying experience. About half of the students in W/T claimed that the task was more difficult than they had expected, and some of them realized that they are dysfunctional without technology, and probably addicted to it.

  13. The Cortical Organization of Speech Processing: Feedback Control and Predictive Coding the Context of a Dual-Stream Model

    Science.gov (United States)

    Hickok, Gregory

    2012-01-01

    Speech recognition is an active process that involves some form of predictive coding. This statement is relatively uncontroversial. What is less clear is the source of the prediction. The dual-stream model of speech processing suggests that there are two possible sources of predictive coding in speech perception: the motor speech system and the…

  14. Auditory processing disorders: an update for speech-language pathologists.

    Science.gov (United States)

    DeBonis, David A; Moncrieff, Deborah

    2008-02-01

    Unanswered questions regarding the nature of auditory processing disorders (APDs), how best to identify at-risk students, how best to diagnose and differentiate APDs from other disorders, and concerns about the lack of valid treatments have resulted in ongoing confusion and skepticism about the diagnostic validity of this label. This poses challenges for speech-language pathologists (SLPs) who are working with school-age children and whose scope of practice includes APD screening and intervention. The purpose of this article is to address some of the questions commonly asked by SLPs regarding APDs in school-age children. This article is also intended to serve as a resource for SLPs to be used in deciding what role they will or will not play with respect to APDs in school-age children. The methodology used in this article included a computerized database review of the latest published information on APD, with an emphasis on the work of established researchers and expert panels, including articles from the American Speech-Language-Hearing Association and the American Academy of Audiology. The article concludes with the authors' recommendations for continued research and their views on the appropriate role of the SLP in performing careful screening, making referrals, and supporting intervention.

  15. Visemic Processing in Audiovisual Discrimination of Natural Speech: A Simultaneous fMRI-EEG Study

    Science.gov (United States)

    Dubois, Cyril; Otzenberger, Helene; Gounot, Daniel; Sock, Rudolph; Metz-Lutz, Marie-Noelle

    2012-01-01

    In a noisy environment, visual perception of articulatory movements improves natural speech intelligibility. Parallel to phonemic processing based on auditory signal, visemic processing constitutes a counterpart based on "visemes", the distinctive visual units of speech. Aiming at investigating the neural substrates of visemic processing in a…

  16. Phonological simplifications, apraxia of speech and the interaction between phonological and phonetic processing.

    Science.gov (United States)

    Galluzzi, Claudia; Bureca, Ivana; Guariglia, Cecilia; Romani, Cristina

    2015-05-01

    Research on aphasia has struggled to identify apraxia of speech (AoS) as an independent deficit affecting a processing level separate from phonological assembly and motor implementation. This is because AoS is characterized by both phonological and phonetic errors and, therefore, can be interpreted as a combination of deficits at the phonological and the motoric level rather than as an independent impairment. We apply novel psycholinguistic analyses to the perceptually phonological errors made by 24 Italian aphasic patients. We show that only patients with relative high rate (>10%) of phonetic errors make sound errors which simplify the phonology of the target. Moreover, simplifications are strongly associated with other variables indicative of articulatory difficulties - such as a predominance of errors on consonants rather than vowels - but not with other measures - such as rate of words reproduced correctly or rates of lexical errors. These results indicate that sound errors cannot arise at a single phonological level because they are different in different patients. Instead, different patterns: (1) provide evidence for separate impairments and the existence of a level of articulatory planning/programming intermediate between phonological selection and motor implementation; (2) validate AoS as an independent impairment at this level, characterized by phonetic errors and phonological simplifications; (3) support the claim that linguistic principles of complexity have an articulatory basis since they only apply in patients with associated articulatory difficulties. Copyright © 2015 The Authors. Published by Elsevier Ltd.. All rights reserved.

  17. Barista: A Framework for Concurrent Speech Processing by USC-SAIL

    OpenAIRE

    Can, Doğan; Gibson, James; Vaz, Colin; Georgiou, Panayiotis G.; Narayanan, Shrikanth S.

    2014-01-01

    We present Barista, an open-source framework for concurrent speech processing based on the Kaldi speech recognition toolkit and the libcppa actor library. With Barista, we aim to provide an easy-to-use, extensible framework for constructing highly customizable concurrent (and/or distributed) networks for a variety of speech processing tasks. Each Barista network specifies a flow of data between simple actors, concurrent entities communicating by message passing, modeled after Kaldi tools. Lev...

  18. Working memory and intelligibility of hearing-aid processed speech

    Directory of Open Access Journals (Sweden)

    Pamela eSouza

    2015-05-01

    Full Text Available Previous work suggested that individuals with low working memory capacity may be at a disadvantage in adverse listening environments, including situations with background noise or substantial modification of the acoustic signal. This study explored the relationship between patient factors (including working memory capacity and intelligibility and quality of modified speech for older individuals with sensorineural hearing loss. The modification was created using a combination of hearing aid processing (wide-dynamic range compression and frequency compression applied to sentences in multitalker babble. The extent of signal modification was quantified via an envelope fidelity index. We also explored the contribution of components of working memory by including measures of processing speed and executive function. We hypothesized that listeners with low working memory capacity would perform more poorly than those with high working memory capacity across all situations, and would also be differentially affected by high amounts of signal modification. Results showed a significant effect of working memory capacity for speech intelligibility, and an interaction between working memory, amount of hearing loss and signal modification. Signal modification was the major predictor of quality ratings. These data add to the literature on hearing-aid processing and working memory by suggesting that the working memory-intelligibility effects may be related to aggregate signal fidelity, rather than on the specific signal manipulation. They also suggest that for individuals with low working memory capacity, sensorineural loss may be most appropriately addressed with wide-dynamic range compression and/or frequency compression parameters that maintain the fidelity of the signal envelope.

  19. Cortical oscillations and entrainment in speech processing during working memory load.

    Science.gov (United States)

    Hjortkjaer, Jens; Märcher-Rørsted, Jonatan; Fuglsang, Søren A; Dau, Torsten

    2018-02-02

    Neuronal oscillations are thought to play an important role in working memory (WM) and speech processing. Listening to speech in real-life situations is often cognitively demanding but it is unknown whether WM load influences how auditory cortical activity synchronizes to speech features. Here, we developed an auditory n-back paradigm to investigate cortical entrainment to speech envelope fluctuations under different degrees of WM load. We measured the electroencephalogram, pupil dilations and behavioural performance from 22 subjects listening to continuous speech with an embedded n-back task. The speech stimuli consisted of long spoken number sequences created to match natural speech in terms of sentence intonation, syllabic rate and phonetic content. To burden different WM functions during speech processing, listeners performed an n-back task on the speech sequences in different levels of background noise. Increasing WM load at higher n-back levels was associated with a decrease in posterior alpha power as well as increased pupil dilations. Frontal theta power increased at the start of the trial and increased additionally with higher n-back level. The observed alpha-theta power changes are consistent with visual n-back paradigms suggesting general oscillatory correlates of WM processing load. Speech entrainment was measured as a linear mapping between the envelope of the speech signal and low-frequency cortical activity (level) decreased cortical speech envelope entrainment. Although entrainment persisted under high load, our results suggest a top-down influence of WM processing on cortical speech entrainment. © 2018 The Authors. European Journal of Neuroscience published by Federation of European Neuroscience Societies and John Wiley & Sons Ltd.

  20. Effects of spectral complexity and sound duration on automatic complex-sound pitch processing in humans - a mismatch negativity study.

    Science.gov (United States)

    Tervaniemi, M; Schröger, E; Saher, M; Näätänen, R

    2000-08-18

    The pitch of a spectrally rich sound is known to be more easily perceived than that of a sinusoidal tone. The present study compared the importance of spectral complexity and sound duration in facilitated pitch discrimination. The mismatch negativity (MMN), which reflects automatic neural discrimination, was recorded to a 2. 5% pitch change in pure tones with only one sinusoidal frequency component (500 Hz) and in spectrally rich tones with three (500-1500 Hz) and five (500-2500 Hz) harmonic partials. During the recordings, subjects concentrated on watching a silent movie. In separate blocks, stimuli were of 100 and 250 ms in duration. The MMN amplitude was enhanced with both spectrally rich sounds when compared with pure tones. The prolonged sound duration did not significantly enhance the MMN. This suggests that increased spectral rather than temporal information facilitates pitch processing of spectrally rich sounds.

  1. De novo microdeletion of BCL11A is associated with severe speech sound disorder.

    Science.gov (United States)

    Peter, Beate; Matsushita, Mark; Oda, Kaori; Raskind, Wendy

    2014-08-01

    In 10 cases of 2p15p16.1 microdeletions reported worldwide to date, shared phenotypes included growth retardation, craniofacial and skeletal dysmorphic traits, internal organ defects, intellectual disability, nonverbal or low verbal status, abnormal muscle tone, and gross motor delays. The size of the deletions ranged from 0.3 to 5.7 Mb, where the smallest deletion involved the BCL11A, PAPOLG, and REL genes. Here we report on an 11-year-old male with a heterozygous de novo 0.2 Mb deletion containing a single gene, BCL11A, and a phenotype characterized by childhood apraxia of speech and dysarthria in the presence of general oral and gross motor dyspraxia and hypotonia as well as expressive language and mild intellectual delays. BCL11A is situated within the dyslexia susceptibility candidate region 3 (DYX3) candidate region on chromosome 2. The present case is the first to involve a single gene within the microdeletion region and a phenotype restricted to a subset of the traits observed in other cases with more extensive deletions. © 2014 Wiley Periodicals, Inc.

  2. Associations among measures of sequential processing in motor and linguistics tasks in adults with and without a family history of childhood apraxia of speech: a replication study.

    Science.gov (United States)

    Button, Le; Peter, Beate; Stoel-Gammon, Carol; Raskind, Wendy H

    2013-03-01

    The purpose of this study was to address the hypothesis that childhood apraxia of speech (CAS) is influenced by an underlying deficit in sequential processing that is also expressed in other modalities. In a sample of 21 adults from five multigenerational families, 11 with histories of various familial speech sound disorders, 3 biologically related adults from a family with familial CAS showed motor sequencing deficits in an alternating motor speech task. Compared with the other adults, these three participants showed deficits in tasks requiring high loads of sequential processing, including nonword imitation, nonword reading and spelling. Qualitative error analyses in real word and nonword imitations revealed group differences in phoneme sequencing errors. Motor sequencing ability was correlated with phoneme sequencing errors during real word and nonword imitation, reading and spelling. Correlations were characterized by extremely high scores in one family and extremely low scores in another. Results are consistent with a central deficit in sequential processing in CAS of familial origin.

  3. A comparison of orthogonal transformations for digital speech processing.

    Science.gov (United States)

    Campanella, S. J.; Robinson, G. S.

    1971-01-01

    Discrete forms of the Fourier, Hadamard, and Karhunen-Loeve transforms are examined for their capacity to reduce the bit rate necessary to transmit speech signals. To rate their effectiveness in accomplishing this goal the quantizing error (or noise) resulting for each transformation method at various bit rates is computed and compared with that for conventional companded PCM processing. Based on this comparison, it is found that Karhunen-Loeve provides a reduction in bit rate of 13.5 kbits/s, Fourier 10 kbits/s, and Hadamard 7.5 kbits/s as compared with the bit rate required for companded PCM. These bit-rate reductions are shown to be somewhat independent of the transmission bit rate.

  4. Progressive apraxia of speech as a window into the study of speech planning processes.

    Science.gov (United States)

    Laganaro, Marina; Croisier, Michèle; Bagou, Odile; Assal, Frédéric

    2012-09-01

    We present a 3-year follow-up study of a patient with progressive apraxia of speech (PAoS), aimed at investigating whether the theoretical organization of phonetic encoding is reflected in the progressive disruption of speech. As decreased speech rate was the most striking pattern of disruption during the first 2 years, durational analyses were carried out longitudinally on syllables excised from spontaneous, repetition and reading speech samples. The crucial result of the present study is the demonstration of an effect of syllable frequency on duration: the progressive disruption of articulation rate did not affect all syllables in the same way, but followed a gradient that was function of the frequency of use of syllable-sized motor programs. The combination of data from this case of PAoS with previous psycholinguistic and neurolinguistic data, points to a frequency organization of syllable-sized speech-motor plans. In this study we also illustrate how studying PAoS can be exploited in theoretical and clinical investigations of phonetic encoding as it represents a unique opportunity to investigate speech while it progressively disrupts. Copyright © 2011 Elsevier Srl. All rights reserved.

  5. Perception of environmental sounds by experienced cochlear implant patients

    Science.gov (United States)

    Shafiro, Valeriy; Gygi, Brian; Cheng, Min-Yu; Vachhani, Jay; Mulvey, Megan

    2011-01-01

    Objectives Environmental sound perception serves an important ecological function by providing listeners with information about objects and events in their immediate environment. Environmental sounds such as car horns, baby cries or chirping birds can alert listeners to imminent dangers as well as contribute to one's sense of awareness and well being. Perception of environmental sounds as acoustically and semantically complex stimuli, may also involve some factors common to the processing of speech. However, very limited research has investigated the abilities of cochlear implant (CI) patients to identify common environmental sounds, despite patients' general enthusiasm about them. This project (1) investigated the ability of patients with modern-day CIs to perceive environmental sounds, (2) explored associations among speech, environmental sounds and basic auditory abilities, and (3) examined acoustic factors that might be involved in environmental sound perception. Design Seventeen experienced postlingually-deafened CI patients participated in the study. Environmental sound perception was assessed with a large-item test composed of 40 sound sources, each represented by four different tokens. The relationship between speech and environmental sound perception, and the role of working memory and some basic auditory abilities were examined based on patient performance on a battery of speech tests (HINT, CNC, and individual consonant and vowel tests), tests of basic auditory abilities (audiometric thresholds, gap detection, temporal pattern and temporal order for tones tests) and a backward digit recall test. Results The results indicated substantially reduced ability to identify common environmental sounds in CI patients (45.3%). Except for vowels, all speech test scores significantly correlated with the environmental sound test scores: r = 0.73 for HINT in quiet, r = 0.69 for HINT in noise, r = 0.70 for CNC, r = 0.64 for consonants and r = 0.48 for vowels. HINT and

  6. Noise control, sound, and the vehicle design process

    Science.gov (United States)

    Donavan, Paul

    2005-09-01

    For many products, noise and sound are viewed as necessary evils that need to be dealt with in order to bring the product successfully to market. They are generally not product ``exciters'' although some vehicle manufacturers do tune and advertise specific sounds to enhance the perception of their products. In this paper, influencing the design process for the ``evils,'' such as wind noise and road noise, are considered in more detail. There are three ingredients to successfully dealing with the evils in the design process. The first of these is knowing how excesses in noise effects the end customer in a tangible manner and how that effects customer satisfaction and ultimately sells. The second is having and delivering the knowledge of what is required of the design to achieve a satisfactory or even better level of noise performance. The third ingredient is having the commitment of the designers to incorporate the knowledge into their part, subsystem or system. In this paper, the elements of each of these ingredients are discussed in some detail and the attributes of a successful design process are enumerated.

  7. Infant speech-sound discrimination testing: effects of stimulus intensity and procedural model on measures of performance.

    Science.gov (United States)

    Nozza, R J

    1987-06-01

    Performance of infants in a speech-sound discrimination task (/ba/ vs /da/) was measured at three stimulus intensity levels (50, 60, and 70 dB SPL) using the operant head-turn procedure. The procedure was modified so that data could be treated as though from a single-interval (yes-no) procedure, as is commonly done, as well as if from a sustained attention (vigilance) task. Discrimination performance changed significantly with increase in intensity, suggesting caution in the interpretation of results from infant discrimination studies in which only single stimulus intensity levels within this range are used. The assumptions made about the underlying methodological model did not change the performance-intensity relationships. However, infants demonstrated response decrement, typical of vigilance tasks, which supports the notion that the head-turn procedure is represented best by the vigilance model. Analysis then was done according to a method designed for tasks with undefined observation intervals [C. S. Watson and T. L. Nichols, J. Acoust. Soc. Am. 59, 655-668 (1976)]. Results reveal that, while group data are reasonably well represented across levels of difficulty by the fixed-interval model, there is a variation in performance as a function of time following trial onset that could lead to underestimation of performance in some cases.

  8. Evaluation of the effects of nonlinear frequency compression on speech recognition and sound quality for adults with mild to moderate hearing loss.

    Science.gov (United States)

    Picou, Erin M; Marcrum, Steven C; Ricketts, Todd A

    2015-03-01

    While potentially improving audibility for listeners with considerable high frequency hearing loss, the effects of implementing nonlinear frequency compression (NFC) for listeners with moderate high frequency hearing loss are unclear. The purpose of this study was to investigate the effects of activating NFC for listeners who are not traditionally considered candidates for this technology. Participants wore study hearing aids with NFC activated for a 3-4 week trial period. After the trial period, they were tested with NFC and with conventional processing on measures of consonant discrimination threshold in quiet, consonant recognition in quiet, sentence recognition in noise, and acceptableness of sound quality of speech and music. Seventeen adult listeners with symmetrical, mild to moderate sensorineural hearing loss participated. Better ear, high frequency pure-tone averages (4, 6, and 8 kHz) were 60 dB HL or better. Activating NFC resulted in lower (better) thresholds for discrimination of /s/, whose spectral center was 9 kHz. There were no other significant effects of NFC compared to conventional processing. These data suggest that the benefits, and detriments, of activating NFC may be limited for this population.

  9. A Randomized Controlled Trial on The Beneficial Effects of Training Letter-Speech Sound Integration on Reading Fluency in Children with Dyslexia.

    Directory of Open Access Journals (Sweden)

    Gorka Fraga González

    Full Text Available A recent account of dyslexia assumes that a failure to develop automated letter-speech sound integration might be responsible for the observed lack of reading fluency. This study uses a pre-test-training-post-test design to evaluate the effects of a training program based on letter-speech sound associations with a special focus on gains in reading fluency. A sample of 44 children with dyslexia and 23 typical readers, aged 8 to 9, was recruited. Children with dyslexia were randomly allocated to either the training program group (n = 23 or a waiting-list control group (n = 21. The training intensively focused on letter-speech sound mapping and consisted of 34 individual sessions of 45 minutes over a five month period. The children with dyslexia showed substantial reading gains for the main word reading and spelling measures after training, improving at a faster rate than typical readers and waiting-list controls. The results are interpreted within the conceptual framework assuming a multisensory integration deficit as the most proximal cause of dysfluent reading in dyslexia.ISRCTN register ISRCTN12783279.

  10. Discrimination and streaming of speech sounds based on differences in interaural and spectral cues.

    Science.gov (United States)

    David, Marion; Lavandier, Mathieu; Grimault, Nicolas; Oxenham, Andrew J

    2017-09-01

    Differences in spatial cues, including interaural time differences (ITDs), interaural level differences (ILDs) and spectral cues, can lead to stream segregation of alternating noise bursts. It is unknown how effective such cues are for streaming sounds with realistic spectro-temporal variations. In particular, it is not known whether the high-frequency spectral cues associated with elevation remain sufficiently robust under such conditions. To answer these questions, sequences of consonant-vowel tokens were generated and filtered by non-individualized head-related transfer functions to simulate the cues associated with different positions in the horizontal and median planes. A discrimination task showed that listeners could discriminate changes in interaural cues both when the stimulus remained constant and when it varied between presentations. However, discrimination of changes in spectral cues was much poorer in the presence of stimulus variability. A streaming task, based on the detection of repeated syllables in the presence of interfering syllables, revealed that listeners can use both interaural and spectral cues to segregate alternating syllable sequences, despite the large spectro-temporal differences between stimuli. However, only the full complement of spatial cues (ILDs, ITDs, and spectral cues) resulted in obligatory streaming in a task that encouraged listeners to integrate the tokens into a single stream.

  11. Comparison between bilateral cochlear implants and Neurelec Digisonic(®) SP Binaural cochlear implant: speech perception, sound localization and patient self-assessment.

    Science.gov (United States)

    Bonnard, Damien; Lautissier, Sylvie; Bosset-Audoit, Amélie; Coriat, Géraldine; Beraha, Max; Maunoury, Antoine; Martel, Jacques; Darrouzet, Vincent; Bébéar, Jean-Pierre; Dauman, René

    2013-01-01

    An alternative to bilateral cochlear implantation is offered by the Neurelec Digisonic(®) SP Binaural cochlear implant, which allows stimulation of both cochleae within a single device. The purpose of this prospective study was to compare a group of Neurelec Digisonic(®) SP Binaural implant users (denoted BINAURAL group, n = 7) with a group of bilateral adult cochlear implant users (denoted BILATERAL group, n = 6) in terms of speech perception, sound localization, and self-assessment of health status and hearing disability. Speech perception was assessed using word recognition at 60 dB SPL in quiet and in a 'cocktail party' noise delivered through five loudspeakers in the hemi-sound field facing the patient (signal-to-noise ratio = +10 dB). The sound localization task was to determine the source of a sound stimulus among five speakers positioned between -90° and +90° from midline. Change in health status was assessed using the Glasgow Benefit Inventory and hearing disability was evaluated with the Abbreviated Profile of Hearing Aid Benefit. Speech perception was not statistically different between the two groups, even though there was a trend in favor of the BINAURAL group (mean percent word recognition in the BINAURAL and BILATERAL groups: 70 vs. 56.7% in quiet, 55.7 vs. 43.3% in noise). There was also no significant difference with regard to performance in sound localization and self-assessment of health status and hearing disability. On the basis of the BINAURAL group's performance in hearing tasks involving the detection of interaural differences, implantation with the Neurelec Digisonic(®) SP Binaural implant may be considered to restore effective binaural hearing. Based on these first comparative results, this device seems to provide benefits similar to those of traditional bilateral cochlear implantation, with a new approach to stimulate both auditory nerves. Copyright © 2013 S. Karger AG, Basel.

  12. "Non-Vocalization": A Phonological Error Process in the Speech of Severely and Profoundly Hearing Impaired Adults, from the Point of View of the Theory of Phonology as Human Behaviour

    Science.gov (United States)

    Halpern, Orly; Tobin, Yishai

    2008-01-01

    "Non-vocalization" (N-V) is a newly described phonological error process in hearing impaired speakers. In N-V the hearing impaired person actually articulates the phoneme but without producing a voice. The result is an error process looking as if it is produced but sounding as if it is omitted. N-V was discovered by video recording the speech of…

  13. Sadness is unique: Neural processing of emotions in speech prosody in musicians and non-musicians

    Directory of Open Access Journals (Sweden)

    Mona ePark

    2015-01-01

    Full Text Available Musical training has been shown to have positive effects on several aspects of speech processing, however, the effects of musical training on the neural processing of speech prosody conveying distinct emotions are yet to be better understood. We used functional magnetic resonance imaging (fMRI to investigate whether the neural responses to speech prosody conveying happiness, sadness, and fear differ between musicians and non-musicians. Differences in processing of emotional speech prosody between the two groups were only observed when sadness was expressed. Musicians showed increased activation in the middle frontal gyrus, the anterior medial prefrontal cortex, the posterior cingulate cortex and the retrosplenial cortex. Our results suggest an increased sensitivity of emotional processing in musicians with respect to sadness expressed in speech, possibly reflecting empathic processes.

  14. Letter-Sound Reading: Teaching Preschool Children Print-to-Sound Processing

    Science.gov (United States)

    Wolf, Gail Marie

    2016-01-01

    This intervention study investigated the growth of letter sound reading and growth of consonant-vowel-consonant (CVC) word decoding abilities for a representative sample of 41 US children in preschool settings. Specifically, the study evaluated the effectiveness of a 3-step letter-sound teaching intervention in teaching preschool children to…

  15. Initial uncertainty impacts statistical learning in sound sequence processing.

    Science.gov (United States)

    Todd, Juanita; Provost, Alexander; Whitson, Lisa; Mullens, Daniel

    2016-11-01

    This paper features two studies confirming a lasting impact of first learning on how subsequent experience is weighted in early relevance-filtering processes. In both studies participants were exposed to sequences of sound that contained a regular pattern on two different timescales. Regular patterning in sound is readily detected by the auditory system and used to form "prediction models" that define the most likely properties of sound to be encountered in a given context. The presence and strength of these prediction models is inferred from changes in automatically elicited components of auditory evoked potentials. Both studies employed sound sequences that contained both a local and longer-term pattern. The local pattern was defined by a regular repeating pure tone occasionally interrupted by a rare deviating tone (p=0.125) that was physically different (a 30msvs. 60ms duration difference in one condition and a 1000Hz vs. 1500Hz frequency difference in the other). The longer-term pattern was defined by the rate at which the two tones alternated probabilities (i.e., the tone that was first rare became common and the tone that was first common became rare). There was no task related to the tones and participants were asked to ignore them while focussing attention on a movie with subtitles. Auditory-evoked potentials revealed long lasting modulatory influences based on whether the tone was initially encountered as rare and unpredictable or common and predictable. The results are interpreted as evidence that probability (or indeed predictability) assigns a differential information-value to the two tones that in turn affects the extent to which prediction models are updated and imposed. These effects are exposed for both common and rare occurrences of the tones. The studies contribute to a body of work that reveals that probabilistic information is not faithfully represented in these early evoked potentials and instead exposes that predictability (or conversely

  16. Contributions of Letter-Speech Sound Learning and Visual Print Tuning to Reading Improvement: Evidence from Brain Potential and Dyslexia Training Studies

    Directory of Open Access Journals (Sweden)

    Gorka Fraga González

    2017-01-01

    Full Text Available We use a neurocognitive perspective to discuss the contribution of learning letter-speech sound (L-SS associations and visual specialization in the initial phases of reading in dyslexic children. We review findings from associative learning studies on related cognitive skills important for establishing and consolidating L-SS associations. Then we review brain potential studies, including our own, that yielded two markers associated with reading fluency. Here we show that the marker related to visual specialization (N170 predicts word and pseudoword reading fluency in children who received additional practice in the processing of morphological word structure. Conversely, L-SS integration (indexed by mismatch negativity (MMN may only remain important when direct orthography to semantic conversion is not possible, such as in pseudoword reading. In addition, the correlation between these two markers supports the notion that multisensory integration facilitates visual specialization. Finally, we review the role of implicit learning and executive functions in audiovisual learning in dyslexia. Implications for remedial research are discussed and suggestions for future studies are presented.

  17. The neural processing of foreign-accented speech and its relationship to listener bias

    Directory of Open Access Journals (Sweden)

    Han-Gyol eYi

    2014-10-01

    Full Text Available Foreign-accented speech often presents a challenging listening condition. In addition to deviations from the target speech norms related to the inexperience of the nonnative speaker, listener characteristics may play a role in determining intelligibility levels. We have previously shown that an implicit visual bias for associating East Asian faces and foreignness predicts the listeners’ perceptual ability to process Korean-accented English audiovisual speech (Yi et al., 2013. Here, we examine the neural mechanism underlying the influence of listener bias to foreign faces on speech perception. In a functional magnetic resonance imaging (fMRI study, native English speakers listened to native- and Korean-accented English sentences, with or without faces. The participants’ Asian-foreign association was measured using an implicit association test (IAT, conducted outside the scanner. We found that foreign-accented speech evoked greater activity in the bilateral primary auditory cortices and the inferior frontal gyri, potentially reflecting greater computational demand. Higher IAT scores, indicating greater bias, were associated with increased BOLD response to foreign-accented speech with faces in the primary auditory cortex, the early node for spectrotemporal analysis. We conclude the following: (1 foreign-accented speech perception places greater demand on the neural systems underlying speech perception; (2 face of the talker can exaggerate the perceived foreignness of foreign-accented speech; (3 implicit Asian-foreign association is associated with decreased neural efficiency in early spectrotemporal processing.

  18. Part-of-speech effects on text-to-speech synthesis

    CSIR Research Space (South Africa)

    Schlunz, GI

    2010-11-01

    Full Text Available One of the goals of text-to-speech (TTS) systems is to produce natural-sounding synthesised speech. Towards this end various natural language processing (NLP) tasks are performed to model the prosodic aspects of the TTS voice. One of the fundamental...

  19. Speech processing asymmetry revealed by dichotic listening and functional brain imaging.

    Science.gov (United States)

    Hugdahl, Kenneth; Westerhausen, René

    2016-12-01

    In this article, we review research in our laboratory from the last 25 to 30 years on the neuronal basis for laterality of speech perception focusing on the upper, posterior parts of the temporal lobes, and its functional and structural connections to other brain regions. We review both behavioral and brain imaging data, with a focus on dichotic listening experiments, and using a variety of imaging modalities. The data have come in most parts from healthy individuals and from studies on normally functioning brain, although we also review a few selected clinical examples. We first review and discuss the structural model for the explanation of the right-ear advantage (REA) and left hemisphere asymmetry for auditory language processing. A common theme across many studies have been our interest in the interaction between bottom-up, stimulus-driven, and top-down, instruction-driven, aspects of hemispheric asymmetry, and how perceptual factors interact with cognitive factors to shape asymmetry of auditory language information processing. In summary, our research have shown laterality for the initial processing of consonant-vowel syllables, first observed as a behavioral REA when subjects are required to report which syllable of a dichotic syllable-pair they perceive. In subsequent work we have corroborated the REA with brain imaging, and have shown that the REA is modulated through both bottom-up manipulations of stimulus properties, like sound intensity, and top-down manipulations of cognitive properties, like attention focus. Copyright © 2015 Elsevier Ltd. All rights reserved.

  20. Barista: A Framework for Concurrent Speech Processing by USC-SAIL.

    Science.gov (United States)

    Can, Doğan; Gibson, James; Vaz, Colin; Georgiou, Panayiotis G; Narayanan, Shrikanth S

    2014-05-01

    We present Barista, an open-source framework for concurrent speech processing based on the Kaldi speech recognition toolkit and the libcppa actor library. With Barista, we aim to provide an easy-to-use, extensible framework for constructing highly customizable concurrent (and/or distributed) networks for a variety of speech processing tasks. Each Barista network specifies a flow of data between simple actors, concurrent entities communicating by message passing, modeled after Kaldi tools. Leveraging the fast and reliable concurrency and distribution mechanisms provided by libcppa, Barista lets demanding speech processing tasks, such as real-time speech recognizers and complex training workflows, to be scheduled and executed on parallel (and/or distributed) hardware. Barista is released under the Apache License v2.0.

  1. Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing

    DEFF Research Database (Denmark)

    Jørgensen, Søren; Dau, Torsten

    2011-01-01

    A model for predicting the intelligibility of processed noisy speech is proposed. The speech-based envelope power spectrum model has a similar structure as the model of Ewert and Dau [(2000). J. Acoust. Soc. Am. 108, 1181-1196], developed to account for modulation detection and masking data. The ...... process provides a key measure of speech intelligibility. © 2011 Acoustical Society of America.......A model for predicting the intelligibility of processed noisy speech is proposed. The speech-based envelope power spectrum model has a similar structure as the model of Ewert and Dau [(2000). J. Acoust. Soc. Am. 108, 1181-1196], developed to account for modulation detection and masking data....... The model estimates the speech-to-noise envelope power ratio, SNR env, at the output of a modulation filterbank and relates this metric to speech intelligibility using the concept of an ideal observer. Predictions were compared to data on the intelligibility of speech presented in stationary speech...

  2. Speech production, dual-process theory, and the attentive addressee

    OpenAIRE

    Pollard, A. J.

    2012-01-01

    This thesis outlines a model of Speaker-Addressee interaction that suggests some answers to two linked problems current in speech production. The first concerns an under-researched issue in psycholinguistics: how are decisions about speech content – conceptualization – carried out? The second, a pragmatics problem, asks how Speakers, working under the heavy time pressures of normal dialogue, achieve optimal relevance often enough for successful communication to take place. L...

  3. Language/Culture Modulates Brain and Gaze Processes in Audiovisual Speech Perception.

    Science.gov (United States)

    Hisanaga, Satoko; Sekiyama, Kaoru; Igasaki, Tomohiko; Murayama, Nobuki

    2016-10-13

    Several behavioural studies have shown that the interplay between voice and face information in audiovisual speech perception is not universal. Native English speakers (ESs) are influenced by visual mouth movement to a greater degree than native Japanese speakers (JSs) when listening to speech. However, the biological basis of these group differences is unknown. Here, we demonstrate the time-varying processes of group differences in terms of event-related brain potentials (ERP) and eye gaze for audiovisual and audio-only speech perception. On a behavioural level, while congruent mouth movement shortened the ESs' response time for speech perception, the opposite effect was observed in JSs. Eye-tracking data revealed a gaze bias to the mouth for the ESs but not the JSs, especially before the audio onset. Additionally, the ERP P2 amplitude indicated that ESs processed multisensory speech more efficiently than auditory-only speech; however, the JSs exhibited the opposite pattern. Taken together, the ESs' early visual attention to the mouth was likely to promote phonetic anticipation, which was not the case for the JSs. These results clearly indicate the impact of language and/or culture on multisensory speech processing, suggesting that linguistic/cultural experiences lead to the development of unique neural systems for audiovisual speech perception.

  4. Personal Names in Children's Speech: The Process of Acquisition

    Directory of Open Access Journals (Sweden)

    Galina R. Dobrova

    2014-12-01

    Full Text Available The article discusses the process of children’s understanding of the differences between proper and common names. The author emphasizes the role of the anthropocentric approach to personal names, especially when it is based on the study of the ontogenetic development of linguistic capacity focusing on the mechanisms of the formation of mental patterns of proper names, in particular — of personal names, and of their special linguistic status as (relatively “strict” designators. Analyzing recordings of children’s spontaneous speech and experimental data, the author argues that the study of the early stages of personal names acquisition, in comparison with the acquisition of common nouns, highlights such significant features of a child’s developing mind as the ability to distinguish between identifying and generalizing linguistic signs, to construct hyponym/hyperonym relations going from individual to the most generalized designations (from personal name to common nouns of different kinds, including relative, completely depending on the reference point, and reciprocal ones, e. g. kinship terms. Additionally, the author shows that the anthropocentric approach emphasizes such properties of personal names as their coreferentiality, relativity and their capacity to act as semiotic shifters as far as the choice of the form of a name depends on the social relations between the speaker and his addressee and their respective positions in the social hierarchy.

  5. Neural processing of speech in children is influenced by bilingual experience

    Science.gov (United States)

    Krizman, Jennifer; Slater, Jessica; Skoe, Erika; Marian, Viorica; Kraus, Nina

    2014-01-01

    Language experience fine-tunes how the auditory system processes sound. For example, bilinguals, relative to monolinguals, have more robust evoked responses to speech that manifest as stronger neural encoding of the fundamental frequency (F0) and greater across-trial consistency. However, it is unknown whether such enhancements increase with increasing second language experience. We predict that F0 amplitude and neural consistency scale with dual-language experience during childhood, such that more years of bilingual experience leads to more robust F0 encoding and greater neural consistency. To test this hypothesis, we recorded auditory brainstem responses to the synthesized syllables ‘ba’ and ‘ga’ in two groups of bilingual children who were matched for age at test (8.4+/−0.67 years) but differed in their age of second language acquisition. One group learned English and Spanish simultaneously from birth (n=13), while the second group learned the two languages sequentially (n=15), spending on average their first four years as monolingual Spanish speakers. We find that simultaneous bilinguals have a larger F0 response to ‘ba’ and ‘ga’ and a more consistent response to ‘ba’ compared to sequential bilinguals. We also demonstrate that these neural enhancements positively relate with years of bilingual experience. These findings support the notion that bilingualism enhances subcortical auditory processing. PMID:25445377

  6. An oscillopathic approach to developmental dyslexia: From genes to speech processing.

    Science.gov (United States)

    Jiménez-Bravo, Miguel; Marrero, Victoria; Benítez-Burraco, Antonio

    2017-06-30

    Developmental dyslexia is a heterogeneous condition entailing problems with reading and spelling. Several genes have been linked or associated to the disease, many of which contribute to the development and function of brain areas important for auditory and phonological processing. Nonetheless, a clear link between genes, the brain, and the symptoms of dyslexia is still pending. The goal of this paper is contributing to bridge this gap. With this aim, we have focused on how the dyslexic brain fails to process speech sounds and reading cues. We have adopted an oscillatory perspective, according to which dyslexia may result from a deficient integration of different brain rhythms during reading/spellings tasks. Moreover, we show that some candidate genes for this condition are related to brain rhythms. This fresh approach is expected to provide a better understanding of the aetiology and the clinical presentation of developmental dyslexia, but also to achieve an earlier and more accurate diagnosis of the disease. Copyright © 2017 Elsevier B.V. All rights reserved.

  7. Reducing the Effects of Background Noise during Auditory Functional Magnetic Resonance Imaging of Speech Processing: Qualitative and Quantitative Comparisons between Two Image Acquisition Schemes and Noise Cancellation

    Science.gov (United States)

    Blackman, Graham A.; Hall, Deborah A.

    2011-01-01

    Purpose: The intense sound generated during functional magnetic resonance imaging (fMRI) complicates studies of speech and hearing. This experiment evaluated the benefits of using active noise cancellation (ANC), which attenuates the level of the scanner sound at the participant's ear by up to 35 dB around the peak at 600 Hz. Method: Speech and…

  8. The effect of viewing speech on auditory speech processing is different in the left and right hemispheres.

    Science.gov (United States)

    Davis, Chris; Kislyuk, Daniel; Kim, Jeesun; Sams, Mikko

    2008-11-25

    We used whole-head magnetoencephalograpy (MEG) to record changes in neuromagnetic N100m responses generated in the left and right auditory cortex as a function of the match between visual and auditory speech signals. Stimuli were auditory-only (AO) and auditory-visual (AV) presentations of /pi/, /ti/ and /vi/. Three types of intensity matched auditory stimuli were used: intact speech (Normal), frequency band filtered speech (Band) and speech-shaped white noise (Noise). The behavioural task was to detect the /vi/ syllables which comprised 12% of stimuli. N100m responses were measured to averaged /pi/ and /ti/ stimuli. Behavioural data showed that identification of the stimuli was faster and more accurate for Normal than for Band stimuli, and for Band than for Noise stimuli. Reaction times were faster for AV than AO stimuli. MEG data showed that in the left hemisphere, N100m to both AO and AV stimuli was largest for the Normal, smaller for Band and smallest for Noise stimuli. In the right hemisphere, Normal and Band AO stimuli elicited N100m responses of quite similar amplitudes, but N100m amplitude to Noise was about half of that. There was a reduction in N100m for the AV compared to the AO conditions. The size of this reduction for each stimulus type was same in the left hemisphere but graded in the right (being largest to the Normal, smaller to the Band and smallest to the Noise stimuli). The N100m decrease for the Normal stimuli was significantly larger in the right than in the left hemisphere. We suggest that the effect of processing visual speech seen in the right hemisphere likely reflects suppression of the auditory response based on AV cues for place of articulation.

  9. [Investigating phonological planning processes in speech production through a speech-error induction technique].

    Science.gov (United States)

    Nakayama, Masataka; Saito, Satoru

    2015-08-01

    The present study investigated principles of phonological planning, a common serial ordering mechanism for speech production and phonological short-term memory. Nakayama and Saito (2014) have investigated the principles by using a speech-error induction technique, in which participants were exposed to an auditory distracIor word immediately before an utterance of a target word. They demonstrated within-word adjacent mora exchanges and serial position effects on error rates. These findings support, respectively, the temporal distance and the edge principles at a within-word level. As this previous study induced errors using word distractors created by exchanging adjacent morae in the target words, it is possible that the speech errors are expressions of lexical intrusions reflecting interactive activation of phonological and lexical/semantic representations. To eliminate this possibility, the present study used nonword distractors that had no lexical or semantic representations. This approach successfully replicated the error patterns identified in the abovementioned study, further confirming that the temporal distance and edge principles are organizing precepts in phonological planning.

  10. The Mechanism of Speech Processing in Congenital Amusia: Evidence from Mandarin Speakers

    OpenAIRE

    Liu, Fang; Jiang, Cunmei; Thompson, William Forde; Xu, Yi; Yang, Yufang; Stewart, Lauren

    2012-01-01

    Congenital amusia is a neuro-developmental disorder of pitch perception that causes severe problems with music processing but only subtle difficulties in speech processing. This study investigated speech processing in a group of Mandarin speakers with congenital amusia. Thirteen Mandarin amusics and thirteen matched controls participated in a set of tone and intonation perception tasks and two pitch threshold tasks. Compared with controls, amusics showed impaired performance on word discrimin...

  11. Effects of lips and hands on auditory learning of second-language speech sounds.

    Science.gov (United States)

    Hirata, Yukari; Kelly, Spencer D

    2010-04-01

    Previous research has found that auditory training helps native English speakers to perceive phonemic vowel length contrasts in Japanese, but their performance did not reach native levels after training. Given that multimodal information, such as lip movement and hand gesture, influences many aspects of native language processing, the authors examined whether multimodal input helps to improve native English speakers' ability to perceive Japanese vowel length contrasts. Sixty native English speakers participated in 1 of 4 types of training: (a) audio-only; (b) audio-mouth; (c) audio-hands; and (d) audio-mouth-hands. Before and after training, participants were given phoneme perception tests that measured their ability to identify short and long vowels in Japanese (e.g., /kato/ vs. /kato/). Although all 4 groups improved from pre- to posttest (replicating previous research), the participants in the audio-mouth condition improved more than those in the audio-only condition, whereas the 2 conditions involving hand gestures did not. Seeing lip movements during training significantly helps learners to perceive difficult second-language phonemic contrasts, but seeing hand gestures does not. The authors discuss possible benefits and limitations of using multimodal information in second-language phoneme learning.

  12. Speech reception with different bilateral directional processing schemes: Influence of binaural hearing, audiometric asymmetry, and acoustic scenario.

    Science.gov (United States)

    Neher, Tobias; Wagener, Kirsten C; Latzel, Matthias

    2017-09-01

    Hearing aid (HA) users can differ markedly in their benefit from directional processing (or beamforming) algorithms. The current study therefore investigated candidacy for different bilateral directional processing schemes. Groups of elderly listeners with symmetric (N = 20) or asymmetric (N = 19) hearing thresholds for frequencies below 2 kHz, a large spread in the binaural intelligibility level difference (BILD), and no difference in age, overall degree of hearing loss, or performance on a measure of selective attention took part. Aided speech reception was measured using virtual acoustics together with a simulation of a linked pair of completely occluding behind-the-ear HAs. Five processing schemes and three acoustic scenarios were used. The processing schemes differed in the tradeoff between signal-to-noise ratio (SNR) improvement and binaural cue preservation. The acoustic scenarios consisted of a frontal target talker presented against two speech maskers from ±60° azimuth or spatially diffuse cafeteria noise. For both groups, a significant interaction between BILD, processing scheme, and acoustic scenario was found. This interaction implied that, in situations with lateral speech maskers, HA users with BILDs larger than about 2 dB profited more from preserved low-frequency binaural cues than from greater SNR improvement, whereas for smaller BILDs the opposite was true. Audiometric asymmetry reduced the influence of binaural hearing. In spatially diffuse noise, the maximal SNR improvement was generally beneficial. N 0 S π detection performance at 500 Hz predicted the benefit from low-frequency binaural cues. Together, these findings provide a basis for adapting bilateral directional processing to individual and situational influences. Further research is needed to investigate their generalizability to more realistic HA conditions (e.g., with low-frequency vent-transmitted sound). Copyright © 2017 Elsevier B.V. All rights reserved.

  13. Co-speech gestures influence neural activity in brain regions associated with processing semantic information.

    Science.gov (United States)

    Dick, Anthony Steven; Goldin-Meadow, Susan; Hasson, Uri; Skipper, Jeremy I; Small, Steven L

    2009-11-01

    Everyday communication is accompanied by visual information from several sources, including co-speech gestures, which provide semantic information listeners use to help disambiguate the speaker's message. Using fMRI, we examined how gestures influence neural activity in brain regions associated with processing semantic information. The BOLD response was recorded while participants listened to stories under three audiovisual conditions and one auditory-only (speech alone) condition. In the first audiovisual condition, the storyteller produced gestures that naturally accompany speech. In the second, the storyteller made semantically unrelated hand movements. In the third, the storyteller kept her hands still. In addition to inferior parietal and posterior superior and middle temporal regions, bilateral posterior superior temporal sulcus and left anterior inferior frontal gyrus responded more strongly to speech when it was further accompanied by gesture, regardless of the semantic relation to speech. However, the right inferior frontal gyrus was sensitive to the semantic import of the hand movements, demonstrating more activity when hand movements were semantically unrelated to the accompanying speech. These findings show that perceiving hand movements during speech modulates the distributed pattern of neural activation involved in both biological motion perception and discourse comprehension, suggesting listeners attempt to find meaning, not only in the words speakers produce, but also in the hand movements that accompany speech.

  14. Measurement of sound velocity profiles in fluids for process monitoring

    International Nuclear Information System (INIS)

    Wolf, M; Kühnicke, E; Lenz, M; Bock, M

    2012-01-01

    In ultrasonic measurements, the time of flight to the object interface is often the only information that is analysed. Conventionally it is only possible to determine distances or sound velocities if the other value is known. The current paper deals with a novel method to measure the sound propagation path length and the sound velocity in media with moving scattering particles simultaneously. Since the focal position also depends on sound velocity, it can be used as a second parameter. Via calibration curves it is possible to determine the focal position and sound velocity from the measured time of flight to the focus, which is correlated to the maximum of averaged echo signal amplitude. To move focal position along the acoustic axis, an annular array is used. This allows measuring sound velocity locally resolved without any previous knowledge of the acoustic media and without a reference reflector. In previous publications the functional efficiency of this method was shown for media with constant velocities. In this work the accuracy of these measurements is improved. Furthermore first measurements and simulations are introduced for non-homogeneous media. Therefore an experimental set-up was created to generate a linear temperature gradient, which also causes a gradient of sound velocity.

  15. Low-Frequency Cortical Entrainment to Speech Reflects Phoneme-Level Processing.

    Science.gov (United States)

    Di Liberto, Giovanni M; O'Sullivan, James A; Lalor, Edmund C

    2015-10-05

    The human ability to understand speech is underpinned by a hierarchical auditory system whose successive stages process increasingly complex attributes of the acoustic input. It has been suggested that to produce categorical speech perception, this system must elicit consistent neural responses to speech tokens (e.g., phonemes) despite variations in their acoustics. Here, using electroencephalography (EEG), we provide evidence for this categorical phoneme-level speech processing by showing that the relationship between continuous speech and neural activity is best described when that speech is represented using both low-level spectrotemporal information and categorical labeling of phonetic features. Furthermore, the mapping between phonemes and EEG becomes more discriminative for phonetic features at longer latencies, in line with what one might expect from a hierarchical system. Importantly, these effects are not seen for time-reversed speech. These findings may form the basis for future research on natural language processing in specific cohorts of interest and for broader insights into how brains transform acoustic input into meaning. Copyright © 2015 Elsevier Ltd. All rights reserved.

  16. Musical intervention enhances infants’ neural processing of temporal structure in music and speech

    OpenAIRE

    Zhao, T. Christina; Kuhl, Patricia K.

    2016-01-01

    Musicians show enhanced musical pitch and meter processing, effects that generalize to speech. Yet potential differences between musicians and nonmusicians limit conclusions. We examined the effects of a randomized laboratory-controlled music intervention on music and speech processing in 9-mo-old infants. The Intervention exposed infants to music in triple meter (the waltz) in a social environment. Controls engaged in similar social play without music. After 12 sessions, infants’ temporal in...

  17. Neural Processing of What and Who Information in Speech

    Science.gov (United States)

    Chandrasekaran, Bharath; Chan, Alice H. D.; Wong, Patrick C. M.

    2011-01-01

    Human speech is composed of two types of information, related to content (lexical information, i.e., "what" is being said [e.g., words]) and to the speaker (indexical information, i.e., "who" is talking [e.g., voices]). The extent to which lexical versus indexical information is represented separately or integrally in the brain is unresolved. In…

  18. Visual Information Can Hinder Working Memory Processing of Speech

    Science.gov (United States)

    Mishra, Sushmit; Lunner, Thomas; Stenfelt, Stefan; Ronnberg, Jerker; Rudner, Mary

    2013-01-01

    Purpose: The purpose of the present study was to evaluate the new Cognitive Spare Capacity Test (CSCT), which measures aspects of working memory capacity for heard speech in the audiovisual and auditory-only modalities of presentation. Method: In Experiment 1, 20 young adults with normal hearing performed the CSCT and an independent battery of…

  19. Residual Neural Processing of Musical Sound Features in Adult Cochlear Implant Users

    Science.gov (United States)

    Timm, Lydia; Vuust, Peter; Brattico, Elvira; Agrawal, Deepashri; Debener, Stefan; Büchner, Andreas; Dengler, Reinhard; Wittfoth, Matthias

    2014-01-01

    Auditory processing in general and music perception in particular are hampered in adult cochlear implant (CI) users. To examine the residual music perception skills and their underlying neural correlates in CI users implanted in adolescence or adulthood, we conducted an electrophysiological and behavioral study comparing adult CI users with normal-hearing age-matched controls (NH controls). We used a newly developed musical multi-feature paradigm, which makes it possible to test automatic auditory discrimination of six different types of sound feature changes inserted within a musical enriched setting lasting only 20 min. The presentation of stimuli did not require the participants’ attention, allowing the study of the early automatic stage of feature processing in the auditory cortex. For the CI users, we obtained mismatch negativity (MMN) brain responses to five feature changes but not to changes of rhythm, whereas we obtained MMNs for all the feature changes in the NH controls. Furthermore, the MMNs to deviants of pitch of CI users were reduced in amplitude and later than those of NH controls for changes of pitch and guitar timber. No other group differences in MMN parameters were found to changes in intensity and saxophone timber. Furthermore, the MMNs in CI users reflected the behavioral scores from a respective discrimination task and were correlated with patients’ age and speech intelligibility. Our results suggest that even though CI users are not performing at the same level as NH controls in neural discrimination of pitch-based features, they do possess potential neural abilities for music processing. However, CI users showed a disrupted ability to automatically discriminate rhythmic changes compared with controls. The current behavioral and MMN findings highlight the residual neural skills for music processing even in CI users who have been implanted in adolescence or adulthood. Highlights: -Automatic brain responses to musical feature changes

  20. Competing sound sources reveal spatial effects in cortical processing.

    Directory of Open Access Journals (Sweden)

    Ross K Maddox

    Full Text Available Why is spatial tuning in auditory cortex weak, even though location is important to object recognition in natural settings? This question continues to vex neuroscientists focused on linking physiological results to auditory perception. Here we show that the spatial locations of simultaneous, competing sound sources dramatically influence how well neural spike trains recorded from the zebra finch field L (an analog of mammalian primary auditory cortex encode source identity. We find that the location of a birdsong played in quiet has little effect on the fidelity of the neural encoding of the song. However, when the song is presented along with a masker, spatial effects are pronounced. For each spatial configuration, a subset of neurons encodes song identity more robustly than others. As a result, competing sources from different locations dominate responses of different neural subpopulations, helping to separate neural responses into independent representations. These results help elucidate how cortical processing exploits spatial information to provide a substrate for selective spatial auditory attention.

  1. Using personal response systems to assess speech perception within the classroom: an approach to determine the efficacy of sound field amplification in primary school classrooms.

    Science.gov (United States)

    Vickers, Deborah A; Backus, Bradford C; Macdonald, Nora K; Rostamzadeh, Niloofar K; Mason, Nisha K; Pandya, Roshni; Marriage, Josephine E; Mahon, Merle H

    2013-01-01

    The assessment of the combined effect of classroom acoustics and sound field amplification (SFA) on children's speech perception within the "live" classroom poses a challenge to researchers. The goals of this study were to determine: (1) Whether personal response system (PRS) hand-held voting cards, together with a closed-set speech perception test (Chear Auditory Perception Test [CAPT]), provide an appropriate method for evaluating speech perception in the classroom; (2) Whether SFA provides better access to the teacher's speech than without SFA for children, taking into account vocabulary age, middle ear dysfunction or ear-canal wax, and home language. Forty-four children from two school-year groups, year 2 (aged 6 years 11 months to 7 years 10 months) and year 3 (aged 7 years 11 months to 8 years 10 months) were tested in two classrooms, using a shortened version of the four-alternative consonant discrimination section of the CAPT. All children used a PRS to register their chosen response, which they selected from four options displayed on the interactive whiteboard. The classrooms were located in a 19th-century school in central London, United Kingdom. Each child sat at their usual position in the room while target speech stimuli were presented either in quiet or in noise. The target speech was presented from the front of the classroom at 65 dBA (calibrated at 1 m) and the presented noise level was 46 dBA measured at the center of the classroom. The older children had an additional noise condition with a noise level of 52 dBA. All conditions were presented twice, once with SFA and once without SFA and the order of testing was randomized. White noise from the teacher's right-hand side of the classroom and International Speech Test Signal from the teacher's left-hand side were used, and the noises were matched at the center point of the classroom (10sec averaging [A-weighted]). Each child's expressive vocabulary age and middle ear status were measured

  2. Piano training enhances the neural processing of pitch and improves speech perception in Mandarin-speaking children.

    Science.gov (United States)

    Nan, Yun; Liu, Li; Geiser, Eveline; Shu, Hua; Gong, Chen Chen; Dong, Qi; Gabrieli, John D E; Desimone, Robert

    2018-06-25

    Musical training confers advantages in speech-sound processing, which could play an important role in early childhood education. To understand the mechanisms of this effect, we used event-related potential and behavioral measures in a longitudinal design. Seventy-four Mandarin-speaking children aged 4-5 y old were pseudorandomly assigned to piano training, reading training, or a no-contact control group. Six months of piano training improved behavioral auditory word discrimination in general as well as word discrimination based on vowels compared with the controls. The reading group yielded similar trends. However, the piano group demonstrated unique advantages over the reading and control groups in consonant-based word discrimination and in enhanced positive mismatch responses (pMMRs) to lexical tone and musical pitch changes. The improved word discrimination based on consonants correlated with the enhancements in musical pitch pMMRs among the children in the piano group. In contrast, all three groups improved equally on general cognitive measures, including tests of IQ, working memory, and attention. The results suggest strengthened common sound processing across domains as an important mechanism underlying the benefits of musical training on language processing. In addition, although we failed to find far-transfer effects of musical training to general cognition, the near-transfer effects to speech perception establish the potential for musical training to help children improve their language skills. Piano training was not inferior to reading training on direct tests of language function, and it even seemed superior to reading training in enhancing consonant discrimination.

  3. High school music classes enhance the neural processing of speech

    OpenAIRE

    Tierney, Adam; Krizman, Jennifer; Skoe, Erika; Johnston, Kathleen; Kraus, Nina

    2013-01-01

    Should music be a priority in public education? One argument for teaching music in school is that private music instruction relates to enhanced language abilities and neural function. However, the directionality of this relationship is unclear and it is unknown whether school-based music training can produce these enhancements. Here we show that two years of group music classes in high school enhance the subcortical encoding of speech. To tease apart the relationships between music and neural...

  4. Seeing the talker’s face supports executive processing of speech in steady state noise

    OpenAIRE

    Sushmit eMishra; Thomas eLunner; Thomas eLunner; Thomas eLunner; Stefan eStenfelt; Stefan eStenfelt; Jerker eRönnberg; Mary eRudner

    2013-01-01

    Listening to speech in noise depletes cognitive resources, affecting speech processing. The present study investigated how remaining resources or cognitive spare capacity (CSC) can be deployed by young adults with normal hearing. We administered a test of CSC (CSCT, Mishra et al., 2013) along with a battery of established cognitive tests to 20 participants with normal hearing. In the CSCT, lists of two-digit numbers were presented with and without visual cues in quiet, as well as in steady-st...

  5. Application of adaptive digital signal processing to speech enhancement for the hearing impaired.

    Science.gov (United States)

    Chabries, D M; Christiansen, R W; Brey, R H; Robinette, M S; Harris, R W

    1987-01-01

    A major complaint of individuals with normal hearing and hearing impairments is a reduced ability to understand speech in a noisy environment. This paper describes the concept of adaptive noise cancelling for removing noise from corrupted speech signals. Application of adaptive digital signal processing has long been known and is described from a historical as well as technical perspective. The Widrow-Hoff LMS (least mean square) algorithm developed in 1959 forms the introduction to modern adaptive signal processing. This method uses a "primary" input which consists of the desired speech signal corrupted with noise and a second "reference" signal which is used to estimate the primary noise signal. By subtracting the adaptively filtered estimate of the noise, the desired speech signal is obtained. Recent developments in the field as they relate to noise cancellation are described. These developments include more computationally efficient algorithms as well as algorithms that exhibit improved learning performance. A second method for removing noise from speech, for use when no independent reference for the noise exists, is referred to as single channel noise suppression. Both adaptive and spectral subtraction techniques have been applied to this problem--often with the result of decreased speech intelligibility. Current techniques applied to this problem are described, including signal processing techniques that offer promise in the noise suppression application.

  6. The pupil response is sensitive to divided attention during speech processing.

    Science.gov (United States)

    Koelewijn, Thomas; Shinn-Cunningham, Barbara G; Zekveld, Adriana A; Kramer, Sophia E

    2014-06-01

    Dividing attention over two streams of speech strongly decreases performance compared to focusing on only one. How divided attention affects cognitive processing load as indexed with pupillometry during speech recognition has so far not been investigated. In 12 young adults the pupil response was recorded while they focused on either one or both of two sentences that were presented dichotically and masked by fluctuating noise across a range of signal-to-noise ratios. In line with previous studies, the performance decreases when processing two target sentences instead of one. Additionally, dividing attention to process two sentences caused larger pupil dilation and later peak pupil latency than processing only one. This suggests an effect of attention on cognitive processing load (pupil dilation) during speech processing in noise. Copyright © 2014 The Authors. Published by Elsevier B.V. All rights reserved.

  7. Neural Entrainment to Speech Modulates Speech Intelligibility

    NARCIS (Netherlands)

    Riecke, Lars; Formisano, Elia; Sorger, Bettina; Baskent, Deniz; Gaudrain, Etienne

    2018-01-01

    Speech is crucial for communication in everyday life. Speech-brain entrainment, the alignment of neural activity to the slow temporal fluctuations (envelope) of acoustic speech input, is a ubiquitous element of current theories of speech processing. Associations between speech-brain entrainment and

  8. The sound manifesto

    Science.gov (United States)

    O'Donnell, Michael J.; Bisnovatyi, Ilia

    2000-11-01

    Computing practice today depends on visual output to drive almost all user interaction. Other senses, such as audition, may be totally neglected, or used tangentially, or used in highly restricted specialized ways. We have excellent audio rendering through D-A conversion, but we lack rich general facilities for modeling and manipulating sound comparable in quality and flexibility to graphics. We need coordinated research in several disciplines to improve the use of sound as an interactive information channel. Incremental and separate improvements in synthesis, analysis, speech processing, audiology, acoustics, music, etc. will not alone produce the radical progress that we seek in sonic practice. We also need to create a new central topic of study in digital audio research. The new topic will assimilate the contributions of different disciplines on a common foundation. The key central concept that we lack is sound as a general-purpose information channel. We must investigate the structure of this information channel, which is driven by the cooperative development of auditory perception and physical sound production. Particular audible encodings, such as speech and music, illuminate sonic information by example, but they are no more sufficient for a characterization than typography is sufficient for characterization of visual information. To develop this new conceptual topic of sonic information structure, we need to integrate insights from a number of different disciplines that deal with sound. In particular, we need to coordinate central and foundational studies of the representational models of sound with specific applications that illuminate the good and bad qualities of these models. Each natural or artificial process that generates informative sound, and each perceptual mechanism that derives information from sound, will teach us something about the right structure to attribute to the sound itself. The new Sound topic will combine the work of computer

  9. Speech Processing to Improve the Perception of Speech in Background Noise for Children With Auditory Processing Disorder and Typically Developing Peers.

    Science.gov (United States)

    Flanagan, Sheila; Zorilă, Tudor-Cătălin; Stylianou, Yannis; Moore, Brian C J

    2018-01-01

    Auditory processing disorder (APD) may be diagnosed when a child has listening difficulties but has normal audiometric thresholds. For adults with normal hearing and with mild-to-moderate hearing impairment, an algorithm called spectral shaping with dynamic range compression (SSDRC) has been shown to increase the intelligibility of speech when background noise is added after the processing. Here, we assessed the effect of such processing using 8 children with APD and 10 age-matched control children. The loudness of the processed and unprocessed sentences was matched using a loudness model. The task was to repeat back sentences produced by a female speaker when presented with either speech-shaped noise (SSN) or a male competing speaker (CS) at two signal-to-background ratios (SBRs). Speech identification was significantly better with SSDRC processing than without, for both groups. The benefit of SSDRC processing was greater for the SSN than for the CS background. For the SSN, scores were similar for the two groups at both SBRs. For the CS, the APD group performed significantly more poorly than the control group. The overall improvement produced by SSDRC processing could be useful for enhancing communication in a classroom where the teacher's voice is broadcast using a wireless system.

  10. Biological impact of preschool music classes on processing speech in noise.

    Science.gov (United States)

    Strait, Dana L; Parbery-Clark, Alexandra; O'Connell, Samantha; Kraus, Nina

    2013-10-01

    Musicians have increased resilience to the effects of noise on speech perception and its neural underpinnings. We do not know, however, how early in life these enhancements arise. We compared auditory brainstem responses to speech in noise in 32 preschool children, half of whom were engaged in music training. Thirteen children returned for testing one year later, permitting the first longitudinal assessment of subcortical auditory function with music training. Results indicate emerging neural enhancements in musically trained preschoolers for processing speech in noise. Longitudinal outcomes reveal that children enrolled in music classes experience further increased neural resilience to background noise following one year of continued training compared to nonmusician peers. Together, these data reveal enhanced development of neural mechanisms undergirding speech-in-noise perception in preschoolers undergoing music training and may indicate a biological impact of music training on auditory function during early childhood. Copyright © 2013 Elsevier Ltd. All rights reserved.

  11. Recommendations for elaboration, transcultural adaptation and validation process of tests in Speech, Hearing and Language Pathology.

    Science.gov (United States)

    Pernambuco, Leandro; Espelt, Albert; Magalhães, Hipólito Virgílio; Lima, Kenio Costa de

    2017-06-08

    to present a guide with recommendations for translation, adaptation, elaboration and process of validation of tests in Speech and Language Pathology. the recommendations were based on international guidelines with a focus on the elaboration, translation, cross-cultural adaptation and validation process of tests. the recommendations were grouped into two Charts, one of them with procedures for translation and transcultural adaptation and the other for obtaining evidence of validity, reliability and measures of accuracy of the tests. a guide with norms for the organization and systematization of the process of elaboration, translation, cross-cultural adaptation and validation process of tests in Speech and Language Pathology was created.

  12. Self-Repair and Language Selection in Bilingual Speech Processing

    Directory of Open Access Journals (Sweden)

    Inga Hennecke

    2013-07-01

    Full Text Available In psycholinguistic research the exact level of language selection in bilingual lexical access is still controversial and current models of bilingual speech production offer conflicting statements about the mechanisms and location of language selection. This paper aims to provide a corpus analysis of self-repair mechanisms in code-switching contexts of highly fluent bilingual speakers in order to gain further insights into bilingual speech production. The present paper follows the assumptions of the Selection by Proficiency model, which claims that language proficiency and lexical robustness determine the mechanism and level of language selection. In accordance with this hypothesis, highly fluent bilinguals select languages at a prelexical level, which should influence the occurrence of self-repairs in bilingual speech. A corpus of natural speech data of highly fluent and balanced bilingual French-English speakers of the Canadian French variety Franco-Manitoban serves as the basis for a detailed analysis of different self-repair mechanisms in code-switching environments. Although the speech data contain a large amount of code-switching, results reveal that only a few speech errors and self-repairs occur in direct code-switching environments. A detailed analysis of the respective starting point of code-switching and the different repair mechanisms supports the hypothesis that highly proficient bilinguals do not select languages at the lexical level.Le niveau exact de la sélection des langues lors de l’accès lexical chez le bilingue reste une question controversée dans la recherche psycholinguistique. Les modèles actuels de la production verbale bilingue proposent des arguments contradictoires concernant le mécanisme et le lieu de la sélection des langues. La présente recherche vise à fournir une analyse de corpus mettant l’accent sur les mécanismes d’autoréparation dans le contexte d’alternance codique dans la production verbale

  13. Suppression of the µ rhythm during speech and non-speech discrimination revealed by independent component analysis: implications for sensorimotor integration in speech processing.

    Science.gov (United States)

    Bowers, Andrew; Saltuklaroglu, Tim; Harkrider, Ashley; Cuellar, Megan

    2013-01-01

    Constructivist theories propose that articulatory hypotheses about incoming phonetic targets may function to enhance perception by limiting the possibilities for sensory analysis. To provide evidence for this proposal, it is necessary to map ongoing, high-temporal resolution changes in sensorimotor activity (i.e., the sensorimotor μ rhythm) to accurate speech and non-speech discrimination performance (i.e., correct trials.). Sixteen participants (15 female and 1 male) were asked to passively listen to or actively identify speech and tone-sweeps in a two-force choice discrimination task while the electroencephalograph (EEG) was recorded from 32 channels. The stimuli were presented at signal-to-noise ratios (SNRs) in which discrimination accuracy was high (i.e., 80-100%) and low SNRs producing discrimination performance at chance. EEG data were decomposed using independent component analysis and clustered across participants using principle component methods in EEGLAB. ICA revealed left and right sensorimotor µ components for 14/16 and 13/16 participants respectively that were identified on the basis of scalp topography, spectral peaks, and localization to the precentral and postcentral gyri. Time-frequency analysis of left and right lateralized µ component clusters revealed significant (pFDRspeech discrimination trials relative to chance trials following stimulus offset. Findings are consistent with constructivist, internal model theories proposing that early forward motor models generate predictions about likely phonemic units that are then synthesized with incoming sensory cues during active as opposed to passive processing. Future directions and possible translational value for clinical populations in which sensorimotor integration may play a functional role are discussed.

  14. Conflict monitoring in speech processing : An fMRI study of error detection in speech production and perception

    NARCIS (Netherlands)

    Gauvin, Hanna; De Baene, W.; Brass, Marcel; Hartsuiker, Robert

    2016-01-01

    To minimize the number of errors in speech, and thereby facilitate communication, speech is monitored before articulation. It is, however, unclear at which level during speech production monitoring takes place, and what mechanisms are used to detect and correct errors. The present study investigated

  15. Phonological processes in the speech of school-age children with hearing loss: Comparisons with children with normal hearing.

    Science.gov (United States)

    Asad, Areej Nimer; Purdy, Suzanne C; Ballard, Elaine; Fairgray, Liz; Bowen, Caroline

    2018-04-27

    In this descriptive study, phonological processes were examined in the speech of children aged 5;0-7;6 (years; months) with mild to profound hearing loss using hearing aids (HAs) and cochlear implants (CIs), in comparison to their peers. A second aim was to compare phonological processes of HA and CI users. Children with hearing loss (CWHL, N = 25) were compared to children with normal hearing (CWNH, N = 30) with similar age, gender, linguistic, and socioeconomic backgrounds. Speech samples obtained from a list of 88 words, derived from three standardized speech tests, were analyzed using the CASALA (Computer Aided Speech and Language Analysis) program to evaluate participants' phonological systems, based on lax (a process appeared at least twice in the speech of at least two children) and strict (a process appeared at least five times in the speech of at least two children) counting criteria. Developmental phonological processes were eliminated in the speech of younger and older CWNH while eleven developmental phonological processes persisted in the speech of both age groups of CWHL. CWHL showed a similar trend of age of elimination to CWNH, but at a slower rate. Children with HAs and CIs produced similar phonological processes. Final consonant deletion, weak syllable deletion, backing, and glottal replacement were present in the speech of HA users, affecting their overall speech intelligibility. Developmental and non-developmental phonological processes persist in the speech of children with mild to profound hearing loss compared to their peers with typical hearing. The findings indicate that it is important for clinicians to consider phonological assessment in pre-school CWHL and the use of evidence-based speech therapy in order to reduce non-developmental and non-age-appropriate developmental processes, thereby enhancing their speech intelligibility. Copyright © 2018 Elsevier Inc. All rights reserved.

  16. Viewing speech in action: speech articulation videos in the public domain that demonstrate the sounds of the International Phonetic Alphabet (IPA)

    OpenAIRE

    Nakai, S.; Beavan, D.; Lawson, E.; Leplâtre, G.; Scobbie, J. M.; Stuart-Smith, J.

    2016-01-01

    In this article, we introduce recently released, publicly available resources, which allow users to watch videos of hidden articulators (e.g. the tongue) during the production of various types of sounds found in the world’s languages. The articulation videos on these resources are linked to a clickable International Phonetic Alphabet chart ([International Phonetic Association. 1999. Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet. ...

  17. Speech processing: from peripheral to hemispheric asymmetry of the auditory system.

    Science.gov (United States)

    Lazard, Diane S; Collette, Jean-Louis; Perrot, Xavier

    2012-01-01

    Language processing from the cochlea to auditory association cortices shows side-dependent specificities with an apparent left hemispheric dominance. The aim of this article was to propose to nonspeech specialists a didactic review of two complementary theories about hemispheric asymmetry in speech processing. Starting from anatomico-physiological and clinical observations of auditory asymmetry and interhemispheric connections, this review then exposes behavioral (dichotic listening paradigm) as well as functional (functional magnetic resonance imaging and positron emission tomography) experiments that assessed hemispheric specialization for speech processing. Even though speech at an early phonological level is regarded as being processed bilaterally, a left-hemispheric dominance exists for higher-level processing. This asymmetry may arise from a segregation of the speech signal, broken apart within nonprimary auditory areas in two distinct temporal integration windows--a fast one on the left and a slower one on the right--modeled through the asymmetric sampling in time theory or a spectro-temporal trade-off, with a higher temporal resolution in the left hemisphere and a higher spectral resolution in the right hemisphere, modeled through the spectral/temporal resolution trade-off theory. Both theories deal with the concept that lower-order tuning principles for acoustic signal might drive higher-order organization for speech processing. However, the precise nature, mechanisms, and origin of speech processing asymmetry are still being debated. Finally, an example of hemispheric asymmetry alteration, which has direct clinical implications, is given through the case of auditory aging that mixes peripheral disorder and modifications of central processing. Copyright © 2011 The American Laryngological, Rhinological, and Otological Society, Inc.

  18. Electrophysiological evidence for a self-processing advantage during audiovisual speech integration.

    Science.gov (United States)

    Treille, Avril; Vilain, Coriandre; Kandel, Sonia; Sato, Marc

    2017-09-01

    Previous electrophysiological studies have provided strong evidence for early multisensory integrative mechanisms during audiovisual speech perception. From these studies, one unanswered issue is whether hearing our own voice and seeing our own articulatory gestures facilitate speech perception, possibly through a better processing and integration of sensory inputs with our own sensory-motor knowledge. The present EEG study examined the impact of self-knowledge during the perception of auditory (A), visual (V) and audiovisual (AV) speech stimuli that were previously recorded from the participant or from a speaker he/she had never met. Audiovisual interactions were estimated by comparing N1 and P2 auditory evoked potentials during the bimodal condition (AV) with the sum of those observed in the unimodal conditions (A + V). In line with previous EEG studies, our results revealed an amplitude decrease of P2 auditory evoked potentials in AV compared to A + V conditions. Crucially, a temporal facilitation of N1 responses was observed during the visual perception of self speech movements compared to those of another speaker. This facilitation was negatively correlated with the saliency of visual stimuli. These results provide evidence for a temporal facilitation of the integration of auditory and visual speech signals when the visual situation involves our own speech gestures.

  19. A New Method for Rating Hazard from Intense Sounds: Implications for Hearing Protection, Speech Intelligibility, and Situation Awareness

    National Research Council Canada - National Science Library

    Price, G. R

    2005-01-01

    The auditory hazard assessment algorithm for the human (AHAAH), developed by the U.S. Army Research Laboratory, is theoretically based and has been demonstrated to rate hazard from intense sounds much more accurately than existing methods...

  20. The mechanism of speech processing in congenital amusia: evidence from Mandarin speakers.

    Directory of Open Access Journals (Sweden)

    Fang Liu

    Full Text Available Congenital amusia is a neuro-developmental disorder of pitch perception that causes severe problems with music processing but only subtle difficulties in speech processing. This study investigated speech processing in a group of Mandarin speakers with congenital amusia. Thirteen Mandarin amusics and thirteen matched controls participated in a set of tone and intonation perception tasks and two pitch threshold tasks. Compared with controls, amusics showed impaired performance on word discrimination in natural speech and their gliding tone analogs. They also performed worse than controls on discriminating gliding tone sequences derived from statements and questions, and showed elevated thresholds for pitch change detection and pitch direction discrimination. However, they performed as well as controls on word identification, and on statement-question identification and discrimination in natural speech. Overall, tasks that involved multiple acoustic cues to communicative meaning were not impacted by amusia. Only when the tasks relied mainly on pitch sensitivity did amusics show impaired performance compared to controls. These findings help explain why amusia only affects speech processing in subtle ways. Further studies on a larger sample of Mandarin amusics and on amusics of other language backgrounds are needed to consolidate these results.

  1. The mechanism of speech processing in congenital amusia: evidence from Mandarin speakers.

    Science.gov (United States)

    Liu, Fang; Jiang, Cunmei; Thompson, William Forde; Xu, Yi; Yang, Yufang; Stewart, Lauren

    2012-01-01

    Congenital amusia is a neuro-developmental disorder of pitch perception that causes severe problems with music processing but only subtle difficulties in speech processing. This study investigated speech processing in a group of Mandarin speakers with congenital amusia. Thirteen Mandarin amusics and thirteen matched controls participated in a set of tone and intonation perception tasks and two pitch threshold tasks. Compared with controls, amusics showed impaired performance on word discrimination in natural speech and their gliding tone analogs. They also performed worse than controls on discriminating gliding tone sequences derived from statements and questions, and showed elevated thresholds for pitch change detection and pitch direction discrimination. However, they performed as well as controls on word identification, and on statement-question identification and discrimination in natural speech. Overall, tasks that involved multiple acoustic cues to communicative meaning were not impacted by amusia. Only when the tasks relied mainly on pitch sensitivity did amusics show impaired performance compared to controls. These findings help explain why amusia only affects speech processing in subtle ways. Further studies on a larger sample of Mandarin amusics and on amusics of other language backgrounds are needed to consolidate these results.

  2. Effects of Sound, Vocabulary, and Grammar Learning Aptitude on Adult Second Language Speech Attainment in Foreign Language Classrooms

    Science.gov (United States)

    Saito, Kazuya

    2017-01-01

    This study examines the relationship between different types of language learning aptitude (measured via the LLAMA test) and adult second language (L2) learners' attainment in speech production in English-as-a-foreign-language (EFL) classrooms. Picture descriptions elicited from 50 Japanese EFL learners from varied proficiency levels were analyzed…

  3. CASRA+: A Colloquial Arabic Speech Recognition Application

    OpenAIRE

    Ramzi A. Haraty; Omar El Ariss

    2007-01-01

    The research proposed here was for an Arabic speech recognition application, concentrating on the Lebanese dialect. The system starts by sampling the speech, which was the process of transforming the sound from analog to digital and then extracts the features by using the Mel-Frequency Cepstral Coefficients (MFCC). The extracted features are then compared with the system's stored model; in this case the stored model chosen was a phoneme-based model. This reference model differs from the direc...

  4. Pitch and Time Processing in Speech and Tones: The Effects of Musical Training and Attention

    Science.gov (United States)

    Sares, Anastasia G.; Foster, Nicholas E. V.; Allen, Kachina; Hyde, Krista L.

    2018-01-01

    Purpose: Musical training is often linked to enhanced auditory discrimination, but the relative roles of pitch and time in music and speech are unclear. Moreover, it is unclear whether pitch and time processing are correlated across individuals and how they may be affected by attention. This study aimed to examine pitch and time processing in…

  5. Temporal and speech processing skills in normal hearing individuals exposed to occupational noise.

    Science.gov (United States)

    Kumar, U Ajith; Ameenudin, Syed; Sangamanatha, A V

    2012-01-01

    Prolonged exposure to high levels of occupational noise can cause damage to hair cells in the cochlea and result in permanent noise-induced cochlear hearing loss. Consequences of cochlear hearing loss on speech perception and psychophysical abilities have been well documented. Primary goal of this research was to explore temporal processing and speech perception Skills in individuals who are exposed to occupational noise of more than 80 dBA and not yet incurred clinically significant threshold shifts. Contribution of temporal processing skills to speech perception in adverse listening situation was also evaluated. A total of 118 participants took part in this research. Participants comprised three groups of train drivers in the age range of 30-40 (n= 13), 41 50 ( = 13), 41-50 (n = 9), and 51-60 (n = 6) years and their non-noise-exposed counterparts (n = 30 in each age group). Participants of all the groups including the train drivers had hearing sensitivity within 25 dB HL in the octave frequencies between 250 and 8 kHz. Temporal processing was evaluated using gap detection, modulation detection, and duration pattern tests. Speech recognition was tested in presence multi-talker babble at -5dB SNR. Differences between experimental and control groups were analyzed using ANOVA and independent sample t-tests. Results showed a trend of reduced temporal processing skills in individuals with noise exposure. These deficits were observed despite normal peripheral hearing sensitivity. Speech recognition scores in the presence of noise were also significantly poor in noise-exposed group. Furthermore, poor temporal processing skills partially accounted for the speech recognition difficulties exhibited by the noise-exposed individuals. These results suggest that noise can cause significant distortions in the processing of suprathreshold temporal cues which may add to difficulties in hearing in adverse listening conditions.

  6. Temporal and speech processing skills in normal hearing individuals exposed to occupational noise

    Directory of Open Access Journals (Sweden)

    U Ajith Kumar

    2012-01-01

    Full Text Available Prolonged exposure to high levels of occupational noise can cause damage to hair cells in the cochlea and result in permanent noise-induced cochlear hearing loss. Consequences of cochlear hearing loss on speech perception and psychophysical abilities have been well documented. Primary goal of this research was to explore temporal processing and speech perception Skills in individuals who are exposed to occupational noise of more than 80 dBA and not yet incurred clinically significant threshold shifts. Contribution of temporal processing skills to speech perception in adverse listening situation was also evaluated. A total of 118 participants took part in this research. Participants comprised three groups of train drivers in the age range of 30-40 (n= 13, 41 50 ( = 13, 41-50 (n = 9, and 51-60 (n = 6 years and their non-noise-exposed counterparts (n = 30 in each age group. Participants of all the groups including the train drivers had hearing sensitivity within 25 dB HL in the octave frequencies between 250 and 8 kHz. Temporal processing was evaluated using gap detection, modulation detection, and duration pattern tests. Speech recognition was tested in presence multi-talker babble at -5dB SNR. Differences between experimental and control groups were analyzed using ANOVA and independent sample t-tests. Results showed a trend of reduced temporal processing skills in individuals with noise exposure. These deficits were observed despite normal peripheral hearing sensitivity. Speech recognition scores in the presence of noise were also significantly poor in noise-exposed group. Furthermore, poor temporal processing skills partially accounted for the speech recognition difficulties exhibited by the noise-exposed individuals. These results suggest that noise can cause significant distortions in the processing of suprathreshold temporal cues which may add to difficulties in hearing in adverse listening conditions.

  7. Electrophysiological evidence for speech-specific audiovisual integration.

    Science.gov (United States)

    Baart, Martijn; Stekelenburg, Jeroen J; Vroomen, Jean

    2014-01-01

    Lip-read speech is integrated with heard speech at various neural levels. Here, we investigated the extent to which lip-read induced modulations of the auditory N1 and P2 (measured with EEG) are indicative of speech-specific audiovisual integration, and we explored to what extent the ERPs were modulated by phonetic audiovisual congruency. In order to disentangle speech-specific (phonetic) integration from non-speech integration, we used Sine-Wave Speech (SWS) that was perceived as speech by half of the participants (they were in speech-mode), while the other half was in non-speech mode. Results showed that the N1 obtained with audiovisual stimuli peaked earlier than the N1 evoked by auditory-only stimuli. This lip-read induced speeding up of the N1 occurred for listeners in speech and non-speech mode. In contrast, if listeners were in speech-mode, lip-read speech also modulated the auditory P2, but not if listeners were in non-speech mode, thus revealing speech-specific audiovisual binding. Comparing ERPs for phonetically congruent audiovisual stimuli with ERPs for incongruent stimuli revealed an effect of phonetic stimulus congruency that started at ~200 ms after (in)congruence became apparent. Critically, akin to the P2 suppression, congruency effects were only observed if listeners were in speech mode, and not if they were in non-speech mode. Using identical stimuli, we thus confirm that audiovisual binding involves (partially) different neural mechanisms for sound processing in speech and non-speech mode. © 2013 Published by Elsevier Ltd.

  8. Functional neuroanatomy of gesture-speech integration in children varies with individual differences in gesture processing.

    Science.gov (United States)

    Demir-Lira, Özlem Ece; Asaridou, Salomi S; Raja Beharelle, Anjali; Holt, Anna E; Goldin-Meadow, Susan; Small, Steven L

    2018-03-08

    Gesture is an integral part of children's communicative repertoire. However, little is known about the neurobiology of speech and gesture integration in the developing brain. We investigated how 8- to 10-year-old children processed gesture that was essential to understanding a set of narratives. We asked whether the functional neuroanatomy of gesture-speech integration varies as a function of (1) the content of speech, and/or (2) individual differences in how gesture is processed. When gestures provided missing information not present in the speech (i.e., disambiguating gesture; e.g., "pet" + flapping palms = bird), the presence of gesture led to increased activity in inferior frontal gyri, the right middle temporal gyrus, and the left superior temporal gyrus, compared to when gesture provided redundant information (i.e., reinforcing gesture; e.g., "bird" + flapping palms = bird). This pattern of activation was found only in children who were able to successfully integrate gesture and speech behaviorally, as indicated by their performance on post-test story comprehension questions. Children who did not glean meaning from gesture did not show differential activation across the two conditions. Our results suggest that the brain activation pattern for gesture-speech integration in children overlaps with-but is broader than-the pattern in adults performing the same task. Overall, our results provide a possible neurobiological mechanism that could underlie children's increasing ability to integrate gesture and speech over childhood, and account for individual differences in that integration. © 2018 John Wiley & Sons Ltd.

  9. Asymmetry in infants’ selective attention to facial features during visual processing of infant-directed speech

    Directory of Open Access Journals (Sweden)

    Nicholas A Smith

    2013-09-01

    Full Text Available Two experiments used eye tracking to examine how infant and adult observers distribute their eye gaze on videos of a mother producing infant- and adult-directed speech. Both groups showed greater attention to the eyes than to the nose and mouth, as well as an asymmetrical focus on the talker’s right eye for infant-directed speech stimuli. Observers continued to look more at the talker’s apparent right eye when the video stimuli were mirror flipped, suggesting that the asymmetry reflects a perceptual processing bias rather than a stimulus artifact, which may be related to cerebral lateralization of emotion processing.

  10. A comparison between the first-fit settings of two multichannel digital signal-processing strategies: music quality ratings and speech-in-noise scores.

    Science.gov (United States)

    Higgins, Paul; Searchfield, Grant; Coad, Gavin

    2012-06-01

    The aim of this study was to determine which level-dependent hearing aid digital signal-processing strategy (DSP) participants preferred when listening to music and/or performing a speech-in-noise task. Two receiver-in-the-ear hearing aids were compared: one using 32-channel adaptive dynamic range optimization (ADRO) and the other wide dynamic range compression (WDRC) incorporating dual fast (4 channel) and slow (15 channel) processing. The manufacturers' first-fit settings based on participants' audiograms were used in both cases. Results were obtained from 18 participants on a quick speech-in-noise (QuickSIN; Killion, Niquette, Gudmundsen, Revit, & Banerjee, 2004) task and for 3 music listening conditions (classical, jazz, and rock). Participants preferred the quality of music and performed better at the QuickSIN task using the hearing aids with ADRO processing. A potential reason for the better performance of the ADRO hearing aids was less fluctuation in output with change in sound dynamics. ADRO processing has advantages for both music quality and speech recognition in noise over the multichannel WDRC processing that was used in the study. Further evaluations of which DSP aspects contribute to listener preference are required.

  11. The sound symbolism bootstrapping hypothesis for language acquisition and language evolution.

    Science.gov (United States)

    Imai, Mutsumi; Kita, Sotaro

    2014-09-19

    Sound symbolism is a non-arbitrary relationship between speech sounds and meaning. We review evidence that, contrary to the traditional view in linguistics, sound symbolism is an important design feature of language, which affects online processing of language, and most importantly, language acquisition. We propose the sound symbolism bootstrapping hypothesis, claiming that (i) pre-verbal infants are sensitive to sound symbolism, due to a biologically endowed ability to map and integrate multi-modal input, (ii) sound symbolism helps infants gain referential insight for speech sounds, (iii) sound symbolism helps infants and toddlers associate speech sounds with their referents to establish a lexical representation and (iv) sound symbolism helps toddlers learn words by allowing them to focus on referents embedded in a complex scene, alleviating Quine's problem. We further explore the possibility that sound symbolism is deeply related to language evolution, drawing the parallel between historical development of language across generations and ontogenetic development within individuals. Finally, we suggest that sound symbolism bootstrapping is a part of a more general phenomenon of bootstrapping by means of iconic representations, drawing on similarities and close behavioural links between sound symbolism and speech-accompanying iconic gesture. © 2014 The Author(s) Published by the Royal Society. All rights reserved.

  12. Sounds in context

    DEFF Research Database (Denmark)

    Weed, Ethan

    A sound is never just a sound. It is becoming increasingly clear that auditory processing is best thought of not as a one-way afferent stream, but rather as an ongoing interaction between interior processes and the environment. Even the earliest stages of auditory processing in the nervous system...... time-course of contextual influence on auditory processing in three different paradigms: a simple mismatch negativity paradigm with tones of differing pitch, a multi-feature mismatch negativity paradigm in which tones were embedded in a complex musical context, and a cross-modal paradigm, in which...... auditory processing of emotional speech was modulated by an accompanying visual context. I then discuss these results in terms of their implication for how we conceive of the auditory processing stream....

  13. Fine-grained pitch processing of music and speech in congenital amusia.

    Science.gov (United States)

    Tillmann, Barbara; Rusconi, Elena; Traube, Caroline; Butterworth, Brian; Umiltà, Carlo; Peretz, Isabelle

    2011-12-01

    Congenital amusia is a lifelong disorder of music processing that has been ascribed to impaired pitch perception and memory. The present study tested a large group of amusics (n=17) and provided evidence that their pitch deficit affects pitch processing in speech to a lesser extent: Fine-grained pitch discrimination was better in spoken syllables than in acoustically matched tones. Unlike amusics, control participants performed fine-grained pitch discrimination better for musical material than for verbal material. These findings suggest that pitch extraction can be influenced by the nature of the material (music vs speech), and that amusics' pitch deficit is not restricted to musical material, but extends to segmented speech events. © 2011 Acoustical Society of America

  14. [The application of cybernetic modeling methods for the forensic medical personality identification based on the voice and sounding speech characteristics].

    Science.gov (United States)

    Kaganov, A Sh; Kir'yanov, P A

    2015-01-01

    The objective of the present publication was to discuss the possibility of application of cybernetic modeling methods to overcome the apparent discrepancy between two kinds of the speech records, viz. initial ones (e.g. obtained in the course of special investigation activities) and the voice prints obtained from the persons subjected to the criminalistic examination. The paper is based on the literature sources and the materials of original criminalistics expertises performed by the authors.

  15. A Comparison of the Metalinguistic Performance and Spelling Development of Children With Inconsistent Speech Sound Disorder and Their Age-Matched and Reading-Matched Peers.

    Science.gov (United States)

    McNeill, Brigid C; Wolter, Julie; Gillon, Gail T

    2017-05-17

    This study explored the specific nature of a spelling impairment in children with speech sound disorder (SSD) in relation to metalinguistic predictors of spelling development. The metalinguistic (phoneme, morphological, and orthographic awareness) and spelling development of 28 children ages 6-8 years with a history of inconsistent SSD were compared to those of their age-matched (n = 28) and reading-matched (n = 28) peers. Analysis of the literacy outcomes of children within the cohort with persistent (n = 18) versus resolved (n = 10) SSD was also conducted. The age-matched peers outperformed the SSD group on all measures. Children with SSD performed comparably to their reading-matched peers on metalinguistic measures but exhibited lower spelling scores. Children with persistent SSD generally had less favorable outcomes than children with resolved SSD; however, even children with resolved SSD performed poorly on normative spelling measures. Children with SSD have a specific difficulty with spelling that is not commensurate with their metalinguistic and reading ability. Although low metalinguistic awareness appears to inhibit these children's spelling development, other factors should be considered, such as nonverbal rehearsal during spelling attempts and motoric ability. Integration of speech-production and spelling-intervention goals is important to enhance literacy outcomes for this group.

  16. "When he's around his brothers … he's not so quiet": the private and public worlds of school-aged children with speech sound disorder.

    Science.gov (United States)

    McLeod, Sharynne; Daniel, Graham; Barr, Jacqueline

    2013-01-01

    Children interact with people in context: including home, school, and in the community. Understanding children's relationships within context is important for supporting children's development. Using child-friendly methodologies, the purpose of this research was to understand the lives of children with speech sound disorder (SSD) in context. Thirty-four interviews were undertaken with six school-aged children identified with SSD, and their siblings, friends, parents, grandparents, and teachers. Interview transcripts, questionnaires, and children's drawings were analyzed to reveal that these children experienced the world in context dependent ways (private vs. public worlds). Family and close friends typically provided a safe, supportive environment where children could be themselves and participate in typical childhoods. In contrast, when out of these familiar contexts, the children often were frustrated, embarrassed, and withdrawn, their relationships changed, and they were unable to get their message across in public contexts. Speech-language pathology assessment and intervention could be enhanced by interweaving the valuable insights of children, siblings, friends, parents, teachers, and other adults within children's worlds to more effectively support these children in context. 1. Recognize that children with SSD experience the world in different ways, depending on whether they are in private or public contexts. 2. Describe the changes in the roles of family and friends when children with SSD are in public contexts. 3. Discover the position of the child as central in Bronfenbrenner’s bioecological model. 4. Identify principles of child-friendly research. 5. Recognize the importance of considering the child in context during speech-language pathology assessment and intervention. Crown Copyright © 2012. Published by Elsevier Inc. All rights reserved.

  17. Multisensory and Modality Specific Processing of Visual Speech in Different Regions of the Premotor Cortex

    Directory of Open Access Journals (Sweden)

    Daniel eCallan

    2014-05-01

    Full Text Available Behavioral and neuroimaging studies have demonstrated that brain regions involved with speech production also support speech perception, especially under degraded conditions. The premotor cortex has been shown to be active during both observation and execution of action (‘Mirror System’ properties, and may facilitate speech perception by mapping unimodal and multimodal sensory features onto articulatory speech gestures. For this functional magnetic resonance imaging (fMRI study, participants identified vowels produced by a speaker in audio-visual (saw the speaker’s articulating face and heard her voice, visual only (only saw the speaker’s articulating face, and audio only (only heard the speaker’s voice conditions with varying audio signal-to-noise ratios in order to determine the regions of the premotor cortex involved with multisensory and modality specific processing of visual speech gestures. The task was designed so that identification could be made with a high level of accuracy from visual only stimuli to control for task difficulty and differences in intelligibility. The results of the fMRI analysis for visual only and audio-visual conditions showed overlapping activity in inferior frontal gyrus and premotor cortex. The left ventral inferior premotor cortex showed properties of multimodal (audio-visual enhancement with a degraded auditory signal. The left inferior parietal lobule and right cerebellum also showed these properties. The left ventral superior and dorsal premotor cortex did not show this multisensory enhancement effect, but there was greater activity for the visual only over audio-visual conditions in these areas. The results suggest that the inferior regions of the ventral premotor cortex are involved with integrating multisensory information, whereas, more superior and dorsal regions of the premotor cortex are involved with mapping unimodal (in this case visual sensory features of the speech signal with

  18. Letter-sound processing deficits in children with developmental dyslexia: An ERP study.

    Science.gov (United States)

    Moll, Kristina; Hasko, Sandra; Groth, Katharina; Bartling, Jürgen; Schulte-Körne, Gerd

    2016-04-01

    The time course during letter-sound processing was investigated in children with developmental dyslexia (DD) and typically developing (TD) children using electroencephalography. Thirty-eight children with DD and 25 TD children participated in a visual-auditory oddball paradigm. Event-related potentials (ERPs) elicited by standard and deviant stimuli in an early (100-190 ms) and late (560-750 ms) time window were analysed. In the early time window, ERPs elicited by the deviant stimulus were delayed and less left lateralized over fronto-temporal electrodes for children with DD compared to TD children. In the late time window, children with DD showed higher amplitudes extending more over right frontal electrodes. Longer latencies in the early time window and stronger right hemispheric activation in the late time window were associated with slower reading and naming speed. Additionally, stronger right hemispheric activation in the late time window correlated with poorer phonological awareness skills. Deficits in early stages of letter-sound processing influence later more explicit cognitive processes during letter-sound processing. Identifying the neurophysiological correlates of letter-sound processing and their relation to reading related skills provides insight into the degree of automaticity during letter-sound processing beyond behavioural measures of letter-sound-knowledge. Copyright © 2016 International Federation of Clinical Neurophysiology. Published by Elsevier Ireland Ltd. All rights reserved.

  19. SYNAPTIC DEPRESSION IN DEEP NEURAL NETWORKS FOR SPEECH PROCESSING.

    Science.gov (United States)

    Zhang, Wenhao; Li, Hanyu; Yang, Minda; Mesgarani, Nima

    2016-03-01

    A characteristic property of biological neurons is their ability to dynamically change the synaptic efficacy in response to variable input conditions. This mechanism, known as synaptic depression, significantly contributes to the formation of normalized representation of speech features. Synaptic depression also contributes to the robust performance of biological systems. In this paper, we describe how synaptic depression can be modeled and incorporated into deep neural network architectures to improve their generalization ability. We observed that when synaptic depression is added to the hidden layers of a neural network, it reduces the effect of changing background activity in the node activations. In addition, we show that when synaptic depression is included in a deep neural network trained for phoneme classification, the performance of the network improves under noisy conditions not included in the training phase. Our results suggest that more complete neuron models may further reduce the gap between the biological performance and artificial computing, resulting in networks that better generalize to novel signal conditions.

  20. The way you say it, the way I feel it: emotional word processing in accented speech

    Science.gov (United States)

    Hatzidaki, Anna; Baus, Cristina; Costa, Albert

    2015-01-01

    The present study examined whether processing words with affective connotations in a listener's native language may be modulated by accented speech. To address this question, we used the Event Related Potential (ERP) technique and recorded the cerebral activity of Spanish native listeners, who performed a semantic categorization task, while listening to positive, negative and neutral words produced in standard Spanish or in four foreign accents. The behavioral results yielded longer latencies for emotional than for neutral words in both native and foreign-accented speech, with no difference between positive and negative words. The electrophysiological results replicated previous findings from the emotional language literature, with the amplitude of the Late Positive Complex (LPC), associated with emotional language processing, being larger (more positive) for emotional than for neutral words at posterior scalp sites. Interestingly, foreign-accented speech was found to interfere with the processing of positive valence and go along with a negativity bias, possibly suggesting heightened attention to negative words. The manipulation employed in the present study provides an interesting perspective on the effects of accented speech on processing affective-laden information. It shows that higher order semantic processes that involve emotion-related aspects are sensitive to a speaker's accent. PMID:25870577

  1. Investigation of Speech Impairments in A Child with HIV: The Study of Phonological Processes: Case Report

    Directory of Open Access Journals (Sweden)

    Zahra Ilkhani

    2017-03-01

    Full Text Available Human immunodeficiency virus (HIV is a viral disease with immunodeficiency in human. So, it can involve different areas such as language, speech, motor and memory. The present research, as a case report, introducing the characteristics of phonological processes of a child who had Aids and lived in a nursery through referring and professional assessing in a speech therapy clinic. The child was a 4 year old boy who was in HIV base on blood test. Speech skills was assessed based on DEAP and language assessment was analyzed according to TOLD-P3. He talked with single word. He used two words sentences rarely. According to language assessment (TOLD-P3, semantic, syntax and phonology features were tested. So he was in emerging language stage. Also his expressive language was lower than his perceive language. In addition, based on DEAP-P test, phonological process of substitution type has been recognized most. Also, the most of the substitution phonological process which accrued have been velar fronting. This study showed that the most phonological process in a child with HIV was the process of substitution. It may be a risk factor for decreasing speech intelligibility. With regard to the results of the present research that showed that the subject had the disorder and there are limited researches in this area,

  2. Fast Algorithms for High-Order Sparse Linear Prediction with Applications to Speech Processing

    DEFF Research Database (Denmark)

    Jensen, Tobias Lindstrøm; Giacobello, Daniele; van Waterschoot, Toon

    2016-01-01

    In speech processing applications, imposing sparsity constraints on high-order linear prediction coefficients and prediction residuals has proven successful in overcoming some of the limitation of conventional linear predictive modeling. However, this modeling scheme, named sparse linear prediction...... problem with lower accuracy than in previous work. In the experimental analysis, we clearly show that a solution with lower accuracy can achieve approximately the same performance as a high accuracy solution both objectively, in terms of prediction gain, as well as with perceptual relevant measures, when...... evaluated in a speech reconstruction application....

  3. Neurophysiological Evidence That Musical Training Influences the Recruitment of Right Hemispheric Homologues for Speech Perception

    Directory of Open Access Journals (Sweden)

    McNeel Gordon Jantzen

    2014-03-01

    Full Text Available Musicians have a more accurate temporal and tonal representation of auditory stimuli than their non-musician counterparts (Kraus & Chandrasekaran, 2010; Parbery-Clark, Skoe, & Kraus, 2009; Zendel & Alain, 2008; Musacchia, Sams, Skoe, & Kraus, 2007. Musicians who are adept at the production and perception of music are also more sensitive to key acoustic features of speech such as voice onset timing and pitch. Together, these data suggest that musical training may enhance the processing of acoustic information for speech sounds. In the current study, we sought to provide neural evidence that musicians process speech and music in a similar way. We hypothesized that for musicians, right hemisphere areas traditionally associated with music are also engaged for the processing of speech sounds. In contrast we predicted that in non-musicians processing of speech sounds would be localized to traditional left hemisphere language areas. Speech stimuli differing in voice onset time was presented using a dichotic listening paradigm. Subjects either indicated aural location for a specified speech sound or identified a specific speech sound from a directed aural location. Musical training effects and organization of acoustic features were reflected by activity in source generators of the P50. This included greater activation of right middle temporal gyrus (MTG and superior temporal gyrus (STG in musicians. The findings demonstrate recruitment of right hemisphere in musicians for discriminating speech sounds and a putative broadening of their language network. Musicians appear to have an increased sensitivity to acoustic features and enhanced selective attention to temporal features of speech that is facilitated by musical training and supported, in part, by right hemisphere homologues of established speech processing regions of the brain.

  4. Fine-structure processing, frequency selectivity and speech perception in hearing-impaired listeners

    DEFF Research Database (Denmark)

    Strelcyk, Olaf; Dau, Torsten

    2008-01-01

    Hearing-impaired people often experience great difficulty with speech communication when background noise is present, even if reduced audibility has been compensated for. Other impairment factors must be involved. In order to minimize confounding effects, the subjects participating in this study...... consisted of groups with homogeneous, symmetric audiograms. The perceptual listening experiments assessed the intelligibility of full-spectrum as well as low-pass filtered speech in the presence of stationary and fluctuating interferers, the individual's frequency selectivity and the integrity of temporal...... modulation were obtained. In addition, these binaural and monaural thresholds were measured in a stationary background noise in order to assess the persistence of the fine-structure processing to interfering noise. Apart from elevated speech reception thresholds, the hearing impaired listeners showed poorer...

  5. Impact of Noise and Working Memory on Speech Processing in Adults with and without ADHD

    Science.gov (United States)

    Michalek, Anne M. P.

    2012-01-01

    Auditory processing of speech is influenced by internal (i.e., attention, working memory) and external factors (i.e., background noise, visual information). This study examined the interplay among these factors in individuals with and without ADHD. All participants completed a listening in noise task, two working memory capacity tasks, and two…

  6. Temporally selective attention supports speech processing in 3- to 5-year-old children.

    Science.gov (United States)

    Astheimer, Lori B; Sanders, Lisa D

    2012-01-01

    Recent event-related potential (ERP) evidence demonstrates that adults employ temporally selective attention to preferentially process the initial portions of words in continuous speech. Doing so is an effective listening strategy since word-initial segments are highly informative. Although the development of this process remains unexplored, directing attention to word onsets may be important for speech processing in young children who would otherwise be overwhelmed by the rapidly changing acoustic signals that constitute speech. We examined the use of temporally selective attention in 3- to 5-year-old children listening to stories by comparing ERPs elicited by attention probes presented at four acoustically matched times relative to word onsets: concurrently with a word onset, 100 ms before, 100 ms after, and at random control times. By 80 ms, probes presented at and after word onsets elicited a larger negativity than probes presented before word onsets or at control times. The latency and distribution of this effect is similar to temporally and spatially selective attention effects measured in adults and, despite differences in polarity, spatially selective attention effects measured in children. These results indicate that, like adults, preschool aged children modulate temporally selective attention to preferentially process the initial portions of words in continuous speech. Copyright © 2011 Elsevier Ltd. All rights reserved.

  7. Behavioral Signal Processing: Deriving Human Behavioral Informatics From Speech and Language

    Science.gov (United States)

    Narayanan, Shrikanth; Georgiou, Panayiotis G.

    2013-01-01

    The expression and experience of human behavior are complex and multimodal and characterized by individual and contextual heterogeneity and variability. Speech and spoken language communication cues offer an important means for measuring and modeling human behavior. Observational research and practice across a variety of domains from commerce to healthcare rely on speech- and language-based informatics for crucial assessment and diagnostic information and for planning and tracking response to an intervention. In this paper, we describe some of the opportunities as well as emerging methodologies and applications of human behavioral signal processing (BSP) technology and algorithms for quantitatively understanding and modeling typical, atypical, and distressed human behavior with a specific focus on speech- and language-based communicative, affective, and social behavior. We describe the three important BSP components of acquiring behavioral data in an ecologically valid manner across laboratory to real-world settings, extracting and analyzing behavioral cues from measured data, and developing models offering predictive and decision-making support. We highlight both the foundational speech and language processing building blocks as well as the novel processing and modeling opportunities. Using examples drawn from specific real-world applications ranging from literacy assessment and autism diagnostics to psychotherapy for addiction and marital well being, we illustrate behavioral informatics applications of these signal processing techniques that contribute to quantifying higher level, often subjectively described, human behavior in a domain-sensitive fashion. PMID:24039277

  8. Fish protection at water intakes using a new signal development process and sound system

    International Nuclear Information System (INIS)

    Loeffelman, P.H.; Klinect, D.A.; Van Hassel, J.H.

    1991-01-01

    American Electric Power Company, Inc., is exploring the feasibility of using a patented signal development process and sound system to guide aquatic animals with underwater sound. Sounds from animals such as chinook salmon, steelhead trout, striped bass, freshwater drum, largemouth bass, and gizzard shad can be used to synthesize a new signal to stimulate the animal in the most sensitive portion of its hearing range. AEP's field tests during its research demonstrate that adult chinook salmon, steelhead trout and warmwater fish, and steelhead trout and chinook salmon smolts can be repelled with a properly-tuned system. The signal development process and sound system is designed to be transportable and use animals at the site to incorporate site-specific factors known to affect underwater sound, e.g., bottom shape and type, water current, and temperature. This paper reports that, because the overall goal of this research was to determine the feasibility of using sound to divert fish, it was essential that the approach use a signal development process which could be customized to animals and site conditions at any hydropower plant site

  9. Predicting Prosody from Text for Text-to-Speech Synthesis

    CERN Document Server

    Rao, K Sreenivasa

    2012-01-01

    Predicting Prosody from Text for Text-to-Speech Synthesis covers the specific aspects of prosody, mainly focusing on how to predict the prosodic information from linguistic text, and then how to exploit the predicted prosodic knowledge for various speech applications. Author K. Sreenivasa Rao discusses proposed methods along with state-of-the-art techniques for the acquisition and incorporation of prosodic knowledge for developing speech systems. Positional, contextual and phonological features are proposed for representing the linguistic and production constraints of the sound units present in the text. This book is intended for graduate students and researchers working in the area of speech processing.

  10. Early and parallel processing of pragmatic and semantic information in speech acts: neurophysiological evidence

    Directory of Open Access Journals (Sweden)

    Natalia eEgorova

    2013-03-01

    Full Text Available Although language is a tool for communication, most research in the neuroscience of language has focused on studying words and sentences, while little is known about the brain mechanisms of speech acts, or communicative functions, for which words and sentences are used as tools. Here the neural processing of two types of speech acts, Naming and Requesting, was addressed using the time-resolved event-related potential (ERP technique. The brain responses for Naming and Request diverged as early as ~120 ms after the onset of the critical words, at the same time as, or even before, the earliest brain manifestations of semantic word properties could be detected. Request-evoked potentials were generally larger in amplitude than those for Naming. The use of identical words in closely matched settings for both speech acts rules out explanation of the difference in terms of phonological, lexical, semantic properties or word expectancy. The cortical sources underlying the ERP enhancement for Requests were found in the fronto-central cortex, consistent with the activation of action knowledge, as well as in right temporo-parietal junction, possibly reflecting additional implications of speech acts for social interaction and theory of mind. These results provide the first evidence for surprisingly early access to pragmatic and social interactive knowledge, which possibly occurs in parallel with other types of linguistic processing, and thus supports the near-simultaneous access to different subtypes of psycholinguistic information.

  11. ACTION OF UNIFORM SEARCH ALGORITHM WHEN SELECTING LANGUAGE UNITS IN THE PROCESS OF SPEECH

    Directory of Open Access Journals (Sweden)

    Ирина Михайловна Некипелова

    2013-05-01

    Full Text Available The article is devoted to research of action of uniform search algorithm when selecting by human of language units for speech produce. The process is connected with a speech optimization phenomenon. This makes it possible to shorten the time of cogitation something that human want to say, and to achieve the maximum precision in thoughts expression. The algorithm of uniform search works at consciousness  and subconsciousness levels. It favours the forming of automatism produce and perception of speech. Realization of human's cognitive potential in the process of communication starts up complicated mechanism of self-organization and self-regulation of language. In turn, it results in optimization of language system, servicing needs not only human's self-actualization but realization of communication in society. The method of problem-oriented search is used for researching of optimization mechanisms, which are distinctive to speech producing and stabilization of language.DOI: http://dx.doi.org/10.12731/2218-7405-2013-4-50

  12. Acoustic analysis of trill sounds.

    Science.gov (United States)

    Dhananjaya, N; Yegnanarayana, B; Bhaskararao, Peri

    2012-04-01

    In this paper, the acoustic-phonetic characteristics of steady apical trills--trill sounds produced by the periodic vibration of the apex of the tongue--are studied. Signal processing methods, namely, zero-frequency filtering and zero-time liftering of speech signals, are used to analyze the excitation source and the resonance characteristics of the vocal tract system, respectively. Although it is natural to expect the effect of trilling on the resonances of the vocal tract system, it is interesting to note that trilling influences the glottal source of excitation as well. The excitation characteristics derived using zero-frequency filtering of speech signals are glottal epochs, strength of impulses at the glottal epochs, and instantaneous fundamental frequency of the glottal vibration. Analysis based on zero-time liftering of speech signals is used to study the dynamic resonance characteristics of vocal tract system during the production of trill sounds. Qualitative analysis of trill sounds in different vowel contexts, and the acoustic cues that may help spotting trills in continuous speech are discussed.

  13. A new signal development process and sound system for diverting fish from water intakes

    International Nuclear Information System (INIS)

    Klinet, D.A.; Loeffelman, P.H.; van Hassel, J.H.

    1992-01-01

    This paper reports that American Electric Power Service Corporation has explored the feasibility of using a patented signal development process and underwater sound system to divert fish away from water intake areas. The effect of water intakes on fish is being closely scrutinized as hydropower projects are re-licensed. The overall goal of this four-year research project was to develop an underwater guidance system which is biologically effective, reliable and cost-effective compared to other proposed methods of diversion, such as physical screens. Because different fish species have various listening ranges, it was essential to the success of this experiment that the sound system have a great amount of flexibility. Assuming a fish's sounds are heard by the same kind of fish, it was necessary to develop a procedure and acquire instrumentation to properly analyze the sounds that the target fish species create to communicate and any artificial signals being generated for diversion

  14. Don't speak too fast! Processing of fast rate speech in children with specific language impairment.

    Directory of Open Access Journals (Sweden)

    Hélène Guiraud

    Full Text Available Perception of speech rhythm requires the auditory system to track temporal envelope fluctuations, which carry syllabic and stress information. Reduced sensitivity to rhythmic acoustic cues has been evidenced in children with Specific Language Impairment (SLI, impeding syllabic parsing and speech decoding. Our study investigated whether these children experience specific difficulties processing fast rate speech as compared with typically developing (TD children.Sixteen French children with SLI (8-13 years old with mainly expressive phonological disorders and with preserved comprehension and 16 age-matched TD children performed a judgment task on sentences produced 1 at normal rate, 2 at fast rate or 3 time-compressed. Sensitivity index (d' to semantically incongruent sentence-final words was measured.Overall children with SLI perform significantly worse than TD children. Importantly, as revealed by the significant Group × Speech Rate interaction, children with SLI find it more challenging than TD children to process both naturally or artificially accelerated speech. The two groups do not significantly differ in normal rate speech processing.In agreement with rhythm-processing deficits in atypical language development, our results suggest that children with SLI face difficulties adjusting to rapid speech rate. These findings are interpreted in light of temporal sampling and prosodic phrasing frameworks and of oscillatory mechanisms underlying speech perception.

  15. Don't speak too fast! Processing of fast rate speech in children with specific language impairment.

    Science.gov (United States)

    Guiraud, Hélène; Bedoin, Nathalie; Krifi-Papoz, Sonia; Herbillon, Vania; Caillot-Bascoul, Aurélia; Gonzalez-Monge, Sibylle; Boulenger, Véronique

    2018-01-01

    Perception of speech rhythm requires the auditory system to track temporal envelope fluctuations, which carry syllabic and stress information. Reduced sensitivity to rhythmic acoustic cues has been evidenced in children with Specific Language Impairment (SLI), impeding syllabic parsing and speech decoding. Our study investigated whether these children experience specific difficulties processing fast rate speech as compared with typically developing (TD) children. Sixteen French children with SLI (8-13 years old) with mainly expressive phonological disorders and with preserved comprehension and 16 age-matched TD children performed a judgment task on sentences produced 1) at normal rate, 2) at fast rate or 3) time-compressed. Sensitivity index (d') to semantically incongruent sentence-final words was measured. Overall children with SLI perform significantly worse than TD children. Importantly, as revealed by the significant Group × Speech Rate interaction, children with SLI find it more challenging than TD children to process both naturally or artificially accelerated speech. The two groups do not significantly differ in normal rate speech processing. In agreement with rhythm-processing deficits in atypical language development, our results suggest that children with SLI face difficulties adjusting to rapid speech rate. These findings are interpreted in light of temporal sampling and prosodic phrasing frameworks and of oscillatory mechanisms underlying speech perception.

  16. Multivoxel Patterns Reveal Functionally Differentiated Networks Underlying Auditory Feedback Processing of Speech

    DEFF Research Database (Denmark)

    Zheng, Zane Z.; Vicente-Grabovetsky, Alejandro; MacDonald, Ewen N.

    2013-01-01

    The everyday act of speaking involves the complex processes of speech motor control. An important component of control is monitoring, detection, and processing of errors when auditory feedback does not correspond to the intended motor gesture. Here we show, using fMRI and converging operations...... within a multivoxel pattern analysis framework, that this sensorimotor process is supported by functionally differentiated brain networks. During scanning, a real-time speech-tracking system was used to deliver two acoustically different types of distorted auditory feedback or unaltered feedback while...... human participants were vocalizing monosyllabic words, and to present the same auditory stimuli while participants were passively listening. Whole-brain analysis of neural-pattern similarity revealed three functional networks that were differentially sensitive to distorted auditory feedback during...

  17. Identifying Cortical Lateralization of Speech Processing in Infants Using Near-Infrared Spectroscopy

    Science.gov (United States)

    Bortfeld, Heather; Fava, Eswen; Boas, David A.

    2010-01-01

    We investigate the utility of near-infrared spectroscopy (NIRS) as an alternative technique for studying infant speech processing. NIRS is an optical imaging technology that uses relative changes in total hemoglobin concentration and oxygenation as an indicator of neural activation. Procedurally, NIRS has the advantage over more common methods (e.g., fMRI) in that it can be used to study the neural responses of behaviorally active infants. Older infants (aged 6–9 months) were allowed to sit on their caretakers’ laps during stimulus presentation to determine relative differences in focal activity in the temporal region of the brain during speech processing. Results revealed a dissociation of sensory-specific processing in two cortical regions, the left and right temporal lobes. These findings are consistent with those obtained using other neurophysiological methods and point to the utility of NIRS as a means of establishing neural correlates of language development in older (and more active) infants. PMID:19142766

  18. Hidden Markov models in automatic speech recognition

    Science.gov (United States)

    Wrzoskowicz, Adam

    1993-11-01

    This article describes a method for constructing an automatic speech recognition system based on hidden Markov models (HMMs). The author discusses the basic concepts of HMM theory and the application of these models to the analysis and recognition of speech signals. The author provides algorithms which make it possible to train the ASR system and recognize signals on the basis of distinct stochastic models of selected speech sound classes. The author describes the specific components of the system and the procedures used to model and recognize speech. The author discusses problems associated with the choice of optimal signal detection and parameterization characteristics and their effect on the performance of the system. The author presents different options for the choice of speech signal segments and their consequences for the ASR process. The author gives special attention to the use of lexical, syntactic, and semantic information for the purpose of improving the quality and efficiency of the system. The author also describes an ASR system developed by the Speech Acoustics Laboratory of the IBPT PAS. The author discusses the results of experiments on the effect of noise on the performance of the ASR system and describes methods of constructing HMM's designed to operate in a noisy environment. The author also describes a language for human-robot communications which was defined as a complex multilevel network from an HMM model of speech sounds geared towards Polish inflections. The author also added mandatory lexical and syntactic rules to the system for its communications vocabulary.

  19. Seeing the talker’s face supports executive processing of speech in steady state noise

    Directory of Open Access Journals (Sweden)

    Sushmit eMishra

    2013-11-01

    Full Text Available Listening to speech in noise depletes cognitive resources, affecting speech processing. The present study investigated how remaining resources or cognitive spare capacity (CSC can be deployed by young adults with normal hearing. We administered a test of CSC (CSCT, Mishra et al., 2013 along with a battery of established cognitive tests to 20 participants with normal hearing. In the CSCT, lists of two-digit numbers were presented with and without visual cues in quiet, as well as in steady-state and speech-like noise at a high intelligibility level. In low load conditions, two numbers were recalled according to instructions inducing executive processing (updating, inhibition and in high load conditions the participants were additionally instructed to recall one extra number, which was the always the first item in the list. In line with previous findings, results showed that CSC was sensitive to memory load and executive function but generally not related to working memory capacity. Furthermore, CSCT scores in quiet were lowered by visual cues, probably due to distraction. In steady-state noise, the presence of visual cues improved CSCT scores, probably by enabling better encoding. Contrary to our expectation, CSCT performance was disrupted more in steady-state than speech-like noise, although only without visual cues, possibly because selective attention could be used to ignore the speech-like background and provide an enriched representation of target items in working memory similar to that obtained in quiet. This interpretation is supported by a consistent association between CSCT scores and updating skills.

  20. Seeing the talker's face supports executive processing of speech in steady state noise.

    Science.gov (United States)

    Mishra, Sushmit; Lunner, Thomas; Stenfelt, Stefan; Rönnberg, Jerker; Rudner, Mary

    2013-01-01

    Listening to speech in noise depletes cognitive resources, affecting speech processing. The present study investigated how remaining resources or cognitive spare capacity (CSC) can be deployed by young adults with normal hearing. We administered a test of CSC (CSCT; Mishra et al., 2013) along with a battery of established cognitive tests to 20 participants with normal hearing. In the CSCT, lists of two-digit numbers were presented with and without visual cues in quiet, as well as in steady-state and speech-like noise at a high intelligibility level. In low load conditions, two numbers were recalled according to instructions inducing executive processing (updating, inhibition) and in high load conditions the participants were additionally instructed to recall one extra number, which was the always the first item in the list. In line with previous findings, results showed that CSC was sensitive to memory load and executive function but generally not related to working memory capacity (WMC). Furthermore, CSCT scores in quiet were lowered by visual cues, probably due to distraction. In steady-state noise, the presence of visual cues improved CSCT scores, probably by enabling better encoding. Contrary to our expectation, CSCT performance was disrupted more in steady-state than speech-like noise, although only without visual cues, possibly because selective attention could be used to ignore the speech-like background and provide an enriched representation of target items in working memory similar to that obtained in quiet. This interpretation is supported by a consistent association between CSCT scores and updating skills.

  1. Seeing the talker’s face supports executive processing of speech in steady state noise

    Science.gov (United States)

    Mishra, Sushmit; Lunner, Thomas; Stenfelt, Stefan; Rönnberg, Jerker; Rudner, Mary

    2013-01-01

    Listening to speech in noise depletes cognitive resources, affecting speech processing. The present study investigated how remaining resources or cognitive spare capacity (CSC) can be deployed by young adults with normal hearing. We administered a test of CSC (CSCT; Mishra et al., 2013) along with a battery of established cognitive tests to 20 participants with normal hearing. In the CSCT, lists of two-digit numbers were presented with and without visual cues in quiet, as well as in steady-state and speech-like noise at a high intelligibility level. In low load conditions, two numbers were recalled according to instructions inducing executive processing (updating, inhibition) and in high load conditions the participants were additionally instructed to recall one extra number, which was the always the first item in the list. In line with previous findings, results showed that CSC was sensitive to memory load and executive function but generally not related to working memory capacity (WMC). Furthermore, CSCT scores in quiet were lowered by visual cues, probably due to distraction. In steady-state noise, the presence of visual cues improved CSCT scores, probably by enabling better encoding. Contrary to our expectation, CSCT performance was disrupted more in steady-state than speech-like noise, although only without visual cues, possibly because selective attention could be used to ignore the speech-like background and provide an enriched representation of target items in working memory similar to that obtained in quiet. This interpretation is supported by a consistent association between CSCT scores and updating skills. PMID:24324411

  2. Significance of Joint Features Derived from the Modified Group Delay Function in Speech Processing

    Directory of Open Access Journals (Sweden)

    Murthy Hema A

    2007-01-01

    Full Text Available This paper investigates the significance of combining cepstral features derived from the modified group delay function and from the short-time spectral magnitude like the MFCC. The conventional group delay function fails to capture the resonant structure and the dynamic range of the speech spectrum primarily due to pitch periodicity effects. The group delay function is modified to suppress these spikes and to restore the dynamic range of the speech spectrum. Cepstral features are derived from the modified group delay function, which are called the modified group delay feature (MODGDF. The complementarity and robustness of the MODGDF when compared to the MFCC are also analyzed using spectral reconstruction techniques. Combination of several spectral magnitude-based features and the MODGDF using feature fusion and likelihood combination is described. These features are then used for three speech processing tasks, namely, syllable, speaker, and language recognition. Results indicate that combining MODGDF with MFCC at the feature level gives significant improvements for speech recognition tasks in noise. Combining the MODGDF and the spectral magnitude-based features gives a significant increase in recognition performance of 11% at best, while combining any two features derived from the spectral magnitude does not give any significant improvement.

  3. Fidelity of Automatic Speech Processing for Adult and Child Talker Classifications.

    Directory of Open Access Journals (Sweden)

    Mark VanDam

    Full Text Available Automatic speech processing (ASP has recently been applied to very large datasets of naturalistically collected, daylong recordings of child speech via an audio recorder worn by young children. The system developed by the LENA Research Foundation analyzes children's speech for research and clinical purposes, with special focus on of identifying and tagging family speech dynamics and the at-home acoustic environment from the auditory perspective of the child. A primary issue for researchers, clinicians, and families using the Language ENvironment Analysis (LENA system is to what degree the segment labels are valid. This classification study evaluates the performance of the computer ASP output against 23 trained human judges who made about 53,000 judgements of classification of segments tagged by the LENA ASP. Results indicate performance consistent with modern ASP such as those using HMM methods, with acoustic characteristics of fundamental frequency and segment duration most important for both human and machine classifications. Results are likely to be important for interpreting and improving ASP output.

  4. Comparing Binaural Pre-processing Strategies II: Speech Intelligibility of Bilateral Cochlear Implant Users.

    Science.gov (United States)

    Baumgärtel, Regina M; Hu, Hongmei; Krawczyk-Becker, Martin; Marquardt, Daniel; Herzke, Tobias; Coleman, Graham; Adiloğlu, Kamil; Bomke, Katrin; Plotz, Karsten; Gerkmann, Timo; Doclo, Simon; Kollmeier, Birger; Hohmann, Volker; Dietz, Mathias

    2015-12-30

    Several binaural audio signal enhancement algorithms were evaluated with respect to their potential to improve speech intelligibility in noise for users of bilateral cochlear implants (CIs). 50% speech reception thresholds (SRT50) were assessed using an adaptive procedure in three distinct, realistic noise scenarios. All scenarios were highly nonstationary, complex, and included a significant amount of reverberation. Other aspects, such as the perfectly frontal target position, were idealized laboratory settings, allowing the algorithms to perform better than in corresponding real-world conditions. Eight bilaterally implanted CI users, wearing devices from three manufacturers, participated in the study. In all noise conditions, a substantial improvement in SRT50 compared to the unprocessed signal was observed for most of the algorithms tested, with the largest improvements generally provided by binaural minimum variance distortionless response (MVDR) beamforming algorithms. The largest overall improvement in speech intelligibility was achieved by an adaptive binaural MVDR in a spatially separated, single competing talker noise scenario. A no-pre-processing condition and adaptive differential microphones without a binaural link served as the two baseline conditions. SRT50 improvements provided by the binaural MVDR beamformers surpassed the performance of the adaptive differential microphones in most cases. Speech intelligibility improvements predicted by instrumental measures were shown to account for some but not all aspects of the perceptually obtained SRT50 improvements measured in bilaterally implanted CI users. © The Author(s) 2015.

  5. Towards dynamic interorganizational business process management (text keynote speech)

    NARCIS (Netherlands)

    Grefen, P.W.P.J.; Reddy, S.M.

    2006-01-01

    In the modern day business world, we see more and more complex collaboration scenarios in which multiple autonomous organizations work together. From a business perspective, we distinguish between horizontal and vertical relationships. We argue that the concept of business process is central in both

  6. The Matrix Pencil and its Applications to Speech Processing

    Science.gov (United States)

    2007-03-01

    example in ascending order, to form the ordered frequency list vector, F v . The nx1 vector F v is then input to the Column Duplicator, which forms the...already is. With regard to the pre-processing that has been described, one could also pre-condition the input frequency list based on phase and decay

  7. Mapping symbols to sounds: electrophysiological correlates of the impaired reading process in dyslexia

    Directory of Open Access Journals (Sweden)

    Andreas eWidmann

    2012-03-01

    Full Text Available Dyslexic and control first grade school children were compared in a Symbol-to-Sound matching test based on a nonlinguistic audiovisual training which is known to have a remediating effect on dyslexia. Visual symbol patterns had to be matched with predicted sound patterns. Sounds incongruent with the corresponding visual symbol (thus not matching the prediction elicited the N2b and P3a event-related potential (ERP components relative to congruent sounds in control children. Their ERPs resembled the ERP effects previously reported for healthy adults with this paradigm. In dyslexic children, N2b onset latency was delayed and its amplitude significantly reduced over left hemisphere whereas P3a was absent. Moreover, N2b amplitudes significantly correlated with the reading skills. ERPs to sound changes in a control condition were unaffected. In addition, correctly predicted sounds, that is, sounds that are congruent with the visual symbol, elicited an early induced auditory gamma band response (GBR reflecting synchronization of brain activity in normal-reading children as previously observed in healthy adults. However, dyslexic children showed no GBR. This indicates that visual symbolic and auditory sensory information are not integrated into a unitary audiovisual object representation in them. Finally, incongruent sounds were followed by a later desynchronization of brain activity in the gamma band in both groups. This desynchronization was significantly larger in dyslexic children. Although both groups accomplished the task successfully remarkable group differences in brain responses suggest that normal-reading children and dyslexic children recruit (partly different brain mechanisms when solving the task. We propose that abnormal ERPs and GBRs in dyslexic readers indicate a deficit resulting in a widespread impairment in processing and integrating auditory and visual information and contributing to the reading impairment in dyslexia.

  8. Phonological Processes in the Speech of Jordanian Arabic Children with Cleft Lip and/or Palate

    Science.gov (United States)

    Al-Tamimi, Feda Y.; Owais, Arwa I.; Khabour, Omar F.; Khamaiseh, Zaidan A.

    2011-01-01

    The controlled and free speech of 15 Jordanian male and female children with cleft lip and/or palate was analyzed to account for the different phonological processes exhibited. Study participants were divided into three main age groups, 4 years 2 months to 4 years 7 months, 5 years 3 months to 5 years 6 months, and 6 years 4 months to 6 years 6…

  9. "Slight" of hand: the processing of visually degraded gestures with speech.

    Science.gov (United States)

    Kelly, Spencer D; Hansen, Bruce C; Clark, David T

    2012-01-01

    Co-speech hand gestures influence language comprehension. The present experiment explored what part of the visual processing system is optimized for processing these gestures. Participants viewed short video clips of speech and gestures (e.g., a person saying "chop" or "twist" while making a chopping gesture) and had to determine whether the two modalities were congruent or incongruent. Gesture videos were designed to stimulate the parvocellular or magnocellular visual pathways by filtering out low or high spatial frequencies (HSF versus LSF) at two levels of degradation severity (moderate and severe). Participants were less accurate and slower at processing gesture and speech at severe versus moderate levels of degradation. In addition, they were slower for LSF versus HSF stimuli, and this difference was most pronounced in the severely degraded condition. However, exploratory item analyses showed that the HSF advantage was modulated by the range of motion and amount of motion energy in each video. The results suggest that hand gestures exploit a wide range of spatial frequencies, and depending on what frequencies carry the most motion energy, parvocellular or magnocellular visual pathways are maximized to quickly and optimally extract meaning.

  10. The applicability of normalisation process theory to speech and language therapy: a review of qualitative research on a speech and language intervention.

    Science.gov (United States)

    James, Deborah M

    2011-08-12

    The Bercow review found a high level of public dissatisfaction with speech and language services for children. Children with speech, language, and communication needs (SLCN) often have chronic complex conditions that require provision from health, education, and community services. Speech and language therapists are a small group of Allied Health Professionals with a specialist skill-set that equips them to work with children with SLCN. They work within and across the diverse range of public service providers. The aim of this review was to explore the applicability of Normalisation Process Theory (NPT) to the case of speech and language therapy. A review of qualitative research on a successfully embedded speech and language therapy intervention was undertaken to test the applicability of NPT. The review focused on two of the collective action elements of NPT (relational integration and interaction workability) using all previously published qualitative data from both parents and practitioners' perspectives on the intervention. The synthesis of the data based on the Normalisation Process Model (NPM) uncovered strengths in the interpersonal processes between the practitioners and parents, and weaknesses in how the accountability of the intervention is distributed in the health system. The analysis based on the NPM uncovered interpersonal processes between the practitioners and parents that were likely to have given rise to successful implementation of the intervention. In previous qualitative research on this intervention where the Medical Research Council's guidance on developing a design for a complex intervention had been used as a framework, the interpersonal work within the intervention had emerged as a barrier to implementation of the intervention. It is suggested that the design of services for children and families needs to extend beyond the consideration of benefits and barriers to embrace the social processes that appear to afford success in embedding

  11. Effects of Blocked and Random Practice Schedule on Outcomes of Sound Production Treatment for Acquired Apraxia of Speech: Results of a Group Investigation.

    Science.gov (United States)

    Wambaugh, Julie L; Nessler, Christina; Wright, Sandra; Mauszycki, Shannon C; DeLong, Catharine; Berggren, Kiera; Bailey, Dallin J

    2017-06-22

    The purpose of this investigation was to compare the effects of schedule of practice (i.e., blocked vs. random) on outcomes of Sound Production Treatment (SPT; Wambaugh, Kalinyak-Fliszar, West, & Doyle, 1998) for speakers with chronic acquired apraxia of speech and aphasia. A combination of group and single-case experimental designs was used. Twenty participants each received SPT administered with randomized stimuli presentation (SPT-R) and SPT applied with blocked stimuli presentation (SPT-B). Treatment effects were examined with respect to accuracy of articulation as measured in treated and untreated experimental words produced during probes. All participants demonstrated improved articulation of treated items with both practice schedules. Effect sizes were calculated to estimate magnitude of change for treated and untreated items by treatment condition. No significant differences were found for SPT-R and SPT-B relative to effect size. Percent change over the highest baseline performance was also calculated to provide a clinically relevant indication of improvement. Change scores associated with SPT-R were significantly higher than those for SPT-B for treated items but not untreated items. SPT can result in improved articulation regardless of schedule of practice. However, SPT-R may result in greater gains for treated items. https://doi.org/10.23641/asha.5116831.

  12. The influence of spectral characteristics of early reflections on speech intelligibility

    DEFF Research Database (Denmark)

    Arweiler, Iris; Buchholz, Jörg

    2011-01-01

    The auditory system takes advantage of early reflections (ERs) in a room by integrating them with the direct sound (DS) and thereby increasing the effective speech level. In the present paper the benefit from realistic ERs on speech intelligibility in diffuse speech-shaped noise was investigated...... ascribed to their altered spectrum compared to the DS and to the filtering by the torso, head, and pinna. No binaural processing other than a binaural summation effect could be observed....

  13. Interaction matters: A perceived social partner alters the neural processing of human speech.

    Science.gov (United States)

    Rice, Katherine; Redcay, Elizabeth

    2016-04-01

    Mounting evidence suggests that social interaction changes how communicative behaviors (e.g., spoken language, gaze) are processed, but the precise neural bases by which social-interactive context may alter communication remain unknown. Various perspectives suggest that live interactions are more rewarding, more attention-grabbing, or require increased mentalizing-thinking about the thoughts of others. Dissociating between these possibilities is difficult because most extant neuroimaging paradigms examining social interaction have not directly compared live paradigms to conventional "offline" (or recorded) paradigms. We developed a novel fMRI paradigm to assess whether and how an interactive context changes the processing of speech matched in content and vocal characteristics. Participants listened to short vignettes--which contained no reference to people or mental states--believing that some vignettes were prerecorded and that others were presented over a real-time audio-feed by a live social partner. In actuality, all speech was prerecorded. Simply believing that speech was live increased activation in each participant's own mentalizing regions, defined using a functional localizer. Contrasting live to recorded speech did not reveal significant differences in attention or reward regions. Further, higher levels of autistic-like traits were associated with altered neural specialization for live interaction. These results suggest that humans engage in ongoing mentalizing about social partners, even when such mentalizing is not explicitly required, illustrating how social context shapes social cognition. Understanding communication in social context has important implications for typical and atypical social processing, especially for disorders like autism where social difficulties are more acute in live interaction. Copyright © 2015 Elsevier Inc. All rights reserved.

  14. Speech intelligibility for normal hearing and hearing-impaired listeners in simulated room acoustic conditions

    DEFF Research Database (Denmark)

    Arweiler, Iris; Dau, Torsten; Poulsen, Torben

    Speech intelligibility depends on many factors such as room acoustics, the acoustical properties and location of the signal and the interferers, and the ability of the (normal and impaired) auditory system to process monaural and binaural sounds. In the present study, the effect of reverberation...... on spatial release from masking was investigated in normal hearing and hearing impaired listeners using three types of interferers: speech shaped noise, an interfering female talker and speech-modulated noise. Speech reception thresholds (SRT) were obtained in three simulated environments: a listening room......, a classroom and a church. The data from the study provide constraints for existing models of speech intelligibility prediction (based on the speech intelligibility index, SII, or the speech transmission index, STI) which have shortcomings when reverberation and/or fluctuating noise affect speech...

  15. Hemispheric asymmetries in speech perception: sense, nonsense and modulations.

    Directory of Open Access Journals (Sweden)

    Stuart Rosen

    Full Text Available The well-established left hemisphere specialisation for language processing has long been claimed to be based on a low-level auditory specialization for specific acoustic features in speech, particularly regarding 'rapid temporal processing'.A novel analysis/synthesis technique was used to construct a variety of sounds based on simple sentences which could be manipulated in spectro-temporal complexity, and whether they were intelligible or not. All sounds consisted of two noise-excited spectral prominences (based on the lower two formants in the original speech which could be static or varying in frequency and/or amplitude independently. Dynamically varying both acoustic features based on the same sentence led to intelligible speech but when either or both acoustic features were static, the stimuli were not intelligible. Using the frequency dynamics from one sentence with the amplitude dynamics of another led to unintelligible sounds of comparable spectro-temporal complexity to the intelligible ones. Positron emission tomography (PET was used to compare which brain regions were active when participants listened to the different sounds.Neural activity to spectral and amplitude modulations sufficient to support speech intelligibility (without actually being intelligible was seen bilaterally, with a right temporal lobe dominance. A left dominant response was seen only to intelligible sounds. It thus appears that the left hemisphere specialisation for speech is based on the linguistic properties of utterances, not on particular acoustic features.

  16. Effects of age on electrophysiological correlates of speech processing in a dynamic cocktail-party situation

    Directory of Open Access Journals (Sweden)

    Stephan eGetzmann

    2015-09-01

    Full Text Available Successful speech perception in multi-speaker environments depends on auditory scene analysis, comprising auditory object segregation and grouping, and on focusing attention toward the speaker of interest. Changes in speaker settings (e.g., in speaker position require object re-selection and attention re-focusing. Here, we tested the processing of changes in a realistic multi-speaker scenario in younger and older adults, employing a speech-perception task and event-related potential (ERP measures. Sequences of short words (combinations of company names and values were simultaneously presented via four loudspeakers at different locations, and the participants responded to the value of a target company. Voice and position of the speaker of the target information were kept constant for a variable number of trials and then changed. Relative to the pre-change level, changes caused higher error rates, and more so in older than younger adults. The ERP analysis revealed stronger fronto-central N2 and N400 components in younger adults, suggesting a more effective inhibition of concurrent speech stimuli and enhanced language processing. The difference ERPs (post-change minus pre-change indicated a change-related N400 and late positive complex (LPC over parietal areas in both groups. Only the older adults showed an additional frontal LPC, suggesting increased allocation of attentional resources after changes in speaker settings. In sum, changes in speaker settings are critical events for speech perception in multi-speaker environments. Especially older persons show deficits that could be based on less flexible inhibitory control and increased distraction.

  17. Effects of musical training on sound pattern processing in high-school students.

    Science.gov (United States)

    Wang, Wenjung; Staffaroni, Laura; Reid, Errold; Steinschneider, Mitchell; Sussman, Elyse

    2009-05-01

    Recognizing melody in music involves detection of both the pitch intervals and the silence between sequentially presented sounds. This study tested the hypothesis that active musical training in adolescents facilitates the ability to passively detect sequential sound patterns compared to musically non-trained age-matched peers. Twenty adolescents, aged 15-18 years, were divided into groups according to their musical training and current experience. A fixed order tone pattern was presented at various stimulus rates while electroencephalogram was recorded. The influence of musical training on passive auditory processing of the sound patterns was assessed using components of event-related brain potentials (ERPs). The mismatch negativity (MMN) ERP component was elicited in different stimulus onset asynchrony (SOA) conditions in non-musicians than musicians, indicating that musically active adolescents were able to detect sound patterns across longer time intervals than age-matched peers. Musical training facilitates detection of auditory patterns, allowing the ability to automatically recognize sequential sound patterns over longer time periods than non-musical counterparts.

  18. The selective role of premotor cortex in speech perception: a contribution to phoneme judgements but not speech comprehension.

    Science.gov (United States)

    Krieger-Redwood, Katya; Gaskell, M Gareth; Lindsay, Shane; Jefferies, Elizabeth

    2013-12-01

    Several accounts of speech perception propose that the areas involved in producing language are also involved in perceiving it. In line with this view, neuroimaging studies show activation of premotor cortex (PMC) during phoneme judgment tasks; however, there is debate about whether speech perception necessarily involves motor processes, across all task contexts, or whether the contribution of PMC is restricted to tasks requiring explicit phoneme awareness. Some aspects of speech processing, such as mapping sounds onto meaning, may proceed without the involvement of motor speech areas if PMC specifically contributes to the manipulation and categorical perception of phonemes. We applied TMS to three sites-PMC, posterior superior temporal gyrus, and occipital pole-and for the first time within the TMS literature, directly contrasted two speech perception tasks that required explicit phoneme decisions and mapping of speech sounds onto semantic categories, respectively. TMS to PMC disrupted explicit phonological judgments but not access to meaning for the same speech stimuli. TMS to two further sites confirmed that this pattern was site specific and did not reflect a generic difference in the susceptibility of our experimental tasks to TMS: stimulation of pSTG, a site involved in auditory processing, disrupted performance in both language tasks, whereas stimulation of occipital pole had no effect on performance in either task. These findings demonstrate that, although PMC is important for explicit phonological judgments, crucially, PMC is not necessary for mapping speech onto meanings.

  19. Convection measurement package for space processing sounding rocket flights. [low gravity manufacturing - fluid dynamics

    Science.gov (United States)

    Spradley, L. W.

    1975-01-01

    The effects on heated fluids of nonconstant accelerations, rocket vibrations, and spin rates, was studied. A system is discussed which can determine the influence of the convective effects on fluid experiments. The general suitability of sounding rockets for performing these experiments is treated. An analytical investigation of convection in an enclosure which is heated in low gravity is examined. The gravitational body force was taken as a time-varying function using anticipated sounding rocket accelerations, since accelerometer flight data were not available. A computer program was used to calculate the flow rates and heat transfer in fluids with geometries and boundary conditions typical of space processing configurations. Results of the analytical investigation identify the configurations, fluids and boundary values which are most suitable for measuring the convective environment of sounding rockets. A short description of fabricated fluid cells and the convection measurement package is given. Photographs are included.

  20. Exploring Australian speech-language pathologists' use and perceptions ofnon-speech oral motor exercises.

    Science.gov (United States)

    Rumbach, Anna F; Rose, Tanya A; Cheah, Mynn

    2018-01-29

    To explore Australian speech-language pathologists' use of non-speech oral motor exercises, and rationales for using/not using non-speech oral motor exercises in clinical practice. A total of 124 speech-language pathologists practising in Australia, working with paediatric and/or adult clients with speech sound difficulties, completed an online survey. The majority of speech-language pathologists reported that they did not use non-speech oral motor exercises when working with paediatric or adult clients with speech sound difficulties. However, more than half of the speech-language pathologists working with adult clients who have dysarthria reported using non-speech oral motor exercises with this population. The most frequently reported rationale for using non-speech oral motor exercises in speech sound difficulty management was to improve awareness/placement of articulators. The majority of speech-language pathologists agreed there is no clear clinical or research evidence base to support non-speech oral motor exercise use with clients who have speech sound difficulties. This study provides an overview of Australian speech-language pathologists' reported use and perceptions of non-speech oral motor exercises' applicability and efficacy in treating paediatric and adult clients who have speech sound difficulties. The research findings provide speech-language pathologists with insight into how and why non-speech oral motor exercises are currently used, and adds to the knowledge base regarding Australian speech-language pathology practice of non-speech oral motor exercises in the treatment of speech sound difficulties. Implications for Rehabilitation Non-speech oral motor exercises refer to oral motor activities which do not involve speech, but involve the manipulation or stimulation of oral structures including the lips, tongue, jaw, and soft palate. Non-speech oral motor exercises are intended to improve the function (e.g., movement, strength) of oral structures. The

  1. Human Superior Temporal Gyrus Organization of Spectrotemporal Modulation Tuning Derived from Speech Stimuli.

    Science.gov (United States)

    Hullett, Patrick W; Hamilton, Liberty S; Mesgarani, Nima; Schreiner, Christoph E; Chang, Edward F

    2016-02-10

    The human superior temporal gyrus (STG) is critical for speech perception, yet the organization of spectrotemporal processing of speech within the STG is not well understood. Here, to characterize the spatial organization of spectrotemporal processing of speech across human STG, we use high-density cortical surface field potential recordings while participants listened to natural continuous speech. While synthetic broad-band stimuli did not yield sustained activation of the STG, spectrotemporal receptive fields could be reconstructed from vigorous responses to speech stimuli. We find that the human STG displays a robust anterior-posterior spatial distribution of spectrotemporal tuning in which the posterior STG is tuned for temporally fast varying speech sounds that have relatively constant energy across the frequency axis (low spectral modulation) while the anterior STG is tuned for temporally slow varying speech sounds that have a high degree of spectral variation across the frequency axis (high spectral modulation). This work illustrates organization of spectrotemporal processing in the human STG, and illuminates processing of ethologically relevant speech signals in a region of the brain specialized for speech perception. Considerable evidence has implicated the human superior temporal gyrus (STG) in speech processing. However, the gross organization of spectrotemporal processing of speech within the STG is not well characterized. Here we use natural speech stimuli and advanced receptive field characterization methods to show that spectrotemporal features within speech are well organized along the posterior-to-anterior axis of the human STG. These findings demonstrate robust functional organization based on spectrotemporal modulation content, and illustrate that much of the encoded information in the STG represents the physical acoustic properties of speech stimuli. Copyright © 2016 the authors 0270-6474/16/362014-13$15.00/0.

  2. IRI-2012 MODEL ADAPTABILITY ESTIMATION FOR AUTOMATED PROCESSING OF VERTICAL SOUNDING IONOGRAMS

    Directory of Open Access Journals (Sweden)

    V. D. Nikolaeva

    2014-01-01

    Full Text Available The paper deals with possibility of IRI-2012 global empirical model applying to the vertical sounding of the ionosphere semiautomatic data processing. Main ionosphere characteristics from vertical sounding data at IZMIRAN Voeikovo station in February 2013 were compared with IRI-2012 model calculation results. 2688 model values and 1866 real values of f0F2, f0E, hmF2, hmE were processed. E and F2 layers critical frequency (f0E, f0F2 and the maximum altitudes (hmF2, hmE were determined from the ionograms. Vertical profiles of the electron concentration were restored with IRI-2012 model by measured frequency and height. The model calculation was also made without the inclusion of the real vertical sounding data. Monthly averages and standard deviations (σ for the parameters f0F2, f0E, hmF2, hmE for each hour of the day were calculated according to the vertical sounding and model values. Model applicability conditions for automated processing of ionograms for subauroral ionosphere were determined. Initial IRI-2012 model can be applied in the sub-auroral ionograms processing at daytime with undisturbed conditions in the absence of sporadic ionization. In this case model calculations can be adjusted by the near-time vertical sounding data. IRI-2012 model values for f0E (in daytime and hmF2 can be applied to reduce computational costs in the systems of automatic parameters search and preliminary determination of the searching area range for the main parameters. IRI-2012 model can be used for a more accurate approximation of the real data series in the absence of the real values. In view of sporadic ionization, ionosphere models of the high latitudes must be applied with corpuscular ions formation unit.

  3. Cognitive Processing Speed, Working Memory, and the Intelligibility of Hearing Aid-Processed Speech in Persons with Hearing Impairment

    Directory of Open Access Journals (Sweden)

    Wycliffe Kabaywe Yumba

    2017-08-01

    Full Text Available Previous studies have demonstrated that successful listening with advanced signal processing in digital hearing aids is associated with individual cognitive capacity, particularly working memory capacity (WMC. This study aimed to examine the relationship between cognitive abilities (cognitive processing speed and WMC and individual listeners’ responses to digital signal processing settings in adverse listening conditions. A total of 194 native Swedish speakers (83 women and 111 men, aged 33–80 years (mean = 60.75 years, SD = 8.89, with bilateral, symmetrical mild to moderate sensorineural hearing loss who had completed a lexical decision speed test (measuring cognitive processing speed and semantic word-pair span test (SWPST, capturing WMC participated in this study. The Hagerman test (capturing speech recognition in noise was conducted using an experimental hearing aid with three digital signal processing settings: (1 linear amplification without noise reduction (NoP, (2 linear amplification with noise reduction (NR, and (3 non-linear amplification without NR (“fast-acting compression”. The results showed that cognitive processing speed was a better predictor of speech intelligibility in noise, regardless of the types of signal processing algorithms used. That is, there was a stronger association between cognitive processing speed and NR outcomes and fast-acting compression outcomes (in steady state noise. We observed a weaker relationship between working memory and NR, but WMC did not relate to fast-acting compression. WMC was a relatively weaker predictor of speech intelligibility in noise. These findings might have been different if the participants had been provided with training and or allowed to acclimatize to binary masking noise reduction or fast-acting compression.

  4. Childhood apraxia of speech and multiple phonological disorders in Cairo-Egyptian Arabic speaking children: language, speech, and oro-motor differences.

    Science.gov (United States)

    Aziz, Azza Adel; Shohdi, Sahar; Osman, Dalia Mostafa; Habib, Emad Iskander

    2010-06-01

    Childhood apraxia of speech is a neurological childhood speech-sound disorder in which the precision and consistency of movements underlying speech are impaired in the absence of neuromuscular deficits. Children with childhood apraxia of speech and those with multiple phonological disorder share some common phonological errors that can be misleading in diagnosis. This study posed a question about a possible significant difference in language, speech and non-speech oral performances between children with childhood apraxia of speech, multiple phonological disorder and normal children that can be used for a differential diagnostic purpose. 30 pre-school children between the ages of 4 and 6 years served as participants. Each of these children represented one of 3 possible subject-groups: Group 1: multiple phonological disorder; Group 2: suspected cases of childhood apraxia of speech; Group 3: control group with no communication disorder. Assessment procedures included: parent interviews; testing of non-speech oral motor skills and testing of speech skills. Data showed that children with suspected childhood apraxia of speech showed significantly lower language score only in their expressive abilities. Non-speech tasks did not identify significant differences between childhood apraxia of speech and multiple phonological disorder groups except for those which required two sequential motor performances. In speech tasks, both consonant and vowel accuracy were significantly lower and inconsistent in childhood apraxia of speech group than in the multiple phonological disorder group. Syllable number, shape and sequence accuracy differed significantly in the childhood apraxia of speech group than the other two groups. In addition, children with childhood apraxia of speech showed greater difficulty in processing prosodic features indicating a clear need to address these variables for differential diagnosis and treatment of children with childhood apraxia of speech. Copyright (c

  5. Cognitive processing load during listening is reduced more by decreasing voice similarity than by increasing spatial separation between target and masker speech

    NARCIS (Netherlands)

    Zekveld, A.A.; Rudner, M.; Kramer, S.E.; Lyzenga, J.; Ronnberg, J.

    2014-01-01

    We investigated changes in speech recognition and cognitive processing load due to the masking release attributable to decreasing similarity between target and masker speech. This was achieved by using masker voices with either the same (female) gender as the target speech or different gender (male)

  6. Brain regions for sound processing and song release in a small grasshopper.

    Science.gov (United States)

    Balvantray Bhavsar, Mit; Stumpner, Andreas; Heinrich, Ralf

    2017-05-01

    We investigated brain regions - mostly neuropils - that process auditory information relevant for the initiation of response songs of female grasshoppers Chorthippus biguttulus during bidirectional intraspecific acoustic communication. Male-female acoustic duets in the species Ch. biguttulus require the perception of sounds, their recognition as a species- and gender-specific signal and the initiation of commands that activate thoracic pattern generating circuits to drive the sound-producing stridulatory movements of the hind legs. To study sensory-to-motor processing during acoustic communication we used multielectrodes that allowed simultaneous recordings of acoustically stimulated electrical activity from several ascending auditory interneurons or local brain neurons and subsequent electrical stimulation of the recording site. Auditory activity was detected in the lateral protocerebrum (where most of the described ascending auditory interneurons terminate), in the superior medial protocerebrum and in the central complex, that has previously been implicated in the control of sound production. Neural responses to behaviorally attractive sound stimuli showed no or only poor correlation with behavioral responses. Current injections into the lateral protocerebrum, the central complex and the deuto-/tritocerebrum (close to the cerebro-cervical fascicles), but not into the superior medial protocerebrum, elicited species-typical stridulation with high success rate. Latencies and numbers of phrases produced by electrical stimulation were different between these brain regions. Our results indicate three brain regions (likely neuropils) where auditory activity can be detected with two of these regions being potentially involved in song initiation. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Does dynamic information about the speaker's face contribute to semantic speech processing? ERP evidence.

    Science.gov (United States)

    Hernández-Gutiérrez, David; Abdel Rahman, Rasha; Martín-Loeches, Manuel; Muñoz, Francisco; Schacht, Annekathrin; Sommer, Werner

    2018-07-01

    Face-to-face interactions characterize communication in social contexts. These situations are typically multimodal, requiring the integration of linguistic auditory input with facial information from the speaker. In particular, eye gaze and visual speech provide the listener with social and linguistic information, respectively. Despite the importance of this context for an ecological study of language, research on audiovisual integration has mainly focused on the phonological level, leaving aside effects on semantic comprehension. Here we used event-related potentials (ERPs) to investigate the influence of facial dynamic information on semantic processing of connected speech. Participants were presented with either a video or a still picture of the speaker, concomitant to auditory sentences. Along three experiments, we manipulated the presence or absence of the speaker's dynamic facial features (mouth and eyes) and compared the amplitudes of the semantic N400 elicited by unexpected words. Contrary to our predictions, the N400 was not modulated by dynamic facial information; therefore, semantic processing seems to be unaffected by the speaker's gaze and visual speech. Even though, during the processing of expected words, dynamic faces elicited a long-lasting late posterior positivity compared to the static condition. This effect was significantly reduced when the mouth of the speaker was covered. Our findings may indicate an increase of attentional processing to richer communicative contexts. The present findings also demonstrate that in natural communicative face-to-face encounters, perceiving the face of a speaker in motion provides supplementary information that is taken into account by the listener, especially when auditory comprehension is non-demanding. Copyright © 2018 Elsevier Ltd. All rights reserved.

  8. Automatic analysis of slips of the tongue: Insights into the cognitive architecture of speech production.

    Science.gov (United States)

    Goldrick, Matthew; Keshet, Joseph; Gustafson, Erin; Heller, Jordana; Needle, Jeremy

    2016-04-01

    Traces of the cognitive mechanisms underlying speaking can be found within subtle variations in how we pronounce sounds. While speech errors have traditionally been seen as categorical substitutions of one sound for another, acoustic/articulatory analyses show they partially reflect the intended sound. When "pig" is mispronounced as "big," the resulting /b/ sound differs from correct productions of "big," moving towards intended "pig"-revealing the role of graded sound representations in speech production. Investigating the origins of such phenomena requires detailed estimation of speech sound distributions; this has been hampered by reliance on subjective, labor-intensive manual annotation. Computational methods can address these issues by providing for objective, automatic measurements. We develop a novel high-precision computational approach, based on a set of machine learning algorithms, for measurement of elicited speech. The algorithms are trained on existing manually labeled data to detect and locate linguistically relevant acoustic properties with high accuracy. Our approach is robust, is designed to handle mis-productions, and overall matches the performance of expert coders. It allows us to analyze a very large dataset of speech errors (containing far more errors than the total in the existing literature), illuminating properties of speech sound distributions previously impossible to reliably observe. We argue that this provides novel evidence that two sources both contribute to deviations in speech errors: planning processes specifying the targets of articulation and articulatory processes specifying the motor movements that execute this plan. These findings illustrate how a much richer picture of speech provides an opportunity to gain novel insights into language processing. Copyright © 2016 Elsevier B.V. All rights reserved.

  9. Enhanced Excitatory Connectivity and Disturbed Sound Processing in the Auditory Brainstem of Fragile X Mice.

    Science.gov (United States)

    Garcia-Pino, Elisabet; Gessele, Nikodemus; Koch, Ursula

    2017-08-02

    Hypersensitivity to sounds is one of the prevalent symptoms in individuals with Fragile X syndrome (FXS). It manifests behaviorally early during development and is often used as a landmark for treatment efficacy. However, the physiological mechanisms and circuit-level alterations underlying this aberrant behavior remain poorly understood. Using the mouse model of FXS ( Fmr1 KO ), we demonstrate that functional maturation of auditory brainstem synapses is impaired in FXS. Fmr1 KO mice showed a greatly enhanced excitatory synaptic input strength in neurons of the lateral superior olive (LSO), a prominent auditory brainstem nucleus, which integrates ipsilateral excitation and contralateral inhibition to compute interaural level differences. Conversely, the glycinergic, inhibitory input properties remained unaffected. The enhanced excitation was the result of an increased number of cochlear nucleus fibers converging onto one LSO neuron, without changing individual synapse properties. Concomitantly, immunolabeling of excitatory ending markers revealed an increase in the immunolabeled area, supporting abnormally elevated excitatory input numbers. Intrinsic firing properties were only slightly enhanced. In line with the disturbed development of LSO circuitry, auditory processing was also affected in adult Fmr1 KO mice as shown with single-unit recordings of LSO neurons. These processing deficits manifested as an increase in firing rate, a broadening of the frequency response area, and a shift in the interaural level difference function of LSO neurons. Our results suggest that this aberrant synaptic development of auditory brainstem circuits might be a major underlying cause of the auditory processing deficits in FXS. SIGNIFICANCE STATEMENT Fragile X Syndrome (FXS) is the most common inheritable form of intellectual impairment, including autism. A core symptom of FXS is extreme sensitivity to loud sounds. This is one reason why individuals with FXS tend to avoid social

  10. Temporal integration: intentional sound discrimination does not modulate stimulus-driven processes in auditory event synthesis.

    Science.gov (United States)

    Sussman, Elyse; Winkler, István; Kreuzer, Judith; Saher, Marieke; Näätänen, Risto; Ritter, Walter

    2002-12-01

    Our previous study showed that the auditory context could influence whether two successive acoustic changes occurring within the temporal integration window (approximately 200ms) were pre-attentively encoded as a single auditory event or as two discrete events (Cogn Brain Res 12 (2001) 431). The aim of the current study was to assess whether top-down processes could influence the stimulus-driven processes in determining what constitutes an auditory event. Electroencepholagram (EEG) was recorded from 11 scalp electrodes to frequently occurring standard and infrequently occurring deviant sounds. Within the stimulus blocks, deviants either occurred only in pairs (successive feature changes) or both singly and in pairs. Event-related potential indices of change and target detection, the mismatch negativity (MMN) and the N2b component, respectively, were compared with the simultaneously measured performance in discriminating the deviants. Even though subjects could voluntarily distinguish the two successive auditory feature changes from each other, which was also indicated by the elicitation of the N2b target-detection response, top-down processes did not modify the event organization reflected by the MMN response. Top-down processes can extract elemental auditory information from a single integrated acoustic event, but the extraction occurs at a later processing stage than the one whose outcome is indexed by MMN. Initial processes of auditory event-formation are fully governed by the context within which the sounds occur. Perception of the deviants as two separate sound events (the top-down effects) did not change the initial neural representation of the same deviants as one event (indexed by the MMN), without a corresponding change in the stimulus-driven sound organization.

  11. Demodulation Processes in Auditory Perception

    National Research Council Canada - National Science Library

    Feth, Lawrence

    1997-01-01

    The long range goal of this project was the understanding of human auditory processing of information conveyed by complex, time varying signals such as speech, music or important environmental sounds...

  12. Functional heterogeneity within the default network during semantic processing and speech production.

    Directory of Open Access Journals (Sweden)

    Mohamed L Seghier

    2012-08-01

    Full Text Available This fMRI study investigated the functional heterogeneity of the core nodes of the default mode network (DMN during language processing. The core nodes of the DMN were defined as task-induced deactivations over multiple tasks in 94 healthy subjects. We used a factorial design that manipulated different tasks (semantic matching or speech production and stimuli (familiar words and objects or unfamiliar stimuli, alternating with periods of fixation/rest. Our findings revealed several consistent effects in the DMN, namely less deactivations in the left inferior parietal lobule during semantic than perceptual matching in parallel with greater deactivations during semantic matching in anterior subdivisions of the posterior cingulate cortex and the ventromedial prefrontal cortex. This suggests that, when the brain is engaged in effortful semantic tasks, a part of the DMN in the left angular gyrus was less deactivated as five other nodes of the DMN were more deactivated. These five DMN areas, where deactivation was greater for semantic than perceptual matching, were further differentiated because deactivation was greater in (i posterior ventromedial prefrontal cortex for speech production relative to semantic matching, (ii posterior precuneus and posterior cingulate cortex for perceptual processing relative to speech production and (iii right inferior parietal cortex for pictures of objects relative to written words during both naming and semantic decisions. Our results thus highlight that task difficulty alone cannot fully explain the functional variability in task-induced deactivations. Together these results emphasize that core nodes within the DMN are functionally heterogeneous and differentially sensitive to the type of language processing.

  13. Categorical speech processing in Broca's area: an fMRI study using multivariate pattern-based analysis.

    Science.gov (United States)

    Lee, Yune-Sang; Turkeltaub, Peter; Granger, Richard; Raizada, Rajeev D S

    2012-03-14

    Although much effort has been directed toward understanding the neural basis of speech processing, the neural processes involved in the categorical perception of speech have been relatively less studied, and many questions remain open. In this functional magnetic resonance imaging (fMRI) study, we probed the cortical regions mediating categorical speech perception using an advanced brain-mapping technique, whole-brain multivariate pattern-based analysis (MVPA). Normal healthy human subjects (native English speakers) were scanned while they listened to 10 consonant-vowel syllables along the /ba/-/da/ continuum. Outside of the scanner, individuals' own category boundaries were measured to divide the fMRI data into /ba/ and /da/ conditions per subject. The whole-brain MVPA revealed that Broca's area and the left pre-supplementary motor area evoked distinct neural activity patterns between the two perceptual categories (/ba/ vs /da/). Broca's area was also found when the same analysis was applied to another dataset (Raizada and Poldrack, 2007), which previously yielded the supramarginal gyrus using a univariate adaptation-fMRI paradigm. The consistent MVPA findings from two independent datasets strongly indicate that Broca's area participates in categorical speech perception, with a possible role of translating speech signals into articulatory codes. The difference in results between univariate and multivariate pattern-based analyses of the same data suggest that processes in different cortical areas along the dorsal speech perception stream are distributed on different spatial scales.

  14. Hemispheric association and dissociation of voice and speech information processing in stroke.

    Science.gov (United States)

    Jones, Anna B; Farrall, Andrew J; Belin, Pascal; Pernet, Cyril R

    2015-10-01

    As we listen to someone speaking, we extract both linguistic and non-linguistic information. Knowing how these two sets of information are processed in the brain is fundamental for the general understanding of social communication, speech recognition and therapy of language impairments. We investigated the pattern of performances in phoneme versus gender categorization in left and right hemisphere stroke patients, and found an anatomo-functional dissociation in the right frontal cortex, establishing a new syndrome in voice discrimination abilities. In addition, phoneme and gender performances were most often associated than dissociated in the left hemisphere patients, suggesting a common neural underpinnings. Copyright © 2015 Elsevier Ltd. All rights reserved.

  15. Stochastic Signal Processing for Sound Environment System with Decibel Evaluation and Energy Observation

    Directory of Open Access Journals (Sweden)

    Akira Ikuta

    2014-01-01

    Full Text Available In real sound environment system, a specific signal shows various types of probability distribution, and the observation data are usually contaminated by external noise (e.g., background noise of non-Gaussian distribution type. Furthermore, there potentially exist various nonlinear correlations in addition to the linear correlation between input and output time series. Consequently, often the system input and output relationship in the real phenomenon cannot be represented by a simple model using only the linear correlation and lower order statistics. In this study, complex sound environment systems difficult to analyze by using usual structural method are considered. By introducing an estimation method of the system parameters reflecting correlation information for conditional probability distribution under existence of the external noise, a prediction method of output response probability for sound environment systems is theoretically proposed in a suitable form for the additive property of energy variable and the evaluation in decibel scale. The effectiveness of the proposed stochastic signal processing method is experimentally confirmed by applying it to the observed data in sound environment systems.

  16. [Modeling developmental aspects of sensorimotor control of speech production].

    Science.gov (United States)

    Kröger, B J; Birkholz, P; Neuschaefer-Rube, C

    2007-05-01

    Detailed knowledge of the neurophysiology of speech acquisition is important for understanding the developmental aspects of speech perception and production and for understanding developmental disorders of speech perception and production. A computer implemented neural model of sensorimotor control of speech production was developed. The model is capable of demonstrating the neural functions of different cortical areas during speech production in detail. (i) Two sensory and two motor maps or neural representations and the appertaining neural mappings or projections establish the sensorimotor feedback control system. These maps and mappings are already formed and trained during the prelinguistic phase of speech acquisition. (ii) The feedforward sensorimotor control system comprises the lexical map (representations of sounds, syllables, and words of the first language) and the mappings from lexical to sensory and to motor maps. The training of the appertaining mappings form the linguistic phase of speech acquisition. (iii) Three prelinguistic learning phases--i. e. silent mouthing, quasi stationary vocalic articulation, and realisation of articulatory protogestures--can be defined on the basis of our simulation studies using the computational neural model. These learning phases can be associated with temporal phases of prelinguistic speech acquisition obtained from natural data. The neural model illuminates the detailed function of specific cortical areas during speech production. In particular it can be shown that developmental disorders of speech production may result from a delayed or incorrect process within one of the prelinguistic learning phases defined by the neural model.

  17. Susceptibility to a multisensory speech illusion in older persons is driven by perceptual processes

    Directory of Open Access Journals (Sweden)

    Annalisa eSetti

    2013-09-01

    Full Text Available Recent studies suggest that multisensory integration is enhanced in older adults but it is not known whether this enhancement is solely driven by perceptual processes or affected by cognitive processes. Using the ‘McGurk illusion’, in Experiment 1 we found that audio-visual integration of incongruent audio-visual words was higher in older adults than in younger adults, although the recognition of either audio- or visual-only presented words was the same across groups. In Experiment 2 we tested recall of sentences within which an incongruent audio-visual speech word was embedded. The overall semantic meaning of the sentence was compatible with either one of the unisensory components of the target word and/or with the illusory percept. Older participants recalled more illusory audio-visual words in sentences than younger adults, however, there was no differential effect of word compatibility on recall for the two groups. Our findings suggest that the relatively high susceptibility to the audio-visual speech illusion in older participants is due more to perceptual than cognitive processing.

  18. Speech-language therapists' process of including significant others in aphasia rehabilitation.

    Science.gov (United States)

    Hallé, Marie-Christine; Le Dorze, Guylaine; Mingant, Anne

    2014-11-01

    Although aphasia rehabilitation should include significant others, it is currently unknown how this recommendation is adopted in speech-language therapy practice. Speech-language therapists' (SLTs) experience of including significant others in aphasia rehabilitation is also understudied, yet a better understanding of clinical reality would be necessary to facilitate implementation of best evidence pertaining to family interventions. To explore the process through which SLTs work with significant others of people with aphasia in rehabilitation settings. Individual semi-structured interviews were conducted with eight SLTs who had been working with persons with aphasia in rehabilitation centres for at least 1 year. Grounded theory principles were applied in analysing interview transcripts. A theoretical model was developed representing SLTs' process of working with significant others of persons with aphasia in rehabilitation. Including significant others was perceived as challenging, yet a bonus to their fundamental patient-centred approach. Basic interventions with significant others when they were available included information sharing. If necessary, significant others were referred to social workers or psychologists or the participants collaborated with those professionals. Participants rarely and only under specific conditions provided significant others with language exercises or trained them to communicate better with the aphasic person. As a result, even if participants felt satisfied with their efforts to offer family and friends interventions, they also had unachieved ideals, such as having more frequent contacts with significant others. If SLTs perceived work with significant others as a feasible necessity, rather than as a challenging bonus, they could be more inclined to include family and friends within therapy with the aim to improve their communication with the person with aphasia. SLTs could also be more satisfied with their practice. In order to

  19. Functional lateralization of speech processing in adults and children who stutter

    Directory of Open Access Journals (Sweden)

    Yutaka eSato

    2011-04-01

    Full Text Available Developmental stuttering is a speech disorder in fluency characterized by repetitions, prolongations and silent blocks, especially in the initial parts of utterances. Although their symptoms are motor related, people who stutter show abnormal patterns of cerebral hemispheric dominance in both anterior and posterior language areas. It is unknown whether the abnormal functional lateralization in the posterior language area starts during childhood or emerges as a consequence of many years of stuttering. In order to address this issue, we measured the lateralization of hemodynamic responses in the auditory cortex during auditory speech processing in adults and children who stutter, including preschoolers, with near-infrared spectroscopy (NIRS. We used the analysis-resynthesis technique to prepare two types of stimuli: (i a phonemic contrast embedded in Japanese spoken words (/itta/ vs. /itte/ and (ii a prosodic contrast (/itta/ vs. /itta?/. In the baseline blocks, only /itta/ tokens were presented. In phonemic contrast blocks, /itta/ and /itte/ tokens were presented pseudo-randomly, and /itta/ and /itta?/ tokens in prosodic contrast blocks. In adults and children who do not stutter, there was a clear left-hemispheric advantage for the phonemic contrast compared to the prosodic contrast. Adults and children who stutter, however, showed no significant difference between the two stimulus conditions. A subject-by-subject analysis revealed that not a single subject who stutters showed a left advantage in the phonemic contrast over the prosodic contrast condition. These results indicate that the functional lateralization for auditory speech processing is in disarray among those who stutter, even at preschool age. These results shed light on the neural pathophysiology of developmental stuttering.

  20. Application of Business Process Management to drive the deployment of a speech recognition system in a healthcare organization.

    Science.gov (United States)

    González Sánchez, María José; Framiñán Torres, José Manuel; Parra Calderón, Carlos Luis; Del Río Ortega, Juan Antonio; Vigil Martín, Eduardo; Nieto Cervera, Jaime

    2008-01-01

    We present a methodology based on Business Process Management to guide the development of a speech recognition system in a hospital in Spain. The methodology eases the deployment of the system by 1) involving the clinical staff in the process, 2) providing the IT professionals with a description of the process and its requirements, 3) assessing advantages and disadvantages of the speech recognition system, as well as its impact in the organisation, and 4) help reorganising the healthcare process before implementing the new technology in order to identify how it can better contribute to the overall objective of the organisation.

  1. Corticofugal modulation of initial neural processing of sound information from the ipsilateral ear in the mouse.

    Directory of Open Access Journals (Sweden)

    Xiuping Liu

    2010-11-01

    Full Text Available Cortical neurons implement a high frequency-specific modulation of subcortical nuclei that includes the cochlear nucleus. Anatomical studies show that corticofugal fibers terminating in the auditory thalamus and midbrain are mostly ipsilateral. Differently, corticofugal fibers terminating in the cochlear nucleus are bilateral, which fits to the needs of binaural hearing that improves hearing quality. This leads to our hypothesis that corticofugal modulation of initial neural processing of sound information from the contralateral and ipsilateral ears could be equivalent or coordinated at the first sound processing level.With the focal electrical stimulation of the auditory cortex and single unit recording, this study examined corticofugal modulation of the ipsilateral cochlear nucleus. The same methods and procedures as described in our previous study of corticofugal modulation of contralateral cochlear nucleus were employed simply for comparison. We found that focal electrical stimulation of cortical neurons induced substantial changes in the response magnitude, response latency and receptive field of ipsilateral cochlear nucleus neurons. Cortical stimulation facilitated auditory response and shortened the response latency of physiologically matched neurons whereas it inhibited auditory response and lengthened the response latency of unmatched neurons. Finally, cortical stimulation shifted the best frequencies of cochlear neurons towards those of stimulated cortical neurons.Our data suggest that cortical neurons enable a high frequency-specific remodelling of sound information processing in the ipsilateral cochlear nucleus in the same manner as that in the contralateral cochlear nucleus.

  2. Sound stream segregation: a neuromorphic approach to solve the "cocktail party problem" in real-time.

    Science.gov (United States)

    Thakur, Chetan Singh; Wang, Runchun M; Afshar, Saeed; Hamilton, Tara J; Tapson, Jonathan C; Shamma, Shihab A; van Schaik, André

    2015-01-01

    The human auditory system has the ability to segregate complex auditory scenes into a foreground component and a background, allowing us to listen to specific speech sounds from a mixture of sounds. Selective attention plays a crucial role in this process, colloquially known as the "cocktail party effect." It has not been possible to build a machine that can emulate this human ability in real-time. Here, we have developed a framework for the implementation of a neuromorphic sound segregation algorithm in a Field Programmable Gate Array (FPGA). This algorithm is based on the principles of temporal coherence and uses an attention signal to separate a target sound stream from background noise. Temporal coherence implies that auditory features belonging to the same sound source are coherently modulated and evoke highly correlated neural response patterns. The basis for this form of sound segregation is that responses from pairs of channels that are strongly positively correlated belong to the same stream, while channels that are uncorrelated or anti-correlated belong to different streams. In our framework, we have used a neuromorphic cochlea as a frontend sound analyser to extract spatial information of the sound input, which then passes through band pass filters that extract the sound envelope at various modulation rates. Further stages include feature extraction and mask generation, which is finally used to reconstruct the targeted sound. Using sample tonal and speech mixtures, we show that our FPGA architecture is able to segregate sound sources in real-time. The accuracy of segregation is indicated by the high signal-to-noise ratio (SNR) of the segregated stream (90, 77, and 55 dB for simple tone, complex tone, and speech, respectively) as compared to the SNR of the mixture waveform (0 dB). This system may be easily extended for the segregation of complex speech signals, and may thus find various applications in electronic devices such as for sound segregation and

  3. Techniques and applications for binaural sound manipulation in human-machine interfaces

    Science.gov (United States)

    Begault, Durand R.; Wenzel, Elizabeth M.

    1992-01-01

    The implementation of binaural sound to speech and auditory sound cues (auditory icons) is addressed from both an applications and technical standpoint. Techniques overviewed include processing by means of filtering with head-related transfer functions. Application to advanced cockpit human interface systems is discussed, although the techniques are extendable to any human-machine interface. Research issues pertaining to three-dimensional sound displays under investigation at the Aerospace Human Factors Division at NASA Ames Research Center are described.

  4. Auditory Perception, Suprasegmental Speech Processing, and Vocabulary Development in Chinese Preschoolers.

    Science.gov (United States)

    Wang, Hsiao-Lan S; Chen, I-Chen; Chiang, Chun-Han; Lai, Ying-Hui; Tsao, Yu

    2016-10-01

    The current study examined the associations between basic auditory perception, speech prosodic processing, and vocabulary development in Chinese kindergartners, specifically, whether early basic auditory perception may be related to linguistic prosodic processing in Chinese Mandarin vocabulary acquisition. A series of language, auditory, and linguistic prosodic tests were given to 100 preschool children who had not yet learned how to read Chinese characters. The results suggested that lexical tone sensitivity and intonation production were significantly correlated with children's general vocabulary abilities. In particular, tone awareness was associated with comprehensive language development, whereas intonation production was associated with both comprehensive and expressive language development. Regression analyses revealed that tone sensitivity accounted for 36% of the unique variance in vocabulary development, whereas intonation production accounted for 6% of the variance in vocabulary development. Moreover, auditory frequency discrimination was significantly correlated with lexical tone sensitivity, syllable duration discrimination, and intonation production in Mandarin Chinese. Also it provided significant contributions to tone sensitivity and intonation production. Auditory frequency discrimination may indirectly affect early vocabulary development through Chinese speech prosody. © The Author(s) 2016.

  5. Effects of noise and audiovisual cues on speech processing in adults with and without ADHD.

    Science.gov (United States)

    Michalek, Anne M P; Watson, Silvana M; Ash, Ivan; Ringleb, Stacie; Raymer, Anastasia

    2014-03-01

    This study examined the interplay among internal (e.g. attention, working memory abilities) and external (e.g. background noise, visual information) factors in individuals with and without ADHD. A 2 × 2 × 6 mixed design with correlational analyses was used to compare participant results on a standardized listening in noise sentence repetition task (QuickSin; Killion et al, 2004 ), presented in an auditory and an audiovisual condition as signal-to-noise ratio (SNR) varied from 25-0 dB and to determine individual differences in working memory capacity and short-term recall. Thirty-eight young adults without ADHD and twenty-five young adults with ADHD. Diagnosis, modality, and signal-to-noise ratio all affected the ability to process speech in noise. The interaction between the diagnosis of ADHD, the presence of visual cues, and the level of noise had an effect on a person's ability to process speech in noise. conclusion: Young adults with ADHD benefited less from visual information during noise than young adults without ADHD, an effect influenced by working memory abilities.

  6. Temporal modulations in speech and music.

    Science.gov (United States)

    Ding, Nai; Patel, Aniruddh D; Chen, Lin; Butler, Henry; Luo, Cheng; Poeppel, David

    2017-10-01

    Speech and music have structured rhythms. Here we discuss a major acoustic correlate of spoken and musical rhythms, the slow (0.25-32Hz) temporal modulations in sound intensity and compare the modulation properties of speech and music. We analyze these modulations using over 25h of speech and over 39h of recordings of Western music. We show that the speech modulation spectrum is highly consistent across 9 languages (including languages with typologically different rhythmic characteristics). A different, but similarly consistent modulation spectrum is observed for music, including classical music played by single instruments of different types, symphonic, jazz, and rock. The temporal modulations of speech and music show broad but well-separated peaks around 5 and 2Hz, respectively. These acoustically dominant time scales may be intrinsic features of speech and music, a possibility which should be investigated using more culturally diverse samples in each domain. Distinct modulation timescales for speech and music could facilitate their perceptual analysis and its neural processing. Copyright © 2017 Elsevier Ltd. All rights reserved.

  7. Emotional prosody of task-irrelevant speech interferes with the retention of serial order.

    Science.gov (United States)

    Kattner, Florian; Ellermeier, Wolfgang

    2018-04-09

    Task-irrelevant speech and other temporally changing sounds are known to interfere with the short-term memorization of ordered verbal materials, as compared to silence or stationary sounds. It has been argued that this disruption of short-term memory (STM) may be due to (a) interference of automatically encoded acoustical fluctuations with the process of serial rehearsal or (b) attentional capture by salient task-irrelevant information. To disentangle the contributions of these 2 processes, the authors investigated whether the disruption of serial recall is due to the semantic or acoustical properties of task-irrelevant speech (Experiment 1). They found that performance was affected by the prosody (emotional intonation), but not by the semantics (word meaning), of irrelevant speech, suggesting that the disruption of serial recall is due to interference of precategorically encoded changing-state sound (with higher fluctuation strength of emotionally intonated speech). The authors further demonstrated a functional distinction between this form of distraction and attentional capture by contrasting the effect of (a) speech prosody and (b) sudden prosody deviations on both serial and nonserial STM tasks (Experiment 2). Although serial recall was again sensitive to the emotional prosody of irrelevant speech, performance on a nonserial missing-item task was unaffected by the presence of neutral or emotionally intonated speech sounds. In contrast, sudden prosody changes tended to impair performance on both tasks, suggesting an independent effect of attentional capture. (PsycINFO Database Record (c) 2018 APA, all rights reserved).

  8. Electrophysiological evidence for a defect in the processing of temporal sound patterns in multiple sclerosis.

    Science.gov (United States)

    Jones, S J; Sprague, L; Vaz Pato, M

    2002-11-01

    To assess the processing of spectrotemporal sound patterns in multiple sclerosis by using auditory evoked potentials (AEPs) to complex harmonic tones. 22 patients with definite multiple sclerosis but mild disability and no auditory complaints were compared with 15 normal controls. Short latency AEPs were recorded using standard methods. Long latency AEPs were recorded to synthesised musical instrument tones, at onset every two seconds, at abrupt frequency changes every two seconds, and at the end of a two second period of 16/s frequency changes. The subjects were inattentive but awake, reading irrelevant material. Short latency AEPs were abnormal in only 4 of 22 patients, whereas long latency AEPs were abnormal to one or more stimuli in 17 of 22. No significant latency prolongation was seen in response to onset and infrequent frequency changes (P1, N1, P2) but the potentials at the end of 16/s frequency modulations, particularly the P2 peaking approximately 200 ms after the next expected change, were significantly delayed. The delayed responses appear to be a mild disorder in the processing of change in temporal sound patterns. The delay may be conceived of as extra time taken to compare the incoming sound with the contents of a temporally ordered sensory memory store (the long auditory store or echoic memory), which generates a response when the next expected frequency change fails to occur. The defect cannot be ascribed to lesions of the afferent pathways and so may be due to disseminated brain lesions visible or invisible on magnetic resonance imaging.

  9. A Signal Processing Module for the Analysis of Heart Sounds and Heart Murmurs

    International Nuclear Information System (INIS)

    Javed, Faizan; Venkatachalam, P A; H, Ahmad Fadzil M

    2006-01-01

    In this paper a Signal Processing Module (SPM) for the computer-aided analysis of heart sounds has been developed. The module reveals important information of cardiovascular disorders and can assist general physician to come up with more accurate and reliable diagnosis at early stages. It can overcome the deficiency of expert doctors in rural as well as urban clinics and hospitals. The module has five main blocks: Data Acquisition and Pre-processing, Segmentation, Feature Extraction, Murmur Detection and Murmur Classification. The heart sounds are first acquired using an electronic stethoscope which has the capability of transferring these signals to the near by workstation using wireless media. Then the signals are segmented into individual cycles as well as individual components using the spectral analysis of heart without using any reference signal like ECG. Then the features are extracted from the individual components using Spectrogram and are used as an input to a MLP (Multiple Layer Perceptron) Neural Network that is trained to detect the presence of heart murmurs. Once the murmur is detected they are classified into seven classes depending on their timing within the cardiac cycle using Smoothed Pseudo Wigner-Ville distribution. The module has been tested with real heart sounds from 40 patients and has proved to be quite efficient and robust while dealing with a large variety of pathological conditions

  10. A Signal Processing Module for the Analysis of Heart Sounds and Heart Murmurs

    Energy Technology Data Exchange (ETDEWEB)

    Javed, Faizan; Venkatachalam, P A; H, Ahmad Fadzil M [Signal and Imaging Processing and Tele-Medicine Technology Research Group, Department of Electrical and Electronics Engineering, Universiti Teknologi PETRONAS, 31750 Tronoh, Perak (Malaysia)

    2006-04-01

    In this paper a Signal Processing Module (SPM) for the computer-aided analysis of heart sounds has been developed. The module reveals important information of cardiovascular disorders and can assist general physician to come up with more accurate and reliable diagnosis at early stages. It can overcome the deficiency of expert doctors in rural as well as urban clinics and hospitals. The module has five main blocks: Data Acquisition and Pre-processing, Segmentation, Feature Extraction, Murmur Detection and Murmur Classification. The heart sounds are first acquired using an electronic stethoscope which has the capability of transferring these signals to the near by workstation using wireless media. Then the signals are segmented into individual cycles as well as individual components using the spectral analysis of heart without using any reference signal like ECG. Then the features are extracted from the individual components using Spectrogram and are used as an input to a MLP (Multiple Layer Perceptron) Neural Network that is trained to detect the presence of heart murmurs. Once the murmur is detected they are classified into seven classes depending on their timing within the cardiac cycle using Smoothed Pseudo Wigner-Ville distribution. The module has been tested with real heart sounds from 40 patients and has proved to be quite efficient and robust while dealing with a large variety of pathological conditions.

  11. Uranium-series radionuclides as tracers of geochemical processes in Long Island Sound

    International Nuclear Information System (INIS)

    Benninger, L.K.

    1976-05-01

    An estuary can be visualized as a membrane between land and the deep ocean, and the understanding of the estuarine processes which determine the permeability of this membrane to terrigenous materials is necessary for the estimation of fluxes of these materials to the oceans. Natural radionuclides are useful probes into estuarine geochemistry because of the time-dependent relationships among them and because, as analogs of stable elements, they are much less subject to contamination during sampling and analysis. In this study the flux of heavy metals through Long Island Sound is considered in light of the material balance for excess 210 Pb, and analyses of concurrent seston and water samples from central Long Island Sound are used to probe the internal workings of the estuary

  12. Improving Robustness against Environmental Sounds for Directing Attention of Social Robots

    DEFF Research Database (Denmark)

    Thomsen, Nicolai Bæk; Tan, Zheng-Hua; Lindberg, Børge

    2015-01-01

    This paper presents a multi-modal system for finding out where to direct the attention of a social robot in a dialog scenario, which is robust against environmental sounds (door slamming, phone ringing etc.) and short speech segments. The method is based on combining voice activity detection (VAD......) and sound source localization (SSL) and furthermore apply post-processing to SSL to filter out short sounds. The system is tested against a baseline system in four different real-world experiments, where different sounds are used as interfering sounds. The results are promising and show a clear improvement....

  13. Amusia Results in Abnormal Brain Activity following Inappropriate Intonation during Speech Comprehension

    OpenAIRE

    Jiang, Cunmei; Hamm, Jeff P.; Lim, Vanessa K.; Kirk, Ian J.; Chen, Xuhai; Yang, Yufang

    2012-01-01

    Pitch processing is a critical ability on which humans' tonal musical experience depends, and which is also of paramount importance for decoding prosody in speech. Congenital amusia refers to deficits in the ability to properly process musical pitch, and recent evidence has suggested that this musical pitch disorder may impact upon the processing of speech sounds. Here we present the first electrophysiological evidence demonstrating that individuals with amusia who speak Mandarin Chinese are ...

  14. Vocal Noise Cancellation From Respiratory Sounds

    National Research Council Canada - National Science Library

    Moussavi, Zahra

    2001-01-01

    Although background noise cancellation for speech or electrocardiographic recording is well established, however when the background noise contains vocal noises and the main signal is a breath sound...

  15. Speech Signal and Facial Image Processing for Obstructive Sleep Apnea Assessment

    Directory of Open Access Journals (Sweden)

    Fernando Espinoza-Cuadros

    2015-01-01

    Full Text Available Obstructive sleep apnea (OSA is a common sleep disorder characterized by recurring breathing pauses during sleep caused by a blockage of the upper airway (UA. OSA is generally diagnosed through a costly procedure requiring an overnight stay of the patient at the hospital. This has led to proposing less costly procedures based on the analysis of patients’ facial images and voice recordings to help in OSA detection and severity assessment. In this paper we investigate the use of both image and speech processing to estimate the apnea-hypopnea index, AHI (which describes the severity of the condition, over a population of 285 male Spanish subjects suspected to suffer from OSA and referred to a Sleep Disorders Unit. Photographs and voice recordings were collected in a supervised but not highly controlled way trying to test a scenario close to an OSA assessment application running on a mobile device (i.e., smartphones or tablets. Spectral information in speech utterances is modeled by a state-of-the-art low-dimensional acoustic representation, called i-vector. A set of local craniofacial features related to OSA are extracted from images after detecting facial landmarks using Active Appearance Models (AAMs. Support vector regression (SVR is applied on facial features and i-vectors to estimate the AHI.

  16. How Hearing Impairment Affects Sentence Comprehension: Using Eye Fixations to Investigate the Duration of Speech Processing

    DEFF Research Database (Denmark)

    Wendt, Dorothea; Kollmeier, Birger; Brand, Thomas

    2015-01-01

    ; this measure uses eye fixations recorded while the participant listens to a sentence. Eye fixations toward a target picture (which matches the aurally presented sentence) were measured in the presence of a competitor picture. Based on the recorded eye fixations, the single target detection amplitude, which...... reflects the tendency of the participant to fixate the target picture, was used as a metric to estimate the duration of sentence processing. The single target detection amplitude was calculated for sentence structures with different levels of linguistic complexity and for different listening conditions......: in quiet and in two different noise conditions. Participants with hearing impairment spent more time processing sentences, even at high levels of speech intelligibility. In addition, the relationship between the proposed online measure and listener-specific factors, such as hearing aid use and cognitive...

  17. PET studies of phonetic processing of speech: review, replication, and reanalysis

    DEFF Research Database (Denmark)

    Zatorre, R J; Meyer, E; Gjedde, A

    1996-01-01

    Positron emission tomography was used to investigate cerebral blood flow (CBF) changes associated with the processing of speech. In a first experiment, normal right-handed volunteers were scanned under two conditions that required phonetic processing (discrimination of final consonants and phoneme...... monitoring), and one baseline condition of passive listening. Analysis was carried out by paired-image subtraction, with MRI overlay for anatomical localization. Comparison of each phonetic condition with the baseline condition revealed increased CBF in the left frontal lobe, close to the border between...... Broca's area and the motor cortex, and in a left parietal region. A second experiment showed that this area was not activated by a semantic judgement task. Reanalysis of data from an earlier study, in which various baseline conditions were used, confirmed that this region of left frontal cortex...

  18. Extensions to the Speech Disorders Classification System (SDCS)

    Science.gov (United States)

    Shriberg, Lawrence D.; Fourakis, Marios; Hall, Sheryl D.; Karlsson, Heather B.; Lohmeier, Heather L.; McSweeny, Jane L.; Potter, Nancy L.; Scheer-Cohen, Alison R.; Strand, Edythe A.; Tilkens, Christie M.; Wilson, David L.

    2010-01-01

    This report describes three extensions to a classification system for paediatric speech sound disorders termed the Speech Disorders Classification System (SDCS). Part I describes a classification extension to the SDCS to differentiate motor speech disorders from speech delay and to differentiate among three sub-types of motor speech disorders.…

  19. The influence of phonetic dimensions on aphasic speech perception

    NARCIS (Netherlands)

    de Kok, D.A.; Jonkers, R.; Bastiaanse, Y.R.M.

    2010-01-01

    Individuals with aphasia have more problems detecting small differences between speech sounds than larger ones. This paper reports how phonemic processing is impaired and how this is influenced by speechreading. A non-word discrimination task was carried out with 'audiovisual', 'auditory only' and

  20. Monaural and binaural benefit from early reflections for speech intelligibility

    DEFF Research Database (Denmark)

    Arweiler, Iris; Buchholz, Jörg; Dau, Torsten

    2010-01-01

    The auditory system takes advantage of early reflections (ER’s) in a room by integrating them with the direct sound (DS) and thereby increasing the effective speech level. The energy, spectral content and direction of the ER’s are dependent on wall absorptions and room geometries and therefore di...... to investigate if ER processing is a monaural or binaural effect....

  1. The pathways for intelligible speech: multivariate and univariate perspectives.

    Science.gov (United States)

    Evans, S; Kyong, J S; Rosen, S; Golestani, N; Warren, J E; McGettigan, C; Mourão-Miranda, J; Wise, R J S; Scott, S K

    2014-09-01

    An anterior pathway, concerned with extracting meaning from sound, has been identified in nonhuman primates. An analogous pathway has been suggested in humans, but controversy exists concerning the degree of lateralization and the precise location where responses to intelligible speech emerge. We have demonstrated that the left anterior superior temporal sulcus (STS) responds preferentially to intelligible speech (Scott SK, Blank CC, Rosen S, Wise RJS. 2000. Identification of a pathway for intelligible speech in the left temporal lobe. Brain. 123:2400-2406.). A functional magnetic resonance imaging study in Cerebral Cortex used equivalent stimuli and univariate and multivariate analyses to argue for the greater importance of bilateral posterior when compared with the left anterior STS in responding to intelligible speech (Okada K, Rong F, Venezia J, Matchin W, Hsieh IH, Saberi K, Serences JT,Hickok G. 2010. Hierarchical organization of human auditory cortex: evidence from acoustic invariance in the response to intelligible speech. 20: 2486-2495.). Here, we also replicate our original study, demonstrating that the left anterior STS exhibits the strongest univariate response and, in decoding using the bilateral temporal cortex, contains the most informative voxels showing an increased response to intelligible speech. In contrast, in classifications using local "searchlights" and a whole brain analysis, we find greater classification accuracy in posterior rather than anterior temporal regions. Thus, we show that the precise nature of the multivariate analysis used will emphasize different response profiles associated with complex sound to speech processing. © The Author 2013. Published by Oxford University Press.

  2. The influence of spectral and spatial characteristics of early reflections on speech intelligibility

    DEFF Research Database (Denmark)

    Arweiler, Iris; Buchholz, Jörg; Dau, Torsten

    The auditory system employs different strategies to facilitate speech intelligibility in complex listening conditions. One of them is the integration of early reflections (ER’s) with the direct sound (DS) to increase the effective speech level. So far the underlying mechanisms of ER processing have...... of listeners that speech intelligibility improved with added ER energy, but less than with added DS energy. An efficiency factor was introduced to quantify this effect. The difference in speech intelligibility could be mainly ascribed to the differences in the spectrum between the speech signals....... binaural). The direction-dependency could be explained by the spectral changes introduced by the pinna, head, and torso. The results will be important with regard to the influence of signal processing strategies in modern hearing aids on speech intelligibility, because they might alter the spectral...

  3. Perceptual chunking and its effect on memory in speech processing: ERP and behavioral evidence.

    Science.gov (United States)

    Gilbert, Annie C; Boucher, Victor J; Jemel, Boutheina

    2014-01-01

    We examined how perceptual chunks of varying size in utterances can influence immediate memory of heard items (monosyllabic words). Using behavioral measures and event-related potentials (N400) we evaluated the quality of the memory trace for targets taken from perceived temporal groups (TGs) of three and four items. Variations in the amplitude of the N400 showed a better memory trace for items presented in TGs of three compared to those in groups of four. Analyses of behavioral responses along with P300 components also revealed effects of chunk position in the utterance. This is the first study to measure the online effects of perceptual chunks on the memory trace of spoken items. Taken together, the N400 and P300 responses demonstrate that the perceptual chunking of speech facilitates information buffering and a processing on a chunk-by-chunk basis.

  4. Within-subjects comparison of the HiRes and Fidelity120 speech processing strategies: speech perception and its relation to place-pitch sensitivity.

    Science.gov (United States)

    Donaldson, Gail S; Dawson, Patricia K; Borden, Lamar Z

    2011-01-01

    Previous studies have confirmed that current steering can increase the number of discriminable pitches available to many cochlear implant (CI) users; however, the ability to perceive additional pitches has not been linked to improved speech perception. The primary goals of this study were to determine (1) whether adult CI users can achieve higher levels of spectral cue transmission with a speech processing strategy that implements current steering (Fidelity120) than with a predecessor strategy (HiRes) and, if so, (2) whether the magnitude of improvement can be predicted from individual differences in place-pitch sensitivity. A secondary goal was to determine whether Fidelity120 supports higher levels of speech recognition in noise than HiRes. A within-subjects repeated measures design evaluated speech perception performance with Fidelity120 relative to HiRes in 10 adult CI users. Subjects used the novel strategy (either HiRes or Fidelity120) for 8 wks during the main study; a subset of five subjects used Fidelity120 for three additional months after the main study. Speech perception was assessed for the spectral cues related to vowel F1 frequency, vowel F2 frequency, and consonant place of articulation; overall transmitted information for vowels and consonants; and sentence recognition in noise. Place-pitch sensitivity was measured for electrode pairs in the apical, middle, and basal regions of the implanted array using a psychophysical pitch-ranking task. With one exception, there was no effect of strategy (HiRes versus Fidelity120) on the speech measures tested, either during the main study (N = 10) or after extended use of Fidelity120 (N = 5). The exception was a small but significant advantage for HiRes over Fidelity120 for consonant perception during the main study. Examination of individual subjects' data revealed that 3 of 10 subjects demonstrated improved perception of one or more spectral cues with Fidelity120 relative to HiRes after 8 wks or longer

  5. Methods and Application of Phonetic Label Alignment in Speech Processing Tasks

    Directory of Open Access Journals (Sweden)

    M. Myslivec

    2000-12-01

    Full Text Available The paper deals with the problem of automatic phonetic segmentation ofspeech signals, namely for speech analysis and recognition purposes.Several methods and approaches are described and evaluated from thepoint of view of their accuracy. A complete instruction for creating anannotated database for training a Czech speech recognition system isprovided together with the authors' own experience. The results of thework have found practical applications, for example, in developing atool for semi-automatic speech segmentation, building alarge-vocabulary phoneme-based speech recognition system and designingan aid for learning and practicing pronunciation of words or phrases inthe native or a foreign language.

  6. Environmental Contamination of Normal Speech.

    Science.gov (United States)

    Harley, Trevor A.

    1990-01-01

    Environmentally contaminated speech errors (irrelevant words or phrases derived from the speaker's environment and erroneously incorporated into speech) are hypothesized to occur at a high level of speech processing, but with a relatively late insertion point. The data indicate that speech production processes are not independent of other…

  7. Effects of irrelevant speech and traffic noise on speech perception and cognitive performance in elementary school children.

    Science.gov (United States)

    Klatte, Maria; Meis, Markus; Sukowski, Helga; Schick, August

    2007-01-01

    The effects of background noise of moderate intensity on short-term storage and processing of verbal information were analyzed in 6 to 8 year old children. In line with adult studies on "irrelevant sound effect" (ISE), serial recall of visually presented digits was severely disrupted by background speech that the children did not understand. Train noises of equal Intensity however, had no effect. Similar results were demonstrated with tasks requiring storage and processing of heard information. Memory for nonwords, execution of oral instructions and categorizing speech sounds were significantly disrupted by irrelevant speech. The affected functions play a fundamental role in the acquisition of spoken and written language. Implications concerning current models of the ISE and the acoustic conditions in schools and kindergardens are discussed.

  8. The role of the speech-language pathologist in identifying and treating children with auditory processing disorder.

    Science.gov (United States)

    Richard, Gail J

    2011-07-01

    A summary of issues regarding auditory processing disorder (APD) is presented, including some of the remaining questions and challenges raised by the articles included in the clinical forum. Evolution of APD as a diagnostic entity within audiology and speech-language pathology is reviewed. A summary of treatment efficacy results and issues is provided, as well as the continuing dilemma for speech-language pathologists (SLPs) charged with providing treatment for referred APD clients. The role of the SLP in diagnosing and treating APD remains under discussion, despite lack of efficacy data supporting auditory intervention and questions regarding the clinical relevance and validity of APD.

  9. Transport processes and sound velocity in vibrationally non-equilibrium gas of anharmonic oscillators

    Science.gov (United States)

    Rydalevskaya, Maria A.; Voroshilova, Yulia N.

    2018-05-01

    Vibrationally non-equilibrium flows of chemically homogeneous diatomic gases are considered under the conditions that the distribution of the molecules over vibrational levels differs significantly from the Boltzmann distribution. In such flows, molecular collisions can be divided into two groups: the first group corresponds to "rapid" microscopic processes whereas the second one corresponds to "slow" microscopic processes (their rate is comparable to or larger than that of gasdynamic parameters variation). The collisions of the first group form quasi-stationary vibrationally non-equilibrium distribution functions. The model kinetic equations are used to study the transport processes under these conditions. In these equations, the BGK-type approximation is used to model only the collision operators of the first group. It allows us to simplify derivation of the transport fluxes and calculation of the kinetic coefficients. Special attention is given to the connection between the formulae for the bulk viscosity coefficient and the sound velocity square.

  10. Cortical activity patterns predict robust speech discrimination ability in noise

    Science.gov (United States)

    Shetake, Jai A.; Wolf, Jordan T.; Cheung, Ryan J.; Engineer, Crystal T.; Ram, Satyananda K.; Kilgard, Michael P.

    2012-01-01

    The neural mechanisms that support speech discrimination in noisy conditions are poorly understood. In quiet conditions, spike timing information appears to be used in the discrimination of speech sounds. In this study, we evaluated the hypothesis that spike timing is also used to distinguish between speech sounds in noisy conditions that significantly degrade neural responses to speech sounds. We tested speech sound discrimination in rats and recorded primary auditory cortex (A1) responses to speech sounds in background noise of different intensities and spectral compositions. Our behavioral results indicate that rats, like humans, are able to accurately discriminate consonant sounds even in the presence of background noise that is as loud as the speech signal. Our neural recordings confirm that speech sounds evoke degraded but detectable responses in noise. Finally, we developed a novel neural classifier that mimics behavioral discrimination. The classifier discriminates between speech sounds by comparing the A1 spatiotemporal activity patterns evoked on single trials with the average spatiotemporal patterns evoked by known sounds. Unlike classifiers in most previous studies, this classifier is not provided with the stimulus onset time. Neural activity analyzed with the use of relative spike timing was well correlated with behavioral speech discrimination in quiet and in noise. Spike timing information integrated over longer intervals was required to accurately predict rat behavioral speech discrimination in noisy conditions. The similarity of neural and behavioral discrimination of speech in noise suggests that humans and rats may employ similar brain mechanisms to solve this problem. PMID:22098331

  11. Expressive vocabulary and auditory processing in children with deviant speech acquisition.

    Science.gov (United States)

    Quintas, Victor Gandra; Mezzomo, Carolina Lisbôa; Keske-Soares, Márcia; Dias, Roberta Freitas

    2010-01-01

    expressive vocabulary and auditory processing in children with phonological disorder. to compare the performance of children with phonological disorder in a vocabulary test with the parameters indicated by the same test and to verify a possible relationship between this performance and auditory processing deficits. participants were 12 children diagnosed with phonological disorders, with ages ranging from 5 to 7 years, of both genders. Vocabulary was assessed using the ABFW language test and the simplified auditory processing evaluation (sorting), Alternate Dichotic Dissyllable - Staggered Spondaic Word (SSW), Pitch Pattern Sequence (PPS) and the Binaural Fusion Test (BF). considering performance in the vocabulary test, all children obtained results with no significant statistical. As for the auditory processing assessment, all children presented better results than expected; the only exception was on the sorting process testing, where the mean accuracy score was of 8.25. Regarding the performance in the other auditory processing tests, the mean accuracy averages were 6.50 in the SSW, 10.74 in the PPS and 7.10 in the BF. When correlating the performance obtained in both assessments, considering p>0.05, the results indicated that, despite the normality, the lower the value obtained in the auditory processing assessment, the lower the accuracy presented in the vocabulary test. A trend was observed for the semantic fields of "means of transportation and professions". Considering the classification categories of the vocabulary test, the SP (substitution processes) were the categories that presented the higher significant increase in all semantic fields. there is a correlation between the auditory processing and the lexicon, where vocabulary can be influenced in children with deviant speech acquisition.

  12. Speech recognition from spectral dynamics

    Indian Academy of Sciences (India)

    Carrier nature of speech; modulation spectrum; spectral dynamics ... the relationships between phonetic values of sounds and their short-term spectral envelopes .... the number of free parameters that need to be estimated from training data.

  13. Early enhanced processing and delayed habituation to deviance sounds in autism spectrum disorder.

    Science.gov (United States)

    Hudac, Caitlin M; DesChamps, Trent D; Arnett, Anne B; Cairney, Brianna E; Ma, Ruqian; Webb, Sara Jane; Bernier, Raphael A

    2018-06-01

    Children with autism spectrum disorder (ASD) exhibit difficulties processing and encoding sensory information in daily life. Cognitive response to environmental change in control individuals is naturally dynamic, meaning it habituates or reduces over time as one becomes accustomed to the deviance. The origin of atypical response to deviance in ASD may relate to differences in this dynamic habituation. The current study of 133 children and young adults with and without ASD examined classic electrophysiological responses (MMN and P3a), as well as temporal patterns of habituation (i.e., N1 and P3a change over time) in response to a passive auditory oddball task. Individuals with ASD showed an overall heightened sensitivity to change as exhibited by greater P3a amplitude to novel sounds. Moreover, youth with ASD showed dynamic ERP differences, including slower attenuation of the N1 response to infrequent tones and the P3a response to novel sounds. Dynamic ERP responses were related to parent ratings of auditory sensory-seeking behaviors, but not general cognition. As the first large-scale study to characterize temporal dynamics of auditory ERPs in ASD, our results provide compelling evidence that heightened response to auditory deviance in ASD is largely driven by early sensitivity and prolonged processing of auditory deviance. Copyright © 2018 Elsevier Inc. All rights reserved.

  14. Speech-Language Therapists' Process of Including Significant Others in Aphasia Rehabilitation

    Science.gov (United States)

    Hallé, Marie-Christine; Le Dorze, Guylaine; Mingant, Anne

    2014-01-01

    Background: Although aphasia rehabilitation should include significant others, it is currently unknown how this recommendation is adopted in speech-language therapy practice. Speech-language therapists' (SLTs) experience of including significant others in aphasia rehabilitation is also understudied, yet a better understanding of clinical…

  15. Co-Thought and Co-Speech Gestures Are Generated by the Same Action Generation Process

    Science.gov (United States)

    Chu, Mingyuan; Kita, Sotaro

    2016-01-01

    People spontaneously gesture when they speak (co-speech gestures) and when they solve problems silently (co-thought gestures). In this study, we first explored the relationship between these 2 types of gestures and found that individuals who produced co-thought gestures more frequently also produced co-speech gestures more frequently (Experiments…

  16. Processing and Comprehension of Accented Speech by Monolingual and Bilingual Children

    Science.gov (United States)

    McDonald, Margarethe; Gross, Megan; Buac, Milijana; Batko, Michelle; Kaushanskaya, Margarita

    2018-01-01

    This study tested the effect of Spanish-accented speech on sentence comprehension in children with different degrees of Spanish experience. The hypothesis was that earlier acquisition of Spanish would be associated with enhanced comprehension of Spanish-accented speech. Three groups of 5-6-year-old children were tested: monolingual…

  17. Sound stream segregation: a neuromorphic approach to solve the “cocktail party problem” in real-time

    Science.gov (United States)

    Thakur, Chetan Singh; Wang, Runchun M.; Afshar, Saeed; Hamilton, Tara J.; Tapson, Jonathan C.; Shamma, Shihab A.; van Schaik, André

    2015-01-01

    The human auditory system has the ability to segregate complex auditory scenes into a foreground component and a background, allowing us to listen to specific speech sounds from a mixture of sounds. Selective attention plays a crucial role in this process, colloquially known as the “cocktail party effect.” It has not been possible to build a machine that can emulate this human ability in real-time. Here, we have developed a framework for the implementation of a neuromorphic sound segregation algorithm in a Field Programmable Gate Array (FPGA). This algorithm is based on the principles of temporal coherence and uses an attention signal to separate a target sound stream from background noise. Temporal coherence implies that auditory features belonging to the same sound source are coherently modulated and evoke highly correlated neural response patterns. The basis for this form of sound segregation is that responses from pairs of channels that are strongly positively correlated belong to the same stream, while channels that are uncorrelated or anti-correlated belong to different streams. In our framework, we have used a neuromorphic cochlea as a frontend sound analyser to extract spatial information of the sound input, which then passes through band pass filters that extract the sound envelope at various modulation rates. Further stages include feature extraction and mask generation, which is finally used to reconstruct the targeted sound. Using sample tonal and speech mixtures, we show that our FPGA architecture is able to segregate sound sources in real-time. The accuracy of segregation is indicated by the high signal-to-noise ratio (SNR) of the segregated stream (90, 77, and 55 dB for simple tone, complex tone, and speech, respectively) as compared to the SNR of the mixture waveform (0 dB). This system may be easily extended for the segregation of complex speech signals, and may thus find various applications in electronic devices such as for sound segregation

  18. Sound stream segregation: a neuromorphic approach to solve the ‘cocktail party problem’ in real-time

    Directory of Open Access Journals (Sweden)

    Chetan Singh Thakur

    2015-09-01

    Full Text Available The human auditory system has the ability to segregate complex auditory scenes into a foreground component and a background, allowing us to listen to specific speech sounds from a mixture of sounds. Selective attention plays a crucial role in this process, colloquially known as the ‘cocktail party effect’. It has not been possible to build a machine that can emulate this human ability in real-time. Here, we have developed a framework for the implementation of a neuromorphic sound segregation algorithm in a Field Programmable Gate Array (FPGA. This algorithm is based on the principles of temporal coherence and uses an attention signal to separate a target sound stream from background noise. Temporal coherence implies that auditory features belonging to the same sound source are coherently modulated and evoke highly correlated neural response patterns. The basis for this form of sound segregation is that responses from pairs of channels that are strongly positively correlated belong to the same stream, while channels that are uncorrelated or anti-correlated belong to different streams. In our framework, we have used a neuromorphic cochlea as a frontend sound analyser to extract spatial information of the sound input, which then passes through band pass filters that extract the sound envelope at various modulation rates. Further stages include feature extraction and mask generation, which is finally used to reconstruct the targeted sound. Using sample tonal and speech mixtures, we show that our FPGA architecture is able to segregate sound sources in real-time. The accuracy of segregation is indicated by the high signal-to-noise ratio (SNR of the segregated stream (90, 77 and 55 dB for simple tone, complex tone and speech, respectively as compared to the SNR of the mixture waveform (0 dB. This system may be easily extended for the segregation of complex speech signals, and may thus find various applications in electronic devices such as for

  19. The Beginnings of Danish Speech Perception

    DEFF Research Database (Denmark)

    Østerbye, Torkil

    , in the light of the rich and complex Danish sound system. The first two studies report on native adults’ perception of Danish speech sounds in quiet and noise. The third study examined the development of language-specific perception in native Danish infants at 6, 9 and 12 months of age. The book points......Little is known about the perception of speech sounds by native Danish listeners. However, the Danish sound system differs in several interesting ways from the sound systems of other languages. For instance, Danish is characterized, among other features, by a rich vowel inventory and by different...... reductions of speech sounds evident in the pronunciation of the language. This book (originally a PhD thesis) consists of three studies based on the results of two experiments. The experiments were designed to provide knowledge of the perception of Danish speech sounds by Danish adults and infants...

  20. Sound and sound sources

    DEFF Research Database (Denmark)

    Larsen, Ole Næsbye; Wahlberg, Magnus

    2017-01-01

    There is no difference in principle between the infrasonic and ultrasonic sounds, which are inaudible to humans (or other animals) and the sounds that we can hear. In all cases, sound is a wave of pressure and particle oscillations propagating through an elastic medium, such as air. This chapter...... is about the physical laws that govern how animals produce sound signals and how physical principles determine the signals’ frequency content and sound level, the nature of the sound field (sound pressure versus particle vibrations) as well as directional properties of the emitted signal. Many...... of these properties are dictated by simple physical relationships between the size of the sound emitter and the wavelength of emitted sound. The wavelengths of the signals need to be sufficiently short in relation to the size of the emitter to allow for the efficient production of propagating sound pressure waves...

  1. Signal Processing Methods for Removing the Effects of Whole Body Vibration upon Speech

    Science.gov (United States)

    Bitner, Rachel M.; Begault, Durand R.

    2014-01-01

    Humans may be exposed to whole-body vibration in environments where clear speech communications are crucial, particularly during the launch phases of space flight and in high-performance aircraft. Prior research has shown that high levels of vibration cause a decrease in speech intelligibility. However, the effects of whole-body vibration upon speech are not well understood, and no attempt has been made to restore speech distorted by whole-body vibration. In this paper, a model for speech under whole-body vibration is proposed and a method to remove its effect is described. The method described reduces the perceptual effects of vibration, yields higher ASR accuracy scores, and may significantly improve intelligibility. Possible applications include incorporation within communication systems to improve radio-communication systems in environments such a spaceflight, aviation, or off-road vehicle operations.

  2. Assessment and improvement of sound quality in cochlear implant users.

    Science.gov (United States)

    Caldwell, Meredith T; Jiam, Nicole T; Limb, Charles J

    2017-06-01

    Cochlear implants (CIs) have successfully provided speech perception to individuals with sensorineural hearing loss. Recent research has focused on more challenging acoustic stimuli such as music and voice emotion. The purpose of this review is to evaluate and describe sound quality in CI users with the purposes of summarizing novel findings and crucial information about how CI users experience complex sounds. Here we review the existing literature on PubMed and Scopus to present what is known about perceptual sound quality in CI users, discuss existing measures of sound quality, explore how sound quality may be effectively studied, and examine potential strategies of improving sound quality in the CI population. Sound quality, defined here as the perceived richness of an auditory stimulus, is an attribute of implant-mediated listening that remains poorly studied. Sound quality is distinct from appraisal, which is generally defined as the subjective likability or pleasantness of a sound. Existing studies suggest that sound quality perception in the CI population is limited by a range of factors, most notably pitch distortion and dynamic range compression. Although there are currently very few objective measures of sound quality, the CI-MUSHRA has been used as a means of evaluating sound quality. There exist a number of promising strategies to improve sound quality perception in the CI population including apical cochlear stimulation, pitch tuning, and noise reduction processing strategies. In the published literature, sound quality perception is severely limited among CI users. Future research should focus on developing systematic, objective, and quantitative sound quality metrics and designing therapies to mitigate poor sound quality perception in CI users. NA.

  3. Speaking with a mirror: engagement of mirror neurons via choral speech and its derivatives induces stuttering inhibition.

    Science.gov (United States)

    Kalinowski, Joseph; Saltuklaroglu, Tim

    2003-04-01

    'Choral speech', 'unison speech', or 'imitation speech' has long been known to immediately induce reflexive, spontaneous, and natural sounding fluency, even the most severe cases of stuttering. Unlike typical post-therapeutic speech, a hallmark characteristic of choral speech is the sense of 'invulnerability' to stuttering, regardless of phonetic context, situational environment, or audience size. We suggest that choral speech immediately inhibits stuttering by engaging mirror systems of neurons, innate primitive neuronal substrates that dominate the initial phases of language development due to their predisposition to reflexively imitate gestural action sequences in a fluent manner. Since mirror systems are primordial in nature, they take precedence over the much later developing stuttering pathology. We suggest that stuttering may best be ameliorated by reengaging mirror neurons via choral speech or one of its derivatives (using digital signal processing technology) to provide gestural mirrors, that are nature's way of immediately overriding the central stuttering block. Copyright 2003 Elsevier Science Ltd.

  4. Speech enhancement

    CERN Document Server

    Benesty, Jacob; Chen, Jingdong

    2006-01-01

    We live in a noisy world! In all applications (telecommunications, hands-free communications, recording, human-machine interfaces, etc.) that require at least one microphone, the signal of interest is usually contaminated by noise and reverberation. As a result, the microphone signal has to be ""cleaned"" with digital signal processing tools before it is played out, transmitted, or stored.This book is about speech enhancement. Different well-known and state-of-the-art methods for noise reduction, with one or multiple microphones, are discussed. By speech enhancement, we mean not only noise red

  5. Central load reduces peripheral processing: Evidence from incidental memory of background speech.

    Science.gov (United States)

    Halin, Niklas; Marsh, John E; Sörqvist, Patrik

    2015-12-01

    Is there a trade-off between central (working memory) load and peripheral (perceptual) processing? To address this question, participants were requested to undertake an n-back task in one of two levels of central/cognitive load (i.e., 1-back or 2-back) in the presence of a to-be-ignored story presented via headphones. Participants were told to ignore the background story, but they were given a surprise memory test of what had been said in the background story, immediately after the n-back task was completed. Memory was poorer in the high central load (2-back) condition in comparison with the low central load (1-back) condition. Hence, when people compensate for higher central load, by increasing attentional engagement, peripheral processing is constrained. Moreover, participants with high working memory capacity (WMC) - with a superior ability for attentional engagement - remembered less of the background story, but only in the low central load condition. Taken together, peripheral processing - as indexed by incidental memory of background speech - is constrained when task engagement is high. © 2015 The Authors. Scandinavian Journal of Psychology published by Scandinavian Psychological Associations and John Wiley & Sons Ltd.

  6. A virtual auditory environment for investigating the auditory signal processing of realistic sounds

    DEFF Research Database (Denmark)

    Favrot, Sylvain Emmanuel; Buchholz, Jörg

    2008-01-01

    In the present study, a novel multichannel loudspeaker-based virtual auditory environment (VAE) is introduced. The VAE aims at providing a versatile research environment for investigating the auditory signal processing in real environments, i.e., considering multiple sound sources and room...... reverberation. The environment is based on the ODEON room acoustic simulation software to render the acoustical scene. ODEON outputs are processed using a combination of different order Ambisonic techniques to calculate multichannel room impulse responses (mRIR). Auralization is then obtained by the convolution...... the VAE development, special care was taken in order to achieve a realistic auditory percept and to avoid “artifacts” such as unnatural coloration. The performance of the VAE has been evaluated and optimized on a 29 loudspeaker setup using both objective and subjective measurement techniques....

  7. Tipos de erros de fala em crianças com transtorno fonológico em função do histórico de otite média Speech errors in children with speech sound disorders according to otitis media history

    Directory of Open Access Journals (Sweden)

    Haydée Fiszbein Wertzner

    2012-12-01

    Full Text Available OBJETIVO: Descrever os índices articulatórios quanto aos diferentes tipos de erros e verificar a existência de um tipo de erro preferencial em crianças com transtorno fonológico, em função da presença ou não de histórico de otite média. MÉTODOS: Participaram deste estudo prospectivo e transversal, 21 sujeitos com idade entre 5 anos e 2 meses e 7 anos e 9 meses com diagnóstico de transtorno fonológico. Os sujeitos foram agrupados de acordo com a presença do histórico otite média. O grupo experimental 1 (GE1 foi composto por 14 sujeitos com histórico de otite média e o grupo experimental 2 (GE2 por sete sujeitos que não apresentaram histórico de otite média. Foram calculadas a quantidade de erros de fala (distorções, omissões e substituições e os índices articulatórios. Os dados foram submetidos à análise estatística. RESULTADOS: Os grupos GE1 e GE2 diferiram quanto ao desempenho nos índices na comparação entre as duas provas de fonologia aplicadas. Observou-se em todas as análises que os índices que avaliam as substituições indicaram o tipo de erro mais cometido pelas crianças com transtorno fonológico. CONCLUSÃO: Os índices foram efetivos na indicação da substituição como o erro mais ocorrente em crianças com TF. A maior ocorrência de erros de fala observada na nomeação de figuras em crianças com histórico de otite média indica que tais erros, possivelmente, estão associados à dificuldade na representação fonológica causada pela perda auditiva transitória que vivenciaram.PURPOSE: To describe articulatory indexes for the different speech errors and to verify the existence of a preferred type of error in children with speech sound disorder, according to the presence or absence of otitis media history. METHODS: Participants in this prospective and cross-sectional study were 21 subjects aged between 5 years and 2 months and 7 years and 9 months with speech sound disorder. Subjects were

  8. Design, development and test of the gearbox condition monitoring system using sound signal processing

    Directory of Open Access Journals (Sweden)

    M Zamani

    2016-09-01

    from a productive source of power to a consumer, for torque meeting and for rotating speed needed for the consumer. In fact, gearbox is an interfere between power source and power consumer which produces a flexible communication between power source and power consumer. Needing to a gearbox as a machine which can generate harmony as an interface is unavoidable due to lack of harmony of torque and rotating speed of production source of power. So necessary calculations in order to attain to technical characteristics of gearwheels, bearings, shaft dimensions and other accessories of gearbox were done. This gearbox is from kinds of simple gearwheel which its input and output shaft are parallel to each other. Main accessories of gearbox are: 1.crust 2.shaft 3.gearwheel 4.thorn 5.bearing 6.cover. All of the design parameters were calculated and considered in designing of all of the accessories of gearbox. Electromotor rotating calibration: For this aim, a light-contact telemeter in model of Lutron was used as contact. Acoustic module of electro motor: A module was constructed in order to prevent from sound waves interaction resulting from an electromotor function with waves of gearbox function. Three layers of sound abs