speech sound discrimination: Topics by WorldWideScience.org

Sample records for speech sound discrimination

Mechanisms underlying speech sound discrimination and categorization in humans and zebra finches

NARCIS (Netherlands)

Burgering, Merel A.; ten Cate, Carel; Vroomen, Jean

Speech sound categorization in birds seems in many ways comparable to that by humans, but it is unclear what mechanisms underlie such categorization. To examine this, we trained zebra finches and humans to discriminate two pairs of edited speech sounds that varied either along one dimension (vowel
Knockdown of Dyslexia-Gene Dcdc2 Interferes with Speech Sound Discrimination in Continuous Streams.

Science.gov (United States)

Centanni, Tracy Michelle; Booker, Anne B; Chen, Fuyi; Sloan, Andrew M; Carraway, Ryan S; Rennaker, Robert L; LoTurco, Joseph J; Kilgard, Michael P

2016-04-27

Dyslexia is the most common developmental language disorder and is marked by deficits in reading and phonological awareness. One theory of dyslexia suggests that the phonological awareness deficit is due to abnormal auditory processing of speech sounds. Variants in DCDC2 and several other neural migration genes are associated with dyslexia and may contribute to auditory processing deficits. In the current study, we tested the hypothesis that RNAi suppression of Dcdc2 in rats causes abnormal cortical responses to sound and impaired speech sound discrimination. In the current study, rats were subjected in utero to RNA interference targeting of the gene Dcdc2 or a scrambled sequence. Primary auditory cortex (A1) responses were acquired from 11 rats (5 with Dcdc2 RNAi; DC-) before any behavioral training. A separate group of 8 rats (3 DC-) were trained on a variety of speech sound discrimination tasks, and auditory cortex responses were acquired following training. Dcdc2 RNAi nearly eliminated the ability of rats to identify specific speech sounds from a continuous train of speech sounds but did not impair performance during discrimination of isolated speech sounds. The neural responses to speech sounds in A1 were not degraded as a function of presentation rate before training. These results suggest that A1 is not directly involved in the impaired speech discrimination caused by Dcdc2 RNAi. This result contrasts earlier results using Kiaa0319 RNAi and suggests that different dyslexia genes may cause different deficits in the speech processing circuitry, which may explain differential responses to therapy. Although dyslexia is diagnosed through reading difficulty, there is a great deal of variation in the phenotypes of these individuals. The underlying neural and genetic mechanisms causing these differences are still widely debated. In the current study, we demonstrate that suppression of a candidate-dyslexia gene causes deficits on tasks of rapid stimulus processing
Cortical activity patterns predict robust speech discrimination ability in noise

Science.gov (United States)

Shetake, Jai A.; Wolf, Jordan T.; Cheung, Ryan J.; Engineer, Crystal T.; Ram, Satyananda K.; Kilgard, Michael P.

2012-01-01

The neural mechanisms that support speech discrimination in noisy conditions are poorly understood. In quiet conditions, spike timing information appears to be used in the discrimination of speech sounds. In this study, we evaluated the hypothesis that spike timing is also used to distinguish between speech sounds in noisy conditions that significantly degrade neural responses to speech sounds. We tested speech sound discrimination in rats and recorded primary auditory cortex (A1) responses to speech sounds in background noise of different intensities and spectral compositions. Our behavioral results indicate that rats, like humans, are able to accurately discriminate consonant sounds even in the presence of background noise that is as loud as the speech signal. Our neural recordings confirm that speech sounds evoke degraded but detectable responses in noise. Finally, we developed a novel neural classifier that mimics behavioral discrimination. The classifier discriminates between speech sounds by comparing the A1 spatiotemporal activity patterns evoked on single trials with the average spatiotemporal patterns evoked by known sounds. Unlike classifiers in most previous studies, this classifier is not provided with the stimulus onset time. Neural activity analyzed with the use of relative spike timing was well correlated with behavioral speech discrimination in quiet and in noise. Spike timing information integrated over longer intervals was required to accurately predict rat behavioral speech discrimination in noisy conditions. The similarity of neural and behavioral discrimination of speech in noise suggests that humans and rats may employ similar brain mechanisms to solve this problem. PMID:22098331
Visual Speech Fills in Both Discrimination and Identification of Non-Intact Auditory Speech in Children

Science.gov (United States)

Jerger, Susan; Damian, Markus F.; McAlpine, Rachel P.; Abdi, Herve

2018-01-01

To communicate, children must discriminate and identify speech sounds. Because visual speech plays an important role in this process, we explored how visual speech influences phoneme discrimination and identification by children. Critical items had intact visual speech (e.g. baez) coupled to non-intact (excised onsets) auditory speech (signified…
Atypical central auditory speech-sound discrimination in children who stutter as indexed by the mismatch negativity

NARCIS (Netherlands)

Jansson-Verkasalo, E.; Eggers, K.; Järvenpää, A.; Suominen, K.; Van Den Bergh, B.R.H.; de Nil, L.; Kujala, T.

2014-01-01

Purpose Recent theoretical conceptualizations suggest that disfluencies in stuttering may arise from several factors, one of them being atypical auditory processing. The main purpose of the present study was to investigate whether speech sound encoding and central auditory discrimination, are
Effect of gap detection threshold on consistency of speech in children with speech sound disorder.

Science.gov (United States)

Sayyahi, Fateme; Soleymani, Zahra; Akbari, Mohammad; Bijankhan, Mahmood; Dolatshahi, Behrooz

2017-02-01

The present study examined the relationship between gap detection threshold and speech error consistency in children with speech sound disorder. The participants were children five to six years of age who were categorized into three groups of typical speech, consistent speech disorder (CSD) and inconsistent speech disorder (ISD).The phonetic gap detection threshold test was used for this study, which is a valid test comprised six syllables with inter-stimulus intervals between 20-300ms. The participants were asked to listen to the recorded stimuli three times and indicate whether they heard one or two sounds. There was no significant difference between the typical and CSD groups (p=0.55), but there were significant differences in performance between the ISD and CSD groups and the ISD and typical groups (p=0.00). The ISD group discriminated between speech sounds at a higher threshold. Children with inconsistent speech errors could not distinguish speech sounds during time-limited phonetic discrimination. It is suggested that inconsistency in speech is a representation of inconsistency in auditory perception, which causes by high gap detection threshold. Copyright © 2016 Elsevier Ltd. All rights reserved.
Cognitive and linguistic sources of variance in 2-year-olds’ speech-sound discrimination: a preliminary investigation.

Science.gov (United States)

Lalonde, Kaylah; Holt, Rachael Frush

2014-02-01

This preliminary investigation explored potential cognitive and linguistic sources of variance in 2-year-olds’ speech-sound discrimination by using the toddler change/ no-change procedure and examined whether modifications would result in a procedure that can be used consistently with younger 2-year-olds. Twenty typically developing 2-year-olds completed the newly modified toddler change/no-change procedure. Behavioral tests and parent report questionnaires were used to measure several cognitive and linguistic constructs. Stepwise linear regression was used to relate discrimination sensitivity to the cognitive and linguistic measures. In addition, discrimination results from the current experiment were compared with those from 2-year-old children tested in a previous experiment. Receptive vocabulary and working memory explained 56.6% of variance in discrimination performance. Performance was not different on the modified toddler change/no-change procedure used in the current experiment from in a previous investigation, which used the original version of the procedure. The relationship between speech discrimination and receptive vocabulary and working memory provides further evidence that the procedure is sensitive to the strength of perceptual representations. The role for working memory might also suggest that there are specific subject-related, nonsensory factors limiting the applicability of the procedure to children who have not reached the necessary levels of cognitive and linguistic development.
Discrimination and preference of speech and non-speech sounds in autism patients%孤独症患者言语及非言语声音辨识和偏好特征

Institute of Scientific and Technical Information of China (English)

王崇颖; 江鸣山; 徐旸; 马斐然; 石锋

2011-01-01

Objective:To explore the discrimination and preference of speech and non-speech sounds in autism patients. Methods: Ten people (5 children vs. 5 adults) diagnosed with autism according to the criteria of Diagnostic and Statistical Manual of Mental Disorders. Fourth Version ( DSM-Ⅳ) were selected from database of Nankai University Center for Behavioural Science. Together with 10 healthy controls with matched age, people with autism were tested by three experiments on speech sounds, pure tone and intonation which were recorded and modified by Praat, a voice analysis software. Their discrimination and preference were collected orally. The exact probability values were calculated. Results: The results showed that there were no significant differences on the discrimination of speech sounds, pure tone and intonation between autism patients and controls ( P ＞ 0. 05) while controls preferred speech and non-speech sounds with higher pitch than autism ( e. g. , - 100Hz/ +50Hz. 2 vs. 7. P ＜ 0. 05:50Hz/250Hz. 4 vs. 10. P ＜ 0. 05) and autism preferred non-speech sounds with lower pitch ( 100Hz/250Hz. 6 vs. 3.P ＜ 0. 05). No significant difference on the preference of intonation between autism and controls ( P ＞ 0. 05) was found. Conclusion:lt shows that people with autism have impaired auditory processing on speech and non-speech sounds.%目的:探究孤独症患者对言语及非言语声音的辨识和偏好特征.方法:从南开大学医学院行为医学中心患者数据库中选取根据美国精神障碍诊断与统计手册第4版(DSM-Ⅳ)诊断标准确诊的孤独症患者10名(儿童和成年人各5例),选取与年龄匹配的正常对照10名.所有被试均接受由专业的语音软件Praat录制和生成的语音音高、纯音音高和韵律的实验测试,口头报告其对言语及非言语声音的辨识和偏好结果.结果:孤独症患者在语音音高、纯音音高和韵律的辨识上和正常对照组差异无统计学意义(均P>0.05).�
Atypical speech versus non-speech detection and discrimination in 4- to 6- yr old children with autism spectrum disorder: An ERP study.

Directory of Open Access Journals (Sweden)

Alena Galilee

Full Text Available Previous event-related potential (ERP research utilizing oddball stimulus paradigms suggests diminished processing of speech versus non-speech sounds in children with an Autism Spectrum Disorder (ASD. However, brain mechanisms underlying these speech processing abnormalities, and to what extent they are related to poor language abilities in this population remain unknown. In the current study, we utilized a novel paired repetition paradigm in order to investigate ERP responses associated with the detection and discrimination of speech and non-speech sounds in 4- to 6-year old children with ASD, compared with gender and verbal age matched controls. ERPs were recorded while children passively listened to pairs of stimuli that were either both speech sounds, both non-speech sounds, speech followed by non-speech, or non-speech followed by speech. Control participants exhibited N330 match/mismatch responses measured from temporal electrodes, reflecting speech versus non-speech detection, bilaterally, whereas children with ASD exhibited this effect only over temporal electrodes in the left hemisphere. Furthermore, while the control groups exhibited match/mismatch effects at approximately 600 ms (central N600, temporal P600 when a non-speech sound was followed by a speech sound, these effects were absent in the ASD group. These findings suggest that children with ASD fail to activate right hemisphere mechanisms, likely associated with social or emotional aspects of speech detection, when distinguishing non-speech from speech stimuli. Together, these results demonstrate the presence of atypical speech versus non-speech processing in children with ASD when compared with typically developing children matched on verbal age.
Atypical speech versus non-speech detection and discrimination in 4- to 6- yr old children with autism spectrum disorder: An ERP study.

Science.gov (United States)

Galilee, Alena; Stefanidou, Chrysi; McCleery, Joseph P

2017-01-01

Previous event-related potential (ERP) research utilizing oddball stimulus paradigms suggests diminished processing of speech versus non-speech sounds in children with an Autism Spectrum Disorder (ASD). However, brain mechanisms underlying these speech processing abnormalities, and to what extent they are related to poor language abilities in this population remain unknown. In the current study, we utilized a novel paired repetition paradigm in order to investigate ERP responses associated with the detection and discrimination of speech and non-speech sounds in 4- to 6-year old children with ASD, compared with gender and verbal age matched controls. ERPs were recorded while children passively listened to pairs of stimuli that were either both speech sounds, both non-speech sounds, speech followed by non-speech, or non-speech followed by speech. Control participants exhibited N330 match/mismatch responses measured from temporal electrodes, reflecting speech versus non-speech detection, bilaterally, whereas children with ASD exhibited this effect only over temporal electrodes in the left hemisphere. Furthermore, while the control groups exhibited match/mismatch effects at approximately 600 ms (central N600, temporal P600) when a non-speech sound was followed by a speech sound, these effects were absent in the ASD group. These findings suggest that children with ASD fail to activate right hemisphere mechanisms, likely associated with social or emotional aspects of speech detection, when distinguishing non-speech from speech stimuli. Together, these results demonstrate the presence of atypical speech versus non-speech processing in children with ASD when compared with typically developing children matched on verbal age.
Speech discrimination difficulties in High-Functioning Autism Spectrum Disorder are likely independent of auditory hypersensitivity.

Directory of Open Access Journals (Sweden)

William Andrew Dunlop

2016-08-01

Full Text Available Autism Spectrum Disorder (ASD, characterised by impaired communication skills and repetitive behaviours, can also result in differences in sensory perception. Individuals with ASD often perform normally in simple auditory tasks but poorly compared to typically developed (TD individuals on complex auditory tasks like discriminating speech from complex background noise. A common trait of individuals with ASD is hypersensitivity to auditory stimulation. No studies to our knowledge consider whether hypersensitivity to sounds is related to differences in speech-in-noise discrimination. We provide novel evidence that individuals with high-functioning ASD show poor performance compared to TD individuals in a speech-in-noise discrimination task with an attentionally demanding background noise, but not in a purely energetic noise. Further, we demonstrate in our small sample that speech-hypersensitivity does not appear to predict performance in the speech-in-noise task. The findings support the argument that an attentional deficit, rather than a perceptual deficit, affects the ability of individuals with ASD to discriminate speech from background noise. Finally, we piloted a novel questionnaire that measures difficulty hearing in noisy environments, and sensitivity to non-verbal and verbal sounds. Psychometric analysis using 128 TD participants provided novel evidence for a difference in sensitivity to non-verbal and verbal sounds, and these findings were reinforced by participants with ASD who also completed the questionnaire. The study was limited by a small and high-functioning sample of participants with ASD. Future work could test larger sample sizes and include lower-functioning ASD participants.
Speech Production and Speech Discrimination by Hearing-Impaired Children.

Science.gov (United States)

Novelli-Olmstead, Tina; Ling, Daniel

1984-01-01

Seven hearing impaired children (five to seven years old) assigned to the Speakers group made highly significant gains in speech production and auditory discrimination of speech, while Listeners made only slight speech production gains and no gains in auditory discrimination. Combined speech and auditory training was more effective than auditory…
Speech endpoint detection with non-language speech sounds for generic speech processing applications

Science.gov (United States)

McClain, Matthew; Romanowski, Brian

2009-05-01

Non-language speech sounds (NLSS) are sounds produced by humans that do not carry linguistic information. Examples of these sounds are coughs, clicks, breaths, and filled pauses such as "uh" and "um" in English. NLSS are prominent in conversational speech, but can be a significant source of errors in speech processing applications. Traditionally, these sounds are ignored by speech endpoint detection algorithms, where speech regions are identified in the audio signal prior to processing. The ability to filter NLSS as a pre-processing step can significantly enhance the performance of many speech processing applications, such as speaker identification, language identification, and automatic speech recognition. In order to be used in all such applications, NLSS detection must be performed without the use of language models that provide knowledge of the phonology and lexical structure of speech. This is especially relevant to situations where the languages used in the audio are not known apriori. We present the results of preliminary experiments using data from American and British English speakers, in which segments of audio are classified as language speech sounds (LSS) or NLSS using a set of acoustic features designed for language-agnostic NLSS detection and a hidden-Markov model (HMM) to model speech generation. The results of these experiments indicate that the features and model used are capable of detection certain types of NLSS, such as breaths and clicks, while detection of other types of NLSS such as filled pauses will require future research.
Intelligibility of speech of children with speech and sound disorders

OpenAIRE

Ivetac, Tina

2014-01-01

The purpose of this study is to examine speech intelligibility of children with primary speech and sound disorders aged 3 to 6 years in everyday life. The research problem is based on the degree to which parents or guardians, immediate family members (sister, brother, grandparents), extended family members (aunt, uncle, cousin), child's friends, other acquaintances, child's teachers and strangers understand the speech of children with speech sound disorders. We examined whether the level ...
Musical ability and non-native speech-sound processing are linked through sensitivity to pitch and spectral information.

Science.gov (United States)

Kempe, Vera; Bublitz, Dennis; Brooks, Patricia J

2015-05-01

Is the observed link between musical ability and non-native speech-sound processing due to enhanced sensitivity to acoustic features underlying both musical and linguistic processing? To address this question, native English speakers (N = 118) discriminated Norwegian tonal contrasts and Norwegian vowels. Short tones differing in temporal, pitch, and spectral characteristics were used to measure sensitivity to the various acoustic features implicated in musical and speech processing. Musical ability was measured using Gordon's Advanced Measures of Musical Audiation. Results showed that sensitivity to specific acoustic features played a role in non-native speech-sound processing: Controlling for non-verbal intelligence, prior foreign language-learning experience, and sex, sensitivity to pitch and spectral information partially mediated the link between musical ability and discrimination of non-native vowels and lexical tones. The findings suggest that while sensitivity to certain acoustic features partially mediates the relationship between musical ability and non-native speech-sound processing, complex tests of musical ability also tap into other shared mechanisms. © 2014 The British Psychological Society.
Infant speech-sound discrimination testing: effects of stimulus intensity and procedural model on measures of performance.

Science.gov (United States)

Nozza, R J

1987-06-01

Performance of infants in a speech-sound discrimination task (/ba/ vs /da/) was measured at three stimulus intensity levels (50, 60, and 70 dB SPL) using the operant head-turn procedure. The procedure was modified so that data could be treated as though from a single-interval (yes-no) procedure, as is commonly done, as well as if from a sustained attention (vigilance) task. Discrimination performance changed significantly with increase in intensity, suggesting caution in the interpretation of results from infant discrimination studies in which only single stimulus intensity levels within this range are used. The assumptions made about the underlying methodological model did not change the performance-intensity relationships. However, infants demonstrated response decrement, typical of vigilance tasks, which supports the notion that the head-turn procedure is represented best by the vigilance model. Analysis then was done according to a method designed for tasks with undefined observation intervals [C. S. Watson and T. L. Nichols, J. Acoust. Soc. Am. 59, 655-668 (1976)]. Results reveal that, while group data are reasonably well represented across levels of difficulty by the fixed-interval model, there is a variation in performance as a function of time following trial onset that could lead to underestimation of performance in some cases.
Aging affects hemispheric asymmetry in the neural representation of speech sounds.

Science.gov (United States)

Bellis, T J; Nicol, T; Kraus, N

2000-01-15

Hemispheric asymmetries in the processing of elemental speech sounds appear to be critical for normal speech perception. This study investigated the effects of age on hemispheric asymmetry observed in the neurophysiological responses to speech stimuli in three groups of normal hearing, right-handed subjects: children (ages, 8-11 years), young adults (ages, 20-25 years), and older adults (ages > 55 years). Peak-to-peak response amplitudes of the auditory cortical P1-N1 complex obtained over right and left temporal lobes were examined to determine the degree of left/right asymmetry in the neurophysiological responses elicited by synthetic speech syllables in each of the three subject groups. In addition, mismatch negativity (MMN) responses, which are elicited by acoustic change, were obtained. Whereas children and young adults demonstrated larger P1-N1-evoked response amplitudes over the left temporal lobe than over the right, responses from elderly subjects were symmetrical. In contrast, MMN responses, which reflect an echoic memory process, were symmetrical in all subject groups. The differences observed in the neurophysiological responses were accompanied by a finding of significantly poorer ability to discriminate speech syllables involving rapid spectrotemporal changes in the older adult group. This study demonstrates a biological, age-related change in the neural representation of basic speech sounds and suggests one possible underlying mechanism for the speech perception difficulties exhibited by aging adults. Furthermore, results of this study support previous findings suggesting a dissociation between neural mechanisms underlying those processes that reflect the basic representation of sound structure and those that represent auditory echoic memory and stimulus change.
Speech sound disorder at 4 years: prevalence, comorbidities, and predictors in a community cohort of children.

Science.gov (United States)

Eadie, Patricia; Morgan, Angela; Ukoumunne, Obioha C; Ttofari Eecen, Kyriaki; Wake, Melissa; Reilly, Sheena

2015-06-01

The epidemiology of preschool speech sound disorder is poorly understood. Our aims were to determine: the prevalence of idiopathic speech sound disorder; the comorbidity of speech sound disorder with language and pre-literacy difficulties; and the factors contributing to speech outcome at 4 years. One thousand four hundred and ninety-four participants from an Australian longitudinal cohort completed speech, language, and pre-literacy assessments at 4 years. Prevalence of speech sound disorder (SSD) was defined by standard score performance of ≤79 on a speech assessment. Logistic regression examined predictors of SSD within four domains: child and family; parent-reported speech; cognitive-linguistic; and parent-reported motor skills. At 4 years the prevalence of speech disorder in an Australian cohort was 3.4%. Comorbidity with SSD was 40.8% for language disorder and 20.8% for poor pre-literacy skills. Sex, maternal vocabulary, socio-economic status, and family history of speech and language difficulties predicted SSD, as did 2-year speech, language, and motor skills. Together these variables provided good discrimination of SSD (area under the curve=0.78). This is the first epidemiological study to demonstrate prevalence of SSD at 4 years of age that was consistent with previous clinical studies. Early detection of SSD at 4 years should focus on family variables and speech, language, and motor skills measured at 2 years. © 2014 Mac Keith Press.
Speech versus singing: Infants choose happier sounds

Directory of Open Access Journals (Sweden)

Marieve eCorbeil

2013-06-01

Full Text Available Infants prefer speech to non-vocal sounds and to non-human vocalizations, and they prefer happy-sounding speech to neutral speech. They also exhibit an interest in singing, but there is little knowledge of their relative interest in speech and singing. The present study explored infants’ attention to unfamiliar audio samples of speech and singing. In Experiment 1, infants 4-13 months of age were exposed to happy-sounding infant-directed speech versus hummed lullabies by the same woman. They listened significantly longer to the speech, which had considerably greater acoustic variability and expressiveness, than to the lullabies. In Experiment 2, infants of comparable age who heard the lyrics of a Turkish children’s song spoken versus sung in a joyful/happy manner did not exhibit differential listening. Infants in Experiment 3 heard the happily sung lyrics of the Turkish children’s song versus a version that was spoken in an adult-directed or affectively neutral manner. They listened significantly longer to the sung version. Overall, happy voice quality rather than vocal mode (speech or singing was the principal contributor to infant attention, regardless of age.
Discriminative learning for speech recognition

CERN Document Server

He, Xiadong

2008-01-01

In this book, we introduce the background and mainstream methods of probabilistic modeling and discriminative parameter optimization for speech recognition. The specific models treated in depth include the widely used exponential-family distributions and the hidden Markov model. A detailed study is presented on unifying the common objective functions for discriminative learning in speech recognition, namely maximum mutual information (MMI), minimum classification error, and minimum phone/word error. The unification is presented, with rigorous mathematical analysis, in a common rational-functio

Refining Stimulus Parameters in Assessing Infant Speech Perception Using Visual Reinforcement Infant Speech Discrimination: Sensation Level.

Science.gov (United States)

Uhler, Kristin M; Baca, Rosalinda; Dudas, Emily; Fredrickson, Tammy

2015-01-01

Speech perception measures have long been considered an integral piece of the audiological assessment battery. Currently, a prelinguistic, standardized measure of speech perception is missing in the clinical assessment battery for infants and young toddlers. Such a measure would allow systematic assessment of speech perception abilities of infants as well as the potential to investigate the impact early identification of hearing loss and early fitting of amplification have on the auditory pathways. To investigate the impact of sensation level (SL) on the ability of infants with normal hearing (NH) to discriminate /a-i/ and /ba-da/ and to determine if performance on the two contrasts are significantly different in predicting the discrimination criterion. The design was based on a survival analysis model for event occurrence and a repeated measures logistic model for binary outcomes. The outcome for survival analysis was the minimum SL for criterion and the outcome for the logistic regression model was the presence/absence of achieving the criterion. Criterion achievement was designated when an infant's proportion correct score was >0.75 on the discrimination performance task. Twenty-two infants with NH sensitivity participated in this study. There were 9 males and 13 females, aged 6-14 mo. Testing took place over two to three sessions. The first session consisted of a hearing test, threshold assessment of the two speech sounds (/a/ and /i/), and if time and attention allowed, visual reinforcement infant speech discrimination (VRISD). The second session consisted of VRISD assessment for the two test contrasts (/a-i/ and /ba-da/). The presentation level started at 50 dBA. If the infant was unable to successfully achieve criterion (>0.75) at 50 dBA, the presentation level was increased to 70 dBA followed by 60 dBA. Data examination included an event analysis, which provided the probability of criterion distribution across SL. The second stage of the analysis was a
Subtyping Children with Speech Sound Disorders by Endophenotypes

Science.gov (United States)

Lewis, Barbara A.; Avrich, Allison A.; Freebairn, Lisa A.; Taylor, H. Gerry; Iyengar, Sudha K.; Stein, Catherine M.

2011-01-01

Purpose: The present study examined associations of 5 endophenotypes (i.e., measurable skills that are closely associated with speech sound disorders and are useful in detecting genetic influences on speech sound production), oral motor skills, phonological memory, phonological awareness, vocabulary, and speeded naming, with 3 clinical criteria…
Spectral integration in speech and non-speech sounds

Science.gov (United States)

Jacewicz, Ewa

2005-04-01

Spectral integration (or formant averaging) was proposed in vowel perception research to account for the observation that a reduction of the intensity of one of two closely spaced formants (as in /u/) produced a predictable shift in vowel quality [Delattre et al., Word 8, 195-210 (1952)]. A related observation was reported in psychoacoustics, indicating that when the components of a two-tone periodic complex differ in amplitude and frequency, its perceived pitch is shifted toward that of the more intense tone [Helmholtz, App. XIV (1875/1948)]. Subsequent research in both fields focused on the frequency interval that separates these two spectral components, in an attempt to determine the size of the bandwidth for spectral integration to occur. This talk will review the accumulated evidence for and against spectral integration within the hypothesized limit of 3.5 Bark for static and dynamic signals in speech perception and psychoacoustics. Based on similarities in the processing of speech and non-speech sounds, it is suggested that spectral integration may reflect a general property of the auditory system. A larger frequency bandwidth, possibly close to 3.5 Bark, may be utilized in integrating acoustic information, including speech, complex signals, or sound quality of a violin.
The influence of environmental sound training on the perception of spectrally degraded speech and environmental sounds.

Science.gov (United States)

Shafiro, Valeriy; Sheft, Stanley; Gygi, Brian; Ho, Kim Thien N

2012-06-01

Perceptual training with spectrally degraded environmental sounds results in improved environmental sound identification, with benefits shown to extend to untrained speech perception as well. The present study extended those findings to examine longer-term training effects as well as effects of mere repeated exposure to sounds over time. Participants received two pretests (1 week apart) prior to a week-long environmental sound training regimen, which was followed by two posttest sessions, separated by another week without training. Spectrally degraded stimuli, processed with a four-channel vocoder, consisted of a 160-item environmental sound test, word and sentence tests, and a battery of basic auditory abilities and cognitive tests. Results indicated significant improvements in all speech and environmental sound scores between the initial pretest and the last posttest with performance increments following both exposure and training. For environmental sounds (the stimulus class that was trained), the magnitude of positive change that accompanied training was much greater than that due to exposure alone, with improvement for untrained sounds roughly comparable to the speech benefit from exposure. Additional tests of auditory and cognitive abilities showed that speech and environmental sound performance were differentially correlated with tests of spectral and temporal-fine-structure processing, whereas working memory and executive function were correlated with speech, but not environmental sound perception. These findings indicate generalizability of environmental sound training and provide a basis for implementing environmental sound training programs for cochlear implant (CI) patients.
Effects of sounds of locomotion on speech perception

Directory of Open Access Journals (Sweden)

Matz Larsson

2015-01-01

Full Text Available Human locomotion typically creates noise, a possible consequence of which is the masking of sound signals originating in the surroundings. When walking side by side, people often subconsciously synchronize their steps. The neurophysiological and evolutionary background of this behavior is unclear. The present study investigated the potential of sound created by walking to mask perception of speech and compared the masking produced by walking in step with that produced by unsynchronized walking. The masking sound (footsteps on gravel and the target sound (speech were presented through the same speaker to 15 normal-hearing subjects. The original recorded walking sound was modified to mimic the sound of two individuals walking in pace or walking out of synchrony. The participants were instructed to adjust the sound level of the target sound until they could just comprehend the speech signal ("just follow conversation" or JFC level when presented simultaneously with synchronized or unsynchronized walking sound at 40 dBA, 50 dBA, 60 dBA, or 70 dBA. Synchronized walking sounds produced slightly less masking of speech than did unsynchronized sound. The median JFC threshold in the synchronized condition was 38.5 dBA, while the corresponding value for the unsynchronized condition was 41.2 dBA. Combined results at all sound pressure levels showed an improvement in the signal-to-noise ratio (SNR for synchronized footsteps; the median difference was 2.7 dB and the mean difference was 1.2 dB [P < 0.001, repeated-measures analysis of variance (RM-ANOVA]. The difference was significant for masker levels of 50 dBA and 60 dBA, but not for 40 dBA or 70 dBA. This study provides evidence that synchronized walking may reduce the masking potential of footsteps.
Interventions for Speech Sound Disorders in Children

Science.gov (United States)

Williams, A. Lynn, Ed.; McLeod, Sharynne, Ed.; McCauley, Rebecca J., Ed.

2010-01-01

With detailed discussion and invaluable video footage of 23 treatment interventions for speech sound disorders (SSDs) in children, this textbook and DVD set should be part of every speech-language pathologist's professional preparation. Focusing on children with functional or motor-based speech disorders from early childhood through the early…
Perceptual sensitivity to spectral properties of earlier sounds during speech categorization.

Science.gov (United States)

Stilp, Christian E; Assgari, Ashley A

2018-02-28

Speech perception is heavily influenced by surrounding sounds. When spectral properties differ between earlier (context) and later (target) sounds, this can produce spectral contrast effects (SCEs) that bias perception of later sounds. For example, when context sounds have more energy in low-F 1 frequency regions, listeners report more high-F 1 responses to a target vowel, and vice versa. SCEs have been reported using various approaches for a wide range of stimuli, but most often, large spectral peaks were added to the context to bias speech categorization. This obscures the lower limit of perceptual sensitivity to spectral properties of earlier sounds, i.e., when SCEs begin to bias speech categorization. Listeners categorized vowels (/ɪ/-/ɛ/, Experiment 1) or consonants (/d/-/g/, Experiment 2) following a context sentence with little spectral amplification (+1 to +4 dB) in frequency regions known to produce SCEs. In both experiments, +3 and +4 dB amplification in key frequency regions of the context produced SCEs, but lesser amplification was insufficient to bias performance. This establishes a lower limit of perceptual sensitivity where spectral differences across sounds can bias subsequent speech categorization. These results are consistent with proposed adaptation-based mechanisms that potentially underlie SCEs in auditory perception. Recent sounds can change what speech sounds we hear later. This can occur when the average frequency composition of earlier sounds differs from that of later sounds, biasing how they are perceived. These "spectral contrast effects" are widely observed when sounds' frequency compositions differ substantially. We reveal the lower limit of these effects, as +3 dB amplification of key frequency regions in earlier sounds was enough to bias categorization of the following vowel or consonant sound. Speech categorization being biased by very small spectral differences across sounds suggests that spectral contrast effects occur
Emergence of category-level sensitivities in non-native speech sound learning

Directory of Open Access Journals (Sweden)

Emily eMyers

2014-08-01

Full Text Available Over the course of development, speech sounds that are contrastive in one’s native language tend to become perceived categorically: that is, listeners are unaware of variation within phonetic categories while showing excellent sensitivity to speech sounds that span linguistically meaningful phonetic category boundaries. The end stage of this developmental process is that the perceptual systems that handle acoustic-phonetic information show special tuning to native language contrasts, and as such, category-level information appears to be present at even fairly low levels of the neural processing stream. Research on adults acquiring non-native speech categories offers an avenue for investigating the interplay of category-level information and perceptual sensitivities to these sounds as speech categories emerge. In particular, one can observe the neural changes that unfold as listeners learn not only to perceive acoustic distinctions that mark non-native speech sound contrasts, but also to map these distinctions onto category-level representations. An emergent literature on the neural basis of novel and non-native speech sound learning offers new insight into this question. In this review, I will examine this literature in order to answer two key questions. First, where in the neural pathway does sensitivity to category-level phonetic information first emerge over the trajectory of speech sound learning? Second, how do frontal and temporal brain areas work in concert over the course of non-native speech sound learning? Finally, in the context of this literature I will describe a model of speech sound learning in which rapidly-adapting access to categorical information in the frontal lobes modulates the sensitivity of stable, slowly-adapting responses in the temporal lobes.
Precision of working memory for speech sounds.

Science.gov (United States)

Joseph, Sabine; Iverson, Paul; Manohar, Sanjay; Fox, Zoe; Scott, Sophie K; Husain, Masud

2015-01-01

Memory for speech sounds is a key component of models of verbal working memory (WM). But how good is verbal WM? Most investigations assess this using binary report measures to derive a fixed number of items that can be stored. However, recent findings in visual WM have challenged such "quantized" views by employing measures of recall precision with an analogue response scale. WM for speech sounds might rely on both continuous and categorical storage mechanisms. Using a novel speech matching paradigm, we measured WM recall precision for phonemes. Vowel qualities were sampled from a formant space continuum. A probe vowel had to be adjusted to match the vowel quality of a target on a continuous, analogue response scale. Crucially, this provided an index of the variability of a memory representation around its true value and thus allowed us to estimate how memories were distorted from the original sounds. Memory load affected the quality of speech sound recall in two ways. First, there was a gradual decline in recall precision with increasing number of items, consistent with the view that WM representations of speech sounds become noisier with an increase in the number of items held in memory, just as for vision. Based on multidimensional scaling (MDS), the level of noise appeared to be reflected in distortions of the formant space. Second, as memory load increased, there was evidence of greater clustering of participants' responses around particular vowels. A mixture model captured both continuous and categorical responses, demonstrating a shift from continuous to categorical memory with increasing WM load. This suggests that direct acoustic storage can be used for single items, but when more items must be stored, categorical representations must be used.
Sensorimotor influences on speech perception in infancy.

Science.gov (United States)

Bruderer, Alison G; Danielson, D Kyle; Kandhadai, Padmapriya; Werker, Janet F

2015-11-03

The influence of speech production on speech perception is well established in adults. However, because adults have a long history of both perceiving and producing speech, the extent to which the perception-production linkage is due to experience is unknown. We addressed this issue by asking whether articulatory configurations can influence infants' speech perception performance. To eliminate influences from specific linguistic experience, we studied preverbal, 6-mo-old infants and tested the discrimination of a nonnative, and hence never-before-experienced, speech sound distinction. In three experimental studies, we used teething toys to control the position and movement of the tongue tip while the infants listened to the speech sounds. Using ultrasound imaging technology, we verified that the teething toys consistently and effectively constrained the movement and positioning of infants' tongues. With a looking-time procedure, we found that temporarily restraining infants' articulators impeded their discrimination of a nonnative consonant contrast but only when the relevant articulator was selectively restrained to prevent the movements associated with producing those sounds. Our results provide striking evidence that even before infants speak their first words and without specific listening experience, sensorimotor information from the articulators influences speech perception. These results transform theories of speech perception by suggesting that even at the initial stages of development, oral-motor movements influence speech sound discrimination. Moreover, an experimentally induced "impairment" in articulator movement can compromise speech perception performance, raising the question of whether long-term oral-motor impairments may impact perceptual development.
Sound frequency affects speech emotion perception: results from congenital amusia.

Science.gov (United States)

Lolli, Sydney L; Lewenstein, Ari D; Basurto, Julian; Winnik, Sean; Loui, Psyche

2015-01-01

Congenital amusics, or "tone-deaf" individuals, show difficulty in perceiving and producing small pitch differences. While amusia has marked effects on music perception, its impact on speech perception is less clear. Here we test the hypothesis that individual differences in pitch perception affect judgment of emotion in speech, by applying low-pass filters to spoken statements of emotional speech. A norming study was first conducted on Mechanical Turk to ensure that the intended emotions from the Macquarie Battery for Evaluation of Prosody were reliably identifiable by US English speakers. The most reliably identified emotional speech samples were used in Experiment 1, in which subjects performed a psychophysical pitch discrimination task, and an emotion identification task under low-pass and unfiltered speech conditions. Results showed a significant correlation between pitch-discrimination threshold and emotion identification accuracy for low-pass filtered speech, with amusics (defined here as those with a pitch discrimination threshold >16 Hz) performing worse than controls. This relationship with pitch discrimination was not seen in unfiltered speech conditions. Given the dissociation between low-pass filtered and unfiltered speech conditions, we inferred that amusics may be compensating for poorer pitch perception by using speech cues that are filtered out in this manipulation. To assess this potential compensation, Experiment 2 was conducted using high-pass filtered speech samples intended to isolate non-pitch cues. No significant correlation was found between pitch discrimination and emotion identification accuracy for high-pass filtered speech. Results from these experiments suggest an influence of low frequency information in identifying emotional content of speech.
A homology sound-based algorithm for speech signal interference

Science.gov (United States)

Jiang, Yi-jiao; Chen, Hou-jin; Li, Ju-peng; Zhang, Zhan-song

2015-12-01

Aiming at secure analog speech communication, a homology sound-based algorithm for speech signal interference is proposed in this paper. We first split speech signal into phonetic fragments by a short-term energy method and establish an interference noise cache library with the phonetic fragments. Then we implement the homology sound interference by mixing the randomly selected interferential fragments and the original speech in real time. The computer simulation results indicated that the interference produced by this algorithm has advantages of real time, randomness, and high correlation with the original signal, comparing with the traditional noise interference methods such as white noise interference. After further studies, the proposed algorithm may be readily used in secure speech communication.
The influence of (central) auditory processing disorder in speech sound disorders.

Science.gov (United States)

Barrozo, Tatiane Faria; Pagan-Neves, Luciana de Oliveira; Vilela, Nadia; Carvallo, Renata Mota Mamede; Wertzner, Haydée Fiszbein

2016-01-01

Considering the importance of auditory information for the acquisition and organization of phonological rules, the assessment of (central) auditory processing contributes to both the diagnosis and targeting of speech therapy in children with speech sound disorders. To study phonological measures and (central) auditory processing of children with speech sound disorder. Clinical and experimental study, with 21 subjects with speech sound disorder aged between 7.0 and 9.11 years, divided into two groups according to their (central) auditory processing disorder. The assessment comprised tests of phonology, speech inconsistency, and metalinguistic abilities. The group with (central) auditory processing disorder demonstrated greater severity of speech sound disorder. The cutoff value obtained for the process density index was the one that best characterized the occurrence of phonological processes for children above 7 years of age. The comparison among the tests evaluated between the two groups showed differences in some phonological and metalinguistic abilities. Children with an index value above 0.54 demonstrated strong tendencies towards presenting a (central) auditory processing disorder, and this measure was effective to indicate the need for evaluation in children with speech sound disorder. Copyright © 2015 Associação Brasileira de Otorrinolaringologia e Cirurgia Cérvico-Facial. Published by Elsevier Editora Ltda. All rights reserved.
Sound frequency affects speech emotion perception: Results from congenital amusia

Directory of Open Access Journals (Sweden)

Sydney eLolli

2015-09-01

Full Text Available Congenital amusics, or tone-deaf individuals, show difficulty in perceiving and producing small pitch differences. While amusia has marked effects on music perception, its impact on speech perception is less clear. Here we test the hypothesis that individual differences in pitch perception affect judgment of emotion in speech, by applying band-pass filters to spoken statements of emotional speech. A norming study was first conducted on Mechanical Turk to ensure that the intended emotions from the Macquarie Battery for Evaluation of Prosody (MBEP were reliably identifiable by US English speakers. The most reliably identified emotional speech samples were used in in Experiment 1, in which subjects performed a psychophysical pitch discrimination task, and an emotion identification task under band-pass and unfiltered speech conditions. Results showed a significant correlation between pitch discrimination threshold and emotion identification accuracy for band-pass filtered speech, with amusics (defined here as those with a pitch discrimination threshold > 16 Hz performing worse than controls. This relationship with pitch discrimination was not seen in unfiltered speech conditions. Given the dissociation between band-pass filtered and unfiltered speech conditions, we inferred that amusics may be compensating for poorer pitch perception by using speech cues that are filtered out in this manipulation.
Cross-Modal Correspondence between Brightness and Chinese Speech Sound with Aspiration

Directory of Open Access Journals (Sweden)

Sachiko Hirata

2011-10-01

Full Text Available Phonetic symbolism is the phenomenon of speech sounds evoking images based on sensory experiences; it is often discussed with cross-modal correspondence. By using Garner's task, Hirata, Kita, and Ukita (2009 showed the cross-modal congruence between brightness and voiced/voiceless consonants in Japanese speech sound, which is known as phonetic symbolism. In the present study, we examined the effect of the meaning of mimetics (lexical words whose sound reflects its meaning, like “ding-dong” in Japanese language on the cross-modal correspondence. We conducted an experiment with Chinese speech sounds with or without aspiration using Chinese people. Chinese vocabulary also contains mimetics but the existence of aspiration doesn't relate to the meaning of Chinese mimetics. As a result, Chinese speech sounds with aspiration, which resemble voiceless consonants, were matched with white color, whereas those without aspiration were matched with black. This result is identical to its pattern in Japanese people and consequently suggests that cross-modal correspondence occurs without the effect of the meaning of mimetics. The problem that whether these cross-modal correspondences are purely based on physical properties of speech sound or affected from phonetic properties remains for further study.
The effects of bilingualism on children's perception of speech sounds

NARCIS (Netherlands)

Brasileiro, I.

2009-01-01

The general topic addressed by this dissertation is that of bilingualism, and more specifically, the topic of bilingual acquisition of speech sounds. The central question in this study is the following: does bilingualism affect children’s perceptual development of speech sounds? The term bilingual
Auditory spatial attention to speech and complex non-speech sounds in children with autism spectrum disorder.

Science.gov (United States)

Soskey, Laura N; Allen, Paul D; Bennetto, Loisa

2017-08-01

One of the earliest observable impairments in autism spectrum disorder (ASD) is a failure to orient to speech and other social stimuli. Auditory spatial attention, a key component of orienting to sounds in the environment, has been shown to be impaired in adults with ASD. Additionally, specific deficits in orienting to social sounds could be related to increased acoustic complexity of speech. We aimed to characterize auditory spatial attention in children with ASD and neurotypical controls, and to determine the effect of auditory stimulus complexity on spatial attention. In a spatial attention task, target and distractor sounds were played randomly in rapid succession from speakers in a free-field array. Participants attended to a central or peripheral location, and were instructed to respond to target sounds at the attended location while ignoring nearby sounds. Stimulus-specific blocks evaluated spatial attention for simple non-speech tones, speech sounds (vowels), and complex non-speech sounds matched to vowels on key acoustic properties. Children with ASD had significantly more diffuse auditory spatial attention than neurotypical children when attending front, indicated by increased responding to sounds at adjacent non-target locations. No significant differences in spatial attention emerged based on stimulus complexity. Additionally, in the ASD group, more diffuse spatial attention was associated with more severe ASD symptoms but not with general inattention symptoms. Spatial attention deficits have important implications for understanding social orienting deficits and atypical attentional processes that contribute to core deficits of ASD. Autism Res 2017, 10: 1405-1416. © 2017 International Society for Autism Research, Wiley Periodicals, Inc. © 2017 International Society for Autism Research, Wiley Periodicals, Inc.
Facilitated auditory detection for speech sounds

Directory of Open Access Journals (Sweden)

Carine eSignoret

2011-07-01

Full Text Available If it is well known that knowledge facilitates higher cognitive functions, such as visual and auditory word recognition, little is known about the influence of knowledge on detection, particularly in the auditory modality. Our study tested the influence of phonological and lexical knowledge on auditory detection. Words, pseudo words and complex non phonological sounds, energetically matched as closely as possible, were presented at a range of presentation levels from sub threshold to clearly audible. The participants performed a detection task (Experiments 1 and 2 that was followed by a two alternative forced choice recognition task in Experiment 2. The results of this second task in Experiment 2 suggest a correct recognition of words in the absence of detection with a subjective threshold approach. In the detection task of both experiments, phonological stimuli (words and pseudo words were better detected than non phonological stimuli (complex sounds, presented close to the auditory threshold. This finding suggests an advantage of speech for signal detection. An additional advantage of words over pseudo words was observed in Experiment 2, suggesting that lexical knowledge could also improve auditory detection when listeners had to recognize the stimulus in a subsequent task. Two simulations of detection performance performed on the sound signals confirmed that the advantage of speech over non speech processing could not be attributed to energetic differences in the stimuli.
Speech Abilities in Preschool Children with Speech Sound Disorder with and without Co-Occurring Language Impairment

Science.gov (United States)

Macrae, Toby; Tyler, Ann A.

2014-01-01

Purpose: The authors compared preschool children with co-occurring speech sound disorder (SSD) and language impairment (LI) to children with SSD only in their numbers and types of speech sound errors. Method: In this post hoc quasi-experimental study, independent samples t tests were used to compare the groups in the standard score from different…
Speech-discrimination scores modeled as a binomial variable.

Science.gov (United States)

Thornton, A R; Raffin, M J

1978-09-01

Many studies have reported variability data for tests of speech discrimination, and the disparate results of these studies have not been given a simple explanation. Arguments over the relative merits of 25- vs 50-word tests have ignored the basic mathematical properties inherent in the use of percentage scores. The present study models performance on clinical tests of speech discrimination as a binomial variable. A binomial model was developed, and some of its characteristics were tested against data from 4120 scores obtained on the CID Auditory Test W-22. A table for determining significant deviations between scores was generated and compared to observed differences in half-list scores for the W-22 tests. Good agreement was found between predicted and observed values. Implications of the binomial characteristics of speech-discrimination scores are discussed.

Assessment of Spectral and Temporal Resolution in Cochlear Implant Users Using Psychoacoustic Discrimination and Speech Cue Categorization.

Science.gov (United States)

Winn, Matthew B; Won, Jong Ho; Moon, Il Joon

This study was conducted to measure auditory perception by cochlear implant users in the spectral and temporal domains, using tests of either categorization (using speech-based cues) or discrimination (using conventional psychoacoustic tests). The authors hypothesized that traditional nonlinguistic tests assessing spectral and temporal auditory resolution would correspond to speech-based measures assessing specific aspects of phonetic categorization assumed to depend on spectral and temporal auditory resolution. The authors further hypothesized that speech-based categorization performance would ultimately be a superior predictor of speech recognition performance, because of the fundamental nature of speech recognition as categorization. Nineteen cochlear implant listeners and 10 listeners with normal hearing participated in a suite of tasks that included spectral ripple discrimination, temporal modulation detection, and syllable categorization, which was split into a spectral cue-based task (targeting the /ba/-/da/ contrast) and a timing cue-based task (targeting the /b/-/p/ and /d/-/t/ contrasts). Speech sounds were manipulated to contain specific spectral or temporal modulations (formant transitions or voice onset time, respectively) that could be categorized. Categorization responses were quantified using logistic regression to assess perceptual sensitivity to acoustic phonetic cues. Word recognition testing was also conducted for cochlear implant listeners. Cochlear implant users were generally less successful at utilizing both spectral and temporal cues for categorization compared with listeners with normal hearing. For the cochlear implant listener group, spectral ripple discrimination was significantly correlated with the categorization of formant transitions; both were correlated with better word recognition. Temporal modulation detection using 100- and 10-Hz-modulated noise was not correlated either with the cochlear implant subjects' categorization of
A Diagnostic Marker to Discriminate Childhood Apraxia of Speech from Speech Delay: Introduction

Science.gov (United States)

Shriberg, Lawrence D.; Strand, Edythe A.; Fourakis, Marios; Jakielski, Kathy J.; Hall, Sheryl D.; Karlsson, Heather B.; Mabie, Heather L.; McSweeny, Jane L.; Tilkens, Christie M.; Wilson, David L.

2017-01-01

Purpose: The goal of this article is to introduce the pause marker (PM), a single-sign diagnostic marker proposed to discriminate early or persistent childhood apraxia of speech (CAS) from speech delay.
A Pilot Investigation of Speech Sound Disorder Intervention Delivered by Telehealth to School-Age Children

Directory of Open Access Journals (Sweden)

Sue Grogan-Johnson

2011-05-01

Full Text Available This article describes a school-based telehealth service delivery model and reports outcomes made by school-age students with speech sound disorders in a rural Ohio school district. Speech therapy using computer-based speech sound intervention materials was provided either by live interactive videoconferencing (telehealth, or conventional side-by-side intervention. Progress was measured using pre- and post-intervention scores on the Goldman Fristoe Test of Articulation-2 (Goldman & Fristoe, 2002. Students in both service delivery models made significant improvements in speech sound production, with students in the telehealth condition demonstrating greater mastery of their Individual Education Plan (IEP goals. Live interactive videoconferencing thus appears to be a viable method for delivering intervention for speech sound disorders to children in a rural, public school setting. Keywords: Telehealth, telerehabilitation, videoconferencing, speech sound disorder, speech therapy, speech-language pathology; E-Helper
Visual speech alters the discrimination and identification of non-intact auditory speech in children with hearing loss.

Science.gov (United States)

Jerger, Susan; Damian, Markus F; McAlpine, Rachel P; Abdi, Hervé

2017-03-01

Understanding spoken language is an audiovisual event that depends critically on the ability to discriminate and identify phonemes yet we have little evidence about the role of early auditory experience and visual speech on the development of these fundamental perceptual skills. Objectives of this research were to determine 1) how visual speech influences phoneme discrimination and identification; 2) whether visual speech influences these two processes in a like manner, such that discrimination predicts identification; and 3) how the degree of hearing loss affects this relationship. Such evidence is crucial for developing effective intervention strategies to mitigate the effects of hearing loss on language development. Participants were 58 children with early-onset sensorineural hearing loss (CHL, 53% girls, M = 9;4 yrs) and 58 children with normal hearing (CNH, 53% girls, M = 9;4 yrs). Test items were consonant-vowel (CV) syllables and nonwords with intact visual speech coupled to non-intact auditory speech (excised onsets) as, for example, an intact consonant/rhyme in the visual track (Baa or Baz) coupled to non-intact onset/rhyme in the auditory track (/-B/aa or/-B/az). The items started with an easy-to-speechread/B/or difficult-to-speechread/G/onset and were presented in the auditory (static face) vs. audiovisual (dynamic face) modes. We assessed discrimination for intact vs. non-intact different pairs (e.g., Baa:/-B/aa). We predicted that visual speech would cause the non-intact onset to be perceived as intact and would therefore generate more same-as opposed to different-responses in the audiovisual than auditory mode. We assessed identification by repetition of nonwords with non-intact onsets (e.g.,/-B/az). We predicted that visual speech would cause the non-intact onset to be perceived as intact and would therefore generate more Baz-as opposed to az- responses in the audiovisual than auditory mode. Performance in the audiovisual mode showed more same
Visual Speech Alters the Discrimination and Identification of Non-Intact Auditory Speech in Children with Hearing Loss

Science.gov (United States)

Jerger, Susan; Damian, Markus F.; McAlpine, Rachel P.; Abdi, Hervé

2017-01-01

Objectives Understanding spoken language is an audiovisual event that depends critically on the ability to discriminate and identify phonemes yet we have little evidence about the role of early auditory experience and visual speech on the development of these fundamental perceptual skills. Objectives of this research were to determine 1) how visual speech influences phoneme discrimination and identification; 2) whether visual speech influences these two processes in a like manner, such that discrimination predicts identification; and 3) how the degree of hearing loss affects this relationship. Such evidence is crucial for developing effective intervention strategies to mitigate the effects of hearing loss on language development. Methods Participants were 58 children with early-onset sensorineural hearing loss (CHL, 53% girls, M = 9;4 yrs) and 58 children with normal hearing (CNH, 53% girls, M = 9;4 yrs). Test items were consonant-vowel (CV) syllables and nonwords with intact visual speech coupled to non-intact auditory speech (excised onsets) as, for example, an intact consonant/rhyme in the visual track (Baa or Baz) coupled to non-intact onset/rhyme in the auditory track (/–B/aa or /–B/az). The items started with an easy-to-speechread /B/ or difficult-to-speechread /G/ onset and were presented in the auditory (static face) vs. audiovisual (dynamic face) modes. We assessed discrimination for intact vs. non-intact different pairs (e.g., Baa:/–B/aa). We predicted that visual speech would cause the non-intact onset to be perceived as intact and would therefore generate more same—as opposed to different—responses in the audiovisual than auditory mode. We assessed identification by repetition of nonwords with non-intact onsets (e.g., /–B/az). We predicted that visual speech would cause the non-intact onset to be perceived as intact and would therefore generate more Baz—as opposed to az— responses in the audiovisual than auditory mode. Results
Dynamic Assessment of Phonological Awareness for Children with Speech Sound Disorders

Science.gov (United States)

Gillam, Sandra Laing; Ford, Mikenzi Bentley

2012-01-01

The current study was designed to examine the relationships between performance on a nonverbal phoneme deletion task administered in a dynamic assessment format with performance on measures of phoneme deletion, word-level reading, and speech sound production that required verbal responses for school-age children with speech sound disorders (SSDs).…
Visual feedback of tongue movement for novel speech sound learning

Directory of Open Access Journals (Sweden)

William F Katz

2015-11-01

Full Text Available Pronunciation training studies have yielded important information concerning the processing of audiovisual (AV information. Second language (L2 learners show increased reliance on bottom-up, multimodal input for speech perception (compared to monolingual individuals. However, little is known about the role of viewing one’s own speech articulation processes during speech training. The current study investigated whether real-time, visual feedback for tongue movement can improve a speaker’s learning of non-native speech sounds. An interactive 3D tongue visualization system based on electromagnetic articulography (EMA was used in a speech training experiment. Native speakers of American English produced a novel speech sound (/ɖ̠/; a voiced, coronal, palatal stop before, during, and after trials in which they viewed their own speech movements using the 3D model. Talkers’ productions were evaluated using kinematic (tongue-tip spatial positioning and acoustic (burst spectra measures. The results indicated a rapid gain in accuracy associated with visual feedback training. The findings are discussed with respect to neural models for multimodal speech processing.
International aspirations for speech-language pathologists' practice with multilingual children with speech sound disorders: development of a position paper.

Science.gov (United States)

McLeod, Sharynne; Verdon, Sarah; Bowen, Caroline

2013-01-01

A major challenge for the speech-language pathology profession in many cultures is to address the mismatch between the "linguistic homogeneity of the speech-language pathology profession and the linguistic diversity of its clientele" (Caesar & Kohler, 2007, p. 198). This paper outlines the development of the Multilingual Children with Speech Sound Disorders: Position Paper created to guide speech-language pathologists' (SLPs') facilitation of multilingual children's speech. An international expert panel was assembled comprising 57 researchers (SLPs, linguists, phoneticians, and speech scientists) with knowledge about multilingual children's speech, or children with speech sound disorders. Combined, they had worked in 33 countries and used 26 languages in professional practice. Fourteen panel members met for a one-day workshop to identify key points for inclusion in the position paper. Subsequently, 42 additional panel members participated online to contribute to drafts of the position paper. A thematic analysis was undertaken of the major areas of discussion using two data sources: (a) face-to-face workshop transcript (133 pages) and (b) online discussion artifacts (104 pages). Finally, a moderator with international expertise in working with children with speech sound disorders facilitated the incorporation of the panel's recommendations. The following themes were identified: definitions, scope, framework, evidence, challenges, practices, and consideration of a multilingual audience. The resulting position paper contains guidelines for providing services to multilingual children with speech sound disorders (http://www.csu.edu.au/research/multilingual-speech/position-paper). The paper is structured using the International Classification of Functioning, Disability and Health: Children and Youth Version (World Health Organization, 2007) and incorporates recommendations for (a) children and families, (b) SLPs' assessment and intervention, (c) SLPs' professional
Anterior paracingulate and cingulate cortex mediates the effects of cognitive load on speech sound discrimination.

Science.gov (United States)

Gennari, Silvia P; Millman, Rebecca E; Hymers, Mark; Mattys, Sven L

2018-06-11

Perceiving speech while performing another task is a common challenge in everyday life. How the brain controls resource allocation during speech perception remains poorly understood. Using functional magnetic resonance imaging (fMRI), we investigated the effect of cognitive load on speech perception by examining brain responses of participants performing a phoneme discrimination task and a visual working memory task simultaneously. The visual task involved holding either a single meaningless image in working memory (low cognitive load) or four different images (high cognitive load). Performing the speech task under high load, compared to low load, resulted in decreased activity in pSTG/pMTG and increased activity in visual occipital cortex and two regions known to contribute to visual attention regulation-the superior parietal lobule (SPL) and the paracingulate and anterior cingulate gyrus (PaCG, ACG). Critically, activity in PaCG/ACG was correlated with performance in the visual task and with activity in pSTG/pMTG: Increased activity in PaCG/ACG was observed for individuals with poorer visual performance and with decreased activity in pSTG/pMTG. Moreover, activity in a pSTG/pMTG seed region showed psychophysiological interactions with areas of the PaCG/ACG, with stronger interaction in the high-load than the low-load condition. These findings show that the acoustic analysis of speech is affected by the demands of a concurrent visual task and that the PaCG/ACG plays a role in allocating cognitive resources to concurrent auditory and visual information. Copyright © 2018. Published by Elsevier Inc.
Cognitive Bias for Learning Speech Sounds From a Continuous Signal Space Seems Nonlinguistic

Directory of Open Access Journals (Sweden)

Sabine van der Ham

2015-10-01

Full Text Available When learning language, humans have a tendency to produce more extreme distributions of speech sounds than those observed most frequently: In rapid, casual speech, vowel sounds are centralized, yet cross-linguistically, peripheral vowels occur almost universally. We investigate whether adults’ generalization behavior reveals selective pressure for communication when they learn skewed distributions of speech-like sounds from a continuous signal space. The domain-specific hypothesis predicts that the emergence of sound categories is driven by a cognitive bias to make these categories maximally distinct, resulting in more skewed distributions in participants’ reproductions. However, our participants showed more centered distributions, which goes against this hypothesis, indicating that there are no strong innate linguistic biases that affect learning these speech-like sounds. The centralization behavior can be explained by a lack of communicative pressure to maintain categories.
Binaural speech discrimination under noise in hearing-impaired listeners

Science.gov (United States)

Kumar, K. V.; Rao, A. B.

1988-01-01

This paper presents the results of an assessment of speech discrimination by hearing-impaired listeners (sensori-neural, conductive, and mixed groups) under binaural free-field listening in the presence of background noise. Subjects with pure-tone thresholds greater than 20 dB in 0.5, 1.0 and 2.0 kHz were presented with a version of the W-22 list of phonetically balanced words under three conditions: (1) 'quiet', with the chamber noise below 28 dB and speech at 60 dB; (2) at a constant S/N ratio of +10 dB, and with a background white noise at 70 dB; and (3) same as condition (2), but with the background noise at 80 dB. The mean speech discrimination scores decreased significantly with noise in all groups. However, the decrease in binaural speech discrimination scores with an increase in hearing impairment was less for material presented under the noise conditions than for the material presented in quiet.
Developmental changes in brain activation involved in the production of novel speech sounds in children.

Science.gov (United States)

Hashizume, Hiroshi; Taki, Yasuyuki; Sassa, Yuko; Thyreau, Benjamin; Asano, Michiko; Asano, Kohei; Takeuchi, Hikaru; Nouchi, Rui; Kotozaki, Yuka; Jeong, Hyeonjeong; Sugiura, Motoaki; Kawashima, Ryuta

2014-08-01

Older children are more successful at producing unfamiliar, non-native speech sounds than younger children during the initial stages of learning. To reveal the neuronal underpinning of the age-related increase in the accuracy of non-native speech production, we examined the developmental changes in activation involved in the production of novel speech sounds using functional magnetic resonance imaging. Healthy right-handed children (aged 6-18 years) were scanned while performing an overt repetition task and a perceptual task involving aurally presented non-native and native syllables. Productions of non-native speech sounds were recorded and evaluated by native speakers. The mouth regions in the bilateral primary sensorimotor areas were activated more significantly during the repetition task relative to the perceptual task. The hemodynamic response in the left inferior frontal gyrus pars opercularis (IFG pOp) specific to non-native speech sound production (defined by prior hypothesis) increased with age. Additionally, the accuracy of non-native speech sound production increased with age. These results provide the first evidence of developmental changes in the neural processes underlying the production of novel speech sounds. Our data further suggest that the recruitment of the left IFG pOp during the production of novel speech sounds was possibly enhanced due to the maturation of the neuronal circuits needed for speech motor planning. This, in turn, would lead to improvement in the ability to immediately imitate non-native speech. Copyright © 2014 Wiley Periodicals, Inc.
Between-Word Simplification Patterns in the Continuous Speech of Children with Speech Sound Disorders

Science.gov (United States)

Klein, Harriet B.; Liu-Shea, May

2009-01-01

Purpose: This study was designed to identify and describe between-word simplification patterns in the continuous speech of children with speech sound disorders. It was hypothesized that word combinations would reveal phonological changes that were unobserved with single words, possibly accounting for discrepancies between the intelligibility of…
Intensive treatment with ultrasound visual feedback for speech sound errors in childhood apraxia

Directory of Open Access Journals (Sweden)

Jonathan L Preston

2016-08-01

Full Text Available Ultrasound imaging is an adjunct to traditional speech therapy that has shown to be beneficial in the remediation of speech sound errors. Ultrasound biofeedback can be utilized during therapy to provide clients additional knowledge about their tongue shapes when attempting to produce sounds that are in error. The additional feedback may assist children with childhood apraxia of speech in stabilizing motor patterns, thereby facilitating more consistent and accurate productions of sounds and syllables. However, due to its specialized nature, ultrasound visual feedback is a technology that is not widely available to clients. Short-term intensive treatment programs are one option that can be utilized to expand access to ultrasound biofeedback. Schema-based motor learning theory suggests that short-term intensive treatment programs (massed practice may assist children in acquiring more accurate motor patterns. In this case series, three participants ages 10-14 diagnosed with childhood apraxia of speech attended 16 hours of speech therapy over a two-week period to address residual speech sound errors. Two participants had distortions on rhotic sounds, while the third participant demonstrated lateralization of sibilant sounds. During therapy, cues were provided to assist participants in obtaining a tongue shape that facilitated a correct production of the erred sound. Additional practice without ultrasound was also included. Results suggested that all participants showed signs of acquisition of sounds in error. Generalization and retention results were mixed. One participant showed generalization and retention of sounds that were treated; one showed generalization but limited retention; and the third showed no evidence of generalization or retention. Individual characteristics that may facilitate generalization are discussed. Short-term intensive treatment programs using ultrasound biofeedback may result in the acquisition of more accurate motor
Sound quality measures for speech in noise through a commercial hearing aid implementing digital noise reduction.

Science.gov (United States)

Ricketts, Todd A; Hornsby, Benjamin W Y

2005-05-01

This brief report discusses the affect of digital noise reduction (DNR) processing on aided speech recognition and sound quality measures in 14 adults fitted with a commercial hearing aid. Measures of speech recognition and sound quality were obtained in two different speech-in-noise conditions (71 dBA speech, +6 dB SNR and 75 dBA speech, +1 dB SNR). The results revealed that the presence or absence of DNR processing did not impact speech recognition in noise (either positively or negatively). Paired comparisons of sound quality for the same speech in noise signals, however, revealed a strong preference for DNR processing. These data suggest that at least one implementation of DNR processing is capable of providing improved sound quality, for speech in noise, in the absence of improved speech recognition.
Office noise: Can headphones and masking sound attenuate distraction by background speech?

Science.gov (United States)

Jahncke, Helena; Björkeholm, Patrik; Marsh, John E; Odelius, Johan; Sörqvist, Patrik

2016-11-22

Background speech is one of the most disturbing noise sources at shared workplaces in terms of both annoyance and performance-related disruption. Therefore, it is important to identify techniques that can efficiently protect performance against distraction. It is also important that the techniques are perceived as satisfactory and are subjectively evaluated as effective in their capacity to reduce distraction. The aim of the current study was to compare three methods of attenuating distraction from background speech: masking a background voice with nature sound through headphones, masking a background voice with other voices through headphones and merely wearing headphones (without masking) as a way to attenuate the background sound. Quiet was deployed as a baseline condition. Thirty students participated in an experiment employing a repeated measures design. Performance (serial short-term memory) was impaired by background speech (1 voice), but this impairment was attenuated when the speech was masked - and in particular when it was masked by nature sound. Furthermore, perceived workload was lowest in the quiet condition and significantly higher in all other sound conditions. Notably, the headphones tested as a sound-attenuating device (i.e. without masking) did not protect against the effects of background speech on performance and subjective work load. Nature sound was the only masking condition that worked as a protector of performance, at least in the context of the serial recall task. However, despite the attenuation of distraction by nature sound, perceived workload was still high - suggesting that it is difficult to find a masker that is both effective and perceived as satisfactory.
Implications of diadochokinesia in children with speech sound disorder.

Science.gov (United States)

Wertzner, Haydée Fiszbein; Pagan-Neves, Luciana de Oliveira; Alves, Renata Ramos; Barrozo, Tatiane Faria

2013-01-01

To verify the performance of children with and without speech sound disorder in oral motor skills measured by oral diadochokinesia according to age and gender and to compare the results by two different methods of analysis. Participants were 72 subjects aged from 5 years to 7 years and 11 months divided into four subgroups according to the presence of speech sound disorder (Study Group and Control Group) and age (6 years and 5 months). Diadochokinesia skills were assessed by the repetition of the sequences 'pa', 'ta', 'ka' and 'pataka' measured both manually and by the software Motor Speech Profile®. Gender was statistically different for both groups but it did not influence on the number of sequences per second produced. Correlation between the number of sequences per second and age was observed for all sequences (except for 'ka') only for the control group children. Comparison between groups did not indicate differences between the number of sequences per second and age. Results presented strong agreement between the values of oral diadochokinesia measured manually and by MSP. This research demonstrated the importance of using different methods of analysis on the functional evaluation of oro-motor processing aspects of children with speech sound disorder and evidenced the oro-motor difficulties on children aged under than eight years old.
Role of the middle ear muscle apparatus in mechanisms of speech signal discrimination

Science.gov (United States)

Moroz, B. S.; Bazarov, V. G.; Sachenko, S. V.

1980-01-01

A method of impedance reflexometry was used to examine 101 students with hearing impairment in order to clarify the interrelation between speech discrimination and the state of the middle ear muscles. Ability to discriminate speech signals depends to some extent on the functional state of intraaural muscles. Speech discrimination was greatly impaired in the absence of stapedial muscle acoustic reflex, in the presence of low thresholds of stimulation and in very small values of reflex amplitude increase. Discrimination was not impeded in positive AR, high values of relative thresholds and normal increase of reflex amplitude in response to speech signals with augmenting intensity.
Speech-language pathologists' practices regarding assessment, analysis, target selection, intervention, and service delivery for children with speech sound disorders.

Science.gov (United States)

Mcleod, Sharynne; Baker, Elise

2014-01-01

A survey of 231 Australian speech-language pathologists (SLPs) was undertaken to describe practices regarding assessment, analysis, target selection, intervention, and service delivery for children with speech sound disorders (SSD). The participants typically worked in private practice, education, or community health settings and 67.6% had a waiting list for services. For each child, most of the SLPs spent 10-40 min in pre-assessment activities, 30-60 min undertaking face-to-face assessments, and 30-60 min completing paperwork after assessments. During an assessment SLPs typically conducted a parent interview, single-word speech sampling, collected a connected speech sample, and used informal tests. They also determined children's stimulability and estimated intelligibility. With multilingual children, informal assessment procedures and English-only tests were commonly used and SLPs relied on family members or interpreters to assist. Common analysis techniques included determination of phonological processes, substitutions-omissions-distortions-additions (SODA), and phonetic inventory. Participants placed high priority on selecting target sounds that were stimulable, early developing, and in error across all word positions and 60.3% felt very confident or confident selecting an appropriate intervention approach. Eight intervention approaches were frequently used: auditory discrimination, minimal pairs, cued articulation, phonological awareness, traditional articulation therapy, auditory bombardment, Nuffield Centre Dyspraxia Programme, and core vocabulary. Children typically received individual therapy with an SLP in a clinic setting. Parents often observed and participated in sessions and SLPs typically included siblings and grandparents in intervention sessions. Parent training and home programs were more frequently used than the group therapy. Two-thirds kept up-to-date by reading journal articles monthly or every 6 months. There were many similarities with
Speech feature discrimination in deaf children following cochlear implantation

Science.gov (United States)

Bergeson, Tonya R.; Pisoni, David B.; Kirk, Karen Iler

2002-05-01

Speech feature discrimination is a fundamental perceptual skill that is often assumed to underlie word recognition and sentence comprehension performance. To investigate the development of speech feature discrimination in deaf children with cochlear implants, we conducted a retrospective analysis of results from the Minimal Pairs Test (Robbins et al., 1988) selected from patients enrolled in a longitudinal study of speech perception and language development. The MP test uses a 2AFC procedure in which children hear a word and select one of two pictures (bat-pat). All 43 children were prelingually deafened, received a cochlear implant before 6 years of age or between ages 6 and 9, and used either oral or total communication. Children were tested once every 6 months to 1 year for 7 years; not all children were tested at each interval. By 2 years postimplant, the majority of these children achieved near-ceiling levels of discrimination performance for vowel height, vowel place, and consonant manner. Most of the children also achieved plateaus but did not reach ceiling performance for consonant place and voicing. The relationship between speech feature discrimination, spoken word recognition, and sentence comprehension will be discussed. [Work supported by NIH/NIDCD Research Grant No. R01DC00064 and NIH/NIDCD Training Grant No. T32DC00012.

Multilingual Aspects of Speech Sound Disorders in Children. Communication Disorders across Languages

Science.gov (United States)

McLeod, Sharynne; Goldstein, Brian

2012-01-01

Multilingual Aspects of Speech Sound Disorders in Children explores both multilingual and multicultural aspects of children with speech sound disorders. The 30 chapters have been written by 44 authors from 16 different countries about 112 languages and dialects. The book is designed to translate research into clinical practice. It is divided into…
Patterns and risk factors associated with speech sounds and language disorders in pakistan

International Nuclear Information System (INIS)

Arshad, H.; Ghayas, M.S.; Madiha, A.

2013-01-01

To observe the patterns of speech sounds and language disorders. To find out associated risk factors of speech sounds and language disorders. Background: Communication is the very essence of modern society. Communication disorders impacts quality of life. Patterns and factors associated with speech sounds and language impairments were explored. The association was seen with different environmental factors. Methodology: The patients included in the study were 200 whose age ranged between two and sixteen years presented in speech therapy clinic OPD Mayo Hospital. A cross-sectional survey questionnaire assessed the patient's bio data, socioeconomic background, family history of communication disorders and bilingualism. It was a descriptive study and was conducted through cross-sectional survey. Data was analysed by SPSS version 16. Results: Results reveal Language disorders were relatively more prevalent in males than those of speech sound disorders. Bilingualism was found as having insignificant effect on these disorders. It was concluded from this study that the socioeconomic status and family history were significant risk factors. Conclusion: Gender, socioeconomic status, family history can play as risk for developing speech sounds and language disorders. There is a grave need to understand patterns of communication disorders in the light of Pakistani society and culture. It is recommended to conduct further studies to determine risk factors and patterns of these impairments. (author)
Comparing Feedback Types in Multimedia Learning of Speech by Young Children With Common Speech Sound Disorders: Research Protocol for a Pretest Posttest Independent Measures Control Trial.

Science.gov (United States)

Doubé, Wendy; Carding, Paul; Flanagan, Kieran; Kaufman, Jordy; Armitage, Hannah

2018-01-01

Children with speech sound disorders benefit from feedback about the accuracy of sounds they make. Home practice can reinforce feedback received from speech pathologists. Games in mobile device applications could encourage home practice, but those currently available are of limited value because they are unlikely to elaborate "Correct"/"Incorrect" feedback with information that can assist in improving the accuracy of the sound. This protocol proposes a "Wizard of Oz" experiment that aims to provide evidence for the provision of effective multimedia feedback for speech sound development. Children with two common speech sound disorders will play a game on a mobile device and make speech sounds when prompted by the game. A human "Wizard" will provide feedback on the accuracy of the sound but the children will perceive the feedback as coming from the game. Groups of 30 young children will be randomly allocated to one of five conditions: four types of feedback and a control which does not play the game. The results of this experiment will inform not only speech sound therapy, but also other types of language learning, both in general, and in multimedia applications. This experiment is a cost-effective precursor to the development of a mobile application that employs pedagogically and clinically sound processes for speech development in young children.
A causal test of the motor theory of speech perception: a case of impaired speech production and spared speech perception.

Science.gov (United States)

Stasenko, Alena; Bonn, Cory; Teghipco, Alex; Garcea, Frank E; Sweet, Catherine; Dombovy, Mary; McDonough, Joyce; Mahon, Bradford Z

2015-01-01

The debate about the causal role of the motor system in speech perception has been reignited by demonstrations that motor processes are engaged during the processing of speech sounds. Here, we evaluate which aspects of auditory speech processing are affected, and which are not, in a stroke patient with dysfunction of the speech motor system. We found that the patient showed a normal phonemic categorical boundary when discriminating two non-words that differ by a minimal pair (e.g., ADA-AGA). However, using the same stimuli, the patient was unable to identify or label the non-word stimuli (using a button-press response). A control task showed that he could identify speech sounds by speaker gender, ruling out a general labelling impairment. These data suggest that while the motor system is not causally involved in perception of the speech signal, it may be used when other cues (e.g., meaning, context) are not available.
Comparing Feedback Types in Multimedia Learning of Speech by Young Children With Common Speech Sound Disorders: Research Protocol for a Pretest Posttest Independent Measures Control Trial

Science.gov (United States)

Doubé, Wendy; Carding, Paul; Flanagan, Kieran; Kaufman, Jordy; Armitage, Hannah

2018-01-01

Children with speech sound disorders benefit from feedback about the accuracy of sounds they make. Home practice can reinforce feedback received from speech pathologists. Games in mobile device applications could encourage home practice, but those currently available are of limited value because they are unlikely to elaborate “Correct”/”Incorrect” feedback with information that can assist in improving the accuracy of the sound. This protocol proposes a “Wizard of Oz” experiment that aims to provide evidence for the provision of effective multimedia feedback for speech sound development. Children with two common speech sound disorders will play a game on a mobile device and make speech sounds when prompted by the game. A human “Wizard” will provide feedback on the accuracy of the sound but the children will perceive the feedback as coming from the game. Groups of 30 young children will be randomly allocated to one of five conditions: four types of feedback and a control which does not play the game. The results of this experiment will inform not only speech sound therapy, but also other types of language learning, both in general, and in multimedia applications. This experiment is a cost-effective precursor to the development of a mobile application that employs pedagogically and clinically sound processes for speech development in young children. PMID:29674986
Comparing Feedback Types in Multimedia Learning of Speech by Young Children With Common Speech Sound Disorders: Research Protocol for a Pretest Posttest Independent Measures Control Trial

Directory of Open Access Journals (Sweden)

Wendy Doubé

2018-04-01

Full Text Available Children with speech sound disorders benefit from feedback about the accuracy of sounds they make. Home practice can reinforce feedback received from speech pathologists. Games in mobile device applications could encourage home practice, but those currently available are of limited value because they are unlikely to elaborate “Correct”/”Incorrect” feedback with information that can assist in improving the accuracy of the sound. This protocol proposes a “Wizard of Oz” experiment that aims to provide evidence for the provision of effective multimedia feedback for speech sound development. Children with two common speech sound disorders will play a game on a mobile device and make speech sounds when prompted by the game. A human “Wizard” will provide feedback on the accuracy of the sound but the children will perceive the feedback as coming from the game. Groups of 30 young children will be randomly allocated to one of five conditions: four types of feedback and a control which does not play the game. The results of this experiment will inform not only speech sound therapy, but also other types of language learning, both in general, and in multimedia applications. This experiment is a cost-effective precursor to the development of a mobile application that employs pedagogically and clinically sound processes for speech development in young children.
Nonlinear frequency compression: effects on sound quality ratings of speech and music.

Science.gov (United States)

Parsa, Vijay; Scollie, Susan; Glista, Danielle; Seelisch, Andreas

2013-03-01

Frequency lowering technologies offer an alternative amplification solution for severe to profound high frequency hearing losses. While frequency lowering technologies may improve audibility of high frequency sounds, the very nature of this processing can affect the perceived sound quality. This article reports the results from two studies that investigated the impact of a nonlinear frequency compression (NFC) algorithm on perceived sound quality. In the first study, the cutoff frequency and compression ratio parameters of the NFC algorithm were varied, and their effect on the speech quality was measured subjectively with 12 normal hearing adults, 12 normal hearing children, 13 hearing impaired adults, and 9 hearing impaired children. In the second study, 12 normal hearing and 8 hearing impaired adult listeners rated the quality of speech in quiet, speech in noise, and music after processing with a different set of NFC parameters. Results showed that the cutoff frequency parameter had more impact on sound quality ratings than the compression ratio, and that the hearing impaired adults were more tolerant to increased frequency compression than normal hearing adults. No statistically significant differences were found in the sound quality ratings of speech-in-noise and music stimuli processed through various NFC settings by hearing impaired listeners. These findings suggest that there may be an acceptable range of NFC settings for hearing impaired individuals where sound quality is not adversely affected. These results may assist an Audiologist in clinical NFC hearing aid fittings for achieving a balance between high frequency audibility and sound quality.
Experience with speech sounds is not necessary for cue trading by budgerigars (Melopsittacus undulatus.

Directory of Open Access Journals (Sweden)

Mary Flaherty

Full Text Available The influence of experience with human speech sounds on speech perception in budgerigars, vocal mimics whose speech exposure can be tightly controlled in a laboratory setting, was measured. Budgerigars were divided into groups that differed in auditory exposure and then tested on a cue-trading identification paradigm with synthetic speech. Phonetic cue trading is a perceptual phenomenon observed when changes on one cue dimension are offset by changes in another cue dimension while still maintaining the same phonetic percept. The current study examined whether budgerigars would trade the cues of voice onset time (VOT and the first formant onset frequency when identifying syllable initial stop consonants and if this would be influenced by exposure to speech sounds. There were a total of four different exposure groups: No speech exposure (completely isolated, Passive speech exposure (regular exposure to human speech, and two Speech-trained groups. After the exposure period, all budgerigars were tested for phonetic cue trading using operant conditioning procedures. Birds were trained to peck keys in response to different synthetic speech sounds that began with "d" or "t" and varied in VOT and frequency of the first formant at voicing onset. Once training performance criteria were met, budgerigars were presented with the entire intermediate series, including ambiguous sounds. Responses on these trials were used to determine which speech cues were used, if a trading relation between VOT and the onset frequency of the first formant was present, and whether speech exposure had an influence on perception. Cue trading was found in all birds and these results were largely similar to those of a group of humans. Results indicated that prior speech experience was not a requirement for cue trading by budgerigars. The results are consistent with theories that explain phonetic cue trading in terms of a rich auditory encoding of the speech signal.
Differences in phonetic discrimination stem from differences in psychoacoustic abilities in learning the sounds of a second language: Evidence from ERP research.

Science.gov (United States)

Lin, Yi; Fan, Ruolin; Mo, Lei

2017-01-01

The scientific community has been divided as to the origin of individual differences in perceiving the sounds of a second language (L2). There are two alternative explanations: a general psychoacoustic origin vs. a speech-specific one. A previous study showed that such individual variability is linked to the perceivers' speech-specific capabilities, rather than the perceivers' psychoacoustic abilities. However, we assume that the selection of participants and parameters of sound stimuli might not appropriate. Therefore, we adjusted the sound stimuli and recorded event-related potentials (ERPs) from two groups of early, proficient Cantonese (L1)-Mandarin (L2) bilinguals who differed in their mastery of the Mandarin (L2) phonetic contrast /in-ing/, to explore whether the individual differences in perceiving L2 stem from participants' ability to discriminate various pure tones (frequency, duration and pattern). To precisely measure the participants' acoustic discrimination, mismatch negativity (MMN) elicited by the oddball paradigm was recorded in the experiment. The results showed that significant differences between good perceivers (GPs) and poor perceivers (PPs) were found in the three general acoustic conditions (frequency, duration and pattern), and the MMN amplitude for GP was significantly larger than for PP. Therefore, our results support a general psychoacoustic origin of individual variability in L2 phonetic mastery.
The Clinical Practice of Speech and Language Therapists with Children with Phonologically Based Speech Sound Disorders

Science.gov (United States)

Oliveira, Carla; Lousada, Marisa; Jesus, Luis M. T.

2015-01-01

Children with speech sound disorders (SSD) represent a large number of speech and language therapists' caseloads. The intervention with children who have SSD can involve different therapy approaches, and these may be articulatory or phonologically based. Some international studies reveal a widespread application of articulatory based approaches in…
The speech perception skills of children with and without speech sound disorder.

Science.gov (United States)

Hearnshaw, Stephanie; Baker, Elise; Munro, Natalie

To investigate whether Australian-English speaking children with and without speech sound disorder (SSD) differ in their overall speech perception accuracy. Additionally, to investigate differences in the perception of specific phonemes and the association between speech perception and speech production skills. Twenty-five Australian-English speaking children aged 48-60 months participated in this study. The SSD group included 12 children and the typically developing (TD) group included 13 children. Children completed routine speech and language assessments in addition to an experimental Australian-English lexical and phonetic judgement task based on Rvachew's Speech Assessment and Interactive Learning System (SAILS) program (Rvachew, 2009). This task included eight words across four word-initial phonemes-/k, ɹ, ʃ, s/. Children with SSD showed significantly poorer perceptual accuracy on the lexical and phonetic judgement task compared with TD peers. The phonemes /ɹ/ and /s/ were most frequently perceived in error across both groups. Additionally, the phoneme /ɹ/ was most commonly produced in error. There was also a positive correlation between overall speech perception and speech production scores. Children with SSD perceived speech less accurately than their typically developing peers. The findings suggest that an Australian-English variation of a lexical and phonetic judgement task similar to the SAILS program is promising and worthy of a larger scale study. Copyright © 2017 Elsevier Inc. All rights reserved.
Discrimination and streaming of speech sounds based on differences in interaural and spectral cues.

Science.gov (United States)

David, Marion; Lavandier, Mathieu; Grimault, Nicolas; Oxenham, Andrew J

2017-09-01

Differences in spatial cues, including interaural time differences (ITDs), interaural level differences (ILDs) and spectral cues, can lead to stream segregation of alternating noise bursts. It is unknown how effective such cues are for streaming sounds with realistic spectro-temporal variations. In particular, it is not known whether the high-frequency spectral cues associated with elevation remain sufficiently robust under such conditions. To answer these questions, sequences of consonant-vowel tokens were generated and filtered by non-individualized head-related transfer functions to simulate the cues associated with different positions in the horizontal and median planes. A discrimination task showed that listeners could discriminate changes in interaural cues both when the stimulus remained constant and when it varied between presentations. However, discrimination of changes in spectral cues was much poorer in the presence of stimulus variability. A streaming task, based on the detection of repeated syllables in the presence of interfering syllables, revealed that listeners can use both interaural and spectral cues to segregate alternating syllable sequences, despite the large spectro-temporal differences between stimuli. However, only the full complement of spatial cues (ILDs, ITDs, and spectral cues) resulted in obligatory streaming in a task that encouraged listeners to integrate the tokens into a single stream.
Sound and speech detection and classification in a Health Smart Home.

Science.gov (United States)

Fleury, A; Noury, N; Vacher, M; Glasson, H; Seri, J F

2008-01-01

Improvements in medicine increase life expectancy in the world and create a new bottleneck at the entrance of specialized and equipped institutions. To allow elderly people to stay at home, researchers work on ways to monitor them in their own environment, with non-invasive sensors. To meet this goal, smart homes, equipped with lots of sensors, deliver information on the activities of the person and can help detect distress situations. In this paper, we present a global speech and sound recognition system that can be set-up in a flat. We placed eight microphones in the Health Smart Home of Grenoble (a real living flat of 47m(2)) and we automatically analyze and sort out the different sounds recorded in the flat and the speech uttered (to detect normal or distress french sentences). We introduce the methods for the sound and speech recognition, the post-processing of the data and finally the experimental results obtained in real conditions in the flat.
Deficits in Letter-Speech Sound Associations but Intact Visual Conflict Processing in Dyslexia: Results from a Novel ERP-Paradigm

OpenAIRE

Bakos, Sarolta; Landerl, Karin; Bartling, Jürgen; Schulte-Körne, Gerd; Moll, Kristina

2017-01-01

The reading and spelling deficits characteristic of developmental dyslexia (dyslexia) have been related to problems in phonological processing and in learning associations between letters and speech-sounds. Even when children with dyslexia have learned the letters and their corresponding speech sounds, letter-speech sound associations might still be less automatized compared to children with age-adequate literacy skills. In order to examine automaticity in letter-speech sound associations and...
Crossmodal deficit in dyslexic children: practice affects the neural timing of letter-speech sound integration

Directory of Open Access Journals (Sweden)

Gojko eŽarić

2015-06-01

Full Text Available A failure to build solid letter-speech sound associations may contribute to reading impairments in developmental dyslexia. Whether this reduced neural integration of letters and speech sounds changes over time within individual children and how this relates to behavioral gains in reading skills remains unknown. In this research, we examined changes in event-related potential (ERP measures of letter-speech sound integration over a 6-month period during which 9-year-old dyslexic readers (n=17 followed a training in letter-speech sound coupling next to their regular reading curriculum. We presented the Dutch spoken vowels /a/ and /o/ as standard and deviant stimuli in one auditory and two audiovisual oddball conditions. In one audiovisual condition (AV0, the letter ‘a’ was presented simultaneously with the vowels, while in the other (AV200 it was preceding vowel onset for 200 ms. Prior to the training (T1, dyslexic readers showed the expected pattern of typical auditory mismatch responses, together with the absence of letter-speech sound effects in a late negativity (LN window. After the training (T2, our results showed earlier (and enhanced crossmodal effects in the LN window. Most interestingly, earlier LN latency at T2 was significantly related to higher behavioral accuracy in letter-speech sound coupling. On a more general level, the timing of the earlier mismatch negativity (MMN in the simultaneous condition (AV0 measured at T1, significantly related to reading fluency at both T1 and T2 as well as with reading gains. Our findings suggest that the reduced neural integration of letters and speech sounds in dyslexic children may show moderate improvement with reading instruction and training and that behavioral improvements relate especially to individual differences in the timing of this neural integration.
Discrimination of musical instrument sounds resynthesized with simplified spectrotemporal parameters.

Science.gov (United States)

McAdams, S; Beauchamp, J W; Meneguzzi, S

1999-02-01

The perceptual salience of several outstanding features of quasiharmonic, time-variant spectra was investigated in musical instrument sounds. Spectral analyses of sounds from seven musical instruments (clarinet, flute, oboe, trumpet, violin, harpsichord, and marimba) produced time-varying harmonic amplitude and frequency data. Six basic data simplifications and five combinations of them were applied to the reference tones: amplitude-variation smoothing, coherent variation of amplitudes over time, spectral-envelope smoothing, forced harmonic-frequency variation, frequency-variation smoothing, and harmonic-frequency flattening. Listeners were asked to discriminate sounds resynthesized with simplified data from reference sounds resynthesized with the full data. Averaged over the seven instruments, the discrimination was very good for spectral envelope smoothing and amplitude envelope coherence, but was moderate to poor in decreasing order for forced harmonic frequency variation, frequency variation smoothing, frequency flattening, and amplitude variation smoothing. Discrimination of combinations of simplifications was equivalent to that of the most potent constituent simplification. Objective measurements were made on the spectral data for harmonic amplitude, harmonic frequency, and spectral centroid changes resulting from simplifications. These measures were found to correlate well with discrimination results, indicating that listeners have access to a relatively fine-grained sensory representation of musical instrument sounds.
Sound localization and speech identification in the frontal median plane with a hear-through headset

DEFF Research Database (Denmark)

Hoffmann, Pablo F.; Møller, Anders Kalsgaard; Christensen, Flemming

2014-01-01

signals can be superimposed via earphone reproduction. An important aspect of the hear-through headset is its transparency, i.e. how close to real life can the electronically amplied sounds be perceived. Here we report experiments conducted to evaluate the auditory transparency of a hear-through headset...... prototype by comparing human performance in natural, hear-through, and fully occluded conditions for two spatial tasks: frontal vertical-plane sound localization and speech-on-speech spatial release from masking. Results showed that localization performance was impaired by the hear-through headset relative...... to the natural condition though not as much as in the fully occluded condition. Localization was affected the least when the sound source was in front of the listeners. Different from the vertical localization performance, results from the speech task suggest that normal speech-on-speech spatial release from...
Children with speech sound disorder: Comparing a non-linguistic auditory approach with a phonological intervention approach to improve phonological skills

Directory of Open Access Journals (Sweden)

Cristina eMurphy

2015-02-01

Full Text Available This study aimed to compare the effects of a non-linguistic auditory intervention approach with a phonological intervention approach on the phonological skills of children with speech sound disorder. A total of 17 children, aged 7-12 years, with speech sound disorder were randomly allocated to either the non-linguistic auditory temporal intervention group (n = 10, average age 7.7 ± 1.2 or phonological intervention group (n = 7, average age 8.6 ± 1.2. The intervention outcomes included auditory-sensory measures (auditory temporal processing skills and cognitive measures (attention, short-term memory, speech production and phonological awareness skills. The auditory approach focused on non-linguistic auditory training (eg. backward masking and frequency discrimination, whereas the phonological approach focused on speech sound training (eg. phonological organisation and awareness. Both interventions consisted of twelve 45-minute sessions delivered twice per week, for a total of nine hours. Intra-group analysis demonstrated that the auditory intervention group showed significant gains in both auditory and cognitive measures, whereas no significant gain was observed in the phonological intervention group. No significant improvement on phonological skills was observed in any of the groups. Inter-group analysis demonstrated significant differences between the improvement following training for both groups, with a more pronounced gain for the non-linguistic auditory temporal intervention in one of the visual attention measures and both auditory measures. Therefore, both analyses suggest that although the non-linguistic auditory intervention approach appeared to be the most effective intervention approach, it was not sufficient to promote the enhancement of phonological skills.
Applications of Hilbert Spectral Analysis for Speech and Sound Signals

Science.gov (United States)

Huang, Norden E.

2003-01-01

A new method for analyzing nonlinear and nonstationary data has been developed, and the natural applications are to speech and sound signals. The key part of the method is the Empirical Mode Decomposition method with which any complicated data set can be decomposed into a finite and often small number of Intrinsic Mode Functions (IMF). An IMF is defined as any function having the same numbers of zero-crossing and extrema, and also having symmetric envelopes defined by the local maxima and minima respectively. The IMF also admits well-behaved Hilbert transform. This decomposition method is adaptive, and, therefore, highly efficient. Since the decomposition is based on the local characteristic time scale of the data, it is applicable to nonlinear and nonstationary processes. With the Hilbert transform, the Intrinsic Mode Functions yield instantaneous frequencies as functions of time, which give sharp identifications of imbedded structures. This method invention can be used to process all acoustic signals. Specifically, it can process the speech signals for Speech synthesis, Speaker identification and verification, Speech recognition, and Sound signal enhancement and filtering. Additionally, as the acoustical signals from machinery are essentially the way the machines are talking to us. Therefore, the acoustical signals, from the machines, either from sound through air or vibration on the machines, can tell us the operating conditions of the machines. Thus, we can use the acoustic signal to diagnosis the problems of machines.
Transfer Effect of Speech-sound Learning on Auditory-motor Processing of Perceived Vocal Pitch Errors.

Science.gov (United States)

Chen, Zhaocong; Wong, Francis C K; Jones, Jeffery A; Li, Weifeng; Liu, Peng; Chen, Xi; Liu, Hanjun

2015-08-17

Speech perception and production are intimately linked. There is evidence that speech motor learning results in changes to auditory processing of speech. Whether speech motor control benefits from perceptual learning in speech, however, remains unclear. This event-related potential study investigated whether speech-sound learning can modulate the processing of feedback errors during vocal pitch regulation. Mandarin speakers were trained to perceive five Thai lexical tones while learning to associate pictures with spoken words over 5 days. Before and after training, participants produced sustained vowel sounds while they heard their vocal pitch feedback unexpectedly perturbed. As compared to the pre-training session, the magnitude of vocal compensation significantly decreased for the control group, but remained consistent for the trained group at the post-training session. However, the trained group had smaller and faster N1 responses to pitch perturbations and exhibited enhanced P2 responses that correlated significantly with their learning performance. These findings indicate that the cortical processing of vocal pitch regulation can be shaped by learning new speech-sound associations, suggesting that perceptual learning in speech can produce transfer effects to facilitating the neural mechanisms underlying the online monitoring of auditory feedback regarding vocal production.

Speech recognition using articulatory and excitation source features

CERN Document Server

Rao, K Sreenivasa

2017-01-01

This book discusses the contribution of articulatory and excitation source information in discriminating sound units. The authors focus on excitation source component of speech -- and the dynamics of various articulators during speech production -- for enhancement of speech recognition (SR) performance. Speech recognition is analyzed for read, extempore, and conversation modes of speech. Five groups of articulatory features (AFs) are explored for speech recognition, in addition to conventional spectral features. Each chapter provides the motivation for exploring the specific feature for SR task, discusses the methods to extract those features, and finally suggests appropriate models to capture the sound unit specific knowledge from the proposed features. The authors close by discussing various combinations of spectral, articulatory and source features, and the desired models to enhance the performance of SR systems.
Identifying Residual Speech Sound Disorders in Bilingual Children: A Japanese-English Case Study

Science.gov (United States)

Preston, Jonathan L.; Seki, Ayumi

2011-01-01

Purpose: To describe (a) the assessment of residual speech sound disorders (SSDs) in bilinguals by distinguishing speech patterns associated with second language acquisition from patterns associated with misarticulations and (b) how assessment of domains such as speech motor control and phonological awareness can provide a more complete…
Listening to an audio drama activates two processing networks, one for all sounds, another exclusively for speech.

Directory of Open Access Journals (Sweden)

Robert Boldt

Full Text Available Earlier studies have shown considerable intersubject synchronization of brain activity when subjects watch the same movie or listen to the same story. Here we investigated the across-subjects similarity of brain responses to speech and non-speech sounds in a continuous audio drama designed for blind people. Thirteen healthy adults listened for ∼19 min to the audio drama while their brain activity was measured with 3 T functional magnetic resonance imaging (fMRI. An intersubject-correlation (ISC map, computed across the whole experiment to assess the stimulus-driven extrinsic brain network, indicated statistically significant ISC in temporal, frontal and parietal cortices, cingulate cortex, and amygdala. Group-level independent component (IC analysis was used to parcel out the brain signals into functionally coupled networks, and the dependence of the ICs on external stimuli was tested by comparing them with the ISC map. This procedure revealed four extrinsic ICs of which two-covering non-overlapping areas of the auditory cortex-were modulated by both speech and non-speech sounds. The two other extrinsic ICs, one left-hemisphere-lateralized and the other right-hemisphere-lateralized, were speech-related and comprised the superior and middle temporal gyri, temporal poles, and the left angular and inferior orbital gyri. In areas of low ISC four ICs that were defined intrinsic fluctuated similarly as the time-courses of either the speech-sound-related or all-sounds-related extrinsic ICs. These ICs included the superior temporal gyrus, the anterior insula, and the frontal, parietal and midline occipital cortices. Taken together, substantial intersubject synchronization of cortical activity was observed in subjects listening to an audio drama, with results suggesting that speech is processed in two separate networks, one dedicated to the processing of speech sounds and the other to both speech and non-speech sounds.
Severe Speech Sound Disorders: An Integrated Multimodal Intervention

Science.gov (United States)

King, Amie M.; Hengst, Julie A.; DeThorne, Laura S.

2013-01-01

Purpose: This study introduces an integrated multimodal intervention (IMI) and examines its effectiveness for the treatment of persistent and severe speech sound disorders (SSD) in young children. The IMI is an activity-based intervention that focuses simultaneously on increasing the "quantity" of a child's meaningful productions of target words…
Perception of acoustic scale and size in musical instrument sounds.

Science.gov (United States)

van Dinther, Ralph; Patterson, Roy D

2006-10-01

There is size information in natural sounds. For example, as humans grow in height, their vocal tracts increase in length, producing a predictable decrease in the formant frequencies of speech sounds. Recent studies have shown that listeners can make fine discriminations about which of two speakers has the longer vocal tract, supporting the view that the auditory system discriminates changes on the acoustic-scale dimension. Listeners can also recognize vowels scaled well beyond the range of vocal tracts normally experienced, indicating that perception is robust to changes in acoustic scale. This paper reports two perceptual experiments designed to extend research on acoustic scale and size perception to the domain of musical sounds: The first study shows that listeners can discriminate the scale of musical instrument sounds reliably, although not quite as well as for voices. The second experiment shows that listeners can recognize the family of an instrument sound which has been modified in pitch and scale beyond the range of normal experience. We conclude that processing of acoustic scale in music perception is very similar to processing of acoustic scale in speech perception.
Evolution of non-speech sound memory in postlingual deafness: implications for cochlear implant rehabilitation.

Science.gov (United States)

Lazard, D S; Giraud, A L; Truy, E; Lee, H J

2011-07-01

Neurofunctional patterns assessed before or after cochlear implantation (CI) are informative markers of implantation outcome. Because phonological memory reorganization in post-lingual deafness is predictive of the outcome, we investigated, using a cross-sectional approach, whether memory of non-speech sounds (NSS) produced by animals or objects (i.e. non-human sounds) is also reorganized, and how this relates to speech perception after CI. We used an fMRI auditory imagery task in which sounds were evoked by pictures of noisy items for post-lingual deaf candidates for CI and for normal-hearing subjects. When deaf subjects imagined sounds, the left inferior frontal gyrus, the right posterior temporal gyrus and the right amygdala were less activated compared to controls. Activity levels in these regions decreased with duration of auditory deprivation, indicating declining NSS representations. Whole brain correlations with duration of auditory deprivation and with speech scores after CI showed an activity decline in dorsal, fronto-parietal, cortical regions, and an activity increase in ventral cortical regions, the right anterior temporal pole and the hippocampal gyrus. Both dorsal and ventral reorganizations predicted poor speech perception outcome after CI. These results suggest that post-CI speech perception relies, at least partially, on the integrity of a neural system used for processing NSS that is based on audio-visual and articulatory mapping processes. When this neural system is reorganized, post-lingual deaf subjects resort to inefficient semantic- and memory-based strategies. These results complement those of other studies on speech processing, suggesting that both speech and NSS representations need to be maintained during deafness to ensure the success of CI. Copyright © 2011 Elsevier Ltd. All rights reserved.
Preschool Speech Error Patterns Predict Articulation and Phonological Awareness Outcomes in Children with Histories of Speech Sound Disorders

Science.gov (United States)

Preston, Jonathan L.; Hull, Margaret; Edwards, Mary Louise

2013-01-01

Purpose: To determine if speech error patterns in preschoolers with speech sound disorders (SSDs) predict articulation and phonological awareness (PA) outcomes almost 4 years later. Method: Twenty-five children with histories of preschool SSDs (and normal receptive language) were tested at an average age of 4;6 (years;months) and were followed up…
Auditory Brainstem Response to Complex Sounds Predicts Self-Reported Speech-in-Noise Performance

Science.gov (United States)

Anderson, Samira; Parbery-Clark, Alexandra; White-Schwoch, Travis; Kraus, Nina

2013-01-01

Purpose: To compare the ability of the auditory brainstem response to complex sounds (cABR) to predict subjective ratings of speech understanding in noise on the Speech, Spatial, and Qualities of Hearing Scale (SSQ; Gatehouse & Noble, 2004) relative to the predictive ability of the Quick Speech-in-Noise test (QuickSIN; Killion, Niquette,…
Neural Correlates of Phonological Processing in Speech Sound Disorder: A Functional Magnetic Resonance Imaging Study

Science.gov (United States)

Tkach, Jean A.; Chen, Xu; Freebairn, Lisa A.; Schmithorst, Vincent J.; Holland, Scott K.; Lewis, Barbara A.

2011-01-01

Speech sound disorders (SSD) are the largest group of communication disorders observed in children. One explanation for these disorders is that children with SSD fail to form stable phonological representations when acquiring the speech sound system of their language due to poor phonological memory (PM). The goal of this study was to examine PM in…
Toward a Model of Pediatric Speech Sound Disorders (SSD) for Differential Diagnosis and Therapy Planning

NARCIS (Netherlands)

Terband, Hayo; Maassen, Bernardus; Maas, Edwin; van Lieshout, Pascal; Maassen, Ben; Terband, Hayo

2016-01-01

The classification and differentiation of pediatric speech sound disorders (SSD) is one of the main questions in the field of speech- and language pathology. Terms for classifying childhood and SSD and motor speech disorders (MSD) refer to speech production processes, and a variety of methods of
The role of high-level processes for oscillatory phase entrainment to speech sound

Directory of Open Access Journals (Sweden)

Benedikt eZoefel

2015-12-01

Full Text Available Constantly bombarded with input, the brain has the need to filter out relevant information while ignoring the irrelevant rest. A powerful tool may be represented by neural oscillations which entrain their high-excitability phase to important input while their low-excitability phase attenuates irrelevant information. Indeed, the alignment between brain oscillations and speech improves intelligibility and helps dissociating speakers during a cocktail party. Although well-investigated, the contribution of low- and high-level processes to phase entrainment to speech sound has only recently begun to be understood. Here, we review those findings, and concentrate on three main results: (1 Phase entrainment to speech sound is modulated by attention or predictions, likely supported by top-down signals and indicating higher-level processes involved in the brain’s adjustment to speech. (2 As phase entrainment to speech can be observed without systematic fluctuations in sound amplitude or spectral content, it does not only reflect a passive steady-state ringing of the cochlea, but entails a higher-level process. (3 The role of intelligibility for phase entrainment is debated. Recent results suggest that intelligibility modulates the behavioral consequences of entrainment, rather than directly affecting the strength of entrainment in auditory regions. We conclude that phase entrainment to speech reflects a sophisticated mechanism: Several high-level processes interact to optimally align neural oscillations with predicted events of high relevance, even when they are hidden in a continuous stream of background noise.
IEP goals for school-age children with speech sound disorders.

Science.gov (United States)

Farquharson, Kelly; Tambyraja, Sherine R; Justice, Laura M; Redle, Erin E

2014-01-01

The purpose of the current study was to describe the current state of practice for writing Individualized Education Program (IEP) goals for children with speech sound disorders (SSDs). IEP goals for 146 children receiving services for SSDs within public school systems across two states were coded for their dominant theoretical framework and overall quality. A dichotomous scheme was used for theoretical framework coding: cognitive-linguistic or sensory-motor. Goal quality was determined by examining 7 specific indicators outlined by an empirically tested rating tool. In total, 147 long-term and 490 short-term goals were coded. The results revealed no dominant theoretical framework for long-term goals, whereas short-term goals largely reflected a sensory-motor framework. In terms of quality, the majority of speech production goals were functional and generalizable in nature, but were not able to be easily targeted during common daily tasks or by other members of the IEP team. Short-term goals were consistently rated higher in quality domains when compared to long-term goals. The current state of practice for writing IEP goals for children with SSDs indicates that theoretical framework may be eclectic in nature and likely written to support the individual needs of children with speech sound disorders. Further investigation is warranted to determine the relations between goal quality and child outcomes. (1) Identify two predominant theoretical frameworks and discuss how they apply to IEP goal writing. (2) Discuss quality indicators as they relate to IEP goals for children with speech sound disorders. (3) Discuss the relationship between long-term goals level of quality and related theoretical frameworks. (4) Identify the areas in which business-as-usual IEP goals exhibit strong quality.
Modeling phoneme perception. II: A model of stop consonant discrimination.

Science.gov (United States)

van Hessen, A J; Schouten, M E

1992-10-01

Combining elements from two existing theories of speech sound discrimination, dual process theory (DPT) and trace context theory (TCT), a new theory, called phoneme perception theory, is proposed, consisting of a long-term phoneme memory, a context-coding memory, and a trace memory, each with its own time constants. This theory is tested by means of stop-consonant discrimination data in which interstimulus interval (ISI; values of 100, 300, and 2000 ms) is an important variable. It is shown that discrimination in which labeling plays an important part (2IFC and AX between category) benefits from increased ISI, whereas discrimination in which only sensory traces are compared (AX within category), decreases with increasing ISI. The theory is also tested on speech discrimination data from the literature in which ISI is a variable [Pisoni, J. Acoust. Soc. Am. 36, 277-282 (1964); Cowan and Morse, J. Acoust. Soc. Am. 79, 500-507 (1986)]. It is concluded that the number of parameters in trace context theory is not sufficient to account for most speech-sound discrimination data and that a few additional assumptions are needed, such as a form of sublabeling, in which subjects encode the quality of a stimulus as a member of a category, and which requires processing time.
[Correlation of diffusion tensor imaging between the cerebral cortex and speech discrimination in presbycusis].

Science.gov (United States)

Peng, Lu; Yu, Shuilian; Chen, Ruichun; Jing, Yan; Liang, Jianping

2015-09-01

To investigate the relationship between pure-tone average (PTA), the fractional anisotropy (FA) of the auditory pathway, cognitive cortex and auditory cortex in presbycusis. Twenty-five elderly subjects with presbycusis were participated in the study. PTA, speech discrimination abilities were evaluated in each subject. Diffusion tensor imaging (DTI) was applied to access the FA of the IC, the superior frontal gyrus and the Heschl's gyrus. Compare the difference between two sides of the values of FA in the three areas. Bivariate correlation analysis was performed to evaluate the effects of PTA and FA of the inferior colliculus (IC), the superior frontal gyrus and the Heschl's gyrus on speech discrimination abilities. There were no significant differences between the left and right side of the inferior colliculus (P > 0.05). Higher FA values were recorded at the left side of the Heschl's gyrus and the superior frontal gyrus (P < 0.05). Both PTA and the FA of the superior frontal gyrus have a negative association with speech discrimination abilities (P < 0.01, P < 0.05), while the FA of the Heschl's gyrus has a positive association with speech discrimination abilities (P < 0.05). Our findings indicated that the speech discrimination abilities of the elderly is not only related to the peripheral auditory function, but also to the central auditory and cognitive function.
Simultaneous Assessment of Speech Identification and Spatial Discrimination

Directory of Open Access Journals (Sweden)

Jennifer K. Bizley

2015-12-01

Full Text Available With increasing numbers of children and adults receiving bilateral cochlear implants, there is an urgent need for assessment tools that enable testing of binaural hearing abilities. Current test batteries are either limited in scope or are of an impractical duration for routine testing. Here, we report a behavioral test that enables combined testing of speech identification and spatial discrimination in noise. In this task, multitalker babble was presented from all speakers, and pairs of speech tokens were sequentially presented from two adjacent speakers. Listeners were required to identify both words from a closed set of four possibilities and to determine whether the second token was presented to the left or right of the first. In Experiment 1, normal-hearing adult listeners were tested at 15° intervals throughout the frontal hemifield. Listeners showed highest spatial discrimination performance in and around the frontal midline, with a decline at more eccentric locations. In contrast, speech identification abilities were least accurate near the midline and showed an improvement in performance at more lateral locations. In Experiment 2, normal-hearing listeners were assessed using a restricted range of speaker locations designed to match those found in clinical testing environments. Here, speakers were separated by 15° around the midline and 30° at more lateral locations. This resulted in a similar pattern of behavioral results as in Experiment 1. We conclude, this test offers the potential to assess both spatial discrimination and the ability to use spatial information for unmasking in clinical populations.
SPEECH EMOTION RECOGNITION USING MODIFIED QUADRATIC DISCRIMINATION FUNCTION

Institute of Scientific and Technical Information of China (English)

无

2008-01-01

Quadratic Discrimination Function(QDF)is commonly used in speech emotion recognition,which proceeds on the premise that the input data is normal distribution.In this Paper,we propose a transformation to normalize the emotional features,then derivate a Modified QDF(MQDF) to speech emotion recognition.Features based on prosody and voice quality are extracted and Principal Component Analysis Neural Network (PCANN) is used to reduce dimension of the feature vectors.The results show that voice quality features are effective supplement for recognition.and the method in this paper could improve the recognition ratio effectively.
Musical and linguistic expertise influence pre-attentive and attentive processing of non-speech sounds.

Science.gov (United States)

Marie, Céline; Kujala, Teija; Besson, Mireille

2012-04-01

The aim of this experiment was two-fold. Our first goal was to determine whether linguistic expertise influences the pre-attentive [as reflected by the Mismatch Negativity - (MMN)] and the attentive processing (as reflected by behavioural discrimination accuracy) of non-speech, harmonic sounds. The second was to directly compare the effects of linguistic and musical expertise. To this end, we compared non-musician native speakers of a quantity language, Finnish, in which duration is a phonemically contrastive cue, with French musicians and French non-musicians. Results revealed that pre-attentive and attentive processing of duration deviants was enhanced in Finn non-musicians and French musicians compared to French non-musicians. By contrast, MMN in French musicians was larger than in both Finns and French non-musicians for frequency deviants, whereas no between-group differences were found for intensity deviants. By showing similar effects of linguistic and musical expertise, these results argue in favor of common processing of duration in music and speech. Copyright Â© 2010 Elsevier Srl. All rights reserved.
Use of Authentic-Speech Technique for Teaching Sound Recognition to EFL Students

Science.gov (United States)

Sersen, William J.

2011-01-01

The main objective of this research was to test an authentic-speech technique for improving the sound-recognition skills of EFL (English as a foreign language) students at Roi-Et Rajabhat University. The secondary objective was to determine the correlation, if any, between students' self-evaluation of sound-recognition progress and the actual…
The influence of (central) auditory processing disorder on the severity of speech-sound disorders in children.

Science.gov (United States)

Vilela, Nadia; Barrozo, Tatiane Faria; Pagan-Neves, Luciana de Oliveira; Sanches, Seisse Gabriela Gandolfi; Wertzner, Haydée Fiszbein; Carvallo, Renata Mota Mamede

2016-02-01

To identify a cutoff value based on the Percentage of Consonants Correct-Revised index that could indicate the likelihood of a child with a speech-sound disorder also having a (central) auditory processing disorder . Language, audiological and (central) auditory processing evaluations were administered. The participants were 27 subjects with speech-sound disorders aged 7 to 10 years and 11 months who were divided into two different groups according to their (central) auditory processing evaluation results. When a (central) auditory processing disorder was present in association with a speech disorder, the children tended to have lower scores on phonological assessments. A greater severity of speech disorder was related to a greater probability of the child having a (central) auditory processing disorder. The use of a cutoff value for the Percentage of Consonants Correct-Revised index successfully distinguished between children with and without a (central) auditory processing disorder. The severity of speech-sound disorder in children was influenced by the presence of (central) auditory processing disorder. The attempt to identify a cutoff value based on a severity index was successful.
CNTNAP2 Is Significantly Associated With Speech Sound Disorder in the Chinese Han Population.

Science.gov (United States)

Zhao, Yun-Jing; Wang, Yue-Ping; Yang, Wen-Zhu; Sun, Hong-Wei; Ma, Hong-Wei; Zhao, Ya-Ru

2015-11-01

Speech sound disorder is the most common communication disorder. Some investigations support the possibility that the CNTNAP2 gene might be involved in the pathogenesis of speech-related diseases. To investigate single-nucleotide polymorphisms in the CNTNAP2 gene, 300 unrelated speech sound disorder patients and 200 normal controls were included in the study. Five single-nucleotide polymorphisms were amplified and directly sequenced. Significant differences were found in the genotype (P = .0003) and allele (P = .0056) frequencies of rs2538976 between patients and controls. The excess frequency of the A allele in the patient group remained significant after Bonferroni correction (P = .0280). A significant haplotype association with rs2710102T/+rs17236239A/+2538976A/+2710117A (P = 4.10e-006) was identified. A neighboring single-nucleotide polymorphism, rs10608123, was found in complete linkage disequilibrium with rs2538976, and the genotypes exactly corresponded to each other. The authors propose that these CNTNAP2 variants increase the susceptibility to speech sound disorder. The single-nucleotide polymorphisms rs10608123 and rs2538976 may merge into one single-nucleotide polymorphism. © The Author(s) 2015.

Speech abilities in preschool children with speech sound disorder with and without co-occurring language impairment.

Science.gov (United States)

Macrae, Toby; Tyler, Ann A

2014-10-01

The authors compared preschool children with co-occurring speech sound disorder (SSD) and language impairment (LI) to children with SSD only in their numbers and types of speech sound errors. In this post hoc quasi-experimental study, independent samples t tests were used to compare the groups in the standard score from different tests of articulation/phonology, percent consonants correct, and the number of omission, substitution, distortion, typical, and atypical error patterns used in the production of different wordlists that had similar levels of phonetic and structural complexity. In comparison with children with SSD only, children with SSD and LI used similar numbers but different types of errors, including more omission patterns ( p < .001, d = 1.55) and fewer distortion patterns ( p = .022, d = 1.03). There were no significant differences in substitution, typical, and atypical error pattern use. Frequent omission error pattern use may reflect a more compromised linguistic system characterized by absent phonological representations for target sounds (see Shriberg et al., 2005). Research is required to examine the diagnostic potential of early frequent omission error pattern use in predicting later diagnoses of co-occurring SSD and LI and/or reading problems.
Lexical and phonological variability in preschool children with speech sound disorder.

Science.gov (United States)

Macrae, Toby; Tyler, Ann A; Lewis, Kerry E

2014-02-01

The authors of this study examined relationships between measures of word and speech error variability and between these and other speech and language measures in preschool children with speech sound disorder (SSD). In this correlational study, 18 preschool children with SSD, age-appropriate receptive vocabulary, and normal oral motor functioning and hearing were assessed across 2 sessions. Experimental measures included word and speech error variability, receptive vocabulary, nonword repetition (NWR), and expressive language. Pearson product–moment correlation coefficients were calculated among the experimental measures. The correlation between word and speech error variability was slight and nonsignificant. The correlation between word variability and receptive vocabulary was moderate and negative, although nonsignificant. High word variability was associated with small receptive vocabularies. The correlations between speech error variability and NWR and between speech error variability and the mean length of children's utterances were moderate and negative, although both were nonsignificant. High speech error variability was associated with poor NWR and language scores. High word variability may reflect unstable lexical representations, whereas high speech error variability may reflect indistinct phonological representations. Preschool children with SSD who show abnormally high levels of different types of speech variability may require slightly different approaches to intervention.
Phonological Encoding in Speech-Sound Disorder: Evidence from a Cross-Modal Priming Experiment

Science.gov (United States)

Munson, Benjamin; Krause, Miriam O. P.

2017-01-01

Background: Psycholinguistic models of language production provide a framework for determining the locus of language breakdown that leads to speech-sound disorder (SSD) in children. Aims: To examine whether children with SSD differ from their age-matched peers with typical speech and language development (TD) in the ability phonologically to…
Speech-Sound Disorders and Attention-Deficit/Hyperactivity Disorder Symptoms

Science.gov (United States)

Lewis, Barbara A.; Short, Elizabeth J.; Iyengar, Sudha K.; Taylor, H. Gerry; Freebairn, Lisa; Tag, Jessica; Avrich, Allison A.; Stein, Catherine M.

2012-01-01

Purpose: The purpose of this study was to examine the association of speech-sound disorders (SSD) with symptoms of attention-deficit/hyperactivity disorder (ADHD) by the severity of the SSD and the mode of transmission of SSD within the pedigrees of children with SSD. Participants and Methods: The participants were 412 children who were enrolled…
Irrelevant sound disrupts speech production: exploring the relationship between short-term memory and experimentally induced slips of the tongue.

Science.gov (United States)

Saito, Satoru; Baddeley, Alan

2004-10-01

To explore the relationship between short-term memory and speech production, we developed a speech error induction technique. The technique, which was adapted from a Japanese word game, exposed participants to an auditory distractor word immediately before the utterance of a target word. In Experiment 1, the distractor words that were phonologically similar to the target word led to a greater number of errors in speaking the target than did the dissimilar distractor words. Furthermore, the speech error scores were significantly correlated with memory span scores. In Experiment 2, memory span scores were again correlated with the rate of the speech errors that were induced from the task-irrelevant speech sounds. Experiment 3 showed a strong irrelevant-sound effect in the serial recall of nonwords. The magnitude of the irrelevant-sound effects was not affected by phonological similarity between the to-be-remembered nonwords and the irrelevant-sound materials. Analysis of recall errors in Experiment 3 also suggested that there were no essential differences in recall error patterns between the dissimilar and similar irrelevant-sound conditions. We proposed two different underlying mechanisms in immediate memory, one operating via the phonological short-term memory store and the other via the processes underpinning speech production.
The Prevalence of Stuttering, Voice, and Speech-Sound Disorders in Primary School Students in Australia

Science.gov (United States)

McKinnon, David H.; McLeod, Sharynne; Reilly, Sheena

2007-01-01

Purpose: The aims of this study were threefold: to report teachers' estimates of the prevalence of speech disorders (specifically, stuttering, voice, and speech-sound disorders); to consider correspondence between the prevalence of speech disorders and gender, grade level, and socioeconomic status; and to describe the level of support provided to…
Suppression of the µ rhythm during speech and non-speech discrimination revealed by independent component analysis: implications for sensorimotor integration in speech processing.

Science.gov (United States)

Bowers, Andrew; Saltuklaroglu, Tim; Harkrider, Ashley; Cuellar, Megan

2013-01-01

Constructivist theories propose that articulatory hypotheses about incoming phonetic targets may function to enhance perception by limiting the possibilities for sensory analysis. To provide evidence for this proposal, it is necessary to map ongoing, high-temporal resolution changes in sensorimotor activity (i.e., the sensorimotor μ rhythm) to accurate speech and non-speech discrimination performance (i.e., correct trials.). Sixteen participants (15 female and 1 male) were asked to passively listen to or actively identify speech and tone-sweeps in a two-force choice discrimination task while the electroencephalograph (EEG) was recorded from 32 channels. The stimuli were presented at signal-to-noise ratios (SNRs) in which discrimination accuracy was high (i.e., 80-100%) and low SNRs producing discrimination performance at chance. EEG data were decomposed using independent component analysis and clustered across participants using principle component methods in EEGLAB. ICA revealed left and right sensorimotor µ components for 14/16 and 13/16 participants respectively that were identified on the basis of scalp topography, spectral peaks, and localization to the precentral and postcentral gyri. Time-frequency analysis of left and right lateralized µ component clusters revealed significant (pFDRspeech discrimination trials relative to chance trials following stimulus offset. Findings are consistent with constructivist, internal model theories proposing that early forward motor models generate predictions about likely phonemic units that are then synthesized with incoming sensory cues during active as opposed to passive processing. Future directions and possible translational value for clinical populations in which sensorimotor integration may play a functional role are discussed.
Do Irrelevant Sounds Impair the Maintenance of All Characteristics of Speech in Memory?

Science.gov (United States)

Gabriel, D.; Gaudrain, E.; Lebrun-Guillaud, G.; Sheppard, F.; Tomescu, I. M.; Schnider, A.

2012-01-01

Several studies have shown that maintaining in memory some attributes of speech, such as the content or pitch of an interlocutor's message, is markedly reduced in the presence of background sounds made of spectrotemporal variations. However, experimental paradigms showing this interference have only focused on one attribute of speech at a time,…
Cognitive flexibility modulates maturation and music-training-related changes in neural sound discrimination.

Science.gov (United States)

Saarikivi, Katri; Putkinen, Vesa; Tervaniemi, Mari; Huotilainen, Minna

2016-07-01

Previous research has demonstrated that musicians show superior neural sound discrimination when compared to non-musicians, and that these changes emerge with accumulation of training. Our aim was to investigate whether individual differences in executive functions predict training-related changes in neural sound discrimination. We measured event-related potentials induced by sound changes coupled with tests for executive functions in musically trained and non-trained children aged 9-11 years and 13-15 years. High performance in a set-shifting task, indexing cognitive flexibility, was linked to enhanced maturation of neural sound discrimination in both musically trained and non-trained children. Specifically, well-performing musically trained children already showed large mismatch negativity (MMN) responses at a young age as well as at an older age, indicating accurate sound discrimination. In contrast, the musically trained low-performing children still showed an increase in MMN amplitude with age, suggesting that they were behind their high-performing peers in the development of sound discrimination. In the non-trained group, in turn, only the high-performing children showed evidence of an age-related increase in MMN amplitude, and the low-performing children showed a small MMN with no age-related change. These latter results suggest an advantage in MMN development also for high-performing non-trained individuals. For the P3a amplitude, there was an age-related increase only in the children who performed well in the set-shifting task, irrespective of music training, indicating enhanced attention-related processes in these children. Thus, the current study provides the first evidence that, in children, cognitive flexibility may influence age-related and training-related plasticity of neural sound discrimination. © 2016 Federation of European Neuroscience Societies and John Wiley & Sons Ltd.
Evidence for the treatment of co-occurring stuttering and speech sound disorder: A clinical case series.

Science.gov (United States)

Unicomb, Rachael; Hewat, Sally; Spencer, Elizabeth; Harrison, Elisabeth

2017-06-01

There is a paucity of evidence to guide treatment for children with co-occurring stuttering and speech sound disorder. Some guidelines suggest treating the two disorders simultaneously using indirect treatment approaches; however, the research supporting these recommendations is over 20 years old. In this clinical case series, we investigate whether these co-occurring disorders could be treated concurrently using direct treatment approaches supported by up-to-date, high-level evidence, and whether this could be done in an efficacious, safe and efficient manner. Five pre-school-aged participants received individual concurrent, direct intervention for both stuttering and speech sound disorder. All participants used the Lidcombe Program, as manualised. Direct treatment for speech sound disorder was individualised based on analysis of each child's sound system. At 12 months post commencement of treatment, all except one participant had completed the Lidcombe Program, and were less than 1.0% syllables stuttered on samples gathered within and beyond the clinic. These four participants completed Stage 1 of the Lidcombe Program in between 14 and 22 clinic visits, consistent with current benchmark data for this programme. At the same assessment point, all five participants exhibited significant increases in percentage of consonants correct and were in alignment with age-expected estimates of this measure. Further, they were treated in an average number of clinic visits that compares favourably with other research on treatment for speech sound disorder. These preliminary results indicate that young children with co-occurring stuttering and speech sound disorder may be treated concurrently using direct treatment approaches. This method of service delivery may have implications for cost and time efficiency and may also address the crucial need for early intervention in both disorders. These positive findings highlight the need for further research in the area and contribute to
Investigating the neural correlates of voice versus speech-sound directed information in pre-school children.

Directory of Open Access Journals (Sweden)

Nora Maria Raschle

Full Text Available Studies in sleeping newborns and infants propose that the superior temporal sulcus is involved in speech processing soon after birth. Speech processing also implicitly requires the analysis of the human voice, which conveys both linguistic and extra-linguistic information. However, due to technical and practical challenges when neuroimaging young children, evidence of neural correlates of speech and/or voice processing in toddlers and young children remains scarce. In the current study, we used functional magnetic resonance imaging (fMRI in 20 typically developing preschool children (average age = 5.8 y; range 5.2-6.8 y to investigate brain activation during judgments about vocal identity versus the initial speech sound of spoken object words. FMRI results reveal common brain regions responsible for voice-specific and speech-sound specific processing of spoken object words including bilateral primary and secondary language areas of the brain. Contrasting voice-specific with speech-sound specific processing predominantly activates the anterior part of the right-hemispheric superior temporal sulcus. Furthermore, the right STS is functionally correlated with left-hemispheric temporal and right-hemispheric prefrontal regions. This finding underlines the importance of the right superior temporal sulcus as a temporal voice area and indicates that this brain region is specialized, and functions similarly to adults by the age of five. We thus extend previous knowledge of voice-specific regions and their functional connections to the young brain which may further our understanding of the neuronal mechanism of speech-specific processing in children with developmental disorders, such as autism or specific language impairments.
Polysyllable Speech Accuracy and Predictors of Later Literacy Development in Preschool Children with Speech Sound Disorders

Science.gov (United States)

Masso, Sarah; Baker, Elise; McLeod, Sharynne; Wang, Cen

2017-01-01

Purpose: The aim of this study was to determine if polysyllable accuracy in preschoolers with speech sound disorders (SSD) was related to known predictors of later literacy development: phonological processing, receptive vocabulary, and print knowledge. Polysyllables--words of three or more syllables--are important to consider because unlike…
Discrimination and identification of long vowels in children with typical language development and specific language impairment

Science.gov (United States)

Datta, Hia; Shafer, Valerie; Kurtzberg, Diane

2004-05-01

Researchers have claimed that children with specific language impairment (SLI) have particular difficulties in discriminating and identifying phonetically similar and brief speech sounds (Stark and Heinz, 1966; Studdert-Kennedy and Bradley, 1997; Sussman, 1993). In a recent study (Shafer et al., 2004), children with SLI were reported to have difficulty in processing brief (50 ms), phonetically similar vowels (/I-E/). The current study investigated perception of long (250 ms), phonetically similar vowels (/I-E/) in 8- to 10-year-old children with SLI and typical language development (TLD). The purpose was to examine whether phonetic similarity in vowels leads to poorer speech-perception in the SLI group. Behavioral and electrophysiological methods were employed to examine discrimination and identification of a nine-step vowel continuum from /I/ to /E/. Similar performances in discrimination were found for both groups, indicating that lengthening vowel duration indeed improves discrimination of phonetically similar vowels. However, these children with SLI showed poor behavioral identification, demonstrating that phonetic similarity of speech sounds, irrespective of their duration, contribute to the speech perception difficulty observed in SLI population. These findings suggest that the deficit in these children with SLI is at the level of working memory or long term memory representation of speech.
Profile of Australian preschool children with speech sound disorders at risk for literacy difficulties

OpenAIRE

McLeod, S.; Crowe, K.; Masso, S.; Baker, E.; McCormack, J.; Wren, Y.; Roulstone, S.; Howland, C.

2017-01-01

Background: Speech sound disorders are a common communication difficulty in preschool children. Teachers indicate difficulty identifying and supporting these children.\\ud \\ud Aim: To describe speech and language characteristics of children identified by their parents and/or teachers as having possible communication concerns.\\ud \\ud Method: 275 Australian 4- to 5-year-old children from 45 preschools whose parents and teachers were concerned about their talking participated in speech-language p...
Profile of Australian Preschool Children with Speech Sound Disorders at Risk for Literacy Difficulties

Science.gov (United States)

McLeod, Sharynne; Crowe, Kathryn; Masso, Sarah; Baker, Elise; McCormack, Jane; Wren, Yvonne; Roulstone, Susan; Howland, Charlotte

2017-01-01

Speech sound disorders are a common communication difficulty in preschool children. Teachers indicate difficulty identifying and supporting these children. The aim of this research was to describe speech and language characteristics of children identified by their parents and/or teachers as having possible communication concerns. 275 Australian 4-…
Music and language expertise influence the categorization of speech and musical sounds: behavioral and electrophysiological measurements.

Science.gov (United States)

Elmer, Stefan; Klein, Carina; Kühnis, Jürg; Liem, Franziskus; Meyer, Martin; Jäncke, Lutz

2014-10-01

In this study, we used high-density EEG to evaluate whether speech and music expertise has an influence on the categorization of expertise-related and unrelated sounds. With this purpose in mind, we compared the categorization of speech, music, and neutral sounds between professional musicians, simultaneous interpreters (SIs), and controls in response to morphed speech-noise, music-noise, and speech-music continua. Our hypothesis was that music and language expertise will strengthen the memory representations of prototypical sounds, which act as a perceptual magnet for morphed variants. This means that the prototype would "attract" variants. This so-called magnet effect should be manifested by an increased assignment of morphed items to the trained category, by a reduced maximal slope of the psychometric function, as well as by differential event-related brain responses reflecting memory comparison processes (i.e., N400 and P600 responses). As a main result, we provide first evidence for a domain-specific behavioral bias of musicians and SIs toward the trained categories, namely music and speech. In addition, SIs showed a bias toward musical items, indicating that interpreting training has a generic influence on the cognitive representation of spectrotemporal signals with similar acoustic properties to speech sounds. Notably, EEG measurements revealed clear distinct N400 and P600 responses to both prototypical and ambiguous items between the three groups at anterior, central, and posterior scalp sites. These differential N400 and P600 responses represent synchronous activity occurring across widely distributed brain networks, and indicate a dynamical recruitment of memory processes that vary as a function of training and expertise.
Hate speech and ethnic discrimination with special focus on social media

OpenAIRE

Ananiev, Jovan

2012-01-01

Old freedoms (namely, the freedom of religion, of speech, of the press, of petition, and of assembly), are at times incompatible with newer forms of freedom. Freedom of speech conflicts with the “right not to be discriminated against.” The great problem modern society faces is not a lack of freedom, per se. It is a question of how to resolve the conflict of many different incompatible freedoms
Songbirds and humans apply different strategies in a sound sequence discrimination task

Directory of Open Access Journals (Sweden)

Yoshimasa eSeki

2013-07-01

Full Text Available The abilities of animals and humans to extract rules from sound sequences have previously been compared using observation of spontaneous responses and conditioning techniques. However, the results were inconsistently interpreted across studies possibly due to methodological and/or species differences. Therefore, we examined the strategies for discrimination of sound sequences in Bengalese finches and humans using the same protocol. Birds were trained on a GO/NOGO task to discriminate between two categories of sound stimulus generated based on an AAB or ABB rule. The sound elements used were taken from a variety of male (M and female (F calls, such that the sequences could be represented as MMF and MFF. In test sessions, FFM and FMM sequences, which were never presented in the training sessions but conformed to the rule, were presented as probe stimuli. The results suggested two discriminative strategies were being applied: 1 memorizing sound patterns of either GO or NOGO stimuli and generating the appropriate responses for only those sounds; and 2 using the repeated element as a cue. There was no evidence that the birds successfully extracted the abstract rule (i.e. AAB and ABB; MMF-GO subjects did not produce a GO response for FFM and vice versa. Next we examined whether those strategies were also applicable for human participants on the same task. The results and questionnaires revealed that participants extracted the abstract rule, and most of them employed it to discriminate the sequences. This strategy was never observed in bird subjects, although some participants used strategies similar to the birds when responding to the probe stimuli. Our results showed that the human participants applied the abstract rule in the task even without instruction but Bengalese finches did not, thereby reconfirming that humans have to extract abstract rules from sound sequences that is distinct from non-human animals.
Songbirds and humans apply different strategies in a sound sequence discrimination task.

Science.gov (United States)

Seki, Yoshimasa; Suzuki, Kenta; Osawa, Ayumi M; Okanoya, Kazuo

2013-01-01

The abilities of animals and humans to extract rules from sound sequences have previously been compared using observation of spontaneous responses and conditioning techniques. However, the results were inconsistently interpreted across studies possibly due to methodological and/or species differences. Therefore, we examined the strategies for discrimination of sound sequences in Bengalese finches and humans using the same protocol. Birds were trained on a GO/NOGO task to discriminate between two categories of sound stimulus generated based on an "AAB" or "ABB" rule. The sound elements used were taken from a variety of male (M) and female (F) calls, such that the sequences could be represented as MMF and MFF. In test sessions, FFM and FMM sequences, which were never presented in the training sessions but conformed to the rule, were presented as probe stimuli. The results suggested two discriminative strategies were being applied: (1) memorizing sound patterns of either GO or NOGO stimuli and generating the appropriate responses for only those sounds; and (2) using the repeated element as a cue. There was no evidence that the birds successfully extracted the abstract rule (i.e., AAB and ABB); MMF-GO subjects did not produce a GO response for FFM and vice versa. Next we examined whether those strategies were also applicable for human participants on the same task. The results and questionnaires revealed that participants extracted the abstract rule, and most of them employed it to discriminate the sequences. This strategy was never observed in bird subjects, although some participants used strategies similar to the birds when responding to the probe stimuli. Our results showed that the human participants applied the abstract rule in the task even without instruction but Bengalese finches did not, thereby reconfirming that humans have to extract abstract rules from sound sequences that is distinct from non-human animals.
Does seeing an Asian face make speech sound more accented?

Science.gov (United States)

Zheng, Yi; Samuel, Arthur G

2017-08-01

Prior studies have reported that seeing an Asian face makes American English sound more accented. The current study investigates whether this effect is perceptual, or if it instead occurs at a later decision stage. We first replicated the finding that showing static Asian and Caucasian faces can shift people's reports about the accentedness of speech accompanying the pictures. When we changed the static pictures to dubbed videos, reducing the demand characteristics, the shift in reported accentedness largely disappeared. By including unambiguous items along with the original ambiguous items, we introduced a contrast bias and actually reversed the shift, with the Asian-face videos yielding lower judgments of accentedness than the Caucasian-face videos. By changing to a mixed rather than blocked design, so that the ethnicity of the videos varied from trial to trial, we eliminated the difference in accentedness rating. Finally, we tested participants' perception of accented speech using the selective adaptation paradigm. After establishing that an auditory-only accented adaptor shifted the perception of how accented test words are, we found that no such adaptation effect occurred when the adapting sounds relied on visual information (Asian vs. Caucasian videos) to influence the accentedness of an ambiguous auditory adaptor. Collectively, the results demonstrate that visual information can affect the interpretation, but not the perception, of accented speech.

The influence of phonetic dimensions on aphasic speech perception

NARCIS (Netherlands)

de Kok, D.A.; Jonkers, R.; Bastiaanse, Y.R.M.

2010-01-01

Individuals with aphasia have more problems detecting small differences between speech sounds than larger ones. This paper reports how phonemic processing is impaired and how this is influenced by speechreading. A non-word discrimination task was carried out with 'audiovisual', 'auditory only' and
Analysis of speech sounds is left-hemisphere predominant at 100-150ms after sound onset.

Science.gov (United States)

Rinne, T; Alho, K; Alku, P; Holi, M; Sinkkonen, J; Virtanen, J; Bertrand, O; Näätänen, R

1999-04-06

Hemispheric specialization of human speech processing has been found in brain imaging studies using fMRI and PET. Due to the restricted time resolution, these methods cannot, however, determine the stage of auditory processing at which this specialization first emerges. We used a dense electrode array covering the whole scalp to record the mismatch negativity (MMN), an event-related brain potential (ERP) automatically elicited by occasional changes in sounds, which ranged from non-phonetic (tones) to phonetic (vowels). MMN can be used to probe auditory central processing on a millisecond scale with no attention-dependent task requirements. Our results indicate that speech processing occurs predominantly in the left hemisphere at the early, pre-attentive level of auditory analysis.
Is the Speech Transmission Index (STI) a robust measure of sound system speech intelligibility performance?

Science.gov (United States)

Mapp, Peter

2002-11-01

Although RaSTI is a good indicator of the speech intelligibility capability of auditoria and similar spaces, during the past 2-3 years it has been shown that RaSTI is not a robust predictor of sound system intelligibility performance. Instead, it is now recommended, within both national and international codes and standards, that full STI measurement and analysis be employed. However, new research is reported, that indicates that STI is not as flawless, nor robust as many believe. The paper highlights a number of potential error mechanisms. It is shown that the measurement technique and signal excitation stimulus can have a significant effect on the overall result and accuracy, particularly where DSP-based equipment is employed. It is also shown that in its current state of development, STI is not capable of appropriately accounting for a number of fundamental speech and system attributes, including typical sound system frequency response variations and anomalies. This is particularly shown to be the case when a system is operating under reverberant conditions. Comparisons between actual system measurements and corresponding word score data are reported where errors of up to 50 implications for VA and PA system performance verification will be discussed.
A Diagnostic Marker to Discriminate Childhood Apraxia of Speech from Speech Delay: I. Development and Description of the Pause Marker

Science.gov (United States)

Shriberg, Lawrence D.; Strand, Edythe A.; Fourakis, Marios; Jakielski, Kathy J.; Hall, Sheryl D.; Karlsson, Heather B.; Mabie, Heather L.; McSweeny, Jane L.; Tilkens, Christie M.; Wilson, David L.

2017-01-01

Purpose: The goal of this article (PM I) is to describe the rationale for and development of the Pause Marker (PM), a single-sign diagnostic marker proposed to discriminate early or persistent childhood apraxia of speech from speech delay. Method: The authors describe and prioritize 7 criteria with which to evaluate the research and clinical…
What characterizes changing-state speech in affecting short-term memory? An EEG study on the irrelevant sound effect.

Science.gov (United States)

Schlittmeier, Sabine J; Weisz, Nathan; Bertrand, Olivier

2011-12-01

The irrelevant sound effect (ISE) describes reduced verbal short-term memory during irrelevant changing-state sounds which consist of different and distinct auditory tokens. Steady-state sounds lack such changing-state features and do not impair performance. An EEG experiment (N=16) explored the distinguishing neurophysiological aspects of detrimental changing-state speech (3-token sequence) compared to ineffective steady-state speech (1-token sequence) on serial recall performance. We analyzed evoked and induced activity related to the memory items as well as spectral activity during the retention phase. The main finding is that the behavioral sound effect was exclusively reflected by attenuated token-induced gamma activation most pronounced between 50-60 Hz and 50-100 ms post-stimulus onset. Changing-state speech seems to disrupt a behaviorally relevant ongoing process during target presentation (e.g., the serial binding of the items). Copyright © 2011 Society for Psychophysiological Research.
Psychometric characteristics of single-word tests of children's speech sound production.

Science.gov (United States)

Flipsen, Peter; Ogiela, Diane A

2015-04-01

Our understanding of test construction has improved since the now-classic review by McCauley and Swisher (1984). The current review article examines the psychometric characteristics of current single-word tests of speech sound production in an attempt to determine whether our tests have improved since then. It also provides a resource that clinicians may use to help them make test selection decisions for their particular client populations. Ten tests published since 1990 were reviewed to determine whether they met the 10 criteria set out by McCauley and Swisher (1984), as well as 7 additional criteria. All of the tests reviewed met at least 3 of McCauley and Swisher's (1984) original criteria, and 9 of 10 tests met at least 5 of them. Most of the tests met some of the additional criteria as well. The state of the art for single-word tests of speech sound production in children appears to have improved in the last 30 years. There remains, however, room for improvement.
Challenges in discriminating profanity from hate speech

Science.gov (United States)

Malmasi, Shervin; Zampieri, Marcos

2018-03-01

In this study, we approach the problem of distinguishing general profanity from hate speech in social media, something which has not been widely considered. Using a new dataset annotated specifically for this task, we employ supervised classification along with a set of features that includes ?-grams, skip-grams and clustering-based word representations. We apply approaches based on single classifiers as well as more advanced ensemble classifiers and stacked generalisation, achieving the best result of ? accuracy for this 3-class classification task. Analysis of the results reveals that discriminating hate speech and profanity is not a simple task, which may require features that capture a deeper understanding of the text not always possible with surface ?-grams. The variability of gold labels in the annotated data, due to differences in the subjective adjudications of the annotators, is also an issue. Other directions for future work are discussed.
Balancing speech intelligibility versus sound exposure in selection of personal hearing protection equipment for Chinook aircrews

NARCIS (Netherlands)

Wijngaarden, S.J. van; Rots, G.

2001-01-01

Background: Aircrews are often exposed to high ambient sound levels, especially in military aviation. Since long-term exposure to such noise may cause hearing damage, selection of adequate hearing protective devices is crucial. Such devices also affect speech intelligibility. When speech
Influence of musical training on perception of L2 speech

NARCIS (Netherlands)

Sadakata, M.; Zanden, L.D.T. van der; Sekiyama, K.

2010-01-01

The current study reports specific cases in which a positive transfer of perceptual ability from the music domain to the language domain occurs. We tested whether musical training enhances discrimination and identification performance of L2 speech sounds (timing features, nasal consonants and
Food approach conditioning and discrimination learning using sound cues in benthic sharks.

Science.gov (United States)

Vila Pouca, Catarina; Brown, Culum

2018-07-01

The marine environment is filled with biotic and abiotic sounds. Some of these sounds predict important events that influence fitness while others are unimportant. Individuals can learn specific sound cues and 'soundscapes' and use them for vital activities such as foraging, predator avoidance, communication and orientation. Most research with sounds in elasmobranchs has focused on hearing thresholds and attractiveness to sound sources, but very little is known about their abilities to learn about sounds, especially in benthic species. Here we investigated if juvenile Port Jackson sharks could learn to associate a musical stimulus with a food reward, discriminate between two distinct musical stimuli, and whether individual personality traits were linked to cognitive performance. Five out of eight sharks were successfully conditioned to associate a jazz song with a food reward delivered in a specific corner of the tank. We observed repeatable individual differences in activity and boldness in all eight sharks, but these personality traits were not linked to the learning performance assays we examined. These sharks were later trained in a discrimination task, where they had to distinguish between the same jazz and a novel classical music song, and swim to opposite corners of the tank according to the stimulus played. The sharks' performance to the jazz stimulus declined to chance levels in the discrimination task. Interestingly, some sharks developed a strong side bias to the right, which in some cases was not the correct side for the jazz stimulus.
Participation of the classical speech areas in auditory long-term memory.

Directory of Open Access Journals (Sweden)

Anke Ninija Karabanov

Full Text Available Accumulating evidence suggests that storing speech sounds requires transposing rapidly fluctuating sound waves into more easily encoded oromotor sequences. If so, then the classical speech areas in the caudalmost portion of the temporal gyrus (pSTG and in the inferior frontal gyrus (IFG may be critical for performing this acoustic-oromotor transposition. We tested this proposal by applying repetitive transcranial magnetic stimulation (rTMS to each of these left-hemisphere loci, as well as to a nonspeech locus, while participants listened to pseudowords. After 5 minutes these stimuli were re-presented together with new ones in a recognition test. Compared to control-site stimulation, pSTG stimulation produced a highly significant increase in recognition error rate, without affecting reaction time. By contrast, IFG stimulation led only to a weak, non-significant, trend toward recognition memory impairment. Importantly, the impairment after pSTG stimulation was not due to interference with perception, since the same stimulation failed to affect pseudoword discrimination examined with short interstimulus intervals. Our findings suggest that pSTG is essential for transforming speech sounds into stored motor plans for reproducing the sound. Whether or not the IFG also plays a role in speech-sound recognition could not be determined from the present results.
Participation of the classical speech areas in auditory long-term memory.

Science.gov (United States)

Karabanov, Anke Ninija; Paine, Rainer; Chao, Chi Chao; Schulze, Katrin; Scott, Brian; Hallett, Mark; Mishkin, Mortimer

2015-01-01

Accumulating evidence suggests that storing speech sounds requires transposing rapidly fluctuating sound waves into more easily encoded oromotor sequences. If so, then the classical speech areas in the caudalmost portion of the temporal gyrus (pSTG) and in the inferior frontal gyrus (IFG) may be critical for performing this acoustic-oromotor transposition. We tested this proposal by applying repetitive transcranial magnetic stimulation (rTMS) to each of these left-hemisphere loci, as well as to a nonspeech locus, while participants listened to pseudowords. After 5 minutes these stimuli were re-presented together with new ones in a recognition test. Compared to control-site stimulation, pSTG stimulation produced a highly significant increase in recognition error rate, without affecting reaction time. By contrast, IFG stimulation led only to a weak, non-significant, trend toward recognition memory impairment. Importantly, the impairment after pSTG stimulation was not due to interference with perception, since the same stimulation failed to affect pseudoword discrimination examined with short interstimulus intervals. Our findings suggest that pSTG is essential for transforming speech sounds into stored motor plans for reproducing the sound. Whether or not the IFG also plays a role in speech-sound recognition could not be determined from the present results.
Parental Beliefs and Experiences Regarding Involvement in Intervention for Their Child with Speech Sound Disorder

Science.gov (United States)

Watts Pappas, Nicole; McAllister, Lindy; McLeod, Sharynne

2016-01-01

Parental beliefs and experiences regarding involvement in speech intervention for their child with mild to moderate speech sound disorder (SSD) were explored using multiple, sequential interviews conducted during a course of treatment. Twenty-one interviews were conducted with seven parents of six children with SSD: (1) after their child's initial…
Neurophysiological Evidence That Musical Training Influences the Recruitment of Right Hemispheric Homologues for Speech Perception

Directory of Open Access Journals (Sweden)

McNeel Gordon Jantzen

2014-03-01

Full Text Available Musicians have a more accurate temporal and tonal representation of auditory stimuli than their non-musician counterparts (Kraus & Chandrasekaran, 2010; Parbery-Clark, Skoe, & Kraus, 2009; Zendel & Alain, 2008; Musacchia, Sams, Skoe, & Kraus, 2007. Musicians who are adept at the production and perception of music are also more sensitive to key acoustic features of speech such as voice onset timing and pitch. Together, these data suggest that musical training may enhance the processing of acoustic information for speech sounds. In the current study, we sought to provide neural evidence that musicians process speech and music in a similar way. We hypothesized that for musicians, right hemisphere areas traditionally associated with music are also engaged for the processing of speech sounds. In contrast we predicted that in non-musicians processing of speech sounds would be localized to traditional left hemisphere language areas. Speech stimuli differing in voice onset time was presented using a dichotic listening paradigm. Subjects either indicated aural location for a specified speech sound or identified a specific speech sound from a directed aural location. Musical training effects and organization of acoustic features were reflected by activity in source generators of the P50. This included greater activation of right middle temporal gyrus (MTG and superior temporal gyrus (STG in musicians. The findings demonstrate recruitment of right hemisphere in musicians for discriminating speech sounds and a putative broadening of their language network. Musicians appear to have an increased sensitivity to acoustic features and enhanced selective attention to temporal features of speech that is facilitated by musical training and supported, in part, by right hemisphere homologues of established speech processing regions of the brain.
Relative Contributions of the Dorsal vs. Ventral Speech Streams to Speech Perception are Context Dependent: a lesion study

Directory of Open Access Journals (Sweden)

Corianne Rogalsky

2014-04-01

Full Text Available The neural basis of speech perception has been debated for over a century. While it is generally agreed that the superior temporal lobes are critical for the perceptual analysis of speech, a major current topic is whether the motor system contributes to speech perception, with several conflicting findings attested. In a dorsal-ventral speech stream framework (Hickok & Poeppel 2007, this debate is essentially about the roles of the dorsal versus ventral speech processing streams. A major roadblock in characterizing the neuroanatomy of speech perception is task-specific effects. For example, much of the evidence for dorsal stream involvement comes from syllable discrimination type tasks, which have been found to behaviorally doubly dissociate from auditory comprehension tasks (Baker et al. 1981. Discrimination task deficits could be a result of difficulty perceiving the sounds themselves, which is the typical assumption, or it could be a result of failures in temporary maintenance of the sensory traces, or the comparison and/or the decision process. Similar complications arise in perceiving sentences: the extent of inferior frontal (i.e. dorsal stream activation during listening to sentences increases as a function of increased task demands (Love et al. 2006. Another complication is the stimulus: much evidence for dorsal stream involvement uses speech samples lacking semantic context (CVs, non-words. The present study addresses these issues in a large-scale lesion-symptom mapping study. 158 patients with focal cerebral lesions from the Mutli-site Aphasia Research Consortium underwent a structural MRI or CT scan, as well as an extensive psycholinguistic battery. Voxel-based lesion symptom mapping was used to compare the neuroanatomy involved in the following speech perception tasks with varying phonological, semantic, and task loads: (i two discrimination tasks of syllables (non-words and words, respectively, (ii two auditory comprehension tasks
Exploring Australian speech-language pathologists' use and perceptions ofnon-speech oral motor exercises.

Science.gov (United States)

Rumbach, Anna F; Rose, Tanya A; Cheah, Mynn

2018-01-29

To explore Australian speech-language pathologists' use of non-speech oral motor exercises, and rationales for using/not using non-speech oral motor exercises in clinical practice. A total of 124 speech-language pathologists practising in Australia, working with paediatric and/or adult clients with speech sound difficulties, completed an online survey. The majority of speech-language pathologists reported that they did not use non-speech oral motor exercises when working with paediatric or adult clients with speech sound difficulties. However, more than half of the speech-language pathologists working with adult clients who have dysarthria reported using non-speech oral motor exercises with this population. The most frequently reported rationale for using non-speech oral motor exercises in speech sound difficulty management was to improve awareness/placement of articulators. The majority of speech-language pathologists agreed there is no clear clinical or research evidence base to support non-speech oral motor exercise use with clients who have speech sound difficulties. This study provides an overview of Australian speech-language pathologists' reported use and perceptions of non-speech oral motor exercises' applicability and efficacy in treating paediatric and adult clients who have speech sound difficulties. The research findings provide speech-language pathologists with insight into how and why non-speech oral motor exercises are currently used, and adds to the knowledge base regarding Australian speech-language pathology practice of non-speech oral motor exercises in the treatment of speech sound difficulties. Implications for Rehabilitation Non-speech oral motor exercises refer to oral motor activities which do not involve speech, but involve the manipulation or stimulation of oral structures including the lips, tongue, jaw, and soft palate. Non-speech oral motor exercises are intended to improve the function (e.g., movement, strength) of oral structures. The
When Does Speech Sound Disorder Matter for Literacy? The Role of Disordered Speech Errors, Co-Occurring Language Impairment and Family Risk of Dyslexia

Science.gov (United States)

Hayiou-Thomas, Marianna E.; Carroll, Julia M.; Leavett, Ruth; Hulme, Charles; Snowling, Margaret J.

2017-01-01

Background: This study considers the role of early speech difficulties in literacy development, in the context of additional risk factors. Method: Children were identified with speech sound disorder (SSD) at the age of 3½ years, on the basis of performance on the Diagnostic Evaluation of Articulation and Phonology. Their literacy skills were…
Hemispheric lateralization in an analysis of speech sounds. Left hemisphere dominance replicated in Japanese subjects.

Science.gov (United States)

Koyama, S; Gunji, A; Yabe, H; Oiwa, S; Akahane-Yamada, R; Kakigi, R; Näätänen, R

2000-09-01

Evoked magnetic responses to speech sounds [R. Näätänen, A. Lehtokoski, M. Lennes, M. Cheour, M. Huotilainen, A. Iivonen, M. Vainio, P. Alku, R.J. Ilmoniemi, A. Luuk, J. Allik, J. Sinkkonen and K. Alho, Language-specific phoneme representations revealed by electric and magnetic brain responses. Nature, 385 (1997) 432-434.] were recorded from 13 Japanese subjects (right-handed). Infrequently presented vowels ([o]) among repetitive vowels ([e]) elicited the magnetic counterpart of mismatch negativity, MMNm (Bilateral, nine subjects; Left hemisphere alone, three subjects; Right hemisphere alone, one subject). The estimated source of the MMNm was stronger in the left than in the right auditory cortex. The sources were located posteriorly in the left than in the right auditory cortex. These findings are consistent with the results obtained in Finnish [R. Näätänen, A. Lehtokoski, M. Lennes, M. Cheour, M. Huotilainen, A. Iivonen, M.Vainio, P.Alku, R.J. Ilmoniemi, A. Luuk, J. Allik, J. Sinkkonen and K. Alho, Language-specific phoneme representations revealed by electric and magnetic brain responses. Nature, 385 (1997) 432-434.][T. Rinne, K. Alho, P. Alku, M. Holi, J. Sinkkonen, J. Virtanen, O. Bertrand and R. Näätänen, Analysis of speech sounds is left-hemisphere predominant at 100-150 ms after sound onset. Neuroreport, 10 (1999) 1113-1117.] and English [K. Alho, J.F. Connolly, M. Cheour, A. Lehtokoski, M. Huotilainen, J. Virtanen, R. Aulanko and R.J. Ilmoniemi, Hemispheric lateralization in preattentive processing of speech sounds. Neurosci. Lett., 258 (1998) 9-12.] subjects. Instead of the P1m observed in Finnish [M. Tervaniemi, A. Kujala, K. Alho, J. Virtanen, R.J. Ilmoniemi and R. Näätänen, Functional specialization of the human auditory cortex in processing phonetic and musical sounds: A magnetoencephalographic (MEG) study. Neuroimage, 9 (1999) 330-336.] and English [K. Alho, J. F. Connolly, M. Cheour, A. Lehtokoski, M. Huotilainen, J. Virtanen, R. Aulanko
Learning-induced neural plasticity of speech processing before birth.

Science.gov (United States)

Partanen, Eino; Kujala, Teija; Näätänen, Risto; Liitola, Auli; Sambeth, Anke; Huotilainen, Minna

2013-09-10

Learning, the foundation of adaptive and intelligent behavior, is based on plastic changes in neural assemblies, reflected by the modulation of electric brain responses. In infancy, auditory learning implicates the formation and strengthening of neural long-term memory traces, improving discrimination skills, in particular those forming the prerequisites for speech perception and understanding. Although previous behavioral observations show that newborns react differentially to unfamiliar sounds vs. familiar sound material that they were exposed to as fetuses, the neural basis of fetal learning has not thus far been investigated. Here we demonstrate direct neural correlates of human fetal learning of speech-like auditory stimuli. We presented variants of words to fetuses; unlike infants with no exposure to these stimuli, the exposed fetuses showed enhanced brain activity (mismatch responses) in response to pitch changes for the trained variants after birth. Furthermore, a significant correlation existed between the amount of prenatal exposure and brain activity, with greater activity being associated with a higher amount of prenatal speech exposure. Moreover, the learning effect was generalized to other types of similar speech sounds not included in the training material. Consequently, our results indicate neural commitment specifically tuned to the speech features heard before birth and their memory representations.
Sound localization and word discrimination in reverberant environment in children with developmental dyslexia

Directory of Open Access Journals (Sweden)

Wendy Castro-Camacho

2015-04-01

Full Text Available Objective Compare if localization of sounds and words discrimination in reverberant environment is different between children with dyslexia and controls. Method We studied 30 children with dyslexia and 30 controls. Sound and word localization and discrimination was studied in five angles from left to right auditory fields (-90o, -45o, 0o, +45o, +90o, under reverberant and no-reverberant conditions; correct answers were compared. Results Spatial location of words in no-reverberant test was deficient in children with dyslexia at 0º and +90o. Spatial location for reverberant test was altered in children with dyslexia at all angles, except -90o. Word discrimination in no-reverberant test in children with dyslexia had a poor performance at left angles. In reverberant test, children with dyslexia exhibited deficiencies at -45o, -90o, and +45o angles. Conclusion Children with dyslexia could had problems when have to locate sound, and discriminate words in extreme locations of the horizontal plane in classrooms with reverberation.

Learning foreign sounds in an alien world: videogame training improves non-native speech categorization.

Science.gov (United States)

Lim, Sung-joo; Holt, Lori L

2011-01-01

Although speech categories are defined by multiple acoustic dimensions, some are perceptually weighted more than others and there are residual effects of native-language weightings in non-native speech perception. Recent research on nonlinguistic sound category learning suggests that the distribution characteristics of experienced sounds influence perceptual cue weights: Increasing variability across a dimension leads listeners to rely upon it less in subsequent category learning (Holt & Lotto, 2006). The present experiment investigated the implications of this among native Japanese learning English /r/-/l/ categories. Training was accomplished using a videogame paradigm that emphasizes associations among sound categories, visual information, and players' responses to videogame characters rather than overt categorization or explicit feedback. Subjects who played the game for 2.5h across 5 days exhibited improvements in /r/-/l/ perception on par with 2-4 weeks of explicit categorization training in previous research and exhibited a shift toward more native-like perceptual cue weights. Copyright © 2011 Cognitive Science Society, Inc.
Speech Sound Disorders in Preschool Children: Correspondence between Clinical Diagnosis and Teacher and Parent Report

Science.gov (United States)

Harrison, Linda J.; McLeod, Sharynne; McAllister, Lindy; McCormack, Jane

2017-01-01

This study sought to assess the level of correspondence between parent and teacher report of concern about young children's speech and specialist assessment of speech sound disorders (SSD). A sample of 157 children aged 4-5 years was recruited in preschools and long day care centres in Victoria and New South Wales (NSW). SSD was assessed…
Differences between the production of [s] and [ʃ] in the speech of adults, typically developing children, and children with speech sound disorders: An ultrasound study.

Science.gov (United States)

Francisco, Danira Tavares; Wertzner, Haydée Fiszbein

2017-01-01

This study describes the criteria that are used in ultrasound to measure the differences between the tongue contours that produce [s] and [ʃ] sounds in the speech of adults, typically developing children (TDC), and children with speech sound disorder (SSD) with the phonological process of palatal fronting. Overlapping images of the tongue contours that resulted from 35 subjects producing the [s] and [ʃ] sounds were analysed to select 11 spokes on the radial grid that were spread over the tongue contour. The difference was calculated between the mean contour of the [s] and [ʃ] sounds for each spoke. A cluster analysis produced groups with some consistency in the pattern of articulation across subjects and differentiated adults and TDC to some extent and children with SSD with a high level of success. Children with SSD were less likely to show differentiation of the tongue contours between the articulation of [s] and [ʃ].
Movement goals and feedback and feedforward control mechanisms in speech production.

Science.gov (United States)

Perkell, Joseph S

2012-09-01

Studies of speech motor control are described that support a theoretical framework in which fundamental control variables for phonemic movements are multi-dimensional regions in auditory and somatosensory spaces. Auditory feedback is used to acquire and maintain auditory goals and in the development and function of feedback and feedforward control mechanisms. Several lines of evidence support the idea that speakers with more acute sensory discrimination acquire more distinct goal regions and therefore produce speech sounds with greater contrast. Feedback modification findings indicate that fluently produced sound sequences are encoded as feedforward commands, and feedback control serves to correct mismatches between expected and produced sensory consequences.
A randomized controlled trial on the beneficial effects of training letter-speech sound integration on reading fluency in children with dyslexia

NARCIS (Netherlands)

Fraga González, G.; Žarić, G.; Tijms, J.; Bonte, M.; Blomert, L.; van der Molen, M.W.

2015-01-01

A recent account of dyslexia assumes that a failure to develop automated letter-speech sound integration might be responsible for the observed lack of reading fluency. This study uses a pre-test-training-post-test design to evaluate the effects of a training program based on letter-speech sound
A sparse neural code for some speech sounds but not for others.

Directory of Open Access Journals (Sweden)

Mathias Scharinger

Full Text Available The precise neural mechanisms underlying speech sound representations are still a matter of debate. Proponents of 'sparse representations' assume that on the level of speech sounds, only contrastive or otherwise not predictable information is stored in long-term memory. Here, in a passive oddball paradigm, we challenge the neural foundations of such a 'sparse' representation; we use words that differ only in their penultimate consonant ("coronal" [t] vs. "dorsal" [k] place of articulation and for example distinguish between the German nouns Latz ([lats]; bib and Lachs ([laks]; salmon. Changes from standard [t] to deviant [k] and vice versa elicited a discernible Mismatch Negativity (MMN response. Crucially, however, the MMN for the deviant [lats] was stronger than the MMN for the deviant [laks]. Source localization showed this difference to be due to enhanced brain activity in right superior temporal cortex. These findings reflect a difference in phonological 'sparsity': Coronal [t] segments, but not dorsal [k] segments, are based on more sparse representations and elicit less specific neural predictions; sensory deviations from this prediction are more readily 'tolerated' and accordingly trigger weaker MMNs. The results foster the neurocomputational reality of 'representationally sparse' models of speech perception that are compatible with more general predictive mechanisms in auditory perception.
Bilateral capacity for speech sound processing in auditory comprehension: evidence from Wada procedures.

Science.gov (United States)

Hickok, G; Okada, K; Barr, W; Pa, J; Rogalsky, C; Donnelly, K; Barde, L; Grant, A

2008-12-01

Data from lesion studies suggest that the ability to perceive speech sounds, as measured by auditory comprehension tasks, is supported by temporal lobe systems in both the left and right hemisphere. For example, patients with left temporal lobe damage and auditory comprehension deficits (i.e., Wernicke's aphasics), nonetheless comprehend isolated words better than one would expect if their speech perception system had been largely destroyed (70-80% accuracy). Further, when comprehension fails in such patients their errors are more often semantically-based, than-phonemically based. The question addressed by the present study is whether this ability of the right hemisphere to process speech sounds is a result of plastic reorganization following chronic left hemisphere damage, or whether the ability exists in undamaged language systems. We sought to test these possibilities by studying auditory comprehension in acute left versus right hemisphere deactivation during Wada procedures. A series of 20 patients undergoing clinically indicated Wada procedures were asked to listen to an auditorily presented stimulus word, and then point to its matching picture on a card that contained the target picture, a semantic foil, a phonemic foil, and an unrelated foil. This task was performed under three conditions, baseline, during left carotid injection of sodium amytal, and during right carotid injection of sodium amytal. Overall, left hemisphere injection led to a significantly higher error rate than right hemisphere injection. However, consistent with lesion work, the majority (75%) of these errors were semantic in nature. These findings suggest that auditory comprehension deficits are predominantly semantic in nature, even following acute left hemisphere disruption. This, in turn, supports the hypothesis that the right hemisphere is capable of speech sound processing in the intact brain.
The neural basis of speech sound discrimination from infancy to adulthood

OpenAIRE

Partanen, Eino

2013-01-01

Rapid processing of speech is facilitated by neural representations of native language phonemes. However, some disorders and developmental conditions, such as developmental dyslexia, can hamper the development of these neural memory traces, leading to language delays and poor academic achievement. While the early identification of such deficits is paramount so that interventions can be started as early as possible, there is currently no systematically used ecologically valid paradigm for the ...
The phonological memory profile of preschool children who make atypical speech sound errors.

Science.gov (United States)

Waring, Rebecca; Eadie, Patricia; Rickard Liow, Susan; Dodd, Barbara

2018-01-01

Previous research indicates that children with speech sound disorders (SSD) have underlying phonological memory deficits. The SSD population, however, is diverse. While children who make consistent atypical speech errors (phonological disorder/PhDis) are known to have executive function deficits in rule abstraction and cognitive flexibility, little is known about their memory profile. Sixteen monolingual preschool children with atypical speech errors (PhDis) were matched individually to age-and-gender peers with typically developing speech (TDS). The two groups were compared on forward recall of familiar words (pointing response), reverse recall of familiar words (pointing response), and reverse recall of digits (spoken response) and a receptive vocabulary task. There were no differences between children with TDS and children with PhDis on forward recall or vocabulary tasks. However, children with TDS significantly outperformed children with PhDis on the two reverse recall tasks. Findings suggest that atypical speech errors are associated with impaired phonological working memory, implicating executive function impairment in specific subtypes of SSD.
Discrimination of stress in speech and music: a mismatch negativity (MMN) study.

Science.gov (United States)

Peter, Varghese; McArthur, Genevieve; Thompson, William Forde

2012-12-01

The aim of this study was to determine if duration-related stress in speech and music is processed in a similar way in the brain. To this end, we tested 20 adults for their abstract mismatch negativity (MMN) event-related potentials to two duration-related stress patterns: stress on the first syllable or note (long-short), and stress on the second syllable or note (short-long). A significant MMN was elicited for both speech and music except for the short-long speech stimulus. The long-short stimuli elicited larger MMN amplitudes for speech and music compared to short-long stimuli. An extra negativity-the late discriminative negativity (LDN)-was observed only for music. The larger MMN amplitude for long-short stimuli might be due to the familiarity of the stress pattern in speech and music. The presence of LDN for music may reflect greater long-term memory transfer for music stimuli. Copyright © 2012 Society for Psychophysiological Research.
Sensitivity of cortical auditory evoked potential detection for hearing-impaired infants in response to short speech sounds

Directory of Open Access Journals (Sweden)

Bram Van Dun

2012-01-01

Full Text Available
Background: Cortical auditory evoked potentials (CAEPs are an emerging tool for hearing aid fitting evaluation in young children who cannot provide reliable behavioral feedback. It is therefore useful to determine the relationship between the sensation level of speech sounds and the detection sensitivity of CAEPs.

Design and methods: Twenty-five sensorineurally hearing impaired infants with an age range of 8 to 30 months were tested once, 18 aided and 7 unaided. First, behavioral thresholds of speech stimuli /m/, /g/, and /t/ were determined using visual reinforcement orientation audiometry (VROA. Afterwards, the same speech stimuli were presented at 55, 65, and 75 dB SPL, and CAEP recordings were made. An automatic statistical detection paradigm was used for CAEP detection.

Results: For sensation levels above 0, 10, and 20 dB respectively, detection sensitivities were equal to 72 ± 10, 75 ± 10, and 78 ± 12%. In 79% of the cases, automatic detection p-values became smaller when the sensation level was increased by 10 dB.

Conclusions: The results of this study suggest that the presence or absence of CAEPs can provide some indication of the audibility of a speech sound for infants with sensorineural hearing loss. The detection of a CAEP provides confidence, to a degree commensurate with the detection probability, that the infant is detecting that sound at the level presented. When testing infants where the audibility of speech sounds has not been established behaviorally, the lack of a cortical response indicates the possibility, but by no means a certainty, that the sensation level is 10 dB or less.
Evaluation of Speech Recognition of Cochlear Implant Recipients Using Adaptive, Digital Remote Microphone Technology and a Speech Enhancement Sound Processing Algorithm.

Science.gov (United States)

Wolfe, Jace; Morais, Mila; Schafer, Erin; Agrawal, Smita; Koch, Dawn

2015-05-01

Cochlear implant recipients often experience difficulty with understanding speech in the presence of noise. Cochlear implant manufacturers have developed sound processing algorithms designed to improve speech recognition in noise, and research has shown these technologies to be effective. Remote microphone technology utilizing adaptive, digital wireless radio transmission has also been shown to provide significant improvement in speech recognition in noise. There are no studies examining the potential improvement in speech recognition in noise when these two technologies are used simultaneously. The goal of this study was to evaluate the potential benefits and limitations associated with the simultaneous use of a sound processing algorithm designed to improve performance in noise (Advanced Bionics ClearVoice) and a remote microphone system that incorporates adaptive, digital wireless radio transmission (Phonak Roger). A two-by-two way repeated measures design was used to examine performance differences obtained without these technologies compared to the use of each technology separately as well as the simultaneous use of both technologies. Eleven Advanced Bionics (AB) cochlear implant recipients, ages 11 to 68 yr. AzBio sentence recognition was measured in quiet and in the presence of classroom noise ranging in level from 50 to 80 dBA in 5-dB steps. Performance was evaluated in four conditions: (1) No ClearVoice and no Roger, (2) ClearVoice enabled without the use of Roger, (3) ClearVoice disabled with Roger enabled, and (4) simultaneous use of ClearVoice and Roger. Speech recognition in quiet was better than speech recognition in noise for all conditions. Use of ClearVoice and Roger each provided significant improvement in speech recognition in noise. The best performance in noise was obtained with the simultaneous use of ClearVoice and Roger. ClearVoice and Roger technology each improves speech recognition in noise, particularly when used at the same time
Oral and Hand Movement Speeds Are Associated with Expressive Language Ability in Children with Speech Sound Disorder

Science.gov (United States)

Peter, Beate

2012-01-01

This study tested the hypothesis that children with speech sound disorder have generalized slowed motor speeds. It evaluated associations among oral and hand motor speeds and measures of speech (articulation and phonology) and language (receptive vocabulary, sentence comprehension, sentence imitation), in 11 children with moderate to severe SSD…
The Comorbidity between Attention-Deficit/Hyperactivity Disorder (ADHD in Children and Arabic Speech Sound Disorder

Directory of Open Access Journals (Sweden)

Ruaa Osama Hariri

2016-04-01

Full Text Available Children with Attention-Deficiency/Hyperactive Disorder (ADHD often have co-existing learning disabilities and developmental weaknesses or delays in some areas including speech (Rief, 2005. Seeing that phonological disorders include articulation errors and other forms of speech disorders, studies pertaining to children with ADHD symptoms who demonstrate signs of phonological disorders in their native Arabic language are lacking. The purpose of this study is to provide a description of Arabic language deficits and to present a theoretical model of potential associations between phonological language deficits and ADHD. Dodd and McCormack’s (1995 four subgroups classification of speech disorder and the phonological disorders pertaining to the Arabic language provided by a Saudi Institute for Speech and Hearing are examined within the theoretical framework. Since intervention may improve articulation and focuses a child’s attention on the sound structure of words, findings in this study are based on the assumption that children with ADHD may acquire phonology for their Arabic language in the same way, and following the same developmental stages as intelligible children. Both quantitative and qualitative analyses have proven that the ADHD group analyzed in this study had indeed failed to acquire most of their Arabic consonants as they should have. Keywords: speech sound disorder, attention-deficiency/hyperactive, developmental disorder, phonological disorder, language disorder/delay, language impairment
Children with Speech Sound Disorders at School: Challenges for Children, Parents and Teachers

Science.gov (United States)

Daniel, Graham R.; McLeod, Sharynne

2017-01-01

Teachers play a major role in supporting children's educational, social, and emotional development although may be unprepared for supporting children with speech sound disorders. Interviews with 34 participants including six focus children, their parents, siblings, friends, teachers and other significant adults in their lives highlighted…
Cross-modal distraction by background speech: what role for meaning?

Science.gov (United States)

Marsh, John E; Jones, Dylan M

2010-01-01

Mental tasks are susceptible to disruption by concurrent to-be-ignored speech. The goal of the present paper is to examine whether a theoretical framework successfully applied to irrelevant speech effects in serial recall-interference by process-can be extended to verbal tasks in which meaning is the basis of retrieval and to which the irrelevant sound is related to different degrees by meaning. That the semantic characteristics of the to-be-ignored sound interact with the predominance of semantic retrieval in the focal task to determine the degree of disruption is demonstrated in three settings: free recall, category-clustering and fluency. Source monitoring-the difficulty in discriminating episodic information on the basis of the sense modality (visual or auditory) in which it was presented-contributes in part to the disruption by speech. The power of alternative accounts-interference-by-content and attentional capture-to predict these outcomes is also discussed.
Inferior Frontal Sensitivity to Common Speech Sounds Is Amplified by Increasing Word Intelligibility

Science.gov (United States)

Vaden, Kenneth I., Jr.; Kuchinsky, Stefanie E.; Keren, Noam I.; Harris, Kelly C.; Ahlstrom, Jayne B.; Dubno, Judy R.; Eckert, Mark A.

2011-01-01

The left inferior frontal gyrus (LIFG) exhibits increased responsiveness when people listen to words composed of speech sounds that frequently co-occur in the English language (Vaden, Piquado, & Hickok, 2011), termed high phonotactic frequency (Vitevitch & Luce, 1998). The current experiment aimed to further characterize the relation of…
Early Intervening for Students with Speech Sound Disorders: Lessons from a School District

Science.gov (United States)

Mire, Stephen P.; Montgomery, Judy K.

2009-01-01

The concept of early intervening services was introduced into public school systems with the implementation of the Individuals With Disabilities Education Improvement Act (IDEA) of 2004. This article describes a program developed for students with speech sound disorders that incorporated concepts of early intervening services, response to…
Knockdown of the dyslexia-associated gene Kiaa0319 impairs temporal responses to speech stimuli in rat primary auditory cortex.

Science.gov (United States)

Centanni, T M; Booker, A B; Sloan, A M; Chen, F; Maher, B J; Carraway, R S; Khodaparast, N; Rennaker, R; LoTurco, J J; Kilgard, M P

2014-07-01

One in 15 school age children have dyslexia, which is characterized by phoneme-processing problems and difficulty learning to read. Dyslexia is associated with mutations in the gene KIAA0319. It is not known whether reduced expression of KIAA0319 can degrade the brain's ability to process phonemes. In the current study, we used RNA interference (RNAi) to reduce expression of Kiaa0319 (the rat homolog of the human gene KIAA0319) and evaluate the effect in a rat model of phoneme discrimination. Speech discrimination thresholds in normal rats are nearly identical to human thresholds. We recorded multiunit neural responses to isolated speech sounds in primary auditory cortex (A1) of rats that received in utero RNAi of Kiaa0319. Reduced expression of Kiaa0319 increased the trial-by-trial variability of speech responses and reduced the neural discrimination ability of speech sounds. Intracellular recordings from affected neurons revealed that reduced expression of Kiaa0319 increased neural excitability and input resistance. These results provide the first evidence that decreased expression of the dyslexia-associated gene Kiaa0319 can alter cortical responses and impair phoneme processing in auditory cortex. © The Author 2013. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Tutorial and Guidelines on Measurement of Sound Pressure Level in Voice and Speech

Science.gov (United States)

Švec, Jan G.; Granqvist, Svante

2018-01-01

Purpose: Sound pressure level (SPL) measurement of voice and speech is often considered a trivial matter, but the measured levels are often reported incorrectly or incompletely, making them difficult to compare among various studies. This article aims at explaining the fundamental principles behind these measurements and providing guidelines to…

Nonspeech Oral Motor Treatment Issues Related to Children with Developmental Speech Sound Disorders

Science.gov (United States)

Ruscello, Dennis M.

2008-01-01

Purpose: This article examines nonspeech oral motor treatments (NSOMTs) in the population of clients with developmental speech sound disorders. NSOMTs are a collection of nonspeech methods and procedures that claim to influence tongue, lip, and jaw resting postures; increase strength; improve muscle tone; facilitate range of motion; and develop…
The dispersion-focalization theory of sound systems

Science.gov (United States)

Schwartz, Jean-Luc; Abry, Christian; Boë, Louis-Jean; Vallée, Nathalie; Ménard, Lucie

2005-04-01

The Dispersion-Focalization Theory states that sound systems in human languages are shaped by two major perceptual constraints: dispersion driving auditory contrast towards maximal or sufficient values [B. Lindblom, J. Phonetics 18, 135-152 (1990)] and focalization driving auditory spectra towards patterns with close neighboring formants. Dispersion is computed from the sum of the inverse squared inter-spectra distances in the (F1, F2, F3, F4) space, using a non-linear process based on the 3.5 Bark critical distance to estimate F2'. Focalization is based on the idea that close neighboring formants produce vowel spectra with marked peaks, easier to process and memorize in the auditory system. Evidence for increased stability of focal vowels in short-term memory was provided in a discrimination experiment on adult French subjects [J. L. Schwartz and P. Escudier, Speech Comm. 8, 235-259 (1989)]. A reanalysis of infant discrimination data shows that focalization could well be the responsible for recurrent discrimination asymmetries [J. L. Schwartz et al., Speech Comm. (in press)]. Recent data about children vowel production indicate that focalization seems to be part of the perceptual templates driving speech development. The Dispersion-Focalization Theory produces valid predictions for both vowel and consonant systems, in relation with available databases of human languages inventories.
Auditory short-term memory trace formation for nonspeech and speech in SLI and dyslexia as indexed by the N100 and mismatch negativity electrophysiological responses.

Science.gov (United States)

Tuomainen, Outi T

2015-04-15

This study investigates nonspeech and speech processing in specific language impairment (SLI) and dyslexia. We used a passive mismatch negativity (MMN) task to tap automatic brain responses and an active behavioural task to tap attended discrimination of nonspeech and speech sounds. Using the roving standard MMN paradigm, we varied the number of standards ('few' vs. 'many') to investigate the effect of sound repetition on N100 and MMN responses. The results revealed that the SLI group needed more repetitions than dyslexics and controls to create a strong enough sensory trace to elicit MMN. In contrast, in the behavioural task, we observed good discrimination of speech and nonspeech in all groups. The findings indicate that auditory processing deficits in SLI and dyslexia are dissociable and that memory trace formation may be implicated in SLI.
Polysyllable Speech Accuracy and Predictors of Later Literacy Development in Preschool Children With Speech Sound Disorders.

Science.gov (United States)

Masso, Sarah; Baker, Elise; McLeod, Sharynne; Wang, Cen

2017-07-12

The aim of this study was to determine if polysyllable accuracy in preschoolers with speech sound disorders (SSD) was related to known predictors of later literacy development: phonological processing, receptive vocabulary, and print knowledge. Polysyllables-words of three or more syllables-are important to consider because unlike monosyllables, polysyllables have been associated with phonological processing and literacy difficulties in school-aged children. They therefore have the potential to help identify preschoolers most at risk of future literacy difficulties. Participants were 93 preschool children with SSD from the Sound Start Study. Participants completed the Polysyllable Preschool Test (Baker, 2013) as well as phonological processing, receptive vocabulary, and print knowledge tasks. Cluster analysis was completed, and 2 clusters were identified: low polysyllable accuracy and moderate polysyllable accuracy. The clusters were significantly different based on 2 measures of phonological awareness and measures of receptive vocabulary, rapid naming, and digit span. The clusters were not significantly different on sound matching accuracy or letter, sound, or print concept knowledge. The participants' poor performance on print knowledge tasks suggested that as a group, they were at risk of literacy difficulties but that there was a cluster of participants at greater risk-those with both low polysyllable accuracy and poor phonological processing.
Discriminating individually considerate and authoritarian leaders by speech activity cues

OpenAIRE

Feese, Sebastian; Muaremi, Amir; Arnrich, Bert; Tröster, Gerhard; Meyer, Bertolt; Jonas, Klaus

2011-01-01

Effective leadership can increase team performance, however up to now the influence of specific micro-level behavioral patterns on team performance is unclear. At the same time, current behavior observation methods in social psychology mostly rely on manual video annotations that impede research. In our work, we follow a sensor-based approach to automatically extract speech activity cues to discriminate individualized considerate from authoritarian leadership. On a subset of 35 selected...
How Should Children with Speech Sound Disorders be Classified? A Review and Critical Evaluation of Current Classification Systems

Science.gov (United States)

Waring, R.; Knight, R.

2013-01-01

Background: Children with speech sound disorders (SSD) form a heterogeneous group who differ in terms of the severity of their condition, underlying cause, speech errors, involvement of other aspects of the linguistic system and treatment response. To date there is no universal and agreed-upon classification system. Instead, a number of…
Language related differences of the sustained response evoked by natural speech sounds.

Directory of Open Access Journals (Sweden)

Christina Siu-Dschu Fan

Full Text Available In tonal languages, such as Mandarin Chinese, the pitch contour of vowels discriminates lexical meaning, which is not the case in non-tonal languages such as German. Recent data provide evidence that pitch processing is influenced by language experience. However, there are still many open questions concerning the representation of such phonological and language-related differences at the level of the auditory cortex (AC. Using magnetoencephalography (MEG, we recorded transient and sustained auditory evoked fields (AEF in native Chinese and German speakers to investigate language related phonological and semantic aspects in the processing of acoustic stimuli. AEF were elicited by spoken meaningful and meaningless syllables, by vowels, and by a French horn tone. Speech sounds were recorded from a native speaker and showed frequency-modulations according to the pitch-contours of Mandarin. The sustained field (SF evoked by natural speech signals was significantly larger for Chinese than for German listeners. In contrast, the SF elicited by a horn tone was not significantly different between groups. Furthermore, the SF of Chinese subjects was larger when evoked by meaningful syllables compared to meaningless ones, but there was no significant difference regarding whether vowels were part of the Chinese phonological system or not. Moreover, the N100m gave subtle but clear evidence that for Chinese listeners other factors than purely physical properties play a role in processing meaningful signals. These findings show that the N100 and the SF generated in Heschl's gyrus are influenced by language experience, which suggests that AC activity related to specific pitch contours of vowels is influenced in a top-down fashion by higher, language related areas. Such interactions are in line with anatomical findings and neuroimaging data, as well as with the dual-stream model of language of Hickok and Poeppel that highlights the close and reciprocal interaction
Language related differences of the sustained response evoked by natural speech sounds.

Science.gov (United States)

Fan, Christina Siu-Dschu; Zhu, Xingyu; Dosch, Hans Günter; von Stutterheim, Christiane; Rupp, André

2017-01-01

In tonal languages, such as Mandarin Chinese, the pitch contour of vowels discriminates lexical meaning, which is not the case in non-tonal languages such as German. Recent data provide evidence that pitch processing is influenced by language experience. However, there are still many open questions concerning the representation of such phonological and language-related differences at the level of the auditory cortex (AC). Using magnetoencephalography (MEG), we recorded transient and sustained auditory evoked fields (AEF) in native Chinese and German speakers to investigate language related phonological and semantic aspects in the processing of acoustic stimuli. AEF were elicited by spoken meaningful and meaningless syllables, by vowels, and by a French horn tone. Speech sounds were recorded from a native speaker and showed frequency-modulations according to the pitch-contours of Mandarin. The sustained field (SF) evoked by natural speech signals was significantly larger for Chinese than for German listeners. In contrast, the SF elicited by a horn tone was not significantly different between groups. Furthermore, the SF of Chinese subjects was larger when evoked by meaningful syllables compared to meaningless ones, but there was no significant difference regarding whether vowels were part of the Chinese phonological system or not. Moreover, the N100m gave subtle but clear evidence that for Chinese listeners other factors than purely physical properties play a role in processing meaningful signals. These findings show that the N100 and the SF generated in Heschl's gyrus are influenced by language experience, which suggests that AC activity related to specific pitch contours of vowels is influenced in a top-down fashion by higher, language related areas. Such interactions are in line with anatomical findings and neuroimaging data, as well as with the dual-stream model of language of Hickok and Poeppel that highlights the close and reciprocal interaction between
Letter-speech sound learning in children with dyslexia : From behavioral research to clinical practice

NARCIS (Netherlands)

Aravena, S.

2017-01-01

In alphabetic languages, learning to associate speech-sounds with unfamiliar characters is a critical step in becoming a proficient reader. This dissertation aimed at expanding our knowledge of this learning process and its relation to dyslexia, with an emphasis on bridging the gap between
Sounds Exaggerate Visual Shape

Science.gov (United States)

Sweeny, Timothy D.; Guzman-Martinez, Emmanuel; Ortega, Laura; Grabowecky, Marcia; Suzuki, Satoru

2012-01-01

While perceiving speech, people see mouth shapes that are systematically associated with sounds. In particular, a vertically stretched mouth produces a /woo/ sound, whereas a horizontally stretched mouth produces a /wee/ sound. We demonstrate that hearing these speech sounds alters how we see aspect ratio, a basic visual feature that contributes…
Reading Skills of Students with Speech Sound Disorders at Three Stages of Literacy Development

Science.gov (United States)

Skebo, Crysten M.; Lewis, Barbara A.; Freebairn, Lisa A.; Tag, Jessica; Ciesla, Allison Avrich; Stein, Catherine M.

2013-01-01

Purpose: The relationship between phonological awareness, overall language, vocabulary, and nonlinguistic cognitive skills to decoding and reading comprehension was examined for students at 3 stages of literacy development (i.e., early elementary school, middle school, and high school). Students with histories of speech sound disorders (SSD) with…
Atypical pattern of discriminating sound features in adults with Asperger syndrome as reflected by the mismatch negativity.

Science.gov (United States)

Kujala, T; Aho, E; Lepistö, T; Jansson-Verkasalo, E; Nieminen-von Wendt, T; von Wendt, L; Näätänen, R

2007-04-01

Asperger syndrome, which belongs to the autistic spectrum of disorders, is characterized by deficits of social interaction and abnormal perception, like hypo- or hypersensitivity in reacting to sounds and discriminating certain sound features. We determined auditory feature discrimination in adults with Asperger syndrome with the mismatch negativity (MMN), a neural response which is an index of cortical change detection. We recorded MMN for five different sound features (duration, frequency, intensity, location, and gap). Our results suggest hypersensitive auditory change detection in Asperger syndrome, as reflected in the enhanced MMN for deviant sounds with a gap or shorter duration, and speeded MMN elicitation for frequency changes.
Predictive Brain Mechanisms in Sound-to-Meaning Mapping during Speech Processing.

Science.gov (United States)

Lyu, Bingjiang; Ge, Jianqiao; Niu, Zhendong; Tan, Li Hai; Gao, Jia-Hong

2016-10-19

Spoken language comprehension relies not only on the identification of individual words, but also on the expectations arising from contextual information. A distributed frontotemporal network is known to facilitate the mapping of speech sounds onto their corresponding meanings. However, how prior expectations influence this efficient mapping at the neuroanatomical level, especially in terms of individual words, remains unclear. Using fMRI, we addressed this question in the framework of the dual-stream model by scanning native speakers of Mandarin Chinese, a language highly dependent on context. We found that, within the ventral pathway, the violated expectations elicited stronger activations in the left anterior superior temporal gyrus and the ventral inferior frontal gyrus (IFG) for the phonological-semantic prediction of spoken words. Functional connectivity analysis showed that expectations were mediated by both top-down modulation from the left ventral IFG to the anterior temporal regions and enhanced cross-stream integration through strengthened connections between different subregions of the left IFG. By further investigating the dynamic causality within the dual-stream model, we elucidated how the human brain accomplishes sound-to-meaning mapping for words in a predictive manner. In daily communication via spoken language, one of the core processes is understanding the words being used. Effortless and efficient information exchange via speech relies not only on the identification of individual spoken words, but also on the contextual information giving rise to expected meanings. Despite the accumulating evidence for the bottom-up perception of auditory input, it is still not fully understood how the top-down modulation is achieved in the extensive frontotemporal cortical network. Here, we provide a comprehensive description of the neural substrates underlying sound-to-meaning mapping and demonstrate how the dual-stream model functions in the modulation of
The interaction between acoustic salience and language experience in developmental speech perception: evidence from nasal place discrimination.

Science.gov (United States)

Narayan, Chandan R; Werker, Janet F; Beddor, Patrice Speeter

2010-05-01

Previous research suggests that infant speech perception reorganizes in the first year: young infants discriminate both native and non-native phonetic contrasts, but by 10-12 months difficult non-native contrasts are less discriminable whereas performance improves on native contrasts. In the current study, four experiments tested the hypothesis that, in addition to the influence of native language experience, acoustic salience also affects the perceptual reorganization that takes place in infancy. Using a visual habituation paradigm, two nasal place distinctions that differ in relative acoustic salience, acoustically robust labial-alveolar [ma]-[na] and acoustically less salient alveolar-velar [na]-[ enga], were presented to infants in a cross-language design. English-learning infants at 6-8 and 10-12 months showed discrimination of the native and acoustically robust [ma]-[na] (Experiment 1), but not the non-native (in initial position) and acoustically less salient [na]-[ enga] (Experiment 2). Very young (4-5-month-old) English-learning infants tested on the same native and non-native contrasts also showed discrimination of only the [ma]-[na] distinction (Experiment 3). Filipino-learning infants, whose ambient language includes the syllable-initial alveolar (/n/)-velar (/ eng/) contrast, showed discrimination of native [na]-[ enga] at 10-12 months, but not at 6-8 months (Experiment 4). These results support the hypothesis that acoustic salience affects speech perception in infancy, with native language experience facilitating discrimination of an acoustically similar phonetic distinction [na]-[ enga]. We discuss the implications of this developmental profile for a comprehensive theory of speech perception in infancy.
Estimation of sound pressure levels of voiced speech from skin vibration of the neck

NARCIS (Netherlands)

Svec, JG; Titze, IR; Popolo, PS

How accurately can sound pressure levels (SPLs) of speech be estimated from skin vibration of the neck? Measurements using a small accelerometer were carried out in 27 subjects (10 males and 17 females) who read Rainbow and Marvin Williams passages in soft, comfortable, and loud voice, while skin
What does learner speech sound like? A case study on adult learners of isiXhosa

CSIR Research Space (South Africa)

Badenhorst, Jaco

2016-12-01

Full Text Available moved during recording or by a sound/beep that results from the press of a button and an obstruction of the device microphone. • Low volume: Speech is too soft to understand what is being said. • Whispering: Speaker whispers during recording. • Laughter...-processing categories. If any of these categories were marked for a particular utterance, the utterance was discarded. The event categories were combined as follows: • Option 1: Empty, Whispering, Laughter, Background speech, Transcription mismatch • Option 2: Empty...
Relationship between individual differences in speech processing and cognitive functions.

Science.gov (United States)

Ou, Jinghua; Law, Sam-Po; Fung, Roxana

2015-12-01

A growing body of research has suggested that cognitive abilities may play a role in individual differences in speech processing. The present study took advantage of a widespread linguistic phenomenon of sound change to systematically assess the relationships between speech processing and various components of attention and working memory in the auditory and visual modalities among typically developed Cantonese-speaking individuals. The individual variations in speech processing are captured in an ongoing sound change-tone merging in Hong Kong Cantonese, in which typically developed native speakers are reported to lose the distinctions between some tonal contrasts in perception and/or production. Three groups of participants were recruited, with a first group of good perception and production, a second group of good perception but poor production, and a third group of good production but poor perception. Our findings revealed that modality-independent abilities of attentional switching/control and working memory might contribute to individual differences in patterns of speech perception and production as well as discrimination latencies among typically developed speakers. The findings not only have the potential to generalize to speech processing in other languages, but also broaden our understanding of the omnipresent phenomenon of language change in all languages.
Musicians' Enhanced Neural Differentiation of Speech Sounds Arises Early in Life: Developmental Evidence from Ages 3 to 30

Science.gov (United States)

Strait, Dana L.; O'Connell, Samantha; Parbery-Clark, Alexandra; Kraus, Nina

2014-01-01

The perception and neural representation of acoustically similar speech sounds underlie language development. Music training hones the perception of minute acoustic differences that distinguish sounds; this training may generalize to speech processing given that adult musicians have enhanced neural differentiation of similar speech syllables compared with nonmusicians. Here, we asked whether this neural advantage in musicians is present early in life by assessing musically trained and untrained children as young as age 3. We assessed auditory brainstem responses to the speech syllables /ba/ and /ga/ as well as auditory and visual cognitive abilities in musicians and nonmusicians across 3 developmental time-points: preschoolers, school-aged children, and adults. Cross-phase analyses objectively measured the degree to which subcortical responses differed to these speech syllables in musicians and nonmusicians for each age group. Results reveal that musicians exhibit enhanced neural differentiation of stop consonants early in life and with as little as a few years of training. Furthermore, the extent of subcortical stop consonant distinction correlates with auditory-specific cognitive abilities (i.e., auditory working memory and attention). Results are interpreted according to a corticofugal framework for auditory learning in which subcortical processing enhancements are engendered by strengthened cognitive control over auditory function in musicians. PMID:23599166
Gender and vocal production mode discrimination using the high frequencies for speech and singing

Science.gov (United States)

Monson, Brian B.; Lotto, Andrew J.; Story, Brad H.

2014-01-01

Humans routinely produce acoustical energy at frequencies above 6 kHz during vocalization, but this frequency range is often not represented in communication devices and speech perception research. Recent advancements toward high-definition (HD) voice and extended bandwidth hearing aids have increased the interest in the high frequencies. The potential perceptual information provided by high-frequency energy (HFE) is not well characterized. We found that humans can accomplish tasks of gender discrimination and vocal production mode discrimination (speech vs. singing) when presented with acoustic stimuli containing only HFE at both amplified and normal levels. Performance in these tasks was robust in the presence of low-frequency masking noise. No substantial learning effect was observed. Listeners also were able to identify the sung and spoken text (excerpts from “The Star-Spangled Banner”) with very few exposures. These results add to the increasing evidence that the high frequencies provide at least redundant information about the vocal signal, suggesting that its representation in communication devices (e.g., cell phones, hearing aids, and cochlear implants) and speech/voice synthesizers could improve these devices and benefit normal-hearing and hearing-impaired listeners. PMID:25400613
Use of a Deep Recurrent Neural Network to Reduce Wind Noise: Effects on Judged Speech Intelligibility and Sound Quality.

Science.gov (United States)

Keshavarzi, Mahmoud; Goehring, Tobias; Zakis, Justin; Turner, Richard E; Moore, Brian C J

2018-01-01

Despite great advances in hearing-aid technology, users still experience problems with noise in windy environments. The potential benefits of using a deep recurrent neural network (RNN) for reducing wind noise were assessed. The RNN was trained using recordings of the output of the two microphones of a behind-the-ear hearing aid in response to male and female speech at various azimuths in the presence of noise produced by wind from various azimuths with a velocity of 3 m/s, using the "clean" speech as a reference. A paired-comparison procedure was used to compare all possible combinations of three conditions for subjective intelligibility and for sound quality or comfort. The conditions were unprocessed noisy speech, noisy speech processed using the RNN, and noisy speech that was high-pass filtered (which also reduced wind noise). Eighteen native English-speaking participants were tested, nine with normal hearing and nine with mild-to-moderate hearing impairment. Frequency-dependent linear amplification was provided for the latter. Processing using the RNN was significantly preferred over no processing by both subject groups for both subjective intelligibility and sound quality, although the magnitude of the preferences was small. High-pass filtering (HPF) was not significantly preferred over no processing. Although RNN was significantly preferred over HPF only for sound quality for the hearing-impaired participants, for the results as a whole, there was a preference for RNN over HPF. Overall, the results suggest that reduction of wind noise using an RNN is possible and might have beneficial effects when used in hearing aids.

Use of a Deep Recurrent Neural Network to Reduce Wind Noise: Effects on Judged Speech Intelligibility and Sound Quality

Science.gov (United States)

Keshavarzi, Mahmoud; Goehring, Tobias; Zakis, Justin; Turner, Richard E.; Moore, Brian C. J.

2018-01-01

Despite great advances in hearing-aid technology, users still experience problems with noise in windy environments. The potential benefits of using a deep recurrent neural network (RNN) for reducing wind noise were assessed. The RNN was trained using recordings of the output of the two microphones of a behind-the-ear hearing aid in response to male and female speech at various azimuths in the presence of noise produced by wind from various azimuths with a velocity of 3 m/s, using the “clean” speech as a reference. A paired-comparison procedure was used to compare all possible combinations of three conditions for subjective intelligibility and for sound quality or comfort. The conditions were unprocessed noisy speech, noisy speech processed using the RNN, and noisy speech that was high-pass filtered (which also reduced wind noise). Eighteen native English-speaking participants were tested, nine with normal hearing and nine with mild-to-moderate hearing impairment. Frequency-dependent linear amplification was provided for the latter. Processing using the RNN was significantly preferred over no processing by both subject groups for both subjective intelligibility and sound quality, although the magnitude of the preferences was small. High-pass filtering (HPF) was not significantly preferred over no processing. Although RNN was significantly preferred over HPF only for sound quality for the hearing-impaired participants, for the results as a whole, there was a preference for RNN over HPF. Overall, the results suggest that reduction of wind noise using an RNN is possible and might have beneficial effects when used in hearing aids. PMID:29708061
Differential Diagnosis of Speech Sound Disorder (Phonological Disorder): Audiological Assessment beyond the Pure-tone Audiogram.

Science.gov (United States)

Iliadou, Vasiliki Vivian; Chermak, Gail D; Bamiou, Doris-Eva

2015-04-01

According to the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition, diagnosis of speech sound disorder (SSD) requires a determination that it is not the result of other congenital or acquired conditions, including hearing loss or neurological conditions that may present with similar symptomatology. To examine peripheral and central auditory function for the purpose of determining whether a peripheral or central auditory disorder was an underlying factor or contributed to the child's SSD. Central auditory processing disorder clinic pediatric case reports. Three clinical cases are reviewed of children with diagnosed SSD who were referred for audiological evaluation by their speech-language pathologists as a result of slower than expected progress in therapy. Audiological testing revealed auditory deficits involving peripheral auditory function or the central auditory nervous system. These cases demonstrate the importance of increasing awareness among professionals of the need to fully evaluate the auditory system to identify auditory deficits that could contribute to a patient's speech sound (phonological) disorder. Audiological assessment in cases of suspected SSD should not be limited to pure-tone audiometry given its limitations in revealing the full range of peripheral and central auditory deficits, deficits which can compromise treatment of SSD. American Academy of Audiology.
Modeling auditory processing and speech perception in hearing-impaired listeners

DEFF Research Database (Denmark)

Jepsen, Morten Løve

in a diagnostic rhyme test. The framework was constructed such that discrimination errors originating from the front-end and the back-end were separated. The front-end was fitted to individual listeners with cochlear hearing loss according to non-speech data, and speech data were obtained in the same listeners......A better understanding of how the human auditory system represents and analyzes sounds and how hearing impairment affects such processing is of great interest for researchers in the fields of auditory neuroscience, audiology, and speech communication as well as for applications in hearing......-instrument and speech technology. In this thesis, the primary focus was on the development and evaluation of a computational model of human auditory signal-processing and perception. The model was initially designed to simulate the normal-hearing auditory system with particular focus on the nonlinear processing...
Prediction of IOI-HA Scores Using Speech Reception Thresholds and Speech Discrimination Scores in Quiet

DEFF Research Database (Denmark)

Brännström, K Jonas; Lantz, Johannes; Nielsen, Lars Holme

2014-01-01

), and speech discrimination scores (SDSs) in quiet or in noise are common assessments made prior to hearing aid (HA) fittings. It is not known whether SRT and SDS in quiet relate to HA outcome measured with the International Outcome Inventory for Hearing Aids (IOI-HA). PURPOSE: The aim of the present study...... COLLECTION AND ANALYSIS: The psychometric properties were evaluated and compared to previous studies using the IOI-HA. The associations and differences between the outcome scores and a number of descriptive variables (age, gender, fitted monaurally/binaurally with HA, first-time/experienced HA users, years...
The Beginnings of Danish Speech Perception

DEFF Research Database (Denmark)

Østerbye, Torkil

, in the light of the rich and complex Danish sound system. The first two studies report on native adults’ perception of Danish speech sounds in quiet and noise. The third study examined the development of language-specific perception in native Danish infants at 6, 9 and 12 months of age. The book points......Little is known about the perception of speech sounds by native Danish listeners. However, the Danish sound system differs in several interesting ways from the sound systems of other languages. For instance, Danish is characterized, among other features, by a rich vowel inventory and by different...... reductions of speech sounds evident in the pronunciation of the language. This book (originally a PhD thesis) consists of three studies based on the results of two experiments. The experiments were designed to provide knowledge of the perception of Danish speech sounds by Danish adults and infants...
A qualitative analysis of hate speech reported to the Romanian National Council for Combating Discrimination (2003‑2015)

OpenAIRE

Adriana Iordache

2015-01-01

The article analyzes the specificities of Romanian hate speech over a period of twelve years through a qualitative analysis of 384 Decisions of the National Council for Combating Discrimination. The study employs a coding methodology which allows one to separate decisions according to the group that was the victim of hate speech. The article finds that stereotypes employed are similar to those encountered in the international literature. The main target of hate speech is the Roma, who are ...
Focal versus distributed temporal cortex activity for speech sound category assignment

Science.gov (United States)

Bouton, Sophie; Chambon, Valérian; Tyrand, Rémi; Seeck, Margitta; Karkar, Sami; van de Ville, Dimitri; Giraud, Anne-Lise

2018-01-01

Percepts and words can be decoded from distributed neural activity measures. However, the existence of widespread representations might conflict with the more classical notions of hierarchical processing and efficient coding, which are especially relevant in speech processing. Using fMRI and magnetoencephalography during syllable identification, we show that sensory and decisional activity colocalize to a restricted part of the posterior superior temporal gyrus (pSTG). Next, using intracortical recordings, we demonstrate that early and focal neural activity in this region distinguishes correct from incorrect decisions and can be machine-decoded to classify syllables. Crucially, significant machine decoding was possible from neuronal activity sampled across different regions of the temporal and frontal lobes, despite weak or absent sensory or decision-related responses. These findings show that speech-sound categorization relies on an efficient readout of focal pSTG neural activity, while more distributed activity patterns, although classifiable by machine learning, instead reflect collateral processes of sensory perception and decision. PMID:29363598
Corollary discharge provides the sensory content of inner speech.

Science.gov (United States)

Scott, Mark

2013-09-01

Inner speech is one of the most common, but least investigated, mental activities humans perform. It is an internal copy of one's external voice and so is similar to a well-established component of motor control: corollary discharge. Corollary discharge is a prediction of the sound of one's voice generated by the motor system. This prediction is normally used to filter self-caused sounds from perception, which segregates them from externally caused sounds and prevents the sensory confusion that would otherwise result. The similarity between inner speech and corollary discharge motivates the theory, tested here, that corollary discharge provides the sensory content of inner speech. The results reported here show that inner speech attenuates the impact of external sounds. This attenuation was measured using a context effect (an influence of contextual speech sounds on the perception of subsequent speech sounds), which weakens in the presence of speech imagery that matches the context sound. Results from a control experiment demonstrated this weakening in external speech as well. Such sensory attenuation is a hallmark of corollary discharge.
A Randomized Controlled Trial on The Beneficial Effects of Training Letter-Speech Sound Integration on Reading Fluency in Children with Dyslexia.

Directory of Open Access Journals (Sweden)

Gorka Fraga González

Full Text Available A recent account of dyslexia assumes that a failure to develop automated letter-speech sound integration might be responsible for the observed lack of reading fluency. This study uses a pre-test-training-post-test design to evaluate the effects of a training program based on letter-speech sound associations with a special focus on gains in reading fluency. A sample of 44 children with dyslexia and 23 typical readers, aged 8 to 9, was recruited. Children with dyslexia were randomly allocated to either the training program group (n = 23 or a waiting-list control group (n = 21. The training intensively focused on letter-speech sound mapping and consisted of 34 individual sessions of 45 minutes over a five month period. The children with dyslexia showed substantial reading gains for the main word reading and spelling measures after training, improving at a faster rate than typical readers and waiting-list controls. The results are interpreted within the conceptual framework assuming a multisensory integration deficit as the most proximal cause of dysfluent reading in dyslexia.ISRCTN register ISRCTN12783279.
Statistical Learning, Syllable Processing, and Speech Production in Healthy Hearing and Hearing-Impaired Preschool Children: A Mismatch Negativity Study.

Science.gov (United States)

Studer-Eichenberger, Esther; Studer-Eichenberger, Felix; Koenig, Thomas

2016-01-01

The objectives of the present study were to investigate temporal/spectral sound-feature processing in preschool children (4 to 7 years old) with peripheral hearing loss compared with age-matched controls. The results verified the presence of statistical learning, which was diminished in children with hearing impairments (HIs), and elucidated possible perceptual mediators of speech production. Perception and production of the syllables /ba/, /da/, /ta/, and /na/ were recorded in 13 children with normal hearing and 13 children with HI. Perception was assessed physiologically through event-related potentials (ERPs) recorded by EEG in a multifeature mismatch negativity paradigm and behaviorally through a discrimination task. Temporal and spectral features of the ERPs during speech perception were analyzed, and speech production was quantitatively evaluated using speech motor maximum performance tasks. Proximal to stimulus onset, children with HI displayed a difference in map topography, indicating diminished statistical learning. In later ERP components, children with HI exhibited reduced amplitudes in the N2 and early parts of the late disciminative negativity components specifically, which are associated with temporal and spectral control mechanisms. Abnormalities of speech perception were only subtly reflected in speech production, as the lone difference found in speech production studies was a mild delay in regulating speech intensity. In addition to previously reported deficits of sound-feature discriminations, the present study results reflect diminished statistical learning in children with HI, which plays an early and important, but so far neglected, role in phonological processing. Furthermore, the lack of corresponding behavioral abnormalities in speech production implies that impaired perceptual capacities do not necessarily translate into productive deficits.
Cluster-Randomized Controlled Trial Evaluating the Effectiveness of Computer-Assisted Intervention Delivered by Educators for Children with Speech Sound Disorders

Science.gov (United States)

McLeod, Sharynne; Baker, Elise; McCormack, Jane; Wren, Yvonne; Roulstone, Sue; Crowe, Kathryn; Masso, Sarah; White, Paul; Howland, Charlotte

2017-01-01

Purpose: The aim was to evaluate the effectiveness of computer-assisted input-based intervention for children with speech sound disorders (SSD). Method: The Sound Start Study was a cluster-randomized controlled trial. Seventy-nine early childhood centers were invited to participate, 45 were recruited, and 1,205 parents and educators of 4- and…
Neural indices of phonemic discrimination and sentence-level speech intelligibility in quiet and noise: A P3 study.

Science.gov (United States)

Koerner, Tess K; Zhang, Yang; Nelson, Peggy B; Wang, Boxiang; Zou, Hui

2017-07-01

This study examined how speech babble noise differentially affected the auditory P3 responses and the associated neural oscillatory activities for consonant and vowel discrimination in relation to segmental- and sentence-level speech perception in noise. The data were collected from 16 normal-hearing participants in a double-oddball paradigm that contained a consonant (/ba/ to /da/) and vowel (/ba/ to /bu/) change in quiet and noise (speech-babble background at a -3 dB signal-to-noise ratio) conditions. Time-frequency analysis was applied to obtain inter-trial phase coherence (ITPC) and event-related spectral perturbation (ERSP) measures in delta, theta, and alpha frequency bands for the P3 response. Behavioral measures included percent correct phoneme detection and reaction time as well as percent correct IEEE sentence recognition in quiet and in noise. Linear mixed-effects models were applied to determine possible brain-behavior correlates. A significant noise-induced reduction in P3 amplitude was found, accompanied by significantly longer P3 latency and decreases in ITPC across all frequency bands of interest. There was a differential effect of noise on consonant discrimination and vowel discrimination in both ERP and behavioral measures, such that noise impacted the detection of the consonant change more than the vowel change. The P3 amplitude and some of the ITPC and ERSP measures were significant predictors of speech perception at segmental- and sentence-levels across listening conditions and stimuli. These data demonstrate that the P3 response with its associated cortical oscillations represents a potential neurophysiological marker for speech perception in noise. Copyright © 2017 Elsevier B.V. All rights reserved.
A Survey of University Professors Teaching Speech Sound Disorders: Nonspeech Oral Motor Exercises and Other Topics

Science.gov (United States)

Watson, Maggie M.; Lof, Gregory L.

2009-01-01

Purpose: The purpose of this article was to obtain and organize information from instructors who teach course work on the subject of children's speech sound disorders (SSD) regarding their use of teaching resources, involvement in students' clinical practica, and intervention approaches presented to students. Instructors also reported if they…
Rainforests as concert halls for birds: Are reverberations improving sound transmission of long song elements?

DEFF Research Database (Denmark)

Nemeth, Erwin; Dabelsteen, Torben; Pedersen, Simon Boel

2006-01-01

that longer sounds are less attenuated. The results indicate that higher sound pressure level is caused by superimposing reflections. It is suggested that this beneficial effect of reverberations explains interspecific birdsong differences in element length. Transmission paths with stronger reverberations......In forests reverberations have probably detrimental and beneficial effects on avian communication. They constrain signal discrimination by masking fast repetitive sounds and they improve signal detection by elongating sounds. This ambivalence of reflections for animal signals in forests is similar...... to the influence of reverberations on speech or music in indoor sound transmission. Since comparisons of sound fields of forests and concert halls have demonstrated that reflections can contribute in both environments a considerable part to the energy of a received sound, it is here assumed that reverberations...
Effects of Active and Passive Hearing Protection Devices on Sound Source Localization, Speech Recognition, and Tone Detection.

Directory of Open Access Journals (Sweden)

Andrew D Brown

Full Text Available Hearing protection devices (HPDs such as earplugs offer to mitigate noise exposure and reduce the incidence of hearing loss among persons frequently exposed to intense sound. However, distortions of spatial acoustic information and reduced audibility of low-intensity sounds caused by many existing HPDs can make their use untenable in high-risk (e.g., military or law enforcement environments where auditory situational awareness is imperative. Here we assessed (1 sound source localization accuracy using a head-turning paradigm, (2 speech-in-noise recognition using a modified version of the QuickSIN test, and (3 tone detection thresholds using a two-alternative forced-choice task. Subjects were 10 young normal-hearing males. Four different HPDs were tested (two active, two passive, including two new and previously untested devices. Relative to unoccluded (control performance, all tested HPDs significantly degraded performance across tasks, although one active HPD slightly improved high-frequency tone detection thresholds and did not degrade speech recognition. Behavioral data were examined with respect to head-related transfer functions measured using a binaural manikin with and without tested HPDs in place. Data reinforce previous reports that HPDs significantly compromise a variety of auditory perceptual facilities, particularly sound localization due to distortions of high-frequency spectral cues that are important for the avoidance of front-back confusions.
Evaluation of the effects of nonlinear frequency compression on speech recognition and sound quality for adults with mild to moderate hearing loss.

Science.gov (United States)

Picou, Erin M; Marcrum, Steven C; Ricketts, Todd A

2015-03-01

While potentially improving audibility for listeners with considerable high frequency hearing loss, the effects of implementing nonlinear frequency compression (NFC) for listeners with moderate high frequency hearing loss are unclear. The purpose of this study was to investigate the effects of activating NFC for listeners who are not traditionally considered candidates for this technology. Participants wore study hearing aids with NFC activated for a 3-4 week trial period. After the trial period, they were tested with NFC and with conventional processing on measures of consonant discrimination threshold in quiet, consonant recognition in quiet, sentence recognition in noise, and acceptableness of sound quality of speech and music. Seventeen adult listeners with symmetrical, mild to moderate sensorineural hearing loss participated. Better ear, high frequency pure-tone averages (4, 6, and 8 kHz) were 60 dB HL or better. Activating NFC resulted in lower (better) thresholds for discrimination of /s/, whose spectral center was 9 kHz. There were no other significant effects of NFC compared to conventional processing. These data suggest that the benefits, and detriments, of activating NFC may be limited for this population.
Statistical learning of recurring sound patterns encodes auditory objects in songbird forebrain.

Science.gov (United States)

Lu, Kai; Vicario, David S

2014-10-07

Auditory neurophysiology has demonstrated how basic acoustic features are mapped in the brain, but it is still not clear how multiple sound components are integrated over time and recognized as an object. We investigated the role of statistical learning in encoding the sequential features of complex sounds by recording neuronal responses bilaterally in the auditory forebrain of awake songbirds that were passively exposed to long sound streams. These streams contained sequential regularities, and were similar to streams used in human infants to demonstrate statistical learning for speech sounds. For stimulus patterns with contiguous transitions and with nonadjacent elements, single and multiunit responses reflected neuronal discrimination of the familiar patterns from novel patterns. In addition, discrimination of nonadjacent patterns was stronger in the right hemisphere than in the left, and may reflect an effect of top-down modulation that is lateralized. Responses to recurring patterns showed stimulus-specific adaptation, a sparsening of neural activity that may contribute to encoding invariants in the sound stream and that appears to increase coding efficiency for the familiar stimuli across the population of neurons recorded. As auditory information about the world must be received serially over time, recognition of complex auditory objects may depend on this type of mnemonic process to create and differentiate representations of recently heard sounds.
Audio-visual speech perception in infants and toddlers with Down syndrome, fragile X syndrome, and Williams syndrome.

Science.gov (United States)

D'Souza, Dean; D'Souza, Hana; Johnson, Mark H; Karmiloff-Smith, Annette

2016-08-01

Typically-developing (TD) infants can construct unified cross-modal percepts, such as a speaking face, by integrating auditory-visual (AV) information. This skill is a key building block upon which higher-level skills, such as word learning, are built. Because word learning is seriously delayed in most children with neurodevelopmental disorders, we assessed the hypothesis that this delay partly results from a deficit in integrating AV speech cues. AV speech integration has rarely been investigated in neurodevelopmental disorders, and never previously in infants. We probed for the McGurk effect, which occurs when the auditory component of one sound (/ba/) is paired with the visual component of another sound (/ga/), leading to the perception of an illusory third sound (/da/ or /tha/). We measured AV integration in 95 infants/toddlers with Down, fragile X, or Williams syndrome, whom we matched on Chronological and Mental Age to 25 TD infants. We also assessed a more basic AV perceptual ability: sensitivity to matching vs. mismatching AV speech stimuli. Infants with Williams syndrome failed to demonstrate a McGurk effect, indicating poor AV speech integration. Moreover, while the TD children discriminated between matching and mismatching AV stimuli, none of the other groups did, hinting at a basic deficit or delay in AV speech processing, which is likely to constrain subsequent language development. Copyright © 2016 Elsevier Inc. All rights reserved.
Low-level neural auditory discrimination dysfunctions in specific language impairmentâA review on mismatch negativity findings

Directory of Open Access Journals (Sweden)

Teija Kujala

2017-12-01

Full Text Available In specific language impairment (SLI, there is a delay in the childâs oral language skills when compared with nonverbal cognitive abilities. The problems typically relate to phonological and morphological processing and word learning. This article reviews studies which have used mismatch negativity (MMN in investigating low-level neural auditory dysfunctions in this disorder. With MMN, it is possible to tap the accuracy of neural sound discrimination and sensory memory functions. These studies have found smaller response amplitudes and longer latencies for speech and non-speech sound changes in children with SLI than in typically developing children, suggesting impaired and slow auditory discrimination in SLI. Furthermore, they suggest shortened sensory memory duration and vulnerability of the sensory memory to masking effects. Importantly, some studies reported associations between MMN parameters and language test measures. In addition, it was found that language intervention can influence the abnormal MMN in children with SLI, enhancing its amplitude. These results suggest that the MMN can shed light on the neural basis of various auditory and memory impairments in SLI, which are likely to influence speech perception. Keywords: Specific language impairment, Auditory processing, Mismatch negativity (MMN
Reduced neural integration of letters and speech sounds in dyslexic children scales with individual differences in reading fluency.

Directory of Open Access Journals (Sweden)

Gojko Žarić

Full Text Available The acquisition of letter-speech sound associations is one of the basic requirements for fluent reading acquisition and its failure may contribute to reading difficulties in developmental dyslexia. Here we investigated event-related potential (ERP measures of letter-speech sound integration in 9-year-old typical and dyslexic readers and specifically test their relation to individual differences in reading fluency. We employed an audiovisual oddball paradigm in typical readers (n = 20, dysfluent (n = 18 and severely dysfluent (n = 18 dyslexic children. In one auditory and two audiovisual conditions the Dutch spoken vowels/a/and/o/were presented as standard and deviant stimuli. In audiovisual blocks, the letter 'a' was presented either simultaneously (AV0, or 200 ms before (AV200 vowel sound onset. Across the three children groups, vowel deviancy in auditory blocks elicited comparable mismatch negativity (MMN and late negativity (LN responses. In typical readers, both audiovisual conditions (AV0 and AV200 led to enhanced MMN and LN amplitudes. In both dyslexic groups, the audiovisual LN effects were mildly reduced. Most interestingly, individual differences in reading fluency were correlated with MMN latency in the AV0 condition. A further analysis revealed that this effect was driven by a short-lived MMN effect encompassing only the N1 window in severely dysfluent dyslexics versus a longer MMN effect encompassing both the N1 and P2 windows in the other two groups. Our results confirm and extend previous findings in dyslexic children by demonstrating a deficient pattern of letter-speech sound integration depending on the level of reading dysfluency. These findings underscore the importance of considering individual differences across the entire spectrum of reading skills in addition to group differences between typical and dyslexic readers.

Perception of environmental sounds by experienced cochlear implant patients

Science.gov (United States)

Shafiro, Valeriy; Gygi, Brian; Cheng, Min-Yu; Vachhani, Jay; Mulvey, Megan

2011-01-01

Objectives Environmental sound perception serves an important ecological function by providing listeners with information about objects and events in their immediate environment. Environmental sounds such as car horns, baby cries or chirping birds can alert listeners to imminent dangers as well as contribute to one's sense of awareness and well being. Perception of environmental sounds as acoustically and semantically complex stimuli, may also involve some factors common to the processing of speech. However, very limited research has investigated the abilities of cochlear implant (CI) patients to identify common environmental sounds, despite patients' general enthusiasm about them. This project (1) investigated the ability of patients with modern-day CIs to perceive environmental sounds, (2) explored associations among speech, environmental sounds and basic auditory abilities, and (3) examined acoustic factors that might be involved in environmental sound perception. Design Seventeen experienced postlingually-deafened CI patients participated in the study. Environmental sound perception was assessed with a large-item test composed of 40 sound sources, each represented by four different tokens. The relationship between speech and environmental sound perception, and the role of working memory and some basic auditory abilities were examined based on patient performance on a battery of speech tests (HINT, CNC, and individual consonant and vowel tests), tests of basic auditory abilities (audiometric thresholds, gap detection, temporal pattern and temporal order for tones tests) and a backward digit recall test. Results The results indicated substantially reduced ability to identify common environmental sounds in CI patients (45.3%). Except for vowels, all speech test scores significantly correlated with the environmental sound test scores: r = 0.73 for HINT in quiet, r = 0.69 for HINT in noise, r = 0.70 for CNC, r = 0.64 for consonants and r = 0.48 for vowels. HINT and
[Improving speech comprehension using a new cochlear implant speech processor].

Science.gov (United States)

Müller-Deile, J; Kortmann, T; Hoppe, U; Hessel, H; Morsnowski, A

2009-06-01

The aim of this multicenter clinical field study was to assess the benefits of the new Freedom 24 sound processor for cochlear implant (CI) users implanted with the Nucleus 24 cochlear implant system. The study included 48 postlingually profoundly deaf experienced CI users who demonstrated speech comprehension performance with their current speech processor on the Oldenburg sentence test (OLSA) in quiet conditions of at least 80% correct scores and who were able to perform adaptive speech threshold testing using the OLSA in noisy conditions. Following baseline measures of speech comprehension performance with their current speech processor, subjects were upgraded to the Freedom 24 speech processor. After a take-home trial period of at least 2 weeks, subject performance was evaluated by measuring the speech reception threshold with the Freiburg multisyllabic word test and speech intelligibility with the Freiburg monosyllabic word test at 50 dB and 70 dB in the sound field. The results demonstrated highly significant benefits for speech comprehension with the new speech processor. Significant benefits for speech comprehension were also demonstrated with the new speech processor when tested in competing background noise.In contrast, use of the Abbreviated Profile of Hearing Aid Benefit (APHAB) did not prove to be a suitably sensitive assessment tool for comparative subjective self-assessment of hearing benefits with each processor. Use of the preprocessing algorithm known as adaptive dynamic range optimization (ADRO) in the Freedom 24 led to additional improvements over the standard upgrade map for speech comprehension in quiet and showed equivalent performance in noise. Through use of the preprocessing beam-forming algorithm BEAM, subjects demonstrated a highly significant improved signal-to-noise ratio for speech comprehension thresholds (i.e., signal-to-noise ratio for 50% speech comprehension scores) when tested with an adaptive procedure using the Oldenburg
Spanish is better than English for discriminating Portuguese vowels: acoustic similarity versus vowel inventory size

Science.gov (United States)

Elvin, Jaydene; Escudero, Paola; Vasiliev, Polina

2014-01-01

Second language (L2) learners often struggle to distinguish sound contrasts that are not present in their native language (L1). Models of non-native and L2 sound perception claim that perceptual similarity between L1 and L2 sound contrasts correctly predicts discrimination by naïve listeners and L2 learners. The present study tested the explanatory power of vowel inventory size versus acoustic properties as predictors of discrimination accuracy when naïve Australian English (AusE) and Iberian Spanish (IS) listeners are presented with six Brazilian Portuguese (BP) vowel contrasts. Our results show that IS listeners outperformed AusE listeners, confirming that cross-linguistic acoustic properties, rather than cross-linguistic vowel inventory sizes, successfully predict non-native discrimination difficulty. Furthermore, acoustic distance between BP vowels and closest L1 vowels successfully predicted differential levels of difficulty among the six BP contrasts, with BP /e-i/ and /o-u/ being the most difficult for both listener groups. We discuss the importance of our findings for the adequacy of models of L2 speech perception. PMID:25400599
Piano training enhances the neural processing of pitch and improves speech perception in Mandarin-speaking children.

Science.gov (United States)

Nan, Yun; Liu, Li; Geiser, Eveline; Shu, Hua; Gong, Chen Chen; Dong, Qi; Gabrieli, John D E; Desimone, Robert

2018-06-25

Musical training confers advantages in speech-sound processing, which could play an important role in early childhood education. To understand the mechanisms of this effect, we used event-related potential and behavioral measures in a longitudinal design. Seventy-four Mandarin-speaking children aged 4-5 y old were pseudorandomly assigned to piano training, reading training, or a no-contact control group. Six months of piano training improved behavioral auditory word discrimination in general as well as word discrimination based on vowels compared with the controls. The reading group yielded similar trends. However, the piano group demonstrated unique advantages over the reading and control groups in consonant-based word discrimination and in enhanced positive mismatch responses (pMMRs) to lexical tone and musical pitch changes. The improved word discrimination based on consonants correlated with the enhancements in musical pitch pMMRs among the children in the piano group. In contrast, all three groups improved equally on general cognitive measures, including tests of IQ, working memory, and attention. The results suggest strengthened common sound processing across domains as an important mechanism underlying the benefits of musical training on language processing. In addition, although we failed to find far-transfer effects of musical training to general cognition, the near-transfer effects to speech perception establish the potential for musical training to help children improve their language skills. Piano training was not inferior to reading training on direct tests of language function, and it even seemed superior to reading training in enhancing consonant discrimination.
Gay- and Lesbian-Sounding Auditory Cues Elicit Stereotyping and Discrimination.

Science.gov (United States)

Fasoli, Fabio; Maass, Anne; Paladino, Maria Paola; Sulpizio, Simone

2017-07-01

The growing body of literature on the recognition of sexual orientation from voice ("auditory gaydar") is silent on the cognitive and social consequences of having a gay-/lesbian- versus heterosexual-sounding voice. We investigated this issue in four studies (overall N = 276), conducted in Italian language, in which heterosexual listeners were exposed to single-sentence voice samples of gay/lesbian and heterosexual speakers. In all four studies, listeners were found to make gender-typical inferences about traits and preferences of heterosexual speakers, but gender-atypical inferences about those of gay or lesbian speakers. Behavioral intention measures showed that listeners considered lesbian and gay speakers as less suitable for a leadership position, and male (but not female) listeners took distance from gay speakers. Together, this research demonstrates that having a gay/lesbian rather than heterosexual-sounding voice has tangible consequences for stereotyping and discrimination.
Part-of-speech effects on text-to-speech synthesis

CSIR Research Space (South Africa)

Schlunz, GI

2010-11-01

Full Text Available One of the goals of text-to-speech (TTS) systems is to produce natural-sounding synthesised speech. Towards this end various natural language processing (NLP) tasks are performed to model the prosodic aspects of the TTS voice. One of the fundamental...
Comparison between bilateral cochlear implants and Neurelec Digisonic(®) SP Binaural cochlear implant: speech perception, sound localization and patient self-assessment.

Science.gov (United States)

Bonnard, Damien; Lautissier, Sylvie; Bosset-Audoit, Amélie; Coriat, Géraldine; Beraha, Max; Maunoury, Antoine; Martel, Jacques; Darrouzet, Vincent; Bébéar, Jean-Pierre; Dauman, René

2013-01-01

An alternative to bilateral cochlear implantation is offered by the Neurelec Digisonic(®) SP Binaural cochlear implant, which allows stimulation of both cochleae within a single device. The purpose of this prospective study was to compare a group of Neurelec Digisonic(®) SP Binaural implant users (denoted BINAURAL group, n = 7) with a group of bilateral adult cochlear implant users (denoted BILATERAL group, n = 6) in terms of speech perception, sound localization, and self-assessment of health status and hearing disability. Speech perception was assessed using word recognition at 60 dB SPL in quiet and in a 'cocktail party' noise delivered through five loudspeakers in the hemi-sound field facing the patient (signal-to-noise ratio = +10 dB). The sound localization task was to determine the source of a sound stimulus among five speakers positioned between -90° and +90° from midline. Change in health status was assessed using the Glasgow Benefit Inventory and hearing disability was evaluated with the Abbreviated Profile of Hearing Aid Benefit. Speech perception was not statistically different between the two groups, even though there was a trend in favor of the BINAURAL group (mean percent word recognition in the BINAURAL and BILATERAL groups: 70 vs. 56.7% in quiet, 55.7 vs. 43.3% in noise). There was also no significant difference with regard to performance in sound localization and self-assessment of health status and hearing disability. On the basis of the BINAURAL group's performance in hearing tasks involving the detection of interaural differences, implantation with the Neurelec Digisonic(®) SP Binaural implant may be considered to restore effective binaural hearing. Based on these first comparative results, this device seems to provide benefits similar to those of traditional bilateral cochlear implantation, with a new approach to stimulate both auditory nerves. Copyright © 2013 S. Karger AG, Basel.
Acoustic analyses of speech sounds and rhythms in Japanese- and English-learning infants

Directory of Open Access Journals (Sweden)

Yuko eYamashita

2013-02-01

Full Text Available The purpose of this study was to explore developmental changes, in terms of spectral fluctuations and temporal periodicity with Japanese- and English-learning infants. Three age groups (15, 20, and 24 months were selected, because infants diversify phonetic inventories with age. Natural speech of the infants was recorded. We utilized a critical-band-filter bank, which simulated the frequency resolution in adults’ auditory periphery. First, the correlations between the critical-band outputs represented by factor analysis were observed in order to see how the critical bands should be connected to each other, if a listener is to differentiate sounds in infants’ speech. In the following analysis, we analyzed the temporal fluctuations of factor scores by calculating autocorrelations. The present analysis identified three factors observed in adult speech at 24 months of age in both linguistic environments. These three factors were shifted to a higher frequency range corresponding to the smaller vocal tract size of the infants. The results suggest that the vocal tract structures of the infants had developed to become adult-like configuration by 24 months of age in both language environments. The amount of utterances with periodic nature of shorter time increased with age in both environments. This trend was clearer in the Japanese environment.
Sound of mind : electrophysiological and behavioural evidence for the role of context, variation and informativity in human speech processing

NARCIS (Netherlands)

Nixon, Jessie Sophia

2014-01-01

Spoken communication involves transmission of a message which takes physical form in acoustic waves. Within any given language, acoustic cues pattern in language-specific ways along language-specific acoustic dimensions to create speech sound contrasts. These cues are utilized by listeners to
Comparison of speech perception performance between Sprint/Esprit 3G and Freedom processors in children implanted with nucleus cochlear implants.

Science.gov (United States)

Santarelli, Rosamaria; Magnavita, Vincenzo; De Filippi, Roberta; Ventura, Laura; Genovese, Elisabetta; Arslan, Edoardo

2009-04-01

To compare speech perception performance in children fitted with previous generation Nucleus sound processor, Sprint or Esprit 3G, and the Freedom, the most recently released system from the Cochlear Corporation that features a larger input dynamic range. Prospective intrasubject comparative study. University Medical Center. Seventeen prelingually deafened children who had received the Nucleus 24 cochlear implant and used the Sprint or Esprit 3G sound processor. Cochlear implantation with Cochlear device. Speech perception was evaluated at baseline (Sprint, n = 11; Esprit 3G, n = 6) and after 1 month's experience with the Freedom sound processor. Identification and recognition of disyllabic words and identification of vowels were performed via recorded voice in quiet (70 dB [A]), in the presence of background noise at various levels of signal-to-noise ratio (+10, +5, 0, -5) and at a soft presentation level (60 dB [A]). Consonant identification and recognition of disyllabic words, trisyllabic words, and sentences were evaluated in live voice. Frequency discrimination was measured in a subset of subjects (n = 5) by using an adaptive, 3-interval, 3-alternative, forced-choice procedure. Identification of disyllabic words administered at a soft presentation level showed a significant increase when switching to the Freedom compared with the previously worn processor in children using the Sprint or Esprit 3G. Identification and recognition of disyllabic words in the presence of background noise as well as consonant identification and sentence recognition increased significantly for the Freedom compared with the previously worn device only in children fitted with the Sprint. Frequency discrimination was significantly better when switching to the Freedom compared with the previously worn processor. Serial comparisons revealed that that speech perception performance evaluated in children aged 5 to 15 years was superior with the Freedom than previous generations of Nucleus
Effects of Familiarity and Feeding on Newborn Speech-Voice Recognition

Science.gov (United States)

Valiante, A. Grace; Barr, Ronald G.; Zelazo, Philip R.; Brant, Rollin; Young, Simon N.

2013-01-01

Newborn infants preferentially orient to familiar over unfamiliar speech sounds. They are also better at remembering unfamiliar speech sounds for short periods of time if learning and retention occur after a feed than before. It is unknown whether short-term memory for speech is enhanced when the sound is familiar (versus unfamiliar) and, if so,…
Acquired Apraxia of Speech: The Effects of Repeated Practice and Rate/Rhythm Control Treatments on Sound Production Accuracy

Science.gov (United States)

Wambaugh, Julie L.; Nessler, Christina; Cameron, Rosalea; Mauszycki, Shannon C.

2012-01-01

Purpose: This investigation was designed to elucidate the effects of repeated practice treatment on sound production accuracy in individuals with apraxia of speech (AOS) and aphasia. A secondary purpose was to determine if the addition of rate/rhythm control to treatment provided further benefits beyond those achieved with repeated practice.…
[Non-speech oral motor treatment efficacy for children with developmental speech sound disorders].

Science.gov (United States)

Ygual-Fernandez, A; Cervera-Merida, J F

2016-01-01

In the treatment of speech disorders by means of speech therapy two antagonistic methodological approaches are applied: non-verbal ones, based on oral motor exercises (OME), and verbal ones, which are based on speech processing tasks with syllables, phonemes and words. In Spain, OME programmes are called 'programas de praxias', and are widely used and valued by speech therapists. To review the studies conducted on the effectiveness of OME-based treatments applied to children with speech disorders and the theoretical arguments that could justify, or not, their usefulness. Over the last few decades evidence has been gathered about the lack of efficacy of this approach to treat developmental speech disorders and pronunciation problems in populations without any neurological alteration of motor functioning. The American Speech-Language-Hearing Association has advised against its use taking into account the principles of evidence-based practice. The knowledge gathered to date on motor control shows that the pattern of mobility and its corresponding organisation in the brain are different in speech and other non-verbal functions linked to nutrition and breathing. Neither the studies on their effectiveness nor the arguments based on motor control studies recommend the use of OME-based programmes for the treatment of pronunciation problems in children with developmental language disorders.
Language Experience Affects Grouping of Musical Instrument Sounds

Science.gov (United States)

Bhatara, Anjali; Boll-Avetisyan, Natalie; Agus, Trevor; Höhle, Barbara; Nazzi, Thierry

2016-01-01

Language experience clearly affects the perception of speech, but little is known about whether these differences in perception extend to non-speech sounds. In this study, we investigated rhythmic perception of non-linguistic sounds in speakers of French and German using a grouping task, in which complexity (variability in sounds, presence of…
Efficient Coding and Statistically Optimal Weighting of Covariance among Acoustic Attributes in Novel Sounds

Science.gov (United States)

Stilp, Christian E.; Kluender, Keith R.

2012-01-01

To the extent that sensorineural systems are efficient, redundancy should be extracted to optimize transmission of information, but perceptual evidence for this has been limited. Stilp and colleagues recently reported efficient coding of robust correlation (r = .97) among complex acoustic attributes (attack/decay, spectral shape) in novel sounds. Discrimination of sounds orthogonal to the correlation was initially inferior but later comparable to that of sounds obeying the correlation. These effects were attenuated for less-correlated stimuli (r = .54) for reasons that are unclear. Here, statistical properties of correlation among acoustic attributes essential for perceptual organization are investigated. Overall, simple strength of the principal correlation is inadequate to predict listener performance. Initial superiority of discrimination for statistically consistent sound pairs was relatively insensitive to decreased physical acoustic/psychoacoustic range of evidence supporting the correlation, and to more frequent presentations of the same orthogonal test pairs. However, increased range supporting an orthogonal dimension has substantial effects upon perceptual organization. Connectionist simulations and Eigenvalues from closed-form calculations of principal components analysis (PCA) reveal that perceptual organization is near-optimally weighted to shared versus unshared covariance in experienced sound distributions. Implications of reduced perceptual dimensionality for speech perception and plausible neural substrates are discussed. PMID:22292057
Learning to Produce Syllabic Speech Sounds via Reward-Modulated Neural Plasticity

Science.gov (United States)

Warlaumont, Anne S.; Finnegan, Megan K.

2016-01-01

At around 7 months of age, human infants begin to reliably produce well-formed syllables containing both consonants and vowels, a behavior called canonical babbling. Over subsequent months, the frequency of canonical babbling continues to increase. How the infant’s nervous system supports the acquisition of this ability is unknown. Here we present a computational model that combines a spiking neural network, reinforcement-modulated spike-timing-dependent plasticity, and a human-like vocal tract to simulate the acquisition of canonical babbling. Like human infants, the model’s frequency of canonical babbling gradually increases. The model is rewarded when it produces a sound that is more auditorily salient than sounds it has previously produced. This is consistent with data from human infants indicating that contingent adult responses shape infant behavior and with data from deaf and tracheostomized infants indicating that hearing, including hearing one’s own vocalizations, is critical for canonical babbling development. Reward receipt increases the level of dopamine in the neural network. The neural network contains a reservoir with recurrent connections and two motor neuron groups, one agonist and one antagonist, which control the masseter and orbicularis oris muscles, promoting or inhibiting mouth closure. The model learns to increase the number of salient, syllabic sounds it produces by adjusting the base level of muscle activation and increasing their range of activity. Our results support the possibility that through dopamine-modulated spike-timing-dependent plasticity, the motor cortex learns to harness its natural oscillations in activity in order to produce syllabic sounds. It thus suggests that learning to produce rhythmic mouth movements for speech production may be supported by general cortical learning mechanisms. The model makes several testable predictions and has implications for our understanding not only of how syllabic vocalizations develop
Adult-like processing of time-compressed speech by newborns: A NIRS study.

Science.gov (United States)

Issard, Cécile; Gervain, Judit

2017-06-01

Humans can adapt to a wide range of variations in the speech signal, maintaining an invariant representation of the linguistic information it contains. Among them, adaptation to rapid or time-compressed speech has been well studied in adults, but the developmental origin of this capacity remains unknown. Does this ability depend on experience with speech (if yes, as heard in utero or as heard postnatally), with sounds in general or is it experience-independent? Using near-infrared spectroscopy, we show that the newborn brain can discriminate between three different compression rates: normal, i.e. 100% of the original duration, moderately compressed, i.e. 60% of original duration and highly compressed, i.e. 30% of original duration. Even more interestingly, responses to normal and moderately compressed speech are similar, showing a canonical hemodynamic response in the left temporoparietal, right frontal and right temporal cortex, while responses to highly compressed speech are inverted, showing a decrease in oxyhemoglobin concentration. These results mirror those found in adults, who readily adapt to moderately compressed, but not to highly compressed speech, showing that adaptation to time-compressed speech requires little or no experience with speech, and happens at an auditory, and not at a more abstract linguistic level. Copyright © 2016 The Authors. Published by Elsevier Ltd.. All rights reserved.
Identification of speech transients using variable frame rate analysis and wavelet packets.

Science.gov (United States)

Rasetshwane, Daniel M; Boston, J Robert; Li, Ching-Chung

2006-01-01

Speech transients are important cues for identifying and discriminating speech sounds. Yoo et al. and Tantibundhit et al. were successful in identifying speech transients and, emphasizing them, improving the intelligibility of speech in noise. However, their methods are computationally intensive and unsuitable for real-time applications. This paper presents a method to identify and emphasize speech transients that combines subband decomposition by the wavelet packet transform with variable frame rate (VFR) analysis and unvoiced consonant detection. The VFR analysis is applied to each wavelet packet to define a transitivity function that describes the extent to which the wavelet coefficients of that packet are changing. Unvoiced consonant detection is used to identify unvoiced consonant intervals and the transitivity function is amplified during these intervals. The wavelet coefficients are multiplied by the transitivity function for that packet, amplifying the coefficients localized at times when they are changing and attenuating coefficients at times when they are steady. Inverse transform of the modified wavelet packet coefficients produces a signal corresponding to speech transients similar to the transients identified by Yoo et al. and Tantibundhit et al. A preliminary implementation of the algorithm runs more efficiently.
Dimensional feature weighting utilizing multiple kernel learning for single-channel talker location discrimination using the acoustic transfer function.

Science.gov (United States)

Takashima, Ryoichi; Takiguchi, Tetsuya; Ariki, Yasuo

2013-02-01

This paper presents a method for discriminating the location of the sound source (talker) using only a single microphone. In a previous work, the single-channel approach for discriminating the location of the sound source was discussed, where the acoustic transfer function from a user's position is estimated by using a hidden Markov model of clean speech in the cepstral domain. In this paper, each cepstral dimension of the acoustic transfer function is newly weighted, in order to obtain the cepstral dimensions having information that is useful for classifying the user's position. Then, this paper proposes a feature-weighting method for the cepstral parameter using multiple kernel learning, defining the base kernels for each cepstral dimension of the acoustic transfer function. The user's position is trained and classified by support vector machine. The effectiveness of this method has been confirmed by sound source (talker) localization experiments performed in different room environments.
Prevalence and Predictors of Persistent Speech Sound Disorder at Eight Years Old: Findings from a Population Cohort Study

Science.gov (United States)

Wren, Yvonne; Miller, Laura L.; Peters, Tim J.; Emond, Alan; Roulstone, Sue

2016-01-01

Purpose: The purpose of this study was to determine prevalence and predictors of persistent speech sound disorder (SSD) in children aged 8 years after disregarding children presenting solely with common clinical distortions (i.e., residual errors). Method: Data from the Avon Longitudinal Study of Parents and Children (Boyd et al., 2012) were used.…

Computer-based auditory phoneme discrimination training improves speech recognition in noise in experienced adult cochlear implant listeners.

Science.gov (United States)

Schumann, Annette; Serman, Maja; Gefeller, Olaf; Hoppe, Ulrich

2015-03-01

Specific computer-based auditory training may be a useful completion in the rehabilitation process for cochlear implant (CI) listeners to achieve sufficient speech intelligibility. This study evaluated the effectiveness of a computerized, phoneme-discrimination training programme. The study employed a pretest-post-test design; participants were randomly assigned to the training or control group. Over a period of three weeks, the training group was instructed to train in phoneme discrimination via computer, twice a week. Sentence recognition in different noise conditions (moderate to difficult) was tested pre- and post-training, and six months after the training was completed. The control group was tested and retested within one month. Twenty-seven adult CI listeners who had been using cochlear implants for more than two years participated in the programme; 15 adults in the training group, 12 adults in the control group. Besides significant improvements for the trained phoneme-identification task, a generalized training effect was noted via significantly improved sentence recognition in moderate noise. No significant changes were noted in the difficult noise conditions. Improved performance was maintained over an extended period. Phoneme-discrimination training improves experienced CI listeners' speech perception in noise. Additional research is needed to optimize auditory training for individual benefit.
Dissociating Cortical Activity during Processing of Native and Non-Native Audiovisual Speech from Early to Late Infancy

Directory of Open Access Journals (Sweden)

Eswen Fava

2014-08-01

Full Text Available Initially, infants are capable of discriminating phonetic contrasts across the world’s languages. Starting between seven and ten months of age, they gradually lose this ability through a process of perceptual narrowing. Although traditionally investigated with isolated speech sounds, such narrowing occurs in a variety of perceptual domains (e.g., faces, visual speech. Thus far, tracking the developmental trajectory of this tuning process has been focused primarily on auditory speech alone, and generally using isolated sounds. But infants learn from speech produced by people talking to them, meaning they learn from a complex audiovisual signal. Here, we use near-infrared spectroscopy to measure blood concentration changes in the bilateral temporal cortices of infants in three different age groups: 3-to-6 months, 7-to-10 months, and 11-to-14-months. Critically, all three groups of infants were tested with continuous audiovisual speech in both their native and another, unfamiliar language. We found that at each age range, infants showed different patterns of cortical activity in response to the native and non-native stimuli. Infants in the youngest group showed bilateral cortical activity that was greater overall in response to non-native relative to native speech; the oldest group showed left lateralized activity in response to native relative to non-native speech. These results highlight perceptual tuning as a dynamic process that happens across modalities and at different levels of stimulus complexity.
Comparing auditory filter bandwidths, spectral ripple modulation detection, spectral ripple discrimination, and speech recognition: Normal and impaired hearing.

Science.gov (United States)

Davies-Venn, Evelyn; Nelson, Peggy; Souza, Pamela

2015-07-01

Some listeners with hearing loss show poor speech recognition scores in spite of using amplification that optimizes audibility. Beyond audibility, studies have suggested that suprathreshold abilities such as spectral and temporal processing may explain differences in amplified speech recognition scores. A variety of different methods has been used to measure spectral processing. However, the relationship between spectral processing and speech recognition is still inconclusive. This study evaluated the relationship between spectral processing and speech recognition in listeners with normal hearing and with hearing loss. Narrowband spectral resolution was assessed using auditory filter bandwidths estimated from simultaneous notched-noise masking. Broadband spectral processing was measured using the spectral ripple discrimination (SRD) task and the spectral ripple depth detection (SMD) task. Three different measures were used to assess unamplified and amplified speech recognition in quiet and noise. Stepwise multiple linear regression revealed that SMD at 2.0 cycles per octave (cpo) significantly predicted speech scores for amplified and unamplified speech in quiet and noise. Commonality analyses revealed that SMD at 2.0 cpo combined with SRD and equivalent rectangular bandwidth measures to explain most of the variance captured by the regression model. Results suggest that SMD and SRD may be promising clinical tools for diagnostic evaluation and predicting amplification outcomes.
Comparing auditory filter bandwidths, spectral ripple modulation detection, spectral ripple discrimination, and speech recognition: Normal and impaired hearinga)

Science.gov (United States)

Davies-Venn, Evelyn; Nelson, Peggy; Souza, Pamela

2015-01-01

Some listeners with hearing loss show poor speech recognition scores in spite of using amplification that optimizes audibility. Beyond audibility, studies have suggested that suprathreshold abilities such as spectral and temporal processing may explain differences in amplified speech recognition scores. A variety of different methods has been used to measure spectral processing. However, the relationship between spectral processing and speech recognition is still inconclusive. This study evaluated the relationship between spectral processing and speech recognition in listeners with normal hearing and with hearing loss. Narrowband spectral resolution was assessed using auditory filter bandwidths estimated from simultaneous notched-noise masking. Broadband spectral processing was measured using the spectral ripple discrimination (SRD) task and the spectral ripple depth detection (SMD) task. Three different measures were used to assess unamplified and amplified speech recognition in quiet and noise. Stepwise multiple linear regression revealed that SMD at 2.0 cycles per octave (cpo) significantly predicted speech scores for amplified and unamplified speech in quiet and noise. Commonality analyses revealed that SMD at 2.0 cpo combined with SRD and equivalent rectangular bandwidth measures to explain most of the variance captured by the regression model. Results suggest that SMD and SRD may be promising clinical tools for diagnostic evaluation and predicting amplification outcomes. PMID:26233047
[Music therapy in adults with cochlear implants : Effects on music perception and subjective sound quality].

Science.gov (United States)

Hutter, E; Grapp, M; Argstatter, H

2016-12-01

People with severe hearing impairments and deafness can achieve good speech comprehension using a cochlear implant (CI), although music perception often remains impaired. A novel concept of music therapy for adults with CI was developed and evaluated in this study. This study included 30 adults with a unilateral CI following postlingual deafness. The subjective sound quality of the CI was rated using the hearing implant sound quality index (HISQUI) and musical tests for pitch discrimination, melody recognition and timbre identification were applied. As a control 55 normally hearing persons also completed the musical tests. In comparison to normally hearing subjects CI users showed deficits in the perception of pitch, melody and timbre. Specific effects of therapy were observed in the subjective sound quality of the CI, in pitch discrimination into a high and low pitch range and in timbre identification, while general learning effects were found in melody recognition. Music perception shows deficits in CI users compared to normally hearing persons. After individual music therapy in the rehabilitation process, improvements in this delicate area could be achieved.
A systematic review and classification of interventions for speech-sound disorder in preschool children.

Science.gov (United States)

Wren, Yvonne; Harding, Sam; Goldbart, Juliet; Roulstone, Sue

2018-05-01

Multiple interventions have been developed to address speech sound disorder (SSD) in children. Many of these have been evaluated but the evidence for these has not been considered within a model which categorizes types of intervention. The opportunity to carry out a systematic review of interventions for SSD arose as part of a larger scale study of interventions for primary speech and language impairment in preschool children. To review systematically the evidence for interventions for SSD in preschool children and to categorize them within a classification of interventions for SSD. Relevant search terms were used to identify intervention studies published up to 2012, with the following inclusion criteria: participants were aged between 2 years and 5 years, 11 months; they exhibited speech, language and communication needs; and a primary outcome measure of speech was used. Studies that met inclusion criteria were quality appraised using the single case experimental design (SCED) or PEDro-P, depending on their methodology. Those judged to be high quality were classified according to the primary focus of intervention. The final review included 26 studies. Case series was the most common research design. Categorization to the classification system for interventions showed that cognitive-linguistic and production approaches to intervention were the most frequently reported. The highest graded evidence was for three studies within the auditory-perceptual and integrated categories. The evidence for intervention for preschool children with SSD is focused on seven out of 11 subcategories of interventions. Although all the studies included in the review were good quality as defined by quality appraisal checklists, they mostly represented lower-graded evidence. Higher-graded studies are needed to understand clearly the strength of evidence for different interventions. © 2018 Royal College of Speech and Language Therapists.
Participation of the classical speech areas in auditory long-term memory

DEFF Research Database (Denmark)

Karabanov, Anke Ninija; Paine, Rainer; Chao, Chi Chao

2015-01-01

-presented together with new ones in a recognition test. Compared to control-site stimulation, pSTG stimulation produced a highly significant increase in recognition error rate, without affecting reaction time. By contrast, IFG stimulation led only to a weak, non-significant, trend toward recognition memory...... impairment. Importantly, the impairment after pSTG stimulation was not due to interference with perception, since the same stimulation failed to affect pseudoword discrimination examined with short interstimulus intervals. Our findings suggest that pSTG is essential for transforming speech sounds into stored...
The effect of F0 contour on the intelligibility of speech in the presence of interfering sounds for Mandarin Chinese.

Science.gov (United States)

Chen, Jing; Yang, Hongying; Wu, Xihong; Moore, Brian C J

2018-02-01

In Mandarin Chinese, the fundamental frequency (F0) contour defines lexical "Tones" that differ in meaning despite being phonetically identical. Flattening the F0 contour impairs the intelligibility of Mandarin Chinese in background sounds. This might occur because the flattening introduces misleading lexical information. To avoid this effect, two types of speech were used: single-Tone speech contained Tones 1 and 0 only, which have a flat F0 contour; multi-Tone speech contained all Tones and had a varying F0 contour. The intelligibility of speech in steady noise was slightly better for single-Tone speech than for multi-Tone speech. The intelligibility of speech in a two-talker masker, with the difference in mean F0 between the target and masker matched across conditions, was worse for the multi-Tone target in the multi-Tone masker than for any other combination of target and masker, probably because informational masking was maximal for this combination. The introduction of a perceived spatial separation between the target and masker, via the precedence effect, led to better performance for all target-masker combinations, especially the multi-Tone target in the multi-Tone masker. In summary, a flat F0 contour does not reduce the intelligibility of Mandarin Chinese when the introduction of misleading lexical cues is avoided.
Contributions of Letter-Speech Sound Learning and Visual Print Tuning to Reading Improvement: Evidence from Brain Potential and Dyslexia Training Studies

NARCIS (Netherlands)

Fraga González, G.; Žarić, G.; Tijms, J.; Bonte, M.; van der Molen, M.W.

We use a neurocognitive perspective to discuss the contribution of learning letter-speech sound (L-SS) associations and visual specialization in the initial phases of reading in dyslexic children. We review findings from associative learning studies on related cognitive skills important for
The sound symbolism bootstrapping hypothesis for language acquisition and language evolution.

Science.gov (United States)

Imai, Mutsumi; Kita, Sotaro

2014-09-19

Sound symbolism is a non-arbitrary relationship between speech sounds and meaning. We review evidence that, contrary to the traditional view in linguistics, sound symbolism is an important design feature of language, which affects online processing of language, and most importantly, language acquisition. We propose the sound symbolism bootstrapping hypothesis, claiming that (i) pre-verbal infants are sensitive to sound symbolism, due to a biologically endowed ability to map and integrate multi-modal input, (ii) sound symbolism helps infants gain referential insight for speech sounds, (iii) sound symbolism helps infants and toddlers associate speech sounds with their referents to establish a lexical representation and (iv) sound symbolism helps toddlers learn words by allowing them to focus on referents embedded in a complex scene, alleviating Quine's problem. We further explore the possibility that sound symbolism is deeply related to language evolution, drawing the parallel between historical development of language across generations and ontogenetic development within individuals. Finally, we suggest that sound symbolism bootstrapping is a part of a more general phenomenon of bootstrapping by means of iconic representations, drawing on similarities and close behavioural links between sound symbolism and speech-accompanying iconic gesture. © 2014 The Author(s) Published by the Royal Society. All rights reserved.
Acoustic analysis of trill sounds.

Science.gov (United States)

Dhananjaya, N; Yegnanarayana, B; Bhaskararao, Peri

2012-04-01

In this paper, the acoustic-phonetic characteristics of steady apical trills--trill sounds produced by the periodic vibration of the apex of the tongue--are studied. Signal processing methods, namely, zero-frequency filtering and zero-time liftering of speech signals, are used to analyze the excitation source and the resonance characteristics of the vocal tract system, respectively. Although it is natural to expect the effect of trilling on the resonances of the vocal tract system, it is interesting to note that trilling influences the glottal source of excitation as well. The excitation characteristics derived using zero-frequency filtering of speech signals are glottal epochs, strength of impulses at the glottal epochs, and instantaneous fundamental frequency of the glottal vibration. Analysis based on zero-time liftering of speech signals is used to study the dynamic resonance characteristics of vocal tract system during the production of trill sounds. Qualitative analysis of trill sounds in different vowel contexts, and the acoustic cues that may help spotting trills in continuous speech are discussed.
The Functional Connectome of Speech Control.

Directory of Open Access Journals (Sweden)

Stefan Fuertinger

2015-07-01

Full Text Available In the past few years, several studies have been directed to understanding the complexity of functional interactions between different brain regions during various human behaviors. Among these, neuroimaging research installed the notion that speech and language require an orchestration of brain regions for comprehension, planning, and integration of a heard sound with a spoken word. However, these studies have been largely limited to mapping the neural correlates of separate speech elements and examining distinct cortical or subcortical circuits involved in different aspects of speech control. As a result, the complexity of the brain network machinery controlling speech and language remained largely unknown. Using graph theoretical analysis of functional MRI (fMRI data in healthy subjects, we quantified the large-scale speech network topology by constructing functional brain networks of increasing hierarchy from the resting state to motor output of meaningless syllables to complex production of real-life speech as well as compared to non-speech-related sequential finger tapping and pure tone discrimination networks. We identified a segregated network of highly connected local neural communities (hubs in the primary sensorimotor and parietal regions, which formed a commonly shared core hub network across the examined conditions, with the left area 4p playing an important role in speech network organization. These sensorimotor core hubs exhibited features of flexible hubs based on their participation in several functional domains across different networks and ability to adaptively switch long-range functional connectivity depending on task content, resulting in a distinct community structure of each examined network. Specifically, compared to other tasks, speech production was characterized by the formation of six distinct neural communities with specialized recruitment of the prefrontal cortex, insula, putamen, and thalamus, which collectively
A general auditory bias for handling speaker variability in speech? Evidence in humans and songbirds

Directory of Open Access Journals (Sweden)

Buddhamas eKriengwatana

2015-08-01

Full Text Available Different speakers produce the same speech sound differently, yet listeners are still able to reliably identify the speech sound. How listeners can adjust their perception to compensate for speaker differences in speech, and whether these compensatory processes are unique only to humans, is still not fully understood. In this study we compare the ability of humans and zebra finches to categorize vowels despite speaker variation in speech in order to test the hypothesis that accommodating speaker and gender differences in isolated vowels can be achieved without prior experience with speaker-related variability. Using a behavioural Go/No-go task and identical stimuli, we compared Australian English adults’ (naïve to Dutch and zebra finches’ (naïve to human speech ability to categorize /ɪ/ and /ɛ/ vowels of an novel Dutch speaker after learning to discriminate those vowels from only one other speaker. Experiment 1 and 2 presented vowels of two speakers interspersed or blocked, respectively. Results demonstrate that categorization of vowels is possible without prior exposure to speaker-related variability in speech for zebra finches, and in non-native vowel categories for humans. Therefore, this study is the first to provide evidence for what might be a species-shared auditory bias that may supersede speaker-related information during vowel categorization. It additionally provides behavioural evidence contradicting a prior hypothesis that accommodation of speaker differences is achieved via the use of formant ratios. Therefore, investigations of alternative accounts of vowel normalization that incorporate the possibility of an auditory bias for disregarding inter-speaker variability are warranted.
The neural basis of sublexical speech and corresponding nonspeech processing: a combined EEG-MEG study.

Science.gov (United States)

Kuuluvainen, Soila; Nevalainen, Päivi; Sorokin, Alexander; Mittag, Maria; Partanen, Eino; Putkinen, Vesa; Seppänen, Miia; Kähkönen, Seppo; Kujala, Teija

2014-03-01

We addressed the neural organization of speech versus nonspeech sound processing by investigating preattentive cortical auditory processing of changes in five features of a consonant-vowel syllable (consonant, vowel, sound duration, frequency, and intensity) and their acoustically matched nonspeech counterparts in a simultaneous EEG-MEG recording of mismatch negativity (MMN/MMNm). Overall, speech-sound processing was enhanced compared to nonspeech sound processing. This effect was strongest for changes which affect word meaning (consonant, vowel, and vowel duration) in the left and for the vowel identity change in the right hemisphere also. Furthermore, in the right hemisphere, speech-sound frequency and intensity changes were processed faster than their nonspeech counterparts, and there was a trend for speech-enhancement in frequency processing. In summary, the results support the proposed existence of long-term memory traces for speech sounds in the auditory cortices, and indicate at least partly distinct neural substrates for speech and nonspeech sound processing. Copyright © 2014 Elsevier Inc. All rights reserved.
Kinematic Analysis of Speech Sound Sequencing Errors Induced by Delayed Auditory Feedback.

Science.gov (United States)

Cler, Gabriel J; Lee, Jackson C; Mittelman, Talia; Stepp, Cara E; Bohland, Jason W

2017-06-22

Delayed auditory feedback (DAF) causes speakers to become disfluent and make phonological errors. Methods for assessing the kinematics of speech errors are lacking, with most DAF studies relying on auditory perceptual analyses, which may be problematic, as errors judged to be categorical may actually represent blends of sounds or articulatory errors. Eight typical speakers produced nonsense syllable sequences under normal and DAF (200 ms). Lip and tongue kinematics were captured with electromagnetic articulography. Time-locked acoustic recordings were transcribed, and the kinematics of utterances with and without perceived errors were analyzed with existing and novel quantitative methods. New multivariate measures showed that for 5 participants, kinematic variability for productions perceived to be error free was significantly increased under delay; these results were validated by using the spatiotemporal index measure. Analysis of error trials revealed both typical productions of a nontarget syllable and productions with articulatory kinematics that incorporated aspects of both the target and the perceived utterance. This study is among the first to characterize articulatory changes under DAF and provides evidence for different classes of speech errors, which may not be perceptually salient. New methods were developed that may aid visualization and analysis of large kinematic data sets. https://doi.org/10.23641/asha.5103067.
Evaluating signal-to-noise ratios, loudness, and related measures as indicators of airborne sound insulation.

Science.gov (United States)

Park, H K; Bradley, J S

2009-09-01

Subjective ratings of the audibility, annoyance, and loudness of music and speech sounds transmitted through 20 different simulated walls were used to identify better single number ratings of airborne sound insulation. The first part of this research considered standard measures such as the sound transmission class the weighted sound reduction index (R(w)) and variations of these measures [H. K. Park and J. S. Bradley, J. Acoust. Soc. Am. 126, 208-219 (2009)]. This paper considers a number of other measures including signal-to-noise ratios related to the intelligibility of speech and measures related to the loudness of sounds. An exploration of the importance of the included frequencies showed that the optimum ranges of included frequencies were different for speech and music sounds. Measures related to speech intelligibility were useful indicators of responses to speech sounds but were not as successful for music sounds. A-weighted level differences, signal-to-noise ratios and an A-weighted sound transmission loss measure were good predictors of responses when the included frequencies were optimized for each type of sound. The addition of new spectrum adaptation terms to R(w) values were found to be the most practical approach for achieving more accurate predictions of subjective ratings of transmitted speech and music sounds.
Behavioral and electrophysiological evidence for early and automatic detection of phonological equivalence in variable speech inputs.

Science.gov (United States)

Kharlamov, Viktor; Campbell, Kenneth; Kazanina, Nina

2011-11-01

Speech sounds are not always perceived in accordance with their acoustic-phonetic content. For example, an early and automatic process of perceptual repair, which ensures conformity of speech inputs to the listener's native language phonology, applies to individual input segments that do not exist in the native inventory or to sound sequences that are illicit according to the native phonotactic restrictions on sound co-occurrences. The present study with Russian and Canadian English speakers shows that listeners may perceive phonetically distinct and licit sound sequences as equivalent when the native language system provides robust evidence for mapping multiple phonetic forms onto a single phonological representation. In Russian, due to an optional but productive t-deletion process that affects /stn/ clusters, the surface forms [sn] and [stn] may be phonologically equivalent and map to a single phonological form /stn/. In contrast, [sn] and [stn] clusters are usually phonologically distinct in (Canadian) English. Behavioral data from identification and discrimination tasks indicated that [sn] and [stn] clusters were more confusable for Russian than for English speakers. The EEG experiment employed an oddball paradigm with nonwords [asna] and [astna] used as the standard and deviant stimuli. A reliable mismatch negativity response was elicited approximately 100 msec postchange in the English group but not in the Russian group. These findings point to a perceptual repair mechanism that is engaged automatically at a prelexical level to ensure immediate encoding of speech inputs in phonological terms, which in turn enables efficient access to the meaning of a spoken utterance.
A qualitative analysis of hate speech reported to the Romanian National Council for Combating Discrimination (2003‑2015

Directory of Open Access Journals (Sweden)

Adriana Iordache

2015-12-01

Full Text Available The article analyzes the specificities of Romanian hate speech over a period of twelve years through a qualitative analysis of 384 Decisions of the National Council for Combating Discrimination. The study employs a coding methodology which allows one to separate decisions according to the group that was the victim of hate speech. The article finds that stereotypes employed are similar to those encountered in the international literature. The main target of hate speech is the Roma, who are seen as „dirty“, „uncivilized“ and a threat to Romania’s image abroad. Other stereotypes encountered were that of the „disloyal“ Hungarian and of the sexually promiscuous woman. Moreover, women are seen as unfit for management positions. The article also discusses stereotypes about homosexuals, who are seen as „sick“ and about non-orthodox religions, portrayed as „sectarian“.
Extensions to the Speech Disorders Classification System (SDCS)

Science.gov (United States)

Shriberg, Lawrence D.; Fourakis, Marios; Hall, Sheryl D.; Karlsson, Heather B.; Lohmeier, Heather L.; McSweeny, Jane L.; Potter, Nancy L.; Scheer-Cohen, Alison R.; Strand, Edythe A.; Tilkens, Christie M.; Wilson, David L.

2010-01-01

This report describes three extensions to a classification system for paediatric speech sound disorders termed the Speech Disorders Classification System (SDCS). Part I describes a classification extension to the SDCS to differentiate motor speech disorders from speech delay and to differentiate among three sub-types of motor speech disorders.…
Relationship between channel interaction and spectral-ripple discrimination in cochlear implant users.

Science.gov (United States)

Jones, Gary L; Won, Jong Ho; Drennan, Ward R; Rubinstein, Jay T

2013-01-01

Cochlear implant (CI) users can achieve remarkable speech understanding, but there is great variability in outcomes that is only partially accounted for by age, residual hearing, and duration of deafness. Results might be improved with the use of psychophysical tests to predict which sound processing strategies offer the best potential outcomes. In particular, the spectral-ripple discrimination test offers a time-efficient, nonlinguistic measure that is correlated with perception of both speech and music by CI users. Features that make this "one-point" test time-efficient, and thus potentially clinically useful, are also connected to controversy within the CI field about what the test measures. The current work examined the relationship between thresholds in the one-point spectral-ripple test, in which stimuli are presented acoustically, and interaction indices measured under the controlled conditions afforded by direct stimulation with a research processor. Results of these studies include the following: (1) within individual subjects there were large variations in the interaction index along the electrode array, (2) interaction indices generally decreased with increasing electrode separation, and (3) spectral-ripple discrimination improved with decreasing mean interaction index at electrode separations of one, three, and five electrodes. These results indicate that spectral-ripple discrimination thresholds can provide a useful metric of the spectral resolution of CI users.

Validating a perceptual distraction model using a personal two-zone sound system

DEFF Research Database (Denmark)

Rämö, Jussi; Christensen, Lasse; Bech, Søren

2017-01-01

This paper focuses on validating a perceptual distraction model, which aims to predict user's perceived distraction caused by audio-on-audio interference. Originally, the distraction model was trained with music targets and interferers using a simple loudspeaker setup, consisting of only two...... sound zones within the sound-zone system. Thus, validating the model using a different sound-zone system with both speech-on-music and music-on-speech stimuli sets. The results show that the model performance is equally good in both zones, i.e., with both speech- on-music and music-on-speech stimuli...
Sound Classification in Hearing Aids Inspired by Auditory Scene Analysis

Science.gov (United States)

Büchler, Michael; Allegro, Silvia; Launer, Stefan; Dillier, Norbert

2005-12-01

A sound classification system for the automatic recognition of the acoustic environment in a hearing aid is discussed. The system distinguishes the four sound classes "clean speech," "speech in noise," "noise," and "music." A number of features that are inspired by auditory scene analysis are extracted from the sound signal. These features describe amplitude modulations, spectral profile, harmonicity, amplitude onsets, and rhythm. They are evaluated together with different pattern classifiers. Simple classifiers, such as rule-based and minimum-distance classifiers, are compared with more complex approaches, such as Bayes classifier, neural network, and hidden Markov model. Sounds from a large database are employed for both training and testing of the system. The achieved recognition rates are very high except for the class "speech in noise." Problems arise in the classification of compressed pop music, strongly reverberated speech, and tonal or fluctuating noises.
Intervention for bilingual speech sound disorders: A case study of an isiXhosa-English-speaking child.

Science.gov (United States)

Rossouw, Kate; Pascoe, Michelle

2018-03-19

Bilingualism is common in South Africa, with many children acquiring isiXhosa as a home language and learning English from a young age in nursery or crèche. IsiXhosa is a local language, part of the Bantu language family, widely spoken in the country. Aims: To describe changes in a bilingual child's speech following intervention based on a theoretically motivated and tailored intervention plan. Methods and procedures: This study describes a female isiXhosa-English bilingual child, named Gcobisa (pseudonym) (chronological age 4 years and 2 months) with a speech sound disorder. Gcobisa's speech was assessed and her difficulties categorised according to Dodd's (2005) diagnostic framework. From this, intervention was planned and the language of intervention was selected. Following intervention, Gcobisa's speech was reassessed. Outcomes and results: Gcobisa's speech was categorised as a consistent phonological delay as she presented with gliding of/l/in both English and isiXhosa, cluster reduction in English and several other age appropriate phonological processes. She was provided with 16 sessions of intervention using a minimal pairs approach, targeting the phonological process of gliding of/l/, which was not considered age appropriate for Gcobisa in isiXhosa when compared to the small set of normative data regarding monolingual isiXhosa development. As a result, the targets and stimuli were in isiXhosa while the main language of instruction was English. This reflects the language mismatch often faced by speech language therapists in South Africa. Gcobisa showed evidence of generalising the target phoneme to English words. Conclusions and implications: The data have theoretical implications regarding bilingual development of isiXhosa-English, as it highlights the ways bilingual development may differ from the monolingual development of this language pair. It adds to the small set of intervention studies investigating the changes in the speech of bilingual
Intervention for bilingual speech sound disorders: A case study of an isiXhosa–English-speaking child

Directory of Open Access Journals (Sweden)

Kate Rossouw

2018-03-01

Full Text Available Background: Bilingualism is common in South Africa, with many children acquiring isiXhosa as a home language and learning English from a young age in nursery or crèche. IsiXhosa is a local language, part of the Bantu language family, widely spoken in the country. Aims: To describe changes in a bilingual child’s speech following intervention based on a theoretically motivated and tailored intervention plan. Methods and procedures: This study describes a female isiXhosa–English bilingual child, named Gcobisa (pseudonym (chronological age 4 years and 2 months with a speech sound disorder. Gcobisa’s speech was assessed and her difficulties categorised according to Dodd’s (2005 diagnostic framework. From this, intervention was planned and the language of intervention was selected. Following intervention, Gcobisa’s speech was reassessed. Outcomes and results: Gcobisa’s speech was categorised as a consistent phonological delay as she presented with gliding of/l/in both English and isiXhosa, cluster reduction in English and several other age appropriate phonological processes. She was provided with 16 sessions of intervention using a minimal pairs approach, targeting the phonological process of gliding of/l/, which was not considered age appropriate for Gcobisa in isiXhosa when compared to the small set of normative data regarding monolingual isiXhosa development. As a result, the targets and stimuli were in isiXhosa while the main language of instruction was English. This reflects the language mismatch often faced by speech language therapists in South Africa. Gcobisa showed evidence of generalising the target phoneme to English words. Conclusions and implications: The data have theoretical implications regarding bilingual development of isiXhosa–English, as it highlights the ways bilingual development may differ from the monolingual development of this language pair. It adds to the small set of intervention studies
Internet Video Telephony Allows Speech Reading by Deaf Individuals and Improves Speech Perception by Cochlear Implant Users

Science.gov (United States)

Mantokoudis, Georgios; Dähler, Claudia; Dubach, Patrick; Kompis, Martin; Caversaccio, Marco D.; Senn, Pascal

2013-01-01

Objective To analyze speech reading through Internet video calls by profoundly hearing-impaired individuals and cochlear implant (CI) users. Methods Speech reading skills of 14 deaf adults and 21 CI users were assessed using the Hochmair Schulz Moser (HSM) sentence test. We presented video simulations using different video resolutions (1280×720, 640×480, 320×240, 160×120 px), frame rates (30, 20, 10, 7, 5 frames per second (fps)), speech velocities (three different speakers), webcameras (Logitech Pro9000, C600 and C500) and image/sound delays (0–500 ms). All video simulations were presented with and without sound and in two screen sizes. Additionally, scores for live Skype™ video connection and live face-to-face communication were assessed. Results Higher frame rate (>7 fps), higher camera resolution (>640×480 px) and shorter picture/sound delay (<100 ms) were associated with increased speech perception scores. Scores were strongly dependent on the speaker but were not influenced by physical properties of the camera optics or the full screen mode. There is a significant median gain of +8.5%pts (p = 0.009) in speech perception for all 21 CI-users if visual cues are additionally shown. CI users with poor open set speech perception scores (n = 11) showed the greatest benefit under combined audio-visual presentation (median speech perception +11.8%pts, p = 0.032). Conclusion Webcameras have the potential to improve telecommunication of hearing-impaired individuals. PMID:23359119
Internet video telephony allows speech reading by deaf individuals and improves speech perception by cochlear implant users.

Directory of Open Access Journals (Sweden)

Georgios Mantokoudis

Full Text Available OBJECTIVE: To analyze speech reading through Internet video calls by profoundly hearing-impaired individuals and cochlear implant (CI users. METHODS: Speech reading skills of 14 deaf adults and 21 CI users were assessed using the Hochmair Schulz Moser (HSM sentence test. We presented video simulations using different video resolutions (1280 × 720, 640 × 480, 320 × 240, 160 × 120 px, frame rates (30, 20, 10, 7, 5 frames per second (fps, speech velocities (three different speakers, webcameras (Logitech Pro9000, C600 and C500 and image/sound delays (0-500 ms. All video simulations were presented with and without sound and in two screen sizes. Additionally, scores for live Skype™ video connection and live face-to-face communication were assessed. RESULTS: Higher frame rate (>7 fps, higher camera resolution (>640 × 480 px and shorter picture/sound delay (<100 ms were associated with increased speech perception scores. Scores were strongly dependent on the speaker but were not influenced by physical properties of the camera optics or the full screen mode. There is a significant median gain of +8.5%pts (p = 0.009 in speech perception for all 21 CI-users if visual cues are additionally shown. CI users with poor open set speech perception scores (n = 11 showed the greatest benefit under combined audio-visual presentation (median speech perception +11.8%pts, p = 0.032. CONCLUSION: Webcameras have the potential to improve telecommunication of hearing-impaired individuals.
Alternative Speech Communication System for Persons with Severe Speech Disorders

Science.gov (United States)

Selouani, Sid-Ahmed; Sidi Yakoub, Mohammed; O'Shaughnessy, Douglas

2009-12-01

Assistive speech-enabled systems are proposed to help both French and English speaking persons with various speech disorders. The proposed assistive systems use automatic speech recognition (ASR) and speech synthesis in order to enhance the quality of communication. These systems aim at improving the intelligibility of pathologic speech making it as natural as possible and close to the original voice of the speaker. The resynthesized utterances use new basic units, a new concatenating algorithm and a grafting technique to correct the poorly pronounced phonemes. The ASR responses are uttered by the new speech synthesis system in order to convey an intelligible message to listeners. Experiments involving four American speakers with severe dysarthria and two Acadian French speakers with sound substitution disorders (SSDs) are carried out to demonstrate the efficiency of the proposed methods. An improvement of the Perceptual Evaluation of the Speech Quality (PESQ) value of 5% and more than 20% is achieved by the speech synthesis systems that deal with SSD and dysarthria, respectively.
An exploratory study of the influence of load and practice on segmental and articulatory variability in children with speech sound disorders.

Science.gov (United States)

Vuolo, Janet; Goffman, Lisa

2017-01-01

This exploratory treatment study used phonetic transcription and speech kinematics to examine changes in segmental and articulatory variability. Nine children, ages 4 to 8 years old, served as participants, including two with childhood apraxia of speech (CAS), five with speech sound disorder (SSD) and two who were typically developing. Children practised producing agent + action phrases in an imitation task (low linguistic load) and a retrieval task (high linguistic load) over five sessions. In the imitation task in session one, both participants with CAS showed high degrees of segmental and articulatory variability. After five sessions, imitation practice resulted in increased articulatory variability for five participants. Retrieval practice resulted in decreased articulatory variability in three participants with SSD. These results suggest that short-term speech production practice in rote imitation disrupts articulatory control in children with and without CAS. In contrast, tasks that require linguistic processing may scaffold learning for children with SSD but not CAS.
Educators’ perspectives on facilitating computer-assisted speech intervention in early childhood settings

OpenAIRE

Crowe, K.; Cumming, T.; McCormack, J.; McLeod, S.; Baker, E.; Wren, Y.; Roulstone, S.; Masso, S.

2017-01-01

Early childhood educators are frequently called on to support preschool-aged children with speech sound disorders and to engage these children in activities that target their speech production. This study explored factors that acted as facilitators and/or barriers to the provision of computer-based support for children with speech sound disorders (SSD) in early childhood centres. Participants were 23 early childhood educators at 13 centres who participated in the Sound Start Study, a randomiz...
The Use of Electropalatography in the Treatment of Acquired Apraxia of Speech.

Science.gov (United States)

Mauszycki, Shannon C; Wright, Sandra; Dingus, Nicole; Wambaugh, Julie L

2016-12-01

This investigation was designed to examine the effects of an articulatory-kinematic treatment in conjunction with visual biofeedback (VBFB) via electropalatography (EPG) on the accuracy of articulation for acquired apraxia of speech (AOS). A multiple-baseline design across participants and behaviors was used with 4 individuals with chronic AOS and aphasia. Accuracy of target speech sounds in treated and untreated phrases in probe sessions served as the dependent variable. Participants received an articulatory-kinematic treatment in combination with VBFB, which was sequentially applied to 3 stimulus sets composed of 2-word phrases with a target speech sound for each set. Positive changes in articulatory accuracy were observed for participants for the majority of treated speech sounds. Also, there was generalization to untreated phrases for most trained speech sounds. Two participants had better long-term maintenance of treated speech sounds in both trained and untrained stimuli. Findings indicate EPG may be a potential treatment tool for AOS. It appears that individuals with AOS can benefit from VBFB via EPG in improving articulatory accuracy. However, further research is needed to determine if VBFB is more advantageous than behavioral treatments that have been proven effective in improving speech production for speakers with AOS.
Stuttering children and the probability of remission--the role of cerebral dominance and speech production.

Science.gov (United States)

Brosch, S; Haege, A; Kalehne, P; Johannsen, H S

1999-01-25

The identification of critical characteristics which might predict whether childhood stuttering will become chronic. Part of the study investigates the relationship between hearing and central processing of acoustic stimuli, cerebral dominance and the clinical course of the stuttering. A prospective study of 79 stuttering children aged 3-9 years. The subjects were examined with regard to their cerebral dominance in various tests of laterality, their peripheral hearing and their ability to discriminate sound using the dichotic discrimination test according to Uttenweiler (V. Uttenweiler, Dichotischer Diskriminationstest für Kinder, Sprache Stimme Gehör 4 (1980) 107-111). Results were correlated with the probability of remission of stuttering. Comparisons were made with a control group of 18 children of kindergarten age with normal speech. The period of investigation was 18 months. Seventy-two children underwent follow-up examinations. Of these, 36 achieved fluency of speech. The results of the dichotic discrimination test showed no relation to the rate of remission. When the relationship between handedness and stuttering was investigated, it was found that left-handed children had a significantly poorer chance of attaining speech fluency. The Uttenweiler test allowed no prognostic evaluation of the future course of stuttering in the age group studied, though auditory dominance was not completely developed in a majority of the 3-6 year-old children. Handedness, however, appears to be related to the probability that stuttering will become chronic.
Optimizing acoustical conditions for speech intelligibility in classrooms

Science.gov (United States)

Yang, Wonyoung

High speech intelligibility is imperative in classrooms where verbal communication is critical. However, the optimal acoustical conditions to achieve a high degree of speech intelligibility have previously been investigated with inconsistent results, and practical room-acoustical solutions to optimize the acoustical conditions for speech intelligibility have not been developed. This experimental study validated auralization for speech-intelligibility testing, investigated the optimal reverberation for speech intelligibility for both normal and hearing-impaired listeners using more realistic room-acoustical models, and proposed an optimal sound-control design for speech intelligibility based on the findings. The auralization technique was used to perform subjective speech-intelligibility tests. The validation study, comparing auralization results with those of real classroom speech-intelligibility tests, found that if the room to be auralized is not very absorptive or noisy, speech-intelligibility tests using auralization are valid. The speech-intelligibility tests were done in two different auralized sound fields---approximately diffuse and non-diffuse---using the Modified Rhyme Test and both normal and hearing-impaired listeners. A hybrid room-acoustical prediction program was used throughout the work, and it and a 1/8 scale-model classroom were used to evaluate the effects of ceiling barriers and reflectors. For both subject groups, in approximately diffuse sound fields, when the speech source was closer to the listener than the noise source, the optimal reverberation time was zero. When the noise source was closer to the listener than the speech source, the optimal reverberation time was 0.4 s (with another peak at 0.0 s) with relative output power levels of the speech and noise sources SNS = 5 dB, and 0.8 s with SNS = 0 dB. In non-diffuse sound fields, when the noise source was between the speaker and the listener, the optimal reverberation time was 0.6 s with
Bridging the Gap Between Speech and Language: Using Multimodal Treatment in a Child With Apraxia.

Science.gov (United States)

Tierney, Cheryl D; Pitterle, Kathleen; Kurtz, Marie; Nakhla, Mark; Todorow, Carlyn

2016-09-01

Childhood apraxia of speech is a neurologic speech sound disorder in which children have difficulty constructing words and sounds due to poor motor planning and coordination of the articulators required for speech sound production. We report the case of a 3-year-old boy strongly suspected to have childhood apraxia of speech at 18 months of age who used multimodal communication to facilitate language development throughout his work with a speech language pathologist. In 18 months of an intensive structured program, he exhibited atypical rapid improvement, progressing from having no intelligible speech to achieving age-appropriate articulation. We suspect that early introduction of sign language by family proved to be a highly effective form of language development, that when coupled with intensive oro-motor and speech sound therapy, resulted in rapid resolution of symptoms. Copyright © 2016 by the American Academy of Pediatrics.
Hemispheric asymmetries in speech perception: sense, nonsense and modulations.

Directory of Open Access Journals (Sweden)

Stuart Rosen

Full Text Available The well-established left hemisphere specialisation for language processing has long been claimed to be based on a low-level auditory specialization for specific acoustic features in speech, particularly regarding 'rapid temporal processing'.A novel analysis/synthesis technique was used to construct a variety of sounds based on simple sentences which could be manipulated in spectro-temporal complexity, and whether they were intelligible or not. All sounds consisted of two noise-excited spectral prominences (based on the lower two formants in the original speech which could be static or varying in frequency and/or amplitude independently. Dynamically varying both acoustic features based on the same sentence led to intelligible speech but when either or both acoustic features were static, the stimuli were not intelligible. Using the frequency dynamics from one sentence with the amplitude dynamics of another led to unintelligible sounds of comparable spectro-temporal complexity to the intelligible ones. Positron emission tomography (PET was used to compare which brain regions were active when participants listened to the different sounds.Neural activity to spectral and amplitude modulations sufficient to support speech intelligibility (without actually being intelligible was seen bilaterally, with a right temporal lobe dominance. A left dominant response was seen only to intelligible sounds. It thus appears that the left hemisphere specialisation for speech is based on the linguistic properties of utterances, not on particular acoustic features.
Parent-child interaction in motor speech therapy.

Science.gov (United States)

Namasivayam, Aravind Kumar; Jethava, Vibhuti; Pukonen, Margit; Huynh, Anna; Goshulak, Debra; Kroll, Robert; van Lieshout, Pascal

2018-01-01

This study measures the reliability and sensitivity of a modified Parent-Child Interaction Observation scale (PCIOs) used to monitor the quality of parent-child interaction. The scale is part of a home-training program employed with direct motor speech intervention for children with speech sound disorders. Eighty-four preschool age children with speech sound disorders were provided either high- (2×/week/10 weeks) or low-intensity (1×/week/10 weeks) motor speech intervention. Clinicians completed the PCIOs at the beginning, middle, and end of treatment. Inter-rater reliability (Kappa scores) was determined by an independent speech-language pathologist who assessed videotaped sessions at the midpoint of the treatment block. Intervention sensitivity of the scale was evaluated using a Friedman test for each item and then followed up with Wilcoxon pairwise comparisons where appropriate. We obtained fair-to-good inter-rater reliability (Kappa = 0.33-0.64) for the PCIOs using only video-based scoring. Child-related items were more strongly influenced by differences in treatment intensity than parent-related items, where a greater number of sessions positively influenced parent learning of treatment skills and child behaviors. The adapted PCIOs is reliable and sensitive to monitor the quality of parent-child interactions in a 10-week block of motor speech intervention with adjunct home therapy. Implications for rehabilitation Parent-centered therapy is considered a cost effective method of speech and language service delivery. However, parent-centered models may be difficult to implement for treatments such as developmental motor speech interventions that require a high degree of skill and training. For children with speech sound disorders and motor speech difficulties, a translated and adapted version of the parent-child observation scale was found to be sufficiently reliable and sensitive to assess changes in the quality of the parent-child interactions during
The Persian version of auditory word discrimination test (P-AWDT) for children: Development, validity, and reliability.

Science.gov (United States)

Hashemi, Nassim; Ghorbani, Ali; Soleymani, Zahra; Kamali, Mohmmad; Ahmadi, Zohreh Ziatabar; Mahmoudian, Saeid

2018-07-01

Auditory discrimination of speech sounds is an important perceptual ability and a precursor to the acquisition of language. Auditory information is at least partially necessary for the acquisition and organization of phonological rules. There are few standardized behavioral tests to evaluate phonemic distinctive features in children with or without speech and language disorders. The main objective of the present study was the development, validity, and reliability of the Persian version of auditory word discrimination test (P-AWDT) for 4-8-year-old children. A total of 120 typical children and 40 children with speech sound disorder (SSD) participated in the present study. The test comprised of 160 monosyllabic paired-words distributed in the Forms A-1 and the Form A-2 for the initial consonants (80 words) and the Forms B-1 and the Form B-2 for the final consonants (80 words). Moreover, the discrimination of vowels was randomly included in all forms. Content validity was calculated and 50 children repeated the test twice with two weeks of interval (test-retest reliability). Further analysis was also implemented including validity, intraclass correlation coefficient (ICC), Cronbach's alpha (internal consistency), age groups, and gender. The content validity index (CVI) and the test-retest reliability of the P-AWDT were achieved 63%-86% and 81%-96%, respectively. Moreover, the total Cronbach's alpha for the internal consistency was estimated relatively high (0.93). Comparison of the mean scores of the P-AWDT in the typical children and the children with SSD revealed a significant difference. The results revealed that the group with SSD had greater severity of deficit than the typical group in auditory word discrimination. In addition, the difference between the age groups was statistically significant, especially in 4-4.11-year-old children. The performance of the two gender groups was relatively same. The comparison of the P-AWDT scores between the typical children
"When He's around His Brothers ... He's Not so Quiet": The Private and Public Worlds of School-Aged Children with Speech Sound Disorder

Science.gov (United States)

McLeod, Sharynne; Daniel, Graham; Barr, Jacqueline

2013-01-01

Children interact with people in context: including home, school, and in the community. Understanding children's relationships within context is important for supporting children's development. Using child-friendly methodologies, the purpose of this research was to understand the lives of children with speech sound disorder (SSD) in context.…
Hearing speech in music.

Science.gov (United States)

Ekström, Seth-Reino; Borg, Erik

2011-01-01

The masking effect of a piano composition, played at different speeds and in different octaves, on speech-perception thresholds was investigated in 15 normal-hearing and 14 moderately-hearing-impaired subjects. Running speech (just follow conversation, JFC) testing and use of hearing aids increased the everyday validity of the findings. A comparison was made with standard audiometric noises [International Collegium of Rehabilitative Audiology (ICRA) noise and speech spectrum-filtered noise (SPN)]. All masking sounds, music or noise, were presented at the same equivalent sound level (50 dBA). The results showed a significant effect of piano performance speed and octave (Ptempo had the largest effect; and high octave and slow tempo, the smallest. Music had a lower masking effect than did ICRA noise with two or six speakers at normal vocal effort (Pmusic offers an interesting opportunity for studying masking under realistic conditions, where spectral and temporal features can be varied independently. The results have implications for composing music with vocal parts, designing acoustic environments and creating a balance between speech perception and privacy in social settings.
Thinking soap But Speaking ‘oaps’. The Sound Preparation Period: Backward Calculation From Utterance to Muscle Innervation

Directory of Open Access Journals (Sweden)

Nora Wiedenmann

2010-04-01

Full Text Available
In this article’s model—on speech and on speech errors, dyscoordinations, and disorders—, the time-course from the muscle innervation impetuses to the utterance of sounds as intended for canonical speech sound sequences is calculated backward. This time-course is shown as the sum of all the known physiological durations of speech sounds and speech gestures that are necessary to produce an utterance. The model introduces two internal clocks, based on positive or negative factors, representing certain physiologically-based time-courses during the sound preparation period (Lautvorspann. The use of these internal clocks show that speech gestures—like other motor activities—work according to a simple serialization principle: Under non-default conditions,
alterations of the time-courses may cause speech errors of sound serialization, dyscoordinations of sounds as observed during first language acquisition, or speech disorders as pathological cases. These alterations of the time-course are modelled by varying the two internal-clock factors. The calculation of time-courses uses as default values the sound durations of the context-dependent Munich PHONDAT Database of Spoken German (see Appendix 4. As a new, human approach, this calculation agrees mathematically with the approach of Linear Programming / Operations Research. This work gives strong support to the fairly old suspicion (of 1908 of the famous Austrian speech error scientist Meringer [15], namely that one mostly thinks and articulates in a different serialization than is audible from one’s uttered sound sequences.
Multisensory integration of speech sounds with letters vs. visual speech : only visual speech induces the mismatch negativity

NARCIS (Netherlands)

Stekelenburg, J.J.; Keetels, M.N.; Vroomen, J.H.M.

2018-01-01

Numerous studies have demonstrated that the vision of lip movements can alter the perception of auditory speech syllables (McGurk effect). While there is ample evidence for integration of text and auditory speech, there are only a few studies on the orthographic equivalent of the McGurk effect.

Validating a perceptual distraction model in a personal two-zone sound system

DEFF Research Database (Denmark)

Rämö, Jussi; Christensen, Lasse; Bech, Søren

2017-01-01

This paper focuses on validating a perceptual distraction model, which aims to predict user’s perceived distraction caused by audio-on-audio interference, e.g., two competing audio sources within the same listening space. Originally, the distraction model was trained with music-on-music stimuli...... using a simple loudspeaker setup, consisting of only two loudspeakers, one for the target sound source and the other for the interfering sound source. Recently, the model was successfully validated in a complex personal sound-zone system with speech-on-music stimuli. Second round of validations were...... conducted by physically altering the sound-zone system and running a set of new listening experiments utilizing two sound zones within the sound-zone system. Thus, validating the model using a different sound-zone system with both speech-on-music and music-on-speech stimuli sets. Preliminary results show...
Constraints on decay of environmental sound memory in adult rats.

Science.gov (United States)

Sakai, Masashi

2006-11-27

When adult rats are pretreated with a 48-h-long 'repetitive nonreinforced sound exposure', performance in two-sound discriminative operant conditioning transiently improves. We have already proven that this 'sound exposure-enhanced discrimination' is dependent upon enhancement of the perceptual capacity of the auditory cortex. This study investigated principles governing decay of sound exposure-enhanced discrimination decay. Sound exposure-enhanced discrimination disappeared within approximately 72 h if animals were deprived of environmental sounds after sound exposure, and that shortened to less than approximately 60 h if they were exposed to environmental sounds in the animal room. Sound-deprivation itself exerted no clear effects. These findings suggest that the memory of a passively exposed behaviorally irrelevant sound signal does not merely pass along the intrinsic lifetime but also gets deteriorated by other incoming signals.
Normal Aspects of Speech, Hearing, and Language.

Science.gov (United States)

Minifie, Fred. D., Ed.; And Others

This book is written as a guide to the understanding of the processes involved in human speech communication. Ten authorities contributed material to provide an introduction to the physiological aspects of speech production and reception, the acoustical aspects of speech production and transmission, the psychophysics of sound reception, the nature…
Ultrasound biofeedback treatment for persisting childhood apraxia of speech.

Science.gov (United States)

Preston, Jonathan L; Brick, Nickole; Landi, Nicole

2013-11-01

The purpose of this study was to evaluate the efficacy of a treatment program that includes ultrasound biofeedback for children with persisting speech sound errors associated with childhood apraxia of speech (CAS). Six children ages 9-15 years participated in a multiple baseline experiment for 18 treatment sessions during which treatment focused on producing sequences involving lingual sounds. Children were cued to modify their tongue movements using visual feedback from real-time ultrasound images. Probe data were collected before, during, and after treatment to assess word-level accuracy for treated and untreated sound sequences. As participants reached preestablished performance criteria, new sequences were introduced into treatment. All participants met the performance criterion (80% accuracy for 2 consecutive sessions) on at least 2 treated sound sequences. Across the 6 participants, performance criterion was met for 23 of 31 treated sequences in an average of 5 sessions. Some participants showed no improvement in untreated sequences, whereas others showed generalization to untreated sequences that were phonetically similar to the treated sequences. Most gains were maintained 2 months after the end of treatment. The percentage of phonemes correct increased significantly from pretreatment to the 2-month follow-up. A treatment program including ultrasound biofeedback is a viable option for improving speech sound accuracy in children with persisting speech sound errors associated with CAS.
Long-term exposure to noise impairs cortical sound processing and attention control.

Science.gov (United States)

Kujala, Teija; Shtyrov, Yury; Winkler, Istvan; Saher, Marieke; Tervaniemi, Mari; Sallinen, Mikael; Teder-Sälejärvi, Wolfgang; Alho, Kimmo; Reinikainen, Kalevi; Näätänen, Risto

2004-11-01

Long-term exposure to noise impairs human health, causing pathological changes in the inner ear as well as other anatomical and physiological deficits. Numerous individuals are daily exposed to excessive noise. However, there is a lack of systematic research on the effects of noise on cortical function. Here we report data showing that long-term exposure to noise has a persistent effect on central auditory processing and leads to concurrent behavioral deficits. We found that speech-sound discrimination was impaired in noise-exposed individuals, as indicated by behavioral responses and the mismatch negativity brain response. Furthermore, irrelevant sounds increased the distractibility of the noise-exposed subjects, which was shown by increased interference in task performance and aberrant brain responses. These results demonstrate that long-term exposure to noise has long-lasting detrimental effects on central auditory processing and attention control.
Varying acoustic-phonemic ambiguity reveals that talker normalization is obligatory in speech processing.

Science.gov (United States)

Choi, Ja Young; Hu, Elly R; Perrachione, Tyler K

2018-04-01

The nondeterministic relationship between speech acoustics and abstract phonemic representations imposes a challenge for listeners to maintain perceptual constancy despite the highly variable acoustic realization of speech. Talker normalization facilitates speech processing by reducing the degrees of freedom for mapping between encountered speech and phonemic representations. While this process has been proposed to facilitate the perception of ambiguous speech sounds, it is currently unknown whether talker normalization is affected by the degree of potential ambiguity in acoustic-phonemic mapping. We explored the effects of talker normalization on speech processing in a series of speeded classification paradigms, parametrically manipulating the potential for inconsistent acoustic-phonemic relationships across talkers for both consonants and vowels. Listeners identified words with varying potential acoustic-phonemic ambiguity across talkers (e.g., beet/boat vs. boot/boat) spoken by single or mixed talkers. Auditory categorization of words was always slower when listening to mixed talkers compared to a single talker, even when there was no potential acoustic ambiguity between target sounds. Moreover, the processing cost imposed by mixed talkers was greatest when words had the most potential acoustic-phonemic overlap across talkers. Models of acoustic dissimilarity between target speech sounds did not account for the pattern of results. These results suggest (a) that talker normalization incurs the greatest processing cost when disambiguating highly confusable sounds and (b) that talker normalization appears to be an obligatory component of speech perception, taking place even when the acoustic-phonemic relationships across sounds are unambiguous.
High-frequency energy in singing and speech

Science.gov (United States)

Monson, Brian Bruce

While human speech and the human voice generate acoustical energy up to (and beyond) 20 kHz, the energy above approximately 5 kHz has been largely neglected. Evidence is accruing that this high-frequency energy contains perceptual information relevant to speech and voice, including percepts of quality, localization, and intelligibility. The present research was an initial step in the long-range goal of characterizing high-frequency energy in singing voice and speech, with particular regard for its perceptual role and its potential for modification during voice and speech production. In this study, a database of high-fidelity recordings of talkers was created and used for a broad acoustical analysis and general characterization of high-frequency energy, as well as specific characterization of phoneme category, voice and speech intensity level, and mode of production (speech versus singing) by high-frequency energy content. Directionality of radiation of high-frequency energy from the mouth was also examined. The recordings were used for perceptual experiments wherein listeners were asked to discriminate between speech and voice samples that differed only in high-frequency energy content. Listeners were also subjected to gender discrimination tasks, mode-of-production discrimination tasks, and transcription tasks with samples of speech and singing that contained only high-frequency content. The combination of these experiments has revealed that (1) human listeners are able to detect very subtle level changes in high-frequency energy, and (2) human listeners are able to extract significant perceptual information from high-frequency energy.
Abnormal sound detection device

International Nuclear Information System (INIS)

Yamada, Izumi; Matsui, Yuji.

1995-01-01

Only components synchronized with rotation of pumps are sampled from detected acoustic sounds, to judge the presence or absence of abnormality based on the magnitude of the synchronized components. A synchronized component sampling means can remove resonance sounds and other acoustic sounds generated at a synchronously with the rotation based on the knowledge that generated acoustic components in a normal state are a sort of resonance sounds and are not precisely synchronized with the number of rotation. On the other hand, abnormal sounds of a rotating body are often caused by compulsory force accompanying the rotation as a generation source, and the abnormal sounds can be detected by extracting only the rotation-synchronized components. Since components of normal acoustic sounds generated at present are discriminated from the detected sounds, reduction of the abnormal sounds due to a signal processing can be avoided and, as a result, abnormal sound detection sensitivity can be improved. Further, since it is adapted to discriminate the occurrence of the abnormal sound from the actually detected sounds, the other frequency components which are forecast but not generated actually are not removed, so that it is further effective for the improvement of detection sensitivity. (N.H.)
Phoneme Compression: processing of the speech signal and effects on speech intelligibility in hearing-Impaired listeners

NARCIS (Netherlands)

A. Goedegebure (Andre)

2005-01-01

textabstractHearing-aid users often continue to have problems with poor speech understanding in difficult acoustical conditions. Another generally accounted problem is that certain sounds become too loud whereas other sounds are still not audible. Dynamic range compression is a signal processing
Results of the Sensory Profile in Children with Suspected Childhood Apraxia of Speech

Science.gov (United States)

Newmeyer Amy J.; Grether, Sandra; Aylward, Christa; deGrauw, Ton; Akers, Rachel; Grasha, Carol; Ishikawa, Keiko; White, Jaye

2009-01-01

Speech-sound disorders are common in preschool-age children, and are characterized by difficulty in the planning and production of speech sounds and their combination into words and sentences. The objective of this study was to review and compare the results of the "Sensory Profile" ([Dunn, 1999]) in children with a specific type of speech-sound…
Effects of irrelevant speech and traffic noise on speech perception and cognitive performance in elementary school children.

Science.gov (United States)

Klatte, Maria; Meis, Markus; Sukowski, Helga; Schick, August

2007-01-01

The effects of background noise of moderate intensity on short-term storage and processing of verbal information were analyzed in 6 to 8 year old children. In line with adult studies on "irrelevant sound effect" (ISE), serial recall of visually presented digits was severely disrupted by background speech that the children did not understand. Train noises of equal Intensity however, had no effect. Similar results were demonstrated with tasks requiring storage and processing of heard information. Memory for nonwords, execution of oral instructions and categorizing speech sounds were significantly disrupted by irrelevant speech. The affected functions play a fundamental role in the acquisition of spoken and written language. Implications concerning current models of the ISE and the acoustic conditions in schools and kindergardens are discussed.
A general auditory bias for handling speaker variability in speech? Evidence in humans and songbirds

NARCIS (Netherlands)

Kriengwatana, B.; Escudero, P.; Kerkhoven, A.H.; ten Cate, C.

2015-01-01

Different speakers produce the same speech sound differently, yet listeners are still able to reliably identify the speech sound. How listeners can adjust their perception to compensate for speaker differences in speech, and whether these compensatory processes are unique only to humans, is still
Auditory perception and attention as reflected by the brain event-related potentials in children with Asperger syndrome.

Science.gov (United States)

Lepistö, T; Silokallio, S; Nieminen-von Wendt, T; Alku, P; Näätänen, R; Kujala, T

2006-10-01

Language development is delayed and deviant in individuals with autism, but proceeds quite normally in those with Asperger syndrome (AS). We investigated auditory-discrimination and orienting in children with AS using an event-related potential (ERP) paradigm that was previously applied to children with autism. ERPs were measured to pitch, duration, and phonetic changes in vowels and to corresponding changes in non-speech sounds. Active sound discrimination was evaluated with a sound-identification task. The mismatch negativity (MMN), indexing sound-discrimination accuracy, showed right-hemisphere dominance in the AS group, but not in the controls. Furthermore, the children with AS had diminished MMN-amplitudes and decreased hit rates for duration changes. In contrast, their MMN to speech pitch changes was parietally enhanced. The P3a, reflecting involuntary orienting to changes, was diminished in the children with AS for speech pitch and phoneme changes, but not for the corresponding non-speech changes. The children with AS differ from controls with respect to their sound-discrimination and orienting abilities. The results of the children with AS are relatively similar to those earlier obtained from children with autism using the same paradigm, although these clinical groups differ markedly in their language development.
Age-related sensitive periods influence visual language discrimination in adults.

Science.gov (United States)

Weikum, Whitney M; Vouloumanos, Athena; Navarra, Jordi; Soto-Faraco, Salvador; Sebastián-Gallés, Núria; Werker, Janet F

2013-01-01

Adults as well as infants have the capacity to discriminate languages based on visual speech alone. Here, we investigated whether adults' ability to discriminate languages based on visual speech cues is influenced by the age of language acquisition. Adult participants who had all learned English (as a first or second language) but did not speak French were shown faces of bilingual (French/English) speakers silently reciting sentences in either language. Using only visual speech information, adults who had learned English from birth or as a second language before the age of 6 could discriminate between French and English significantly better than chance. However, adults who had learned English as a second language after age 6 failed to discriminate these two languages, suggesting that early childhood exposure is crucial for using relevant visual speech information to separate languages visually. These findings raise the possibility that lowered sensitivity to non-native visual speech cues may contribute to the difficulties encountered when learning a new language in adulthood.
Using personal response systems to assess speech perception within the classroom: an approach to determine the efficacy of sound field amplification in primary school classrooms.

Science.gov (United States)

Vickers, Deborah A; Backus, Bradford C; Macdonald, Nora K; Rostamzadeh, Niloofar K; Mason, Nisha K; Pandya, Roshni; Marriage, Josephine E; Mahon, Merle H

2013-01-01

The assessment of the combined effect of classroom acoustics and sound field amplification (SFA) on children's speech perception within the "live" classroom poses a challenge to researchers. The goals of this study were to determine: (1) Whether personal response system (PRS) hand-held voting cards, together with a closed-set speech perception test (Chear Auditory Perception Test [CAPT]), provide an appropriate method for evaluating speech perception in the classroom; (2) Whether SFA provides better access to the teacher's speech than without SFA for children, taking into account vocabulary age, middle ear dysfunction or ear-canal wax, and home language. Forty-four children from two school-year groups, year 2 (aged 6 years 11 months to 7 years 10 months) and year 3 (aged 7 years 11 months to 8 years 10 months) were tested in two classrooms, using a shortened version of the four-alternative consonant discrimination section of the CAPT. All children used a PRS to register their chosen response, which they selected from four options displayed on the interactive whiteboard. The classrooms were located in a 19th-century school in central London, United Kingdom. Each child sat at their usual position in the room while target speech stimuli were presented either in quiet or in noise. The target speech was presented from the front of the classroom at 65 dBA (calibrated at 1 m) and the presented noise level was 46 dBA measured at the center of the classroom. The older children had an additional noise condition with a noise level of 52 dBA. All conditions were presented twice, once with SFA and once without SFA and the order of testing was randomized. White noise from the teacher's right-hand side of the classroom and International Speech Test Signal from the teacher's left-hand side were used, and the noises were matched at the center point of the classroom (10sec averaging [A-weighted]). Each child's expressive vocabulary age and middle ear status were measured
Auditory cortex processes variation in our own speech.

Directory of Open Access Journals (Sweden)

Kevin R Sitek

Full Text Available As we talk, we unconsciously adjust our speech to ensure it sounds the way we intend it to sound. However, because speech production involves complex motor planning and execution, no two utterances of the same sound will be exactly the same. Here, we show that auditory cortex is sensitive to natural variations in self-produced speech from utterance to utterance. We recorded event-related potentials (ERPs from ninety-nine subjects while they uttered "ah" and while they listened to those speech sounds played back. Subjects' utterances were sorted based on their formant deviations from the previous utterance. Typically, the N1 ERP component is suppressed during talking compared to listening. By comparing ERPs to the least and most variable utterances, we found that N1 was less suppressed to utterances that differed greatly from their preceding neighbors. In contrast, an utterance's difference from the median formant values did not affect N1. Trial-to-trial pitch (f0 deviation and pitch difference from the median similarly did not affect N1. We discuss mechanisms that may underlie the change in N1 suppression resulting from trial-to-trial formant change. Deviant utterances require additional auditory cortical processing, suggesting that speaking-induced suppression mechanisms are optimally tuned for a specific production.
Auditory Cortex Processes Variation in Our Own Speech

Science.gov (United States)

Sitek, Kevin R.; Mathalon, Daniel H.; Roach, Brian J.; Houde, John F.; Niziolek, Caroline A.; Ford, Judith M.

2013-01-01

As we talk, we unconsciously adjust our speech to ensure it sounds the way we intend it to sound. However, because speech production involves complex motor planning and execution, no two utterances of the same sound will be exactly the same. Here, we show that auditory cortex is sensitive to natural variations in self-produced speech from utterance to utterance. We recorded event-related potentials (ERPs) from ninety-nine subjects while they uttered “ah” and while they listened to those speech sounds played back. Subjects' utterances were sorted based on their formant deviations from the previous utterance. Typically, the N1 ERP component is suppressed during talking compared to listening. By comparing ERPs to the least and most variable utterances, we found that N1 was less suppressed to utterances that differed greatly from their preceding neighbors. In contrast, an utterance's difference from the median formant values did not affect N1. Trial-to-trial pitch (f0) deviation and pitch difference from the median similarly did not affect N1. We discuss mechanisms that may underlie the change in N1 suppression resulting from trial-to-trial formant change. Deviant utterances require additional auditory cortical processing, suggesting that speaking-induced suppression mechanisms are optimally tuned for a specific production. PMID:24349399
THE ONTOGENESIS OF SPEECH DEVELOPMENT

Directory of Open Access Journals (Sweden)

T. E. Braudo

2017-01-01

Full Text Available The purpose of this article is to acquaint the specialists, working with children having developmental disorders, with age-related norms for speech development. Many well-known linguists and psychologists studied speech ontogenesis (logogenesis. Speech is a higher mental function, which integrates many functional systems. Speech development in infants during the first months after birth is ensured by the innate hearing and emerging ability to fix the gaze on the face of an adult. Innate emotional reactions are also being developed during this period, turning into nonverbal forms of communication. At about 6 months a baby starts to pronounce some syllables; at 7–9 months – repeats various sounds combinations, pronounced by adults. At 10–11 months a baby begins to react on the words, referred to him/her. The first words usually appear at an age of 1 year; this is the start of the stage of active speech development. At this time it is acceptable, if a child confuses or rearranges sounds, distorts or misses them. By the age of 1.5 years a child begins to understand abstract explanations of adults. Significant vocabulary enlargement occurs between 2 and 3 years; grammatical structures of the language are being formed during this period (a child starts to use phrases and sentences. Preschool age (3–7 y. o. is characterized by incorrect, but steadily improving pronunciation of sounds and phonemic perception. The vocabulary increases; abstract speech and retelling are being formed. Children over 7 y. o. continue to improve grammar, writing and reading skills. The described stages may not have strict age boundaries, as soon as they are dependent not only on environment, but also on the child’s mental constitution, heredity and character.
Automatic analysis of slips of the tongue: Insights into the cognitive architecture of speech production.

Science.gov (United States)

Goldrick, Matthew; Keshet, Joseph; Gustafson, Erin; Heller, Jordana; Needle, Jeremy

2016-04-01

Traces of the cognitive mechanisms underlying speaking can be found within subtle variations in how we pronounce sounds. While speech errors have traditionally been seen as categorical substitutions of one sound for another, acoustic/articulatory analyses show they partially reflect the intended sound. When "pig" is mispronounced as "big," the resulting /b/ sound differs from correct productions of "big," moving towards intended "pig"-revealing the role of graded sound representations in speech production. Investigating the origins of such phenomena requires detailed estimation of speech sound distributions; this has been hampered by reliance on subjective, labor-intensive manual annotation. Computational methods can address these issues by providing for objective, automatic measurements. We develop a novel high-precision computational approach, based on a set of machine learning algorithms, for measurement of elicited speech. The algorithms are trained on existing manually labeled data to detect and locate linguistically relevant acoustic properties with high accuracy. Our approach is robust, is designed to handle mis-productions, and overall matches the performance of expert coders. It allows us to analyze a very large dataset of speech errors (containing far more errors than the total in the existing literature), illuminating properties of speech sound distributions previously impossible to reliably observe. We argue that this provides novel evidence that two sources both contribute to deviations in speech errors: planning processes specifying the targets of articulation and articulatory processes specifying the motor movements that execute this plan. These findings illustrate how a much richer picture of speech provides an opportunity to gain novel insights into language processing. Copyright © 2016 Elsevier B.V. All rights reserved.
Speech Intelligibility and Hearing Protector Selection

Science.gov (United States)

2016-08-29

not only affect the listener of speech communication in a noisy environment, HPDs can also affect the speaker . Tufts and Frank (2003) found that...of hearing protection on speech intelligibility in noise. Sound and Vibration . 20(10): 12-14. Berger, E. H. 1980. EARLog #4 – The

The Relationship Between Speech, Language, and Phonological Awareness in Preschool-Age Children With Developmental Disabilities.

Science.gov (United States)

Barton-Hulsey, Andrea; Sevcik, Rose A; Romski, MaryAnn

2018-05-03

A number of intrinsic factors, including expressive speech skills, have been suggested to place children with developmental disabilities at risk for limited development of reading skills. This study examines the relationship between these factors, speech ability, and children's phonological awareness skills. A nonexperimental study design was used to examine the relationship between intrinsic skills of speech, language, print, and letter-sound knowledge to phonological awareness in 42 children with developmental disabilities between the ages of 48 and 69 months. Hierarchical multiple regression was done to determine if speech ability accounted for a unique amount of variance in phonological awareness skill beyond what would be expected by developmental skills inclusive of receptive language and print and letter-sound knowledge. A range of skill in all areas of direct assessment was found. Children with limited speech were found to have emerging skills in print knowledge, letter-sound knowledge, and phonological awareness. Speech ability did not predict a significant amount of variance in phonological awareness beyond what would be expected by developmental skills of receptive language and print and letter-sound knowledge. Children with limited speech ability were found to have receptive language and letter-sound knowledge that supported the development of phonological awareness skills. This study provides implications for practitioners and researchers concerning the factors related to early reading development in children with limited speech ability and developmental disabilities.
Tinnitus (Phantom Sound: Risk coming for future

Directory of Open Access Journals (Sweden)

Suresh Rewar

2015-01-01

Full Text Available The word 'tinnitus' comes from the Latin word tinnire, meaning “to ring” or “a ringing.” Tinnitus is the cognition of sound in the absence of any corresponding external sound. Tinnitus can take the form of continuous buzzing, hissing, or ringing, or a combination of these or other characteristics. Tinnitus affects 10% to 25% of the adult population. Tinnitus is classified as objective and subjective categories. Subjective tinnitus is meaningless sounds that are not associated with a physical sound and only the person who has the tinnitus can hear it. Objective tinnitus is the result of a sound that can be heard by the physician. Tinnitus is not a disease in itself but a common symptom, and because it involves the perception of sound or sounds, it is commonly associated with the hearing system. In fact, various parts of the hearing system, including the inner ear, are often responsible for this symptom. Tinnitus patients, which can lead to sleep disturbances, concentration problems, fatigue, depression, anxiety disorders, and sometimes even to suicide. The evaluation of tinnitus always begins with a thorough history and physical examination, with further testing performed when indicated. Diagnostic testing should include audiography, speech discrimination testing, computed tomography angiography, or magnetic resonance angiography should be performed. All patients with tinnitus can benefit from patient education and preventive measures, and oftentimes the physician's reassurance and assistance with the psychologic aftereffects of tinnitus can be the therapy most valuable to the patient. There are no specific medications for the treatment of tinnitus. Sedatives and some other medications may prove helpful in the early stages. The ultimate goal of neuro-imaging is to identify subtypes of tinnitus in order to better inform treatment strategies.
The Voice of the Heart: Vowel-Like Sound in Pulmonary Artery Hypertension

Directory of Open Access Journals (Sweden)

Mohamed Elgendi

2018-04-01

Full Text Available Increased blood pressure in the pulmonary artery is referred to as pulmonary hypertension and often is linked to loud pulmonic valve closures. For the purpose of this paper, it was hypothesized that pulmonary circulation vibrations will create sounds similar to sounds created by vocal cords during speech and that subjects with pulmonary artery hypertension (PAH could have unique sound signatures across four auscultatory sites. Using a digital stethoscope, heart sounds were recorded at the cardiac apex, 2nd left intercostal space (2LICS, 2nd right intercostal space (2RICS, and 4th left intercostal space (4LICS undergoing simultaneous cardiac catheterization. From the collected heart sounds, relative power of the frequency band, energy of the sinusoid formants, and entropy were extracted. PAH subjects were differentiated by applying the linear discriminant analysis with leave-one-out cross-validation. The entropy of the first sinusoid formant decreased significantly in subjects with a mean pulmonary artery pressure (mPAp ≥ 25 mmHg versus subjects with a mPAp < 25 mmHg with a sensitivity of 84% and specificity of 88.57%, within a 10-s optimized window length for heart sounds recorded at the 2LICS. First sinusoid formant entropy reduction of heart sounds in PAH subjects suggests the existence of a vowel-like pattern. Pattern analysis revealed a unique sound signature, which could be used in non-invasive screening tools.
Correlational Analysis of Speech Intelligibility Tests and Metrics for Speech Transmission

Science.gov (United States)

2017-12-04

sounds, are more prone to masking than the high-energy, wide-spectrum vowels. Such contaminated speech is still audible but not clear. Thus, speech...Science; 2012 June 12–14; Kuala Lumpur ( Malaysia ): New York (NY): IEEE; c2012. p. 676–682. Approved for public release; distribution is unlimited. 47...ARRABITO 1 UNIV OF COLORADO (PDF) K AREHART 1 NASA (PDF) J ALLEN 1 FOOD AND DRUG ADM-DEPT (PDF) OF HEALTH AND HUMAN SERVICES
Plasticity in the Human Speech Motor System Drives Changes in Speech Perception

Science.gov (United States)

Lametti, Daniel R.; Rochet-Capellan, Amélie; Neufeld, Emily; Shiller, Douglas M.

2014-01-01

Recent studies of human speech motor learning suggest that learning is accompanied by changes in auditory perception. But what drives the perceptual change? Is it a consequence of changes in the motor system? Or is it a result of sensory inflow during learning? Here, subjects participated in a speech motor-learning task involving adaptation to altered auditory feedback and they were subsequently tested for perceptual change. In two separate experiments, involving two different auditory perceptual continua, we show that changes in the speech motor system that accompany learning drive changes in auditory speech perception. Specifically, we obtained changes in speech perception when adaptation to altered auditory feedback led to speech production that fell into the phonetic range of the speech perceptual tests. However, a similar change in perception was not observed when the auditory feedback that subjects' received during learning fell into the phonetic range of the perceptual tests. This indicates that the central motor outflow associated with vocal sensorimotor adaptation drives changes to the perceptual classification of speech sounds. PMID:25080594
Filtering the Unknown: Speech Activity Detection in Heterogeneous Video Collections

NARCIS (Netherlands)

Huijbregts, M.A.H.; Wooters, Chuck; Ordelman, Roeland J.F.

2007-01-01

In this paper we discuss the speech activity detection system that we used for detecting speech regions in the Dutch TRECVID video collection. The system is designed to filter non-speech like music or sound effects out of the signal without the use of predefined non-speech models. Because the system
Song and speech: examining the link between singing talent and speech imitation ability

Directory of Open Access Journals (Sweden)

Markus eChristiner

2013-11-01

Full Text Available In previous research on speech imitation, musicality and an ability to sing were isolated as the strongest indicators of good pronunciation skills in foreign languages. We, therefore, wanted to take a closer look at the nature of the ability to sing, which shares a common ground with the ability to imitate speech. This study focuses on whether good singing performance predicts good speech imitation. Fourty-one singers of different levels of proficiency were selected for the study and their ability to sing, to imitate speech, their musical talent and working memory were tested. Results indicated that singing performance is a better indicator of the ability to imitate speech than the playing of a musical instrument. A multiple regression revealed that 64 % of the speech imitation score variance could be explained by working memory together with educational background and singing performance. A second multiple regression showed that 66 % of the speech imitation variance of completely unintelligible and unfamiliar language stimuli (Hindi could be explained by working memory together with a singer’s sense of rhythm and quality of voice. This supports the idea that both vocal behaviors have a common grounding in terms of vocal and motor flexibility, ontogenetic and phylogenetic development, neural orchestration and sound memory with singing fitting better into the category of "speech" on the productive level and "music" on the acoustic level. As a result, good singers benefit from vocal and motor flexibility, productively and cognitively, in three ways. 1. Motor flexibility and the ability to sing improve language and musical function. 2. Good singers retain a certain plasticity and are open to new and unusual sound combinations during adulthood both perceptually and productively. 3. The ability to sing improves the memory span of the auditory short term memory.
A Change of a Consonant Status: the Bedouinisation of the [j] Sound in the Speech of Kuwaitis: A Case Study

Directory of Open Access Journals (Sweden)

Abdulmohsen A. Dashti

2015-09-01

Full Text Available In light of sociolinguist phonological change, the following study investigates the [j] sound in the speech of Kuwaitis as the predominant form and characterizes the sedentary population which is made up of both the indigenous and non-indigenous group; while [ʤ] is the realisation of the Bedouins who are also a part of the indigenous population. Although [ʤ] is the classical variant, it has, for some time, been regarded by Kuwaitis as the stigmatized form and the [j] as the one that carries prestige. This study examines the change of status of [j] and [ʤ] in the speech of Kuwaitis. The main hypothesis is that [j] no longer carries prestige. To test this hypothesis, 40 Kuwaitis of different gender, ages, educational background, and social networks were spontaneously chosen to be interviewed. Their speech was phonetically transcribed and accordingly was quantitatively and qualitatively analyzed. Results indicate that the [j] variant is undergoing change of status and that the social parameters and the significant political and social changes, that Kuwait has undergone recently, have triggered this linguistic shift.
The Tuning of Human Neonates' Preference for Speech

Science.gov (United States)

Vouloumanos, Athena; Hauser, Marc D.; Werker, Janet F.; Martin, Alia

2010-01-01

Human neonates prefer listening to speech compared to many nonspeech sounds, suggesting that humans are born with a bias for speech. However, neonates' preference may derive from properties of speech that are not unique but instead are shared with the vocalizations of other species. To test this, thirty neonates and sixteen 3-month-olds were…
Discriminative training of self-structuring hidden control neural models

DEFF Research Database (Denmark)

Sørensen, Helge Bjarup Dissing; Hartmann, Uwe; Hunnerup, Preben

1995-01-01

This paper presents a new training algorithm for self-structuring hidden control neural (SHC) models. The SHC models were trained non-discriminatively for speech recognition applications. Better recognition performance can generally be achieved, if discriminative training is applied instead. Thus...... we developed a discriminative training algorithm for SHC models, where each SHC model for a specific speech pattern is trained with utterances of the pattern to be recognized and with other utterances. The discriminative training of SHC neural models has been tested on the TIDIGITS database...
The sound manifesto

Science.gov (United States)

O'Donnell, Michael J.; Bisnovatyi, Ilia

2000-11-01

Computing practice today depends on visual output to drive almost all user interaction. Other senses, such as audition, may be totally neglected, or used tangentially, or used in highly restricted specialized ways. We have excellent audio rendering through D-A conversion, but we lack rich general facilities for modeling and manipulating sound comparable in quality and flexibility to graphics. We need coordinated research in several disciplines to improve the use of sound as an interactive information channel. Incremental and separate improvements in synthesis, analysis, speech processing, audiology, acoustics, music, etc. will not alone produce the radical progress that we seek in sonic practice. We also need to create a new central topic of study in digital audio research. The new topic will assimilate the contributions of different disciplines on a common foundation. The key central concept that we lack is sound as a general-purpose information channel. We must investigate the structure of this information channel, which is driven by the cooperative development of auditory perception and physical sound production. Particular audible encodings, such as speech and music, illuminate sonic information by example, but they are no more sufficient for a characterization than typography is sufficient for characterization of visual information. To develop this new conceptual topic of sonic information structure, we need to integrate insights from a number of different disciplines that deal with sound. In particular, we need to coordinate central and foundational studies of the representational models of sound with specific applications that illuminate the good and bad qualities of these models. Each natural or artificial process that generates informative sound, and each perceptual mechanism that derives information from sound, will teach us something about the right structure to attribute to the sound itself. The new Sound topic will combine the work of computer
Audiovisual Discrimination between Laughter and Speech

NARCIS (Netherlands)

Petridis, Stavros; Pantic, Maja

Past research on automatic laughter detection has focused mainly on audio-based detection. Here we present an audiovisual approach to distinguishing laughter from speech and we show that integrating the information from audio and video leads to an improved reliability of audiovisual approach in
Sound-based assistive technology support to hearing, speaking and seeing

CERN Document Server

Ifukube, Tohru

2017-01-01

This book "Sound-based Assistive Technology" explains a technology to help speech-, hearing- and sight-impaired people. They might benefit in some way from an enhancement in their ability to recognize and produce speech or to detect sounds in their surroundings. Additionally, it is considered how sound-based assistive technology might be applied to the areas of speech recognition, speech synthesis, environmental recognition, virtual reality and robots. It is the primary focus of this book to provide an understanding of both the methodology and basic concepts of assistive technology rather than listing the variety of assistive devices developed in Japan or other countries. Although this book presents a number of different topics, they are sufficiently independent from one another that the reader may begin at any chapter without experiencing confusion. It should be acknowledged that much of the research quoted in this book was conducted in the author's laboratories both at Hokkaido University and the University...
Stuttering Frequency, Speech Rate, Speech Naturalness, and Speech Effort During the Production of Voluntary Stuttering.

Science.gov (United States)

Davidow, Jason H; Grossman, Heather L; Edge, Robin L

2018-05-01

Voluntary stuttering techniques involve persons who stutter purposefully interjecting disfluencies into their speech. Little research has been conducted on the impact of these techniques on the speech pattern of persons who stutter. The present study examined whether changes in the frequency of voluntary stuttering accompanied changes in stuttering frequency, articulation rate, speech naturalness, and speech effort. In total, 12 persons who stutter aged 16-34 years participated. Participants read four 300-syllable passages during a control condition, and three voluntary stuttering conditions that involved attempting to produce purposeful, tension-free repetitions of initial sounds or syllables of a word for two or more repetitions (i.e., bouncing). The three voluntary stuttering conditions included bouncing on 5%, 10%, and 15% of syllables read. Friedman tests and follow-up Wilcoxon signed ranks tests were conducted for the statistical analyses. Stuttering frequency, articulation rate, and speech naturalness were significantly different between the voluntary stuttering conditions. Speech effort did not differ between the voluntary stuttering conditions. Stuttering frequency was significantly lower during the three voluntary stuttering conditions compared to the control condition, and speech effort was significantly lower during two of the three voluntary stuttering conditions compared to the control condition. Due to changes in articulation rate across the voluntary stuttering conditions, it is difficult to conclude, as has been suggested previously, that voluntary stuttering is the reason for stuttering reductions found when using voluntary stuttering techniques. Additionally, future investigations should examine different types of voluntary stuttering over an extended period of time to determine their impact on stuttering frequency, speech rate, speech naturalness, and speech effort.
Automatic discrimination between laughter and speech

NARCIS (Netherlands)

Truong, K.; Leeuwen, D. van

2007-01-01

Emotions can be recognized by audible paralinguistic cues in speech. By detecting these paralinguistic cues that can consist of laughter, a trembling voice, coughs, changes in the intonation contour etc., information about the speakers state and emotion can be revealed. This paper describes the
Task-Modulated Cortical Representations of Natural Sound Source Categories

DEFF Research Database (Denmark)

Hjortkjær, Jens; Kassuba, Tanja; Madsen, Kristoffer Hougaard

2018-01-01

In everyday sound environments, we recognize sound sources and events by attending to relevant aspects of an acoustic input. Evidence about the cortical mechanisms involved in extracting relevant category information from natural sounds is, however, limited to speech. Here, we used functional MRI...
Perception of Intersensory Synchrony in Audiovisual Speech: Not that Special

Science.gov (United States)

Vroomen, Jean; Stekelenburg, Jeroen J.

2011-01-01

Perception of intersensory temporal order is particularly difficult for (continuous) audiovisual speech, as perceivers may find it difficult to notice substantial timing differences between speech sounds and lip movements. Here we tested whether this occurs because audiovisual speech is strongly paired ("unity assumption"). Participants made…
Intervention efficacy and intensity for children with speech sound disorder.

Science.gov (United States)

Allen, Melissa M

2013-06-01

Clinicians do not have an evidence base they can use to recommend optimum intervention intensity for preschool children who present with speech sound disorder (SSD). This study examined the effect of dose frequency on phonological performance and the efficacy of the multiple oppositions approach. Fifty-four preschool children with SSD were randomly assigned to one of three intervention conditions. Two intervention conditions received the multiple oppositions approach either 3 times per week for 8 weeks (P3) or once weekly for 24 weeks (P1). A control (C) condition received a storybook intervention. Percentage of consonants correct (PCC) was evaluated at 8 weeks and after 24 sessions. PCC gain was examined after a 6-week maintenance period. The P3 condition had a significantly better phonological outcome than the P1 and C conditions at 8 weeks and than the P1 condition after 24 weeks. There were no significant differences between the P1 and C conditions. There was no significant difference between the P1 and P3 conditions in PCC gain during the maintenance period. Preschool children with SSD who received the multiple oppositions approach made significantly greater gains when they were provided with a more intensive dose frequency and when cumulative intervention intensity was held constant.
Acoustic assessment of speech privacy curtains in two nursing units

Science.gov (United States)

Pope, Diana S.; Miller-Klein, Erik T.

2016-01-01

Hospitals have complex soundscapes that create challenges to patient care. Extraneous noise and high reverberation rates impair speech intelligibility, which leads to raised voices. In an unintended spiral, the increasing noise may result in diminished speech privacy, as people speak loudly to be heard over the din. The products available to improve hospital soundscapes include construction materials that absorb sound (acoustic ceiling tiles, carpet, wall insulation) and reduce reverberation rates. Enhanced privacy curtains are now available and offer potential for a relatively simple way to improve speech privacy and speech intelligibility by absorbing sound at the hospital patient's bedside. Acoustic assessments were performed over 2 days on two nursing units with a similar design in the same hospital. One unit was built with the 1970s’ standard hospital construction and the other was newly refurbished (2013) with sound-absorbing features. In addition, we determined the effect of an enhanced privacy curtain versus standard privacy curtains using acoustic measures of speech privacy and speech intelligibility indexes. Privacy curtains provided auditory protection for the patients. In general, that protection was increased by the use of enhanced privacy curtains. On an average, the enhanced curtain improved sound absorption from 20% to 30%; however, there was considerable variability, depending on the configuration of the rooms tested. Enhanced privacy curtains provide measureable improvement to the acoustics of patient rooms but cannot overcome larger acoustic design issues. To shorten reverberation time, additional absorption, and compact and more fragmented nursing unit floor plate shapes should be considered. PMID:26780959
Acoustic assessment of speech privacy curtains in two nursing units.

Science.gov (United States)

Pope, Diana S; Miller-Klein, Erik T

2016-01-01

Hospitals have complex soundscapes that create challenges to patient care. Extraneous noise and high reverberation rates impair speech intelligibility, which leads to raised voices. In an unintended spiral, the increasing noise may result in diminished speech privacy, as people speak loudly to be heard over the din. The products available to improve hospital soundscapes include construction materials that absorb sound (acoustic ceiling tiles, carpet, wall insulation) and reduce reverberation rates. Enhanced privacy curtains are now available and offer potential for a relatively simple way to improve speech privacy and speech intelligibility by absorbing sound at the hospital patient's bedside. Acoustic assessments were performed over 2 days on two nursing units with a similar design in the same hospital. One unit was built with the 1970s' standard hospital construction and the other was newly refurbished (2013) with sound-absorbing features. In addition, we determined the effect of an enhanced privacy curtain versus standard privacy curtains using acoustic measures of speech privacy and speech intelligibility indexes. Privacy curtains provided auditory protection for the patients. In general, that protection was increased by the use of enhanced privacy curtains. On an average, the enhanced curtain improved sound absorption from 20% to 30%; however, there was considerable variability, depending on the configuration of the rooms tested. Enhanced privacy curtains provide measureable improvement to the acoustics of patient rooms but cannot overcome larger acoustic design issues. To shorten reverberation time, additional absorption, and compact and more fragmented nursing unit floor plate shapes should be considered.

Acoustic assessment of speech privacy curtains in two nursing units

Directory of Open Access Journals (Sweden)

Diana S Pope

2016-01-01

Full Text Available Hospitals have complex soundscapes that create challenges to patient care. Extraneous noise and high reverberation rates impair speech intelligibility, which leads to raised voices. In an unintended spiral, the increasing noise may result in diminished speech privacy, as people speak loudly to be heard over the din. The products available to improve hospital soundscapes include construction materials that absorb sound (acoustic ceiling tiles, carpet, wall insulation and reduce reverberation rates. Enhanced privacy curtains are now available and offer potential for a relatively simple way to improve speech privacy and speech intelligibility by absorbing sound at the hospital patient′s bedside. Acoustic assessments were performed over 2 days on two nursing units with a similar design in the same hospital. One unit was built with the 1970s′ standard hospital construction and the other was newly refurbished (2013 with sound-absorbing features. In addition, we determined the effect of an enhanced privacy curtain versus standard privacy curtains using acoustic measures of speech privacy and speech intelligibility indexes. Privacy curtains provided auditory protection for the patients. In general, that protection was increased by the use of enhanced privacy curtains. On an average, the enhanced curtain improved sound absorption from 20% to 30%; however, there was considerable variability, depending on the configuration of the rooms tested. Enhanced privacy curtains provide measureable improvement to the acoustics of patient rooms but cannot overcome larger acoustic design issues. To shorten reverberation time, additional absorption, and compact and more fragmented nursing unit floor plate shapes should be considered.
The selective role of premotor cortex in speech perception: a contribution to phoneme judgements but not speech comprehension.

Science.gov (United States)

Krieger-Redwood, Katya; Gaskell, M Gareth; Lindsay, Shane; Jefferies, Elizabeth

2013-12-01

Several accounts of speech perception propose that the areas involved in producing language are also involved in perceiving it. In line with this view, neuroimaging studies show activation of premotor cortex (PMC) during phoneme judgment tasks; however, there is debate about whether speech perception necessarily involves motor processes, across all task contexts, or whether the contribution of PMC is restricted to tasks requiring explicit phoneme awareness. Some aspects of speech processing, such as mapping sounds onto meaning, may proceed without the involvement of motor speech areas if PMC specifically contributes to the manipulation and categorical perception of phonemes. We applied TMS to three sites-PMC, posterior superior temporal gyrus, and occipital pole-and for the first time within the TMS literature, directly contrasted two speech perception tasks that required explicit phoneme decisions and mapping of speech sounds onto semantic categories, respectively. TMS to PMC disrupted explicit phonological judgments but not access to meaning for the same speech stimuli. TMS to two further sites confirmed that this pattern was site specific and did not reflect a generic difference in the susceptibility of our experimental tasks to TMS: stimulation of pSTG, a site involved in auditory processing, disrupted performance in both language tasks, whereas stimulation of occipital pole had no effect on performance in either task. These findings demonstrate that, although PMC is important for explicit phonological judgments, crucially, PMC is not necessary for mapping speech onto meanings.
Voice Activity Detection. Fundamentals and Speech Recognition System Robustness

OpenAIRE

Ramirez, J.; Gorriz, J. M.; Segura, J. C.

2007-01-01

This chapter has shown an overview of the main challenges in robust speech detection and a review of the state of the art and applications. VADs are frequently used in a number of applications including speech coding, speech enhancement and speech recognition. A precise VAD extracts a set of discriminative speech features from the noisy speech and formulates the decision in terms of well defined rule. The chapter has summarized three robust VAD methods that yield high speech/non-speech discri...
Auditory and visual sustained attention in children with speech sound disorder.

Directory of Open Access Journals (Sweden)

Cristina F B Murphy

Full Text Available Although research has demonstrated that children with specific language impairment (SLI and reading disorder (RD exhibit sustained attention deficits, no study has investigated sustained attention in children with speech sound disorder (SSD. Given the overlap of symptoms, such as phonological memory deficits, between these different language disorders (i.e., SLI, SSD and RD and the relationships between working memory, attention and language processing, it is worthwhile to investigate whether deficits in sustained attention also occur in children with SSD. A total of 55 children (18 diagnosed with SSD (8.11 ± 1.231 and 37 typically developing children (8.76 ± 1.461 were invited to participate in this study. Auditory and visual sustained-attention tasks were applied. Children with SSD performed worse on these tasks; they committed a greater number of auditory false alarms and exhibited a significant decline in performance over the course of the auditory detection task. The extent to which performance is related to auditory perceptual difficulties and probable working memory deficits is discussed. Further studies are needed to better understand the specific nature of these deficits and their clinical implications.
Predicting the perceived sound quality of frequency-compressed speech.

Directory of Open Access Journals (Sweden)

Rainer Huber

Full Text Available The performance of objective speech and audio quality measures for the prediction of the perceived quality of frequency-compressed speech in hearing aids is investigated in this paper. A number of existing quality measures have been applied to speech signals processed by a hearing aid, which compresses speech spectra along frequency in order to make information contained in higher frequencies audible for listeners with severe high-frequency hearing loss. Quality measures were compared with subjective ratings obtained from normal hearing and hearing impaired children and adults in an earlier study. High correlations were achieved with quality measures computed by quality models that are based on the auditory model of Dau et al., namely, the measure PSM, computed by the quality model PEMO-Q; the measure qc, computed by the quality model proposed by Hansen and Kollmeier; and the linear subcomponent of the HASQI. For the prediction of quality ratings by hearing impaired listeners, extensions of some models incorporating hearing loss were implemented and shown to achieve improved prediction accuracy. Results indicate that these objective quality measures can potentially serve as tools for assisting in initial setting of frequency compression parameters.
Processing of prosodic changes in natural speech stimuli in school-age children.

Science.gov (United States)

Lindström, R; Lepistö, T; Makkonen, T; Kujala, T

2012-12-01

Speech prosody conveys information about important aspects of communication: the meaning of the sentence and the emotional state or intention of the speaker. The present study addressed processing of emotional prosodic changes in natural speech stimuli in school-age children (mean age 10 years) by recording the electroencephalogram, facial electromyography, and behavioral responses. The stimulus was a semantically neutral Finnish word uttered with four different emotional connotations: neutral, commanding, sad, and scornful. In the behavioral sound-discrimination task the reaction times were fastest for the commanding stimulus and longest for the scornful stimulus, and faster for the neutral than for the sad stimulus. EEG and EMG responses were measured during non-attentive oddball paradigm. Prosodic changes elicited a negative-going, fronto-centrally distributed neural response peaking at about 500 ms from the onset of the stimulus, followed by a fronto-central positive deflection, peaking at about 740 ms. For the commanding stimulus also a rapid negative deflection peaking at about 290 ms from stimulus onset was elicited. No reliable stimulus type specific rapid facial reactions were found. The results show that prosodic changes in natural speech stimuli activate pre-attentive neural change-detection mechanisms in school-age children. However, the results do not support the suggestion of automaticity of emotion specific facial muscle responses to non-attended emotional speech stimuli in children. Copyright © 2012 Elsevier B.V. All rights reserved.
Combined Aphasia and Apraxia of Speech Treatment (CAAST): effects of a novel therapy.

Science.gov (United States)

Wambaugh, Julie L; Wright, Sandra; Nessler, Christina; Mauszycki, Shannon C

2014-12-01

This investigation was designed to examine the effects of a newly developed treatment for aphasia and acquired apraxia of speech (AOS). Combined Aphasia and Apraxia of Speech Treatment (CAAST) targets language and speech production simultaneously, with treatment techniques derived from Response Elaboration Training (Kearns, 1985) and Sound Production Treatment (Wambaugh, Kalinyak-Fliszar, West, & Doyle, 1998). The purpose of this study was to determine whether CAAST was associated with positive changes in verbal language and speech production with speakers with aphasia and AOS. Four participants with chronic aphasia and AOS received CAAST applied sequentially to sets of pictures in the context of multiple baseline designs. CAAST entailed elaboration of participant-initiated utterances, with sound production training applied as needed to the elaborated productions. The dependent variables were (a) production of correct information units (CIUs; Nicholas & Brookshire, 1993) in response to experimental picture stimuli, (b) percentage of consonants correct in sentence repetition, and (c) speech intelligibility. CAAST was associated with increased CIU production in trained and untrained picture sets for all participants. Gains in sound production accuracy and speech intelligibility varied across participants; a modification of CAAST to provide additional speech production treatment may be desirable.
Working memory in school-age children with and without a persistent speech sound disorder.

Science.gov (United States)

Farquharson, Kelly; Hogan, Tiffany P; Bernthal, John E

2017-03-17

The aim of this study was to explore the role of working memory processes as a possible cognitive underpinning of persistent speech sound disorders (SSD). Forty school-aged children were enrolled; 20 children with persistent SSD (P-SSD) and 20 typically developing children. Children participated in three working memory tasks - one to target each of the components in Baddeley's working memory model: phonological loop, visual spatial sketchpad and central executive. Children with P-SSD performed poorly only on the phonological loop tasks compared to their typically developing age-matched peers. However, mediation analyses revealed that the relation between working memory and a P-SSD was reliant upon nonverbal intelligence. These results suggest that co-morbid low-average nonverbal intelligence are linked to poor working memory in children with P-SSD. Theoretical and clinical implications are discussed.
Speech and Hearing Science in Ancient India--A Review of Sanskrit Literature.

Science.gov (United States)

Savithri, S. R.

1988-01-01

The study reviewed Sanskrit books written between 1500 BC and 1904 AD concerning diseases, speech pathology, and audiology. Details are provided of the ancient Indian system of disease classification, the classification of speech sounds, causes of speech disorders, and treatment of speech and language disorders. (DB)
Cross-language and second language speech perception

DEFF Research Database (Denmark)

Bohn, Ocke-Schwen

2017-01-01

in cross-language and second language speech perception research: The mapping issue (the perceptual relationship of sounds of the native and the nonnative language in the mind of the native listener and the L2 learner), the perceptual and learning difficulty/ease issue (how this relationship may or may...... not cause perceptual and learning difficulty), and the plasticity issue (whether and how experience with the nonnative language affects the perceptual organization of speech sounds in the mind of L2 learners). One important general conclusion from this research is that perceptual learning is possible at all...
Dyslexia risk gene relates to representation of sound in the auditory brainstem.

Science.gov (United States)

Neef, Nicole E; Müller, Bent; Liebig, Johanna; Schaadt, Gesa; Grigutsch, Maren; Gunter, Thomas C; Wilcke, Arndt; Kirsten, Holger; Skeide, Michael A; Kraft, Indra; Kraus, Nina; Emmrich, Frank; Brauer, Jens; Boltze, Johannes; Friederici, Angela D

2017-04-01

Dyslexia is a reading disorder with strong associations with KIAA0319 and DCDC2. Both genes play a functional role in spike time precision of neurons. Strikingly, poor readers show an imprecise encoding of fast transients of speech in the auditory brainstem. Whether dyslexia risk genes are related to the quality of sound encoding in the auditory brainstem remains to be investigated. Here, we quantified the response consistency of speech-evoked brainstem responses to the acoustically presented syllable [da] in 159 genotyped, literate and preliterate children. When controlling for age, sex, familial risk and intelligence, partial correlation analyses associated a higher dyslexia risk loading with KIAA0319 with noisier responses. In contrast, a higher risk loading with DCDC2 was associated with a trend towards more stable responses. These results suggest that unstable representation of sound, and thus, reduced neural discrimination ability of stop consonants, occurred in genotypes carrying a higher amount of KIAA0319 risk alleles. Current data provide the first evidence that the dyslexia-associated gene KIAA0319 can alter brainstem responses and impair phoneme processing in the auditory brainstem. This brain-gene relationship provides insight into the complex relationships between phenotype and genotype thereby improving the understanding of the dyslexia-inherent complex multifactorial condition. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Response properties of neurons in the cat's putamen during auditory discrimination.

Science.gov (United States)

Zhao, Zhenling; Sato, Yu; Qin, Ling

2015-10-01

The striatum integrates diverse convergent input and plays a critical role in the goal-directed behaviors. To date, the auditory functions of striatum are less studied. Recently, it was demonstrated that auditory cortico-striatal projections influence behavioral performance during a frequency discrimination task. To reveal the functions of striatal neurons in auditory discrimination, we recorded the single-unit spike activities in the putamen (dorsal striatum) of free-moving cats while performing a Go/No-go task to discriminate the sounds with different modulation rates (12.5 Hz vs. 50 Hz) or envelopes (damped vs. ramped). We found that the putamen neurons can be broadly divided into four groups according to their contributions to sound discrimination. First, 40% of neurons showed vigorous responses synchronized to the sound envelope, and could precisely discriminate different sounds. Second, 18% of neurons showed a high preference of ramped to damped sounds, but no preference for modulation rate. They could only discriminate the change of sound envelope. Third, 27% of neurons rapidly adapted to the sound stimuli, had no ability of sound discrimination. Fourth, 15% of neurons discriminated the sounds dependent on the reward-prediction. Comparing to passively listening condition, the activities of putamen neurons were significantly enhanced by the engagement of the auditory tasks, but not modulated by the cat's behavioral choice. The coexistence of multiple types of neurons suggests that the putamen is involved in the transformation from auditory representation to stimulus-reward association. Copyright © 2015 Elsevier B.V. All rights reserved.
The role of reverberation-related binaural cues in the externalization of speech

DEFF Research Database (Denmark)

Catic, Jasmina; Santurette, Sébastien; Dau, Torsten

2015-01-01

The perception of externalization of speech sounds was investigated with respect to the monaural and binaural cues available at the listeners’ ears in a reverberant environment. Individualized binaural room impulse responses (BRIRs) were used to simulate externalized sound sources via headphones....... The measured BRIRs were subsequently modified such that the proportion of the response containing binaural vs monaural information was varied. Normal-hearing listeners were presented with speech sounds convolved with such modified BRIRs. Monaural reverberation cues were found to be sufficient...
Reverberation impairs brainstem temporal representations of voiced vowel sounds: challenging periodicity-tagged segregation of competing speech in rooms

Directory of Open Access Journals (Sweden)

Mark eSayles

2015-01-01

Full Text Available The auditory system typically processes information from concurrently active sound sources (e.g., two voices speaking at once, in the presence of multiple delayed, attenuated and distorted sound-wave reflections (reverberation. Brainstem circuits help segregate these complex acoustic mixtures into auditory objects. Psychophysical studies demonstrate a strong interaction between reverberation and fundamental-frequency (F0 modulation, leading to impaired segregation of competing vowels when segregation is on the basis of F0 differences. Neurophysiological studies of complex-sound segregation have concentrated on sounds with steady F0s, in anechoic environments. However, F0 modulation and reverberation are quasi-ubiquitous.We examine the ability of 129 single units in the ventral cochlear nucleus of the anesthetized guinea pig to segregate the concurrent synthetic vowel sounds /a/ and /i/, based on temporal discharge patterns under closed-field conditions. We address the effects of added real-room reverberation, F0 modulation, and the interaction of these two factors, on brainstem neural segregation of voiced speech sounds. A firing-rate representation of single-vowels’ spectral envelopes is robust to the combination of F0 modulation and reverberation: local firing-rate maxima and minima across the tonotopic array code vowel-formant structure. However, single-vowel F0-related periodicity information in shuffled inter-spike interval distributions is significantly degraded in the combined presence of reverberation and F0 modulation. Hence, segregation of double-vowels’ spectral energy into two streams (corresponding to the two vowels, on the basis of temporal discharge patterns, is impaired by reverberation; specifically when F0 is modulated. All unit types (primary-like, chopper, onset are similarly affected. These results offer neurophysiological insights to perceptual organization of complex acoustic scenes under realistically challenging
Emotional prosody of task-irrelevant speech interferes with the retention of serial order.

Science.gov (United States)

Kattner, Florian; Ellermeier, Wolfgang

2018-04-09

Task-irrelevant speech and other temporally changing sounds are known to interfere with the short-term memorization of ordered verbal materials, as compared to silence or stationary sounds. It has been argued that this disruption of short-term memory (STM) may be due to (a) interference of automatically encoded acoustical fluctuations with the process of serial rehearsal or (b) attentional capture by salient task-irrelevant information. To disentangle the contributions of these 2 processes, the authors investigated whether the disruption of serial recall is due to the semantic or acoustical properties of task-irrelevant speech (Experiment 1). They found that performance was affected by the prosody (emotional intonation), but not by the semantics (word meaning), of irrelevant speech, suggesting that the disruption of serial recall is due to interference of precategorically encoded changing-state sound (with higher fluctuation strength of emotionally intonated speech). The authors further demonstrated a functional distinction between this form of distraction and attentional capture by contrasting the effect of (a) speech prosody and (b) sudden prosody deviations on both serial and nonserial STM tasks (Experiment 2). Although serial recall was again sensitive to the emotional prosody of irrelevant speech, performance on a nonserial missing-item task was unaffected by the presence of neutral or emotionally intonated speech sounds. In contrast, sudden prosody changes tended to impair performance on both tasks, suggesting an independent effect of attentional capture. (PsycINFO Database Record (c) 2018 APA, all rights reserved).
Training the Brain to Weight Speech Cues Differently: A Study of Finnish Second-language Users of English

Science.gov (United States)

Ylinen, Sari; Uther, Maria; Latvala, Antti; Vepsalainen, Sara; Iverson, Paul; Akahane-Yamada, Reiko; Naatanen, Risto

2010-01-01

Foreign-language learning is a prime example of a task that entails perceptual learning. The correct comprehension of foreign-language speech requires the correct recognition of speech sounds. The most difficult speech-sound contrasts for foreign-language learners often are the ones that have multiple phonetic cues, especially if the cues are…
The functional anatomy of speech perception: Dorsal and ventral processing pathways

Science.gov (United States)

Hickok, Gregory

2003-04-01

Drawing on recent developments in the cortical organization of vision, and on data from a variety of sources, Hickok and Poeppel (2000) have proposed a new model of the functional anatomy of speech perception. The model posits that early cortical stages of speech perception involve auditory fields in the superior temporal gyrus bilaterally (although asymmetrically). This cortical processing system then diverges into two broad processing streams, a ventral stream, involved in mapping sound onto meaning, and a dorsal stream, involved in mapping sound onto articulatory-based representations. The ventral stream projects ventrolaterally toward inferior posterior temporal cortex which serves as an interface between sound and meaning. The dorsal stream projects dorsoposteriorly toward the parietal lobe and ultimately to frontal regions. This network provides a mechanism for the development and maintenance of ``parity'' between auditory and motor representations of speech. Although the dorsal stream represents a tight connection between speech perception and speech production, it is not a critical component of the speech perception process under ecologically natural listening conditions. Some degree of bi-directionality in both the dorsal and ventral pathways is also proposed. A variety of recent empirical tests of this model have provided further support for the proposal.
Online Speech/Music Segmentation Based on the Variance Mean of Filter Bank Energy

Directory of Open Access Journals (Sweden)

Zdravko Kačič

2009-01-01

Full Text Available This paper presents a novel feature for online speech/music segmentation based on the variance mean of filter bank energy (VMFBE. The idea that encouraged the feature's construction is energy variation in a narrow frequency sub-band. The energy varies more rapidly, and to a greater extent for speech than for music. Therefore, an energy variance in such a sub-band is greater for speech than for music. The radio broadcast database and the BNSI broadcast news database were used for feature discrimination and segmentation ability evaluation. The calculation procedure of the VMFBE feature has 4 out of 6 steps in common with the MFCC feature calculation procedure. Therefore, it is a very convenient speech/music discriminator for use in real-time automatic speech recognition systems based on MFCC features, because valuable processing time can be saved, and computation load is only slightly increased. Analysis of the feature's speech/music discriminative ability shows an average error rate below 10% for radio broadcast material and it outperforms other features used for comparison, by more than 8%. The proposed feature as a stand-alone speech/music discriminator in a segmentation system achieves an overall accuracy of over 94% on radio broadcast material.
Online Speech/Music Segmentation Based on the Variance Mean of Filter Bank Energy

Science.gov (United States)

Kos, Marko; Grašič, Matej; Kačič, Zdravko

2009-12-01

This paper presents a novel feature for online speech/music segmentation based on the variance mean of filter bank energy (VMFBE). The idea that encouraged the feature's construction is energy variation in a narrow frequency sub-band. The energy varies more rapidly, and to a greater extent for speech than for music. Therefore, an energy variance in such a sub-band is greater for speech than for music. The radio broadcast database and the BNSI broadcast news database were used for feature discrimination and segmentation ability evaluation. The calculation procedure of the VMFBE feature has 4 out of 6 steps in common with the MFCC feature calculation procedure. Therefore, it is a very convenient speech/music discriminator for use in real-time automatic speech recognition systems based on MFCC features, because valuable processing time can be saved, and computation load is only slightly increased. Analysis of the feature's speech/music discriminative ability shows an average error rate below 10% for radio broadcast material and it outperforms other features used for comparison, by more than 8%. The proposed feature as a stand-alone speech/music discriminator in a segmentation system achieves an overall accuracy of over 94% on radio broadcast material.
Hearing speech in music

Directory of Open Access Journals (Sweden)

Seth-Reino Ekström

2011-01-01

Full Text Available The masking effect of a piano composition, played at different speeds and in different octaves, on speech-perception thresholds was investigated in 15 normal-hearing and 14 moderately-hearing-impaired subjects. Running speech (just follow conversation, JFC testing and use of hearing aids increased the everyday validity of the findings. A comparison was made with standard audiometric noises [International Collegium of Rehabilitative Audiology (ICRA noise and speech spectrum-filtered noise (SPN]. All masking sounds, music or noise, were presented at the same equivalent sound level (50 dBA. The results showed a significant effect of piano performance speed and octave (P<.01. Low octave and fast tempo had the largest effect; and high octave and slow tempo, the smallest. Music had a lower masking effect than did ICRA noise with two or six speakers at normal vocal effort (P<.01 and SPN (P<.05. Subjects with hearing loss had higher masked thresholds than the normal-hearing subjects (P<.01, but there were smaller differences between masking conditions (P<.01. It is pointed out that music offers an interesting opportunity for studying masking under realistic conditions, where spectral and temporal features can be varied independently. The results have implications for composing music with vocal parts, designing acoustic environments and creating a balance between speech perception and privacy in social settings.

Towards parameter-free classification of sound effects in movies

Science.gov (United States)

Chu, Selina; Narayanan, Shrikanth; Kuo, C.-C. J.

2005-08-01

The problem of identifying intense events via multimedia data mining in films is investigated in this work. Movies are mainly characterized by dialog, music, and sound effects. We begin our investigation with detecting interesting events through sound effects. Sound effects are neither speech nor music, but are closely associated with interesting events such as car chases and gun shots. In this work, we utilize low-level audio features including MFCC and energy to identify sound effects. It was shown in previous work that the Hidden Markov model (HMM) works well for speech/audio signals. However, this technique requires a careful choice in designing the model and choosing correct parameters. In this work, we introduce a framework that will avoid such necessity and works well with semi- and non-parametric learning algorithms.
Innovative Speech Reconstructive Surgery

OpenAIRE

Hashem Shemshadi

2003-01-01

Proper speech functioning in human being, depends on the precise coordination and timing balances in a series of complex neuro nuscular movements and actions. Starting from the prime organ of energy source of expelled air from respirato y system; deliver such air to trigger vocal cords; swift changes of this phonatory episode to a comprehensible sound in RESONACE and final coordination of all head and neck structures to elicit final speech in ...
Serial recall of rhythms and verbal sequences: Impacts of concurrent tasks and irrelevant sound.

Science.gov (United States)

Hall, Debbora; Gathercole, Susan E

2011-08-01

Rhythmic grouping enhances verbal serial recall, yet very little is known about memory for rhythmic patterns. The aim of this study was to compare the cognitive processes supporting memory for rhythmic and verbal sequences using a range of concurrent tasks and irrelevant sounds. In Experiment 1, both concurrent articulation and paced finger tapping during presentation and during a retention interval impaired rhythm recall, while letter recall was only impaired by concurrent articulation. In Experiments 2 and 3, irrelevant sound consisted of irrelevant speech or tones, changing-state or steady-state sound, and syncopated or paced sound during presentation and during a retention interval. Irrelevant speech was more damaging to rhythm and letter recall than was irrelevant tone sound, but there was no effect of changing state on rhythm recall, while letter recall accuracy was disrupted by changing-state sound. Pacing of sound did not consistently affect either rhythm or letter recall. There are similarities in the way speech and rhythms are processed that appear to extend beyond reliance on temporal coding mechanisms involved in serial-order recall.
The relevance of visual information on learning sounds in infancy

NARCIS (Netherlands)

ter Schure, S.M.M.

2016-01-01

Newborn infants are sensitive to combinations of visual and auditory speech. Does this ability to match sounds and sights affect how infants learn the sounds of their native language? And are visual articulations the only type of visual information that can influence sound learning? This
Recovering With Acquired Apraxia of Speech: The First 2 Years.

Science.gov (United States)

Haley, Katarina L; Shafer, Jennifer N; Harmon, Tyson G; Jacks, Adam

2016-12-01

This study was intended to document speech recovery for 1 person with acquired apraxia of speech quantitatively and on the basis of her lived experience. The second author sustained a traumatic brain injury that resulted in acquired apraxia of speech. Over a 2-year period, she documented her recovery through 22 video-recorded monologues. We analyzed these monologues using a combination of auditory perceptual, acoustic, and qualitative methods. Recovery was evident for all quantitative variables examined. For speech sound production, the recovery was most prominent during the first 3 months, but slower improvement was evident for many months. Measures of speaking rate, fluency, and prosody changed more gradually throughout the entire period. A qualitative analysis of topics addressed in the monologues was consistent with the quantitative speech recovery and indicated a subjective dynamic relationship between accuracy and rate, an observation that several factors made speech sound production variable, and a persisting need for cognitive effort while speaking. Speech features improved over an extended time, but the recovery trajectories differed, indicating dynamic reorganization of the underlying speech production system. The relationship among speech dimensions should be examined in other cases and in population samples. The combination of quantitative and qualitative analysis methods offers advantages for understanding clinically relevant aspects of recovery.
The mechanism of speech processing in congenital amusia: evidence from Mandarin speakers.

Directory of Open Access Journals (Sweden)

Fang Liu

Full Text Available Congenital amusia is a neuro-developmental disorder of pitch perception that causes severe problems with music processing but only subtle difficulties in speech processing. This study investigated speech processing in a group of Mandarin speakers with congenital amusia. Thirteen Mandarin amusics and thirteen matched controls participated in a set of tone and intonation perception tasks and two pitch threshold tasks. Compared with controls, amusics showed impaired performance on word discrimination in natural speech and their gliding tone analogs. They also performed worse than controls on discriminating gliding tone sequences derived from statements and questions, and showed elevated thresholds for pitch change detection and pitch direction discrimination. However, they performed as well as controls on word identification, and on statement-question identification and discrimination in natural speech. Overall, tasks that involved multiple acoustic cues to communicative meaning were not impacted by amusia. Only when the tasks relied mainly on pitch sensitivity did amusics show impaired performance compared to controls. These findings help explain why amusia only affects speech processing in subtle ways. Further studies on a larger sample of Mandarin amusics and on amusics of other language backgrounds are needed to consolidate these results.
The mechanism of speech processing in congenital amusia: evidence from Mandarin speakers.

Science.gov (United States)

Liu, Fang; Jiang, Cunmei; Thompson, William Forde; Xu, Yi; Yang, Yufang; Stewart, Lauren

2012-01-01

Congenital amusia is a neuro-developmental disorder of pitch perception that causes severe problems with music processing but only subtle difficulties in speech processing. This study investigated speech processing in a group of Mandarin speakers with congenital amusia. Thirteen Mandarin amusics and thirteen matched controls participated in a set of tone and intonation perception tasks and two pitch threshold tasks. Compared with controls, amusics showed impaired performance on word discrimination in natural speech and their gliding tone analogs. They also performed worse than controls on discriminating gliding tone sequences derived from statements and questions, and showed elevated thresholds for pitch change detection and pitch direction discrimination. However, they performed as well as controls on word identification, and on statement-question identification and discrimination in natural speech. Overall, tasks that involved multiple acoustic cues to communicative meaning were not impacted by amusia. Only when the tasks relied mainly on pitch sensitivity did amusics show impaired performance compared to controls. These findings help explain why amusia only affects speech processing in subtle ways. Further studies on a larger sample of Mandarin amusics and on amusics of other language backgrounds are needed to consolidate these results.
Sound stream segregation: a neuromorphic approach to solve the "cocktail party problem" in real-time.

Science.gov (United States)

Thakur, Chetan Singh; Wang, Runchun M; Afshar, Saeed; Hamilton, Tara J; Tapson, Jonathan C; Shamma, Shihab A; van Schaik, André

2015-01-01

The human auditory system has the ability to segregate complex auditory scenes into a foreground component and a background, allowing us to listen to specific speech sounds from a mixture of sounds. Selective attention plays a crucial role in this process, colloquially known as the "cocktail party effect." It has not been possible to build a machine that can emulate this human ability in real-time. Here, we have developed a framework for the implementation of a neuromorphic sound segregation algorithm in a Field Programmable Gate Array (FPGA). This algorithm is based on the principles of temporal coherence and uses an attention signal to separate a target sound stream from background noise. Temporal coherence implies that auditory features belonging to the same sound source are coherently modulated and evoke highly correlated neural response patterns. The basis for this form of sound segregation is that responses from pairs of channels that are strongly positively correlated belong to the same stream, while channels that are uncorrelated or anti-correlated belong to different streams. In our framework, we have used a neuromorphic cochlea as a frontend sound analyser to extract spatial information of the sound input, which then passes through band pass filters that extract the sound envelope at various modulation rates. Further stages include feature extraction and mask generation, which is finally used to reconstruct the targeted sound. Using sample tonal and speech mixtures, we show that our FPGA architecture is able to segregate sound sources in real-time. The accuracy of segregation is indicated by the high signal-to-noise ratio (SNR) of the segregated stream (90, 77, and 55 dB for simple tone, complex tone, and speech, respectively) as compared to the SNR of the mixture waveform (0 dB). This system may be easily extended for the segregation of complex speech signals, and may thus find various applications in electronic devices such as for sound segregation and
Integrating speech in time depends on temporal expectancies and attention.

Science.gov (United States)

Scharinger, Mathias; Steinberg, Johanna; Tavano, Alessandro

2017-08-01

Sensory information that unfolds in time, such as in speech perception, relies on efficient chunking mechanisms in order to yield optimally-sized units for further processing. Whether or not two successive acoustic events receive a one-unit or a two-unit interpretation seems to depend on the fit between their temporal extent and a stipulated temporal window of integration. However, there is ongoing debate on how flexible this temporal window of integration should be, especially for the processing of speech sounds. Furthermore, there is no direct evidence of whether attention may modulate the temporal constraints on the integration window. For this reason, we here examine how different word durations, which lead to different temporal separations of sound onsets, interact with attention. In an Electroencephalography (EEG) study, participants actively and passively listened to words where word-final consonants were occasionally omitted. Words had either a natural duration or were artificially prolonged in order to increase the separation of speech sound onsets. Omission responses to incomplete speech input, originating in left temporal cortex, decreased when the critical speech sound was separated from previous sounds by more than 250 msec, i.e., when the separation was larger than the stipulated temporal window of integration (125-150 msec). Attention, on the other hand, only increased omission responses for stimuli with natural durations. We complemented the event-related potential (ERP) analyses by a frequency-domain analysis on the stimulus presentation rate. Notably, the power of stimulation frequency showed the same duration and attention effects than the omission responses. We interpret these findings on the background of existing research on temporal integration windows and further suggest that our findings may be accounted for within the framework of predictive coding. Copyright © 2017 Elsevier Ltd. All rights reserved.
Intelligibility of synthetic speech in the presence of interfering speech

NARCIS (Netherlands)

Eggen, J.H.

1989-01-01

Standard articulation tests are not always sensitive enough to discriminate between speech samples which are of high intelligibility. One can increase the sensitivity of such tests by presenting the test materials in noise. In this way, small differences in intelligibility can be magnified into
The influence of meaning on the perception of speech sounds.

Science.gov (United States)

Kazanina, Nina; Phillips, Colin; Idsardi, William

2006-07-25

As part of knowledge of language, an adult speaker possesses information on which sounds are used in the language and on the distribution of these sounds in a multidimensional acoustic space. However, a speaker must know not only the sound categories of his language but also the functional significance of these categories, in particular, which sound contrasts are relevant for storing words in memory and which sound contrasts are not. Using magnetoencephalographic brain recordings with speakers of Russian and Korean, we demonstrate that a speaker's perceptual space, as reflected in early auditory brain responses, is shaped not only by bottom-up analysis of the distribution of sounds in his language but also by more abstract analysis of the functional significance of those sounds.
Neural Tuning to Low-Level Features of Speech throughout the Perisylvian Cortex.

Science.gov (United States)

Berezutskaya, Julia; Freudenburg, Zachary V; Güçlü, Umut; van Gerven, Marcel A J; Ramsey, Nick F

2017-08-16

Despite a large body of research, we continue to lack a detailed account of how auditory processing of continuous speech unfolds in the human brain. Previous research showed the propagation of low-level acoustic features of speech from posterior superior temporal gyrus toward anterior superior temporal gyrus in the human brain (Hullett et al., 2016). In this study, we investigate what happens to these neural representations past the superior temporal gyrus and how they engage higher-level language processing areas such as inferior frontal gyrus. We used low-level sound features to model neural responses to speech outside of the primary auditory cortex. Two complementary imaging techniques were used with human participants (both males and females): electrocorticography (ECoG) and fMRI. Both imaging techniques showed tuning of the perisylvian cortex to low-level speech features. With ECoG, we found evidence of propagation of the temporal features of speech sounds along the ventral pathway of language processing in the brain toward inferior frontal gyrus. Increasingly coarse temporal features of speech spreading from posterior superior temporal cortex toward inferior frontal gyrus were associated with linguistic features such as voice onset time, duration of the formant transitions, and phoneme, syllable, and word boundaries. The present findings provide the groundwork for a comprehensive bottom-up account of speech comprehension in the human brain. SIGNIFICANCE STATEMENT We know that, during natural speech comprehension, a broad network of perisylvian cortical regions is involved in sound and language processing. Here, we investigated the tuning to low-level sound features within these regions using neural responses to a short feature film. We also looked at whether the tuning organization along these brain regions showed any parallel to the hierarchy of language structures in continuous speech. Our results show that low-level speech features propagate throughout the
Speech recognition from spectral dynamics

Indian Academy of Sciences (India)

Carrier nature of speech; modulation spectrum; spectral dynamics ... the relationships between phonetic values of sounds and their short-term spectral envelopes .... the number of free parameters that need to be estimated from training data.
The Frame Constraint on Experimentally Elicited Speech Errors in Japanese

Science.gov (United States)

Saito, Akie; Inoue, Tomoyoshi

2017-01-01

The so-called syllable position effect in speech errors has been interpreted as reflecting constraints posed by the frame structure of a given language, which is separately operating from linguistic content during speech production. The effect refers to the phenomenon that when a speech error occurs, replaced and replacing sounds tend to be in the…
Speech masking and cancelling and voice obscuration

Science.gov (United States)

Holzrichter, John F.

2013-09-10

A non-acoustic sensor is used to measure a user's speech and then broadcasts an obscuring acoustic signal diminishing the user's vocal acoustic output intensity and/or distorting the voice sounds making them unintelligible to persons nearby. The non-acoustic sensor is positioned proximate or contacting a user's neck or head skin tissue for sensing speech production information.
The Multisensory Sound Lab: Sounds You Can See and Feel.

Science.gov (United States)

Lederman, Norman; Hendricks, Paula

1994-01-01

A multisensory sound lab has been developed at the Model Secondary School for the Deaf (District of Columbia). A special floor allows vibrations to be felt, and a spectrum analyzer displays frequencies and harmonics visually. The lab is used for science education, auditory training, speech therapy, music and dance instruction, and relaxation…
LIBERDADE DE EXPRESSÃO E DISCURSO DO ÓDIO NO BRASIL / FREE SPEECH AND HATE SPEECH IN BRAZIL

Directory of Open Access Journals (Sweden)

Nevita Maria Pessoa de Aquino Franca Luna

2014-12-01

Full Text Available The purpose of this article is to analyze the restriction of free speech when it comes close to hate speech. In this perspective, the aim of this study is to answer the question: what is the understanding adopted by the Brazilian Supreme Court in cases involving the conflict between free speech and hate speech? The methodology combines a bibliographic review on the theoretical assumptions of the research (concept of free speech and hate speech, and understanding of the rights of defense of traditionally discriminated minorities and empirical research (documental and jurisprudential analysis of judged cases of American Court, German Court and Brazilian Court. Firstly, free speech is discussed, defining its meaning, content and purpose. Then, the hate speech is pointed as an inhibitor element of free speech for offending members of traditionally discriminated minorities, who are outnumbered or in a situation of cultural, socioeconomic or political subordination. Subsequently, are discussed some aspects of American (negative freedom and German models (positive freedom, to demonstrate that different cultures adopt different legal solutions. At the end, it is concluded that there is an approximation of the Brazilian understanding with the German doctrine, from the analysis of landmark cases as the publisher Siegfried Ellwanger (2003 and the Samba School Unidos do Viradouro (2008. The Brazilian comprehension, a multicultural country made up of different ethnicities, leads to a new process of defending minorities who, despite of involving the collision of fundamental rights (dignity, equality and freedom, is still restrained by incompatible barriers of a contemporary pluralistic democracy.
Vocal Noise Cancellation From Respiratory Sounds

National Research Council Canada - National Science Library

Moussavi, Zahra

2001-01-01

Although background noise cancellation for speech or electrocardiographic recording is well established, however when the background noise contains vocal noises and the main signal is a breath sound...
Understanding speech when wearing communication headsets and hearing protectors with subband processing.

Science.gov (United States)

Brammer, Anthony J; Yu, Gongqiang; Bernstein, Eric R; Cherniack, Martin G; Peterson, Donald R; Tufts, Jennifer B

2014-08-01

An adaptive, delayless, subband feed-forward control structure is employed to improve the speech signal-to-noise ratio (SNR) in the communication channel of a circumaural headset/hearing protector (HPD) from 90 Hz to 11.3 kHz, and to provide active noise control (ANC) from 50 to 800 Hz to complement the passive attenuation of the HPD. The task involves optimizing the speech SNR for each communication channel subband, subject to limiting the maximum sound level at the ear, maintaining a speech SNR preferred by users, and reducing large inter-band gain differences to improve speech quality. The performance of a proof-of-concept device has been evaluated in a pseudo-diffuse sound field when worn by human subjects under conditions of environmental noise and speech that do not pose a risk to hearing, and by simulation for other conditions. For the environmental noises employed in this study, subband speech SNR control combined with subband ANC produced greater improvement in word scores than subband ANC alone, and improved the consistency of word scores across subjects. The simulation employed a subject-specific linear model, and predicted that word scores are maintained in excess of 90% for sound levels outside the HPD of up to ∼115 dBA.
Stop consonant voicing in young children's speech: Evidence from a cross-sectional study

Science.gov (United States)

Ganser, Emily

There are intuitive reasons to believe that speech-sound acquisition and language acquisition should be related in development. Surprisingly, only recently has research begun to parse just how the two might be related. This study investigated possible correlations between speech-sound acquisition and language acquisition, as part of a large-scale, longitudinal study of the relationship between different types of phonological development and vocabulary growth in the preschool years. Productions of voiced and voiceless stop-initial words were recorded from 96 children aged 28-39 months. Voice Onset Time (VOT, in ms) for each token context was calculated. A mixed-model logistic regression was calculated which predicted whether the sound was intended to be voiced or voiceless based on its VOT. This model estimated the slopes of the logistic function for each child. This slope was referred to as Robustness of Contrast (based on Holliday, Reidy, Beckman, and Edwards, 2015), defined as being the degree of categorical differentiation between the production of two speech sounds or classes of sounds, in this case, voiced and voiceless stops. Results showed a wide range of slopes for individual children, suggesting that slope-derived Robustness of Contrast could be a viable means of measuring a child's acquisition of the voicing contrast. Robustness of Contrast was then compared to traditional measures of speech and language skills to investigate whether there was any correlation between the production of stop voicing and broader measures of speech and language development. The Robustness of Contrast measure was found to correlate with all individual measures of speech and language, suggesting that it might indeed be predictive of later language skills.

Using ILD or ITD Cues for Sound Source Localization and Speech Understanding in a Complex Listening Environment by Listeners with Bilateral and with Hearing-Preservation Cochlear Implants

Science.gov (United States)

Loiselle, Louise H.; Dorman, Michael F.; Yost, William A.; Cook, Sarah J.; Gifford, Rene H.

2016-01-01

Purpose: To assess the role of interaural time differences and interaural level differences in (a) sound-source localization, and (b) speech understanding in a cocktail party listening environment for listeners with bilateral cochlear implants (CIs) and for listeners with hearing-preservation CIs. Methods: Eleven bilateral listeners with MED-EL…
The role of speech therapy in the therapy of children with central hearing disorders

Directory of Open Access Journals (Sweden)

Agnieszka Kasperczuk-Bajda

2017-09-01

Full Text Available Central disorders of hearing processing are one of the main causes of school difficulties among children. CAPD is described as incapability of using auditory acoustic sounds with its correct perception within ambit structures. The disorder is often accompanied by such difficulties as dyslexia, specific learning problems or subnormal speech development. Early diagnose of the disorder and commencing a therapy allows a child a better adjustment to expectations which he or she is exposed to by its environment. The aim of this work is indicating the role and abilities of a speech therapist while treating CAPD children. Aural training is adequate for children with central auditory disorders and in order to be effective it should be long lasting, intensive and adjusted to a child’s individual abilities. Therapy should include both passive listening of sounds and exercises in which the child can actively participate. The aim of speech therapy is to develop auditory skills, speaking, communication and stimulating cognitive potential of a child. Among the auditory exercises conducted by the speech therapist are understanding distorted speech exercises, understanding distorted speech in the presence of a jamming signal, separation and integration of information exercises. localization and lateralization exercises, recognizing sound patterns exercises, recognizing sound sequences exercises, differentiating nonverbal stimuli and phonemes exercises and prosodic training. Therapeutic auditory training that is carried out systematically develops aural and linguistic competences.
Environmental Sound Training in Cochlear Implant Users

Science.gov (United States)

Shafiro, Valeriy; Sheft, Stanley; Kuvadia, Sejal; Gygi, Brian

2015-01-01

Purpose: The study investigated the effect of a short computer-based environmental sound training regimen on the perception of environmental sounds and speech in experienced cochlear implant (CI) patients. Method: Fourteen CI patients with the average of 5 years of CI experience participated. The protocol consisted of 2 pretests, 1 week apart,…
Auditory feedback perturbation in children with developmental speech disorders

NARCIS (Netherlands)

Terband, H.R.; van Brenk, F.J.; van Doornik-van der Zee, J.C.

2014-01-01

Background/purpose: Several studies indicate a close relation between auditory and speech motor functions in children with speech sound disorders (SSD). The aim of this study was to investigate the ability to compensate and adapt for perturbed auditory feedback in children with SSD compared to
Evaluation of Sound Quality, Boominess and Boxiness in Small Rooms

DEFF Research Database (Denmark)

Weisser, Adam; Rindel, Jens Holger

2006-01-01

ratings. The classical bass ratio definitions showed poor correlation with all subjective ratings. The overall sound quality ratings gave different results for speech and music. For speech the preferred mean RT should be as low as possible, whereas for music there was found a preferred range between 0......The acoustics of small rooms has been studied with emphasis on sound quality, boominess and boxiness when the rooms are used for speech or music. Seven rooms with very different characteristics have been used for the study. Subjective listening tests were made using binaural recordings...... of reproduced speech and music. The test results were compared with a large number of objective acoustic parameters based on the frequency-dependent reverberation times measured in the rooms. This has led to the proposal of three new acoustic parameters, which have shown high correlation with the subjective...
A multigenerational family study of oral and hand motor sequencing ability provides evidence for a familial speech sound disorder subtype

Science.gov (United States)

Peter, Beate; Raskind, Wendy H.

2011-01-01

Purpose To evaluate phenotypic expressions of speech sound disorder (SSD) in multigenerational families with evidence of familial forms of SSD. Method Members of five multigenerational families (N = 36) produced rapid sequences of monosyllables and disyllables and tapped computer keys with repetitive and alternating movements. Results Measures of repetitive and alternating motor speed were correlated within and between the two motor systems. Repetitive and alternating motor speeds increased in children and decreased in adults as a function of age. In two families with children who had severe speech deficits consistent with disrupted praxis, slowed alternating, but not repetitive, oral movements characterized most of the affected children and adults with a history of SSD, and slowed alternating hand movements were seen in some of the biologically related participants as well. Conclusion Results are consistent with a familial motor-based SSD subtype with incomplete penetrance, motivating new clinical questions about motor-based intervention not only in the oral but also the limb system. PMID:21909176
Tipos de erros de fala em crianças com transtorno fonológico em função do histórico de otite média Speech errors in children with speech sound disorders according to otitis media history

Directory of Open Access Journals (Sweden)

Haydée Fiszbein Wertzner

2012-12-01

Full Text Available OBJETIVO: Descrever os índices articulatórios quanto aos diferentes tipos de erros e verificar a existência de um tipo de erro preferencial em crianças com transtorno fonológico, em função da presença ou não de histórico de otite média. MÉTODOS: Participaram deste estudo prospectivo e transversal, 21 sujeitos com idade entre 5 anos e 2 meses e 7 anos e 9 meses com diagnóstico de transtorno fonológico. Os sujeitos foram agrupados de acordo com a presença do histórico otite média. O grupo experimental 1 (GE1 foi composto por 14 sujeitos com histórico de otite média e o grupo experimental 2 (GE2 por sete sujeitos que não apresentaram histórico de otite média. Foram calculadas a quantidade de erros de fala (distorções, omissões e substituições e os índices articulatórios. Os dados foram submetidos à análise estatística. RESULTADOS: Os grupos GE1 e GE2 diferiram quanto ao desempenho nos índices na comparação entre as duas provas de fonologia aplicadas. Observou-se em todas as análises que os índices que avaliam as substituições indicaram o tipo de erro mais cometido pelas crianças com transtorno fonológico. CONCLUSÃO: Os índices foram efetivos na indicação da substituição como o erro mais ocorrente em crianças com TF. A maior ocorrência de erros de fala observada na nomeação de figuras em crianças com histórico de otite média indica que tais erros, possivelmente, estão associados à dificuldade na representação fonológica causada pela perda auditiva transitória que vivenciaram.PURPOSE: To describe articulatory indexes for the different speech errors and to verify the existence of a preferred type of error in children with speech sound disorder, according to the presence or absence of otitis media history. METHODS: Participants in this prospective and cross-sectional study were 21 subjects aged between 5 years and 2 months and 7 years and 9 months with speech sound disorder. Subjects were
Can you hear me yet? An intracranial investigation of speech and non-speech audiovisual interactions in human cortex.

Science.gov (United States)

Rhone, Ariane E; Nourski, Kirill V; Oya, Hiroyuki; Kawasaki, Hiroto; Howard, Matthew A; McMurray, Bob

In everyday conversation, viewing a talker's face can provide information about the timing and content of an upcoming speech signal, resulting in improved intelligibility. Using electrocorticography, we tested whether human auditory cortex in Heschl's gyrus (HG) and on superior temporal gyrus (STG) and motor cortex on precentral gyrus (PreC) were responsive to visual/gestural information prior to the onset of sound and whether early stages of auditory processing were sensitive to the visual content (speech syllable versus non-speech motion). Event-related band power (ERBP) in the high gamma band was content-specific prior to acoustic onset on STG and PreC, and ERBP in the beta band differed in all three areas. Following sound onset, we found with no evidence for content-specificity in HG, evidence for visual specificity in PreC, and specificity for both modalities in STG. These results support models of audio-visual processing in which sensory information is integrated in non-primary cortical areas.
Tutorial: Speech Assessment for Multilingual Children Who Do Not Speak the Same Language(s) as the Speech-Language Pathologist.

Science.gov (United States)

McLeod, Sharynne; Verdon, Sarah

2017-08-15

The aim of this tutorial is to support speech-language pathologists (SLPs) undertaking assessments of multilingual children with suspected speech sound disorders, particularly children who speak languages that are not shared with their SLP. The tutorial was written by the International Expert Panel on Multilingual Children's Speech, which comprises 46 researchers (SLPs, linguists, phoneticians, and speech scientists) who have worked in 43 countries and used 27 languages in professional practice. Seventeen panel members met for a 1-day workshop to identify key points for inclusion in the tutorial, 26 panel members contributed to writing this tutorial, and 34 members contributed to revising this tutorial online (some members contributed to more than 1 task). This tutorial draws on international research evidence and professional expertise to provide a comprehensive overview of working with multilingual children with suspected speech sound disorders. This overview addresses referral, case history, assessment, analysis, diagnosis, and goal setting and the SLP's cultural competence and preparation for working with interpreters and multicultural support workers and dealing with organizational and government barriers to and facilitators of culturally competent practice. The issues raised in this tutorial are applied in a hypothetical case study of an English-speaking SLP's assessment of a multilingual Cantonese- and English-speaking 4-year-old boy. Resources are listed throughout the tutorial.
Analysis of glottal source parameters in Parkinsonian speech.

Science.gov (United States)

Hanratty, Jane; Deegan, Catherine; Walsh, Mary; Kirkpatrick, Barry

2016-08-01

Diagnosis and monitoring of Parkinson's disease has a number of challenges as there is no definitive biomarker despite the broad range of symptoms. Research is ongoing to produce objective measures that can either diagnose Parkinson's or act as an objective decision support tool. Recent research on speech based measures have demonstrated promising results. This study aims to investigate the characteristics of the glottal source signal in Parkinsonian speech. An experiment is conducted in which a selection of glottal parameters are tested for their ability to discriminate between healthy and Parkinsonian speech. Results for each glottal parameter are presented for a database of 50 healthy speakers and a database of 16 speakers with Parkinsonian speech symptoms. Receiver operating characteristic (ROC) curves were employed to analyse the results and the area under the ROC curve (AUC) values were used to quantify the performance of each glottal parameter. The results indicate that glottal parameters can be used to discriminate between healthy and Parkinsonian speech, although results varied for each parameter tested. For the task of separating healthy and Parkinsonian speech, 2 out of the 7 glottal parameters tested produced AUC values of over 0.9.
Hearing aid processing of loud speech and noise signals: Consequences for loudness perception and listening comfort

DEFF Research Database (Denmark)

Schmidt, Erik

2007-01-01

sounds, has found that both normal-hearing and hearing-impaired listeners prefer loud sounds to be closer to the most comfortable loudness-level, than suggested by common non-linear fitting rules. During this project, two listening experiments were carried out. In the first experiment, hearing aid users......Hearing aid processing of loud speech and noise signals: Consequences for loudness perception and listening comfort. Sound processing in hearing aids is determined by the fitting rule. The fitting rule describes how the hearing aid should amplify speech and sounds in the surroundings......, such that they become audible again for the hearing impaired person. The general goal is to place all sounds within the hearing aid users’ audible range, such that speech intelligibility and listening comfort become as good as possible. Amplification strategies in hearing aids are in many cases based on empirical...
Musical expertise and foreign speech perception.

Science.gov (United States)

Martínez-Montes, Eduardo; Hernández-Pérez, Heivet; Chobert, Julie; Morgado-Rodríguez, Lisbet; Suárez-Murias, Carlos; Valdés-Sosa, Pedro A; Besson, Mireille

2013-01-01

The aim of this experiment was to investigate the influence of musical expertise on the automatic perception of foreign syllables and harmonic sounds. Participants were Cuban students with high level of expertise in music or in visual arts and with the same level of general education and socio-economic background. We used a multi-feature Mismatch Negativity (MMN) design with sequences of either syllables in Mandarin Chinese or harmonic sounds, both comprising deviants in pitch contour, duration and Voice Onset Time (VOT) or equivalent that were either far from (Large deviants) or close to (Small deviants) the standard. For both Mandarin syllables and harmonic sounds, results were clear-cut in showing larger MMNs to pitch contour deviants in musicians than in visual artists. Results were less clear for duration and VOT deviants, possibly because of the specific characteristics of the stimuli. Results are interpreted as reflecting similar processing of pitch contour in speech and non-speech sounds. The implications of these results for understanding the influence of intense musical training from childhood to adulthood and of genetic predispositions for music on foreign language perception are discussed.
Musical expertise and foreign speech perception

Directory of Open Access Journals (Sweden)

Eduardo eMartínez-Montes

2013-11-01

Full Text Available The aim of this experiment was to investigate the influence of musical expertise on the automatic perception of foreign syllables and harmonic sounds. Participants were Cuban students with high level of expertise in music or in visual arts and with the same level of general education and socio-economic background. We used a multi-feature Mismatch Negativity (MMN design with sequences of either syllables in Mandarin Chinese or harmonic sounds, both comprising deviants in pitch contour, duration and Voice Onset Time (VOT or equivalent that were either far from (Large deviants or close to (Small deviants the standard. For both Mandarin syllables and harmonic sounds, results were clear-cut in showing larger MMNs to pitch contour deviants in musicians than in visual artists. Results were less clear for duration and VOT deviants, possibly because of the specific characteristics of the stimuli. Results are interpreted as reflecting similar processing of pitch contour in speech and non-speech sounds. The implications of these results for understanding the influence of intense musical training from childhood to adulthood and of genetic predispositions for music on foreign language perception is discussed.
Why the Left Hemisphere Is Dominant for Speech Production: Connecting the Dots

Directory of Open Access Journals (Sweden)

Harvey Martin Sussman

2015-12-01

Full Text Available Evidence from seemingly disparate areas of speech/language research is reviewed to form a unified theoretical account for why the left hemisphere is specialized for speech production. Research findings from studies investigating hemispheric lateralization of infant babbling, the primacy of the syllable in phonological structure, rhyming performance in split-brain patients, rhyming ability and phonetic categorization in children diagnosed with developmental apraxia of speech, rules governing exchange errors in spoonerisms, organizational principles of neocortical control of learned motor behaviors, and multi-electrode recordings of human neuronal responses to speech sounds are described and common threads highlighted. It is suggested that the emergence, in developmental neurogenesis, of a hard-wired, syllabically-organized, neural substrate representing the phonemic sound elements of one’s language, particularly the vocalic nucleus, is the crucial factor underlying the left hemisphere’s dominance for speech production.
Sound stream segregation: a neuromorphic approach to solve the “cocktail party problem” in real-time

Science.gov (United States)

Thakur, Chetan Singh; Wang, Runchun M.; Afshar, Saeed; Hamilton, Tara J.; Tapson, Jonathan C.; Shamma, Shihab A.; van Schaik, André

2015-01-01

The human auditory system has the ability to segregate complex auditory scenes into a foreground component and a background, allowing us to listen to specific speech sounds from a mixture of sounds. Selective attention plays a crucial role in this process, colloquially known as the “cocktail party effect.” It has not been possible to build a machine that can emulate this human ability in real-time. Here, we have developed a framework for the implementation of a neuromorphic sound segregation algorithm in a Field Programmable Gate Array (FPGA). This algorithm is based on the principles of temporal coherence and uses an attention signal to separate a target sound stream from background noise. Temporal coherence implies that auditory features belonging to the same sound source are coherently modulated and evoke highly correlated neural response patterns. The basis for this form of sound segregation is that responses from pairs of channels that are strongly positively correlated belong to the same stream, while channels that are uncorrelated or anti-correlated belong to different streams. In our framework, we have used a neuromorphic cochlea as a frontend sound analyser to extract spatial information of the sound input, which then passes through band pass filters that extract the sound envelope at various modulation rates. Further stages include feature extraction and mask generation, which is finally used to reconstruct the targeted sound. Using sample tonal and speech mixtures, we show that our FPGA architecture is able to segregate sound sources in real-time. The accuracy of segregation is indicated by the high signal-to-noise ratio (SNR) of the segregated stream (90, 77, and 55 dB for simple tone, complex tone, and speech, respectively) as compared to the SNR of the mixture waveform (0 dB). This system may be easily extended for the segregation of complex speech signals, and may thus find various applications in electronic devices such as for sound segregation
Sound stream segregation: a neuromorphic approach to solve the ‘cocktail party problem’ in real-time

Directory of Open Access Journals (Sweden)

Chetan Singh Thakur

2015-09-01

Full Text Available The human auditory system has the ability to segregate complex auditory scenes into a foreground component and a background, allowing us to listen to specific speech sounds from a mixture of sounds. Selective attention plays a crucial role in this process, colloquially known as the ‘cocktail party effect’. It has not been possible to build a machine that can emulate this human ability in real-time. Here, we have developed a framework for the implementation of a neuromorphic sound segregation algorithm in a Field Programmable Gate Array (FPGA. This algorithm is based on the principles of temporal coherence and uses an attention signal to separate a target sound stream from background noise. Temporal coherence implies that auditory features belonging to the same sound source are coherently modulated and evoke highly correlated neural response patterns. The basis for this form of sound segregation is that responses from pairs of channels that are strongly positively correlated belong to the same stream, while channels that are uncorrelated or anti-correlated belong to different streams. In our framework, we have used a neuromorphic cochlea as a frontend sound analyser to extract spatial information of the sound input, which then passes through band pass filters that extract the sound envelope at various modulation rates. Further stages include feature extraction and mask generation, which is finally used to reconstruct the targeted sound. Using sample tonal and speech mixtures, we show that our FPGA architecture is able to segregate sound sources in real-time. The accuracy of segregation is indicated by the high signal-to-noise ratio (SNR of the segregated stream (90, 77 and 55 dB for simple tone, complex tone and speech, respectively as compared to the SNR of the mixture waveform (0 dB. This system may be easily extended for the segregation of complex speech signals, and may thus find various applications in electronic devices such as for
Restoring speech perception with cochlear implants by spanning defective electrode contacts.

Science.gov (United States)

Frijns, Johan H M; Snel-Bongers, Jorien; Vellinga, Dirk; Schrage, Erik; Vanpoucke, Filiep J; Briaire, Jeroen J

2013-04-01

Even with six defective contacts, spanning can largely restore speech perception with the HiRes 120 speech processing strategy to the level supported by an intact electrode array. Moreover, the sound quality is not degraded. Previous studies have demonstrated reduced speech perception scores (SPS) with defective contacts in HiRes 120. This study investigated whether replacing defective contacts by spanning, i.e. current steering on non-adjacent contacts, is able to restore speech recognition to the level supported by an intact electrode array. Ten adult cochlear implant recipients (HiRes90K, HiFocus1J) with experience with HiRes 120 participated in this study. Three different defective electrode arrays were simulated (six separate defective contacts, three pairs or two triplets). The participants received three take-home strategies and were asked to evaluate the sound quality in five predefined listening conditions. After 3 weeks, SPS were evaluated with monosyllabic words in quiet and in speech-shaped background noise. The participants rated the sound quality equal for all take-home strategies. SPS with background noise were equal for all conditions tested. However, SPS in quiet (85% phonemes correct on average with the full array) decreased significantly with increasing spanning distance, with a 3% decrease for each spanned contact.
Sound production treatment for acquired apraxia of speech: Effects of blocked and random practice on multisyllabic word production.

Science.gov (United States)

Wambaugh, Julie; Nessler, Christina; Wright, Sandra; Mauszycki, Shannon; DeLong, Catharine

2016-10-01

This study was designed to examine the effects of practice schedule, blocked vs random, on outcomes of a behavioural treatment for acquired apraxia of speech (AOS), Sound Production Treatment (SPT). SPT was administered to four speakers with chronic AOS and aphasia in the context of multiple baseline designs across behaviours and participants. Treatment was applied to multiple sound errors within three-to-five syllable words. All participants received both practice schedules: SPT-Random (SPT-R) and SPT-Blocked (SPT-B). Improvements in accuracy of word production for trained items were found for both treatment conditions for all participants. One participant demonstrated better maintenance effects associated with SPT-R. Response generalisation to untreated words varied across participants, but was generally modest and unstable. Stimulus generalisation to production of words in sentence completion was positive for three of the participants. Stimulus generalisation to production of phrases was positive for two of the participants. Findings provide additional efficacy data regarding SPT's effects on articulation of treated items and extend knowledge of the treatment's effects when applied to multiple targets within multisyllabic words.
Effects of musical expertise on oscillatory brain activity in response to emotional sounds.

Science.gov (United States)

Nolden, Sophie; Rigoulot, Simon; Jolicoeur, Pierre; Armony, Jorge L

2017-08-01

Emotions can be conveyed through a variety of channels in the auditory domain, be it via music, non-linguistic vocalizations, or speech prosody. Moreover, recent studies suggest that expertise in one sound category can impact the processing of emotional sounds in other sound categories as they found that musicians process more efficiently emotional musical and vocal sounds than non-musicians. However, the neural correlates of these modulations, especially their time course, are not very well understood. Consequently, we focused here on how the neural processing of emotional information varies as a function of sound category and expertise of participants. Electroencephalogram (EEG) of 20 non-musicians and 17 musicians was recorded while they listened to vocal (speech and vocalizations) and musical sounds. The amplitude of EEG-oscillatory activity in the theta, alpha, beta, and gamma band was quantified and Independent Component Analysis (ICA) was used to identify underlying components of brain activity in each band. Category differences were found in theta and alpha bands, due to larger responses to music and speech than to vocalizations, and in posterior beta, mainly due to differential processing of speech. In addition, we observed greater activation in frontal theta and alpha for musicians than for non-musicians, as well as an interaction between expertise and emotional content of sounds in frontal alpha. The results reflect musicians' expertise in recognition of emotion-conveying music, which seems to also generalize to emotional expressions conveyed by the human voice, in line with previous accounts of effects of expertise on musical and vocal sounds processing. Copyright © 2017 Elsevier Ltd. All rights reserved.
Working memory span in Persian-speaking children with speech sound disorders and normal speech development.

Science.gov (United States)

Afshar, Mohamad Reza; Ghorbani, Ali; Rashedi, Vahid; Jalilevand, Nahid; Kamali, Mohamad

2017-10-01

The aim of this study was to compare working memory span in Persian-speaking preschool children with speech sound disorder (SSD) and their typically speaking peers. Additionally, the study aimed to examine Non-Word Repetition (NWR), Forward Digit Span (FDS) and Backward Digit Span (BDS) in four groups of children with varying severity levels of SSD. The participants in this study comprised 35 children with SSD and 35 typically developing (TD) children -matched for age and sex-as a control group. The participants were between the age range of 48 and 72 months. Two components of working memory including phonological loop and central executive were compared between two groups. We used two tasks (NWR and FDS) to assess phonological loop component, and one task (BDS) to assess central executive component. Percentage of correct consonants (PCC) was used to calculate the severity of SSD. Significant differences were observed between the two groups in all tasks that assess working memory (p working memory between the various severity groups indicated significant differences between different severities of both NWR and FDS tasks among the SSD children (p 0.05). The result showed that PCC scores in TD children were associated with NWR (p 0.05). The working memory skills were weaker in SSD children, in comparison to TD children. In addition, children with varying levels of severity of SSD differed in terms of NWR and FSD, but not BDS. Copyright © 2017 Elsevier B.V. All rights reserved.

Quality of Mobile Phone and Tablet Mobile Apps for Speech Sound Disorders: Protocol for an Evidence-Based Appraisal.

Science.gov (United States)

Furlong, Lisa M; Morris, Meg E; Erickson, Shane; Serry, Tanya A

2016-11-29

Although mobile apps are readily available for speech sound disorders (SSD), their validity has not been systematically evaluated. This evidence-based appraisal will critically review and synthesize current evidence on available therapy apps for use by children with SSD. The main aims are to (1) identify the types of apps currently available for Android and iOS mobile phones and tablets, and (2) to critique their design features and content using a structured quality appraisal tool. This protocol paper presents and justifies the methods used for a systematic review of mobile apps that provide intervention for use by children with SSD. The primary outcomes of interest are (1) engagement, (2) functionality, (3) aesthetics, (4) information quality, (5) subjective quality, and (6) perceived impact. Quality will be assessed by 2 certified practicing speech-language pathologists using a structured quality appraisal tool. Two app stores will be searched from the 2 largest operating platforms, Android and iOS. Systematic methods of knowledge synthesis shall include searching the app stores using a defined procedure, data extraction, and quality analysis. This search strategy shall enable us to determine how many SSD apps are available for Android and for iOS compatible mobile phones and tablets. It shall also identify the regions of the world responsible for the apps' development, the content and the quality of offerings. Recommendations will be made for speech-language pathologists seeking to use mobile apps in their clinical practice. This protocol provides a structured process for locating apps and appraising the quality, as the basis for evaluating their use in speech pathology for children in English-speaking nations. ©Lisa M Furlong, Meg E Morris, Shane Erickson, Tanya A Serry. Originally published in JMIR Research Protocols (http://www.researchprotocols.org), 29.11.2016.
Musical background not associated with self-perceived hearing performance or speech perception in postlingual cochlear-implant users

NARCIS (Netherlands)

Fuller, Christina; Free, Rolien; Maat, Bert; Baskent, Deniz

In normal-hearing listeners, musical background has been observed to change the sound representation in the auditory system and produce enhanced performance in some speech perception tests. Based on these observations, it has been hypothesized that musical background can influence sound and speech
Speech and orthodontic appliances: a systematic literature review.

Science.gov (United States)

Chen, Junyu; Wan, Jia; You, Lun

2018-01-23

Various types of orthodontic appliances can lead to speech difficulties. However, speech difficulties caused by orthodontic appliances have not been sufficiently investigated by an evidence-based method. The aim of this study is to outline the scientific evidence and mechanism of the speech difficulties caused by orthodontic appliances. Randomized-controlled clinical trials (RCT), controlled clinical trials, and cohort studies focusing on the effect of orthodontic appliances on speech were included. A systematic search was conducted by an electronic search in PubMed, EMBASE, and the Cochrane Library databases, complemented by a manual search. The types of orthodontic appliances, the affected sounds, and duration period of the speech disturbances were extracted. The ROBINS-I tool was applied to evaluate the quality of non-randomized studies, and the bias of RCT was assessed based on the Cochrane Handbook for Systematic Reviews of Interventions. No meta-analyses could be performed due to the heterogeneity in the study designs and treatment modalities. Among 448 screened articles, 13 studies were included (n = 297 patients). Different types of orthodontic appliances such as fixed appliances, orthodontic retainers and palatal expanders could influence the clarity of speech. The /i/, /a/, and /e/ vowels as well as /s/, /z/, /l/, /t/, /d/, /r/, and /ʃ/ consonants could be distorted by appliances. Although most speech impairments could return to normal within weeks, speech distortion of the /s/ sound might last for more than 3 months. The low evidence level grading and heterogeneity were the two main limitations in this systematic review. Lingual fixed appliances, palatal expanders, and Hawley retainers have an evident influence on speech production. The /i/, /s/, /t/, and /d/ sounds are the primarily affected ones. The results of this systematic review should be interpreted with caution and more high-quality RCTs with larger sample sizes and longer follow-up periods are
The role of diffusive architectural surfaces on auditory spatial discrimination in performance venues.

Science.gov (United States)

Robinson, Philip W; Pätynen, Jukka; Lokki, Tapio; Jang, Hyung Suk; Jeon, Jin Yong; Xiang, Ning

2013-06-01

In musical or theatrical performance, some venues allow listeners to individually localize and segregate individual performers, while others produce a well blended ensemble sound. The room acoustic conditions that make this possible, and the psycho-acoustic effects at work are not fully understood. This research utilizes auralizations from measured and simulated performance venues to investigate spatial discrimination of multiple acoustic sources in rooms. Signals were generated from measurements taken in a small theater, and listeners in the audience area were asked to distinguish pairs of speech sources on stage with various spatial separations. This experiment was repeated with the proscenium splay walls treated to be flat, diffusive, or absorptive. Similar experiments were conducted in a simulated hall, utilizing 11 early reflections with various characteristics, and measured late reverberation. The experiments reveal that discriminating the lateral arrangement of two sources is possible at narrower separation angles when reflections come from flat or absorptive rather than diffusive surfaces.
Hidden Markov models in automatic speech recognition

Science.gov (United States)

Wrzoskowicz, Adam

1993-11-01

This article describes a method for constructing an automatic speech recognition system based on hidden Markov models (HMMs). The author discusses the basic concepts of HMM theory and the application of these models to the analysis and recognition of speech signals. The author provides algorithms which make it possible to train the ASR system and recognize signals on the basis of distinct stochastic models of selected speech sound classes. The author describes the specific components of the system and the procedures used to model and recognize speech. The author discusses problems associated with the choice of optimal signal detection and parameterization characteristics and their effect on the performance of the system. The author presents different options for the choice of speech signal segments and their consequences for the ASR process. The author gives special attention to the use of lexical, syntactic, and semantic information for the purpose of improving the quality and efficiency of the system. The author also describes an ASR system developed by the Speech Acoustics Laboratory of the IBPT PAS. The author discusses the results of experiments on the effect of noise on the performance of the ASR system and describes methods of constructing HMM's designed to operate in a noisy environment. The author also describes a language for human-robot communications which was defined as a complex multilevel network from an HMM model of speech sounds geared towards Polish inflections. The author also added mandatory lexical and syntactic rules to the system for its communications vocabulary.
Vehicle surge detection and pathway discrimination by pedestrians who are blind: Effect of adding an alert sound to hybrid electric vehicles on performance.

Science.gov (United States)

Kim, Dae Shik; Emerson, Robert Wall; Naghshineh, Koorosh; Pliskow, Jay; Myers, Kyle

2012-05-01

This study examined the effect of adding an artificially generated alert sound to a quiet vehicle on its detectability and localizability with 15 visually impaired adults. When starting from a stationary position, the hybrid electric vehicle with an alert sound was significantly more quickly and reliably detected than either the identical vehicle without such added sound or the comparable internal combustion engine vehicle. However, no significant difference was found between the vehicles in respect to how accurately the participants could discriminate the path of a given vehicle (straight vs. right turn). These results suggest that adding an artificial sound to a hybrid electric vehicle may help reduce delay in street crossing initiation by a blind pedestrian, but the benefit of such alert sound may not be obvious in determining whether the vehicle in his near parallel lane proceeds straight through the intersection or turns right in front of him.
Surgical improvement of speech disorder caused by amyotrophic lateral sclerosis.

Science.gov (United States)

Saigusa, Hideto; Yamaguchi, Satoshi; Nakamura, Tsuyoshi; Komachi, Taro; Kadosono, Osamu; Ito, Hiroyuki; Saigusa, Makoto; Niimi, Seiji

2012-12-01

Amyotrophic lateral sclerosis (ALS) is a progressive debilitating neurological disease. ALS disturbs the quality of life by affecting speech, swallowing and free mobility of the arms without affecting intellectual function. It is therefore of significance to improve intelligibility and quality of speech sounds, especially for ALS patients with slowly progressive courses. Currently, however, there is no effective or established approach to improve speech disorder caused by ALS. We investigated a surgical procedure to improve speech disorder for some patients with neuromuscular diseases with velopharyngeal closure incompetence. In this study, we performed the surgical procedure for two patients suffering from severe speech disorder caused by slowly progressing ALS. The patients suffered from speech disorder with hypernasality and imprecise and weak articulation during a 6-year course (patient 1) and a 3-year course (patient 2) of slowly progressing ALS. We narrowed bilateral lateral palatopharyngeal wall at velopharyngeal port, and performed this surgery under general anesthesia without muscle relaxant for the two patients. Postoperatively, intelligibility and quality of their speech sounds were greatly improved within one month without any speech therapy. The patients were also able to generate longer speech phrases after the surgery. Importantly, there was no serious complication during or after the surgery. In summary, we performed bilateral narrowing of lateral palatopharyngeal wall as a speech surgery for two patients suffering from severe speech disorder associated with ALS. With this technique, improved intelligibility and quality of speech can be maintained for longer duration for the patients with slowly progressing ALS.
Perceptual and Acoustic Reliability Estimates for the Speech Disorders Classification System (SDCS)

Science.gov (United States)

Shriberg, Lawrence D.; Fourakis, Marios; Hall, Sheryl D.; Karlsson, Heather B.; Lohmeier, Heather L.; McSweeny, Jane L.; Potter, Nancy L.; Scheer-Cohen, Alison R.; Strand, Edythe A.; Tilkens, Christie M.; Wilson, David L.

2010-01-01

A companion paper describes three extensions to a classification system for paediatric speech sound disorders termed the Speech Disorders Classification System (SDCS). The SDCS uses perceptual and acoustic data reduction methods to obtain information on a speaker's speech, prosody, and voice. The present paper provides reliability estimates for…
James Weldon Johnson and the Speech Lab Recordings

Directory of Open Access Journals (Sweden)

Chris Mustazza

2016-03-01

Full Text Available On December 24, 1935, James Weldon Johnson read thirteen of his poems at Columbia University, in a recording session engineered by Columbia Professor of Speech George W. Hibbitt and Barnard colleague Professor W. Cabell Greet, pioneers in the field that became sociolinguistics. Interested in American dialects, Greet and Hibbitt used early sound recording technologies to preserve dialect samples. In the same lab where they recorded T.S. Eliot, Gertrude Stein, and others, James Weldon Johnson read a selection of poems that included several from his seminal collection God’s Trombones and some dialect poems. Mustazza has digitized these and made them publicly available in the PennSound archive. In this essay, Mustazza contextualizes the collection, considering the recordings as sonic inscriptions alongside their textual manifestations. He argues that the collection must be heard within the frames of its production conditions—especially its recording in a speech lab—and that the sound recordings are essential elements in an hermeneutic analysis of the poems. He reasons that the poems’ original topics are reframed and refocused when historicized and contextualized within the frame of The Speech Lab Recordings.
The development of speech production in children with cleft palate

DEFF Research Database (Denmark)

Willadsen, Elisabeth; Chapman, Kathy

2012-01-01

The purpose of this chapter is to provide an overview of speech development of children with cleft palate +/- cleft lip. The chapter will begin with a discussion of the impact of clefting on speech. Next, we will provide a brief description of those factors impacting speech development...... for this population of children. Finally, research examining various aspects of speech development of infants and young children with cleft palate (birth to age five) will be reviewed. This final section will be organized by typical stages of speech sound development (e.g., prespeech, the early word stage...
Temporal factors affecting somatosensory-auditory interactions in speech processing

Directory of Open Access Journals (Sweden)

Takayuki eIto

2014-11-01

Full Text Available Speech perception is known to rely on both auditory and visual information. However, sound specific somatosensory input has been shown also to influence speech perceptual processing (Ito et al., 2009. In the present study we addressed further the relationship between somatosensory information and speech perceptual processing by addressing the hypothesis that the temporal relationship between orofacial movement and sound processing contributes to somatosensory-auditory interaction in speech perception. We examined the changes in event-related potentials in response to multisensory synchronous (simultaneous and asynchronous (90 ms lag and lead somatosensory and auditory stimulation compared to individual unisensory auditory and somatosensory stimulation alone. We used a robotic device to apply facial skin somatosensory deformations that were similar in timing and duration to those experienced in speech production. Following synchronous multisensory stimulation the amplitude of the event-related potential was reliably different from the two unisensory potentials. More importantly, the magnitude of the event-related potential difference varied as a function of the relative timing of the somatosensory-auditory stimulation. Event-related activity change due to stimulus timing was seen between 160-220 ms following somatosensory onset, mostly around the parietal area. The results demonstrate a dynamic modulation of somatosensory-auditory convergence and suggest the contribution of somatosensory information for speech processing process is dependent on the specific temporal order of sensory inputs in speech production.
Introduction. The perception of speech: from sound to meaning.

Science.gov (United States)

Moore, Brian C J; Tyler, Lorraine K; Marslen-Wilson, William

2008-03-12

Spoken language communication is arguably the most important activity that distinguishes humans from non-human species. This paper provides an overview of the review papers that make up this theme issue on the processes underlying speech communication. The volume includes contributions from researchers who specialize in a wide range of topics within the general area of speech perception and language processing. It also includes contributions from key researchers in neuroanatomy and functional neuro-imaging, in an effort to cut across traditional disciplinary boundaries and foster cross-disciplinary interactions in this important and rapidly developing area of the biological and cognitive sciences.
The Prevalence of Speech Disorder in Primary School Students in Yazd-Iran

Directory of Open Access Journals (Sweden)

Sedighah Akhavan Karbasi

2011-01-01

Full Text Available Communication disorder is a widespread disabling problems and associated with adverse, long term outcome that impact on individuals, families and academic achievement of children in the school years and affect vocational choices later in adulthood. The aim of this study was to determine prevalence of speech disorders specifically stuttering, voice, and speech-sound disorders in primary school students in Iran-Yazd. In a descriptive study, 7881 primary school students in Yazd evaluated in view from of speech disorders with use of direct and face to face assessment technique in 2005. The prevalence of total speech disorders was 14.8% among whom 13.8% had speech-sound disorder, 1.2% stuttering and 0.47% voice disorder. The prevalence of speech disorders was higher than in males (16.7% as compared to females (12.7%. Pattern of prevalence of the three speech disorders was significantly different according to gender, parental education and by number of family member. There was no significant difference across speech disorders and birth order, religion and paternal consanguinity. These prevalence figures are higher than more studies that using parent or teacher reports.
Language and Speech Improvement for Kindergarten and First Grade. A Supplementary Handbook.

Science.gov (United States)

Cole, Roberta; And Others

The 16-unit language and speech improvement handbook for kindergarten and first grade students contains an introductory section which includes a discussion of the child's developmental speech and language characteristics, a sound development chart, a speech and hearing language screening test, the Henja articulation test, and a general outline of…
On the Perception of Speech Sounds as Biologically Significant Signals1,2

Science.gov (United States)

Pisoni, David B.

2012-01-01

This paper reviews some of the major evidence and arguments currently available to support the view that human speech perception may require the use of specialized neural mechanisms for perceptual analysis. Experiments using synthetically produced speech signals with adults are briefly summarized and extensions of these results to infants and other organisms are reviewed with an emphasis towards detailing those aspects of speech perception that may require some need for specialized species-specific processors. Finally, some comments on the role of early experience in perceptual development are provided as an attempt to identify promising areas of new research in speech perception. PMID:399200
Real time speech formant analyzer and display

Science.gov (United States)

Holland, George E.; Struve, Walter S.; Homer, John F.

1987-01-01

A speech analyzer for interpretation of sound includes a sound input which converts the sound into a signal representing the sound. The signal is passed through a plurality of frequency pass filters to derive a plurality of frequency formants. These formants are converted to voltage signals by frequency-to-voltage converters and then are prepared for visual display in continuous real time. Parameters from the inputted sound are also derived and displayed. The display may then be interpreted by the user. The preferred embodiment includes a microprocessor which is interfaced with a television set for displaying of the sound formants. The microprocessor software enables the sound analyzer to present a variety of display modes for interpretive and therapeutic used by the user.
Evaluating standard airborne sound insulation measures in terms of annoyance, loudness, and audibility ratings.

Science.gov (United States)

Park, H K; Bradley, J S

2009-07-01

This paper reports the results of an evaluation of the merits of standard airborne sound insulation measures with respect to subjective ratings of the annoyance and loudness of transmitted sounds. Subjects listened to speech and music sounds modified to represent transmission through 20 different walls with sound transmission class (STC) ratings from 34 to 58. A number of variations in the standard measures were also considered. These included variations in the 8-dB rule for the maximum allowed deficiency in the STC measure as well as variations in the standard 32-dB total allowed deficiency. Several spectrum adaptation terms were considered in combination with weighted sound reduction index (R(w)) values as well as modifications to the range of included frequencies in the standard rating contour. A STC measure without an 8-dB rule and an R(w) rating with a new spectrum adaptation term were better predictors of annoyance and loudness ratings of speech sounds. R(w) ratings with one of two modified C(tr) spectrum adaptation terms were better predictors of annoyance and loudness ratings of transmitted music sounds. Although some measures were much better predictors of responses to one type of sound than were the standard STC and R(w) values, no measure was remarkably improved for predicting annoyance and loudness ratings of both music and speech sounds.
Effects of early language, speech, and cognition on later reading: A mediation analysis

Directory of Open Access Journals (Sweden)

Vanessa N Durand

2013-09-01

Full Text Available This longitudinal secondary analysis examined which early language and speech abilities are associated with school-aged reading skills, and whether these associations are mediated by cognitive ability. We analyzed vocabulary, syntax, speech sound maturity, and cognition in a sample of healthy children at age 3 years (N=241 in relation to single word reading (decoding, comprehension, and oral reading fluency in the same children at age 9 to 11 years. All predictor variables and the mediator variable were associated with the three reading outcomes. The predictor variables were all associated with cognitive abilities, the mediator. Cognitive abilities partially mediated the effects of language on reading. After mediation, decoding was associated with speech sound maturity; comprehension was associated with receptive vocabulary; and oral fluency was associated with speech sound maturity, receptive vocabulary, and syntax. In summary, all of the effects of language on reading could not be explained by cognition as a mediator. Specific components of language and speech skills in preschool made independent contributions to reading skills 6 to 8 years later. These early precursors to later reading skill represent potential targets for early intervention to improve reading.
Temporal integration: intentional sound discrimination does not modulate stimulus-driven processes in auditory event synthesis.

Science.gov (United States)

Sussman, Elyse; Winkler, István; Kreuzer, Judith; Saher, Marieke; Näätänen, Risto; Ritter, Walter

2002-12-01

Our previous study showed that the auditory context could influence whether two successive acoustic changes occurring within the temporal integration window (approximately 200ms) were pre-attentively encoded as a single auditory event or as two discrete events (Cogn Brain Res 12 (2001) 431). The aim of the current study was to assess whether top-down processes could influence the stimulus-driven processes in determining what constitutes an auditory event. Electroencepholagram (EEG) was recorded from 11 scalp electrodes to frequently occurring standard and infrequently occurring deviant sounds. Within the stimulus blocks, deviants either occurred only in pairs (successive feature changes) or both singly and in pairs. Event-related potential indices of change and target detection, the mismatch negativity (MMN) and the N2b component, respectively, were compared with the simultaneously measured performance in discriminating the deviants. Even though subjects could voluntarily distinguish the two successive auditory feature changes from each other, which was also indicated by the elicitation of the N2b target-detection response, top-down processes did not modify the event organization reflected by the MMN response. Top-down processes can extract elemental auditory information from a single integrated acoustic event, but the extraction occurs at a later processing stage than the one whose outcome is indexed by MMN. Initial processes of auditory event-formation are fully governed by the context within which the sounds occur. Perception of the deviants as two separate sound events (the top-down effects) did not change the initial neural representation of the same deviants as one event (indexed by the MMN), without a corresponding change in the stimulus-driven sound organization.
Children with dyslexia show a reduced processing benefit from bimodal speech information compared to their typically developing peers.

Science.gov (United States)

Schaadt, Gesa; van der Meer, Elke; Pannekamp, Ann; Oberecker, Regine; Männel, Claudia

2018-01-17

During information processing, individuals benefit from bimodally presented input, as has been demonstrated for speech perception (i.e., printed letters and speech sounds) or the perception of emotional expressions (i.e., facial expression and voice tuning). While typically developing individuals show this bimodal benefit, school children with dyslexia do not. Currently, it is unknown whether the bimodal processing deficit in dyslexia also occurs for visual-auditory speech processing that is independent of reading and spelling acquisition (i.e., no letter-sound knowledge is required). Here, we tested school children with and without spelling problems on their bimodal perception of video-recorded mouth movements pronouncing syllables. We analyzed the event-related potential Mismatch Response (MMR) to visual-auditory speech information and compared this response to the MMR to monomodal speech information (i.e., auditory-only, visual-only). We found a reduced MMR with later onset to visual-auditory speech information in children with spelling problems compared to children without spelling problems. Moreover, when comparing bimodal and monomodal speech perception, we found that children without spelling problems showed significantly larger responses in the visual-auditory experiment compared to the visual-only response, whereas children with spelling problems did not. Our results suggest that children with dyslexia exhibit general difficulties in bimodal speech perception independently of letter-speech sound knowledge, as apparent in altered bimodal speech perception and lacking benefit from bimodal information. This general deficit in children with dyslexia may underlie the previously reported reduced bimodal benefit for letter-speech sound combinations and similar findings in emotion perception. Copyright © 2018 Elsevier Ltd. All rights reserved.

Recent advances in nonlinear speech processing

CERN Document Server

Faundez-Zanuy, Marcos; Esposito, Antonietta; Cordasco, Gennaro; Drugman, Thomas; Solé-Casals, Jordi; Morabito, Francesco

2016-01-01

This book presents recent advances in nonlinear speech processing beyond nonlinear techniques. It shows that it exploits heuristic and psychological models of human interaction in order to succeed in the implementations of socially believable VUIs and applications for human health and psychological support. The book takes into account the multifunctional role of speech and what is “outside of the box” (see Björn Schuller’s foreword). To this aim, the book is organized in 6 sections, each collecting a small number of short chapters reporting advances “inside” and “outside” themes related to nonlinear speech research. The themes emphasize theoretical and practical issues for modelling socially believable speech interfaces, ranging from efforts to capture the nature of sound changes in linguistic contexts and the timing nature of speech; labors to identify and detect speech features that help in the diagnosis of psychological and neuronal disease, attempts to improve the effectiveness and performa...
Perceptual statistical learning over one week in child speech production.

Science.gov (United States)

Richtsmeier, Peter T; Goffman, Lisa

2017-07-01

What cognitive mechanisms account for the trajectory of speech sound development, in particular, gradually increasing accuracy during childhood? An intriguing potential contributor is statistical learning, a type of learning that has been studied frequently in infant perception but less often in child speech production. To assess the relevance of statistical learning to developing speech accuracy, we carried out a statistical learning experiment with four- and five-year-olds in which statistical learning was examined over one week. Children were familiarized with and tested on word-medial consonant sequences in novel words. There was only modest evidence for statistical learning, primarily in the first few productions of the first session. This initial learning effect nevertheless aligns with previous statistical learning research. Furthermore, the overall learning effect was similar to an estimate of weekly accuracy growth based on normative studies. The results implicate other important factors in speech sound development, particularly learning via production. Copyright © 2017 Elsevier Inc. All rights reserved.
Neural Correlates of Indicators of Sound Change in Cantonese: Evidence from Cortical and Subcortical Processes.

Science.gov (United States)

Maggu, Akshay R; Liu, Fang; Antoniou, Mark; Wong, Patrick C M

2016-01-01

Across time, languages undergo changes in phonetic, syntactic, and semantic dimensions. Social, cognitive, and cultural factors contribute to sound change, a phenomenon in which the phonetics of a language undergo changes over time. Individuals who misperceive and produce speech in a slightly divergent manner (called innovators ) contribute to variability in the society, eventually leading to sound change. However, the cause of variability in these individuals is still unknown. In this study, we examined whether such misperceptions are represented in neural processes of the auditory system. We investigated behavioral, subcortical (via FFR), and cortical (via P300) manifestations of sound change processing in Cantonese, a Chinese language in which several lexical tones are merging. Across the merging categories, we observed a similar gradation of speech perception abilities in both behavior and the brain (subcortical and cortical processes). Further, we also found that behavioral evidence of tone merging correlated with subjects' encoding at the subcortical and cortical levels. These findings indicate that tone-merger categories, that are indicators of sound change in Cantonese, are represented neurophysiologically with high fidelity. Using our results, we speculate that innovators encode speech in a slightly deviant neurophysiological manner, and thus produce speech divergently that eventually spreads across the community and contributes to sound change.
Neural Correlates of Early Sound Encoding and their Relationship to Speech-in-Noise Perception

Directory of Open Access Journals (Sweden)

Emily B. J. Coffey

2017-08-01

Full Text Available Speech-in-noise (SIN perception is a complex cognitive skill that affects social, vocational, and educational activities. Poor SIN ability particularly affects young and elderly populations, yet varies considerably even among healthy young adults with normal hearing. Although SIN skills are known to be influenced by top-down processes that can selectively enhance lower-level sound representations, the complementary role of feed-forward mechanisms and their relationship to musical training is poorly understood. Using a paradigm that minimizes the main top-down factors that have been implicated in SIN performance such as working memory, we aimed to better understand how robust encoding of periodicity in the auditory system (as measured by the frequency-following response contributes to SIN perception. Using magnetoencephalograpy, we found that the strength of encoding at the fundamental frequency in the brainstem, thalamus, and cortex is correlated with SIN accuracy. The amplitude of the slower cortical P2 wave was previously also shown to be related to SIN accuracy and FFR strength; we use MEG source localization to show that the P2 wave originates in a temporal region anterior to that of the cortical FFR. We also confirm that the observed enhancements were related to the extent and timing of musicianship. These results are consistent with the hypothesis that basic feed-forward sound encoding affects SIN perception by providing better information to later processing stages, and that modifying this process may be one mechanism through which musical training might enhance the auditory networks that subserve both musical and language functions.
Temporal modulations in speech and music.

Science.gov (United States)

Ding, Nai; Patel, Aniruddh D; Chen, Lin; Butler, Henry; Luo, Cheng; Poeppel, David

2017-10-01

Speech and music have structured rhythms. Here we discuss a major acoustic correlate of spoken and musical rhythms, the slow (0.25-32Hz) temporal modulations in sound intensity and compare the modulation properties of speech and music. We analyze these modulations using over 25h of speech and over 39h of recordings of Western music. We show that the speech modulation spectrum is highly consistent across 9 languages (including languages with typologically different rhythmic characteristics). A different, but similarly consistent modulation spectrum is observed for music, including classical music played by single instruments of different types, symphonic, jazz, and rock. The temporal modulations of speech and music show broad but well-separated peaks around 5 and 2Hz, respectively. These acoustically dominant time scales may be intrinsic features of speech and music, a possibility which should be investigated using more culturally diverse samples in each domain. Distinct modulation timescales for speech and music could facilitate their perceptual analysis and its neural processing. Copyright © 2017 Elsevier Ltd. All rights reserved.
Magnified Neural Envelope Coding Predicts Deficits in Speech Perception in Noise.

Science.gov (United States)

Millman, Rebecca E; Mattys, Sven L; Gouws, André D; Prendergast, Garreth

2017-08-09

Verbal communication in noisy backgrounds is challenging. Understanding speech in background noise that fluctuates in intensity over time is particularly difficult for hearing-impaired listeners with a sensorineural hearing loss (SNHL). The reduction in fast-acting cochlear compression associated with SNHL exaggerates the perceived fluctuations in intensity in amplitude-modulated sounds. SNHL-induced changes in the coding of amplitude-modulated sounds may have a detrimental effect on the ability of SNHL listeners to understand speech in the presence of modulated background noise. To date, direct evidence for a link between magnified envelope coding and deficits in speech identification in modulated noise has been absent. Here, magnetoencephalography was used to quantify the effects of SNHL on phase locking to the temporal envelope of modulated noise (envelope coding) in human auditory cortex. Our results show that SNHL enhances the amplitude of envelope coding in posteromedial auditory cortex, whereas it enhances the fidelity of envelope coding in posteromedial and posterolateral auditory cortex. This dissociation was more evident in the right hemisphere, demonstrating functional lateralization in enhanced envelope coding in SNHL listeners. However, enhanced envelope coding was not perceptually beneficial. Our results also show that both hearing thresholds and, to a lesser extent, magnified cortical envelope coding in left posteromedial auditory cortex predict speech identification in modulated background noise. We propose a framework in which magnified envelope coding in posteromedial auditory cortex disrupts the segregation of speech from background noise, leading to deficits in speech perception in modulated background noise. SIGNIFICANCE STATEMENT People with hearing loss struggle to follow conversations in noisy environments. Background noise that fluctuates in intensity over time poses a particular challenge. Using magnetoencephalography, we demonstrate
Classifying laughter and speech using audio-visual feature prediction

NARCIS (Netherlands)

Petridis, Stavros; Asghar, Ali; Pantic, Maja

2010-01-01

In this study, a system that discriminates laughter from speech by modelling the relationship between audio and visual features is presented. The underlying assumption is that this relationship is different between speech and laughter. Neural networks are trained which learn the audio-to-visual and
Spectrotemporal Modulation Detection and Speech Perception by Cochlear Implant Users.

Science.gov (United States)

Won, Jong Ho; Moon, Il Joon; Jin, Sunhwa; Park, Heesung; Woo, Jihwan; Cho, Yang-Sun; Chung, Won-Ho; Hong, Sung Hwa

2015-01-01

Spectrotemporal modulation (STM) detection performance was examined for cochlear implant (CI) users. The test involved discriminating between an unmodulated steady noise and a modulated stimulus. The modulated stimulus presents frequency modulation patterns that change in frequency over time. In order to examine STM detection performance for different modulation conditions, two different temporal modulation rates (5 and 10 Hz) and three different spectral modulation densities (0.5, 1.0, and 2.0 cycles/octave) were employed, producing a total 6 different STM stimulus conditions. In order to explore how electric hearing constrains STM sensitivity for CI users differently from acoustic hearing, normal-hearing (NH) and hearing-impaired (HI) listeners were also tested on the same tasks. STM detection performance was best in NH subjects, followed by HI subjects. On average, CI subjects showed poorest performance, but some CI subjects showed high levels of STM detection performance that was comparable to acoustic hearing. Significant correlations were found between STM detection performance and speech identification performance in quiet and in noise. In order to understand the relative contribution of spectral and temporal modulation cues to speech perception abilities for CI users, spectral and temporal modulation detection was performed separately and related to STM detection and speech perception performance. The results suggest that that slow spectral modulation rather than slow temporal modulation may be important for determining speech perception capabilities for CI users. Lastly, test-retest reliability for STM detection was good with no learning. The present study demonstrates that STM detection may be a useful tool to evaluate the ability of CI sound processing strategies to deliver clinically pertinent acoustic modulation information.
Acquired word deafness, and the temporal grain of sound representation in the primary auditory cortex.

Science.gov (United States)

Phillips, D P; Farmer, M E

1990-11-15

This paper explores the nature of the processing disorder which underlies the speech discrimination deficit in the syndrome of acquired word deafness following from pathology to the primary auditory cortex. A critical examination of the evidence on this disorder revealed the following. First, the most profound forms of the condition are expressed not only in an isolation of the cerebral linguistic processor from auditory input, but in a failure of even the perceptual elaboration of the relevant sounds. Second, in agreement with earlier studies, we conclude that the perceptual dimension disturbed in word deafness is a temporal one. We argue, however, that it is not a generalized disorder of auditory temporal processing, but one which is largely restricted to the processing of sounds with temporal content in the milliseconds to tens-of-milliseconds time frame. The perceptual elaboration of sounds with temporal content outside that range, in either direction, may survive the disorder. Third, we present neurophysiological evidence that the primary auditory cortex has a special role in the representation of auditory events in that time frame, but not in the representation of auditory events with temporal grains outside that range.
Development of Bone-Conducted Ultrasonic Hearing Aid for the Profoundly Deaf: Assessments of the Modulation Type with Regard to Intelligibility and Sound Quality

Science.gov (United States)

Nakagawa, Seiji; Fujiyuki, Chika; Kagomiya, Takayuki

2012-07-01

Bone-conducted ultrasound (BCU) is perceived even by the profoundly sensorineural deaf. A novel hearing aid using the perception of amplitude-modulated BCU (BCU hearing aid: BCUHA) has been developed; however, further improvements are needed, especially in terms of articulation and sound quality. In this study, the intelligibility and sound quality of BCU speech with several types of amplitude modulation [double-sideband with transmitted carrier (DSB-TC), double-sideband with suppressed carrier (DSB-SC), and transposed modulation] were evaluated. The results showed that DSB-TC and transposed speech were more intelligible than DSB-SC speech, and transposed speech was closer than the other types of BCU speech to air-conducted speech in terms of sound quality. These results provide useful information for further development of the BCUHA.
Effects of noise and reverberation on speech perception and listening comprehension of children and adults in a classroom-like setting.

Science.gov (United States)

Klatte, Maria; Lachmann, Thomas; Meis, Markus

2010-01-01

The effects of classroom noise and background speech on speech perception, measured by word-to-picture matching, and listening comprehension, measured by execution of oral instructions, were assessed in first- and third-grade children and adults in a classroom-like setting. For speech perception, in addition to noise, reverberation time (RT) was varied by conducting the experiment in two virtual classrooms with mean RT = 0.47 versus RT = 1.1 s. Children were more impaired than adults by background sounds in both speech perception and listening comprehension. Classroom noise evoked a reliable disruption in children's speech perception even under conditions of short reverberation. RT had no effect on speech perception in silence, but evoked a severe increase in the impairments due to background sounds in all age groups. For listening comprehension, impairments due to background sounds were found in the children, stronger for first- than for third-graders, whereas adults were unaffected. Compared to classroom noise, background speech had a smaller effect on speech perception, but a stronger effect on listening comprehension, remaining significant when speech perception was controlled. This indicates that background speech affects higher-order cognitive processes involved in children's comprehension. Children's ratings of the sound-induced disturbance were low overall and uncorrelated to the actual disruption, indicating that the children did not consciously realize the detrimental effects. The present results confirm earlier findings on the substantial impact of noise and reverberation on children's speech perception, and extend these to classroom-like environmental settings and listening demands closely resembling those faced by children at school.
OLIVE: Speech-Based Video Retrieval

NARCIS (Netherlands)

de Jong, Franciska M.G.; Gauvain, Jean-Luc; den Hartog, Jurgen; den Hartog, Jeremy; Netter, Klaus

1999-01-01

This paper describes the Olive project which aims to support automated indexing of video material by use of human language technologies. Olive is making use of speech recognition to automatically derive transcriptions of the sound tracks, generating time-coded linguistic elements which serve as the
Toward a Natural Speech Understanding System

Science.gov (United States)

1989-10-01

toward the monolingual English 25 msec value. Miyawaki et a]. (1975) investigated the /ra/ - /la/ continuum with English and Japanese speakers...Standard Dictionary In order to evaluate some of the claims of the learning theory of speech recognition, a computer model was developed. The NEXus...discrimination of synthetic vowels. Language and Speech, 1962, 5, 171-189. Funk and Wagnalls New Standard Dictionary of the English Language. New York: Funk and
Sixteen-Month-Old Infants' Segment Words from Infant- and Adult-Directed Speech

Science.gov (United States)

Mani, Nivedita; Pätzold, Wiebke

2016-01-01

One of the first challenges facing the young language learner is the task of segmenting words from a natural language speech stream, without prior knowledge of how these words sound. Studies with younger children find that children find it easier to segment words from fluent speech when the words are presented in infant-directed speech, i.e., the…
Audiovisual Speech Perception in Infancy: The Influence of Vowel Identity and Infants' Productive Abilities on Sensitivity to (Mis)Matches between Auditory and Visual Speech Cues

Science.gov (United States)

Altvater-Mackensen, Nicole; Mani, Nivedita; Grossmann, Tobias

2016-01-01

Recent studies suggest that infants' audiovisual speech perception is influenced by articulatory experience (Mugitani et al., 2008; Yeung & Werker, 2013). The current study extends these findings by testing if infants' emerging ability to produce native sounds in babbling impacts their audiovisual speech perception. We tested 44 6-month-olds…
Processing melodic contour and speech intonation in congenital amusics with Mandarin Chinese.

Science.gov (United States)

Jiang, Cunmei; Hamm, Jeff P; Lim, Vanessa K; Kirk, Ian J; Yang, Yufang

2010-07-01

Congenital amusia is a disorder in the perception and production of musical pitch. It has been suggested that early exposure to a tonal language may compensate for the pitch disorder (Peretz, 2008). If so, it is reasonable to expect that there would be different characterizations of pitch perception in music and speech in congenital amusics who speak a tonal language, such as Mandarin. In this study, a group of 11 adults with amusia whose first language was Mandarin were tested with melodic contour and speech intonation discrimination and identification tasks. The participants with amusia were impaired in discriminating and identifying melodic contour. These abnormalities were also detected in identifying both speech and non-linguistic analogue derived patterns for the Mandarin intonation tasks. In addition, there was an overall trend for the participants with amusia to show deficits with respect to controls in the intonation discrimination tasks for both speech and non-linguistic analogues. These findings suggest that the amusics' melodic pitch deficits may extend to the perception of speech, and could potentially result in some language deficits in those who speak a tonal language. Copyright (c) 2010 Elsevier Ltd. All rights reserved.
Developmental apraxia of speech : deficits in phonetic planning and motor programming

NARCIS (Netherlands)

Nijland, Lian

2003-01-01

The speech of children with developmental apraxia of speech (DAS) is highly unintelligible due to many nonsystematic sound substitutions and distortions. There is ongoing debate about the underlying deficit of the disorder. The ultimate goal of this thesis was to answer this question within the
The relationship of speech intelligibility with hearing sensitivity, cognition, and perceived hearing difficulties varies for different speech perception tests

Science.gov (United States)

Heinrich, Antje; Henshaw, Helen; Ferguson, Melanie A.

2015-01-01

Listeners vary in their ability to understand speech in noisy environments. Hearing sensitivity, as measured by pure-tone audiometry, can only partly explain these results, and cognition has emerged as another key concept. Although cognition relates to speech perception, the exact nature of the relationship remains to be fully understood. This study investigates how different aspects of cognition, particularly working memory and attention, relate to speech intelligibility for various tests. Perceptual accuracy of speech perception represents just one aspect of functioning in a listening environment. Activity and participation limits imposed by hearing loss, in addition to the demands of a listening environment, are also important and may be better captured by self-report questionnaires. Understanding how speech perception relates to self-reported aspects of listening forms the second focus of the study. Forty-four listeners aged between 50 and 74 years with mild sensorineural hearing loss were tested on speech perception tests differing in complexity from low (phoneme discrimination in quiet), to medium (digit triplet perception in speech-shaped noise) to high (sentence perception in modulated noise); cognitive tests of attention, memory, and non-verbal intelligence quotient; and self-report questionnaires of general health-related and hearing-specific quality of life. Hearing sensitivity and cognition related to intelligibility differently depending on the speech test: neither was important for phoneme discrimination, hearing sensitivity alone was important for digit triplet perception, and hearing and cognition together played a role in sentence perception. Self-reported aspects of auditory functioning were correlated with speech intelligibility to different degrees, with digit triplets in noise showing the richest pattern. The results suggest that intelligibility tests can vary in their auditory and cognitive demands and their sensitivity to the challenges that
The relationship of speech intelligibility with hearing sensitivity, cognition, and perceived hearing difficulties varies for different speech perception tests

Directory of Open Access Journals (Sweden)

Antje eHeinrich

2015-06-01

Full Text Available Listeners vary in their ability to understand speech in noisy environments. Hearing sensitivity, as measured by pure-tone audiometry, can only partly explain these results, and cognition has emerged as another key concept. Although cognition relates to speech perception, the exact nature of the relationship remains to be fully understood. This study investigates how different aspects of cognition, particularly working memory and attention, relate to speech intelligibility for various tests.Perceptual accuracy of speech perception represents just one aspect of functioning in a listening environment. Activity and participation limits imposed by hearing loss, in addition to the demands of a listening environment, are also important and may be better captured by self-report questionnaires. Understanding how speech perception relates to self-reported aspects of listening forms the second focus of the study.Forty-four listeners aged between 50-74 years with mild SNHL were tested on speech perception tests differing in complexity from low (phoneme discrimination in quiet, to medium (digit triplet perception in speech-shaped noise to high (sentence perception in modulated noise; cognitive tests of attention, memory, and nonverbal IQ; and self-report questionnaires of general health-related and hearing-specific quality of life.Hearing sensitivity and cognition related to intelligibility differently depending on the speech test: neither was important for phoneme discrimination, hearing sensitivity alone was important for digit triplet perception, and hearing and cognition together played a role in sentence perception. Self-reported aspects of auditory functioning were correlated with speech intelligibility to different degrees, with digit triplets in noise showing the richest pattern. The results suggest that intelligibility tests can vary in their auditory and cognitive demands and their sensitivity to the challenges that auditory environments pose on
A Diagnostic Marker to Discriminate Childhood Apraxia of Speech from Speech Delay: IV. the Pause Marker Index

Science.gov (United States)

Shriberg, Lawrence D.; Strand, Edythe A.; Fourakis, Marios; Jakielski, Kathy J.; Hall, Sheryl D.; Karlsson, Heather B.; Mabie, Heather L.; McSweeny, Jane L.; Tilkens, Christie M.; Wilson, David L.

2017-01-01

Purpose: Three previous articles provided rationale, methods, and several forms of validity support for a diagnostic marker of childhood apraxia of speech (CAS), termed the pause marker (PM). Goals of the present article were to assess the validity and stability of the PM Index (PMI) to scale CAS severity. Method: PM scores and speech, prosody,…

Influence of musical training on understanding voiced and whispered speech in noise.

Science.gov (United States)

Ruggles, Dorea R; Freyman, Richard L; Oxenham, Andrew J

2014-01-01

This study tested the hypothesis that the previously reported advantage of musicians over non-musicians in understanding speech in noise arises from more efficient or robust coding of periodic voiced speech, particularly in fluctuating backgrounds. Speech intelligibility was measured in listeners with extensive musical training, and in those with very little musical training or experience, using normal (voiced) or whispered (unvoiced) grammatically correct nonsense sentences in noise that was spectrally shaped to match the long-term spectrum of the speech, and was either continuous or gated with a 16-Hz square wave. Performance was also measured in clinical speech-in-noise tests and in pitch discrimination. Musicians exhibited enhanced pitch discrimination, as expected. However, no systematic or statistically significant advantage for musicians over non-musicians was found in understanding either voiced or whispered sentences in either continuous or gated noise. Musicians also showed no statistically significant advantage in the clinical speech-in-noise tests. Overall, the results provide no evidence for a significant difference between young adult musicians and non-musicians in their ability to understand speech in noise.
Adaptation to Delayed Speech Feedback Induces Temporal Recalibration between Vocal Sensory and Auditory Modalities

Directory of Open Access Journals (Sweden)

Kosuke Yamamoto

2011-10-01

Full Text Available We ordinarily perceive our voice sound as occurring simultaneously with vocal production, but the sense of simultaneity in vocalization can be easily interrupted by delayed auditory feedback (DAF. DAF causes normal people to have difficulty speaking fluently but helps people with stuttering to improve speech fluency. However, the underlying temporal mechanism for integrating the motor production of voice and the auditory perception of vocal sound remains unclear. In this study, we investigated the temporal tuning mechanism integrating vocal sensory and voice sounds under DAF with an adaptation technique. Participants read some sentences with specific delay times of DAF (0, 30, 75, 120 ms during three minutes to induce ‘Lag Adaptation’. After the adaptation, they then judged the simultaneity between motor sensation and vocal sound given feedback in producing simple voice but not speech. We found that speech production with lag adaptation induced a shift in simultaneity responses toward the adapted auditory delays. This indicates that the temporal tuning mechanism in vocalization can be temporally recalibrated after prolonged exposure to delayed vocal sounds. These findings suggest vocalization is finely tuned by the temporal recalibration mechanism, which acutely monitors the integration of temporal delays between motor sensation and vocal sound.
Effects of spectral complexity and sound duration on automatic complex-sound pitch processing in humans - a mismatch negativity study.

Science.gov (United States)

Tervaniemi, M; Schröger, E; Saher, M; Näätänen, R

2000-08-18

The pitch of a spectrally rich sound is known to be more easily perceived than that of a sinusoidal tone. The present study compared the importance of spectral complexity and sound duration in facilitated pitch discrimination. The mismatch negativity (MMN), which reflects automatic neural discrimination, was recorded to a 2. 5% pitch change in pure tones with only one sinusoidal frequency component (500 Hz) and in spectrally rich tones with three (500-1500 Hz) and five (500-2500 Hz) harmonic partials. During the recordings, subjects concentrated on watching a silent movie. In separate blocks, stimuli were of 100 and 250 ms in duration. The MMN amplitude was enhanced with both spectrally rich sounds when compared with pure tones. The prolonged sound duration did not significantly enhance the MMN. This suggests that increased spectral rather than temporal information facilitates pitch processing of spectrally rich sounds.
Ethnic and gender discrimination in the private rental housing market in Finland: A field experiment.

Directory of Open Access Journals (Sweden)

Annamaria Öblom

Full Text Available Ethnic and gender discrimination in a variety of markets has been documented in several populations. We conducted an online field experiment to examine ethnic and gender discrimination in the private rental housing market in Finland. We sent 1459 inquiries regarding 800 apartments. We compared responses to standardized apartment inquiries including fictive Arabic-sounding, Finnish-sounding or Swedish-sounding female or male names. We found evidence of discrimination against Arabic-sounding names and male names. Inquiries including Arabic-sounding male names had the lowest probability of receiving a response, receiving a response to about 16% of the inquiries made, while Finnish-sounding female names received a response to 42% of the inquires. We did not find any evidence of the landlord's gender being associated with the discrimination pattern. The findings suggest that both ethnic and gender discrimination occur in the private rental housing market in Finland.
Ethnic and gender discrimination in the private rental housing market in Finland: A field experiment.

Science.gov (United States)

Öblom, Annamaria; Antfolk, Jan

2017-01-01

Ethnic and gender discrimination in a variety of markets has been documented in several populations. We conducted an online field experiment to examine ethnic and gender discrimination in the private rental housing market in Finland. We sent 1459 inquiries regarding 800 apartments. We compared responses to standardized apartment inquiries including fictive Arabic-sounding, Finnish-sounding or Swedish-sounding female or male names. We found evidence of discrimination against Arabic-sounding names and male names. Inquiries including Arabic-sounding male names had the lowest probability of receiving a response, receiving a response to about 16% of the inquiries made, while Finnish-sounding female names received a response to 42% of the inquires. We did not find any evidence of the landlord's gender being associated with the discrimination pattern. The findings suggest that both ethnic and gender discrimination occur in the private rental housing market in Finland.
Emotionally conditioning the target-speech voice enhances recognition of the target speech under "cocktail-party" listening conditions.

Science.gov (United States)

Lu, Lingxi; Bao, Xiaohan; Chen, Jing; Qu, Tianshu; Wu, Xihong; Li, Liang

2018-05-01

Under a noisy "cocktail-party" listening condition with multiple people talking, listeners can use various perceptual/cognitive unmasking cues to improve recognition of the target speech against informational speech-on-speech masking. One potential unmasking cue is the emotion expressed in a speech voice, by means of certain acoustical features. However, it was unclear whether emotionally conditioning a target-speech voice that has none of the typical acoustical features of emotions (i.e., an emotionally neutral voice) can be used by listeners for enhancing target-speech recognition under speech-on-speech masking conditions. In this study we examined the recognition of target speech against a two-talker speech masker both before and after the emotionally neutral target voice was paired with a loud female screaming sound that has a marked negative emotional valence. The results showed that recognition of the target speech (especially the first keyword in a target sentence) was significantly improved by emotionally conditioning the target speaker's voice. Moreover, the emotional unmasking effect was independent of the unmasking effect of the perceived spatial separation between the target speech and the masker. Also, (skin conductance) electrodermal responses became stronger after emotional learning when the target speech and masker were perceptually co-located, suggesting an increase of listening efforts when the target speech was informationally masked. These results indicate that emotionally conditioning the target speaker's voice does not change the acoustical parameters of the target-speech stimuli, but the emotionally conditioned vocal features can be used as cues for unmasking target speech.
A frequency bin-wise nonlinear masking algorithm in convolutive mixtures for speech segregation.

Science.gov (United States)

Chi, Tai-Shih; Huang, Ching-Wen; Chou, Wen-Sheng

2012-05-01

A frequency bin-wise nonlinear masking algorithm is proposed in the spectrogram domain for speech segregation in convolutive mixtures. The contributive weight from each speech source to a time-frequency unit of the mixture spectrogram is estimated by a nonlinear function based on location cues. For each sound source, a non-binary mask is formed from the estimated weights and is multiplied to the mixture spectrogram to extract the sound. Head-related transfer functions (HRTFs) are used to simulate convolutive sound mixtures perceived by listeners. Simulation results show our proposed method outperforms convolutive independent component analysis and degenerate unmixing and estimation technique methods in almost all test conditions.
A comparison of sound quality judgments for monaural and binaural hearing aid processed stimuli.

Science.gov (United States)

Balfour, P B; Hawkins, D B

1992-10-01

Fifteen adults with bilaterally symmetrical mild and/or moderate sensorineural hearing loss completed a paired-comparison task designed to elicit sound quality preference judgments for monaural/binaural hearing aid processed signals. Three stimuli (speech-in-quiet, speech-in-noise, and music) were recorded separately in three listening environments (audiometric test booth, living room, and a music/lecture hall) through hearing aids placed on a Knowles Electronics Manikin for Acoustics Research. Judgments were made on eight separate sound quality dimensions (brightness, clarity, fullness, loudness, nearness, overall impression, smoothness, and spaciousness) for each of the three stimuli in three listening environments. Results revealed a distinct binaural preference for all eight sound quality dimensions independent of listening environment. Binaural preferences were strongest for overall impression, fullness, and spaciousness. Stimulus type effect was significant only for fullness and spaciousness, where binaural preferences were strongest for speech-in-quiet. After binaural preference data were obtained, subjects ranked each sound quality dimension with respect to its importance for binaural listening relative to monaural. Clarity was ranked highest in importance and brightness was ranked least important. The key to demonstration of improved binaural hearing aid sound quality may be the use of a paired-comparison format.
Primary progressive aphasia and apraxia of speech.

Science.gov (United States)

Jung, Youngsin; Duffy, Joseph R; Josephs, Keith A

2013-09-01

Primary progressive aphasia is a neurodegenerative syndrome characterized by progressive language dysfunction. The majority of primary progressive aphasia cases can be classified into three subtypes: nonfluent/agrammatic, semantic, and logopenic variants. Each variant presents with unique clinical features, and is associated with distinctive underlying pathology and neuroimaging findings. Unlike primary progressive aphasia, apraxia of speech is a disorder that involves inaccurate production of sounds secondary to impaired planning or programming of speech movements. Primary progressive apraxia of speech is a neurodegenerative form of apraxia of speech, and it should be distinguished from primary progressive aphasia given its discrete clinicopathological presentation. Recently, there have been substantial advances in our understanding of these speech and language disorders. The clinical, neuroimaging, and histopathological features of primary progressive aphasia and apraxia of speech are reviewed in this article. The distinctions among these disorders for accurate diagnosis are increasingly important from a prognostic and therapeutic standpoint. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
Speech Understanding with a New Implant Technology: A Comparative Study with a New Nonskin Penetrating Baha System

Directory of Open Access Journals (Sweden)

Anja Kurz

2014-01-01

Full Text Available Objective. To compare hearing and speech understanding between a new, nonskin penetrating Baha system (Baha Attract to the current Baha system using a skin-penetrating abutment. Methods. Hearing and speech understanding were measured in 16 experienced Baha users. The transmission path via the abutment was compared to a simulated Baha Attract transmission path by attaching the implantable magnet to the abutment and then by adding a sample of artificial skin and the external parts of the Baha Attract system. Four different measurements were performed: bone conduction thresholds directly through the sound processor (BC Direct, aided sound field thresholds, aided speech understanding in quiet, and aided speech understanding in noise. Results. The simulated Baha Attract transmission path introduced an attenuation starting from approximately 5 dB at 1000 Hz, increasing to 20–25 dB above 6000 Hz. However, aided sound field threshold shows smaller differences and aided speech understanding in quiet and in noise does not differ significantly between the two transmission paths. Conclusion. The Baha Attract system transmission path introduces predominately high frequency attenuation. This attenuation can be partially compensated by adequate fitting of the speech processor. No significant decrease in speech understanding in either quiet or in noise was found.
Musical background not associated with self-perceived hearing performance or speech perception in postlingual cochlear-implant users.

Science.gov (United States)

Fuller, Christina; Free, Rolien; Maat, Bert; Başkent, Deniz

2012-08-01

In normal-hearing listeners, musical background has been observed to change the sound representation in the auditory system and produce enhanced performance in some speech perception tests. Based on these observations, it has been hypothesized that musical background can influence sound and speech perception, and as an extension also the quality of life, by cochlear-implant users. To test this hypothesis, this study explored musical background [using the Dutch Musical Background Questionnaire (DMBQ)], and self-perceived sound and speech perception and quality of life [using the Nijmegen Cochlear Implant Questionnaire (NCIQ) and the Speech Spatial and Qualities of Hearing Scale (SSQ)] in 98 postlingually deafened adult cochlear-implant recipients. In addition to self-perceived measures, speech perception scores (percentage of phonemes recognized in words presented in quiet) were obtained from patient records. The self-perceived hearing performance was associated with the objective speech perception. Forty-one respondents (44% of 94 respondents) indicated some form of formal musical training. Fifteen respondents (18% of 83 respondents) judged themselves as having musical training, experience, and knowledge. No association was observed between musical background (quantified by DMBQ), and self-perceived hearing-related performance or quality of life (quantified by NCIQ and SSQ), or speech perception in quiet.
[Clinical characteristics and speech therapy of lingua-apical articulation disorder].

Science.gov (United States)

Zhang, Feng-hua; Jin, Xing-ming; Zhang, Yi-wen; Wu, Hong; Jiang, Fan; Shen, Xiao-ming

2006-03-01

To explore the clinical characteristics and speech therapy of 62 children with lingua-apical articulation disorder. Peabody Picture Vocabulary Test (PPVT), Gesell development scales (Gesell), Wechsler Intelligence Scale for Preschool Children (WPPSI) and speech test were performed for 62 children at the ages of 3 to 8 years with lingua-apical articulation disorder. PPVT was used to measure receptive vocabulary skills. GESELL and WPPSI were utilized to represent cognitive and non-verbal ability. The speech test was adopted to assess the speech development. The children received speech therapy and auxiliary oral-motor functional training once or twice a week. Firstly the target sound was identified according to the speech development milestone, then the method of speech localization was used to clarify the correct articulation placement and manner. It was needed to change food character and administer oral-motor functional training for children with oral motor dysfunction. The 62 cases with the apical articulation disorder were classified into four groups. The combined pattern of the articulation disorder was the most common (40 cases, 64.5%), the next was apico-dental disorder (15 cases, 24.2%). The third was palatal disorder (4 cases, 6.5%) and the last one was the linguo-alveolar disorder (3 cases, 4.8%). The substitution errors of velar were the most common (95.2%), the next was omission errors (30.6%) and the last was absence of aspiration (12.9%). Oral motor dysfunction was found in some children with problems such as disordered joint movement of tongue and head, unstable jaw, weak tongue strength and poor coordination of tongue movement. Some children had feeding problems such as preference of eating soft food, keeping food in mouths, eating slowly, and poor chewing. After 5 to 18 times of therapy, the effective rate of speech therapy reached 82.3%. The lingua-apical articulation disorders can be classified into four groups. The combined pattern of the
Fine-grained pitch processing of music and speech in congenital amusia.

Science.gov (United States)

Tillmann, Barbara; Rusconi, Elena; Traube, Caroline; Butterworth, Brian; Umiltà, Carlo; Peretz, Isabelle

2011-12-01

Congenital amusia is a lifelong disorder of music processing that has been ascribed to impaired pitch perception and memory. The present study tested a large group of amusics (n=17) and provided evidence that their pitch deficit affects pitch processing in speech to a lesser extent: Fine-grained pitch discrimination was better in spoken syllables than in acoustically matched tones. Unlike amusics, control participants performed fine-grained pitch discrimination better for musical material than for verbal material. These findings suggest that pitch extraction can be influenced by the nature of the material (music vs speech), and that amusics' pitch deficit is not restricted to musical material, but extends to segmented speech events. © 2011 Acoustical Society of America
Analysis, Synthesis, and Perception of Musical Sounds The Sound of Music

CERN Document Server

Beauchamp, James W

2007-01-01

Analysis, Synthesis, and Perception of Musical Sounds contains a detailed treatment of basic methods for analysis and synthesis of musical sounds, including the phase vocoder method, the McAulay-Quatieri frequency-tracking method, the constant-Q transform, and methods for pitch tracking with several examples shown. Various aspects of musical sound spectra such as spectral envelope, spectral centroid, spectral flux, and spectral irregularity are defined and discussed. One chapter is devoted to the control and synthesis of spectral envelopes. Two advanced methods of analysis/synthesis are given: "Sines Plus Transients Plus Noise" and "Spectrotemporal Reassignment" are covered. Methods for timbre morphing are given. The last two chapters discuss the perception of musical sounds based on discrimination and multidimensional scaling timbre models.
Unvoiced Speech Recognition Using Tissue-Conductive Acoustic Sensor

Directory of Open Access Journals (Sweden)

Heracleous Panikos

2007-01-01

Full Text Available We present the use of stethoscope and silicon NAM (nonaudible murmur microphones in automatic speech recognition. NAM microphones are special acoustic sensors, which are attached behind the talker's ear and can capture not only normal (audible speech, but also very quietly uttered speech (nonaudible murmur. As a result, NAM microphones can be applied in automatic speech recognition systems when privacy is desired in human-machine communication. Moreover, NAM microphones show robustness against noise and they might be used in special systems (speech recognition, speech transform, etc. for sound-impaired people. Using adaptation techniques and a small amount of training data, we achieved for a 20 k dictation task a word accuracy for nonaudible murmur recognition in a clean environment. In this paper, we also investigate nonaudible murmur recognition in noisy environments and the effect of the Lombard reflex on nonaudible murmur recognition. We also propose three methods to integrate audible speech and nonaudible murmur recognition using a stethoscope NAM microphone with very promising results.
CASRA+: A Colloquial Arabic Speech Recognition Application

OpenAIRE

Ramzi A. Haraty; Omar El Ariss

2007-01-01

The research proposed here was for an Arabic speech recognition application, concentrating on the Lebanese dialect. The system starts by sampling the speech, which was the process of transforming the sound from analog to digital and then extracts the features by using the Mel-Frequency Cepstral Coefficients (MFCC). The extracted features are then compared with the system's stored model; in this case the stored model chosen was a phoneme-based model. This reference model differs from the direc...
Using therapeutic sound with progressive audiologic tinnitus management.

Science.gov (United States)

Henry, James A; Zaugg, Tara L; Myers, Paula J; Schechter, Martin A

2008-09-01

Management of tinnitus generally involves educational counseling, stress reduction, and/or the use of therapeutic sound. This article focuses on therapeutic sound, which can involve three objectives: (a) producing a sense of relief from tinnitus-associated stress (using soothing sound); (b) passively diverting attention away from tinnitus by reducing contrast between tinnitus and the acoustic environment (using background sound); and (c) actively diverting attention away from tinnitus (using interesting sound). Each of these goals can be accomplished using three different types of sound-broadly categorized as environmental sound, music, and speech-resulting in nine combinations of uses of sound and types of sound to manage tinnitus. The authors explain the uses and types of sound, how they can be combined, and how the different combinations are used with Progressive Audiologic Tinnitus Management. They also describe how sound is used with other sound-based methods of tinnitus management (Tinnitus Masking, Tinnitus Retraining Therapy, and Neuromonics).
Electrophysiological evidence for speech-specific audiovisual integration.

Science.gov (United States)

Baart, Martijn; Stekelenburg, Jeroen J; Vroomen, Jean

2014-01-01

Lip-read speech is integrated with heard speech at various neural levels. Here, we investigated the extent to which lip-read induced modulations of the auditory N1 and P2 (measured with EEG) are indicative of speech-specific audiovisual integration, and we explored to what extent the ERPs were modulated by phonetic audiovisual congruency. In order to disentangle speech-specific (phonetic) integration from non-speech integration, we used Sine-Wave Speech (SWS) that was perceived as speech by half of the participants (they were in speech-mode), while the other half was in non-speech mode. Results showed that the N1 obtained with audiovisual stimuli peaked earlier than the N1 evoked by auditory-only stimuli. This lip-read induced speeding up of the N1 occurred for listeners in speech and non-speech mode. In contrast, if listeners were in speech-mode, lip-read speech also modulated the auditory P2, but not if listeners were in non-speech mode, thus revealing speech-specific audiovisual binding. Comparing ERPs for phonetically congruent audiovisual stimuli with ERPs for incongruent stimuli revealed an effect of phonetic stimulus congruency that started at ~200 ms after (in)congruence became apparent. Critically, akin to the P2 suppression, congruency effects were only observed if listeners were in speech mode, and not if they were in non-speech mode. Using identical stimuli, we thus confirm that audiovisual binding involves (partially) different neural mechanisms for sound processing in speech and non-speech mode. © 2013 Published by Elsevier Ltd.
The pathways for intelligible speech: multivariate and univariate perspectives.

Science.gov (United States)

Evans, S; Kyong, J S; Rosen, S; Golestani, N; Warren, J E; McGettigan, C; Mourão-Miranda, J; Wise, R J S; Scott, S K

2014-09-01

An anterior pathway, concerned with extracting meaning from sound, has been identified in nonhuman primates. An analogous pathway has been suggested in humans, but controversy exists concerning the degree of lateralization and the precise location where responses to intelligible speech emerge. We have demonstrated that the left anterior superior temporal sulcus (STS) responds preferentially to intelligible speech (Scott SK, Blank CC, Rosen S, Wise RJS. 2000. Identification of a pathway for intelligible speech in the left temporal lobe. Brain. 123:2400-2406.). A functional magnetic resonance imaging study in Cerebral Cortex used equivalent stimuli and univariate and multivariate analyses to argue for the greater importance of bilateral posterior when compared with the left anterior STS in responding to intelligible speech (Okada K, Rong F, Venezia J, Matchin W, Hsieh IH, Saberi K, Serences JT,Hickok G. 2010. Hierarchical organization of human auditory cortex: evidence from acoustic invariance in the response to intelligible speech. 20: 2486-2495.). Here, we also replicate our original study, demonstrating that the left anterior STS exhibits the strongest univariate response and, in decoding using the bilateral temporal cortex, contains the most informative voxels showing an increased response to intelligible speech. In contrast, in classifications using local "searchlights" and a whole brain analysis, we find greater classification accuracy in posterior rather than anterior temporal regions. Thus, we show that the precise nature of the multivariate analysis used will emphasize different response profiles associated with complex sound to speech processing. © The Author 2013. Published by Oxford University Press.
Predicting Prosody from Text for Text-to-Speech Synthesis

CERN Document Server

Rao, K Sreenivasa

2012-01-01

Predicting Prosody from Text for Text-to-Speech Synthesis covers the specific aspects of prosody, mainly focusing on how to predict the prosodic information from linguistic text, and then how to exploit the predicted prosodic knowledge for various speech applications. Author K. Sreenivasa Rao discusses proposed methods along with state-of-the-art techniques for the acquisition and incorporation of prosodic knowledge for developing speech systems. Positional, contextual and phonological features are proposed for representing the linguistic and production constraints of the sound units present in the text. This book is intended for graduate students and researchers working in the area of speech processing.

Audiovisual speech perception development at varying levels of perceptual processing

OpenAIRE

Lalonde, Kaylah; Holt, Rachael Frush

2016-01-01

This study used the auditory evaluation framework [Erber (1982). Auditory Training (Alexander Graham Bell Association, Washington, DC)] to characterize the influence of visual speech on audiovisual (AV) speech perception in adults and children at multiple levels of perceptual processing. Six- to eight-year-old children and adults completed auditory and AV speech perception tasks at three levels of perceptual processing (detection, discrimination, and recognition). The tasks differed in the le...
Adaptive RD Optimized Hybrid Sound Coding

NARCIS (Netherlands)

Schijndel, N.H. van; Bensa, J.; Christensen, M.G.; Colomes, C.; Edler, B.; Heusdens, R.; Jensen, J.; Jensen, S.H.; Kleijn, W.B.; Kot, V.; Kövesi, B.; Lindblom, J.; Massaloux, D.; Niamut, O.A.; Nordén, F.; Plasberg, J.H.; Vafin, R.; Virette, D.; Wübbolt, O.

2008-01-01

Traditionally, sound codecs have been developed with a particular application in mind, their performance being optimized for specific types of input signals, such as speech or audio (music), and application constraints, such as low bit rate, high quality, or low delay. There is, however, an
Investigation of in-vehicle speech intelligibility metrics for normal hearing and hearing impaired listeners

Science.gov (United States)

Samardzic, Nikolina

The effectiveness of in-vehicle speech communication can be a good indicator of the perception of the overall vehicle quality and customer satisfaction. Currently available speech intelligibility metrics do not account in their procedures for essential parameters needed for a complete and accurate evaluation of in-vehicle speech intelligibility. These include the directivity and the distance of the talker with respect to the listener, binaural listening, hearing profile of the listener, vocal effort, and multisensory hearing. In the first part of this research the effectiveness of in-vehicle application of these metrics is investigated in a series of studies to reveal their shortcomings, including a wide range of scores resulting from each of the metrics for a given measurement configuration and vehicle operating condition. In addition, the nature of a possible correlation between the scores obtained from each metric is unknown. The metrics and the subjective perception of speech intelligibility using, for example, the same speech material have not been compared in literature. As a result, in the second part of this research, an alternative method for speech intelligibility evaluation is proposed for use in the automotive industry by utilizing a virtual reality driving environment for ultimately setting targets, including the associated statistical variability, for future in-vehicle speech intelligibility evaluation. The Speech Intelligibility Index (SII) was evaluated at the sentence Speech Receptions Threshold (sSRT) for various listening situations and hearing profiles using acoustic perception jury testing and a variety of talker and listener configurations and background noise. In addition, the effect of individual sources and transfer paths of sound in an operating vehicle to the vehicle interior sound, specifically their effect on speech intelligibility was quantified, in the framework of the newly developed speech intelligibility evaluation method. Lastly
A Diagnostic Marker to Discriminate Childhood Apraxia of Speech from Speech Delay: III. Theoretical Coherence of the Pause Marker with Speech Processing Deficits in Childhood Apraxia of Speech

Science.gov (United States)

Shriberg, Lawrence D.; Strand, Edythe A.; Fourakis, Marios; Jakielski, Kathy J.; Hall, Sheryl D.; Karlsson, Heather B.; Mabie, Heather L.; McSweeny, Jane L.; Tilkens, Christie M.; Wilson, David L.

2017-01-01

Purpose: Previous articles in this supplement described rationale for and development of the pause marker (PM), a diagnostic marker of childhood apraxia of speech (CAS), and studies supporting its validity and reliability. The present article assesses the theoretical coherence of the PM with speech processing deficits in CAS. Method: PM and other…
Early Morphology and Recurring Sound Patterns

DEFF Research Database (Denmark)

Kjærbæk, Laila; Basbøll, Hans; Lambertsen, Claus

Corpus is a longitudinal corpus of spontaneous Child Speech and Child Directed Speech recorded in the children's homes in interaction with their parents or caretaker and transcribed in CHILDES (MacWhinney 2007 a, b), supplemented by parts of Kim Plunkett's Danish corpus (CHILDES) (Plunkett 1985, 1986...... in creating the typologically characteristic syllable structure of Danish with extreme sound reductions (Rischel 2003, Basbøll 2005) presenting a challenge to the language acquiring child (Bleses & Basbøll 2004). Building upon the Danish CDI-studies as well as on the Odense Twin Corpus and experimental data...
Timing in audiovisual speech perception: A mini review and new psychophysical data.

Science.gov (United States)

Venezia, Jonathan H; Thurman, Steven M; Matchin, William; George, Sahara E; Hickok, Gregory

2016-02-01

Recent influential models of audiovisual speech perception suggest that visual speech aids perception by generating predictions about the identity of upcoming speech sounds. These models place stock in the assumption that visual speech leads auditory speech in time. However, it is unclear whether and to what extent temporally-leading visual speech information contributes to perception. Previous studies exploring audiovisual-speech timing have relied upon psychophysical procedures that require artificial manipulation of cross-modal alignment or stimulus duration. We introduce a classification procedure that tracks perceptually relevant visual speech information in time without requiring such manipulations. Participants were shown videos of a McGurk syllable (auditory /apa/ + visual /aka/ = perceptual /ata/) and asked to perform phoneme identification (/apa/ yes-no). The mouth region of the visual stimulus was overlaid with a dynamic transparency mask that obscured visual speech in some frames but not others randomly across trials. Variability in participants' responses (~35 % identification of /apa/ compared to ~5 % in the absence of the masker) served as the basis for classification analysis. The outcome was a high resolution spatiotemporal map of perceptually relevant visual features. We produced these maps for McGurk stimuli at different audiovisual temporal offsets (natural timing, 50-ms visual lead, and 100-ms visual lead). Briefly, temporally-leading (~130 ms) visual information did influence auditory perception. Moreover, several visual features influenced perception of a single speech sound, with the relative influence of each feature depending on both its temporal relation to the auditory signal and its informational content.
Timing in Audiovisual Speech Perception: A Mini Review and New Psychophysical Data

Science.gov (United States)

Venezia, Jonathan H.; Thurman, Steven M.; Matchin, William; George, Sahara E.; Hickok, Gregory

2015-01-01

Recent influential models of audiovisual speech perception suggest that visual speech aids perception by generating predictions about the identity of upcoming speech sounds. These models place stock in the assumption that visual speech leads auditory speech in time. However, it is unclear whether and to what extent temporally-leading visual speech information contributes to perception. Previous studies exploring audiovisual-speech timing have relied upon psychophysical procedures that require artificial manipulation of cross-modal alignment or stimulus duration. We introduce a classification procedure that tracks perceptually-relevant visual speech information in time without requiring such manipulations. Participants were shown videos of a McGurk syllable (auditory /apa/ + visual /aka/ = perceptual /ata/) and asked to perform phoneme identification (/apa/ yes-no). The mouth region of the visual stimulus was overlaid with a dynamic transparency mask that obscured visual speech in some frames but not others randomly across trials. Variability in participants' responses (∼35% identification of /apa/ compared to ∼5% in the absence of the masker) served as the basis for classification analysis. The outcome was a high resolution spatiotemporal map of perceptually-relevant visual features. We produced these maps for McGurk stimuli at different audiovisual temporal offsets (natural timing, 50-ms visual lead, and 100-ms visual lead). Briefly, temporally-leading (∼130 ms) visual information did influence auditory perception. Moreover, several visual features influenced perception of a single speech sound, with the relative influence of each feature depending on both its temporal relation to the auditory signal and its informational content. PMID:26669309
Speech perception benefits of FM and infrared devices to children with hearing aids in a typical classroom.

Science.gov (United States)

Anderson, Karen L; Goldstein, Howard

2004-04-01

Children typically learn in classroom environments that have background noise and reverberation that interfere with accurate speech perception. Amplification technology can enhance the speech perception of students who are hard of hearing. This study used a single-subject alternating treatments design to compare the speech recognition abilities of children who are, hard of hearing when they were using hearing aids with each of three frequency modulated (FM) or infrared devices. Eight 9-12-year-olds with mild to severe hearing loss repeated Hearing in Noise Test (HINT) sentence lists under controlled conditions in a typical kindergarten classroom with a background noise level of +10 dB signal-to-noise (S/N) ratio and 1.1 s reverberation time. Participants listened to HINT lists using hearing aids alone and hearing aids in combination with three types of S/N-enhancing devices that are currently used in mainstream classrooms: (a) FM systems linked to personal hearing aids, (b) infrared sound field systems with speakers placed throughout the classroom, and (c) desktop personal sound field FM systems. The infrared ceiling sound field system did not provide benefit beyond that provided by hearing aids alone. Desktop and personal FM systems in combination with personal hearing aids provided substantial improvements in speech recognition. This information can assist in making S/N-enhancing device decisions for students using hearing aids. In a reverberant and noisy classroom setting, classroom sound field devices are not beneficial to speech perception for students with hearing aids, whereas either personal FM or desktop sound field systems provide listening benefits.
Cochlear neuropathy and the coding of supra-threshold sound.

Science.gov (United States)

Bharadwaj, Hari M; Verhulst, Sarah; Shaheen, Luke; Liberman, M Charles; Shinn-Cunningham, Barbara G

2014-01-01

Many listeners with hearing thresholds within the clinically normal range nonetheless complain of difficulty hearing in everyday settings and understanding speech in noise. Converging evidence from human and animal studies points to one potential source of such difficulties: differences in the fidelity with which supra-threshold sound is encoded in the early portions of the auditory pathway. Measures of auditory subcortical steady-state responses (SSSRs) in humans and animals support the idea that the temporal precision of the early auditory representation can be poor even when hearing thresholds are normal. In humans with normal hearing thresholds (NHTs), paradigms that require listeners to make use of the detailed spectro-temporal structure of supra-threshold sound, such as selective attention and discrimination of frequency modulation (FM), reveal individual differences that correlate with subcortical temporal coding precision. Animal studies show that noise exposure and aging can cause a loss of a large percentage of auditory nerve fibers (ANFs) without any significant change in measured audiograms. Here, we argue that cochlear neuropathy may reduce encoding precision of supra-threshold sound, and that this manifests both behaviorally and in SSSRs in humans. Furthermore, recent studies suggest that noise-induced neuropathy may be selective for higher-threshold, lower-spontaneous-rate nerve fibers. Based on our hypothesis, we suggest some approaches that may yield particularly sensitive, objective measures of supra-threshold coding deficits that arise due to neuropathy. Finally, we comment on the potential clinical significance of these ideas and identify areas for future investigation.
Cochlear Neuropathy and the Coding of Supra-threshold Sound

Directory of Open Access Journals (Sweden)

Hari M Bharadwaj

2014-02-01

Full Text Available Many listeners with hearing thresholds within the clinically normal range nonetheless complain of difficulty hearing in everyday settings and understanding speech in noise. Converging evidence from human and animal studies points to one potential source of such difficulties: differences in the fidelity with which supra-threshold sound is encoded in the early portions of the auditory pathway. Measures of auditory subcortical steady-state responses in humans and animals support the idea that the temporal precision of the early auditory representation can be poor even when hearing thresholds are normal. In humans with normal hearing thresholds, behavioral ability in paradigms that require listeners to make use of the detailed spectro-temporal structure of supra-threshold sound, such as selective attention and discrimination of frequency modulation, correlate with subcortical temporal coding precision. Animal studies show that noise exposure and aging can cause a loss of a large percentage of auditory nerve fibers without any significant change in measured audiograms. Here, we argue that cochlear neuropathy may reduce encoding precision of supra-threshold sound, and that this manifests both behaviorally and in subcortical steady-state responses in humans. Furthermore, recent studies suggest that noise-induced neuropathy may be selective for higher-threshold, lower-spontaneous-rate nerve fibers. Based on our hypothesis, we suggest some approaches that may yield particularly sensitive, objective measures of supra-threshold coding deficits that arise due to neuropathy. Finally, we comment on the potential clinical significance of these ideas and identify areas for future investigation.
The influence of spectral characteristics of early reflections on speech intelligibility

DEFF Research Database (Denmark)

Arweiler, Iris; Buchholz, Jörg

2011-01-01

The auditory system takes advantage of early reflections (ERs) in a room by integrating them with the direct sound (DS) and thereby increasing the effective speech level. In the present paper the benefit from realistic ERs on speech intelligibility in diffuse speech-shaped noise was investigated...... ascribed to their altered spectrum compared to the DS and to the filtering by the torso, head, and pinna. No binaural processing other than a binaural summation effect could be observed....
The effect of instantaneous input dynamic range setting on the speech perception of children with the nucleus 24 implant.

Science.gov (United States)

Davidson, Lisa S; Skinner, Margaret W; Holstad, Beth A; Fears, Beverly T; Richter, Marie K; Matusofsky, Margaret; Brenner, Christine; Holden, Timothy; Birath, Amy; Kettel, Jerrica L; Scollie, Susan

2009-06-01

The purpose of this study was to examine the effects of a wider instantaneous input dynamic range (IIDR) setting on speech perception and comfort in quiet and noise for children wearing the Nucleus 24 implant system and the Freedom speech processor. In addition, children's ability to understand soft and conversational level speech in relation to aided sound-field thresholds was examined. Thirty children (age, 7 to 17 years) with the Nucleus 24 cochlear implant system and the Freedom speech processor with two different IIDR settings (30 versus 40 dB) were tested on the Consonant Nucleus Consonant (CNC) word test at 50 and 60 dB SPL, the Bamford-Kowal-Bench Speech in Noise Test, and a loudness rating task for four-talker speech noise. Aided thresholds for frequency-modulated tones, narrowband noise, and recorded Ling sounds were obtained with the two IIDRs and examined in relation to CNC scores at 50 dB SPL. Speech Intelligibility Indices were calculated using the long-term average speech spectrum of the CNC words at 50 dB SPL measured at each test site and aided thresholds. Group mean CNC scores at 50 dB SPL with the 40 IIDR were significantly higher (p Speech in Noise Test were not significantly different for the two IIDRs. Significantly improved aided thresholds at 250 to 6000 Hz as well as higher Speech Intelligibility Indices afforded improved audibility for speech presented at soft levels (50 dB SPL). These results indicate that an increased IIDR provides improved word recognition for soft levels of speech without compromising comfort of higher levels of speech sounds or sentence recognition in noise.
SUSTAINABILITY IN THE BOWELS OF SPEECHES

Directory of Open Access Journals (Sweden)

Jadir Mauro Galvao

2012-10-01

Full Text Available The theme of sustainability has not yet achieved the feat of make up as an integral part the theoretical medley that brings out our most everyday actions, often visits some of our thoughts and permeates many of our speeches. The big event of 2012, the meeting gathered Rio +20 glances from all corners of the planet around that theme as burning, but we still see forward timidly. Although we have no very clear what the term sustainability closes it does not sound quite strange. Associate with things like ecology, planet, wastes emitted by smokestacks of factories, deforestation, recycling and global warming must be related, but our goal in this article is the least of clarifying the term conceptually and more try to observe as it appears in speeches of such conference. When the competent authorities talk about sustainability relate to what? We intend to investigate the lines and between the lines of these speeches, any assumptions associated with the term. Therefore we will analyze the speech of the People´s Summit, the opening speech of President Dilma and emblematic speech of the President of Uruguay, José Pepe Mujica.
Speech cues contribute to audiovisual spatial integration.

Directory of Open Access Journals (Sweden)

Christopher W Bishop

Full Text Available Speech is the most important form of human communication but ambient sounds and competing talkers often degrade its acoustics. Fortunately the brain can use visual information, especially its highly precise spatial information, to improve speech comprehension in noisy environments. Previous studies have demonstrated that audiovisual integration depends strongly on spatiotemporal factors. However, some integrative phenomena such as McGurk interference persist even with gross spatial disparities, suggesting that spatial alignment is not necessary for robust integration of audiovisual place-of-articulation cues. It is therefore unclear how speech-cues interact with audiovisual spatial integration mechanisms. Here, we combine two well established psychophysical phenomena, the McGurk effect and the ventriloquist's illusion, to explore this dependency. Our results demonstrate that conflicting spatial cues may not interfere with audiovisual integration of speech, but conflicting speech-cues can impede integration in space. This suggests a direct but asymmetrical influence between ventral 'what' and dorsal 'where' pathways.
Brain Plasticity in Speech Training in Native English Speakers Learning Mandarin Tones

Science.gov (United States)

Heinzen, Christina Carolyn

The current study employed behavioral and event-related potential (ERP) measures to investigate brain plasticity associated with second-language (L2) phonetic learning based on an adaptive computer training program. The program utilized the acoustic characteristics of Infant-Directed Speech (IDS) to train monolingual American English-speaking listeners to perceive Mandarin lexical tones. Behavioral identification and discrimination tasks were conducted using naturally recorded speech, carefully controlled synthetic speech, and non-speech control stimuli. The ERP experiments were conducted with selected synthetic speech stimuli in a passive listening oddball paradigm. Identical pre- and post- tests were administered on nine adult listeners, who completed two-to-three hours of perceptual training. The perceptual training sessions used pair-wise lexical tone identification, and progressed through seven levels of difficulty for each tone pair. The levels of difficulty included progression in speaker variability from one to four speakers and progression through four levels of acoustic exaggeration of duration, pitch range, and pitch contour. Behavioral results for the natural speech stimuli revealed significant training-induced improvement in identification of Tones 1, 3, and 4. Improvements in identification of Tone 4 generalized to novel stimuli as well. Additionally, comparison between discrimination of across-category and within-category stimulus pairs taken from a synthetic continuum revealed a training-induced shift toward more native-like categorical perception of the Mandarin lexical tones. Analysis of the Mismatch Negativity (MMN) responses in the ERP data revealed increased amplitude and decreased latency for pre-attentive processing of across-category discrimination as a result of training. There were also laterality changes in the MMN responses to the non-speech control stimuli, which could reflect reallocation of brain resources in processing pitch patterns
The role of reverberation-related binaural cues in the externalization of speech.

Science.gov (United States)

Catic, Jasmina; Santurette, Sébastien; Dau, Torsten

2015-08-01

The perception of externalization of speech sounds was investigated with respect to the monaural and binaural cues available at the listeners' ears in a reverberant environment. Individualized binaural room impulse responses (BRIRs) were used to simulate externalized sound sources via headphones. The measured BRIRs were subsequently modified such that the proportion of the response containing binaural vs monaural information was varied. Normal-hearing listeners were presented with speech sounds convolved with such modified BRIRs. Monaural reverberation cues were found to be sufficient for the externalization of a lateral sound source. In contrast, for a frontal source, an increased amount of binaural cues from reflections was required in order to obtain well externalized sound images. It was demonstrated that the interaction between the interaural cues of the direct sound and the reverberation strongly affects the perception of externalization. An analysis of the short-term binaural cues showed that the amount of fluctuations of the binaural cues corresponded well to the externalization ratings obtained in the listening tests. The results further suggested that the precedence effect is involved in the auditory processing of the dynamic binaural cues that are utilized for externalization perception.
Auditory-neurophysiological responses to speech during early childhood: Effects of background noise.

Science.gov (United States)

White-Schwoch, Travis; Davies, Evan C; Thompson, Elaine C; Woodruff Carr, Kali; Nicol, Trent; Bradlow, Ann R; Kraus, Nina

2015-10-01

Early childhood is a critical period of auditory learning, during which children are constantly mapping sounds to meaning. But this auditory learning rarely occurs in ideal listening conditions-children are forced to listen against a relentless din. This background noise degrades the neural coding of these critical sounds, in turn interfering with auditory learning. Despite the importance of robust and reliable auditory processing during early childhood, little is known about the neurophysiology underlying speech processing in children so young. To better understand the physiological constraints these adverse listening scenarios impose on speech sound coding during early childhood, auditory-neurophysiological responses were elicited to a consonant-vowel syllable in quiet and background noise in a cohort of typically-developing preschoolers (ages 3-5 yr). Overall, responses were degraded in noise: they were smaller, less stable across trials, slower, and there was poorer coding of spectral content and the temporal envelope. These effects were exacerbated in response to the consonant transition relative to the vowel, suggesting that the neural coding of spectrotemporally-dynamic speech features is more tenuous in noise than the coding of static features-even in children this young. Neural coding of speech temporal fine structure, however, was more resilient to the addition of background noise than coding of temporal envelope information. Taken together, these results demonstrate that noise places a neurophysiological constraint on speech processing during early childhood by causing a breakdown in neural processing of speech acoustics. These results may explain why some listeners have inordinate difficulties understanding speech in noise. Speech-elicited auditory-neurophysiological responses offer objective insight into listening skills during early childhood by reflecting the integrity of neural coding in quiet and noise; this paper documents typical response
Sounds in context

DEFF Research Database (Denmark)

Weed, Ethan

A sound is never just a sound. It is becoming increasingly clear that auditory processing is best thought of not as a one-way afferent stream, but rather as an ongoing interaction between interior processes and the environment. Even the earliest stages of auditory processing in the nervous system...... time-course of contextual influence on auditory processing in three different paradigms: a simple mismatch negativity paradigm with tones of differing pitch, a multi-feature mismatch negativity paradigm in which tones were embedded in a complex musical context, and a cross-modal paradigm, in which...... auditory processing of emotional speech was modulated by an accompanying visual context. I then discuss these results in terms of their implication for how we conceive of the auditory processing stream....
Unconscious improvement in foreign language learning using mismatch negativity neurofeedback: A preliminary study.

Directory of Open Access Journals (Sweden)

Ming Chang

Full Text Available When people learn foreign languages, they find it difficult to perceive speech sounds that are nonexistent in their native language, and extensive training is consequently necessary. Our previous studies have shown that by using neurofeedback based on the mismatch negativity event-related brain potential, participants could unconsciously achieve learning in the auditory discrimination of pure tones that could not be consciously discriminated without the neurofeedback. Here, we examined whether mismatch negativity neurofeedback is effective for helping someone to perceive new speech sounds in foreign language learning. We developed a task for training native Japanese speakers to discriminate between 'l' and 'r' sounds in English, as they usually cannot discriminate between these two sounds. Without participants attending to auditory stimuli or being aware of the nature of the experiment, neurofeedback training helped them to achieve significant improvement in unconscious auditory discrimination and recognition of the target words 'light' and 'right'. There was also improvement in the recognition of other words containing 'l' and 'r' (e.g., 'blight' and 'bright', even though these words had not been presented during training. This method could be used to facilitate foreign language learning and can be extended to other fields of auditory and clinical research and even other senses.
Swahili speech development: preliminary normative data from typically developing pre-school children in Tanzania.

Science.gov (United States)

Gangji, Nazneen; Pascoe, Michelle; Smouse, Mantoa

2015-01-01

Swahili is widely spoken in East Africa, but to date there are no culturally and linguistically appropriate materials available for speech-language therapists working in the region. The challenges are further exacerbated by the limited research available on the typical acquisition of Swahili phonology. To describe the speech development of 24 typically developing first language Swahili-speaking children between the ages of 3;0 and 5;11 years in Dar es Salaam, Tanzania. A cross-sectional design was used with six groups of four children in 6-month age bands. Single-word speech samples were obtained from each child using a set of culturally appropriate pictures designed to elicit all consonants and vowels of Swahili. Each child's speech was audio-recorded and phonetically transcribed using International Phonetic Alphabet (IPA) conventions. Children's speech development is described in terms of (1) phonetic inventory, (2) syllable structure inventory, (3) phonological processes and (4) percentage consonants correct (PCC) and percentage vowels correct (PVC). Results suggest a gradual progression in the acquisition of speech sounds and syllables between the ages of 3;0 and 5;11 years. Vowel acquisition was completed and most of the consonants acquired by age 3;0. Fricatives/z, s, h/ were later acquired at 4 years and /θ/and /r/ were the last acquired consonants at age 5;11. Older children were able to produce speech sounds more accurately and had fewer phonological processes in their speech than younger children. Common phonological processes included lateralization and sound preference substitutions. The study contributes a preliminary set of normative data on speech development of Swahili-speaking children. Findings are discussed in relation to theories of phonological development, and may be used as a basis for further normative studies with larger numbers of children and ultimately the development of a contextually relevant assessment of the phonology of Swahili

Unvoiced Speech Recognition Using Tissue-Conductive Acoustic Sensor

Directory of Open Access Journals (Sweden)

Hiroshi Saruwatari

2007-01-01

Full Text Available We present the use of stethoscope and silicon NAM (nonaudible murmur microphones in automatic speech recognition. NAM microphones are special acoustic sensors, which are attached behind the talker's ear and can capture not only normal (audible speech, but also very quietly uttered speech (nonaudible murmur. As a result, NAM microphones can be applied in automatic speech recognition systems when privacy is desired in human-machine communication. Moreover, NAM microphones show robustness against noise and they might be used in special systems (speech recognition, speech transform, etc. for sound-impaired people. Using adaptation techniques and a small amount of training data, we achieved for a 20 k dictation task a 93.9% word accuracy for nonaudible murmur recognition in a clean environment. In this paper, we also investigate nonaudible murmur recognition in noisy environments and the effect of the Lombard reflex on nonaudible murmur recognition. We also propose three methods to integrate audible speech and nonaudible murmur recognition using a stethoscope NAM microphone with very promising results.
Algorithmic modeling of the irrelevant sound effect (ISE) by the hearing sensation fluctuation strength.

Science.gov (United States)

Schlittmeier, Sabine J; Weissgerber, Tobias; Kerber, Stefan; Fastl, Hugo; Hellbrück, Jürgen

2012-01-01

Background sounds, such as narration, music with prominent staccato passages, and office noise impair verbal short-term memory even when these sounds are irrelevant. This irrelevant sound effect (ISE) is evoked by so-called changing-state sounds that are characterized by a distinct temporal structure with varying successive auditory-perceptive tokens. However, because of the absence of an appropriate psychoacoustically based instrumental measure, the disturbing impact of a given speech or nonspeech sound could not be predicted until now, but necessitated behavioral testing. Our database for parametric modeling of the ISE included approximately 40 background sounds (e.g., speech, music, tone sequences, office noise, traffic noise) and corresponding performance data that was collected from 70 behavioral measurements of verbal short-term memory. The hearing sensation fluctuation strength was chosen to model the ISE and describes the percept of fluctuations when listening to slowly modulated sounds (f(mod) background sounds, the algorithm estimated behavioral performance data in 63 of 70 cases within the interquartile ranges. In particular, all real-world sounds were modeled adequately, whereas the algorithm overestimated the (non-)disturbance impact of synthetic steady-state sounds that were constituted by a repeated vowel or tone. Implications of the algorithm's strengths and prediction errors are discussed.
Recognizing speech in a novel accent: the motor theory of speech perception reframed.

Science.gov (United States)

Moulin-Frier, Clément; Arbib, Michael A

2013-08-01

The motor theory of speech perception holds that we perceive the speech of another in terms of a motor representation of that speech. However, when we have learned to recognize a foreign accent, it seems plausible that recognition of a word rarely involves reconstruction of the speech gestures of the speaker rather than the listener. To better assess the motor theory and this observation, we proceed in three stages. Part 1 places the motor theory of speech perception in a larger framework based on our earlier models of the adaptive formation of mirror neurons for grasping, and for viewing extensions of that mirror system as part of a larger system for neuro-linguistic processing, augmented by the present consideration of recognizing speech in a novel accent. Part 2 then offers a novel computational model of how a listener comes to understand the speech of someone speaking the listener's native language with a foreign accent. The core tenet of the model is that the listener uses hypotheses about the word the speaker is currently uttering to update probabilities linking the sound produced by the speaker to phonemes in the native language repertoire of the listener. This, on average, improves the recognition of later words. This model is neutral regarding the nature of the representations it uses (motor vs. auditory). It serve as a reference point for the discussion in Part 3, which proposes a dual-stream neuro-linguistic architecture to revisits claims for and against the motor theory of speech perception and the relevance of mirror neurons, and extracts some implications for the reframing of the motor theory.
Result on speech perception after conversion from Spectra® to Freedom®.

Science.gov (United States)

Magalhães, Ana Tereza de Matos; Goffi-Gomez, Maria Valéria Schmidt; Hoshino, Ana Cristina; Tsuji, Robinson Koji; Bento, Ricardo Ferreira; Brito, Rubens

2012-04-01

New technology in the Freedom® speech processor for cochlear implants was developed to improve how incoming acoustic sound is processed; this applies not only for new users, but also for previous generations of cochlear implants. To identify the contribution of this technology-- the Nucleus 22®--on speech perception tests in silence and in noise, and on audiometric thresholds. A cross-sectional cohort study was undertaken. Seventeen patients were selected. The last map based on the Spectra® was revised and optimized before starting the tests. Troubleshooting was used to identify malfunction. To identify the contribution of the Freedom® technology for the Nucleus22®, auditory thresholds and speech perception tests were performed in free field in sound-proof booths. Recorded monosyllables and sentences in silence and in noise (SNR = 0dB) were presented at 60 dBSPL. The nonparametric Wilcoxon test for paired data was used to compare groups. Freedom® applied for the Nucleus22® showed a statistically significant difference in all speech perception tests and audiometric thresholds. The Freedom® technology improved the performance of speech perception and audiometric thresholds of patients with Nucleus 22®.
Volubility, consonant, and syllable characteristics in infants and toddlers later diagnosed with childhood apraxia of speech: A pilot study.

Science.gov (United States)

Overby, Megan; Caspari, Susan S

2015-01-01

This pilot study explored the volubility, consonant singleton acquisition, and syllable structure development between infants and toddlers (birth-24 months) with typical speech sound production (TYP) and those later diagnosed with childhood apraxia of speech (CAS). A retrospective longitudinal between- and within-subjects research design was utilized (TYP N=2; CAS N=4). Vocalizations from participants were analyzed between birth-24 months from home videotapes, volunteered by the children's parents, according to type (nonresonant vs. resonant), volubility, place and manner of consonant singletons, and syllable shape (V, CV, VC, CVC, VCV, CVCV, VCVC, and "Other"). Volubility between groups was not significant but statistically significant differences were found in the number of: resonant and non-resonant productions; different consonant singletons; different place features; different manner classes; and proportional use of fricative, glottal, and voiceless phones. Infants and toddlers in the CAS group also demonstrated difficulty with CVCs, had limited syllable shapes, and possible regression of vowel syllable structure. Data corroborate parent reports that infants and toddlers later diagnosed with CAS present differently than do those with typical speech sound skills. Additional study with infants and toddlers later diagnosed with non-CAS speech sound disorder is needed. Readers will: (1) describe current perspectives on volubility of infants and toddlers later diagnosed with CAS; (2) describe current perspectives of the consonant singleton and syllable shape inventories of infants and toddlers later diagnosed with CAS; and (3) discuss the potential differences between the speech sound development of infants and toddlers later diagnosed with CAS and those with typical speech sound skill. Copyright © 2015 Elsevier Inc. All rights reserved.
Human Superior Temporal Gyrus Organization of Spectrotemporal Modulation Tuning Derived from Speech Stimuli.

Science.gov (United States)

Hullett, Patrick W; Hamilton, Liberty S; Mesgarani, Nima; Schreiner, Christoph E; Chang, Edward F

2016-02-10

The human superior temporal gyrus (STG) is critical for speech perception, yet the organization of spectrotemporal processing of speech within the STG is not well understood. Here, to characterize the spatial organization of spectrotemporal processing of speech across human STG, we use high-density cortical surface field potential recordings while participants listened to natural continuous speech. While synthetic broad-band stimuli did not yield sustained activation of the STG, spectrotemporal receptive fields could be reconstructed from vigorous responses to speech stimuli. We find that the human STG displays a robust anterior-posterior spatial distribution of spectrotemporal tuning in which the posterior STG is tuned for temporally fast varying speech sounds that have relatively constant energy across the frequency axis (low spectral modulation) while the anterior STG is tuned for temporally slow varying speech sounds that have a high degree of spectral variation across the frequency axis (high spectral modulation). This work illustrates organization of spectrotemporal processing in the human STG, and illuminates processing of ethologically relevant speech signals in a region of the brain specialized for speech perception. Considerable evidence has implicated the human superior temporal gyrus (STG) in speech processing. However, the gross organization of spectrotemporal processing of speech within the STG is not well characterized. Here we use natural speech stimuli and advanced receptive field characterization methods to show that spectrotemporal features within speech are well organized along the posterior-to-anterior axis of the human STG. These findings demonstrate robust functional organization based on spectrotemporal modulation content, and illustrate that much of the encoded information in the STG represents the physical acoustic properties of speech stimuli. Copyright © 2016 the authors 0270-6474/16/362014-13$15.00/0.
One approach to design of speech emotion database

Science.gov (United States)

Uhrin, Dominik; Chmelikova, Zdenka; Tovarek, Jaromir; Partila, Pavol; Voznak, Miroslav

2016-05-01

This article describes a system for evaluating the credibility of recordings with emotional character. Sound recordings form Czech language database for training and testing systems of speech emotion recognition. These systems are designed to detect human emotions in his voice. The emotional state of man is useful in the security forces and emergency call service. Man in action (soldier, police officer and firefighter) is often exposed to stress. Information about the emotional state (his voice) will help to dispatch to adapt control commands for procedure intervention. Call agents of emergency call service must recognize the mental state of the caller to adjust the mood of the conversation. In this case, the evaluation of the psychological state is the key factor for successful intervention. A quality database of sound recordings is essential for the creation of the mentioned systems. There are quality databases such as Berlin Database of Emotional Speech or Humaine. The actors have created these databases in an audio studio. It means that the recordings contain simulated emotions, not real. Our research aims at creating a database of the Czech emotional recordings of real human speech. Collecting sound samples to the database is only one of the tasks. Another one, no less important, is to evaluate the significance of recordings from the perspective of emotional states. The design of a methodology for evaluating emotional recordings credibility is described in this article. The results describe the advantages and applicability of the developed method.
The Sound Quality of Cochlear Implants: Studies With Single-sided Deaf Patients.

Science.gov (United States)

Dorman, Michael F; Natale, Sarah Cook; Butts, Austin M; Zeitler, Daniel M; Carlson, Matthew L

2017-09-01

The goal of the present study was to assess the sound quality of a cochlear implant for single-sided deaf (SSD) patients fit with a cochlear implant (CI). One of the fundamental, unanswered questions in CI research is "what does an implant sound like?" Conventional CI patients must use the memory of a clean signal, often decades old, to judge the sound quality of their CIs. In contrast, SSD-CI patients can rate the similarity of a clean signal presented to the CI ear and candidate, CI-like signals presented to the ear with normal hearing. For Experiment 1 four types of stimuli were created for presentation to the normal hearing ear: noise vocoded signals, sine vocoded signals, frequency shifted, sine vocoded signals and band-pass filtered, natural speech signals. Listeners rated the similarity of these signals to unmodified signals sent to the CI on a scale of 0 to 10 with 10 being a complete match to the CI signal. For Experiment 2 multitrack signal mixing was used to create natural speech signals that varied along multiple dimensions. In Experiment 1 for eight adult SSD-CI listeners, the best median similarity rating to the sound of the CI for noise vocoded signals was 1.9; for sine vocoded signals 2.9; for frequency upshifted signals, 1.9; and for band pass filtered signals, 5.5. In Experiment 2 for three young listeners, combinations of band pass filtering and spectral smearing lead to ratings of 10. The sound quality of noise and sine vocoders does not generally correspond to the sound quality of cochlear implants fit to SSD patients. Our preliminary conclusion is that natural speech signals that have been muffled to one degree or another by band pass filtering and/or spectral smearing provide a close, but incomplete, match to CI sound quality for some patients.
Head movements encode emotions during speech and song.

Science.gov (United States)

Livingstone, Steven R; Palmer, Caroline

2016-04-01

When speaking or singing, vocalists often move their heads in an expressive fashion, yet the influence of emotion on vocalists' head motion is unknown. Using a comparative speech/song task, we examined whether vocalists' intended emotions influence head movements and whether those movements influence the perceived emotion. In Experiment 1, vocalists were recorded with motion capture while speaking and singing each statement with different emotional intentions (very happy, happy, neutral, sad, very sad). Functional data analyses showed that head movements differed in translational and rotational displacement across emotional intentions, yet were similar across speech and song, transcending differences in F0 (varied freely in speech, fixed in song) and lexical variability. Head motion specific to emotional state occurred before and after vocalizations, as well as during sound production, confirming that some aspects of movement were not simply a by-product of sound production. In Experiment 2, observers accurately identified vocalists' intended emotion on the basis of silent, face-occluded videos of head movements during speech and song. These results provide the first evidence that head movements encode a vocalist's emotional intent and that observers decode emotional information from these movements. We discuss implications for models of head motion during vocalizations and applied outcomes in social robotics and automated emotion recognition. (c) 2016 APA, all rights reserved).
Improving Robustness against Environmental Sounds for Directing Attention of Social Robots

DEFF Research Database (Denmark)

Thomsen, Nicolai Bæk; Tan, Zheng-Hua; Lindberg, Børge

2015-01-01

This paper presents a multi-modal system for finding out where to direct the attention of a social robot in a dialog scenario, which is robust against environmental sounds (door slamming, phone ringing etc.) and short speech segments. The method is based on combining voice activity detection (VAD......) and sound source localization (SSL) and furthermore apply post-processing to SSL to filter out short sounds. The system is tested against a baseline system in four different real-world experiments, where different sounds are used as interfering sounds. The results are promising and show a clear improvement....
Development of a Bone-Conducted Ultrasonic Hearing Aid for the Profoundly Deaf: Evaluation of Sound Quality Using a Semantic Differential Method

Science.gov (United States)

Nakagawa, Seiji; Fujiyuki, Chika; Kagomiya, Takayuki

2013-07-01

Bone-conducted ultrasound (BCU) is perceived even by the profoundly sensorineural deaf. A novel hearing aid using the perception of amplitude-modulated BCU (BCU hearing aid: BCUHA) has been developed. However, there is room for improvement particularly in terms of sound quality. BCU speech is accompanied by a strong high-pitched tone and contain some distortion. In this study, the sound quality of BCU speech with several types of amplitude modulation [double-sideband with transmitted carrier (DSB-TC), double-sideband with suppressed carrier (DSB-SC), and transposed modulations] and air-conducted (AC) speech was quantitatively evaluated using semantic differential and factor analysis. The results showed that all the types of BCU speech had higher metallic and lower esthetic factor scores than AC speech. On the other hand, transposed speech was closer than the other types of BCU speech to AC speech generally; the transposed speech showed a higher powerfulness factor score than the other types of BCU speech and a higher esthetic factor score than DSB-SC speech. These results provide useful information for further development of the BCUHA.
Timbral aspects of reproduced sound in small rooms. I

DEFF Research Database (Denmark)

Bech, Søren

1995-01-01

, has been simulated using an electroacoustic setup. The model included the direct sound, 17 individual reflections, and the reverberant field. The threshold of detection and just-noticeable differences for an increase in level were measured for individual reflections using eight subjects for noise......This paper reports some of the influences of individual reflections on the timbre of reproduced sound. A single loudspeaker with frequency-independent directivity characteristics, positioned in a listening room of normal size with frequency-independent absorption coefficients of the room surfaces...... and speech. The results have shown that the first-order floor and ceiling reflections are likely to individually contribute to the timbre of reproduced speech. For a noise signal, additional reflections from the left sidewall will contribute individually. The level of the reverberant field has been found...
The organization and reorganization of audiovisual speech perception in the first year of life.

Science.gov (United States)

Danielson, D Kyle; Bruderer, Alison G; Kandhadai, Padmapriya; Vatikiotis-Bateson, Eric; Werker, Janet F

2017-04-01

The period between six and 12 months is a sensitive period for language learning during which infants undergo auditory perceptual attunement, and recent results indicate that this sensitive period may exist across sensory modalities. We tested infants at three stages of perceptual attunement (six, nine, and 11 months) to determine 1) whether they were sensitive to the congruence between heard and seen speech stimuli in an unfamiliar language, and 2) whether familiarization with congruent audiovisual speech could boost subsequent non-native auditory discrimination. Infants at six- and nine-, but not 11-months, detected audiovisual congruence of non-native syllables. Familiarization to incongruent, but not congruent, audiovisual speech changed auditory discrimination at test for six-month-olds but not nine- or 11-month-olds. These results advance the proposal that speech perception is audiovisual from early in ontogeny, and that the sensitive period for audiovisual speech perception may last somewhat longer than that for auditory perception alone.
Near-infrared-spectroscopic study on processing of sounds in the brain; a comparison between native and non-native speakers of Japanese.

Science.gov (United States)

Tsunoda, Koichi; Sekimoto, Sotaro; Itoh, Kenji

2016-06-01

Conclusions The result suggested that mother tongue Japanese and non- mother tongue Japanese differ in their pattern of brain dominance when listening to sounds from the natural world-in particular, insect sounds. These results reveal significant support for previous findings from Tsunoda (in 1970). Objectives This study concentrates on listeners who show clear evidence of a 'speech' brain vs a 'music' brain and determines which side is most active in the processing of insect sounds, using with near-infrared spectroscopy. Methods The present study uses 2-channel Near Infrared Spectroscopy (NIRS) to provide a more direct measure of left- and right-brain activity while participants listen to each of three types of sounds: Japanese speech, Western violin music, or insect sounds. Data were obtained from 33 participants who showed laterality on opposite sides for Japanese speech and Western music. Results Results showed that a majority (80%) of the MJ participants exhibited dominance for insect sounds on the side that was dominant for language, while a majority (62%) of the non-MJ participants exhibited dominance for insect sounds on the side that was dominant for music.
The impact of brief restriction to articulation on children's subsequent speech production.

Science.gov (United States)

Seidl, Amanda; Brosseau-Lapré, Françoise; Goffman, Lisa

2018-02-01

This project explored whether disruption of articulation during listening impacts subsequent speech production in 4-yr-olds with and without speech sound disorder (SSD). During novel word learning, typically-developing children showed effects of articulatory disruption as revealed by larger differences between two acoustic cues to a sound contrast, but children with SSD were unaffected by articulatory disruption. Findings suggest that, when typically developing 4-yr-olds experience an articulatory disruption during a listening task, the children's subsequent production is affected. Children with SSD show less influence of articulatory experience during perception, which could be the result of impaired or attenuated ties between perception and articulation.
Influences on infant speech processing: toward a new synthesis.

Science.gov (United States)

Werker, J F; Tees, R C

1999-01-01

To comprehend and produce language, we must be able to recognize the sound patterns of our language and the rules for how these sounds "map on" to meaning. Human infants are born with a remarkable array of perceptual sensitivities that allow them to detect the basic properties that are common to the world's languages. During the first year of life, these sensitivities undergo modification reflecting an exquisite tuning to just that phonological information that is needed to map sound to meaning in the native language. We review this transition from language-general to language-specific perceptual sensitivity that occurs during the first year of life and consider whether the changes propel the child into word learning. To account for the broad-based initial sensitivities and subsequent reorganizations, we offer an integrated transactional framework based on the notion of a specialized perceptual-motor system that has evolved to serve human speech, but which functions in concert with other developing abilities. In so doing, we highlight the links between infant speech perception, babbling, and word learning.
Song and speech: examining the link between singing talent and speech imitation ability.

Science.gov (United States)

Christiner, Markus; Reiterer, Susanne M

2013-01-01

In previous research on speech imitation, musicality, and an ability to sing were isolated as the strongest indicators of good pronunciation skills in foreign languages. We, therefore, wanted to take a closer look at the nature of the ability to sing, which shares a common ground with the ability to imitate speech. This study focuses on whether good singing performance predicts good speech imitation. Forty-one singers of different levels of proficiency were selected for the study and their ability to sing, to imitate speech, their musical talent and working memory were tested. Results indicated that singing performance is a better indicator of the ability to imitate speech than the playing of a musical instrument. A multiple regression revealed that 64% of the speech imitation score variance could be explained by working memory together with educational background and singing performance. A second multiple regression showed that 66% of the speech imitation variance of completely unintelligible and unfamiliar language stimuli (Hindi) could be explained by working memory together with a singer's sense of rhythm and quality of voice. This supports the idea that both vocal behaviors have a common grounding in terms of vocal and motor flexibility, ontogenetic and phylogenetic development, neural orchestration and auditory memory with singing fitting better into the category of "speech" on the productive level and "music" on the acoustic level. As a result, good singers benefit from vocal and motor flexibility, productively and cognitively, in three ways. (1) Motor flexibility and the ability to sing improve language and musical function. (2) Good singers retain a certain plasticity and are open to new and unusual sound combinations during adulthood both perceptually and productively. (3) The ability to sing improves the memory span of the auditory working memory.
Discrimination? - Exhibition of posters

OpenAIRE

Jakimovska, Jana

2017-01-01

Participation in the exhibition with the students form the Art Academy. The exhibition consisted of 15 posters tackling the subjects of hate speech and discrimination. The exhibition happened thanks to the invitation of the Faculty of Law at UGD, and it was a part of a larger event of launching books on the aforementioned subjects.
Techniques and applications for binaural sound manipulation in human-machine interfaces

Science.gov (United States)

Begault, Durand R.; Wenzel, Elizabeth M.

1992-01-01

The implementation of binaural sound to speech and auditory sound cues (auditory icons) is addressed from both an applications and technical standpoint. Techniques overviewed include processing by means of filtering with head-related transfer functions. Application to advanced cockpit human interface systems is discussed, although the techniques are extendable to any human-machine interface. Research issues pertaining to three-dimensional sound displays under investigation at the Aerospace Human Factors Division at NASA Ames Research Center are described.
Effects of Audio-Visual Integration on the Detection of Masked Speech and Non-Speech Sounds

Science.gov (United States)

Eramudugolla, Ranmalee; Henderson, Rachel; Mattingley, Jason B.

2011-01-01

Integration of simultaneous auditory and visual information about an event can enhance our ability to detect that event. This is particularly evident in the perception of speech, where the articulatory gestures of the speaker's lips and face can significantly improve the listener's detection and identification of the message, especially when that…

Australian children with cleft palate achieve age-appropriate speech by 5 years of age.

Science.gov (United States)

Chacon, Antonia; Parkin, Melissa; Broome, Kate; Purcell, Alison

2017-12-01

Children with cleft palate demonstrate atypical speech sound development, which can influence their intelligibility, literacy and learning. There is limited documentation regarding how speech sound errors change over time in cleft palate speech and the effect that these errors have upon mono-versus polysyllabic word production. The objective of this study was to examine the phonetic and phonological speech skills of children with cleft palate at ages 3 and 5. A cross-sectional observational design was used. Eligible participants were aged 3 or 5 years with a repaired cleft palate. The Diagnostic Evaluation of Articulation and Phonology (DEAP) Articulation subtest and a non-standardised list of mono- and polysyllabic words were administered once for each child. The Profile of Phonology (PROPH) was used to analyse each child's speech. N = 51 children with cleft palate participated in the study. Three-year-old children with cleft palate produced significantly more speech errors than their typically-developing peers, but no difference was apparent at 5 years. The 5-year-olds demonstrated greater phonetic and phonological accuracy than the 3-year-old children. Polysyllabic words were more affected by errors than monosyllables in the 3-year-old group only. Children with cleft palate are prone to phonetic and phonological speech errors in their preschool years. Most of these speech errors approximate typically-developing children by 5 years. At 3 years, word shape has an influence upon phonological speech accuracy. Speech pathology intervention is indicated to support the intelligibility of these children from their earliest stages of development. Copyright © 2017 Elsevier B.V. All rights reserved.
Statistical representation of sound textures in the impaired auditory system

DEFF Research Database (Denmark)

McWalter, Richard Ian; Dau, Torsten

2015-01-01

Many challenges exist when it comes to understanding and compensating for hearing impairment. Traditional methods, such as pure tone audiometry and speech intelligibility tests, offer insight into the deficiencies of a hearingimpaired listener, but can only partially reveal the mechanisms...... that underlie the hearing loss. An alternative approach is to investigate the statistical representation of sounds for hearing-impaired listeners along the auditory pathway. Using models of the auditory periphery and sound synthesis, we aimed to probe hearing impaired perception for sound textures – temporally...
A speech processing study using an acoustic model of a multiple-channel cochlear implant

Science.gov (United States)

Xu, Ying

1998-10-01

A cochlear implant is an electronic device designed to provide sound information for adults and children who have bilateral profound hearing loss. The task of representing speech signals as electrical stimuli is central to the design and performance of cochlear implants. Studies have shown that the current speech- processing strategies provide significant benefits to cochlear implant users. However, the evaluation and development of speech-processing strategies have been complicated by hardware limitations and large variability in user performance. To alleviate these problems, an acoustic model of a cochlear implant with the SPEAK strategy is implemented in this study, in which a set of acoustic stimuli whose psychophysical characteristics are as close as possible to those produced by a cochlear implant are presented on normal-hearing subjects. To test the effectiveness and feasibility of this acoustic model, a psychophysical experiment was conducted to match the performance of a normal-hearing listener using model- processed signals to that of a cochlear implant user. Good agreement was found between an implanted patient and an age-matched normal-hearing subject in a dynamic signal discrimination experiment, indicating that this acoustic model is a reasonably good approximation of a cochlear implant with the SPEAK strategy. The acoustic model was then used to examine the potential of the SPEAK strategy in terms of its temporal and frequency encoding of speech. It was hypothesized that better temporal and frequency encoding of speech can be accomplished by higher stimulation rates and a larger number of activated channels. Vowel and consonant recognition tests were conducted on normal-hearing subjects using speech tokens processed by the acoustic model, with different combinations of stimulation rate and number of activated channels. The results showed that vowel recognition was best at 600 pps and 8 activated channels, but further increases in stimulation rate and
Childhood apraxia of speech and multiple phonological disorders in Cairo-Egyptian Arabic speaking children: language, speech, and oro-motor differences.

Science.gov (United States)

Aziz, Azza Adel; Shohdi, Sahar; Osman, Dalia Mostafa; Habib, Emad Iskander

2010-06-01

Childhood apraxia of speech is a neurological childhood speech-sound disorder in which the precision and consistency of movements underlying speech are impaired in the absence of neuromuscular deficits. Children with childhood apraxia of speech and those with multiple phonological disorder share some common phonological errors that can be misleading in diagnosis. This study posed a question about a possible significant difference in language, speech and non-speech oral performances between children with childhood apraxia of speech, multiple phonological disorder and normal children that can be used for a differential diagnostic purpose. 30 pre-school children between the ages of 4 and 6 years served as participants. Each of these children represented one of 3 possible subject-groups: Group 1: multiple phonological disorder; Group 2: suspected cases of childhood apraxia of speech; Group 3: control group with no communication disorder. Assessment procedures included: parent interviews; testing of non-speech oral motor skills and testing of speech skills. Data showed that children with suspected childhood apraxia of speech showed significantly lower language score only in their expressive abilities. Non-speech tasks did not identify significant differences between childhood apraxia of speech and multiple phonological disorder groups except for those which required two sequential motor performances. In speech tasks, both consonant and vowel accuracy were significantly lower and inconsistent in childhood apraxia of speech group than in the multiple phonological disorder group. Syllable number, shape and sequence accuracy differed significantly in the childhood apraxia of speech group than the other two groups. In addition, children with childhood apraxia of speech showed greater difficulty in processing prosodic features indicating a clear need to address these variables for differential diagnosis and treatment of children with childhood apraxia of speech. Copyright (c
Reconceptualizing Practice with Multilingual Children with Speech Sound Disorders: People, Practicalities and Policy

Science.gov (United States)

Verdon, Sarah; McLeod, Sharynne; Wong, Sandie

2015-01-01

Background: The speech and language therapy profession is required to provide services to increasingly multilingual caseloads. Much international research has focused on the challenges of speech and language therapists' (SLTs) practice with multilingual children. Aims: To draw on the experience and knowledge of experts in the field to: (1)…
Speech and non-speech processing in children with phonological disorders: an electrophysiological study

Directory of Open Access Journals (Sweden)

Isabela Crivellaro Gonçalves

2011-01-01

Full Text Available OBJECTIVE: To determine whether neurophysiological auditory brainstem responses to clicks and repeated speech stimuli differ between typically developing children and children with phonological disorders. INTRODUCTION: Phonological disorders are language impairments resulting from inadequate use of adult phonological language rules and are among the most common speech and language disorders in children (prevalence: 8 - 9%. Our hypothesis is that children with phonological disorders have basic differences in the way that their brains encode acoustic signals at brainstem level when compared to normal counterparts. METHODS: We recorded click and speech evoked auditory brainstem responses in 18 typically developing children (control group and in 18 children who were clinically diagnosed with phonological disorders (research group. The age range of the children was from 7-11 years. RESULTS: The research group exhibited significantly longer latency responses to click stimuli (waves I, III and V and speech stimuli (waves V and A when compared to the control group. DISCUSSION: These results suggest that the abnormal encoding of speech sounds may be a biological marker of phonological disorders. However, these results cannot define the biological origins of phonological problems. We also observed that speech-evoked auditory brainstem responses had a higher specificity/sensitivity for identifying phonological disorders than click-evoked auditory brainstem responses. CONCLUSIONS: Early stages of the auditory pathway processing of an acoustic stimulus are not similar in typically developing children and those with phonological disorders. These findings suggest that there are brainstem auditory pathway abnormalities in children with phonological disorders.
Differential Diagnosis of Speech Sound Disorders in Danish-speaking Children

DEFF Research Database (Denmark)

Clausen, Marit Carolin; Fox-Boyer, Anette

in selecting the right intervention approach to resolve the SSD. Different quantitative and qualitative measurements are currently used to subgroup children with SSD. A quantitative method of classifying children is by accuracy of their productions. According to this approach, the severity of children’s SSD...... and clinical decision-making about the need for intervention should not be based on the quantitative approach only. A qualitative classification approach is needed for a distinct subgrouping of children with SSD whereas PCC-A can be used as additional information about the severity of the SSD. Keywords: speech...... is classified by calculating the percentage of correctly produced consonants (i.e. percentage consonants correct, PCC-A) (Shriberg et al., 1997). Alternatively, a qualitative approach seeks to ascertain which types of phonological processes are present in children’s speech, i.e. developmental or idiosyncratic...
Speech impairment in Down syndrome: a review.

Science.gov (United States)

Kent, Ray D; Vorperian, Houri K

2013-02-01

This review summarizes research on disorders of speech production in Down syndrome (DS) for the purposes of informing clinical services and guiding future research. Review of the literature was based on searches using MEDLINE, Google Scholar, PsycINFO, and HighWire Press, as well as consideration of reference lists in retrieved documents (including online sources). Search terms emphasized functions related to voice, articulation, phonology, prosody, fluency, and intelligibility. The following conclusions pertain to four major areas of review: voice, speech sounds, fluency and prosody, and intelligibility. The first major area is voice. Although a number of studies have reported on vocal abnormalities in DS, major questions remain about the nature and frequency of the phonatory disorder. Results of perceptual and acoustic studies have been mixed, making it difficult to draw firm conclusions or even to identify sensitive measures for future study. The second major area is speech sounds. Articulatory and phonological studies show that speech patterns in DS are a combination of delayed development and errors not seen in typical development. Delayed (i.e., developmental) and disordered (i.e., nondevelopmental) patterns are evident by the age of about 3 years, although DS-related abnormalities possibly appear earlier, even in infant babbling. The third major area is fluency and prosody. Stuttering and/or cluttering occur in DS at rates of 10%-45%, compared with about 1% in the general population. Research also points to significant disturbances in prosody. The fourth major area is intelligibility. Studies consistently show marked limitations in this area, but only recently has the research gone beyond simple rating scales.
Processing changes when listening to foreign-accented speech

Directory of Open Access Journals (Sweden)

Carlos eRomero-Rivas

2015-03-01

Full Text Available This study investigates the mechanisms responsible for fast changes in processing foreign-accented speech. Event Related brain Potentials (ERPs were obtained while native speakers of Spanish listened to native and foreign-accented speakers of Spanish. We observed a less positive P200 component for foreign-accented speech relative to native speech comprehension. This suggests that the extraction of spectral information and other important acoustic features was hampered during foreign-accented speech comprehension. However, the amplitude of the N400 component for foreign-accented speech comprehension decreased across the experiment, suggesting the use of a higher level, lexical mechanism. Furthermore, during native speech comprehension, semantic violations in the critical words elicited an N400 effect followed by a late positivity. During foreign-accented speech comprehension, semantic violations only elicited an N400 effect. Overall, our results suggest that, despite a lack of improvement in phonetic discrimination, native listeners experience changes at lexical-semantic levels of processing after brief exposure to foreign-accented speech. Moreover, these results suggest that lexical access, semantic integration and linguistic re-analysis processes are permeable to external factors, such as the accent of the speaker.
Assessment of Danish-speaking children’s phonological development and speech disorders

DEFF Research Database (Denmark)

Clausen, Marit Carolin; Fox-Boyer, Annette

2018-01-01

The identification of speech sounds disorders is an important everyday task for speech and language therapists (SLTs) working with children. Therefore, assessment tools are needed that are able to correctly identify and diagnose a child with a suspected speech disorder and furthermore, that provide...... of the existing speech assessments in Denmark showed that none of the materials fulfilled current recommendations identified in research literature. Therefore, the aim of this paper is to describe the evaluation of a newly constructed instrument for assessing the speech development and disorders of Danish...... with suspected speech disorder (Clausen and Fox-Boyer, in prep). The results indicated that the instrument showed strong inter-examiner reliability for both populations as well as a high content and diagnostic validity. Hence, the study showed that the LogoFoVa can be regarded as a reliable and valid tool...
Treating speech subsystems in childhood apraxia of speech with tactual input: the PROMPT approach.

Science.gov (United States)

Dale, Philip S; Hayden, Deborah A

2013-11-01

Prompts for Restructuring Oral Muscular Phonetic Targets (PROMPT; Hayden, 2004; Hayden, Eigen, Walker, & Olsen, 2010)-a treatment approach for the improvement of speech sound disorders in children-uses tactile-kinesthetic- proprioceptive (TKP) cues to support and shape movements of the oral articulators. No research to date has systematically examined the efficacy of PROMPT for children with childhood apraxia of speech (CAS). Four children (ages 3;6 [years;months] to 4;8), all meeting the American Speech-Language-Hearing Association (2007) criteria for CAS, were treated using PROMPT. All children received 8 weeks of 2 × per week treatment, including at least 4 weeks of full PROMPT treatment that included TKP cues. During the first 4 weeks, 2 of the 4 children received treatment that included all PROMPT components except TKP cues. This design permitted both between-subjects and within-subjects comparisons to evaluate the effect of TKP cues. Gains in treatment were measured by standardized tests and by criterion-referenced measures based on the production of untreated probe words, reflecting change in speech movements and auditory perceptual accuracy. All 4 children made significant gains during treatment, but measures of motor speech control and untreated word probes provided evidence for more gain when TKP cues were included. PROMPT as a whole appears to be effective for treating children with CAS, and the inclusion of TKP cues appears to facilitate greater effect.
Phonemic discrimination and its relation to phonological disorder Discriminação fonêmica e sua relação com o transtorno fonológico

Directory of Open Access Journals (Sweden)

Karine Leyla de Castro Oliveira

2012-12-01

Full Text Available It was performed a nonsystematic review on the importance of the phonemic discrimination to the phonological acquisition and its relation to the phonological disorder. Studies indicate that phonemic discrimination represents an essential ability in the process of acquiring sounds of speech and that children with phonological disorder present difficulties in that ability.Foi realizada uma revisão não sistemática sobre a importância da discriminação fonêmica para a aquisição dos sons da fala e sua relação com o transtorno fonológico. Os estudos indicam que a discriminação fonêmica representa uma habilidade essencial no processo de aquisição dos sons da fala e que as crianças com transtorno fonológico apresentam dificuldade nesta habilidade.
Speech production gains following constraint-induced movement therapy in children with hemiparesis.

Science.gov (United States)

Allison, Kristen M; Reidy, Teressa Garcia; Boyle, Mary; Naber, Erin; Carney, Joan; Pidcock, Frank S

2017-01-01

The purpose of this study was to investigate changes in speech skills of children who have hemiparesis and speech impairment after participation in a constraint-induced movement therapy (CIMT) program. While case studies have reported collateral speech gains following CIMT, the effect of CIMT on speech production has not previously been directly investigated to the knowledge of these investigators. Eighteen children with hemiparesis and co-occurring speech impairment participated in a 21-day clinical CIMT program. The Goldman-Fristoe Test of Articulation-2 (GFTA-2) was used to assess children's articulation of speech sounds before and after the intervention. Changes in percent of consonants correct (PCC) on the GFTA-2 were used as a measure of change in speech production. Children made significant gains in PCC following CIMT. Gains were similar in children with left and right-sided hemiparesis, and across age groups. This study reports significant collateral gains in speech production following CIMT and suggests benefits of CIMT may also spread to speech motor domains.
Dynamic encoding of speech sequence probability in human temporal cortex.

Science.gov (United States)

Leonard, Matthew K; Bouchard, Kristofer E; Tang, Claire; Chang, Edward F

2015-05-06

Sensory processing involves identification of stimulus features, but also integration with the surrounding sensory and cognitive context. Previous work in animals and humans has shown fine-scale sensitivity to context in the form of learned knowledge about the statistics of the sensory environment, including relative probabilities of discrete units in a stream of sequential auditory input. These statistics are a defining characteristic of one of the most important sequential signals humans encounter: speech. For speech, extensive exposure to a language tunes listeners to the statistics of sound sequences. To address how speech sequence statistics are neurally encoded, we used high-resolution direct cortical recordings from human lateral superior temporal cortex as subjects listened to words and nonwords with varying transition probabilities between sound segments. In addition to their sensitivity to acoustic features (including contextual features, such as coarticulation), we found that neural responses dynamically encoded the language-level probability of both preceding and upcoming speech sounds. Transition probability first negatively modulated neural responses, followed by positive modulation of neural responses, consistent with coordinated predictive and retrospective recognition processes, respectively. Furthermore, transition probability encoding was different for real English words compared with nonwords, providing evidence for online interactions with high-order linguistic knowledge. These results demonstrate that sensory processing of deeply learned stimuli involves integrating physical stimulus features with their contextual sequential structure. Despite not being consciously aware of phoneme sequence statistics, listeners use this information to process spoken input and to link low-level acoustic representations with linguistic information about word identity and meaning. Copyright © 2015 the authors 0270-6474/15/357203-12$15.00/0.
Pitch Based Sound Classification

DEFF Research Database (Denmark)

Nielsen, Andreas Brinch; Hansen, Lars Kai; Kjems, U

2006-01-01

A sound classification model is presented that can classify signals into music, noise and speech. The model extracts the pitch of the signal using the harmonic product spectrum. Based on the pitch estimate and a pitch error measure, features are created and used in a probabilistic model with soft......-max output function. Both linear and quadratic inputs are used. The model is trained on 2 hours of sound and tested on publicly available data. A test classification error below 0.05 with 1 s classification windows is achieved. Further more it is shown that linear input performs as well as a quadratic......, and that even though classification gets marginally better, not much is achieved by increasing the window size beyond 1 s....
Error Consistency in Acquired Apraxia of Speech with Aphasia: Effects of the Analysis Unit

Science.gov (United States)

Haley, Katarina L.; Cunningham, Kevin T.; Eaton, Catherine Torrington; Jacks, Adam

2018-01-01

Purpose: Diagnostic recommendations for acquired apraxia of speech (AOS) have been contradictory concerning whether speech sound errors are consistent or variable. Studies have reported divergent findings that, on face value, could argue either for or against error consistency as a diagnostic criterion. The purpose of this study was to explain…
Indonesian Text-To-Speech System Using Diphone Concatenative Synthesis

Directory of Open Access Journals (Sweden)

Sutarman

2015-02-01

Full Text Available In this paper, we describe the design and develop a database of Indonesian diphone synthesis using speech segment of recorded voice to be converted from text to speech and save it as audio file like WAV or MP3. In designing and develop a database of Indonesian diphone there are several steps to follow; First, developed Diphone database includes: create a list of sample of words consisting of diphones organized by prioritizing looking diphone located in the middle of a word if not at the beginning or end; recording the samples of words by segmentation. ;create diphones made with a tool Diphone Studio 1.3. Second, develop system using Microsoft Visual Delphi 6.0, includes: the conversion system from the input of numbers, acronyms, words, and sentences into representations diphone. There are two kinds of conversion (process alleged in analyzing the Indonesian text-to-speech system. One is to convert the text to be sounded to phonem and two, to convert the phonem to speech. Method used in this research is called Diphone Concatenative synthesis, in which recorded sound segments are collected. Every segment consists of a diphone (2 phonems. This synthesizer may produce voice with high level of naturalness. The Indonesian Text to Speech system can differentiate special phonemes like in ‘Beda’ and ‘Bedak’ but sample of other spesific words is necessary to put into the system. This Indonesia TTS system can handle texts with abbreviation, there is the facility to add such words.
Residual Neural Processing of Musical Sound Features in Adult Cochlear Implant Users

Science.gov (United States)

Timm, Lydia; Vuust, Peter; Brattico, Elvira; Agrawal, Deepashri; Debener, Stefan; Büchner, Andreas; Dengler, Reinhard; Wittfoth, Matthias

2014-01-01

Auditory processing in general and music perception in particular are hampered in adult cochlear implant (CI) users. To examine the residual music perception skills and their underlying neural correlates in CI users implanted in adolescence or adulthood, we conducted an electrophysiological and behavioral study comparing adult CI users with normal-hearing age-matched controls (NH controls). We used a newly developed musical multi-feature paradigm, which makes it possible to test automatic auditory discrimination of six different types of sound feature changes inserted within a musical enriched setting lasting only 20 min. The presentation of stimuli did not require the participants’ attention, allowing the study of the early automatic stage of feature processing in the auditory cortex. For the CI users, we obtained mismatch negativity (MMN) brain responses to five feature changes but not to changes of rhythm, whereas we obtained MMNs for all the feature changes in the NH controls. Furthermore, the MMNs to deviants of pitch of CI users were reduced in amplitude and later than those of NH controls for changes of pitch and guitar timber. No other group differences in MMN parameters were found to changes in intensity and saxophone timber. Furthermore, the MMNs in CI users reflected the behavioral scores from a respective discrimination task and were correlated with patients’ age and speech intelligibility. Our results suggest that even though CI users are not performing at the same level as NH controls in neural discrimination of pitch-based features, they do possess potential neural abilities for music processing. However, CI users showed a disrupted ability to automatically discriminate rhythmic changes compared with controls. The current behavioral and MMN findings highlight the residual neural skills for music processing even in CI users who have been implanted in adolescence or adulthood. Highlights: -Automatic brain responses to musical feature changes
Integrated Phoneme Subspace Method for Speech Feature Extraction

Directory of Open Access Journals (Sweden)

Park Hyunsin

2009-01-01

Full Text Available Speech feature extraction has been a key focus in robust speech recognition research. In this work, we discuss data-driven linear feature transformations applied to feature vectors in the logarithmic mel-frequency filter bank domain. Transformations are based on principal component analysis (PCA, independent component analysis (ICA, and linear discriminant analysis (LDA. Furthermore, this paper introduces a new feature extraction technique that collects the correlation information among phoneme subspaces and reconstructs feature space for representing phonemic information efficiently. The proposed speech feature vector is generated by projecting an observed vector onto an integrated phoneme subspace (IPS based on PCA or ICA. The performance of the new feature was evaluated for isolated word speech recognition. The proposed method provided higher recognition accuracy than conventional methods in clean and reverberant environments.
Differential modulation of auditory responses to attended and unattended speech in different listening conditions.

Science.gov (United States)

Kong, Ying-Yee; Mullangi, Ala; Ding, Nai

2014-10-01

This study investigates how top-down attention modulates neural tracking of the speech envelope in different listening conditions. In the quiet conditions, a single speech stream was presented and the subjects paid attention to the speech stream (active listening) or watched a silent movie instead (passive listening). In the competing speaker (CS) conditions, two speakers of opposite genders were presented diotically. Ongoing electroencephalographic (EEG) responses were measured in each condition and cross-correlated with the speech envelope of each speaker at different time lags. In quiet, active and passive listening resulted in similar neural responses to the speech envelope. In the CS conditions, however, the shape of the cross-correlation function was remarkably different between the attended and unattended speech. The cross-correlation with the attended speech showed stronger N1 and P2 responses but a weaker P1 response compared to the cross-correlation with the unattended speech. Furthermore, the N1 response to the attended speech in the CS condition was enhanced and delayed compared with the active listening condition in quiet, while the P2 response to the unattended speaker in the CS condition was attenuated compared with the passive listening in quiet. Taken together, these results demonstrate that top-down attention differentially modulates envelope-tracking neural activity at different time lags and suggest that top-down attention can both enhance the neural responses to the attended sound stream and suppress the responses to the unattended sound stream. Copyright © 2014 Elsevier B.V. All rights reserved.

Speech auditory brainstem response (speech ABR) characteristics depending on recording conditions, and hearing status: an experimental parametric study.

Science.gov (United States)

Akhoun, Idrick; Moulin, Annie; Jeanvoine, Arnaud; Ménard, Mikael; Buret, François; Vollaire, Christian; Scorretti, Riccardo; Veuillet, Evelyne; Berger-Vachon, Christian; Collet, Lionel; Thai-Van, Hung

2008-11-15

Speech elicited auditory brainstem responses (Speech ABR) have been shown to be an objective measurement of speech processing in the brainstem. Given the simultaneous stimulation and recording, and the similarities between the recording and the speech stimulus envelope, there is a great risk of artefactual recordings. This study sought to systematically investigate the source of artefactual contamination in Speech ABR response. In a first part, we measured the sound level thresholds over which artefactual responses were obtained, for different types of transducers and experimental setup parameters. A watermelon model was used to model the human head susceptibility to electromagnetic artefact. It was found that impedances between the electrodes had a great effect on electromagnetic susceptibility and that the most prominent artefact is due to the transducer's electromagnetic leakage. The only artefact-free condition was obtained with insert-earphones shielded in a Faraday cage linked to common ground. In a second part of the study, using the previously defined artefact-free condition, we recorded speech ABR in unilateral deaf subjects and bilateral normal hearing subjects. In an additional control condition, Speech ABR was recorded with the insert-earphones used to deliver the stimulation, unplugged from the ears, so that the subjects did not perceive the stimulus. No responses were obtained from the deaf ear of unilaterally hearing impaired subjects, nor in the insert-out-of-the-ear condition in all the subjects, showing that Speech ABR reflects the functioning of the auditory pathways.
When speaker identity is unavoidable: Neural processing of speaker identity cues in natural speech.

Science.gov (United States)

Tuninetti, Alba; Chládková, Kateřina; Peter, Varghese; Schiller, Niels O; Escudero, Paola

2017-11-01

Speech sound acoustic properties vary largely across speakers and accents. When perceiving speech, adult listeners normally disregard non-linguistic variation caused by speaker or accent differences, in order to comprehend the linguistic message, e.g. to correctly identify a speech sound or a word. Here we tested whether the process of normalizing speaker and accent differences, facilitating the recognition of linguistic information, is found at the level of neural processing, and whether it is modulated by the listeners' native language. In a multi-deviant oddball paradigm, native and nonnative speakers of Dutch were exposed to naturally-produced Dutch vowels varying in speaker, sex, accent, and phoneme identity. Unexpectedly, the analysis of mismatch negativity (MMN) amplitudes elicited by each type of change shows a large degree of early perceptual sensitivity to non-linguistic cues. This finding on perception of naturally-produced stimuli contrasts with previous studies examining the perception of synthetic stimuli wherein adult listeners automatically disregard acoustic cues to speaker identity. The present finding bears relevance to speech normalization theories, suggesting that at an unattended level of processing, listeners are indeed sensitive to changes in fundamental frequency in natural speech tokens. Copyright © 2017 Elsevier Inc. All rights reserved.
MMSE Estimator for Children’s Speech with Car and Weather Noise

Science.gov (United States)

Sayuthi, V.

2018-04-01

Previous research mentioned that most people need and use vehicles for various purposes, in this recent time and future, as a means of traveling. Many ways can be done in a vehicle, such as for enjoying entertainment, and doing work, so vehicles not just only as a means of traveling. In this study, we will examine the children’s speech from a girl in the vehicle that affected by noise disturbances from the sound source of car noise and the weather sound noise around it, in this case, the rainy weather noise. Vehicle sounds may be from car engine or car air conditioner. The minimum mean square error (MMSE) estimator is used as an attempt to obtain or detect the children’s clear speech by representing simulation research as random process signal that factored by the autocorrelation of both the child’s voice and the disturbance noise signal. This MMSE estimator can be considered as wiener filter as the clear sound are reconstructed again. We expected that the results of this study can help as the basis for development of entertainment or communication technology for passengers of vehicles in the future, particularly using MMSE estimators.
Speech intelligibility for normal hearing and hearing-impaired listeners in simulated room acoustic conditions

DEFF Research Database (Denmark)

Arweiler, Iris; Dau, Torsten; Poulsen, Torben

Speech intelligibility depends on many factors such as room acoustics, the acoustical properties and location of the signal and the interferers, and the ability of the (normal and impaired) auditory system to process monaural and binaural sounds. In the present study, the effect of reverberation...... on spatial release from masking was investigated in normal hearing and hearing impaired listeners using three types of interferers: speech shaped noise, an interfering female talker and speech-modulated noise. Speech reception thresholds (SRT) were obtained in three simulated environments: a listening room......, a classroom and a church. The data from the study provide constraints for existing models of speech intelligibility prediction (based on the speech intelligibility index, SII, or the speech transmission index, STI) which have shortcomings when reverberation and/or fluctuating noise affect speech...
Irrelevant speech does not interfere with serial recall in early blind listeners.

Science.gov (United States)

Kattner, Florian; Ellermeier, Wolfgang

2014-01-01

Phonological working memory is known be (a) inversely related to the duration of the items to be learned (word-length effect), and (b) impaired by the presence of irrelevant speech-like sounds (irrelevant-speech effect). As it is discussed controversially whether these memory disruptions are subject to attentional control, both effects were studied in sighted participants and in a sample of early blind individuals who are expected to be superior in selectively attending to auditory stimuli. Results show that, while performance depended on word length in both groups, irrelevant speech interfered with recall only in the sighted group, but not in blind participants. This suggests that blind listeners may be able to effectively prevent irrelevant sound from being encoded in the phonological store, presumably due to superior auditory processing. The occurrence of a word-length effect, however, implies that blind and sighted listeners are utilizing the same phonological rehearsal mechanism in order to maintain information in the phonological store.
Song and speech: examining the link between singing talent and speech imitation ability

Science.gov (United States)

Christiner, Markus; Reiterer, Susanne M.

2013-01-01

In previous research on speech imitation, musicality, and an ability to sing were isolated as the strongest indicators of good pronunciation skills in foreign languages. We, therefore, wanted to take a closer look at the nature of the ability to sing, which shares a common ground with the ability to imitate speech. This study focuses on whether good singing performance predicts good speech imitation. Forty-one singers of different levels of proficiency were selected for the study and their ability to sing, to imitate speech, their musical talent and working memory were tested. Results indicated that singing performance is a better indicator of the ability to imitate speech than the playing of a musical instrument. A multiple regression revealed that 64% of the speech imitation score variance could be explained by working memory together with educational background and singing performance. A second multiple regression showed that 66% of the speech imitation variance of completely unintelligible and unfamiliar language stimuli (Hindi) could be explained by working memory together with a singer's sense of rhythm and quality of voice. This supports the idea that both vocal behaviors have a common grounding in terms of vocal and motor flexibility, ontogenetic and phylogenetic development, neural orchestration and auditory memory with singing fitting better into the category of “speech” on the productive level and “music” on the acoustic level. As a result, good singers benefit from vocal and motor flexibility, productively and cognitively, in three ways. (1) Motor flexibility and the ability to sing improve language and musical function. (2) Good singers retain a certain plasticity and are open to new and unusual sound combinations during adulthood both perceptually and productively. (3) The ability to sing improves the memory span of the auditory working memory. PMID:24319438
Non-linear Dynamics of Speech in Schizophrenia

DEFF Research Database (Denmark)

Fusaroli, Riccardo; Simonsen, Arndis; Weed, Ethan

(regularity and complexity) of speech. Our aims are (1) to achieve a more fine-grained understanding of the speech patterns in schizophrenia than has previously been achieved using traditional, linear measures of prosody and fluency, and (2) to employ the results in a supervised machine-learning process......-effects inference. SANS and SAPS scores were predicted using a 10-fold cross-validated multiple linear regression. Both analyses were iterated 1000 to test for stability of results. Results: Voice dynamics allowed discrimination of patients with schizophrenia from healthy controls with a balanced accuracy of 85...
Processing Complex Sounds Passing through the Rostral Brainstem: The New Early Filter Model

Science.gov (United States)

Marsh, John E.; Campbell, Tom A.

2016-01-01

The rostral brainstem receives both “bottom-up” input from the ascending auditory system and “top-down” descending corticofugal connections. Speech information passing through the inferior colliculus of elderly listeners reflects the periodicity envelope of a speech syllable. This information arguably also reflects a composite of temporal-fine-structure (TFS) information from the higher frequency vowel harmonics of that repeated syllable. The amplitude of those higher frequency harmonics, bearing even higher frequency TFS information, correlates positively with the word recognition ability of elderly listeners under reverberatory conditions. Also relevant is that working memory capacity (WMC), which is subject to age-related decline, constrains the processing of sounds at the level of the brainstem. Turning to the effects of a visually presented sensory or memory load on auditory processes, there is a load-dependent reduction of that processing, as manifest in the auditory brainstem responses (ABR) evoked by to-be-ignored clicks. Wave V decreases in amplitude with increases in the visually presented memory load. A visually presented sensory load also produces a load-dependent reduction of a slightly different sort: The sensory load of visually presented information limits the disruptive effects of background sound upon working memory performance. A new early filter model is thus advanced whereby systems within the frontal lobe (affected by sensory or memory load) cholinergically influence top-down corticofugal connections. Those corticofugal connections constrain the processing of complex sounds such as speech at the level of the brainstem. Selective attention thereby limits the distracting effects of background sound entering the higher auditory system via the inferior colliculus. Processing TFS in the brainstem relates to perception of speech under adverse conditions. Attentional selectivity is crucial when the signal heard is degraded or masked: e
Background Noise Degrades Central Auditory Processing in Toddlers.

Science.gov (United States)

Niemitalo-Haapola, Elina; Haapala, Sini; Jansson-Verkasalo, Eira; Kujala, Teija

2015-01-01

Noise, as an unwanted sound, has become one of modern society's environmental conundrums, and many children are exposed to higher noise levels than previously assumed. However, the effects of background noise on central auditory processing of toddlers, who are still acquiring language skills, have so far not been determined. The authors evaluated the effects of background noise on toddlers' speech-sound processing by recording event-related brain potentials. The hypothesis was that background noise modulates neural speech-sound encoding and degrades speech-sound discrimination. Obligatory P1 and N2 responses for standard syllables and the mismatch negativity (MMN) response for five different syllable deviants presented in a linguistic multifeature paradigm were recorded in silent and background noise conditions. The participants were 18 typically developing 22- to 26-month-old monolingual children with healthy ears. The results showed that the P1 amplitude was smaller and the N2 amplitude larger in the noisy conditions compared with the silent conditions. In the noisy condition, the MMN was absent for the intensity and vowel changes and diminished for the consonant, frequency, and vowel duration changes embedded in speech syllables. Furthermore, the frontal MMN component was attenuated in the noisy condition. However, noise had no effect on P1, N2, or MMN latencies. The results from this study suggest multiple effects of background noise on the central auditory processing of toddlers. It modulates the early stages of sound encoding and dampens neural discrimination vital for accurate speech perception. These results imply that speech processing of toddlers, who may spend long periods of daytime in noisy conditions, is vulnerable to background noise. In noisy conditions, toddlers' neural representations of some speech sounds might be weakened. Thus, special attention should be paid to acoustic conditions and background noise levels in children's daily environments
Rate and rhythm control strategies for apraxia of speech in nonfluent primary progressive aphasia.

Science.gov (United States)

Beber, Bárbara Costa; Berbert, Monalise Costa Batista; Grawer, Ruth Siqueira; Cardoso, Maria Cristina de Almeida Freitas

2018-01-01

The nonfluent/agrammatic variant of primary progressive aphasia is characterized by apraxia of speech and agrammatism. Apraxia of speech limits patients' communication due to slow speaking rate, sound substitutions, articulatory groping, false starts and restarts, segmentation of syllables, and increased difficulty with increasing utterance length. Speech and language therapy is known to benefit individuals with apraxia of speech due to stroke, but little is known about its effects in primary progressive aphasia. This is a case report of a 72-year-old, illiterate housewife, who was diagnosed with nonfluent primary progressive aphasia and received speech and language therapy for apraxia of speech. Rate and rhythm control strategies for apraxia of speech were trained to improve initiation of speech. We discuss the importance of these strategies to alleviate apraxia of speech in this condition and the future perspectives in the area.
Sex Discrimination in Employment Practices.

Science.gov (United States)

California Univ., Los Angeles. Univ. Extension.

The conference on sex discrimination in employment practices was held at the University of California at Los Angeles in cooperation with the Women's Bureau of the Department of Labor. Speeches included: (1) "New Legislation--New Action" by Rosalind K. Loring and William Foster, (2) "Compliance Policies and Procedures for Business and Industry" by…
The sound of feelings: electrophysiological responses to emotional speech in alexithymia.

Directory of Open Access Journals (Sweden)

Katharina Sophia Goerlich

Full Text Available Alexithymia is a personality trait characterized by difficulties in the cognitive processing of emotions (cognitive dimension and in the experience of emotions (affective dimension. Previous research focused mainly on visual emotional processing in the cognitive alexithymia dimension. We investigated the impact of both alexithymia dimensions on electrophysiological responses to emotional speech in 60 female subjects.During unattended processing, subjects watched a movie while an emotional prosody oddball paradigm was presented in the background. During attended processing, subjects detected deviants in emotional prosody. The cognitive alexithymia dimension was associated with a left-hemisphere bias during early stages of unattended emotional speech processing, and with generally reduced amplitudes of the late P3 component during attended processing. In contrast, the affective dimension did not modulate unattended emotional prosody perception, but was associated with reduced P3 amplitudes during attended processing particularly to emotional prosody spoken in high intensity.Our results provide evidence for a dissociable impact of the two alexithymia dimensions on electrophysiological responses during the attended and unattended processing of emotional prosody. The observed electrophysiological modulations are indicative of a reduced sensitivity to the emotional qualities of speech, which may be a contributing factor to problems in interpersonal communication associated with alexithymia.
Restricted Consonant Inventories of 2-Year-Old Finnish Children with a History of Recurrent Acute Otitis Media

Science.gov (United States)

Haapala, Sini; Niemitalo-Haapola, Elina; Raappana, Antti; Kujala, Tiia; Kujala, Teija; Jansson-Verkasalo, Eira

2015-01-01

Many children experience recurrent acute otitis media (RAOM) in early childhood. In a previous study, 2-year-old children with RAOM were shown to have immature neural patterns for speech sound discrimination. The present study further investigated the consonant inventories of these same children using natural speech samples. The results showed…
Sensorimotor speech disorders in Parkinson's disease: Programming and execution deficits

Directory of Open Access Journals (Sweden)

Karin Zazo Ortiz

Full Text Available ABSTRACT Introduction: Dysfunction in the basal ganglia circuits is a determining factor in the physiopathology of the classic signs of Parkinson's disease (PD and hypokinetic dysarthria is commonly related to PD. Regarding speech disorders associated with PD, the latest four-level framework of speech complicates the traditional view of dysarthria as a motor execution disorder. Based on findings that dysfunctions in basal ganglia can cause speech disorders, and on the premise that the speech deficits seen in PD are not related to an execution motor disorder alone but also to a disorder at the motor programming level, the main objective of this study was to investigate the presence of sensorimotor disorders of programming (besides the execution disorders previously described in PD patients. Methods: A cross-sectional study was conducted in a sample of 60 adults matched for gender, age and education: 30 adult patients diagnosed with idiopathic PD (PDG and 30 healthy adults (CG. All types of articulation errors were reanalyzed to investigate the nature of these errors. Interjections, hesitations and repetitions of words or sentences (during discourse were considered typical disfluencies; blocking, episodes of palilalia (words or syllables were analyzed as atypical disfluencies. We analysed features including successive self-initiated trial, phoneme distortions, self-correction, repetition of sounds and syllables, prolonged movement transitions, additions or omissions of sounds and syllables, in order to identify programming and/or execution failures. Orofacial agility was also investigated. Results: The PDG had worse performance on all sensorimotor speech tasks. All PD patients had hypokinetic dysarthria. Conclusion: The clinical characteristics found suggest both execution and programming sensorimotor speech disorders in PD patients.
Not all sounds sound the same: Parkinson's disease affects differently emotion processing in music and in speech prosody.

Science.gov (United States)

Lima, César F; Garrett, Carolina; Castro, São Luís

2013-01-01

Does emotion processing in music and speech prosody recruit common neurocognitive mechanisms? To examine this question, we implemented a cross-domain comparative design in Parkinson's disease (PD). Twenty-four patients and 25 controls performed emotion recognition tasks for music and spoken sentences. In music, patients had impaired recognition of happiness and peacefulness, and intact recognition of sadness and fear; this pattern was independent of general cognitive and perceptual abilities. In speech, patients had a small global impairment, which was significantly mediated by executive dysfunction. Hence, PD affected differently musical and prosodic emotions. This dissociation indicates that the mechanisms underlying the two domains are partly independent.
Sound sensitivity of neurons in rat hippocampus during performance of a sound-guided task

Science.gov (United States)

Vinnik, Ekaterina; Honey, Christian; Schnupp, Jan; Diamond, Mathew E.

2012-01-01

To investigate how hippocampal neurons encode sound stimuli, and the conjunction of sound stimuli with the animal's position in space, we recorded from neurons in the CA1 region of hippocampus in rats while they performed a sound discrimination task. Four different sounds were used, two associated with water reward on the right side of the animal and the other two with water reward on the left side. This allowed us to separate neuronal activity related to sound identity from activity related to response direction. To test the effect of spatial context on sound coding, we trained rats to carry out the task on two identical testing platforms at different locations in the same room. Twenty-one percent of the recorded neurons exhibited sensitivity to sound identity, as quantified by the difference in firing rate for the two sounds associated with the same response direction. Sensitivity to sound identity was often observed on only one of the two testing platforms, indicating an effect of spatial context on sensory responses. Forty-three percent of the neurons were sensitive to response direction, and the probability that any one neuron was sensitive to response direction was statistically independent from its sensitivity to sound identity. There was no significant coding for sound identity when the rats heard the same sounds outside the behavioral task. These results suggest that CA1 neurons encode sound stimuli, but only when those sounds are associated with actions. PMID:22219030
Awareness of rhythm patterns in speech and music in children with specific language impairments

Directory of Open Access Journals (Sweden)

Ruth eCumming

2015-12-01

Full Text Available Children with specific language impairments (SLIs show impaired perception and production of language, and also show impairments in perceiving auditory cues to rhythm (amplitude rise time [ART] and sound duration and in tapping to a rhythmic beat. Here we explore potential links between language development and rhythm perception in 45 children with SLI and 50 age-matched controls. We administered three rhythmic tasks, a musical beat detection task, a tapping-to-music task, and a novel music/speech task, which varied rhythm and pitch cues independently or together in both speech and music. Via low-pass filtering, the music sounded as though it was played from a low-quality radio and the speech sounded as though it was muffled (heard behind the door. We report data for all of the SLI children (N = 45, IQ varying, as well as for two independent subgroupings with intact IQ. One subgroup, Pure SLI, had intact phonology and reading (N=16, the other, SLI PPR (N=15, had impaired phonology and reading. When IQ varied (all SLI children, we found significant group differences in all the rhythmic tasks. For the Pure SLI group, there were rhythmic impairments in the tapping task only. For children with SLI and poor phonology (SLI PPR, group differences were found in all of the filtered speech/music AXB tasks. We conclude that difficulties with rhythmic cues in both speech and music are present in children with SLIs, but that some rhythmic measures are more sensitive than others. The data are interpreted within a ‘prosodic phrasing’ hypothesis, and we discuss the potential utility of rhythmic and musical interventions in remediating speech and language difficulties in children.
From birdsong to human speech recognition: bayesian inference on a hierarchy of nonlinear dynamical systems.

Science.gov (United States)

Yildiz, Izzet B; von Kriegstein, Katharina; Kiebel, Stefan J

2013-01-01

Our knowledge about the computational mechanisms underlying human learning and recognition of sound sequences, especially speech, is still very limited. One difficulty in deciphering the exact means by which humans recognize speech is that there are scarce experimental findings at a neuronal, microscopic level. Here, we show that our neuronal-computational understanding of speech learning and recognition may be vastly improved by looking at an animal model, i.e., the songbird, which faces the same challenge as humans: to learn and decode complex auditory input, in an online fashion. Motivated by striking similarities between the human and songbird neural recognition systems at the macroscopic level, we assumed that the human brain uses the same computational principles at a microscopic level and translated a birdsong model into a novel human sound learning and recognition model with an emphasis on speech. We show that the resulting Bayesian model with a hierarchy of nonlinear dynamical systems can learn speech samples such as words rapidly and recognize them robustly, even in adverse conditions. In addition, we show that recognition can be performed even when words are spoken by different speakers and with different accents-an everyday situation in which current state-of-the-art speech recognition models often fail. The model can also be used to qualitatively explain behavioral data on human speech learning and derive predictions for future experiments.
From birdsong to human speech recognition: bayesian inference on a hierarchy of nonlinear dynamical systems.

Directory of Open Access Journals (Sweden)

Izzet B Yildiz

Full Text Available Our knowledge about the computational mechanisms underlying human learning and recognition of sound sequences, especially speech, is still very limited. One difficulty in deciphering the exact means by which humans recognize speech is that there are scarce experimental findings at a neuronal, microscopic level. Here, we show that our neuronal-computational understanding of speech learning and recognition may be vastly improved by looking at an animal model, i.e., the songbird, which faces the same challenge as humans: to learn and decode complex auditory input, in an online fashion. Motivated by striking similarities between the human and songbird neural recognition systems at the macroscopic level, we assumed that the human brain uses the same computational principles at a microscopic level and translated a birdsong model into a novel human sound learning and recognition model with an emphasis on speech. We show that the resulting Bayesian model with a hierarchy of nonlinear dynamical systems can learn speech samples such as words rapidly and recognize them robustly, even in adverse conditions. In addition, we show that recognition can be performed even when words are spoken by different speakers and with different accents-an everyday situation in which current state-of-the-art speech recognition models often fail. The model can also be used to qualitatively explain behavioral data on human speech learning and derive predictions for future experiments.
Human phoneme recognition depending on speech-intrinsic variability.

Science.gov (United States)

Meyer, Bernd T; Jürgens, Tim; Wesker, Thorsten; Brand, Thomas; Kollmeier, Birger

2010-11-01

The influence of different sources of speech-intrinsic variation (speaking rate, effort, style and dialect or accent) on human speech perception was investigated. In listening experiments with 16 listeners, confusions of consonant-vowel-consonant (CVC) and vowel-consonant-vowel (VCV) sounds in speech-weighted noise were analyzed. Experiments were based on the OLLO logatome speech database, which was designed for a man-machine comparison. It contains utterances spoken by 50 speakers from five dialect/accent regions and covers several intrinsic variations. By comparing results depending on intrinsic and extrinsic variations (i.e., different levels of masking noise), the degradation induced by variabilities can be expressed in terms of the SNR. The spectral level distance between the respective speech segment and the long-term spectrum of the masking noise was found to be a good predictor for recognition rates, while phoneme confusions were influenced by the distance to spectrally close phonemes. An analysis based on transmitted information of articulatory features showed that voicing and manner of articulation are comparatively robust cues in the presence of intrinsic variations, whereas the coding of place is more degraded. The database and detailed results have been made available for comparisons between human speech recognition (HSR) and automatic speech recognizers (ASR).

Some links on this page may take you to non-federal websites. Their policies may differ from this site.