speech intelligibility articulation: Topics by WorldWideScience.org

Sample records for speech intelligibility articulation

Intelligibility of synthetic speech in the presence of interfering speech

NARCIS (Netherlands)

Eggen, J.H.

1989-01-01

Standard articulation tests are not always sensitive enough to discriminate between speech samples which are of high intelligibility. One can increase the sensitivity of such tests by presenting the test materials in noise. In this way, small differences in intelligibility can be magnified into
[Clinical characteristics and speech therapy of lingua-apical articulation disorder].

Science.gov (United States)

Zhang, Feng-hua; Jin, Xing-ming; Zhang, Yi-wen; Wu, Hong; Jiang, Fan; Shen, Xiao-ming

2006-03-01

To explore the clinical characteristics and speech therapy of 62 children with lingua-apical articulation disorder. Peabody Picture Vocabulary Test (PPVT), Gesell development scales (Gesell), Wechsler Intelligence Scale for Preschool Children (WPPSI) and speech test were performed for 62 children at the ages of 3 to 8 years with lingua-apical articulation disorder. PPVT was used to measure receptive vocabulary skills. GESELL and WPPSI were utilized to represent cognitive and non-verbal ability. The speech test was adopted to assess the speech development. The children received speech therapy and auxiliary oral-motor functional training once or twice a week. Firstly the target sound was identified according to the speech development milestone, then the method of speech localization was used to clarify the correct articulation placement and manner. It was needed to change food character and administer oral-motor functional training for children with oral motor dysfunction. The 62 cases with the apical articulation disorder were classified into four groups. The combined pattern of the articulation disorder was the most common (40 cases, 64.5%), the next was apico-dental disorder (15 cases, 24.2%). The third was palatal disorder (4 cases, 6.5%) and the last one was the linguo-alveolar disorder (3 cases, 4.8%). The substitution errors of velar were the most common (95.2%), the next was omission errors (30.6%) and the last was absence of aspiration (12.9%). Oral motor dysfunction was found in some children with problems such as disordered joint movement of tongue and head, unstable jaw, weak tongue strength and poor coordination of tongue movement. Some children had feeding problems such as preference of eating soft food, keeping food in mouths, eating slowly, and poor chewing. After 5 to 18 times of therapy, the effective rate of speech therapy reached 82.3%. The lingua-apical articulation disorders can be classified into four groups. The combined pattern of the
Lexical effects on speech production and intelligibility in Parkinson's disease

Science.gov (United States)

Chiu, Yi-Fang

Individuals with Parkinson's disease (PD) often have speech deficits that lead to reduced speech intelligibility. Previous research provides a rich database regarding the articulatory deficits associated with PD including restricted vowel space (Skodda, Visser, & Schlegel, 2011) and flatter formant transitions (Tjaden & Wilding, 2004; Walsh & Smith, 2012). However, few studies consider the effect of higher level structural variables of word usage frequency and the number of similar sounding words (i.e. neighborhood density) on lower level articulation or on listeners' perception of dysarthric speech. The purpose of the study is to examine the interaction of lexical properties and speech articulation as measured acoustically in speakers with PD and healthy controls (HC) and the effect of lexical properties on the perception of their speech. Individuals diagnosed with PD and age-matched healthy controls read sentences with words that varied in word frequency and neighborhood density. Acoustic analysis was performed to compare second formant transitions in diphthongs, an indicator of the dynamics of tongue movement during speech production, across different lexical characteristics. Young listeners transcribed the spoken sentences and the transcription accuracy was compared across lexical conditions. The acoustic results indicate that both PD and HC speakers adjusted their articulation based on lexical properties but the PD group had significant reductions in second formant transitions compared to HC. Both groups of speakers increased second formant transitions for words with low frequency and low density, but the lexical effect is diphthong dependent. The change in second formant slope was limited in the PD group when the required formant movement for the diphthong is small. The data from listeners' perception of the speech by PD and HC show that listeners identified high frequency words with greater accuracy suggesting the use of lexical knowledge during the
Speech misperception: speaking and seeing interfere differently with hearing.

Directory of Open Access Journals (Sweden)

Takemi Mochida

Full Text Available Speech perception is thought to be linked to speech motor production. This linkage is considered to mediate multimodal aspects of speech perception, such as audio-visual and audio-tactile integration. However, direct coupling between articulatory movement and auditory perception has been little studied. The present study reveals a clear dissociation between the effects of a listener's own speech action and the effects of viewing another's speech movements on the perception of auditory phonemes. We assessed the intelligibility of the syllables [pa], [ta], and [ka] when listeners silently and simultaneously articulated syllables that were congruent/incongruent with the syllables they heard. The intelligibility was compared with a condition where the listeners simultaneously watched another's mouth producing congruent/incongruent syllables, but did not articulate. The intelligibility of [ta] and [ka] were degraded by articulating [ka] and [ta] respectively, which are associated with the same primary articulator (tongue as the heard syllables. But they were not affected by articulating [pa], which is associated with a different primary articulator (lips from the heard syllables. In contrast, the intelligibility of [ta] and [ka] was degraded by watching the production of [pa]. These results indicate that the articulatory-induced distortion of speech perception occurs in an articulator-specific manner while visually induced distortion does not. The articulator-specific nature of the auditory-motor interaction in speech perception suggests that speech motor processing directly contributes to our ability to hear speech.
Joint Dictionary Learning-Based Non-Negative Matrix Factorization for Voice Conversion to Improve Speech Intelligibility After Oral Surgery.

Science.gov (United States)

Fu, Szu-Wei; Li, Pei-Chun; Lai, Ying-Hui; Yang, Cheng-Chien; Hsieh, Li-Chun; Tsao, Yu

2017-11-01

Objective: This paper focuses on machine learning based voice conversion (VC) techniques for improving the speech intelligibility of surgical patients who have had parts of their articulators removed. Because of the removal of parts of the articulator, a patient's speech may be distorted and difficult to understand. To overcome this problem, VC methods can be applied to convert the distorted speech such that it is clear and more intelligible. To design an effective VC method, two key points must be considered: 1) the amount of training data may be limited (because speaking for a long time is usually difficult for postoperative patients); 2) rapid conversion is desirable (for better communication). Methods: We propose a novel joint dictionary learning based non-negative matrix factorization (JD-NMF) algorithm. Compared to conventional VC techniques, JD-NMF can perform VC efficiently and effectively with only a small amount of training data. Results: The experimental results demonstrate that the proposed JD-NMF method not only achieves notably higher short-time objective intelligibility (STOI) scores (a standardized objective intelligibility evaluation metric) than those obtained using the original unconverted speech but is also significantly more efficient and effective than a conventional exemplar-based NMF VC method. Conclusion: The proposed JD-NMF method may outperform the state-of-the-art exemplar-based NMF VC method in terms of STOI scores under the desired scenario. Significance: We confirmed the advantages of the proposed joint training criterion for the NMF-based VC. Moreover, we verified that the proposed JD-NMF can effectively improve the speech intelligibility scores of oral surgery patients. Objective: This paper focuses on machine learning based voice conversion (VC) techniques for improving the speech intelligibility of surgical patients who have had parts of their articulators removed. Because of the removal of parts of the articulator, a patient
ARTICULATION DISORDERS IN SERBIAN LANGUAGE IN CHILDREN WITH SPEECH PATHOLOGY.

Science.gov (United States)

Dmitrić, Tanja; Veselinović, Mila; Mitrović, Slobodan M

2015-01-01

Articulation is the result of speech organs and it means clean, clear and distinct pronunciation of voices in words. A prospective study included 24 children between 5 and 15 years of age, of both sexes. All children were monolingual, Serbian being their native language. The quality of articulation was tested with Triage articulation test. Neither omission nor distortion of plosives was observed in any of them, whereas substitution of plosives occurred in 12% of patients. Omission of affricates was not observed in any of the subjects, but substitution and distortion occurred in 29%, and 76% of subjects, respectively. Omission of fricatives was found in 29% subjects, substitution in 52%, and distortion in 82% of subjects. Omission and distortion of nasals was not recorded in any of the subjects, and substitution occurred in 6% of children. Omission of laterals was observed in 6%, substitution in 46% and distortion in 52% of subjects with articulation disorders. Discussion and Articulation disorders were observed not only in children diagnosed with dyslalia but in those with dysphasia and stuttering as well. Children with speech disorders articulate vowels best, then nasals and plosives. Articulation of fricatives and laterals was found to be most severely deviated, including all three disorders, i.e. substitution, omission and distortion. Spasms of speech muscles and vegetative reactions were also observed in this study, but only in children with stuttering.
Predicting speech intelligibility in conditions with nonlinearly processed noisy speech

DEFF Research Database (Denmark)

Jørgensen, Søren; Dau, Torsten

2013-01-01

The speech-based envelope power spectrum model (sEPSM; [1]) was proposed in order to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index (SII). The sEPSM applies the signal-tonoise ratio in the envelope domain (SNRenv), which was demonstrated...... to successfully predict speech intelligibility in conditions with nonlinearly processed noisy speech, such as processing with spectral subtraction. Moreover, a multiresolution version (mr-sEPSM) was demonstrated to account for speech intelligibility in various conditions with stationary and fluctuating...
Intelligibility for Binaural Speech with Discarded Low-SNR Speech Components.

Science.gov (United States)

Schoenmaker, Esther; van de Par, Steven

2016-01-01

Speech intelligibility in multitalker settings improves when the target speaker is spatially separated from the interfering speakers. A factor that may contribute to this improvement is the improved detectability of target-speech components due to binaural interaction in analogy to the Binaural Masking Level Difference (BMLD). This would allow listeners to hear target speech components within specific time-frequency intervals that have a negative SNR, similar to the improvement in the detectability of a tone in noise when these contain disparate interaural difference cues. To investigate whether these negative-SNR target-speech components indeed contribute to speech intelligibility, a stimulus manipulation was performed where all target components were removed when local SNRs were smaller than a certain criterion value. It can be expected that for sufficiently high criterion values target speech components will be removed that do contribute to speech intelligibility. For spatially separated speakers, assuming that a BMLD-like detection advantage contributes to intelligibility, degradation in intelligibility is expected already at criterion values below 0 dB SNR. However, for collocated speakers it is expected that higher criterion values can be applied without impairing speech intelligibility. Results show that degradation of intelligibility for separated speakers is only seen for criterion values of 0 dB and above, indicating a negligible contribution of a BMLD-like detection advantage in multitalker settings. These results show that the spatial benefit is related to a spatial separation of speech components at positive local SNRs rather than to a BMLD-like detection improvement for speech components at negative local SNRs.
Direct and indirect measures of speech articulator motions using low power EM sensors

International Nuclear Information System (INIS)

Barnes, T; Burnett, G; Gable, T; Holzrichter, J F; Ng, L

1999-01-01

Low power Electromagnetic (EM) Wave sensors can measure general properties of human speech articulator motions, as speech is produced. See Holzrichter, Burnett, Ng, and Lea, J.Acoust.Soc.Am. 103 (1) 622 (1998). Experiments have demonstrated extremely accurate pitch measurements ( and lt; 1 Hz per pitch cycle) and accurate onset of voiced speech. Recent measurements of pressure-induced tracheal motions enable very good spectra and amplitude estimates of a voiced excitation function. The use of the measured excitation functions and pitch synchronous processing enable the determination of each pitch cycle of an accurate transfer function and, indirectly, of the corresponding articulator motions. In addition, direct measurements have been made of EM wave reflections from articulator interfaces, including jaw, tongue, and palate, simultaneously with acoustic and glottal open/close signals. While several types of EM sensors are suitable for speech articulator measurements, the homodyne sensor has been found to provide good spatial and temporal resolution for several applications
Speech Intelligibility Evaluation for Mobile Phones

DEFF Research Database (Denmark)

Jørgensen, Søren; Cubick, Jens; Dau, Torsten

2015-01-01

In the development process of modern telecommunication systems, such as mobile phones, it is common practice to use computer models to objectively evaluate the transmission quality of the system, instead of time-consuming perceptual listening tests. Such models have typically focused on the quality...... of the transmitted speech, while little or no attention has been provided to speech intelligibility. The present study investigated to what extent three state-of-the art speech intelligibility models could predict the intelligibility of noisy speech transmitted through mobile phones. Sentences from the Danish...... Dantale II speech material were mixed with three different kinds of background noise, transmitted through three different mobile phones, and recorded at the receiver via a local network simulator. The speech intelligibility of the transmitted sentences was assessed by six normal-hearing listeners...
Sensory integration dysfunction affects efficacy of speech therapy on children with functional articulation disorders

Directory of Open Access Journals (Sweden)

Tung LC

2013-01-01

Full Text Available Li-Chen Tung,1,# Chin-Kai Lin,2,# Ching-Lin Hsieh,3,4 Ching-Chi Chen,1 Chin-Tsan Huang,1 Chun-Hou Wang5,6 1Department of Physical Medicine and Rehabilitation, Chi Mei Medical Center, Tainan, 2Program of Early Intervention, Department of Early Childhood Education, National Taichung University of Education, Taichung, 3School of Occupational Therapy, College of Medicine, National Taiwan University, Taipei, 4Department of Physical Medicine and Rehabilitation, National Taiwan University Hospital, Taipei, 5School of Physical Therapy, College of Medical Science and Technology, Chung Shan Medical University, Taichung, 6Physical Therapy Room, Chung Shan Medical University Hospital, Taichung, Taiwan#These authors contributed equally Background: Articulation disorders in young children are due to defects occurring at a certain stage in sensory and motor development. Some children with functional articulation disorders may also have sensory integration dysfunction (SID. We hypothesized that speech therapy would be less efficacious in children with SID than in those without SID. Hence, the purpose of this study was to compare the efficacy of speech therapy in two groups of children with functional articulation disorders: those without and those with SID.Method: A total of 30 young children with functional articulation disorders were divided into two groups, the no-SID group (15 children and the SID group (15 children. The number of pronunciation mistakes was evaluated before and after speech therapy.Results: There were no statistically significant differences in age, sex, sibling order, education of parents, and pretest number of mistakes in pronunciation between the two groups (P > 0.05. The mean and standard deviation in the pre- and posttest number of mistakes in pronunciation were 10.5 ± 3.2 and 3.3 ± 3.3 in the no-SID group, and 10.1 ± 2.9 and 6.9 ± 3.5 in the SID group, respectively. Results showed great changes after speech therapy treatment (F
Intelligibility of clear speech: effect of instruction.

Science.gov (United States)

Lam, Jennifer; Tjaden, Kris

2013-10-01

The authors investigated how clear speech instructions influence sentence intelligibility. Twelve speakers produced sentences in habitual, clear, hearing impaired, and overenunciate conditions. Stimuli were amplitude normalized and mixed with multitalker babble for orthographic transcription by 40 listeners. The main analysis investigated percentage-correct intelligibility scores as a function of the 4 conditions and speaker sex. Additional analyses included listener response variability, individual speaker trends, and an alternate intelligibility measure: proportion of content words correct. Relative to the habitual condition, the overenunciate condition was associated with the greatest intelligibility benefit, followed by the hearing impaired and clear conditions. Ten speakers followed this trend. The results indicated different patterns of clear speech benefit for male and female speakers. Greater listener variability was observed for speakers with inherently low habitual intelligibility compared to speakers with inherently high habitual intelligibility. Stable proportions of content words were observed across conditions. Clear speech instructions affected the magnitude of the intelligibility benefit. The instruction to overenunciate may be most effective in clear speech training programs. The findings may help explain the range of clear speech intelligibility benefit previously reported. Listener variability analyses suggested the importance of obtaining multiple listener judgments of intelligibility, especially for speakers with inherently low habitual intelligibility.
Dealing with Phrase Level Co-Articulation (PLC) in speech recognition: a first approach

NARCIS (Netherlands)

Ordelman, Roeland J.F.; van Hessen, Adrianus J.; van Leeuwen, David A.; Robinson, Tony; Renals, Steve

1999-01-01

Whereas nowadays within-word co-articulation effects are usually sufficiently dealt with in automatic speech recognition, this is not always the case with phrase level co-articulation effects (PLC). This paper describes a first approach in dealing with phrase level co-articulation by applying these
Optimizing acoustical conditions for speech intelligibility in classrooms

Science.gov (United States)

Yang, Wonyoung

High speech intelligibility is imperative in classrooms where verbal communication is critical. However, the optimal acoustical conditions to achieve a high degree of speech intelligibility have previously been investigated with inconsistent results, and practical room-acoustical solutions to optimize the acoustical conditions for speech intelligibility have not been developed. This experimental study validated auralization for speech-intelligibility testing, investigated the optimal reverberation for speech intelligibility for both normal and hearing-impaired listeners using more realistic room-acoustical models, and proposed an optimal sound-control design for speech intelligibility based on the findings. The auralization technique was used to perform subjective speech-intelligibility tests. The validation study, comparing auralization results with those of real classroom speech-intelligibility tests, found that if the room to be auralized is not very absorptive or noisy, speech-intelligibility tests using auralization are valid. The speech-intelligibility tests were done in two different auralized sound fields---approximately diffuse and non-diffuse---using the Modified Rhyme Test and both normal and hearing-impaired listeners. A hybrid room-acoustical prediction program was used throughout the work, and it and a 1/8 scale-model classroom were used to evaluate the effects of ceiling barriers and reflectors. For both subject groups, in approximately diffuse sound fields, when the speech source was closer to the listener than the noise source, the optimal reverberation time was zero. When the noise source was closer to the listener than the speech source, the optimal reverberation time was 0.4 s (with another peak at 0.0 s) with relative output power levels of the speech and noise sources SNS = 5 dB, and 0.8 s with SNS = 0 dB. In non-diffuse sound fields, when the noise source was between the speaker and the listener, the optimal reverberation time was 0.6 s with
Human speech articulator measurements using low power, 2GHz Homodyne sensors

International Nuclear Information System (INIS)

Barnes, T; Burnett, G C; Holzrichter, J F

1999-01-01

Very low power, short-range microwave ''radar-like'' sensors can measure the motions and vibrations of internal human speech articulators as speech is produced. In these animate (and also in inanimate acoustic systems) microwave sensors can measure vibration information associated with excitation sources and other interfaces. These data, together with the corresponding acoustic data, enable the calculation of system transfer functions. This information appears to be useful for a surprisingly wide range of applications such as speech coding and recognition, speaker or object identification, speech and musical instrument synthesis, noise cancellation, and other applications
Detecting self-produced speech errors before and after articulation: An ERP investigation

Directory of Open Access Journals (Sweden)

Kevin Michael Trewartha

2013-11-01

Full Text Available It has been argued that speech production errors are monitored by the same neural system involved in monitoring other types of action errors. Behavioral evidence has shown that speech errors can be detected and corrected prior to articulation, yet the neural basis for such pre-articulatory speech error monitoring is poorly understood. The current study investigated speech error monitoring using a phoneme-substitution task known to elicit speech errors. Stimulus-locked event-related potential (ERP analyses comparing correct and incorrect utterances were used to assess pre-articulatory error monitoring and response-locked ERP analyses were used to assess post-articulatory monitoring. Our novel finding in the stimulus-locked analysis revealed that words that ultimately led to a speech error were associated with a larger P2 component at midline sites (FCz, Cz, and CPz. This early positivity may reflect the detection of an error in speech formulation, or a predictive mechanism to signal the potential for an upcoming speech error. The data also revealed that general conflict monitoring mechanisms are involved during this task as both correct and incorrect responses elicited an anterior N2 component typically associated with conflict monitoring. The response-locked analyses corroborated previous observations that self-produced speech errors led to a fronto-central ERN. These results demonstrate that speech errors can be detected prior to articulation, and that speech error monitoring relies on a central error monitoring mechanism.
Speech Intelligibility

Science.gov (United States)

Brand, Thomas

Speech intelligibility (SI) is important for different fields of research, engineering and diagnostics in order to quantify very different phenomena like the quality of recordings, communication and playback devices, the reverberation of auditoria, characteristics of hearing impairment, benefit using hearing aids or combinations of these things.
Human speech articulator measurements using low power, 2GHz Homodyne sensors

Energy Technology Data Exchange (ETDEWEB)

Barnes, T; Burnett, G C; Holzrichter, J F

1999-06-29

Very low power, short-range microwave ''radar-like'' sensors can measure the motions and vibrations of internal human speech articulators as speech is produced. In these animate (and also in inanimate acoustic systems) microwave sensors can measure vibration information associated with excitation sources and other interfaces. These data, together with the corresponding acoustic data, enable the calculation of system transfer functions. This information appears to be useful for a surprisingly wide range of applications such as speech coding and recognition, speaker or object identification, speech and musical instrument synthesis, noise cancellation, and other applications.
Comparison of two speech privacy measurements, articulation index (AI) and speech privacy noise isolation class (NIC'), in open workplaces

Science.gov (United States)

Yoon, Heakyung C.; Loftness, Vivian

2002-05-01

Lack of speech privacy has been reported to be the main dissatisfaction among occupants in open workplaces, according to workplace surveys. Two speech privacy measurements, Articulation Index (AI), standardized by the American National Standards Institute in 1969, and Speech Privacy Noise Isolation Class (NIC', Noise Isolation Class Prime), adapted from Noise Isolation Class (NIC) by U. S. General Services Administration (GSA) in 1979, have been claimed as objective tools to measure speech privacy in open offices. To evaluate which of them, normal privacy for AI or satisfied privacy for NIC', is a better tool in terms of speech privacy in a dynamic open office environment, measurements were taken in the field. AIs and NIC's in the different partition heights and workplace configurations have been measured following ASTM E1130 (Standard Test Method for Objective Measurement of Speech Privacy in Open Offices Using Articulation Index) and GSA test PBS-C.1 (Method for the Direct Measurement of Speech-Privacy Potential (SPP) Based on Subjective Judgments) and PBS-C.2 (Public Building Service Standard Method of Test Method for the Sufficient Verification of Speech-Privacy Potential (SPP) Based on Objective Measurements Including Methods for the Rating of Functional Interzone Attenuation and NC-Background), respectively.
The impact of brief restriction to articulation on children's subsequent speech production.

Science.gov (United States)

Seidl, Amanda; Brosseau-Lapré, Françoise; Goffman, Lisa

2018-02-01

This project explored whether disruption of articulation during listening impacts subsequent speech production in 4-yr-olds with and without speech sound disorder (SSD). During novel word learning, typically-developing children showed effects of articulatory disruption as revealed by larger differences between two acoustic cues to a sound contrast, but children with SSD were unaffected by articulatory disruption. Findings suggest that, when typically developing 4-yr-olds experience an articulatory disruption during a listening task, the children's subsequent production is affected. Children with SSD show less influence of articulatory experience during perception, which could be the result of impaired or attenuated ties between perception and articulation.

Segmental intelligibility of synthetic speech produced by rule.

Science.gov (United States)

Logan, J S; Greene, B G; Pisoni, D B

1989-08-01

This paper reports the results of an investigation that employed the modified rhyme test (MRT) to measure the segmental intelligibility of synthetic speech generated automatically by rule. Synthetic speech produced by ten text-to-speech systems was studied and compared to natural speech. A variation of the standard MRT was also used to study the effects of response set size on perceptual confusions. Results indicated that the segmental intelligibility scores formed a continuum. Several systems displayed very high levels of performance that were close to or equal to scores obtained with natural speech; other systems displayed substantially worse performance compared to natural speech. The overall performance of the best system, DECtalk--Paul, was equivalent to the data obtained with natural speech for consonants in syllable-initial position. The findings from this study are discussed in terms of the use of a set of standardized procedures for measuring intelligibility of synthetic speech under controlled laboratory conditions. Recent work investigating the perception of synthetic speech under more severe conditions in which greater demands are made on the listener's processing resources is also considered. The wide range of intelligibility scores obtained in the present study demonstrates important differences in perception and suggests that not all synthetic speech is perceptually equivalent to the listener.
Segmental intelligibility of synthetic speech produced by rule

Science.gov (United States)

Logan, John S.; Greene, Beth G.; Pisoni, David B.

2012-01-01

This paper reports the results of an investigation that employed the modified rhyme test (MRT) to measure the segmental intelligibility of synthetic speech generated automatically by rule. Synthetic speech produced by ten text-to-speech systems was studied and compared to natural speech. A variation of the standard MRT was also used to study the effects of response set size on perceptual confusions. Results indicated that the segmental intelligibility scores formed a continuum. Several systems displayed very high levels of performance that were close to or equal to scores obtained with natural speech; other systems displayed substantially worse performance compared to natural speech. The overall performance of the best system, DECtalk—Paul, was equivalent to the data obtained with natural speech for consonants in syllable-initial position. The findings from this study are discussed in terms of the use of a set of standardized procedures for measuring intelligibility of synthetic speech under controlled laboratory conditions. Recent work investigating the perception of synthetic speech under more severe conditions in which greater demands are made on the listener’s processing resources is also considered. The wide range of intelligibility scores obtained in the present study demonstrates important differences in perception and suggests that not all synthetic speech is perceptually equivalent to the listener. PMID:2527884
Modelling speech intelligibility in adverse conditions

DEFF Research Database (Denmark)

Jørgensen, Søren; Dau, Torsten

2013-01-01

Jørgensen and Dau (J Acoust Soc Am 130:1475-1487, 2011) proposed the speech-based envelope power spectrum model (sEPSM) in an attempt to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index (SII) in conditions with nonlinearly processed speech...... subjected to phase jitter, a condition in which the spectral structure of the intelligibility of speech signal is strongly affected, while the broadband temporal envelope is kept largely intact. In contrast, the effects of this distortion can be predicted -successfully by the spectro-temporal modulation...... suggest that the SNRenv might reflect a powerful decision metric, while some explicit across-frequency analysis seems crucial in some conditions. How such across-frequency analysis is "realized" in the auditory system remains unresolved....
Preschool Speech Error Patterns Predict Articulation and Phonological Awareness Outcomes in Children with Histories of Speech Sound Disorders

Science.gov (United States)

Preston, Jonathan L.; Hull, Margaret; Edwards, Mary Louise

2013-01-01

Purpose: To determine if speech error patterns in preschoolers with speech sound disorders (SSDs) predict articulation and phonological awareness (PA) outcomes almost 4 years later. Method: Twenty-five children with histories of preschool SSDs (and normal receptive language) were tested at an average age of 4;6 (years;months) and were followed up…
The role of across-frequency envelope processing for speech intelligibility

DEFF Research Database (Denmark)

Chabot-Leclerc, Alexandre; Jørgensen, Søren; Dau, Torsten

2013-01-01

Speech intelligibility models consist of a preprocessing part that transforms the stimuli into some internal (auditory) representation, and a decision metric that quantifies effects of transmission channel, speech interferers, and auditory processing on the speech intelligibility. Here, two recent...... speech intelligibility models, the spectro-temporal modulation index [STMI; Elhilali et al. (2003)] and the speech-based envelope power spectrum model [sEPSM; Jørgensen and Dau (2011)] were evaluated in conditions of noisy speech subjected to reverberation, and to nonlinear distortions through either...
The role of across-frequency envelope processing for speech intelligibility

DEFF Research Database (Denmark)

Chabot-Leclerc, Alexandre; Jørgensen, Søren; Dau, Torsten

2013-01-01

Speech intelligibility models consist of a preprocessing part that transforms the stimuli into some internal (auditory) representation, and a decision metric that quantifies effects of transmission channel, speech interferers, and auditory processing on the speech intelligibility. Here, two recent...... speech intelligibility models, the spectro-temporal modulation index (STMI; Elhilali et al., 2003) and the speech-based envelope power spectrum model (sEPSM; Jørgensen and Dau, 2011) were evaluated in conditions of noisy speech subjected to reverberation, and to nonlinear distortions through either...
Speech Intelligibility Advantages using an Acoustic Beamformer Display

Science.gov (United States)

Begault, Durand R.; Sunder, Kaushik; Godfroy, Martine; Otto, Peter

2015-01-01

A speech intelligibility test conforming to the Modified Rhyme Test of ANSI S3.2 "Method for Measuring the Intelligibility of Speech Over Communication Systems" was conducted using a prototype 12-channel acoustic beamformer system. The target speech material (signal) was identified against speech babble (noise), with calculated signal-noise ratios of 0, 5 and 10 dB. The signal was delivered at a fixed beam orientation of 135 deg (re 90 deg as the frontal direction of the array) and the noise at 135 deg (co-located) and 0 deg (separated). A significant improvement in intelligibility from 57% to 73% was found for spatial separation for the same signal-noise ratio (0 dB). Significant effects for improved intelligibility due to spatial separation were also found for higher signal-noise ratios (5 and 10 dB).
Speech Intelligibility in Noise Using Throat and Acoustic Microphones

National Research Council Canada - National Science Library

Acker-Mills, Barbara

2004-01-01

... speech intelligibility. Speech intelligibility for signals generated by an acoustic microphone, a throat microphone, and the two microphones together was assessed using the Modified Rhyme Test (MRT...
Modeling speech intelligibility in adverse conditions

DEFF Research Database (Denmark)

Dau, Torsten

2012-01-01

) in conditions with nonlinearly processed speech. Instead of considering the reduction of the temporal modulation energy as the intelligibility metric, as assumed in the STI, the sEPSM applies the signal-to-noise ratio in the envelope domain (SNRenv). This metric was shown to be the key for predicting...... understanding speech when more than one person is talking, even when reduced audibility has been fully compensated for by a hearing aid. The reasons for these difficulties are not well understood. This presentation highlights recent concepts of the monaural and binaural signal processing strategies employed...... by the normal as well as impaired auditory system. Jørgensen and Dau [(2011). J. Acoust. Soc. Am. 130, 1475-1487] proposed the speech-based envelope power spectrum model (sEPSM) in an attempt to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index (SII...
Intelligibility of speech of children with speech and sound disorders

OpenAIRE

Ivetac, Tina

2014-01-01

The purpose of this study is to examine speech intelligibility of children with primary speech and sound disorders aged 3 to 6 years in everyday life. The research problem is based on the degree to which parents or guardians, immediate family members (sister, brother, grandparents), extended family members (aunt, uncle, cousin), child's friends, other acquaintances, child's teachers and strangers understand the speech of children with speech sound disorders. We examined whether the level ...
Longitudinal follow-up to evaluate speech disorders in early-treated patients with infantile-onset Pompe disease.

Science.gov (United States)

Zeng, Yin-Ting; Hwu, Wuh-Liang; Torng, Pao-Chuan; Lee, Ni-Chung; Shieh, Jeng-Yi; Lu, Lu; Chien, Yin-Hsiu

2017-05-01

Patients with infantile-onset Pompe disease (IOPD) can be treated by recombinant human acid alpha glucosidase (rhGAA) replacement beginning at birth with excellent survival rates, but they still commonly present with speech disorders. This study investigated the progress of speech disorders in these early-treated patients and ascertained the relationship with treatments. Speech disorders, including hypernasal resonance, articulation disorders, and speech intelligibility, were scored by speech-language pathologists using auditory perception in seven early-treated patients over a period of 6 years. Statistical analysis of the first and last evaluations of the patients was performed with the Wilcoxon signed-rank test. A total of 29 speech samples were analyzed. All the patients suffered from hypernasality, articulation disorder, and impairment in speech intelligibility at the age of 3 years. The conditions were stable, and 2 patients developed normal or near normal speech during follow-up. Speech therapy and a high dose of rhGAA appeared to improve articulation in 6 of the 7 patients (86%, p = 0.028) by decreasing the omission of consonants, which consequently increased speech intelligibility (p = 0.041). Severity of hypernasality greatly reduced only in 2 patients (29%, p = 0.131). Speech disorders were common even in early and successfully treated patients with IOPD; however, aggressive speech therapy and high-dose rhGAA could improve their speech disorders. Copyright © 2016 European Paediatric Neurology Society. Published by Elsevier Ltd. All rights reserved.
Mode of communication and classroom placement impact on speech intelligibility.

Science.gov (United States)

Tobey, Emily A; Rekart, Deborah; Buckley, Kristi; Geers, Ann E

2004-05-01

To examine the impact of classroom placement and mode of communication on speech intelligibility scores in children aged 8 to 9 years using multichannel cochlear implants. Classroom placement (special education, partial mainstream, and full mainstream) and mode of communication (total communication and auditory-oral) reported via parental rating scales before and 4 times after implantation were the independent variables. Speech intelligibility scores obtained at 8 to 9 years of age were the dependent variables. The study included 131 congenitally deafened children between the ages of 8 and 9 years who received a multichannel cochlear implant before the age of 5 years. Higher speech intelligibility scores at 8 to 9 years of age were significantly associated with enrollment in auditory-oral programs rather than enrollment in total communication programs, regardless of when the mode of communication was used (before or after implantation). Speech intelligibility at 8 to 9 years of age was not significantly influenced by classroom placement before implantation, regardless of mode of communication. After implantation, however, there were significant associations between classroom placement and speech intelligibility scores at 8 to 9 years of age. Higher speech intelligibility scores at 8 to 9 years of age were associated with classroom exposure to normal-hearing peers in full or partial mainstream placements than in self-contained, special education placements. Higher speech intelligibility scores in 8- to 9-year-old congenitally deafened cochlear implant recipients were associated with educational settings that emphasize oral communication development. Educational environments that incorporate exposure to normal-hearing peers were also associated with higher speech intelligibility scores at 8 to 9 years of age.
The pathways for intelligible speech: multivariate and univariate perspectives.

Science.gov (United States)

Evans, S; Kyong, J S; Rosen, S; Golestani, N; Warren, J E; McGettigan, C; Mourão-Miranda, J; Wise, R J S; Scott, S K

2014-09-01

An anterior pathway, concerned with extracting meaning from sound, has been identified in nonhuman primates. An analogous pathway has been suggested in humans, but controversy exists concerning the degree of lateralization and the precise location where responses to intelligible speech emerge. We have demonstrated that the left anterior superior temporal sulcus (STS) responds preferentially to intelligible speech (Scott SK, Blank CC, Rosen S, Wise RJS. 2000. Identification of a pathway for intelligible speech in the left temporal lobe. Brain. 123:2400-2406.). A functional magnetic resonance imaging study in Cerebral Cortex used equivalent stimuli and univariate and multivariate analyses to argue for the greater importance of bilateral posterior when compared with the left anterior STS in responding to intelligible speech (Okada K, Rong F, Venezia J, Matchin W, Hsieh IH, Saberi K, Serences JT,Hickok G. 2010. Hierarchical organization of human auditory cortex: evidence from acoustic invariance in the response to intelligible speech. 20: 2486-2495.). Here, we also replicate our original study, demonstrating that the left anterior STS exhibits the strongest univariate response and, in decoding using the bilateral temporal cortex, contains the most informative voxels showing an increased response to intelligible speech. In contrast, in classifications using local "searchlights" and a whole brain analysis, we find greater classification accuracy in posterior rather than anterior temporal regions. Thus, we show that the precise nature of the multivariate analysis used will emphasize different response profiles associated with complex sound to speech processing. © The Author 2013. Published by Oxford University Press.
Improving the speech intelligibility in classrooms

Science.gov (United States)

Lam, Choi Ling Coriolanus

One of the major acoustical concerns in classrooms is the establishment of effective verbal communication between teachers and students. Non-optimal acoustical conditions, resulting in reduced verbal communication, can cause two main problems. First, they can lead to reduce learning efficiency. Second, they can also cause fatigue, stress, vocal strain and health problems, such as headaches and sore throats, among teachers who are forced to compensate for poor acoustical conditions by raising their voices. Besides, inadequate acoustical conditions can induce the usage of public address system. Improper usage of such amplifiers or loudspeakers can lead to impairment of students' hearing systems. The social costs of poor classroom acoustics will be large to impair the learning of children. This invisible problem has far reaching implications for learning, but is easily solved. Many researches have been carried out that they have accurately and concisely summarized the research findings on classrooms acoustics. Though, there is still a number of challenging questions remaining unanswered. Most objective indices for speech intelligibility are essentially based on studies of western languages. Even several studies of tonal languages as Mandarin have been conducted, there is much less on Cantonese. In this research, measurements have been done in unoccupied rooms to investigate the acoustical parameters and characteristics of the classrooms. The speech intelligibility tests, which based on English, Mandarin and Cantonese, and the survey were carried out on students aged from 5 years old to 22 years old. It aims to investigate the differences in intelligibility between English, Mandarin and Cantonese of the classrooms in Hong Kong. The significance on speech transmission index (STI) related to Phonetically Balanced (PB) word scores will further be developed. Together with developed empirical relationship between the speech intelligibility in classrooms with the variations
Speech Intelligibility and Childhood Verbal Apraxia in Children with Down Syndrome

Science.gov (United States)

Kumin, Libby

2006-01-01

Many children with Down syndrome have difficulty with speech intelligibility. The present study used a parent survey to learn more about a specific factor that affects speech intelligibility, i.e. childhood verbal apraxia. One of the factors that affects speech intelligibility for children with Down syndrome is difficulty with voluntarily…
Speech Intelligibility Prediction Based on Mutual Information

DEFF Research Database (Denmark)

Jensen, Jesper; Taal, Cees H.

2014-01-01

This paper deals with the problem of predicting the average intelligibility of noisy and potentially processed speech signals, as observed by a group of normal hearing listeners. We propose a model which performs this prediction based on the hypothesis that intelligibility is monotonically related...... to the mutual information between critical-band amplitude envelopes of the clean signal and the corresponding noisy/processed signal. The resulting intelligibility predictor turns out to be a simple function of the mean-square error (mse) that arises when estimating a clean critical-band amplitude using...... a minimum mean-square error (mmse) estimator based on the noisy/processed amplitude. The proposed model predicts that speech intelligibility cannot be improved by any processing of noisy critical-band amplitudes. Furthermore, the proposed intelligibility predictor performs well ( ρ > 0.95) in predicting...
Speech Intelligibility and Hearing Protector Selection

Science.gov (United States)

2016-08-29

not only affect the listener of speech communication in a noisy environment, HPDs can also affect the speaker . Tufts and Frank (2003) found that...of hearing protection on speech intelligibility in noise. Sound and Vibration . 20(10): 12-14. Berger, E. H. 1980. EARLog #4 – The
Using the Speech Transmission Index for predicting non-native speech intelligibility

NARCIS (Netherlands)

Wijngaarden, S.J. van; Bronkhorst, A.W.; Houtgast, T.; Steeneken, H.J.M.

2004-01-01

While the Speech Transmission Index ~STI! is widely applied for prediction of speech intelligibility in room acoustics and telecommunication engineering, it is unclear how to interpret STI values when non-native talkers or listeners are involved. Based on subjectively measured psychometric functions
Automated Intelligibility Assessment of Pathological Speech Using Phonological Features

Directory of Open Access Journals (Sweden)

Catherine Middag

2009-01-01

Full Text Available It is commonly acknowledged that word or phoneme intelligibility is an important criterion in the assessment of the communication efficiency of a pathological speaker. People have therefore put a lot of effort in the design of perceptual intelligibility rating tests. These tests usually have the drawback that they employ unnatural speech material (e.g., nonsense words and that they cannot fully exclude errors due to listener bias. Therefore, there is a growing interest in the application of objective automatic speech recognition technology to automate the intelligibility assessment. Current research is headed towards the design of automated methods which can be shown to produce ratings that correspond well with those emerging from a well-designed and well-performed perceptual test. In this paper, a novel methodology that is built on previous work (Middag et al., 2008 is presented. It utilizes phonological features, automatic speech alignment based on acoustic models that were trained on normal speech, context-dependent speaker feature extraction, and intelligibility prediction based on a small model that can be trained on pathological speech samples. The experimental evaluation of the new system reveals that the root mean squared error of the discrepancies between perceived and computed intelligibilities can be as low as 8 on a scale of 0 to 100.
Predicting Speech Intelligibility Decline in Amyotrophic Lateral Sclerosis Based on the Deterioration of Individual Speech Subsystems

Science.gov (United States)

Yunusova, Yana; Wang, Jun; Zinman, Lorne; Pattee, Gary L.; Berry, James D.; Perry, Bridget; Green, Jordan R.

2016-01-01

Purpose To determine the mechanisms of speech intelligibility impairment due to neurologic impairments, intelligibility decline was modeled as a function of co-occurring changes in the articulatory, resonatory, phonatory, and respiratory subsystems. Method Sixty-six individuals diagnosed with amyotrophic lateral sclerosis (ALS) were studied longitudinally. The disease-related changes in articulatory, resonatory, phonatory, and respiratory subsystems were quantified using multiple instrumental measures, which were subjected to a principal component analysis and mixed effects models to derive a set of speech subsystem predictors. A stepwise approach was used to select the best set of subsystem predictors to model the overall decline in intelligibility. Results Intelligibility was modeled as a function of five predictors that corresponded to velocities of lip and jaw movements (articulatory), number of syllable repetitions in the alternating motion rate task (articulatory), nasal airflow (resonatory), maximum fundamental frequency (phonatory), and speech pauses (respiratory). The model accounted for 95.6% of the variance in intelligibility, among which the articulatory predictors showed the most substantial independent contribution (57.7%). Conclusion Articulatory impairments characterized by reduced velocities of lip and jaw movements and resonatory impairments characterized by increased nasal airflow served as the subsystem predictors of the longitudinal decline of speech intelligibility in ALS. Declines in maximum performance tasks such as the alternating motion rate preceded declines in intelligibility, thus serving as early predictors of bulbar dysfunction. Following the rapid decline in speech intelligibility, a precipitous decline in maximum performance tasks subsequently occurred. PMID:27148967

The Effect of Background Noise on Intelligibility of Dysphonic Speech

Science.gov (United States)

Ishikawa, Keiko; Boyce, Suzanne; Kelchner, Lisa; Powell, Maria Golla; Schieve, Heidi; de Alarcon, Alessandro; Khosla, Sid

2017-01-01

Purpose: The aim of this study is to determine the effect of background noise on the intelligibility of dysphonic speech and to examine the relationship between intelligibility in noise and an acoustic measure of dysphonia--cepstral peak prominence (CPP). Method: A study of speech perception was conducted using speech samples from 6 adult speakers…
Real-Time Control of an Articulatory-Based Speech Synthesizer for Brain Computer Interfaces.

Directory of Open Access Journals (Sweden)

Florent Bocquelet

2016-11-01

Full Text Available Restoring natural speech in paralyzed and aphasic people could be achieved using a Brain-Computer Interface (BCI controlling a speech synthesizer in real-time. To reach this goal, a prerequisite is to develop a speech synthesizer producing intelligible speech in real-time with a reasonable number of control parameters. We present here an articulatory-based speech synthesizer that can be controlled in real-time for future BCI applications. This synthesizer converts movements of the main speech articulators (tongue, jaw, velum, and lips into intelligible speech. The articulatory-to-acoustic mapping is performed using a deep neural network (DNN trained on electromagnetic articulography (EMA data recorded on a reference speaker synchronously with the produced speech signal. This DNN is then used in both offline and online modes to map the position of sensors glued on different speech articulators into acoustic parameters that are further converted into an audio signal using a vocoder. In offline mode, highly intelligible speech could be obtained as assessed by perceptual evaluation performed by 12 listeners. Then, to anticipate future BCI applications, we further assessed the real-time control of the synthesizer by both the reference speaker and new speakers, in a closed-loop paradigm using EMA data recorded in real time. A short calibration period was used to compensate for differences in sensor positions and articulatory differences between new speakers and the reference speaker. We found that real-time synthesis of vowels and consonants was possible with good intelligibility. In conclusion, these results open to future speech BCI applications using such articulatory-based speech synthesizer.
Predicting Speech Intelligibility with a Multiple Speech Subsystems Approach in Children with Cerebral Palsy

Science.gov (United States)

Lee, Jimin; Hustad, Katherine C.; Weismer, Gary

2014-01-01

Purpose: Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystems approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Method: Nine acoustic variables reflecting different subsystems, and…
Effectiveness of Speech Therapy in Adults with Intellectual Disabilities

Science.gov (United States)

Terband, Hayo; Coppens-Hofman, Marjolein C.; Reffeltrath, Maaike; Maassen, Ben A. M.

2018-01-01

Background: This study investigated the effect of speech therapy in a heterogeneous group of adults with intellectual disability. Method: Thirty-six adults with mild and moderate intellectual disabilities (IQs 40-70; age 18-40 years) with reported poor speech intelligibility received tailored training in articulation and listening skills delivered…
Experimental comparison between speech transmission index, rapid speech transmission index, and speech intelligibility index.

Science.gov (United States)

Larm, Petra; Hongisto, Valtteri

2006-02-01

During the acoustical design of, e.g., auditoria or open-plan offices, it is important to know how speech can be perceived in various parts of the room. Different objective methods have been developed to measure and predict speech intelligibility, and these have been extensively used in various spaces. In this study, two such methods were compared, the speech transmission index (STI) and the speech intelligibility index (SII). Also the simplification of the STI, the room acoustics speech transmission index (RASTI), was considered. These quantities are all based on determining an apparent speech-to-noise ratio on selected frequency bands and summing them using a specific weighting. For comparison, some data were needed on the possible differences of these methods resulting from the calculation scheme and also measuring equipment. Their prediction accuracy was also of interest. Measurements were made in a laboratory having adjustable noise level and absorption, and in a real auditorium. It was found that the measurement equipment, especially the selection of the loudspeaker, can greatly affect the accuracy of the results. The prediction accuracy of the RASTI was found acceptable, if the input values for the prediction are accurately known, even though the studied space was not ideally diffuse.
Assessment of speech intelligibility in background noise and reverberation

DEFF Research Database (Denmark)

Nielsen, Jens Bo

Reliable methods for assessing speech intelligibility are essential within hearing research, audiology, and related areas. Such methods can be used for obtaining a better understanding of how speech intelligibility is affected by, e.g., various environmental factors or different types of hearing...... impairment. In this thesis, two sentence-based tests for speech intelligibility in Danish were developed. The first test is the Conversational Language Understanding Evaluation (CLUE), which is based on the principles of the original American-English Hearing in Noise Test (HINT). The second test...... is a modified version of CLUE where the speech material and the scoring rules have been reconsidered. An extensive validation of the modified test was conducted with both normal-hearing and hearing-impaired listeners. The validation showed that the test produces reliable results for both groups of listeners...
SII-Based Speech Prepocessing for Intelligibility Improvement in Noise

DEFF Research Database (Denmark)

Taal, Cees H.; Jensen, Jesper

2013-01-01

filter sets certain frequency bands to zero when they do not contribute to intelligibility anymore. Experiments show large intelligibility improvements with the proposed method when used in stationary speech-shaped noise. However, it was also found that the method does not perform well for speech...... corrupted by a competing speaker. This is due to the fact that the SII is not a reliable intelligibility predictor for fluctuating noise sources. MATLAB code is provided....
Predicting automatic speech recognition performance over communication channels from instrumental speech quality and intelligibility scores

NARCIS (Netherlands)

Gallardo, L.F.; Möller, S.; Beerends, J.

2017-01-01

The performance of automatic speech recognition based on coded-decoded speech heavily depends on the quality of the transmitted signals, determined by channel impairments. This paper examines relationships between speech recognition performance and measurements of speech quality and intelligibility
Speech characteristics in a Ugandan child with a rare paramedian craniofacial cleft: a case report.

Science.gov (United States)

Van Lierde, K M; Bettens, K; Luyten, A; De Ley, S; Tungotyo, M; Balumukad, D; Galiwango, G; Bauters, W; Vermeersch, H; Hodges, A

2013-03-01

The purpose of this study is to describe the speech characteristics in an English-speaking Ugandan boy of 4.5 years who has a rare paramedian craniofacial cleft (unilateral lip, alveolar, palatal, nasal and maxillary cleft, and associated hypertelorism). Closure of the lip together with the closure of the hard and soft palate (one-stage palatal closure) was performed at the age of 5 months. Objective as well as subjective speech assessment techniques were used. The speech samples were perceptually judged for articulation, intelligibility and nasality. The Nasometer was used for the objective measurement of the nasalance values. The most striking communication problems in this child with the rare craniofacial cleft are an incomplete phonetic inventory, a severely impaired speech intelligibility with the presence of very severe hypernasality, mild nasal emission, phonetic disorders (omission of several consonants, decreased intraoral pressure in explosives, insufficient frication of fricatives and the use of a middorsum palatal stop) and phonological disorders (deletion of initial and final consonants and consonant clusters). The increased objective nasalance values are in agreement with the presence of the audible nasality disorders. The results revealed that several phonetic and phonological articulation disorders together with a decreased speech intelligibility and resonance disorders are present in the child with a rare craniofacial cleft. To what extent a secondary surgery for velopharyngeal insufficiency, combined with speech therapy, will improve speech intelligibility, articulation and resonance characteristics is a subject for further research. The results of such analyses may ultimately serve as a starting point for specific surgical and logopedic treatment that addresses the specific needs of children with rare facial clefts. Copyright © 2012 Elsevier Ireland Ltd. All rights reserved.
Improvement of intelligibility of ideal binary-masked noisy speech by adding background noise.

Science.gov (United States)

Cao, Shuyang; Li, Liang; Wu, Xihong

2011-04-01

When a target-speech/masker mixture is processed with the signal-separation technique, ideal binary mask (IBM), intelligibility of target speech is remarkably improved in both normal-hearing listeners and hearing-impaired listeners. Intelligibility of speech can also be improved by filling in speech gaps with un-modulated broadband noise. This study investigated whether intelligibility of target speech in the IBM-treated target-speech/masker mixture can be further improved by adding a broadband-noise background. The results of this study show that following the IBM manipulation, which remarkably released target speech from speech-spectrum noise, foreign-speech, or native-speech masking (experiment 1), adding a broadband-noise background with the signal-to-noise ratio no less than 4 dB significantly improved intelligibility of target speech when the masker was either noise (experiment 2) or speech (experiment 3). The results suggest that since adding the noise background shallows the areas of silence in the time-frequency domain of the IBM-treated target-speech/masker mixture, the abruption of transient changes in the mixture is smoothed and the perceived continuity of target-speech components becomes enhanced, leading to improved target-speech intelligibility. The findings are useful for advancing computational auditory scene analysis, hearing-aid/cochlear-implant designs, and understanding of speech perception under "cocktail-party" conditions.
Speech intelligibility of laryngectomized patients who use different types of vocal communication

OpenAIRE

Šehović Ivana; Petrović-Lazić Mirjana

2016-01-01

Modern methods of speech rehabilitation after a total laryngectomy have come to a great success by giving the patients a possibility to establish an intelligible and functional speech after an adequate rehabilitation treatment. The aim of this paper was to examine speech intelligibility of laryngectomized patients who use different types of vocal communication: esophageal speech, speech with tracheoesophageal prosthesis and speech with electronic laringeal prosthesis. The research was conduct...
Sensitivity of the Speech Intelligibility Index to the Assumed Dynamic Range

Science.gov (United States)

Jin, In-Ki; Kates, James M.; Arehart, Kathryn H.

2017-01-01

Purpose: This study aims to evaluate the sensitivity of the speech intelligibility index (SII) to the assumed speech dynamic range (DR) in different languages and with different types of stimuli. Method: Intelligibility prediction uses the absolute transfer function (ATF) to map the SII value to the predicted intelligibility for a given stimuli.…
Variability and Intelligibility of Clarified Speech to Different Listener Groups

Science.gov (United States)

Silber, Ronnie F.

Two studies examined the modifications that adult speakers make in speech to disadvantaged listeners. Previous research that has focused on speech to the deaf individuals and to young children has shown that adults clarify speech when addressing these two populations. Acoustic measurements suggest that the signal undergoes similar changes for both populations. Perceptual tests corroborate these results for the deaf population, but are nonsystematic in developmental studies. The differences in the findings for these populations and the nonsystematic results in the developmental literature may be due to methodological factors. The present experiments addressed these methodological questions. Studies of speech to hearing impaired listeners have used read, nonsense, sentences, for which speakers received explicit clarification instructions and feedback, while in the child literature, excerpts of real-time conversations were used. Therefore, linguistic samples were not precisely matched. In this study, experiments used various linguistic materials. Experiment 1 used a children's story; experiment 2, nonsense sentences. Four mothers read both types of material in four ways: (1) in "normal" adult speech, (2) in "babytalk," (3) under the clarification instructions used in the "hearing impaired studies" (instructed clear speech) and (4) in (spontaneous) clear speech without instruction. No extra practice or feedback was given. Sentences were presented to 40 normal hearing college students with and without simultaneous masking noise. Results were separately tabulated for content and function words, and analyzed using standard statistical tests. The major finding in the study was individual variation in speaker intelligibility. "Real world" speakers vary in their baseline intelligibility. The four speakers also showed unique patterns of intelligibility as a function of each independent variable. Results were as follows. Nonsense sentences were less intelligible than story
Predicting Intelligibility Gains in Dysarthria through Automated Speech Feature Analysis

Science.gov (United States)

Fletcher, Annalise R.; Wisler, Alan A.; McAuliffe, Megan J.; Lansford, Kaitlin L.; Liss, Julie M.

2017-01-01

Purpose: Behavioral speech modifications have variable effects on the intelligibility of speakers with dysarthria. In the companion article, a significant relationship was found between measures of speakers' baseline speech and their intelligibility gains following cues to speak louder and reduce rate (Fletcher, McAuliffe, Lansford, Sinex, &…
Speech Intelligibility in Severe Adductor Spasmodic Dysphonia

Science.gov (United States)

Bender, Brenda K.; Cannito, Michael P.; Murry, Thomas; Woodson, Gayle E.

2004-01-01

This study compared speech intelligibility in nondisabled speakers and speakers with adductor spasmodic dysphonia (ADSD) before and after botulinum toxin (Botox) injection. Standard speech samples were obtained from 10 speakers diagnosed with severe ADSD prior to and 1 month following Botox injection, as well as from 10 age- and gender-matched…
The influence of spectral and spatial characteristics of early reflections on speech intelligibility

DEFF Research Database (Denmark)

Arweiler, Iris; Buchholz, Jörg; Dau, Torsten

The auditory system employs different strategies to facilitate speech intelligibility in complex listening conditions. One of them is the integration of early reflections (ER’s) with the direct sound (DS) to increase the effective speech level. So far the underlying mechanisms of ER processing have...... of listeners that speech intelligibility improved with added ER energy, but less than with added DS energy. An efficiency factor was introduced to quantify this effect. The difference in speech intelligibility could be mainly ascribed to the differences in the spectrum between the speech signals....... binaural). The direction-dependency could be explained by the spectral changes introduced by the pinna, head, and torso. The results will be important with regard to the influence of signal processing strategies in modern hearing aids on speech intelligibility, because they might alter the spectral...
Investigation of in-vehicle speech intelligibility metrics for normal hearing and hearing impaired listeners

Science.gov (United States)

Samardzic, Nikolina

The effectiveness of in-vehicle speech communication can be a good indicator of the perception of the overall vehicle quality and customer satisfaction. Currently available speech intelligibility metrics do not account in their procedures for essential parameters needed for a complete and accurate evaluation of in-vehicle speech intelligibility. These include the directivity and the distance of the talker with respect to the listener, binaural listening, hearing profile of the listener, vocal effort, and multisensory hearing. In the first part of this research the effectiveness of in-vehicle application of these metrics is investigated in a series of studies to reveal their shortcomings, including a wide range of scores resulting from each of the metrics for a given measurement configuration and vehicle operating condition. In addition, the nature of a possible correlation between the scores obtained from each metric is unknown. The metrics and the subjective perception of speech intelligibility using, for example, the same speech material have not been compared in literature. As a result, in the second part of this research, an alternative method for speech intelligibility evaluation is proposed for use in the automotive industry by utilizing a virtual reality driving environment for ultimately setting targets, including the associated statistical variability, for future in-vehicle speech intelligibility evaluation. The Speech Intelligibility Index (SII) was evaluated at the sentence Speech Receptions Threshold (sSRT) for various listening situations and hearing profiles using acoustic perception jury testing and a variety of talker and listener configurations and background noise. In addition, the effect of individual sources and transfer paths of sound in an operating vehicle to the vehicle interior sound, specifically their effect on speech intelligibility was quantified, in the framework of the newly developed speech intelligibility evaluation method. Lastly
Microscopic prediction of speech intelligibility in spatially distributed speech-shaped noise for normal-hearing listeners.

Science.gov (United States)

Geravanchizadeh, Masoud; Fallah, Ali

2015-12-01

A binaural and psychoacoustically motivated intelligibility model, based on a well-known monaural microscopic model is proposed. This model simulates a phoneme recognition task in the presence of spatially distributed speech-shaped noise in anechoic scenarios. In the proposed model, binaural advantage effects are considered by generating a feature vector for a dynamic-time-warping speech recognizer. This vector consists of three subvectors incorporating two monaural subvectors to model the better-ear hearing, and a binaural subvector to simulate the binaural unmasking effect. The binaural unit of the model is based on equalization-cancellation theory. This model operates blindly, which means separate recordings of speech and noise are not required for the predictions. Speech intelligibility tests were conducted with 12 normal hearing listeners by collecting speech reception thresholds (SRTs) in the presence of single and multiple sources of speech-shaped noise. The comparison of the model predictions with the measured binaural SRTs, and with the predictions of a macroscopic binaural model called extended equalization-cancellation, shows that this approach predicts the intelligibility in anechoic scenarios with good precision. The square of the correlation coefficient (r(2)) and the mean-absolute error between the model predictions and the measurements are 0.98 and 0.62 dB, respectively.
Speech Intelligibility and Personality Peer-Ratings of Young Adults with Cochlear Implants

Science.gov (United States)

Freeman, Valerie

2018-01-01

Speech intelligibility, or how well a speaker's words are understood by others, affects listeners' judgments of the speaker's competence and personality. Deaf cochlear implant (CI) users vary widely in speech intelligibility, and their speech may have a noticeable "deaf" quality, both of which could evoke negative stereotypes or…
Associations between speech features and phenotypic severity in Treacher Collins syndrome.

Science.gov (United States)

Asten, Pamela; Akre, Harriet; Persson, Christina

2014-04-28

Treacher Collins syndrome (TCS, OMIM 154500) is a rare congenital disorder of craniofacial development. Characteristic hypoplastic malformations of the ears, zygomatic arch, mandible and pharynx have been described in detail. However, reports on the impact of these malformations on speech are few. Exploring speech features and investigating if speech function is related to phenotypic severity are essential for optimizing follow-up and treatment. Articulation, nasal resonance, voice and intelligibility were examined in 19 individuals (5-74 years, median 34 years) divided into three groups comprising children 5-10 years (n = 4), adolescents 11-18 years (n = 4) and adults 29 years and older (n = 11). A speech composite score (0-6) was calculated to reflect the variability of speech deviations. TCS severity scores of phenotypic expression and total scores of Nordic Orofacial Test-Screening (NOT-S) measuring orofacial dysfunction were used in analyses of correlation with speech characteristics (speech composite scores). Children and adolescents presented with significantly higher speech composite scores (median 4, range 1-6) than adults (median 1, range 0-5). Nearly all children and adolescents (6/8) displayed speech deviations of articulation, nasal resonance and voice, while only three adults were identified with multiple speech aberrations. The variability of speech dysfunction in TCS was exhibited by individual combinations of speech deviations in 13/19 participants. The speech composite scores correlated with TCS severity scores and NOT-S total scores. Speech composite scores higher than 4 were associated with cleft palate. The percent of intelligible words in connected speech was significantly lower in children and adolescents (median 77%, range 31-99) than in adults (98%, range 93-100). Intelligibility of speech among the children was markedly inconsistent and clearly affecting the understandability. Multiple speech deviations were identified in

Speech intelligibility enhancement after maxillary denture treatment and its impact on quality of life.

Science.gov (United States)

Knipfer, Christian; Riemann, Max; Bocklet, Tobias; Noeth, Elmar; Schuster, Maria; Sokol, Biljana; Eitner, Stephan; Nkenke, Emeka; Stelzle, Florian

2014-01-01

Tooth loss and its prosthetic rehabilitation significantly affect speech intelligibility. However, little is known about the influence of speech deficiencies on oral health-related quality of life (OHRQoL). The aim of this study was to investigate whether speech intelligibility enhancement through prosthetic rehabilitation significantly influences OHRQoL in patients wearing complete maxillary dentures. Speech intelligibility by means of an automatic speech recognition system (ASR) was prospectively evaluated and compared with subjectively assessed Oral Health Impact Profile (OHIP) scores. Speech was recorded in 28 edentulous patients 1 week prior to the fabrication of new complete maxillary dentures and 6 months thereafter. Speech intelligibility was computed based on the word accuracy (WA) by means of an ASR and compared with a matched control group. One week before and 6 months after rehabilitation, patients assessed themselves for OHRQoL. Speech intelligibility improved significantly after 6 months. Subjects reported a significantly higher OHRQoL after maxillary rehabilitation with complete dentures. No significant correlation was found between the OHIP sum score or its subscales to the WA. Speech intelligibility enhancement achieved through the fabrication of new complete maxillary dentures might not be in the forefront of the patients' perception of their quality of life. For the improvement of OHRQoL in patients wearing complete maxillary dentures, food intake and mastication as well as freedom from pain play a more prominent role.
Investigating the effects of noise-estimation errors in simulated cochlear implant speech intelligibility

DEFF Research Database (Denmark)

Kressner, Abigail Anne; May, Tobias; Malik Thaarup Høegh, Rasmus

2017-01-01

A recent study suggested that the most important factor for obtaining high speech intelligibility in noise with cochlear implant recipients is to preserve the low-frequency amplitude modulations of speech across time and frequency by, for example, minimizing the amount of noise in speech gaps....... In contrast, other studies have argued that the transients provide the most information. Thus, the present study investigates the relative impact of these two factors in the framework of noise reduction by systematically correcting noise-estimation errors within speech segments, speech gaps......, and the transitions between them. Speech intelligibility in noise was measured using a cochlear implant simulation tested on normal-hearing listeners. The results suggest that minimizing noise in the speech gaps can substantially improve intelligibility, especially in modulated noise. However, significantly larger...
Simultaneous natural speech and AAC interventions for children with childhood apraxia of speech: lessons from a speech-language pathologist focus group.

Science.gov (United States)

Oommen, Elizabeth R; McCarthy, John W

2015-03-01

In childhood apraxia of speech (CAS), children exhibit varying levels of speech intelligibility depending on the nature of errors in articulation and prosody. Augmentative and alternative communication (AAC) strategies are beneficial, and commonly adopted with children with CAS. This study focused on the decision-making process and strategies adopted by speech-language pathologists (SLPs) when simultaneously implementing interventions that focused on natural speech and AAC. Eight SLPs, with significant clinical experience in CAS and AAC interventions, participated in an online focus group. Thematic analysis revealed eight themes: key decision-making factors; treatment history and rationale; benefits; challenges; therapy strategies and activities; collaboration with team members; recommendations; and other comments. Results are discussed along with clinical implications and directions for future research.
Effects of Audio-Visual Information on the Intelligibility of Alaryngeal Speech

Science.gov (United States)

Evitts, Paul M.; Portugal, Lindsay; Van Dine, Ami; Holler, Aline

2010-01-01

Background: There is minimal research on the contribution of visual information on speech intelligibility for individuals with a laryngectomy (IWL). Aims: The purpose of this project was to determine the effects of mode of presentation (audio-only, audio-visual) on alaryngeal speech intelligibility. Method: Twenty-three naive listeners were…
Transient and sustained cortical activity elicited by connected speech of varying intelligibility

Directory of Open Access Journals (Sweden)

Tiitinen Hannu

2012-12-01

Full Text Available Abstract Background The robustness of speech perception in the face of acoustic variation is founded on the ability of the auditory system to integrate the acoustic features of speech and to segregate them from background noise. This auditory scene analysis process is facilitated by top-down mechanisms, such as recognition memory for speech content. However, the cortical processes underlying these facilitatory mechanisms remain unclear. The present magnetoencephalography (MEG study examined how the activity of auditory cortical areas is modulated by acoustic degradation and intelligibility of connected speech. The experimental design allowed for the comparison of cortical activity patterns elicited by acoustically identical stimuli which were perceived as either intelligible or unintelligible. Results In the experiment, a set of sentences was presented to the subject in distorted, undistorted, and again in distorted form. The intervening exposure to undistorted versions of sentences rendered the initially unintelligible, distorted sentences intelligible, as evidenced by an increase from 30% to 80% in the proportion of sentences reported as intelligible. These perceptual changes were reflected in the activity of the auditory cortex, with the auditory N1m response (~100 ms being more prominent for the distorted stimuli than for the intact ones. In the time range of auditory P2m response (>200 ms, auditory cortex as well as regions anterior and posterior to this area generated a stronger response to sentences which were intelligible than unintelligible. During the sustained field (>300 ms, stronger activity was elicited by degraded stimuli in auditory cortex and by intelligible sentences in areas posterior to auditory cortex. Conclusions The current findings suggest that the auditory system comprises bottom-up and top-down processes which are reflected in transient and sustained brain activity. It appears that analysis of acoustic features occurs
The relationship of speech intelligibility with hearing sensitivity, cognition, and perceived hearing difficulties varies for different speech perception tests

Science.gov (United States)

Heinrich, Antje; Henshaw, Helen; Ferguson, Melanie A.

2015-01-01

Listeners vary in their ability to understand speech in noisy environments. Hearing sensitivity, as measured by pure-tone audiometry, can only partly explain these results, and cognition has emerged as another key concept. Although cognition relates to speech perception, the exact nature of the relationship remains to be fully understood. This study investigates how different aspects of cognition, particularly working memory and attention, relate to speech intelligibility for various tests. Perceptual accuracy of speech perception represents just one aspect of functioning in a listening environment. Activity and participation limits imposed by hearing loss, in addition to the demands of a listening environment, are also important and may be better captured by self-report questionnaires. Understanding how speech perception relates to self-reported aspects of listening forms the second focus of the study. Forty-four listeners aged between 50 and 74 years with mild sensorineural hearing loss were tested on speech perception tests differing in complexity from low (phoneme discrimination in quiet), to medium (digit triplet perception in speech-shaped noise) to high (sentence perception in modulated noise); cognitive tests of attention, memory, and non-verbal intelligence quotient; and self-report questionnaires of general health-related and hearing-specific quality of life. Hearing sensitivity and cognition related to intelligibility differently depending on the speech test: neither was important for phoneme discrimination, hearing sensitivity alone was important for digit triplet perception, and hearing and cognition together played a role in sentence perception. Self-reported aspects of auditory functioning were correlated with speech intelligibility to different degrees, with digit triplets in noise showing the richest pattern. The results suggest that intelligibility tests can vary in their auditory and cognitive demands and their sensitivity to the challenges that
Relationship between Speech Intelligibility and Speech Comprehension in Babble Noise

Science.gov (United States)

Fontan, Lionel; Tardieu, Julien; Gaillard, Pascal; Woisard, Virginie; Ruiz, Robert

2015-01-01

Purpose: The authors investigated the relationship between the intelligibility and comprehension of speech presented in babble noise. Method: Forty participants listened to French imperative sentences (commands for moving objects) in a multitalker babble background for which intensity was experimentally controlled. Participants were instructed to…
Speech outcome after surgical treatment for oral and oropharyngeal cancer : A longitudinal assessment of patients reconstructed by a microvascular flap

NARCIS (Netherlands)

Borggreven, PA; Verdonck-de Leeuw, [No Value; Langendijk, JA; Doornaert, P; Koster, MN; de Bree, R; Leemans, R

Background. The aim of the study was to analyze speech outcome for patients with advanced oral/oropharyngeal cancer treated with reconstructive surgery and adjuvant radiotherapy. Methods. Speech tests (communicative suitability, intelligibility, articulation, nasality, and consonant errors) were
Speech intelligibility for normal hearing and hearing-impaired listeners in simulated room acoustic conditions

DEFF Research Database (Denmark)

Arweiler, Iris; Dau, Torsten; Poulsen, Torben

Speech intelligibility depends on many factors such as room acoustics, the acoustical properties and location of the signal and the interferers, and the ability of the (normal and impaired) auditory system to process monaural and binaural sounds. In the present study, the effect of reverberation...... on spatial release from masking was investigated in normal hearing and hearing impaired listeners using three types of interferers: speech shaped noise, an interfering female talker and speech-modulated noise. Speech reception thresholds (SRT) were obtained in three simulated environments: a listening room......, a classroom and a church. The data from the study provide constraints for existing models of speech intelligibility prediction (based on the speech intelligibility index, SII, or the speech transmission index, STI) which have shortcomings when reverberation and/or fluctuating noise affect speech...
Parkinson Disease Detection from Speech Articulation Neuromechanics

Directory of Open Access Journals (Sweden)

Pedro Gómez-Vilda

2017-08-01

Full Text Available Aim: The research described is intended to give a description of articulation dynamics as a correlate of the kinematic behavior of the jaw-tongue biomechanical system, encoded as a probability distribution of an absolute joint velocity. This distribution may be used in detecting and grading speech from patients affected by neurodegenerative illnesses, as Parkinson Disease.Hypothesis: The work hypothesis is that the probability density function of the absolute joint velocity includes information on the stability of phonation when applied to sustained vowels, as well as on fluency if applied to connected speech.Methods: A dataset of sustained vowels recorded from Parkinson Disease patients is contrasted with similar recordings from normative subjects. The probability distribution of the absolute kinematic velocity of the jaw-tongue system is extracted from each utterance. A Random Least Squares Feed-Forward Network (RLSFN has been used as a binary classifier working on the pathological and normative datasets in a leave-one-out strategy. Monte Carlo simulations have been conducted to estimate the influence of the stochastic nature of the classifier. Two datasets for each gender were tested (males and females including 26 normative and 53 pathological subjects in the male set, and 25 normative and 38 pathological in the female set.Results: Male and female data subsets were tested in single runs, yielding equal error rates under 0.6% (Accuracy over 99.4%. Due to the stochastic nature of each experiment, Monte Carlo runs were conducted to test the reliability of the methodology. The average detection results after 200 Montecarlo runs of a 200 hyperplane hidden layer RLSFN are given in terms of Sensitivity (males: 0.9946, females: 0.9942, Specificity (males: 0.9944, females: 0.9941 and Accuracy (males: 0.9945, females: 0.9942. The area under the ROC curve is 0.9947 (males and 0.9945 (females. The equal error rate is 0.0054 (males and 0.0057 (females
Effect of Whole-Body Vibration on Speech. Part 2; Effect on Intelligibility

Science.gov (United States)

Begault, Durand R.

2011-01-01

The effect on speech intelligibility was measured for speech where talkers reading Diagnostic Rhyme Test material were exposed to 0.7 g whole body vibration to simulate space vehicle launch. Across all talkers, the effect of vibration was to degrade the percentage of correctly transcribed words from 83% to 74%. The magnitude of the effect of vibration on speech communication varies between individuals, for both talkers and listeners. A worst case scenario for intelligibility would be the most sensitive listener hearing the most sensitive talker; one participant s intelligibility was reduced by 26% (97% to 71%) for one of the talkers.
Exploring the role of brain oscillations in speech perception in noise: Intelligibility of isochronously retimed speech

Directory of Open Access Journals (Sweden)

Vincent Aubanel

2016-08-01

Full Text Available A growing body of evidence shows that brain oscillations track speech. This mechanism is thought to maximise processing efficiency by allocating resources to important speech information, effectively parsing speech into units of appropriate granularity for further decoding. However, some aspects of this mechanism remain unclear. First, while periodicity is an intrinsic property of this physiological mechanism, speech is only quasi-periodic, so it is not clear whether periodicity would present an advantage in processing. Second, it is still a matter of debate which aspect of speech triggers or maintains cortical entrainment, from bottom-up cues such as fluctuations of the amplitude envelope of speech to higher level linguistic cues such as syntactic structure. We present data from a behavioural experiment assessing the effect of isochronous retiming of speech on speech perception in noise. Two types of anchor points were defined for retiming speech, namely syllable onsets and amplitude envelope peaks. For each anchor point type, retiming was implemented at two hierarchical levels, a slow time scale around 2.5 Hz and a fast time scale around 4 Hz. Results show that while any temporal distortion resulted in reduced speech intelligibility, isochronous speech anchored to P-centers (approximated by stressed syllable vowel onsets was significantly more intelligible than a matched anisochronous retiming, suggesting a facilitative role of periodicity defined on linguistically motivated units in processing speech in noise.
Perception of synthetic speech produced automatically by rule: Intelligibility of eight text-to-speech systems.

Science.gov (United States)

Greene, Beth G; Logan, John S; Pisoni, David B

1986-03-01

We present the results of studies designed to measure the segmental intelligibility of eight text-to-speech systems and a natural speech control, using the Modified Rhyme Test (MRT). Results indicated that the voices tested could be grouped into four categories: natural speech, high-quality synthetic speech, moderate-quality synthetic speech, and low-quality synthetic speech. The overall performance of the best synthesis system, DECtalk-Paul, was equivalent to natural speech only in terms of performance on initial consonants. The findings are discussed in terms of recent work investigating the perception of synthetic speech under more severe conditions. Suggestions for future research on improving the quality of synthetic speech are also considered.
Perception of synthetic speech produced automatically by rule: Intelligibility of eight text-to-speech systems

Science.gov (United States)

GREENE, BETH G.; LOGAN, JOHN S.; PISONI, DAVID B.

2012-01-01

We present the results of studies designed to measure the segmental intelligibility of eight text-to-speech systems and a natural speech control, using the Modified Rhyme Test (MRT). Results indicated that the voices tested could be grouped into four categories: natural speech, high-quality synthetic speech, moderate-quality synthetic speech, and low-quality synthetic speech. The overall performance of the best synthesis system, DECtalk-Paul, was equivalent to natural speech only in terms of performance on initial consonants. The findings are discussed in terms of recent work investigating the perception of synthetic speech under more severe conditions. Suggestions for future research on improving the quality of synthetic speech are also considered. PMID:23225916
Speech intelligibility of normal listeners and persons with impaired hearing in traffic noise

Science.gov (United States)

Aniansson, G.; Peterson, Y.

1983-10-01

Speech intelligibility (PB words) in traffic-like noise was investigated in a laboratory situation simulating three common listening situations, indoors at 1 and 4 m and outdoors at 1 m. The maximum noise levels still permitting 75% intelligibility of PB words in these three listening situations were also defined. A total of 269 persons were examined. Forty-six had normal hearing, 90 a presbycusis-type hearing loss, 95 a noise-induced hearing loss and 38 a conductive hearing loss. In the indoor situation the majority of the groups with impaired hearing retained good speech intelligibility in 40 dB(A) masking noise. Lowering the noise level to less than 40 dB(A) resulted in a minor, usually insignificant, improvement in speech intelligibility. Listeners with normal hearing maintained good speech intelligibility in the outdoor listening situation at noise levels up to 60 dB(A), without lip-reading (i.e., using non-auditory information). For groups with impaired hearing due to age and/or noise, representing 8% of the population in Sweden, the noise level outdoors had to be lowered to less than 50 dB(A), in order to achieve good speech intelligibility at 1 m without lip-reading.
Successful and rapid response of speech bulb reduction program combined with speech therapy in velopharyngeal dysfunction: a case report.

Science.gov (United States)

Shin, Yu-Jeong; Ko, Seung-O

2015-12-01

Velopharyngeal dysfunction in cleft palate patients following the primary palate repair may result in nasal air emission, hypernasality, articulation disorder and poor intelligibility of speech. Among conservative treatment methods, speech aid prosthesis combined with speech therapy is widely used method. However because of its long time of treatment more than a year and low predictability, some clinicians prefer a surgical intervention. Thus, the purpose of this report was to increase an attention on the effectiveness of speech aid prosthesis by introducing a case that was successfully treated. In this clinical report, speech bulb reduction program with intensive speech therapy was applied for a patient with velopharyngeal dysfunction and it was rapidly treated by 5months which was unusually short period for speech aid therapy. Furthermore, advantages of pre-operative speech aid therapy were discussed.
Bridging the Gap Between Speech and Language: Using Multimodal Treatment in a Child With Apraxia.

Science.gov (United States)

Tierney, Cheryl D; Pitterle, Kathleen; Kurtz, Marie; Nakhla, Mark; Todorow, Carlyn

2016-09-01

Childhood apraxia of speech is a neurologic speech sound disorder in which children have difficulty constructing words and sounds due to poor motor planning and coordination of the articulators required for speech sound production. We report the case of a 3-year-old boy strongly suspected to have childhood apraxia of speech at 18 months of age who used multimodal communication to facilitate language development throughout his work with a speech language pathologist. In 18 months of an intensive structured program, he exhibited atypical rapid improvement, progressing from having no intelligible speech to achieving age-appropriate articulation. We suspect that early introduction of sign language by family proved to be a highly effective form of language development, that when coupled with intensive oro-motor and speech sound therapy, resulted in rapid resolution of symptoms. Copyright © 2016 by the American Academy of Pediatrics.
Prediction and Optimization of Speech Intelligibility in Adverse Conditions

NARCIS (Netherlands)

Taal, C.H.

2013-01-01

In digital speech-communication systems like mobile phones, public address systems and hearing aids, conveying the message is one of the most important goals. This can be challenging since the intelligibility of the speech may be harmed at various stages before, during and after the transmission
Comparing the information conveyed by envelope modulation for speech intelligibility, speech quality, and music quality.

Science.gov (United States)

Kates, James M; Arehart, Kathryn H

2015-10-01

This paper uses mutual information to quantify the relationship between envelope modulation fidelity and perceptual responses. Data from several previous experiments that measured speech intelligibility, speech quality, and music quality are evaluated for normal-hearing and hearing-impaired listeners. A model of the auditory periphery is used to generate envelope signals, and envelope modulation fidelity is calculated using the normalized cross-covariance of the degraded signal envelope with that of a reference signal. Two procedures are used to describe the envelope modulation: (1) modulation within each auditory frequency band and (2) spectro-temporal processing that analyzes the modulation of spectral ripple components fit to successive short-time spectra. The results indicate that low modulation rates provide the highest information for intelligibility, while high modulation rates provide the highest information for speech and music quality. The low-to-mid auditory frequencies are most important for intelligibility, while mid frequencies are most important for speech quality and high frequencies are most important for music quality. Differences between the spectral ripple components used for the spectro-temporal analysis were not significant in five of the six experimental conditions evaluated. The results indicate that different modulation-rate and auditory-frequency weights may be appropriate for indices designed to predict different types of perceptual relationships.
Predicting speech intelligibility based on the signal-to-noise envelope power ratio after modulation-frequency selective processing

DEFF Research Database (Denmark)

Jørgensen, Søren; Dau, Torsten

2011-01-01

A model for predicting the intelligibility of processed noisy speech is proposed. The speech-based envelope power spectrum model has a similar structure as the model of Ewert and Dau [(2000). J. Acoust. Soc. Am. 108, 1181-1196], developed to account for modulation detection and masking data. The ...... process provides a key measure of speech intelligibility. © 2011 Acoustical Society of America.......A model for predicting the intelligibility of processed noisy speech is proposed. The speech-based envelope power spectrum model has a similar structure as the model of Ewert and Dau [(2000). J. Acoust. Soc. Am. 108, 1181-1196], developed to account for modulation detection and masking data....... The model estimates the speech-to-noise envelope power ratio, SNR env, at the output of a modulation filterbank and relates this metric to speech intelligibility using the concept of an ideal observer. Predictions were compared to data on the intelligibility of speech presented in stationary speech...

Application of artifical intelligence principles to the analysis of "crazy" speech.

Science.gov (United States)

Garfield, D A; Rapp, C

1994-04-01

Artificial intelligence computer simulation methods can be used to investigate psychotic or "crazy" speech. Here, symbolic reasoning algorithms establish semantic networks that schematize speech. These semantic networks consist of two main structures: case frames and object taxonomies. Node-based reasoning rules apply to object taxonomies and pathway-based reasoning rules apply to case frames. Normal listeners may recognize speech as "crazy talk" based on violations of node- and pathway-based reasoning rules. In this article, three separate segments of schizophrenic speech illustrate violations of these rules. This artificial intelligence approach is compared and contrasted with other neurolinguistic approaches and is discussed as a conceptual link between neurobiological and psychodynamic understandings of psychopathology.
The role of auditory spectro-temporal modulation filtering and the decision metric for speech intelligibility prediction

DEFF Research Database (Denmark)

Chabot-Leclerc, Alexandre; Jørgensen, Søren; Dau, Torsten

2014-01-01

Speech intelligibility models typically consist of a preprocessing part that transforms stimuli into some internal (auditory) representation and a decision metric that relates the internal representation to speech intelligibility. The present study analyzed the role of modulation filtering...... in the preprocessing of different speech intelligibility models by comparing predictions from models that either assume a spectro-temporal (i.e., two-dimensional) or a temporal-only (i.e., one-dimensional) modulation filterbank. Furthermore, the role of the decision metric for speech intelligibility was investigated...... subtraction. The results suggested that a decision metric based on the SNRenv may provide a more general basis for predicting speech intelligibility than a metric based on the MTF. Moreover, the one-dimensional modulation filtering process was found to be sufficient to account for the data when combined...
The relationship of speech intelligibility with hearing sensitivity, cognition, and perceived hearing difficulties varies for different speech perception tests

Directory of Open Access Journals (Sweden)

Antje eHeinrich

2015-06-01

Full Text Available Listeners vary in their ability to understand speech in noisy environments. Hearing sensitivity, as measured by pure-tone audiometry, can only partly explain these results, and cognition has emerged as another key concept. Although cognition relates to speech perception, the exact nature of the relationship remains to be fully understood. This study investigates how different aspects of cognition, particularly working memory and attention, relate to speech intelligibility for various tests.Perceptual accuracy of speech perception represents just one aspect of functioning in a listening environment. Activity and participation limits imposed by hearing loss, in addition to the demands of a listening environment, are also important and may be better captured by self-report questionnaires. Understanding how speech perception relates to self-reported aspects of listening forms the second focus of the study.Forty-four listeners aged between 50-74 years with mild SNHL were tested on speech perception tests differing in complexity from low (phoneme discrimination in quiet, to medium (digit triplet perception in speech-shaped noise to high (sentence perception in modulated noise; cognitive tests of attention, memory, and nonverbal IQ; and self-report questionnaires of general health-related and hearing-specific quality of life.Hearing sensitivity and cognition related to intelligibility differently depending on the speech test: neither was important for phoneme discrimination, hearing sensitivity alone was important for digit triplet perception, and hearing and cognition together played a role in sentence perception. Self-reported aspects of auditory functioning were correlated with speech intelligibility to different degrees, with digit triplets in noise showing the richest pattern. The results suggest that intelligibility tests can vary in their auditory and cognitive demands and their sensitivity to the challenges that auditory environments pose on
The Effect of Noise on Relationships Between Speech Intelligibility and Self-Reported Communication Measures in Tracheoesophageal Speakers.

Science.gov (United States)

Eadie, Tanya L; Otero, Devon Sawin; Bolt, Susan; Kapsner-Smith, Mara; Sullivan, Jessica R

2016-08-01

The purpose of this study was to examine how sentence intelligibility relates to self-reported communication in tracheoesophageal speakers when speech intelligibility is measured in quiet and noise. Twenty-four tracheoesophageal speakers who were at least 1 year postlaryngectomy provided audio recordings of 5 sentences from the Sentence Intelligibility Test. Speakers also completed self-reported measures of communication-the Voice Handicap Index-10 and the Communicative Participation Item Bank short form. Speech recordings were presented to 2 groups of inexperienced listeners who heard sentences in quiet or noise. Listeners transcribed the sentences to yield speech intelligibility scores. Very weak relationships were found between intelligibility in quiet and measures of voice handicap and communicative participation. Slightly stronger, but still weak and nonsignificant, relationships were observed between measures of intelligibility in noise and both self-reported measures. However, 12 speakers who were more than 65% intelligible in noise showed strong and statistically significant relationships with both self-reported measures (R2 = .76-.79). Speech intelligibility in quiet is a weak predictor of self-reported communication measures in tracheoesophageal speakers. Speech intelligibility in noise may be a better metric of self-reported communicative function for speakers who demonstrate higher speech intelligibility in noise.
The benefit of combining a deep neural network architecture with ideal ratio mask estimation in computational speech segregation to improve speech intelligibility.

Science.gov (United States)

Bentsen, Thomas; May, Tobias; Kressner, Abigail A; Dau, Torsten

2018-01-01

Computational speech segregation attempts to automatically separate speech from noise. This is challenging in conditions with interfering talkers and low signal-to-noise ratios. Recent approaches have adopted deep neural networks and successfully demonstrated speech intelligibility improvements. A selection of components may be responsible for the success with these state-of-the-art approaches: the system architecture, a time frame concatenation technique and the learning objective. The aim of this study was to explore the roles and the relative contributions of these components by measuring speech intelligibility in normal-hearing listeners. A substantial improvement of 25.4 percentage points in speech intelligibility scores was found going from a subband-based architecture, in which a Gaussian Mixture Model-based classifier predicts the distributions of speech and noise for each frequency channel, to a state-of-the-art deep neural network-based architecture. Another improvement of 13.9 percentage points was obtained by changing the learning objective from the ideal binary mask, in which individual time-frequency units are labeled as either speech- or noise-dominated, to the ideal ratio mask, where the units are assigned a continuous value between zero and one. Therefore, both components play significant roles and by combining them, speech intelligibility improvements were obtained in a six-talker condition at a low signal-to-noise ratio.
Hidden Hearing Loss and Computational Models of the Auditory Pathway: Predicting Speech Intelligibility Decline

Science.gov (United States)

2016-11-28

Title: Hidden Hearing Loss and Computational Models of the Auditory Pathway: Predicting Speech Intelligibility Decline Christopher J. Smalt...representation of speech intelligibility in noise. The auditory-periphery model of Zilany et al. (JASA 2009,2014) is used to make predictions of...auditory nerve (AN) responses to speech stimuli under a variety of difficult listening conditions. The resulting cochlear neurogram, a spectrogram
Predicting speech intelligibility in adverse conditions: evaluation of the speech-based envelope power spectrum model

DEFF Research Database (Denmark)

Jørgensen, Søren; Dau, Torsten

2011-01-01

conditions by comparing predictions to measured data from [Kjems et al. (2009). J. Acoust. Soc. Am. 126 (3), 1415-1426] where speech is mixed with four different interferers, including speech-shaped noise, bottle noise, car noise, and cafe noise. The model accounts well for the differences in intelligibility......The speech-based envelope power spectrum model (sEPSM) [Jørgensen and Dau (2011). J. Acoust. Soc. Am., 130 (3), 1475–1487] estimates the envelope signal-to-noise ratio (SNRenv) of distorted speech and accurately describes the speech recognition thresholds (SRT) for normal-hearing listeners...... observed for the different interferers. None of the standardized models successfully describe these data....
Multimodal Speech Capture System for Speech Rehabilitation and Learning.

Science.gov (United States)

Sebkhi, Nordine; Desai, Dhyey; Islam, Mohammad; Lu, Jun; Wilson, Kimberly; Ghovanloo, Maysam

2017-11-01

Speech-language pathologists (SLPs) are trained to correct articulation of people diagnosed with motor speech disorders by analyzing articulators' motion and assessing speech outcome while patients speak. To assist SLPs in this task, we are presenting the multimodal speech capture system (MSCS) that records and displays kinematics of key speech articulators, the tongue and lips, along with voice, using unobtrusive methods. Collected speech modalities, tongue motion, lips gestures, and voice are visualized not only in real-time to provide patients with instant feedback but also offline to allow SLPs to perform post-analysis of articulators' motion, particularly the tongue, with its prominent but hardly visible role in articulation. We describe the MSCS hardware and software components, and demonstrate its basic visualization capabilities by a healthy individual repeating the words "Hello World." A proof-of-concept prototype has been successfully developed for this purpose, and will be used in future clinical studies to evaluate its potential impact on accelerating speech rehabilitation by enabling patients to speak naturally. Pattern matching algorithms to be applied to the collected data can provide patients with quantitative and objective feedback on their speech performance, unlike current methods that are mostly subjective, and may vary from one SLP to another.
A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments

Directory of Open Access Journals (Sweden)

Jing Mi

2016-09-01

Full Text Available Spatially separating speech maskers from target speech often leads to a large intelligibility improvement. Modeling this phenomenon has long been of interest to binaural-hearing researchers for uncovering brain mechanisms and for improving signal-processing algorithms in hearing-assistive devices. Much of the previous binaural modeling work focused on the unmasking enabled by binaural cues at the periphery, and little quantitative modeling has been directed toward the grouping or source-separation benefits of binaural processing. In this article, we propose a binaural model that focuses on grouping, specifically on the selection of time-frequency units that are dominated by signals from the direction of the target. The proposed model uses Equalization-Cancellation (EC processing with a binary decision rule to estimate a time-frequency binary mask. EC processing is carried out to cancel the target signal and the energy change between the EC input and output is used as a feature that reflects target dominance in each time-frequency unit. The processing in the proposed model requires little computational resources and is straightforward to implement. In combination with the Coherence-based Speech Intelligibility Index, the model is applied to predict the speech intelligibility data measured by Marrone et al. The predicted speech reception threshold matches the pattern of the measured data well, even though the predicted intelligibility improvements relative to the colocated condition are larger than some of the measured data, which may reflect the lack of internal noise in this initial version of the model.
A Binaural Grouping Model for Predicting Speech Intelligibility in Multitalker Environments.

Science.gov (United States)

Mi, Jing; Colburn, H Steven

2016-10-03

Spatially separating speech maskers from target speech often leads to a large intelligibility improvement. Modeling this phenomenon has long been of interest to binaural-hearing researchers for uncovering brain mechanisms and for improving signal-processing algorithms in hearing-assistive devices. Much of the previous binaural modeling work focused on the unmasking enabled by binaural cues at the periphery, and little quantitative modeling has been directed toward the grouping or source-separation benefits of binaural processing. In this article, we propose a binaural model that focuses on grouping, specifically on the selection of time-frequency units that are dominated by signals from the direction of the target. The proposed model uses Equalization-Cancellation (EC) processing with a binary decision rule to estimate a time-frequency binary mask. EC processing is carried out to cancel the target signal and the energy change between the EC input and output is used as a feature that reflects target dominance in each time-frequency unit. The processing in the proposed model requires little computational resources and is straightforward to implement. In combination with the Coherence-based Speech Intelligibility Index, the model is applied to predict the speech intelligibility data measured by Marrone et al. The predicted speech reception threshold matches the pattern of the measured data well, even though the predicted intelligibility improvements relative to the colocated condition are larger than some of the measured data, which may reflect the lack of internal noise in this initial version of the model. © The Author(s) 2016.
Preschool speech intelligibility and vocabulary skills predict long-term speech and language outcomes following cochlear implantation in early childhood.

Science.gov (United States)

Castellanos, Irina; Kronenberger, William G; Beer, Jessica; Henning, Shirley C; Colson, Bethany G; Pisoni, David B

2014-07-01

Speech and language measures during grade school predict adolescent speech-language outcomes in children who receive cochlear implants (CIs), but no research has examined whether speech and language functioning at even younger ages is predictive of long-term outcomes in this population. The purpose of this study was to examine whether early preschool measures of speech and language performance predict speech-language functioning in long-term users of CIs. Early measures of speech intelligibility and receptive vocabulary (obtained during preschool ages of 3-6 years) in a sample of 35 prelingually deaf, early-implanted children predicted speech perception, language, and verbal working memory skills up to 18 years later. Age of onset of deafness and age at implantation added additional variance to preschool speech intelligibility in predicting some long-term outcome scores, but the relationship between preschool speech-language skills and later speech-language outcomes was not significantly attenuated by the addition of these hearing history variables. These findings suggest that speech and language development during the preschool years is predictive of long-term speech and language functioning in early-implanted, prelingually deaf children. As a result, measures of speech-language functioning at preschool ages can be used to identify and adjust interventions for very young CI users who may be at long-term risk for suboptimal speech and language outcomes.
Development of a Danish speech intelligibility test

DEFF Research Database (Denmark)

Nielsen, Jens Bo; Dau, Torsten

2009-01-01

Abstract A Danish speech intelligibility test for assessing the speech recognition threshold in noise (SRTN) has been developed. The test consists of 180 sentences distributed in 18 phonetically balanced lists. The sentences are based on an open word-set and represent everyday language. The sente....... The test was verified with 14 normal-hearing listeners; the overall SRTN lies at a signal-to-noise ratio of -3.15 dB with a standard deviation of 1.0 dB. The list-SRTNs deviate less than 0.5 dB from the overall mean....
Acoustic richness modulates the neural networks supporting intelligible speech processing.

Science.gov (United States)

Lee, Yune-Sang; Min, Nam Eun; Wingfield, Arthur; Grossman, Murray; Peelle, Jonathan E

2016-03-01

The information contained in a sensory signal plays a critical role in determining what neural processes are engaged. Here we used interleaved silent steady-state (ISSS) functional magnetic resonance imaging (fMRI) to explore how human listeners cope with different degrees of acoustic richness during auditory sentence comprehension. Twenty-six healthy young adults underwent scanning while hearing sentences that varied in acoustic richness (high vs. low spectral detail) and syntactic complexity (subject-relative vs. object-relative center-embedded clause structures). We manipulated acoustic richness by presenting the stimuli as unprocessed full-spectrum speech, or noise-vocoded with 24 channels. Importantly, although the vocoded sentences were spectrally impoverished, all sentences were highly intelligible. These manipulations allowed us to test how intelligible speech processing was affected by orthogonal linguistic and acoustic demands. Acoustically rich speech showed stronger activation than acoustically less-detailed speech in a bilateral temporoparietal network with more pronounced activity in the right hemisphere. By contrast, listening to sentences with greater syntactic complexity resulted in increased activation of a left-lateralized network including left posterior lateral temporal cortex, left inferior frontal gyrus, and left dorsolateral prefrontal cortex. Significant interactions between acoustic richness and syntactic complexity occurred in left supramarginal gyrus, right superior temporal gyrus, and right inferior frontal gyrus, indicating that the regions recruited for syntactic challenge differed as a function of acoustic properties of the speech. Our findings suggest that the neural systems involved in speech perception are finely tuned to the type of information available, and that reducing the richness of the acoustic signal dramatically alters the brain's response to spoken language, even when intelligibility is high. Copyright © 2015 Elsevier
The influence of masker type on early reflection processing and speech intelligibility (L)

DEFF Research Database (Denmark)

Arweiler, Iris; Buchholz, Jörg M.; Dau, Torsten

2013-01-01

Arweiler and Buchholz [J. Acoust. Soc. Am. 130, 996-1005 (2011)] showed that, while the energy of early reflections (ERs) in a room improves speech intelligibility, the benefit is smaller than that provided by the energy of the direct sound (DS). In terms of integration of ERs and DS, binaural...... listening did not provide a benefit from ERs apart from a binaural energy summation, such that monaural auditory processing could account for the data. However, a diffuse speech shaped noise (SSN) was used in the speech intelligibility experiments, which does not provide distinct binaural cues...... to the auditory system. In the present study, the monaural and binaural benefit from ERs for speech intelligibility was investigated using three directional maskers presented from 90° azimuth: a SSN, a multi-talker babble, and a reversed two-talker masker. For normal-hearing as well as hearing-impaired listeners...
Vowel Generation for Children with Cerebral Palsy using Myocontrol of a Speech Synthesizer

Directory of Open Access Journals (Sweden)

Chuanxin M Niu

2015-01-01

Full Text Available For children with severe cerebral palsy (CP, social and emotional interactions can be significantly limited due to impaired speech motor function. However, if it is possible to extract continuous voluntary control signals from the electromyograph (EMG of limb muscles, then EMG may be used to drive the synthesis of intelligible speech with controllable speed, intonation and articulation. We report an important first step: the feasibility of controlling a vowel synthesizer using non-speech muscles. A classic formant-based speech synthesizer is adapted to allow the lowest two formants to be controlled by surface EMG from skeletal muscles. EMG signals are filtered using a non-linear Bayesian filtering algorithm that provides the high bandwidth and accuracy required for speech tasks. The frequencies of the first two formants determine points in a 2D plane, and vowels are targets on this plane. We focus on testing the overall feasibility of producing intelligible English vowels with myocontrol using two straightforward EMG-formant mappings. More mappings can be tested in the future to optimize the intelligibility. Vowel generation was tested on 10 healthy adults and 4 patients with dyskinetic CP. Five English vowels were generated by subjects in pseudo-random order, after only 10 minutes of device familiarization. The fraction of vowels correctly identified by 4 naive listeners exceeded 80% for the vowels generated by healthy adults and 57% for vowels generated by patients with CP. Our goal is a continuous virtual voice with personalized intonation and articulation that will restore not only the intellectual content but also the social and emotional content of speech for children and adults with severe movement disorders.
Australian children with cleft palate achieve age-appropriate speech by 5 years of age.

Science.gov (United States)

Chacon, Antonia; Parkin, Melissa; Broome, Kate; Purcell, Alison

2017-12-01

Children with cleft palate demonstrate atypical speech sound development, which can influence their intelligibility, literacy and learning. There is limited documentation regarding how speech sound errors change over time in cleft palate speech and the effect that these errors have upon mono-versus polysyllabic word production. The objective of this study was to examine the phonetic and phonological speech skills of children with cleft palate at ages 3 and 5. A cross-sectional observational design was used. Eligible participants were aged 3 or 5 years with a repaired cleft palate. The Diagnostic Evaluation of Articulation and Phonology (DEAP) Articulation subtest and a non-standardised list of mono- and polysyllabic words were administered once for each child. The Profile of Phonology (PROPH) was used to analyse each child's speech. N = 51 children with cleft palate participated in the study. Three-year-old children with cleft palate produced significantly more speech errors than their typically-developing peers, but no difference was apparent at 5 years. The 5-year-olds demonstrated greater phonetic and phonological accuracy than the 3-year-old children. Polysyllabic words were more affected by errors than monosyllables in the 3-year-old group only. Children with cleft palate are prone to phonetic and phonological speech errors in their preschool years. Most of these speech errors approximate typically-developing children by 5 years. At 3 years, word shape has an influence upon phonological speech accuracy. Speech pathology intervention is indicated to support the intelligibility of these children from their earliest stages of development. Copyright © 2017 Elsevier B.V. All rights reserved.
Activating Articulation Skills through Theraplay.

Science.gov (United States)

Kupperman, Phyllis; And Others

1980-01-01

Speech theraplay, a method of remediation for children with articulation disorders, is described. The approach is based on parent-child interactions that are postulated to activate articulation acquistion. The results of a six-week study indicated improvement in the articulation abilities of six children (3 to 4 years old) with this method.…
Speech outcome in unilateral complete cleft lip and palate patients: a descriptive study.

Science.gov (United States)

Rullo, R; Di Maggio, D; Addabbo, F; Rullo, F; Festa, V M; Perillo, L

2014-09-01

In this study, resonance and articulation disorders were examined in a group of patients surgically treated for cleft lip and palate, considering family social background, and children's ability of self monitoring their speech output while speaking. Fifty children (32 males and 18 females) mean age 6.5 ± 1.6 years, affected by non-syndromic complete unilateral cleft of the lip and palate underwent the same surgical protocol. The speech level was evaluated using the Accordi's speech assessment protocol that focuses on intelligibility, nasality, nasal air escape, pharyngeal friction, and glottal stop. Pearson product-moment correlation analysis was used to detect significant associations between analysed parameters. A total of 16% (8 children) of the sample had severe to moderate degree of nasality and nasal air escape, presence of pharyngeal friction and glottal stop, which obviously compromise speech intelligibility. Ten children (10%) showed a barely acceptable phonological outcome: nasality and nasal air escape were mild to moderate, but the intelligibility remained poor. Thirty-two children (64%) had normal speech. Statistical analysis revealed a significant correlation between the severity of nasal resonance and nasal air escape (p ≤ 0.05). No statistical significant correlation was found between the final intelligibility and the patient social background, neither between the final intelligibility nor the age of the patients. The differences in speech outcome could be explained with a specific, subjective, and inborn ability, different for each child, in self-monitoring their speech output.
Improving on hidden Markov models: An articulatorily constrained, maximum likelihood approach to speech recognition and speech coding

Energy Technology Data Exchange (ETDEWEB)

Hogden, J.

1996-11-05

The goal of the proposed research is to test a statistical model of speech recognition that incorporates the knowledge that speech is produced by relatively slow motions of the tongue, lips, and other speech articulators. This model is called Maximum Likelihood Continuity Mapping (Malcom). Many speech researchers believe that by using constraints imposed by articulator motions, we can improve or replace the current hidden Markov model based speech recognition algorithms. Unfortunately, previous efforts to incorporate information about articulation into speech recognition algorithms have suffered because (1) slight inaccuracies in our knowledge or the formulation of our knowledge about articulation may decrease recognition performance, (2) small changes in the assumptions underlying models of speech production can lead to large changes in the speech derived from the models, and (3) collecting measurements of human articulator positions in sufficient quantity for training a speech recognition algorithm is still impractical. The most interesting (and in fact, unique) quality of Malcom is that, even though Malcom makes use of a mapping between acoustics and articulation, Malcom can be trained to recognize speech using only acoustic data. By learning the mapping between acoustics and articulation using only acoustic data, Malcom avoids the difficulties involved in collecting articulator position measurements and does not require an articulatory synthesizer model to estimate the mapping between vocal tract shapes and speech acoustics. Preliminary experiments that demonstrate that Malcom can learn the mapping between acoustics and articulation are discussed. Potential applications of Malcom aside from speech recognition are also discussed. Finally, specific deliverables resulting from the proposed research are described.
Surgical improvement of speech disorder caused by amyotrophic lateral sclerosis.

Science.gov (United States)

Saigusa, Hideto; Yamaguchi, Satoshi; Nakamura, Tsuyoshi; Komachi, Taro; Kadosono, Osamu; Ito, Hiroyuki; Saigusa, Makoto; Niimi, Seiji

2012-12-01

Amyotrophic lateral sclerosis (ALS) is a progressive debilitating neurological disease. ALS disturbs the quality of life by affecting speech, swallowing and free mobility of the arms without affecting intellectual function. It is therefore of significance to improve intelligibility and quality of speech sounds, especially for ALS patients with slowly progressive courses. Currently, however, there is no effective or established approach to improve speech disorder caused by ALS. We investigated a surgical procedure to improve speech disorder for some patients with neuromuscular diseases with velopharyngeal closure incompetence. In this study, we performed the surgical procedure for two patients suffering from severe speech disorder caused by slowly progressing ALS. The patients suffered from speech disorder with hypernasality and imprecise and weak articulation during a 6-year course (patient 1) and a 3-year course (patient 2) of slowly progressing ALS. We narrowed bilateral lateral palatopharyngeal wall at velopharyngeal port, and performed this surgery under general anesthesia without muscle relaxant for the two patients. Postoperatively, intelligibility and quality of their speech sounds were greatly improved within one month without any speech therapy. The patients were also able to generate longer speech phrases after the surgery. Importantly, there was no serious complication during or after the surgery. In summary, we performed bilateral narrowing of lateral palatopharyngeal wall as a speech surgery for two patients suffering from severe speech disorder associated with ALS. With this technique, improved intelligibility and quality of speech can be maintained for longer duration for the patients with slowly progressing ALS.

Modeling Speech Intelligibility in Hearing Impaired Listeners

DEFF Research Database (Denmark)

Scheidiger, Christoph; Jørgensen, Søren; Dau, Torsten

2014-01-01

speech, e.g. phase jitter or spectral subtraction. Recent studies predict SI for normal-hearing (NH) listeners based on a signal-to-noise ratio measure in the envelope domain (SNRenv), in the framework of the speech-based envelope power spectrum model (sEPSM, [20, 21]). These models have shown good...... agreement with measured data under a broad range of conditions, including stationary and modulated interferers, reverberation, and spectral subtraction. Despite the advances in modeling intelligibility in NH listeners, a broadly applicable model that can predict SI in hearing-impaired (HI) listeners...... is not yet available. As a firrst step towards such a model, this study investigates to what extent eects of hearing impairment on SI can be modeled in the sEPSM framework. Preliminary results show that, by only modeling the loss of audibility, the model cannot account for the higher speech reception...
Methods and models for quantative assessment of speech intelligibility in cross-language communication

NARCIS (Netherlands)

Wijngaarden, S.J. van; Steeneken, H.J.M.; Houtgast, T.

2001-01-01

To deal with the effects of nonnative speech communication on speech intelligibility, one must know the magnitude of these effects. To measure this magnitude, suitable test methods must be available. Many of the methods used in cross-language speech communication research are not very suitable for
Peripheral facial palsy: Speech, communication and oral motor function.

Science.gov (United States)

Movérare, T; Lohmander, A; Hultcrantz, M; Sjögreen, L

2017-02-01

The aim of the present study was to examine the effect of acquired unilateral peripheral facial palsy on speech, communication and oral functions and to study the relationship between the degree of facial palsy and articulation, saliva control, eating ability and lip force. In this descriptive study, 27 patients (15 men and 12 women, mean age 48years) with unilateral peripheral facial palsy were included if they were graded under 70 on the Sunnybrook Facial Grading System. The assessment was carried out in connection with customary visits to the ENT Clinic and comprised lip force, articulation and intelligibility, together with perceived ability to communicate and ability to eat and control saliva conducted through self-response questionnaires. The patients with unilateral facial palsy had significantly lower lip force, poorer articulation and ability to eat and control saliva compared with reference data in healthy populations. The degree of facial palsy correlated significantly with lip force but not with articulation, intelligibility, perceived communication ability or reported ability to eat and control saliva. Acquired peripheral facial palsy may affect communication and the ability to eat and control saliva. Physicians should be aware that there is no direct correlation between the degree of facial palsy and the possible effect on communication, eating ability and saliva control. Physicians are therefore recommended to ask specific questions relating to problems with these functions during customary medical visits and offer possible intervention by a speech-language pathologist or a physiotherapist. Copyright © 2016 Elsevier Masson SAS. All rights reserved.
Adequacy of velopharyngeal closure and speech competency following prosthetic management of soft palate resection

International Nuclear Information System (INIS)

ElDakkak, M

1999-01-01

Ten patients who had undergone soft palate resection for the removal of palatal tumors were studied. In each patient, the surgical defect involved the posterior margin of the soft palate and lead to velopharyngeal insufficiency. None of the patients suffered any speech, hearing or nasal problems before surgery. For each patient, a speech aid obturator was constructed and was used at least one month before the evaluation. Prosthetic management of each subject was evaluated as reflected in adequacy of velopharyngeal closure and speech competency. Various aspects of speech including intelligibility, articulation, nasality, hoarseness and overall speech were correlated with the adequacy of velopharyngeal closure. (author)
Prediction and constraint in audiovisual speech perception

Science.gov (United States)

Peelle, Jonathan E.; Sommers, Mitchell S.

2015-01-01

During face-to-face conversational speech listeners must efficiently process a rapid and complex stream of multisensory information. Visual speech can serve as a critical complement to auditory information because it provides cues to both the timing of the incoming acoustic signal (the amplitude envelope, influencing attention and perceptual sensitivity) and its content (place and manner of articulation, constraining lexical selection). Here we review behavioral and neurophysiological evidence regarding listeners' use of visual speech information. Multisensory integration of audiovisual speech cues improves recognition accuracy, particularly for speech in noise. Even when speech is intelligible based solely on auditory information, adding visual information may reduce the cognitive demands placed on listeners through increasing precision of prediction. Electrophysiological studies demonstrate oscillatory cortical entrainment to speech in auditory cortex is enhanced when visual speech is present, increasing sensitivity to important acoustic cues. Neuroimaging studies also suggest increased activity in auditory cortex when congruent visual information is available, but additionally emphasize the involvement of heteromodal regions of posterior superior temporal sulcus as playing a role in integrative processing. We interpret these findings in a framework of temporally-focused lexical competition in which visual speech information affects auditory processing to increase sensitivity to auditory information through an early integration mechanism, and a late integration stage that incorporates specific information about a speaker's articulators to constrain the number of possible candidates in a spoken utterance. Ultimately it is words compatible with both auditory and visual information that most strongly determine successful speech perception during everyday listening. Thus, audiovisual speech perception is accomplished through multiple stages of integration, supported
Prediction and constraint in audiovisual speech perception.

Science.gov (United States)

Peelle, Jonathan E; Sommers, Mitchell S

2015-07-01

During face-to-face conversational speech listeners must efficiently process a rapid and complex stream of multisensory information. Visual speech can serve as a critical complement to auditory information because it provides cues to both the timing of the incoming acoustic signal (the amplitude envelope, influencing attention and perceptual sensitivity) and its content (place and manner of articulation, constraining lexical selection). Here we review behavioral and neurophysiological evidence regarding listeners' use of visual speech information. Multisensory integration of audiovisual speech cues improves recognition accuracy, particularly for speech in noise. Even when speech is intelligible based solely on auditory information, adding visual information may reduce the cognitive demands placed on listeners through increasing the precision of prediction. Electrophysiological studies demonstrate that oscillatory cortical entrainment to speech in auditory cortex is enhanced when visual speech is present, increasing sensitivity to important acoustic cues. Neuroimaging studies also suggest increased activity in auditory cortex when congruent visual information is available, but additionally emphasize the involvement of heteromodal regions of posterior superior temporal sulcus as playing a role in integrative processing. We interpret these findings in a framework of temporally-focused lexical competition in which visual speech information affects auditory processing to increase sensitivity to acoustic information through an early integration mechanism, and a late integration stage that incorporates specific information about a speaker's articulators to constrain the number of possible candidates in a spoken utterance. Ultimately it is words compatible with both auditory and visual information that most strongly determine successful speech perception during everyday listening. Thus, audiovisual speech perception is accomplished through multiple stages of integration
Reference-Free Assessment of Speech Intelligibility Using Bispectrum of an Auditory Neurogram

Science.gov (United States)

Hossain, Mohammad E.; Jassim, Wissam A.; Zilany, Muhammad S. A.

2016-01-01

Sensorineural hearing loss occurs due to damage to the inner and outer hair cells of the peripheral auditory system. Hearing loss can cause decreases in audibility, dynamic range, frequency and temporal resolution of the auditory system, and all of these effects are known to affect speech intelligibility. In this study, a new reference-free speech intelligibility metric is proposed using 2-D neurograms constructed from the output of a computational model of the auditory periphery. The responses of the auditory-nerve fibers with a wide range of characteristic frequencies were simulated to construct neurograms. The features of the neurograms were extracted using third-order statistics referred to as bispectrum. The phase coupling of neurogram bispectrum provides a unique insight for the presence (or deficit) of supra-threshold nonlinearities beyond audibility for listeners with normal hearing (or hearing loss). The speech intelligibility scores predicted by the proposed method were compared to the behavioral scores for listeners with normal hearing and hearing loss both in quiet and under noisy background conditions. The results were also compared to the performance of some existing methods. The predicted results showed a good fit with a small error suggesting that the subjective scores can be estimated reliably using the proposed neural-response-based metric. The proposed metric also had a wide dynamic range, and the predicted scores were well-separated as a function of hearing loss. The proposed metric successfully captures the effects of hearing loss and supra-threshold nonlinearities on speech intelligibility. This metric could be applied to evaluate the performance of various speech-processing algorithms designed for hearing aids and cochlear implants. PMID:26967160
Audiovisual materials are effective for enhancing the correction of articulation disorders in children with cleft palate.

Science.gov (United States)

Pamplona, María Del Carmen; Ysunza, Pablo Antonio; Morales, Santiago

2017-02-01

Children with cleft palate frequently show speech disorders known as compensatory articulation. Compensatory articulation requires a prolonged period of speech intervention that should include reinforcement at home. However, frequently relatives do not know how to work with their children at home. To study whether the use of audiovisual materials especially designed for complementing speech pathology treatment in children with compensatory articulation can be effective for stimulating articulation practice at home and consequently enhancing speech normalization in children with cleft palate. Eighty-two patients with compensatory articulation were studied. Patients were randomly divided into two groups. Both groups received speech pathology treatment aimed to correct articulation placement. In addition, patients from the active group received a set of audiovisual materials to be used at home. Parents were instructed about strategies and ideas about how to use the materials with their children. Severity of compensatory articulation was compared at the onset and at the end of the speech intervention. After the speech therapy period, the group of patients using audiovisual materials at home demonstrated significantly greater improvement in articulation, as compared with the patients receiving speech pathology treatment on - site without audiovisual supporting materials. The results of this study suggest that audiovisual materials especially designed for practicing adequate articulation placement at home can be effective for reinforcing and enhancing speech pathology treatment of patients with cleft palate and compensatory articulation. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Towards Contactless Silent Speech Recognition Based on Detection of Active and Visible Articulators Using IR-UWB Radar.

Science.gov (United States)

Shin, Young Hoon; Seo, Jiwon

2016-10-29

People with hearing or speaking disabilities are deprived of the benefits of conventional speech recognition technology because it is based on acoustic signals. Recent research has focused on silent speech recognition systems that are based on the motions of a speaker's vocal tract and articulators. Because most silent speech recognition systems use contact sensors that are very inconvenient to users or optical systems that are susceptible to environmental interference, a contactless and robust solution is hence required. Toward this objective, this paper presents a series of signal processing algorithms for a contactless silent speech recognition system using an impulse radio ultra-wide band (IR-UWB) radar. The IR-UWB radar is used to remotely and wirelessly detect motions of the lips and jaw. In order to extract the necessary features of lip and jaw motions from the received radar signals, we propose a feature extraction algorithm. The proposed algorithm noticeably improved speech recognition performance compared to the existing algorithm during our word recognition test with five speakers. We also propose a speech activity detection algorithm to automatically select speech segments from continuous input signals. Thus, speech recognition processing is performed only when speech segments are detected. Our testbed consists of commercial off-the-shelf radar products, and the proposed algorithms are readily applicable without designing specialized radar hardware for silent speech processing.
Effects of Instantaneous Multiband Dynamic Compression on Speech Intelligibility

Directory of Open Access Journals (Sweden)

Herzke Tobias

2005-01-01

Full Text Available The recruitment phenomenon, that is, the reduced dynamic range between threshold and uncomfortable level, is attributed to the loss of instantaneous dynamic compression on the basilar membrane. Despite this, hearing aids commonly use slow-acting dynamic compression for its compensation, because this was found to be the most successful strategy in terms of speech quality and intelligibility rehabilitation. Former attempts to use fast-acting compression gave ambiguous results, raising the question as to whether auditory-based recruitment compensation by instantaneous compression is in principle applicable in hearing aids. This study thus investigates instantaneous multiband dynamic compression based on an auditory filterbank. Instantaneous envelope compression is performed in each frequency band of a gammatone filterbank, which provides a combination of time and frequency resolution comparable to the normal healthy cochlea. The gain characteristics used for dynamic compression are deduced from categorical loudness scaling. In speech intelligibility tests, the instantaneous dynamic compression scheme was compared against a linear amplification scheme, which used the same filterbank for frequency analysis, but employed constant gain factors that restored the sound level for medium perceived loudness in each frequency band. In subjective comparisons, five of nine subjects preferred the linear amplification scheme and would not accept the instantaneous dynamic compression in hearing aids. Four of nine subjects did not perceive any quality differences. A sentence intelligibility test in noise (Oldenburg sentence test showed little to no negative effects of the instantaneous dynamic compression, compared to linear amplification. A word intelligibility test in quiet (one-syllable rhyme test showed that the subjects benefit from the larger amplification at low levels provided by instantaneous dynamic compression. Further analysis showed that the increase
Perceptual Speech Assessment After Anterior Maxillary Distraction in Patients With Cleft Maxillary Hypoplasia.

Science.gov (United States)

Richardson, Sunil; Seelan, Nikkie S; Selvaraj, Dhivakar; Khandeparker, Rakshit V; Gnanamony, Sangeetha

2016-06-01

To assess speech outcomes after anterior maxillary distraction (AMD) in patients with cleft-related maxillary hypoplasia. Fifty-eight patients at least 10 years old with cleft-related maxillary hypoplasia were included in this study irrespective of gender, type of cleft lip and palate, and amount of required advancement. AMD was carried out in all patients using a tooth-borne palatal distractor by a single oral and maxillofacial surgeon. Perceptual speech assessment was performed by 2 speech language pathologists preoperatively, before placement of the distractor device, and 6 months postoperatively using the scoring system of Perkins et al (Plast Reconstr Surg 116:72, 2005); the system evaluates velopharyngeal insufficiency (VPI), resonance, nasal air emission, articulation errors, and intelligibility. The data obtained were tabulated and subjected to statistical analysis using Wilcoxon signed rank test. A P value less than .05 was considered significant. Eight patients were lost to follow-up. At 6-month follow-up, improvements of 62% (n = 31), 64% (n = 32), 50% (n = 25), 68% (n = 34), and 70% (n = 35) in VPI, resonance, nasal air emission, articulation, and intelligibility, respectively, were observed, with worsening of all parameters in 1 patient (2%). The results for all tested parameters were highly significant (P ≤ .001). AMD offers a substantial improvement in speech for all 5 parameters of perceptual speech assessment. Copyright © 2016 The American Association of Oral and Maxillofacial Surgeons. Published by Elsevier Inc. All rights reserved.
The influence of spectral characteristics of early reflections on speech intelligibility

DEFF Research Database (Denmark)

Arweiler, Iris; Buchholz, Jörg

2011-01-01

The auditory system takes advantage of early reflections (ERs) in a room by integrating them with the direct sound (DS) and thereby increasing the effective speech level. In the present paper the benefit from realistic ERs on speech intelligibility in diffuse speech-shaped noise was investigated...... ascribed to their altered spectrum compared to the DS and to the filtering by the torso, head, and pinna. No binaural processing other than a binaural summation effect could be observed....
Factors Affecting Acoustics and Speech Intelligibility in the Operating Room: Size Matters.

Science.gov (United States)

McNeer, Richard R; Bennett, Christopher L; Horn, Danielle Bodzin; Dudaryk, Roman

2017-06-01

Noise in health care settings has increased since 1960 and represents a significant source of dissatisfaction among staff and patients and risk to patient safety. Operating rooms (ORs) in which effective communication is crucial are particularly noisy. Speech intelligibility is impacted by noise, room architecture, and acoustics. For example, sound reverberation time (RT60) increases with room size, which can negatively impact intelligibility, while room objects are hypothesized to have the opposite effect. We explored these relationships by investigating room construction and acoustics of the surgical suites at our institution. We studied our ORs during times of nonuse. Room dimensions were measured to calculate room volumes (VR). Room content was assessed by estimating size and assigning items into 5 volume categories to arrive at an adjusted room content volume (VC) metric. Psychoacoustic analyses were performed by playing sweep tones from a speaker and recording the impulse responses (ie, resulting sound fields) from 3 locations in each room. The recordings were used to calculate 6 psychoacoustic indices of intelligibility. Multiple linear regression was performed using VR and VC as predictor variables and each intelligibility index as an outcome variable. A total of 40 ORs were studied. The surgical suites were characterized by a large degree of construction and surface finish heterogeneity and varied in size from 71.2 to 196.4 m (average VR = 131.1 [34.2] m). An insignificant correlation was observed between VR and VC (Pearson correlation = 0.223, P = .166). Multiple linear regression model fits and β coefficients for VR were highly significant for each of the intelligibility indices and were best for RT60 (R = 0.666, F(2, 37) = 39.9, P the size and contents of an OR can predict a range of psychoacoustic indices of speech intelligibility. Specifically, increasing OR size correlated with worse speech intelligibility, while increasing amounts of OR contents
Audiomotor Perceptual Training Enhances Speech Intelligibility in Background Noise.

Science.gov (United States)

Whitton, Jonathon P; Hancock, Kenneth E; Shannon, Jeffrey M; Polley, Daniel B

2017-11-06

Sensory and motor skills can be improved with training, but learning is often restricted to practice stimuli. As an exception, training on closed-loop (CL) sensorimotor interfaces, such as action video games and musical instruments, can impart a broad spectrum of perceptual benefits. Here we ask whether computerized CL auditory training can enhance speech understanding in levels of background noise that approximate a crowded restaurant. Elderly hearing-impaired subjects trained for 8 weeks on a CL game that, like a musical instrument, challenged them to monitor subtle deviations between predicted and actual auditory feedback as they moved their fingertip through a virtual soundscape. We performed our study as a randomized, double-blind, placebo-controlled trial by training other subjects in an auditory working-memory (WM) task. Subjects in both groups improved at their respective auditory tasks and reported comparable expectations for improved speech processing, thereby controlling for placebo effects. Whereas speech intelligibility was unchanged after WM training, subjects in the CL training group could correctly identify 25% more words in spoken sentences or digit sequences presented in high levels of background noise. Numerically, CL audiomotor training provided more than three times the benefit of our subjects' hearing aids for speech processing in noisy listening conditions. Gains in speech intelligibility could be predicted from gameplay accuracy and baseline inhibitory control. However, benefits did not persist in the absence of continuing practice. These studies employ stringent clinical standards to demonstrate that perceptual learning on a computerized audio game can transfer to "real-world" communication challenges. Copyright © 2017 Elsevier Ltd. All rights reserved.
Dead regions in the cochlea: Implications for speech recognition and applicability of articulation index theory

DEFF Research Database (Denmark)

Vestergaard, Martin David

2003-01-01

Dead regions in the cochlea have been suggested to be responsible for failure by hearing aid users to benefit front apparently increased audibility in terms of speech intelligibility. As an alternative to the more cumbersome psychoacoustic tuning curve measurement, threshold-equalizing noise (TEN...
Syllabic compression and speech intelligibility in hearing impaired listeners

NARCIS (Netherlands)

Verschuure, J.; Dreschler, W. A.; de Haan, E. H.; van Cappellen, M.; Hammerschlag, R.; Maré, M. J.; Maas, A. J.; Hijmans, A. C.

1993-01-01

Syllabic compression has not been shown unequivocally to improve speech intelligibility in hearing-impaired listeners. This paper attempts to explain the poor results by introducing the concept of minimum overshoots. The concept was tested with a digital signal processor on hearing-impaired
A multi-resolution envelope-power based model for speech intelligibility

DEFF Research Database (Denmark)

Jørgensen, Søren; Ewert, Stephan D.; Dau, Torsten

2013-01-01

The speech-based envelope power spectrum model (sEPSM) presented by Jørgensen and Dau [(2011). J. Acoust. Soc. Am. 130, 1475-1487] estimates the envelope power signal-to-noise ratio (SNRenv) after modulation-frequency selective processing. Changes in this metric were shown to account well...... to conditions with stationary interferers, due to the long-term integration of the envelope power, and cannot account for increased intelligibility typically obtained with fluctuating maskers. Here, a multi-resolution version of the sEPSM is presented where the SNRenv is estimated in temporal segments...... with a modulation-filter dependent duration. The multi-resolution sEPSM is demonstrated to account for intelligibility obtained in conditions with stationary and fluctuating interferers, and noisy speech distorted by reverberation or spectral subtraction. The results support the hypothesis that the SNRenv...
The importance for speech intelligibility of random fluctuations in "steady" background noise.

Science.gov (United States)

Stone, Michael A; Füllgrabe, Christian; Mackinnon, Robert C; Moore, Brian C J

2011-11-01

Spectrally shaped steady noise is commonly used as a masker of speech. The effects of inherent random fluctuations in amplitude of such a noise are typically ignored. Here, the importance of these random fluctuations was assessed by comparing two cases. For one, speech was mixed with steady speech-shaped noise and N-channel tone vocoded, a process referred to as signal-domain mixing (SDM); this preserved the random fluctuations of the noise. For the second, the envelope of speech alone was extracted for each vocoder channel and a constant was added corresponding to the root-mean-square value of the noise envelope for that channel. This is referred to as envelope-domain mixing (EDM); it removed the random fluctuations of the noise. Sinusoidally modulated noise and a single talker were also used as backgrounds, with both SDM and EDM. Speech intelligibility was measured for N = 12, 19, and 30, with the target-to-background ratio fixed at -7 dB. For SDM, performance was best for the speech background and worst for the steady noise. For EDM, this pattern was reversed. Intelligibility with steady noise was consistently very poor for SDM, but near-ceiling for EDM, demonstrating that the random fluctuations in steady noise have a large effect.
Production Variability and Single Word Intelligibility in Aphasia and Apraxia of Speech

Science.gov (United States)

Haley, Katarina L.; Martin, Gwenyth

2011-01-01

This study was designed to estimate test-retest reliability of orthographic speech intelligibility testing in speakers with aphasia and AOS and to examine its relationship to the consistency of speaker and listener responses. Monosyllabic single word speech samples were recorded from 13 speakers with coexisting aphasia and AOS. These words were…
Development of Bone-Conducted Ultrasonic Hearing Aid for the Profoundly Deaf: Assessments of the Modulation Type with Regard to Intelligibility and Sound Quality

Science.gov (United States)

Nakagawa, Seiji; Fujiyuki, Chika; Kagomiya, Takayuki

2012-07-01

Bone-conducted ultrasound (BCU) is perceived even by the profoundly sensorineural deaf. A novel hearing aid using the perception of amplitude-modulated BCU (BCU hearing aid: BCUHA) has been developed; however, further improvements are needed, especially in terms of articulation and sound quality. In this study, the intelligibility and sound quality of BCU speech with several types of amplitude modulation [double-sideband with transmitted carrier (DSB-TC), double-sideband with suppressed carrier (DSB-SC), and transposed modulation] were evaluated. The results showed that DSB-TC and transposed speech were more intelligible than DSB-SC speech, and transposed speech was closer than the other types of BCU speech to air-conducted speech in terms of sound quality. These results provide useful information for further development of the BCUHA.

Prediction of speech intelligibility based on a correlation metric in the envelope power spectrum domain

DEFF Research Database (Denmark)

Relano-Iborra, Helia; May, Tobias; Zaar, Johannes

A powerful tool to investigate speech perception is the use of speech intelligibility prediction models. Recently, a model was presented, termed correlation-based speechbased envelope power spectrum model (sEPSMcorr) [1], based on the auditory processing of the multi-resolution speech-based Envel...
Attitudes toward speech disorders: sampling the views of Cantonese-speaking Americans.

Science.gov (United States)

Bebout, L; Arthur, B

1997-01-01

Speech-language pathologists who serve clients from cultural backgrounds that are not familiar to them may encounter culturally influenced attitudinal differences. A questionnaire with statements about 4 speech disorders (dysfluency, cleft pallet, speech of the deaf, and misarticulations) was given to a focus group of Chinese Americans and a comparison group of non-Chinese Americans. The focus group was much more likely to believe that persons with speech disorders could improve their own speech by "trying hard," was somewhat more likely to say that people who use deaf speech and people with cleft palates might be "emotionally disturbed," and generally more likely to view deaf speech as a limitation. The comparison group was more pessimistic about stuttering children's acceptance by their peers than was the focus group. The two subject groups agreed about other items, such as the likelihood that older children with articulation problems are "less intelligent" than their peers.
Direct magnitude estimates of speech intelligibility in dysarthria: effects of a chosen standard.

Science.gov (United States)

Weismer, Gary; Laures, Jacqueline S

2002-06-01

Direct magnitude estimation (DME) has been used frequently as a perceptual scaling technique in studies of the speech intelligibility of persons with speech disorders. The technique is typically used with a standard, or reference stimulus, chosen as a good exemplar of "midrange" intelligibility. In several published studies, the standard has been chosen subjectively, usually on the basis of the expertise of the investigators. The current experiment demonstrates that a fixed set of sentence-level utterances, obtained from 4 individuals with dysarthria (2 with Parkinson disease, 2 with traumatic brain injury) as well as 3 neurologically normal speakers, is scaled differently depending on the identity of the standard. Four different standards were used in the main experiment, three of which were judged qualitatively in two independent evaluations to be good exemplars of midrange intelligibility. Acoustic analyses did not reveal obvious differences between these four standards but suggested that the standard with the worst-scaled intelligibility had much poorer voice source characteristics compared to the other three standards. Results are discussed in terms of possible standardization of midrange intelligibility exemplars for DME experiments.
Examining explanations for fundamental frequency's contribution to speech intelligibility in noise

Science.gov (United States)

Schlauch, Robert S.; Miller, Sharon E.; Watson, Peter J.

2005-09-01

Laures and Weismer [JSLHR, 42, 1148 (1999)] reported that speech with natural variation in fundamental frequency (F0) is more intelligible in noise than speech with a flattened F0 contour. Cognitive-linguistic based explanations have been offered to account for this drop in intelligibility for the flattened condition, but a lower-level mechanism related to auditory streaming may be responsible. Numerous psychoacoustic studies have demonstrated that modulating a tone enables a listener to segregate it from background sounds. To test these rival hypotheses, speech recognition in noise was measured for sentences with six different F0 contours: unmodified, flattened at the mean, natural but exaggerated, reversed, and frequency modulated (rates of 2.5 and 5.0 Hz). The 180 stimulus sentences were produced by five talkers (30 sentences per condition). Speech recognition for fifteen listeners replicate earlier findings showing that flattening the F0 contour results in a roughly 10% reduction in recognition of key words compared with the natural condition. Although the exaggerated condition produced results comparable to those of the flattened condition, the other conditions with unnatural F0 contours all yielded significantly poorer performance than the flattened condition. These results support the cognitive, linguistic-based explanations for the reduction in performance.
Comparing Binaural Pre-processing Strategies II: Speech Intelligibility of Bilateral Cochlear Implant Users.

Science.gov (United States)

Baumgärtel, Regina M; Hu, Hongmei; Krawczyk-Becker, Martin; Marquardt, Daniel; Herzke, Tobias; Coleman, Graham; Adiloğlu, Kamil; Bomke, Katrin; Plotz, Karsten; Gerkmann, Timo; Doclo, Simon; Kollmeier, Birger; Hohmann, Volker; Dietz, Mathias

2015-12-30

Several binaural audio signal enhancement algorithms were evaluated with respect to their potential to improve speech intelligibility in noise for users of bilateral cochlear implants (CIs). 50% speech reception thresholds (SRT50) were assessed using an adaptive procedure in three distinct, realistic noise scenarios. All scenarios were highly nonstationary, complex, and included a significant amount of reverberation. Other aspects, such as the perfectly frontal target position, were idealized laboratory settings, allowing the algorithms to perform better than in corresponding real-world conditions. Eight bilaterally implanted CI users, wearing devices from three manufacturers, participated in the study. In all noise conditions, a substantial improvement in SRT50 compared to the unprocessed signal was observed for most of the algorithms tested, with the largest improvements generally provided by binaural minimum variance distortionless response (MVDR) beamforming algorithms. The largest overall improvement in speech intelligibility was achieved by an adaptive binaural MVDR in a spatially separated, single competing talker noise scenario. A no-pre-processing condition and adaptive differential microphones without a binaural link served as the two baseline conditions. SRT50 improvements provided by the binaural MVDR beamformers surpassed the performance of the adaptive differential microphones in most cases. Speech intelligibility improvements predicted by instrumental measures were shown to account for some but not all aspects of the perceptually obtained SRT50 improvements measured in bilaterally implanted CI users. © The Author(s) 2015.
Speech impairment in Down syndrome: a review.

Science.gov (United States)

Kent, Ray D; Vorperian, Houri K

2013-02-01

This review summarizes research on disorders of speech production in Down syndrome (DS) for the purposes of informing clinical services and guiding future research. Review of the literature was based on searches using MEDLINE, Google Scholar, PsycINFO, and HighWire Press, as well as consideration of reference lists in retrieved documents (including online sources). Search terms emphasized functions related to voice, articulation, phonology, prosody, fluency, and intelligibility. The following conclusions pertain to four major areas of review: voice, speech sounds, fluency and prosody, and intelligibility. The first major area is voice. Although a number of studies have reported on vocal abnormalities in DS, major questions remain about the nature and frequency of the phonatory disorder. Results of perceptual and acoustic studies have been mixed, making it difficult to draw firm conclusions or even to identify sensitive measures for future study. The second major area is speech sounds. Articulatory and phonological studies show that speech patterns in DS are a combination of delayed development and errors not seen in typical development. Delayed (i.e., developmental) and disordered (i.e., nondevelopmental) patterns are evident by the age of about 3 years, although DS-related abnormalities possibly appear earlier, even in infant babbling. The third major area is fluency and prosody. Stuttering and/or cluttering occur in DS at rates of 10%-45%, compared with about 1% in the general population. Research also points to significant disturbances in prosody. The fourth major area is intelligibility. Studies consistently show marked limitations in this area, but only recently has the research gone beyond simple rating scales.
On the relationship between auditory cognition and speech intelligibility in cochlear implant users: An ERP study.

Science.gov (United States)

Finke, Mareike; Büchner, Andreas; Ruigendijk, Esther; Meyer, Martin; Sandmann, Pascale

2016-07-01

There is a high degree of variability in speech intelligibility outcomes across cochlear-implant (CI) users. To better understand how auditory cognition affects speech intelligibility with the CI, we performed an electroencephalography study in which we examined the relationship between central auditory processing, cognitive abilities, and speech intelligibility. Postlingually deafened CI users (N=13) and matched normal-hearing (NH) listeners (N=13) performed an oddball task with words presented in different background conditions (quiet, stationary noise, modulated noise). Participants had to categorize words as living (targets) or non-living entities (standards). We also assessed participants' working memory (WM) capacity and verbal abilities. For the oddball task, we found lower hit rates and prolonged response times in CI users when compared with NH listeners. Noise-related prolongation of the N1 amplitude was found for all participants. Further, we observed group-specific modulation effects of event-related potentials (ERPs) as a function of background noise. While NH listeners showed stronger noise-related modulation of the N1 latency, CI users revealed enhanced modulation effects of the N2/N4 latency. In general, higher-order processing (N2/N4, P3) was prolonged in CI users in all background conditions when compared with NH listeners. Longer N2/N4 latency in CI users suggests that these individuals have difficulties to map acoustic-phonetic features to lexical representations. These difficulties seem to be increased for speech-in-noise conditions when compared with speech in quiet background. Correlation analyses showed that shorter ERP latencies were related to enhanced speech intelligibility (N1, N2/N4), better lexical fluency (N1), and lower ratings of listening effort (N2/N4) in CI users. In sum, our findings suggest that CI users and NH listeners differ with regards to both the sensory and the higher-order processing of speech in quiet as well as in
Stuttering Frequency, Speech Rate, Speech Naturalness, and Speech Effort During the Production of Voluntary Stuttering.

Science.gov (United States)

Davidow, Jason H; Grossman, Heather L; Edge, Robin L

2018-05-01

Voluntary stuttering techniques involve persons who stutter purposefully interjecting disfluencies into their speech. Little research has been conducted on the impact of these techniques on the speech pattern of persons who stutter. The present study examined whether changes in the frequency of voluntary stuttering accompanied changes in stuttering frequency, articulation rate, speech naturalness, and speech effort. In total, 12 persons who stutter aged 16-34 years participated. Participants read four 300-syllable passages during a control condition, and three voluntary stuttering conditions that involved attempting to produce purposeful, tension-free repetitions of initial sounds or syllables of a word for two or more repetitions (i.e., bouncing). The three voluntary stuttering conditions included bouncing on 5%, 10%, and 15% of syllables read. Friedman tests and follow-up Wilcoxon signed ranks tests were conducted for the statistical analyses. Stuttering frequency, articulation rate, and speech naturalness were significantly different between the voluntary stuttering conditions. Speech effort did not differ between the voluntary stuttering conditions. Stuttering frequency was significantly lower during the three voluntary stuttering conditions compared to the control condition, and speech effort was significantly lower during two of the three voluntary stuttering conditions compared to the control condition. Due to changes in articulation rate across the voluntary stuttering conditions, it is difficult to conclude, as has been suggested previously, that voluntary stuttering is the reason for stuttering reductions found when using voluntary stuttering techniques. Additionally, future investigations should examine different types of voluntary stuttering over an extended period of time to determine their impact on stuttering frequency, speech rate, speech naturalness, and speech effort.
Intelligibility of Digital Speech Masked by Noise: Normal Hearing and Hearing Impaired Listeners

Science.gov (United States)

1990-06-01

spectrograms of these phrases were generated by a List 13 Processing Language (LISP) on a Symbolics 3670 artificial intelligence computer (see Figure 10). The...speech and the amount of difference varies with the type of vocoder. 26 ADPC INTELIGIBILITY AND TOE OF MAING 908 78- INTELLIGIBILITY 48 LI OS NORMA 30
Comparative Study of Features of Social Intelligence and Speech Behavior of Children of Primary School Age with Impaired Mental Function

Directory of Open Access Journals (Sweden)

Shcherban D.

2018-04-01

Full Text Available The article discusses the concept of social intelligence and its characteristics in children of primary school age with impaired mental functions. The concept and main features, including speech, are discussed, delays of mental development, the importance of detained development for social intelligence and speech behavior are also considered. Also, the concept of speech behavior is analyzed, the author defines the phenomenon, describes its specific features, which are distinguish its structure, and consist of six components: verbal, emotional, motivational, ethical (moral, prognostic, semantic (cognitive. Particular attention is paid to the position of social intelligence in the structure of speech behavior of children of primary school age with a impaired mental functions. Indicators of social intelligence were analyzed from the point of view of speech behavior of children with different rates of mental development and compared with its components at a qualitative level. The study used both author's and well-known techniques.
Assessing Speech Intelligibility in Children with Hearing Loss: Toward Revitalizing a Valuable Clinical Tool

Science.gov (United States)

Ertmer, David J.

2011-01-01

Background: Newborn hearing screening, early intervention programs, and advancements in cochlear implant and hearing aid technology have greatly increased opportunities for children with hearing loss to become intelligible talkers. Optimizing speech intelligibility requires that progress be monitored closely. Although direct assessment of…
Speech disorders - children

Science.gov (United States)

... disorder; Voice disorders; Vocal disorders; Disfluency; Communication disorder - speech disorder; Speech disorder - stuttering ... evaluation tools that can help identify and diagnose speech disorders: Denver Articulation Screening Examination Goldman-Fristoe Test of ...
Short-term effect of short, intensive speech therapy on articulation and resonance in Ugandan patients with cleft (lip and) palate

NARCIS (Netherlands)

Anke Luyten; H. Vermeersch; A. Hodges; K. Bettens; K. van Lierde; G. Galiwango

2016-01-01

Objectives: The purpose of the current study was to assess the short-term effectiveness of short and intensive speech therapy provided to patients with cleft (lip and) palate (C(L)P) in terms of articulation and resonance. Methods: Five Ugandan patients (age: 7.3-19.6 years) with non-syndromic C(L)P
The effect of F0 contour on the intelligibility of speech in the presence of interfering sounds for Mandarin Chinese.

Science.gov (United States)

Chen, Jing; Yang, Hongying; Wu, Xihong; Moore, Brian C J

2018-02-01

In Mandarin Chinese, the fundamental frequency (F0) contour defines lexical "Tones" that differ in meaning despite being phonetically identical. Flattening the F0 contour impairs the intelligibility of Mandarin Chinese in background sounds. This might occur because the flattening introduces misleading lexical information. To avoid this effect, two types of speech were used: single-Tone speech contained Tones 1 and 0 only, which have a flat F0 contour; multi-Tone speech contained all Tones and had a varying F0 contour. The intelligibility of speech in steady noise was slightly better for single-Tone speech than for multi-Tone speech. The intelligibility of speech in a two-talker masker, with the difference in mean F0 between the target and masker matched across conditions, was worse for the multi-Tone target in the multi-Tone masker than for any other combination of target and masker, probably because informational masking was maximal for this combination. The introduction of a perceived spatial separation between the target and masker, via the precedence effect, led to better performance for all target-masker combinations, especially the multi-Tone target in the multi-Tone masker. In summary, a flat F0 contour does not reduce the intelligibility of Mandarin Chinese when the introduction of misleading lexical cues is avoided.
Predicting speech intelligibility based on a correlation metric in the envelope power spectrum domain

DEFF Research Database (Denmark)

Relaño-Iborra, Helia; May, Tobias; Zaar, Johannes

2016-01-01

A speech intelligibility prediction model is proposed that combines the auditory processing front end of the multi-resolution speech-based envelope power spectrum model [mr-sEPSM; Jørgensen, Ewert, and Dau (2013). J. Acoust. Soc. Am. 134(1), 436–446] with a correlation back end inspired by the sh...
Evaluation of speech intelligibility in open-plan offices

OpenAIRE

Chevret , Patrick; EBISSOU , Ange; Parizet , Etienne

2012-01-01

International audience; In open-plan offices, ambient noise made of intelligible conversations is generally perceived as one of the most important annoyance for tasks requiring concentration efforts. This annoyance has been proved to lead to a decrease of task performance and to health troubles for people in the mean and long term (tiredness, stress, etc.) Consequently, the improvement of working conditions should pass by the evaluation of speech annoyance that could give rise to recommendati...
Relations Between the Intelligibility of Speech in Noise and Psychophysical Measures of Hearing Measured in Four Languages Using the Auditory Profile Test Battery

Directory of Open Access Journals (Sweden)

T. E. M. Van Esch

2015-12-01

Full Text Available The aim of the present study was to determine the relations between the intelligibility of speech in noise and measures of auditory resolution, loudness recruitment, and cognitive function. The analyses were based on data published earlier as part of the presentation of the Auditory Profile, a test battery implemented in four languages. Tests of the intelligibility of speech, resolution, loudness recruitment, and lexical decision making were measured using headphones in five centers: in Germany, the Netherlands, Sweden, and the United Kingdom. Correlations and stepwise linear regression models were calculated. In sum, 72 hearing-impaired listeners aged 22 to 91 years with a broad range of hearing losses were included in the study. Several significant correlations were found with the intelligibility of speech in noise. Stepwise linear regression analyses showed that pure-tone average, age, spectral and temporal resolution, and loudness recruitment were significant predictors of the intelligibility of speech in fluctuating noise. Complex interrelationships between auditory factors and the intelligibility of speech in noise were revealed using the Auditory Profile data set in four languages. After taking into account the effects of pure-tone average and age, spectral and temporal resolution and loudness recruitment had an added value in the prediction of variation among listeners with respect to the intelligibility of speech in noise. The results of the lexical decision making test were not related to the intelligibility of speech in noise, in the population studied.
Effect of the Number of Presentations on Listener Transcriptions and Reliability in the Assessment of Speech Intelligibility in Children

Science.gov (United States)

Lagerberg, Tove B.; Johnels, Jakob Åsberg; Hartelius, Lena; Persson, Christina

2015-01-01

Background: The assessment of intelligibility is an essential part of establishing the severity of a speech disorder. The intelligibility of a speaker is affected by a number of different variables relating, "inter alia," to the speech material, the listener and the listener task. Aims: To explore the impact of the number of…
Is the Speech Transmission Index (STI) a robust measure of sound system speech intelligibility performance?

Science.gov (United States)

Mapp, Peter

2002-11-01

Although RaSTI is a good indicator of the speech intelligibility capability of auditoria and similar spaces, during the past 2-3 years it has been shown that RaSTI is not a robust predictor of sound system intelligibility performance. Instead, it is now recommended, within both national and international codes and standards, that full STI measurement and analysis be employed. However, new research is reported, that indicates that STI is not as flawless, nor robust as many believe. The paper highlights a number of potential error mechanisms. It is shown that the measurement technique and signal excitation stimulus can have a significant effect on the overall result and accuracy, particularly where DSP-based equipment is employed. It is also shown that in its current state of development, STI is not capable of appropriately accounting for a number of fundamental speech and system attributes, including typical sound system frequency response variations and anomalies. This is particularly shown to be the case when a system is operating under reverberant conditions. Comparisons between actual system measurements and corresponding word score data are reported where errors of up to 50 implications for VA and PA system performance verification will be discussed.
Neural Entrainment to Speech Modulates Speech Intelligibility

NARCIS (Netherlands)

Riecke, Lars; Formisano, Elia; Sorger, Bettina; Baskent, Deniz; Gaudrain, Etienne

2018-01-01

Speech is crucial for communication in everyday life. Speech-brain entrainment, the alignment of neural activity to the slow temporal fluctuations (envelope) of acoustic speech input, is a ubiquitous element of current theories of speech processing. Associations between speech-brain entrainment and

Digitised evaluation of speech intelligibility using vowels in maxillectomy patients.

Science.gov (United States)

Sumita, Y I; Hattori, M; Murase, M; Elbashti, M E; Taniguchi, H

2018-03-01

Among the functional disabilities that patients face following maxillectomy, speech impairment is a major factor influencing quality of life. Proper rehabilitation of speech, which may include prosthodontic and surgical treatments and speech therapy, requires accurate evaluation of speech intelligibility (SI). A simple, less time-consuming yet accurate evaluation is desirable both for maxillectomy patients and the various clinicians providing maxillofacial treatment. This study sought to determine the utility of digital acoustic analysis of vowels for the prediction of SI in maxillectomy patients, based on a comprehensive understanding of speech production in the vocal tract of maxillectomy patients and its perception. Speech samples were collected from 33 male maxillectomy patients (mean age 57.4 years) in two conditions, without and with a maxillofacial prosthesis, and formant data for the vowels /a/,/e/,/i/,/o/, and /u/ were calculated based on linear predictive coding. The frequency range of formant 2 (F2) was determined by differences between the minimum and maximum frequency. An SI test was also conducted to reveal the relationship between SI score and F2 range. Statistical analyses were applied. F2 range and SI score were significantly different between the two conditions without and with a prosthesis (both P maxillectomy. © 2017 John Wiley & Sons Ltd.
Effects of Hearing Loss and Fast-Acting Compression on Amplitude Modulation Perception and Speech Intelligibility

DEFF Research Database (Denmark)

Wiinberg, Alan; Jepsen, Morten Løve; Epp, Bastian

2018-01-01

Objective: The purpose was to investigate the effects of hearing-loss and fast-acting compression on speech intelligibility and two measures of temporal modulation sensitivity. Design: Twelve adults with normal hearing (NH) and 16 adults with mild to moderately severe sensorineural hearing loss......, the MDD thresholds were higher for the group with hearing loss than for the group with NH. Fast-acting compression increased the modulation detection thresholds, while no effect of compression on the MDD thresholds was observed. The speech reception thresholds obtained in stationary noise were slightly...... of the modulation detection thresholds, compression does not seem to provide a benefit for speech intelligibility. Furthermore, fast-acting compression may not be able to restore MDD thresholds to the values observed for listeners with NH, suggesting that the two measures of amplitude modulation sensitivity...
Balancing speech intelligibility versus sound exposure in selection of personal hearing protection equipment for Chinook aircrews

NARCIS (Netherlands)

Wijngaarden, S.J. van; Rots, G.

2001-01-01

Background: Aircrews are often exposed to high ambient sound levels, especially in military aviation. Since long-term exposure to such noise may cause hearing damage, selection of adequate hearing protective devices is crucial. Such devices also affect speech intelligibility. When speech
Assessing the effect of physical differences in the articulation of consonants and vowels on audiovisual temporal perception

Science.gov (United States)

Vatakis, Argiro; Maragos, Petros; Rodomagoulakis, Isidoros; Spence, Charles

2012-01-01

We investigated how the physical differences associated with the articulation of speech affect the temporal aspects of audiovisual speech perception. Video clips of consonants and vowels uttered by three different speakers were presented. The video clips were analyzed using an auditory-visual signal saliency model in order to compare signal saliency and behavioral data. Participants made temporal order judgments (TOJs) regarding which speech-stream (auditory or visual) had been presented first. The sensitivity of participants' TOJs and the point of subjective simultaneity (PSS) were analyzed as a function of the place, manner of articulation, and voicing for consonants, and the height/backness of the tongue and lip-roundedness for vowels. We expected that in the case of the place of articulation and roundedness, where the visual-speech signal is more salient, temporal perception of speech would be modulated by the visual-speech signal. No such effect was expected for the manner of articulation or height. The results demonstrate that for place and manner of articulation, participants' temporal percept was affected (although not always significantly) by highly-salient speech-signals with the visual-signals requiring smaller visual-leads at the PSS. This was not the case when height was evaluated. These findings suggest that in the case of audiovisual speech perception, a highly salient visual-speech signal may lead to higher probabilities regarding the identity of the auditory-signal that modulate the temporal window of multisensory integration of the speech-stimulus. PMID:23060756
Working memory and intelligibility of hearing-aid processed speech

Science.gov (United States)

Souza, Pamela E.; Arehart, Kathryn H.; Shen, Jing; Anderson, Melinda; Kates, James M.

2015-01-01

Previous work suggested that individuals with low working memory capacity may be at a disadvantage in adverse listening environments, including situations with background noise or substantial modification of the acoustic signal. This study explored the relationship between patient factors (including working memory capacity) and intelligibility and quality of modified speech for older individuals with sensorineural hearing loss. The modification was created using a combination of hearing aid processing [wide-dynamic range compression (WDRC) and frequency compression (FC)] applied to sentences in multitalker babble. The extent of signal modification was quantified via an envelope fidelity index. We also explored the contribution of components of working memory by including measures of processing speed and executive function. We hypothesized that listeners with low working memory capacity would perform more poorly than those with high working memory capacity across all situations, and would also be differentially affected by high amounts of signal modification. Results showed a significant effect of working memory capacity for speech intelligibility, and an interaction between working memory, amount of hearing loss and signal modification. Signal modification was the major predictor of quality ratings. These data add to the literature on hearing-aid processing and working memory by suggesting that the working memory-intelligibility effects may be related to aggregate signal fidelity, rather than to the specific signal manipulation. They also suggest that for individuals with low working memory capacity, sensorineural loss may be most appropriately addressed with WDRC and/or FC parameters that maintain the fidelity of the signal envelope. PMID:25999874
Working memory and intelligibility of hearing-aid processed speech

Directory of Open Access Journals (Sweden)

Pamela eSouza

2015-05-01

Full Text Available Previous work suggested that individuals with low working memory capacity may be at a disadvantage in adverse listening environments, including situations with background noise or substantial modification of the acoustic signal. This study explored the relationship between patient factors (including working memory capacity and intelligibility and quality of modified speech for older individuals with sensorineural hearing loss. The modification was created using a combination of hearing aid processing (wide-dynamic range compression and frequency compression applied to sentences in multitalker babble. The extent of signal modification was quantified via an envelope fidelity index. We also explored the contribution of components of working memory by including measures of processing speed and executive function. We hypothesized that listeners with low working memory capacity would perform more poorly than those with high working memory capacity across all situations, and would also be differentially affected by high amounts of signal modification. Results showed a significant effect of working memory capacity for speech intelligibility, and an interaction between working memory, amount of hearing loss and signal modification. Signal modification was the major predictor of quality ratings. These data add to the literature on hearing-aid processing and working memory by suggesting that the working memory-intelligibility effects may be related to aggregate signal fidelity, rather than on the specific signal manipulation. They also suggest that for individuals with low working memory capacity, sensorineural loss may be most appropriately addressed with wide-dynamic range compression and/or frequency compression parameters that maintain the fidelity of the signal envelope.
The Role of Music in Speech Intelligibility of Learners with Post Lingual Hearing Impairment in Selected Units in Lusaka District

Science.gov (United States)

Katongo, Emily Mwamba; Ndhlovu, Daniel

2015-01-01

This study sought to establish the role of music in speech intelligibility of learners with Post Lingual Hearing Impairment (PLHI) and strategies teachers used to enhance speech intelligibility in learners with PLHI in selected special units for the deaf in Lusaka district. The study used a descriptive research design. Qualitative and quantitative…
Perceptual Measures of Speech from Individuals with Parkinson's Disease and Multiple Sclerosis: Intelligibility and beyond

Science.gov (United States)

Sussman, Joan E.; Tjaden, Kris

2012-01-01

Purpose: The primary purpose of this study was to compare percent correct word and sentence intelligibility scores for individuals with multiple sclerosis (MS) and Parkinson's disease (PD) with scaled estimates of speech severity obtained for a reading passage. Method: Speech samples for 78 talkers were judged, including 30 speakers with MS, 16…
Articulation Speaks to Executive Function: An Investigation in 4- to 6-Year-Olds

Directory of Open Access Journals (Sweden)

Nicole Netelenbos

2018-02-01

Full Text Available Executive function (EF and language learning play a prominent role in early childhood development. Empirical research continues to point to a concurrent relation between these two faculties. What has been given little attention, however, is the association between EF and speech articulation abilities in children. This study investigated this relation in children aged 4–6 years. Significant correlations indicated that children with better EF [via parental report of the Behavior Rating Inventory of Executive Function (BRIEF inventory] exhibited stronger speech sound production abilities in the articulation of the “s” and “sh” sounds. Furthermore, regression analyses revealed that the Global Executive Composite (GEC of EF as measured by the BRIEF, served as a predictor for speech sound proficiency and that speech sound proficiency served as a predictor for the GEC. Together, these results demonstrate the imbricated nature of EF and speech sound production while bearing theoretical and practical implications. From a theoretical standpoint, the close link between EF and speech articulation may indicate a common ontogenetic pathway. From a practical perspective, the results suggest that children with speech difficulties could be at higher risk for EF deficits.
Treatment Model in Children with Speech Disorders and Its Therapeutic Efficiency

Directory of Open Access Journals (Sweden)

Barberena, Luciana

2014-05-01

Full Text Available Introduction Speech articulation disorders affect the intelligibility of speech. Studies on therapeutic models show the effectiveness of the communication treatment. Objective To analyze the progress achieved by treatment with the ABAB—Withdrawal and Multiple Probes Model in children with different degrees of phonological disorders. Methods The diagnosis of speech articulation disorder was determined by speech and hearing evaluation and complementary tests. The subjects of this research were eight children, with the average age of 5:5. The children were distributed into four groups according to the degrees of the phonological disorders, based on the percentage of correct consonants, as follows: severe, moderate to severe, mild to moderate, and mild. The phonological treatment applied was the ABAB—Withdrawal and Multiple Probes Model. The development of the therapy by generalization was observed through the comparison between the two analyses: contrastive and distinctive features at the moment of evaluation and reevaluation. Results The following types of generalization were found: to the items not used in the treatment (other words, to another position in the word, within a sound class, to other classes of sounds, and to another syllable structure. Conclusion The different types of generalization studied showed the expansion of production and proper use of therapy-trained targets in other contexts or untrained environments. Therefore, the analysis of the generalizations proved to be an important criterion to measure the therapeutic efficacy.
Sensorimotor influences on speech perception in infancy.

Science.gov (United States)

Bruderer, Alison G; Danielson, D Kyle; Kandhadai, Padmapriya; Werker, Janet F

2015-11-03

The influence of speech production on speech perception is well established in adults. However, because adults have a long history of both perceiving and producing speech, the extent to which the perception-production linkage is due to experience is unknown. We addressed this issue by asking whether articulatory configurations can influence infants' speech perception performance. To eliminate influences from specific linguistic experience, we studied preverbal, 6-mo-old infants and tested the discrimination of a nonnative, and hence never-before-experienced, speech sound distinction. In three experimental studies, we used teething toys to control the position and movement of the tongue tip while the infants listened to the speech sounds. Using ultrasound imaging technology, we verified that the teething toys consistently and effectively constrained the movement and positioning of infants' tongues. With a looking-time procedure, we found that temporarily restraining infants' articulators impeded their discrimination of a nonnative consonant contrast but only when the relevant articulator was selectively restrained to prevent the movements associated with producing those sounds. Our results provide striking evidence that even before infants speak their first words and without specific listening experience, sensorimotor information from the articulators influences speech perception. These results transform theories of speech perception by suggesting that even at the initial stages of development, oral-motor movements influence speech sound discrimination. Moreover, an experimentally induced "impairment" in articulator movement can compromise speech perception performance, raising the question of whether long-term oral-motor impairments may impact perceptual development.
Speech rehabilitation of maxillectomy patients with hollow bulb obturator.

Science.gov (United States)

Kumar, Pravesh; Jain, Veena; Thakar, Alok

2012-09-01

To evaluate the effect of hollow bulb obturator prosthesis on articulation and nasalance in maxillectomy patients. A total of 10 patients, who were to undergo maxillectomy, falling under Aramany classes I and II, with normal speech and hearing pattern were selected for the study. They were provided with definitive maxillary obturators after complete healing of the defect. The patients were asked to wear the obturator for six weeks and speech analysis was done to measure changes in articulation and nasalance at four different stages of treatment, namely, preoperative, postoperative (after complete healing, that is, 3-4 months after surgery), after 24 hours, and after six weeks of providing the obturators. Articulation was measured objectively for distortion, addition, substitution, and omission by a speech pathologist, and nasalance was measured by Dr. Speech software. The statistical comparison of preoperative and six weeks post rehabilitation levels showed insignificance in articulation and nasalance. Comparison of post surgery complete healing with six weeks after rehabilitation showed significant differences in both nasalance and articulation. Providing an obturator improves the speech closer to presurgical levels of articulation and there is improvement in nasality also.
Speech rehabilitation of maxillectomy patients with hollow bulb obturator

Directory of Open Access Journals (Sweden)

Pravesh Kumar

2012-01-01

Full Text Available Aim: To evaluate the effect of hollow bulb obturator prosthesis on articulation and nasalance in maxillectomy patients. Materials and Methods: A total of 10 patients, who were to undergo maxillectomy, falling under Aramany classes I and II, with normal speech and hearing pattern were selected for the study. They were provided with definitive maxillary obturators after complete healing of the defect. The patients were asked to wear the obturator for six weeks and speech analysis was done to measure changes in articulation and nasalance at four different stages of treatment, namely, preoperative, postoperative (after complete healing, that is, 3-4 months after surgery, after 24 hours, and after six weeks of providing the obturators. Articulation was measured objectively for distortion, addition, substitution, and omission by a speech pathologist, and nasalance was measured by Dr. Speech software. Results: The statistical comparison of preoperative and six weeks post rehabilitation levels showed insignificance in articulation and nasalance. Comparison of post surgery complete healing with six weeks after rehabilitation showed significant differences in both nasalance and articulation. Conclusion: Providing an obturator improves the speech closer to presurgical levels of articulation and there is improvement in nasality also.
The effect of emotion on articulation rate in persistence and recovery of childhood stuttering.

Science.gov (United States)

Erdemir, Aysu; Walden, Tedra A; Jefferson, Caswell M; Choi, Dahye; Jones, Robin M

2018-06-01

This study investigated the possible association of emotional processes and articulation rate in pre-school age children who stutter and persist (persisting), children who stutter and recover (recovered) and children who do not stutter (nonstuttering). The participants were ten persisting, ten recovered, and ten nonstuttering children between the ages of 3-5 years; who were classified as persisting, recovered, or nonstuttering approximately 2-2.5 years after the experimental testing took place. The children were exposed to three emotionally-arousing video clips (baseline, positive and negative) and produced a narrative based on a text-free storybook following each video clip. From the audio-recordings of these narratives, individual utterances were transcribed and articulation rates were calculated. Results indicated that persisting children exhibited significantly slower articulation rates following the negative emotion condition, unlike recovered and nonstuttering children whose articulation rates were not affected by either of the two emotion-inducing conditions. Moreover, all stuttering children displayed faster rates during fluent compared to stuttered speech; however, the recovered children were significantly faster than the persisting children during fluent speech. Negative emotion plays a detrimental role on the speech-motor control processes of children who persist, whereas children who eventually recover seem to exhibit a relatively more stable and mature speech-motor system. This suggests that complex interactions between speech-motor and emotional processes are at play in stuttering recovery and persistency; and articulation rates following negative emotion or during stuttered versus fluent speech might be considered as potential factors to prospectively predict persistence and recovery from stuttering. Copyright © 2017 Elsevier Inc. All rights reserved.
Modeling speech intelligibility based on the signal-to-noise envelope power ratio

DEFF Research Database (Denmark)

Jørgensen, Søren

of modulation frequency selectivity in the auditory processing of sound with a decision metric for intelligibility that is based on the signal-to-noise envelope power ratio (SNRenv). The proposed speech-based envelope power spectrum model (sEPSM) is demonstrated to account for the effects of stationary...... through three commercially available mobile phones. The model successfully accounts for the performance across the phones in conditions with a stationary speech-shaped background noise, whereas deviations were observed in conditions with “Traffic” and “Pub” noise. Overall, the results of this thesis...
Intelligibility of emotional speech in younger and older adults.

Science.gov (United States)

Dupuis, Kate; Pichora-Fuller, M Kathleen

2014-01-01

Little is known about the influence of vocal emotions on speech understanding. Word recognition accuracy for stimuli spoken to portray seven emotions (anger, disgust, fear, sadness, neutral, happiness, and pleasant surprise) was tested in younger and older listeners. Emotions were presented in either mixed (heterogeneous emotions mixed in a list) or blocked (homogeneous emotion blocked in a list) conditions. Three main hypotheses were tested. First, vocal emotion affects word recognition accuracy; specifically, portrayals of fear enhance word recognition accuracy because listeners orient to threatening information and/or distinctive acoustical cues such as high pitch mean and variation. Second, older listeners recognize words less accurately than younger listeners, but the effects of different emotions on intelligibility are similar across age groups. Third, blocking emotions in list results in better word recognition accuracy, especially for older listeners, and reduces the effect of emotion on intelligibility because as listeners develop expectations about vocal emotion, the allocation of processing resources can shift from emotional to lexical processing. Emotion was the within-subjects variable: all participants heard speech stimuli consisting of a carrier phrase followed by a target word spoken by either a younger or an older talker, with an equal number of stimuli portraying each of seven vocal emotions. The speech was presented in multi-talker babble at signal to noise ratios adjusted for each talker and each listener age group. Listener age (younger, older), condition (mixed, blocked), and talker (younger, older) were the main between-subjects variables. Fifty-six students (Mage= 18.3 years) were recruited from an undergraduate psychology course; 56 older adults (Mage= 72.3 years) were recruited from a volunteer pool. All participants had clinically normal pure-tone audiometric thresholds at frequencies ≤3000 Hz. There were significant main effects of
The nature of articulation errors in Egyptian Arabic-speaking children with velopharyngeal insufficiency due to cleft palate.

Science.gov (United States)

Abou-Elsaad, Tamer; Baz, Hemmat; Afsah, Omayma; Mansy, Alzahraa

2015-09-01

Even with early surgical repair, the majority of cleft palate children demonstrate articulation errors and have typical cleft palate speech. Was to determine the nature of articulation errors of Arabic consonants in Egyptian Arabic-speaking children with velopharyngeal insufficiency (VPI). Thirty Egyptian Arabic-speaking children with VPI due to cleft palate (whether primary repaired or secondary repaired) were studied. Auditory perceptual assessment (APA) of children speech was conducted. Nasopharyngoscopy was done to assess the velopharyngeal port (VPP) movements while the child was repeating speech tasks. Mansoura Arabic Articulation test (MAAT) was performed to analyze the consonants articulation of these children. The most frequent type of articulatory errors observed was substitution, more specifically, backing. Pharyngealization of anterior fricatives was the most frequent substitution, especially for the /s/ sound. The most frequent substituting sounds for other sounds were /ʔ/ followed by /k/ and /n/ sounds. Significant correlations were found between the degrees of the open nasality and VPP closure and the articulation errors. On the other hand, the sounds (/ʔ/,/ħ/,/ʕ/,/n/,/w/,/j/) were normally articulated in all studied group. The determination of articulation errors in VPI children could guide the therapists for designing appropriate speech therapy programs for these cases. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Subjective and objective measurement of the intelligibility of synthesized speech impaired by the very low bit rate STANAG 4591 codec including packet loss

NARCIS (Netherlands)

Počta, P.; Beerends, J.G.

2017-01-01

This paper deals with the intelligibility of speech coded by the STANAG 4591 standard codec, including packet loss, using synthesized speech input. Both subjective and objective assessments are used. It is shown that this codec significantly degrades intelligibility when compared to a standard
The Comorbidity between Attention-Deficit/Hyperactivity Disorder (ADHD in Children and Arabic Speech Sound Disorder

Directory of Open Access Journals (Sweden)

Ruaa Osama Hariri

2016-04-01

Full Text Available Children with Attention-Deficiency/Hyperactive Disorder (ADHD often have co-existing learning disabilities and developmental weaknesses or delays in some areas including speech (Rief, 2005. Seeing that phonological disorders include articulation errors and other forms of speech disorders, studies pertaining to children with ADHD symptoms who demonstrate signs of phonological disorders in their native Arabic language are lacking. The purpose of this study is to provide a description of Arabic language deficits and to present a theoretical model of potential associations between phonological language deficits and ADHD. Dodd and McCormack’s (1995 four subgroups classification of speech disorder and the phonological disorders pertaining to the Arabic language provided by a Saudi Institute for Speech and Hearing are examined within the theoretical framework. Since intervention may improve articulation and focuses a child’s attention on the sound structure of words, findings in this study are based on the assumption that children with ADHD may acquire phonology for their Arabic language in the same way, and following the same developmental stages as intelligible children. Both quantitative and qualitative analyses have proven that the ADHD group analyzed in this study had indeed failed to acquire most of their Arabic consonants as they should have. Keywords: speech sound disorder, attention-deficiency/hyperactive, developmental disorder, phonological disorder, language disorder/delay, language impairment
The development of multisensory speech perception continues into the late childhood years.

Science.gov (United States)

Ross, Lars A; Molholm, Sophie; Blanco, Daniella; Gomez-Ramirez, Manuel; Saint-Amour, Dave; Foxe, John J

2011-06-01

Observing a speaker's articulations substantially improves the intelligibility of spoken speech, especially under noisy listening conditions. This multisensory integration of speech inputs is crucial to effective communication. Appropriate development of this ability has major implications for children in classroom and social settings, and deficits in it have been linked to a number of neurodevelopmental disorders, especially autism. It is clear from structural imaging studies that there is a prolonged maturational course within regions of the perisylvian cortex that persists into late childhood, and these regions have been firmly established as being crucial to speech and language functions. Given this protracted maturational timeframe, we reasoned that multisensory speech processing might well show a similarly protracted developmental course. Previous work in adults has shown that audiovisual enhancement in word recognition is most apparent within a restricted range of signal-to-noise ratios (SNRs). Here, we investigated when these properties emerge during childhood by testing multisensory speech recognition abilities in typically developing children aged between 5 and 14 years, and comparing them with those of adults. By parametrically varying SNRs, we found that children benefited significantly less from observing visual articulations, displaying considerably less audiovisual enhancement. The findings suggest that improvement in the ability to recognize speech-in-noise and in audiovisual integration during speech perception continues quite late into the childhood years. The implication is that a considerable amount of multisensory learning remains to be achieved during the later schooling years, and that explicit efforts to accommodate this learning may well be warranted. European Journal of Neuroscience © 2011 Federation of European Neuroscience Societies and Blackwell Publishing Ltd. No claim to original US government works.

Expanding the phenotypic profile of Kleefstra syndrome: A female with low-average intelligence and childhood apraxia of speech.

Science.gov (United States)

Samango-Sprouse, Carole; Lawson, Patrick; Sprouse, Courtney; Stapleton, Emily; Sadeghin, Teresa; Gropman, Andrea

2016-05-01

Kleefstra syndrome (KS) is a rare neurogenetic disorder most commonly caused by deletion in the 9q34.3 chromosomal region and is associated with intellectual disabilities, severe speech delay, and motor planning deficits. To our knowledge, this is the first patient (PQ, a 6-year-old female) with a 9q34.3 deletion who has near normal intelligence, and developmental dyspraxia with childhood apraxia of speech (CAS). At 6, the Wechsler Preschool and Primary Intelligence testing (WPPSI-III) revealed a Verbal IQ of 81 and Performance IQ of 79. The Beery Buktenica Test of Visual Motor Integration, 5th Edition (VMI) indicated severe visual motor deficits: VMI = 51; Visual Perception = 48; Motor Coordination explanation for the previously reported speech delay and expressive language disorder. Further research is warranted on the impact of CAS on intelligence and behavioral outcome in KS. Therapeutic and prognostic implications are discussed. © 2016 Wiley Periodicals, Inc.
Recognizing emotional speech in Persian: a validated database of Persian emotional speech (Persian ESD).

Science.gov (United States)

Keshtiari, Niloofar; Kuhlmann, Michael; Eslami, Moharram; Klann-Delius, Gisela

2015-03-01

Research on emotional speech often requires valid stimuli for assessing perceived emotion through prosody and lexical content. To date, no comprehensive emotional speech database for Persian is officially available. The present article reports the process of designing, compiling, and evaluating a comprehensive emotional speech database for colloquial Persian. The database contains a set of 90 validated novel Persian sentences classified in five basic emotional categories (anger, disgust, fear, happiness, and sadness), as well as a neutral category. These sentences were validated in two experiments by a group of 1,126 native Persian speakers. The sentences were articulated by two native Persian speakers (one male, one female) in three conditions: (1) congruent (emotional lexical content articulated in a congruent emotional voice), (2) incongruent (neutral sentences articulated in an emotional voice), and (3) baseline (all emotional and neutral sentences articulated in neutral voice). The speech materials comprise about 470 sentences. The validity of the database was evaluated by a group of 34 native speakers in a perception test. Utterances recognized better than five times chance performance (71.4 %) were regarded as valid portrayals of the target emotions. Acoustic analysis of the valid emotional utterances revealed differences in pitch, intensity, and duration, attributes that may help listeners to correctly classify the intended emotion. The database is designed to be used as a reliable material source (for both text and speech) in future cross-cultural or cross-linguistic studies of emotional speech, and it is available for academic research purposes free of charge. To access the database, please contact the first author.
The benefit of combining a deep neural network architecture with ideal ratio mask estimation in computational speech segregation to improve speech intelligibility

DEFF Research Database (Denmark)

Bentsen, Thomas; May, Tobias; Kressner, Abigail Anne

2018-01-01

Computational speech segregation attempts to automatically separate speech from noise. This is challenging in conditions with interfering talkers and low signal-to-noise ratios. Recent approaches have adopted deep neural networks and successfully demonstrated speech intelligibility improvements....... A selection of components may be responsible for the success with these state-of-the-art approaches: the system architecture, a time frame concatenation technique and the learning objective. The aim of this study was to explore the roles and the relative contributions of these components by measuring speech......, to a state-of-the-art deep neural network-based architecture. Another improvement of 13.9 percentage points was obtained by changing the learning objective from the ideal binary mask, in which individual time-frequency units are labeled as either speech- or noise-dominated, to the ideal ratio mask, where...
Cued Speech Transliteration: Effects of Accuracy and Lag Time on Message Intelligibility

Science.gov (United States)

Krause, Jean C.; Lopez, Katherine A.

2017-01-01

This paper is the second in a series concerned with the level of access afforded to students who use educational interpreters. The first paper (Krause & Tessler, 2016) focused on factors affecting accuracy of messages produced by Cued Speech (CS) transliterators (expression). In this study, factors affecting intelligibility (reception by deaf…
Visual context enhanced. The joint contribution of iconic gestures and visible speech to degraded speech comprehension.

NARCIS (Netherlands)

Drijvers, L.; Özyürek, A.

2017-01-01

Purpose: This study investigated whether and to what extent iconic co-speech gestures contribute to information from visible speech to enhance degraded speech comprehension at different levels of noise-vocoding. Previous studies of the contributions of these 2 visual articulators to speech
Stimulation of the pedunculopontine nucleus area in Parkinson's disease: effects on speech and intelligibility.

Science.gov (United States)

Pinto, Serge; Ferraye, Murielle; Espesser, Robert; Fraix, Valérie; Maillet, Audrey; Guirchoum, Jennifer; Layani-Zemour, Deborah; Ghio, Alain; Chabardès, Stéphan; Pollak, Pierre; Debû, Bettina

2014-10-01

Improvement of gait disorders following pedunculopontine nucleus area stimulation in patients with Parkinson's disease has previously been reported and led us to propose this surgical treatment to patients who progressively developed severe gait disorders and freezing despite optimal dopaminergic drug treatment and subthalamic nucleus stimulation. The outcome of our prospective study on the first six patients was somewhat mitigated, as freezing of gait and falls related to freezing were improved by low frequency electrical stimulation of the pedunculopontine nucleus area in some, but not all, patients. Here, we report the speech data prospectively collected in these patients with Parkinson's disease. Indeed, because subthalamic nucleus surgery may lead to speech impairment and a worsening of dysarthria in some patients with Parkinson's disease, we felt it was important to precisely examine any possible modulations of speech for a novel target for deep brain stimulation. Our results suggested a trend towards speech degradation related to the pedunculopontine nucleus area surgery (off stimulation) for aero-phonatory control (maximum phonation time), phono-articulatory coordination (oral diadochokinesis) and speech intelligibility. Possibly, the observed speech degradation may also be linked to the clinical characteristics of the group of patients. The influence of pedunculopontine nucleus area stimulation per se was more complex, depending on the nature of the task: it had a deleterious effect on maximum phonation time and oral diadochokinesis, and mixed effects on speech intelligibility. Whereas levodopa intake and subthalamic nucleus stimulation alone had no and positive effects on speech dimensions, respectively, a negative interaction between the two treatments was observed both before and after pedunculopontine nucleus area surgery. This combination effect did not seem to be modulated by pedunculopontine nucleus area stimulation. Although limited in our group of
Subjective Quality Measurement of Speech Its Evaluation, Estimation and Applications

CERN Document Server

Kondo, Kazuhiro

2012-01-01

It is becoming crucial to accurately estimate and monitor speech quality in various ambient environments to guarantee high quality speech communication. This practical hands-on book shows speech intelligibility measurement methods so that the readers can start measuring or estimating speech intelligibility of their own system. The book also introduces subjective and objective speech quality measures, and describes in detail speech intelligibility measurement methods. It introduces a diagnostic rhyme test which uses rhyming word-pairs, and includes: An investigation into the effect of word familiarity on speech intelligibility. Speech intelligibility measurement of localized speech in virtual 3-D acoustic space using the rhyme test. Estimation of speech intelligibility using objective measures, including the ITU standard PESQ measures, and automatic speech recognizers.
Speech perception, production and intelligibility in French-speaking children with profound hearing loss and early cochlear implantation after congenital cytomegalovirus infection.

Science.gov (United States)

Laccourreye, L; Ettienne, V; Prang, I; Couloigner, V; Garabedian, E-N; Loundon, N

2015-12-01

To analyze speech in children with profound hearing loss following congenital cytomegalovirus (cCMV) infection with cochlear implantation (CI) before the age of 3 years. In a cohort of 15 children with profound hearing loss, speech perception, production and intelligibility were assessed before and 3 years after CI; variables impacting results were explored. Post-CI, median word recognition was 74% on closed-list and 48% on open-list testing; 80% of children acquired speech production; and 60% were intelligible for all listeners or listeners attentive to lip-reading and/or aware of the child's hearing loss. Univariate analysis identified 3 variables (mean post-CI hearing threshold, bilateral vestibular areflexia, and brain abnormality on MRI) with significant negative impact on the development of speech perception, production and intelligibility. CI showed positive impact on hearing and speech in children with post-cCMV profound hearing loss. Our study demonstrated the key role of maximizing post-CI hearing gain. A few children had insufficient progress, especially in case of bilateral vestibular areflexia and/or brain abnormality on MRI. This led us to suggest that balance rehabilitation and speech therapy should be intensified in such cases. Copyright © 2015 Elsevier Masson SAS. All rights reserved.
Articulation and Noncomprehension Signaling in Adolescent and Adult Males with Down Syndrome and Fragile X Syndrome

Science.gov (United States)

Fedak, Larissa Ann

2012-01-01

The purpose of this study was to determine whether or not decreased articulation of speech played a role in the ability of an individual with Down syndrome or Fragile X syndrome to signal noncomprehension and whether the two groups differed in their levels of articulation of speech and noncomprehension signaling ability. The research was conducted…
Intelligibility in Context Scale: Normative and Validation Data for English-Speaking Preschoolers.

Science.gov (United States)

McLeod, Sharynne; Crowe, Kathryn; Shahaeian, Ameneh

2015-07-01

The purpose of this study was to describe normative and validation data on the Intelligibility in Context Scale (ICS; McLeod, Harrison, & McCormack, 2012c) for English-speaking children. The ICS is a 7-item, parent-report measure of children's speech intelligibility with a range of communicative partners. Data were collected from the parents of 803 Australian English-speaking children ranging in age from 4;0 (years;months) to 5;5 (37.0% were multilingual). The mean ICS score was 4.4 (SD = 0.7) out of a possible total score of 5. Children's speech was reported to be most intelligible to their parents, followed by their immediate family, friends, and teachers; children's speech was least intelligible to strangers. The ICS had high internal consistency (α = .94). Significant differences in scores were identified on the basis of sex and age but not on the basis of socioeconomic status or the number of languages spoken. There were significant differences in scores between children whose parents had concerns about their child's speech (M = 3.9) and those who did not (M = 4.6). A sensitivity of .82 and a specificity of .58 were established as the optimal cutoff. Test-retest reliability and criterion validity were established for 184 children with a speech sound disorder. There was a significant low correlation between the ICS mean score and percentage of phonemes correct (r = .30), percentage of consonants correct (r = .24), and percentage of vowels correct (r = .30) on the Diagnostic Evaluation of Articulation and Phonology (Dodd, Hua, Crosbie, Holm, & Ozanne, 2002). Thirty-one parents completed the ICS related to English and another language spoken by their child with a speech sound disorder. The significant correlations between the scores suggest that the ICS may be robust between languages. This article provides normative ICS data for English-speaking children and additional validation of the psychometric properties of the ICS. The robustness of the ICS was suggested
INTEGRATING MACHINE TRANSLATION AND SPEECH SYNTHESIS COMPONENT FOR ENGLISH TO DRAVIDIAN LANGUAGE SPEECH TO SPEECH TRANSLATION SYSTEM

Directory of Open Access Journals (Sweden)

J. SANGEETHA

2015-02-01

Full Text Available This paper provides an interface between the machine translation and speech synthesis system for converting English speech to Tamil text in English to Tamil speech to speech translation system. The speech translation system consists of three modules: automatic speech recognition, machine translation and text to speech synthesis. Many procedures for incorporation of speech recognition and machine translation have been projected. Still speech synthesis system has not yet been measured. In this paper, we focus on integration of machine translation and speech synthesis, and report a subjective evaluation to investigate the impact of speech synthesis, machine translation and the integration of machine translation and speech synthesis components. Here we implement a hybrid machine translation (combination of rule based and statistical machine translation and concatenative syllable based speech synthesis technique. In order to retain the naturalness and intelligibility of synthesized speech Auto Associative Neural Network (AANN prosody prediction is used in this work. The results of this system investigation demonstrate that the naturalness and intelligibility of the synthesized speech are strongly influenced by the fluency and correctness of the translated text.
Long-Term Follow-Up Study of Young Adults Treated for Unilateral Complete Cleft Lip, Alveolus, and Palate by a Treatment Protocol Including Two-Stage Palatoplasty: Speech Outcomes.

Science.gov (United States)

Kappen, Isabelle Francisca Petronella Maria; Bittermann, Dirk; Janssen, Laura; Bittermann, Gerhard Koendert Pieter; Boonacker, Chantal; Haverkamp, Sarah; de Wilde, Hester; Van Der Heul, Marise; Specken, Tom Fjmc; Koole, Ron; Kon, Moshe; Breugem, Corstiaan Cornelis; Mink van der Molen, Aebele Barber

2017-05-01

No consensus exists on the optimal treatment protocol for orofacial clefts or the optimal timing of cleft palate closure. This study investigated factors influencing speech outcomes after two-stage palate repair in adults with a non-syndromal complete unilateral cleft lip and palate (UCLP). This was a retrospective analysis of adult patients with a UCLP who underwent two-stage palate closure and were treated at our tertiary cleft centre. Patients ≥17 years of age were invited for a final speech assessment. Their medical history was obtained from their medical files, and speech outcomes were assessed by a speech pathologist during the follow-up consultation. Forty-eight patients were included in the analysis, with a mean age of 21 years (standard deviation, 3.4 years). Their mean age at the time of hard and soft palate closure was 3 years and 8.0 months, respectively. In 40% of the patients, a pharyngoplasty was performed. On a 5-point intelligibility scale, 84.4% received a score of 1 or 2; meaning that their speech was intelligible. We observed a significant correlation between intelligibility scores and the incidence of articulation errors (Pspeech assessment, and 11%-17% of the patients exhibited increased nasalance scores, assessed through nasometry. The present study describes long-term speech outcomes after two-stage palatoplasty with hard palate closure at a mean age of 3 years old. We observed moderate long-term intelligibility scores, a relatively high incidence of persistent hypernasality, and a high pharyngoplasty incidence.
Speech-language pathologists' practices regarding assessment, analysis, target selection, intervention, and service delivery for children with speech sound disorders.

Science.gov (United States)

Mcleod, Sharynne; Baker, Elise

2014-01-01

A survey of 231 Australian speech-language pathologists (SLPs) was undertaken to describe practices regarding assessment, analysis, target selection, intervention, and service delivery for children with speech sound disorders (SSD). The participants typically worked in private practice, education, or community health settings and 67.6% had a waiting list for services. For each child, most of the SLPs spent 10-40 min in pre-assessment activities, 30-60 min undertaking face-to-face assessments, and 30-60 min completing paperwork after assessments. During an assessment SLPs typically conducted a parent interview, single-word speech sampling, collected a connected speech sample, and used informal tests. They also determined children's stimulability and estimated intelligibility. With multilingual children, informal assessment procedures and English-only tests were commonly used and SLPs relied on family members or interpreters to assist. Common analysis techniques included determination of phonological processes, substitutions-omissions-distortions-additions (SODA), and phonetic inventory. Participants placed high priority on selecting target sounds that were stimulable, early developing, and in error across all word positions and 60.3% felt very confident or confident selecting an appropriate intervention approach. Eight intervention approaches were frequently used: auditory discrimination, minimal pairs, cued articulation, phonological awareness, traditional articulation therapy, auditory bombardment, Nuffield Centre Dyspraxia Programme, and core vocabulary. Children typically received individual therapy with an SLP in a clinic setting. Parents often observed and participated in sessions and SLPs typically included siblings and grandparents in intervention sessions. Parent training and home programs were more frequently used than the group therapy. Two-thirds kept up-to-date by reading journal articles monthly or every 6 months. There were many similarities with
Contribution of Binaural Masking Release to Improved Speech Intelligibility for different Masker types.

Science.gov (United States)

Sutojo, Sarinah; van de Par, Steven; Schoenmaker, Esther

2018-06-01

In situations with competing talkers or in the presence of masking noise, speech intelligibility can be improved by spatially separating the target speaker from the interferers. This advantage is generally referred to as spatial release from masking (SRM) and different mechanisms have been suggested to explain it. One proposed mechanism to benefit from spatial cues is the binaural masking release, which is purely stimulus driven. According to this mechanism, the spatial benefit results from differences in the binaural cues of target and masker, which need to appear simultaneously in time and frequency to improve the signal detection. In an alternative proposed mechanism, the differences in the interaural cues improve the segregation of auditory streams, a process, which involves top-down processing rather than being purely stimulus driven. Other than the cues that produce binaural masking release, the interaural cue differences between target and interferer required to improve stream segregation do not have to appear simultaneously in time and frequency. This study is concerned with the contribution of binaural masking release to SRM for three masker types that differ with respect to the amount of energetic masking they exert. Speech intelligibility was measured, employing a stimulus manipulation that inhibits binaural masking release, and analyzed with a metric to account for the number of better-ear glimpses. Results indicate that the contribution of the stimulus-driven binaural masking release plays a minor role while binaural stream segregation and the availability of glimpses in the better ear had a stronger influence on improving the speech intelligibility. This article is protected by copyright. All rights reserved. This article is protected by copyright. All rights reserved.
Visual Context Enhanced: The Joint Contribution of Iconic Gestures and Visible Speech to Degraded Speech Comprehension

Science.gov (United States)

Drijvers, Linda; Ozyurek, Asli

2017-01-01

Purpose: This study investigated whether and to what extent iconic co-speech gestures contribute to information from visible speech to enhance degraded speech comprehension at different levels of noise-vocoding. Previous studies of the contributions of these 2 visual articulators to speech comprehension have only been performed separately. Method:…
A Danish open-set speech corpus for competing-speech studies

DEFF Research Database (Denmark)

Nielsen, Jens Bo; Dau, Torsten; Neher, Tobias

2014-01-01

Studies investigating speech-on-speech masking effects commonly use closed-set speech materials such as the coordinate response measure [Bolia et al. (2000). J. Acoust. Soc. Am. 107, 1065-1066]. However, these studies typically result in very low (i.e., negative) speech recognition thresholds (SRTs......) when the competing speech signals are spatially separated. To achieve higher SRTs that correspond more closely to natural communication situations, an open-set, low-context, multi-talker speech corpus was developed. Three sets of 268 unique Danish sentences were created, and each set was recorded...... with one of three professional female talkers. The intelligibility of each sentence in the presence of speech-shaped noise was measured. For each talker, 200 approximately equally intelligible sentences were then selected and systematically distributed into 10 test lists. Test list homogeneity was assessed...
Automatic Speech Recognition Systems for the Evaluation of Voice and Speech Disorders in Head and Neck Cancer

OpenAIRE

Andreas Maier; Tino Haderlein; Florian Stelzle; Elmar Nöth; Emeka Nkenke; Frank Rosanowski; Anne Schützenberger; Maria Schuster

2010-01-01

In patients suffering from head and neck cancer, speech intelligibility is often restricted. For assessment and outcome measurements, automatic speech recognition systems have previously been shown to be appropriate for objective and quick evaluation of intelligibility. In this study we investigate the applicability of the method to speech disorders caused by head and neck cancer. Intelligibility was quantified by speech recognition on recordings of a standard text read by 41 German laryngect...
Automatic Speech Recognition Systems for the Evaluation of Voice and Speech Disorders in Head and Neck Cancer

Directory of Open Access Journals (Sweden)

Andreas Maier

2010-01-01

Full Text Available In patients suffering from head and neck cancer, speech intelligibility is often restricted. For assessment and outcome measurements, automatic speech recognition systems have previously been shown to be appropriate for objective and quick evaluation of intelligibility. In this study we investigate the applicability of the method to speech disorders caused by head and neck cancer. Intelligibility was quantified by speech recognition on recordings of a standard text read by 41 German laryngectomized patients with cancer of the larynx or hypopharynx and 49 German patients who had suffered from oral cancer. The speech recognition provides the percentage of correctly recognized words of a sequence, that is, the word recognition rate. Automatic evaluation was compared to perceptual ratings by a panel of experts and to an age-matched control group. Both patient groups showed significantly lower word recognition rates than the control group. Automatic speech recognition yielded word recognition rates which complied with experts' evaluation of intelligibility on a significant level. Automatic speech recognition serves as a good means with low effort to objectify and quantify the most important aspect of pathologic speech—the intelligibility. The system was successfully applied to voice and speech disorders.
Illustrated Speech Anatomy.

Science.gov (United States)

Shearer, William M.

Written for students in the fields of speech correction and audiology, the text deals with the following: structures involved in respiration; the skeleton and the processes of inhalation and exhalation; phonation and pitch, the larynx, and esophageal speech; muscles involved in articulation; muscles involved in resonance; and the anatomy of the…
The relationship between the intelligibility of time-compressed speech and speech in noise in young and elderly listeners

Science.gov (United States)

Versfeld, Niek J.; Dreschler, Wouter A.

2002-01-01

A conventional measure to determine the ability to understand speech in noisy backgrounds is the so-called speech reception threshold (SRT) for sentences. It yields the signal-to-noise ratio (in dB) for which half of the sentences are correctly perceived. The SRT defines to what degree speech must be audible to a listener in order to become just intelligible. There are indications that elderly listeners have greater difficulty in understanding speech in adverse listening conditions than young listeners. This may be partly due to the differences in hearing sensitivity (presbycusis), hence audibility, but other factors, such as temporal acuity, may also play a significant role. A potential measure for the temporal acuity may be the threshold to which speech can be accelerated, or compressed in time. A new test is introduced where the speech rate is varied adaptively. In analogy to the SRT, the time-compression threshold (or TCT) then is defined as the speech rate (expressed in syllables per second) for which half of the sentences are correctly perceived. In experiment I, the TCT test is introduced and normative data are provided. In experiment II, four groups of subjects (young and elderly normal-hearing and hearing-impaired subjects) participated, and the SRT's in stationary and fluctuating speech-shaped noise were determined, as well as the TCT. The results show that the SRT in fluctuating noise and the TCT are highly correlated. All tests indicate that, even after correction for the hearing loss, elderly normal-hearing subjects perform worse than young normal-hearing subjects. The results indicate that the use of the TCT test or the SRT test in fluctuating noise is preferred over the SRT test in stationary noise.

Speech rate in Parkinson's disease: A controlled study.

Science.gov (United States)

Martínez-Sánchez, F; Meilán, J J G; Carro, J; Gómez Íñiguez, C; Millian-Morell, L; Pujante Valverde, I M; López-Alburquerque, T; López, D E

2016-09-01

Speech disturbances will affect most patients with Parkinson's disease (PD) over the course of the disease. The origin and severity of these symptoms are of clinical and diagnostic interest. To evaluate the clinical pattern of speech impairment in PD patients and identify significant differences in speech rate and articulation compared to control subjects. Speech rate and articulation in a reading task were measured using an automatic analytical method. A total of 39 PD patients in the 'on' state and 45 age-and sex-matched asymptomatic controls participated in the study. None of the patients experienced dyskinesias or motor fluctuations during the test. The patients with PD displayed a significant reduction in speech and articulation rates; there were no significant correlations between the studied speech parameters and patient characteristics such as L-dopa dose, duration of the disorder, age, and UPDRS III scores and Hoehn & Yahr scales. Patients with PD show a characteristic pattern of declining speech rate. These results suggest that in PD, disfluencies are the result of the movement disorder affecting the physiology of speech production systems. Copyright © 2014 Sociedad Española de Neurología. Publicado por Elsevier España, S.L.U. All rights reserved.
Validation of the Intelligibility in Context Scale for Jamaican Creole-Speaking Preschoolers.

Science.gov (United States)

Washington, Karla N; McDonald, Megan M; McLeod, Sharynne; Crowe, Kathryn; Devonish, Hubert

2017-08-15

To describe validation of the Intelligibility in Context Scale (ICS; McLeod, Harrison, & McCormack, 2012a) and ICS-Jamaican Creole (ICS-JC; McLeod, Harrison, & McCormack, 2012b) in a sample of typically developing 3- to 6-year-old Jamaicans. One-hundred and forty-five preschooler-parent dyads participated in the study. Parents completed the 7-item ICS (n = 145) and ICS-JC (n = 98) to rate children's speech intelligibility (5-point scale) across communication partners (parents, immediate family, extended family, friends, acquaintances, strangers). Preschoolers completed the Diagnostic Evaluation of Articulation and Phonology (DEAP; Dodd, Hua, Crosbie, Holm, & Ozanne, 2006) in English and Jamaican Creole to establish speech-sound competency. For this sample, we examined validity and reliability (interrater, test-rest, internal consistency) evidence using measures of speech-sound production: (a) percentage of consonants correct, (b) percentage of vowels correct, and (c) percentage of phonemes correct. ICS and ICS-JC ratings showed preschoolers were always (5) to usually (4) understood across communication partners (ICS, M = 4.43; ICS-JC, M = 4.50). Both tools demonstrated excellent internal consistency (α = .91), high interrater, and test-retest reliability. Significant correlations between the two tools and between each measure and language-specific percentage of consonants correct, percentage of vowels correct, and percentage of phonemes correct provided criterion-validity evidence. A positive correlation between the ICS and age further strengthened validity evidence for that measure. Both tools show promising evidence of reliability and validity in describing functional speech intelligibility for this group of typically developing Jamaican preschoolers.
Conflict monitoring in speech processing : An fMRI study of error detection in speech production and perception

NARCIS (Netherlands)

Gauvin, Hanna; De Baene, W.; Brass, Marcel; Hartsuiker, Robert

2016-01-01

To minimize the number of errors in speech, and thereby facilitate communication, speech is monitored before articulation. It is, however, unclear at which level during speech production monitoring takes place, and what mechanisms are used to detect and correct errors. The present study investigated
Alternative Speech Communication System for Persons with Severe Speech Disorders

Science.gov (United States)

Selouani, Sid-Ahmed; Sidi Yakoub, Mohammed; O'Shaughnessy, Douglas

2009-12-01

Assistive speech-enabled systems are proposed to help both French and English speaking persons with various speech disorders. The proposed assistive systems use automatic speech recognition (ASR) and speech synthesis in order to enhance the quality of communication. These systems aim at improving the intelligibility of pathologic speech making it as natural as possible and close to the original voice of the speaker. The resynthesized utterances use new basic units, a new concatenating algorithm and a grafting technique to correct the poorly pronounced phonemes. The ASR responses are uttered by the new speech synthesis system in order to convey an intelligible message to listeners. Experiments involving four American speakers with severe dysarthria and two Acadian French speakers with sound substitution disorders (SSDs) are carried out to demonstrate the efficiency of the proposed methods. An improvement of the Perceptual Evaluation of the Speech Quality (PESQ) value of 5% and more than 20% is achieved by the speech synthesis systems that deal with SSD and dysarthria, respectively.
移动智能语音产业链治理策略研究%Management Strategy on Mobile Intelligent Speech Industrial Chain

Institute of Scientific and Technical Information of China (English)

王忠; 赵惠

2014-01-01

智能语音技术将开辟移动智能语音市场，改变用户的生活方式。分析移动智能语音应用现状得出以下发展趋势：语音将成为移动互联网的“入口”，市场将形成寡头竞争格局，初创公司将成为创新的主要动力。国内企业虽有一定技术优势，但综合竞争实力不及国外企业。建议政府以产业联盟整合产业链，以项目合作联合企业研发，以试点工程引导市场发展。%The intelligent speech technology of mobile devices will open up the mobile intelligent speech market and change the life style of the users. Based on the analysis of the applications status of the mobile intelligent speech,the following trends are concluded as follows:the speech will become the "entrance"of the mobile internet;a pattern of oligopolistic competition will appear on the market;the start-up company will become the main driving force of the innovation. Al-though having some technical advantages,the overall competitive strength of domestic enterprises is less than foreign enter-prises. The paper offers suggestions such as guiding and supporting enterprises to carry out research and development of the mobile intelligent speech,improving the ecosystem of mobile intelligent speech industry,and accelerating the applications demonstration.
Pilot Workload and Speech Analysis: A Preliminary Investigation

Science.gov (United States)

Bittner, Rachel M.; Begault, Durand R.; Christopher, Bonny R.

2013-01-01

Prior research has questioned the effectiveness of speech analysis to measure the stress, workload, truthfulness, or emotional state of a talker. The question remains regarding the utility of speech analysis for restricted vocabularies such as those used in aviation communications. A part-task experiment was conducted in which participants performed Air Traffic Control read-backs in different workload environments. Participant's subjective workload and the speech qualities of fundamental frequency (F0) and articulation rate were evaluated. A significant increase in subjective workload rating was found for high workload segments. F0 was found to be significantly higher during high workload while articulation rates were found to be significantly slower. No correlation was found to exist between subjective workload and F0 or articulation rate.
Ear, Hearing and Speech

DEFF Research Database (Denmark)

Poulsen, Torben

2000-01-01

An introduction is given to the the anatomy and the function of the ear, basic psychoacoustic matters (hearing threshold, loudness, masking), the speech signal and speech intelligibility. The lecture note is written for the course: Fundamentals of Acoustics and Noise Control (51001)......An introduction is given to the the anatomy and the function of the ear, basic psychoacoustic matters (hearing threshold, loudness, masking), the speech signal and speech intelligibility. The lecture note is written for the course: Fundamentals of Acoustics and Noise Control (51001)...
Long-Term Follow-Up Study of Young Adults Treated for Unilateral Complete Cleft Lip, Alveolus, and Palate by a Treatment Protocol Including Two-Stage Palatoplasty: Speech Outcomes

Directory of Open Access Journals (Sweden)

Isabelle Francisca Petronella Maria Kappen

2017-05-01

Full Text Available BackgroundNo consensus exists on the optimal treatment protocol for orofacial clefts or the optimal timing of cleft palate closure. This study investigated factors influencing speech outcomes after two-stage palate repair in adults with a non-syndromal complete unilateral cleft lip and palate (UCLP.MethodsThis was a retrospective analysis of adult patients with a UCLP who underwent two-stage palate closure and were treated at our tertiary cleft centre. Patients ≥17 years of age were invited for a final speech assessment. Their medical history was obtained from their medical files, and speech outcomes were assessed by a speech pathologist during the follow-up consultation.ResultsForty-eight patients were included in the analysis, with a mean age of 21 years (standard deviation, 3.4 years. Their mean age at the time of hard and soft palate closure was 3 years and 8.0 months, respectively. In 40% of the patients, a pharyngoplasty was performed. On a 5-point intelligibility scale, 84.4% received a score of 1 or 2; meaning that their speech was intelligible. We observed a significant correlation between intelligibility scores and the incidence of articulation errors (P<0.001. In total, 36% showed mild to moderate hypernasality during the speech assessment, and 11%–17% of the patients exhibited increased nasalance scores, assessed through nasometry.ConclusionsThe present study describes long-term speech outcomes after two-stage palatoplasty with hard palate closure at a mean age of 3 years old. We observed moderate long-term intelligibility scores, a relatively high incidence of persistent hypernasality, and a high pharyngoplasty incidence.
Long-Term Follow-Up Study of Young Adults Treated for Unilateral Complete Cleft Lip, Alveolus, and Palate by a Treatment Protocol Including Two-Stage Palatoplasty: Speech Outcomes

Science.gov (United States)

Bittermann, Dirk; Janssen, Laura; Bittermann, Gerhard Koendert Pieter; Boonacker, Chantal; Haverkamp, Sarah; de Wilde, Hester; Van Der Heul, Marise; Specken, Tom FJMC; Koole, Ron; Kon, Moshe; Breugem, Corstiaan Cornelis; Mink van der Molen, Aebele Barber

2017-01-01

Background No consensus exists on the optimal treatment protocol for orofacial clefts or the optimal timing of cleft palate closure. This study investigated factors influencing speech outcomes after two-stage palate repair in adults with a non-syndromal complete unilateral cleft lip and palate (UCLP). Methods This was a retrospective analysis of adult patients with a UCLP who underwent two-stage palate closure and were treated at our tertiary cleft centre. Patients ≥17 years of age were invited for a final speech assessment. Their medical history was obtained from their medical files, and speech outcomes were assessed by a speech pathologist during the follow-up consultation. Results Forty-eight patients were included in the analysis, with a mean age of 21 years (standard deviation, 3.4 years). Their mean age at the time of hard and soft palate closure was 3 years and 8.0 months, respectively. In 40% of the patients, a pharyngoplasty was performed. On a 5-point intelligibility scale, 84.4% received a score of 1 or 2; meaning that their speech was intelligible. We observed a significant correlation between intelligibility scores and the incidence of articulation errors (P<0.001). In total, 36% showed mild to moderate hypernasality during the speech assessment, and 11%–17% of the patients exhibited increased nasalance scores, assessed through nasometry. Conclusions The present study describes long-term speech outcomes after two-stage palatoplasty with hard palate closure at a mean age of 3 years old. We observed moderate long-term intelligibility scores, a relatively high incidence of persistent hypernasality, and a high pharyngoplasty incidence. PMID:28573094
Deep Learning-Based Noise Reduction Approach to Improve Speech Intelligibility for Cochlear Implant Recipients.

Science.gov (United States)

Lai, Ying-Hui; Tsao, Yu; Lu, Xugang; Chen, Fei; Su, Yu-Ting; Chen, Kuang-Chao; Chen, Yu-Hsuan; Chen, Li-Ching; Po-Hung Li, Lieber; Lee, Chin-Hui

2018-01-20

We investigate the clinical effectiveness of a novel deep learning-based noise reduction (NR) approach under noisy conditions with challenging noise types at low signal to noise ratio (SNR) levels for Mandarin-speaking cochlear implant (CI) recipients. The deep learning-based NR approach used in this study consists of two modules: noise classifier (NC) and deep denoising autoencoder (DDAE), thus termed (NC + DDAE). In a series of comprehensive experiments, we conduct qualitative and quantitative analyses on the NC module and the overall NC + DDAE approach. Moreover, we evaluate the speech recognition performance of the NC + DDAE NR and classical single-microphone NR approaches for Mandarin-speaking CI recipients under different noisy conditions. The testing set contains Mandarin sentences corrupted by two types of maskers, two-talker babble noise, and a construction jackhammer noise, at 0 and 5 dB SNR levels. Two conventional NR techniques and the proposed deep learning-based approach are used to process the noisy utterances. We qualitatively compare the NR approaches by the amplitude envelope and spectrogram plots of the processed utterances. Quantitative objective measures include (1) normalized covariance measure to test the intelligibility of the utterances processed by each of the NR approaches; and (2) speech recognition tests conducted by nine Mandarin-speaking CI recipients. These nine CI recipients use their own clinical speech processors during testing. The experimental results of objective evaluation and listening test indicate that under challenging listening conditions, the proposed NC + DDAE NR approach yields higher intelligibility scores than the two compared classical NR techniques, under both matched and mismatched training-testing conditions. When compared to the two well-known conventional NR techniques under challenging listening condition, the proposed NC + DDAE NR approach has superior noise suppression capabilities and gives less distortion
Multisensory and Modality Specific Processing of Visual Speech in Different Regions of the Premotor Cortex

Directory of Open Access Journals (Sweden)

Daniel eCallan

2014-05-01

Full Text Available Behavioral and neuroimaging studies have demonstrated that brain regions involved with speech production also support speech perception, especially under degraded conditions. The premotor cortex has been shown to be active during both observation and execution of action (‘Mirror System’ properties, and may facilitate speech perception by mapping unimodal and multimodal sensory features onto articulatory speech gestures. For this functional magnetic resonance imaging (fMRI study, participants identified vowels produced by a speaker in audio-visual (saw the speaker’s articulating face and heard her voice, visual only (only saw the speaker’s articulating face, and audio only (only heard the speaker’s voice conditions with varying audio signal-to-noise ratios in order to determine the regions of the premotor cortex involved with multisensory and modality specific processing of visual speech gestures. The task was designed so that identification could be made with a high level of accuracy from visual only stimuli to control for task difficulty and differences in intelligibility. The results of the fMRI analysis for visual only and audio-visual conditions showed overlapping activity in inferior frontal gyrus and premotor cortex. The left ventral inferior premotor cortex showed properties of multimodal (audio-visual enhancement with a degraded auditory signal. The left inferior parietal lobule and right cerebellum also showed these properties. The left ventral superior and dorsal premotor cortex did not show this multisensory enhancement effect, but there was greater activity for the visual only over audio-visual conditions in these areas. The results suggest that the inferior regions of the ventral premotor cortex are involved with integrating multisensory information, whereas, more superior and dorsal regions of the premotor cortex are involved with mapping unimodal (in this case visual sensory features of the speech signal with
Facial-muscle weakness, speech disorders and dysphagia are common in patients with classic infantile Pompe disease treated with enzyme therapy.

Science.gov (United States)

van Gelder, C M; van Capelle, C I; Ebbink, B J; Moor-van Nugteren, I; van den Hout, J M P; Hakkesteegt, M M; van Doorn, P A; de Coo, I F M; Reuser, A J J; de Gier, H H W; van der Ploeg, A T

2012-05-01

Classic infantile Pompe disease is an inherited generalized glycogen storage disorder caused by deficiency of lysosomal acid α-glucosidase. If left untreated, patients die before one year of age. Although enzyme-replacement therapy (ERT) has significantly prolonged lifespan, it has also revealed new aspects of the disease. For up to 11 years, we investigated the frequency and consequences of facial-muscle weakness, speech disorders and dysphagia in long-term survivors. Sequential photographs were used to determine the timing and severity of facial-muscle weakness. Using standardized articulation tests and fibreoptic endoscopic evaluation of swallowing, we investigated speech and swallowing function in a subset of patients. This study included 11 patients with classic infantile Pompe disease. Median age at the start of ERT was 2.4 months (range 0.1-8.3 months), and median age at the end of the study was 4.3 years (range 7.7 months -12.2 years). All patients developed facial-muscle weakness before the age of 15 months. Speech was studied in four patients. Articulation was disordered, with hypernasal resonance and reduced speech intelligibility in all four. Swallowing function was studied in six patients, the most important findings being ineffective swallowing with residues of food (5/6), penetration or aspiration (3/6), and reduced pharyngeal and/or laryngeal sensibility (2/6). We conclude that facial-muscle weakness, speech disorders and dysphagia are common in long-term survivors receiving ERT for classic infantile Pompe disease. To improve speech and reduce the risk for aspiration, early treatment by a speech therapist and regular swallowing assessments are recommended.
The Effects of Otitis Media on Articulation. Final Report for 1982-1983.

Science.gov (United States)

Roberts, Joanne Erwick

The study examined the relationship in 44 preschoolers (considered to have varying degrees of predicted risk for poor school performance) between otitis media (middle ear disease) during the first 3 years of life and speech production (articulation) during preschool and school age years. Speech production accuracy was assessed by the number of…
A blind algorithm for recovering articulator positions from acoustics

Energy Technology Data Exchange (ETDEWEB)

Hogden, John E [Los Alamos National Laboratory

2009-01-01

MIMICRI is a signal-processing algorithm that has been shown to blindly infer and invert memoryless nonlinear functions of unobservable bandlimited signals, such as the mapping from the unobservable positions of the speech articulators to observable speech sounds. We review results of using MIMICRI on toy problems and on human speech data. We note that MIMICRI requires that the user specify two parameters: the dimensionality and pass-band of the unobservable signals. We show how to use cross-validation to help estimate the passband. An unexpected consequence of this work is that it helps separate signals with overlapping frequency bands.
Memory performance on the Auditory Inference Span Test is independent of background noise type for young adults with normal hearing at high speech intelligibility.

Science.gov (United States)

Rönnberg, Niklas; Rudner, Mary; Lunner, Thomas; Stenfelt, Stefan

2014-01-01

Listening in noise is often perceived to be effortful. This is partly because cognitive resources are engaged in separating the target signal from background noise, leaving fewer resources for storage and processing of the content of the message in working memory. The Auditory Inference Span Test (AIST) is designed to assess listening effort by measuring the ability to maintain and process heard information. The aim of this study was to use AIST to investigate the effect of background noise types and signal-to-noise ratio (SNR) on listening effort, as a function of working memory capacity (WMC) and updating ability (UA). The AIST was administered in three types of background noise: steady-state speech-shaped noise, amplitude modulated speech-shaped noise, and unintelligible speech. Three SNRs targeting 90% speech intelligibility or better were used in each of the three noise types, giving nine different conditions. The reading span test assessed WMC, while UA was assessed with the letter memory test. Twenty young adults with normal hearing participated in the study. Results showed that AIST performance was not influenced by noise type at the same intelligibility level, but became worse with worse SNR when background noise was speech-like. Performance on AIST also decreased with increasing memory load level. Correlations between AIST performance and the cognitive measurements suggested that WMC is of more importance for listening when SNRs are worse, while UA is of more importance for listening in easier SNRs. The results indicated that in young adults with normal hearing, the effort involved in listening in noise at high intelligibility levels is independent of the noise type. However, when noise is speech-like and intelligibility decreases, listening effort increases, probably due to extra demands on cognitive resources added by the informational masking created by the speech fragments and vocal sounds in the background noise.
Memory performance on the Auditory Inference Span Test is independent of background noise type for young adults with normal hearing at high speech intelligibility

Directory of Open Access Journals (Sweden)

Niklas eRönnberg

2014-12-01

Full Text Available Listening in noise is often perceived to be effortful. This is partly because cognitive resources are engaged in separating the target signal from background noise, leaving fewer resources for storage and processing of the content of the message in working memory. The Auditory Inference Span Test (AIST is designed to assess listening effort by measuring the ability to maintain and process heard information. The aim of this study was to use AIST to investigate the effect of background noise types and signal-to-noise ratio (SNR on listening effort, as a function of working memory capacity (WMC and updating ability (UA. The AIST was administered in three types of background noise: steady-state speech-shaped noise, amplitude modulated speech-shaped noise, and unintelligible speech. Three SNRs targeting 90% speech intelligibility or better were used in each of the three noise types, giving nine different conditions. The reading span test assessed WMC, while UA was assessed with the letter memory test. Twenty young adults with normal hearing participated in the study. Results showed that AIST performance was not influenced by noise type at the same intelligibility level, but became worse with worse SNR when background noise was speech-like. Performance on AIST also decreased with increasing MLL. Correlations between AIST performance and the cognitive measurements suggested that WMC is of more importance for listening when SNRs are worse, while UA is of more importance for listening in easier SNRs. The results indicated that in young adults with normal hearing, the effort involved in listening in noise at high intelligibility levels is independent of the noise type. However, when noise is speech-like and intelligibility decreases, listening effort increases, probably due to extra demands on cognitive resources added by the informational masking created by the speech-fragments and vocal sounds in the background noise.
Articulação compensatória associada à fissura de palato ou disfunção velofaríngea: revisão de literatura Compensatory articulation associated to cleft palate or velopharyngeal dysfunction: literature review

Directory of Open Access Journals (Sweden)

Viviane Cristina de Castro Marino

2012-06-01

Full Text Available TEMA: articulação compensatória na fissura palatina. OBJETIVO: contribuir para o aprofundamento de informações sobre os tipos de articulação compensatória descritos na literatura e, ainda, discutir as implicações e contribuições da avaliação clínica e instrumental na identificação destas produções. CONCLUSÃO: as articulações compensatórias merecem a atenção de clínicos e pesquisadores que atuam no Brasil, já que estas alterações são encontradas com grande freqüência em crianças e adultos com fissura palatina ou disfunção velofaríngea, o que compromete a qualidade de vida destes sujeitos. Os fonoaudiólogos devem aprofundar seus conhecimentos sobre os tipos de articulação compensatória e os procedimentos de avaliação, bem como devem estabelecer programas preventivos que favoreçam a aquisição fonológica sem o desenvolvimento dessas compensações.BACKGROUND: compensatory articulation in cleft lip and palate. PURPOSE: to contribute with information regarding the types of compensatory articulation described in the literature and discuss the implications and contributions of clinical and instrumental evaluation of these speech productions. CONCLUSION: compensatory articulation deserves attention from Brazilian physicians and researchers, since that these productions occur in children and adults with cleft palate and velopharyngeal dysfunction, compromising their speech intelligibility and consequently quality of their lives. Speech-language pathologists need to improve knowledge regarding different types of compensatory articulation and also on the procedures for evaluating these productions, in order to settle preventive programs that favor phonological acquisition in children with cleft palate without developing compensatory articulation.
Listeners Experience Linguistic Masking Release in Noise-Vocoded Speech-in-Speech Recognition

Science.gov (United States)

Viswanathan, Navin; Kokkinakis, Kostas; Williams, Brittany T.

2018-01-01

Purpose: The purpose of this study was to evaluate whether listeners with normal hearing perceiving noise-vocoded speech-in-speech demonstrate better intelligibility of target speech when the background speech was mismatched in language (linguistic release from masking [LRM]) and/or location (spatial release from masking [SRM]) relative to the…
Preschool speech articulation and nonword repetition abilities may help predict eventual recovery or persistence of stuttering.

Science.gov (United States)

Spencer, Caroline; Weber-Fox, Christine

2014-09-01

In preschool children, we investigated whether expressive and receptive language, phonological, articulatory, and/or verbal working memory proficiencies aid in predicting eventual recovery or persistence of stuttering. Participants included 65 children, including 25 children who do not stutter (CWNS) and 40 who stutter (CWS) recruited at age 3;9-5;8. At initial testing, participants were administered the Test of Auditory Comprehension of Language, 3rd edition (TACL-3), Structured Photographic Expressive Language Test, 3rd edition (SPELT-3), Bankson-Bernthal Test of Phonology-Consonant Inventory subtest (BBTOP-CI), Nonword Repetition Test (NRT; Dollaghan & Campbell, 1998), and Test of Auditory Perceptual Skills-Revised (TAPS-R) auditory number memory and auditory word memory subtests. Stuttering behaviors of CWS were assessed in subsequent years, forming groups whose stuttering eventually persisted (CWS-Per; n=19) or recovered (CWS-Rec; n=21). Proficiency scores in morphosyntactic skills, consonant production, verbal working memory for known words, and phonological working memory and speech production for novel nonwords obtained at the initial testing were analyzed for each group. CWS-Per were less proficient than CWNS and CWS-Rec in measures of consonant production (BBTOP-CI) and repetition of novel phonological sequences (NRT). In contrast, receptive language, expressive language, and verbal working memory abilities did not distinguish CWS-Rec from CWS-Per. Binary logistic regression analysis indicated that preschool BBTOP-CI scores and overall NRT proficiency significantly predicted future recovery status. Results suggest that phonological and speech articulation abilities in the preschool years should be considered with other predictive factors as part of a comprehensive risk assessment for the development of chronic stuttering. At the end of this activity the reader will be able to: (1) describe the current status of nonlinguistic and linguistic predictors for
Encouraging Student Reflection and Articulation Using a Learning Companion: A Commentary

Science.gov (United States)

Goodman, Bradley; Linton, Frank; Gaimari, Robert

2016-01-01

Our 1998 paper "Encouraging Student Reflection and Articulation using a Learning Companion" (Goodman et al. 1998) was a stepping stone in the progression of learning companions for intelligent tutoring systems (ITS). A simulated learning companion, acting as a peer in an intelligent tutoring environment ensures the availability of a…

Speech Motor Programming in Apraxia of Speech: Evidence from a Delayed Picture-Word Interference Task

Science.gov (United States)

Mailend, Marja-Liisa; Maas, Edwin

2013-01-01

Purpose: Apraxia of speech (AOS) is considered a speech motor programming impairment, but the specific nature of the impairment remains a matter of debate. This study investigated 2 hypotheses about the underlying impairment in AOS framed within the Directions Into Velocities of Articulators (DIVA; Guenther, Ghosh, & Tourville, 2006) model: The…
Speech disorders did not correlate with age at onset of Parkinson’s disease

Directory of Open Access Journals (Sweden)

Alice Estevo Dias

2016-02-01

Full Text Available ABSTRACT Speech disorders are common manifestations of Parkinson´s disease. Objective To compare speech articulation in patients according to age at onset of the disease. Methods Fifty patients was divided into two groups: Group I consisted of 30 patients with age at onset between 40 and 55 years; Group II consisted of 20 patients with age at onset after 65 years. All patients were evaluated based on the Unified Parkinson’s Disease Rating Scale scores, Hoehn and Yahr scale and speech evaluation by perceptual and acoustical analysis. Results There was no statistically significant difference between the two groups regarding neurological involvement and speech characteristics. Correlation analysis indicated differences in speech articulation in relation to staging and axial scores of rigidity and bradykinesia for middle and late-onset. Conclusions Impairment of speech articulation did not correlate with age at onset of disease, but was positively related with disease duration and higher scores in both groups.
Computer-assisted CI fitting: Is the learning capacity of the intelligent agent FOX beneficial for speech understanding?

Science.gov (United States)

Meeuws, Matthias; Pascoal, David; Bermejo, Iñigo; Artaso, Miguel; De Ceulaer, Geert; Govaerts, Paul J

2017-07-01

The software application FOX ('Fitting to Outcome eXpert') is an intelligent agent to assist in the programing of cochlear implant (CI) processors. The current version utilizes a mixture of deterministic and probabilistic logic which is able to improve over time through a learning effect. This study aimed at assessing whether this learning capacity yields measurable improvements in speech understanding. A retrospective study was performed on 25 consecutive CI recipients with a median CI use experience of 10 years who came for their annual CI follow-up fitting session. All subjects were assessed by means of speech audiometry with open set monosyllables at 40, 55, 70, and 85 dB SPL in quiet with their home MAP. Other psychoacoustic tests were executed depending on the audiologist's clinical judgment. The home MAP and the corresponding test results were entered into FOX. If FOX suggested to make MAP changes, they were implemented and another speech audiometry was performed with the new MAP. FOX suggested MAP changes in 21 subjects (84%). The within-subject comparison showed a significant median improvement of 10, 3, 1, and 7% at 40, 55, 70, and 85 dB SPL, respectively. All but two subjects showed an instantaneous improvement in their mean speech audiometric score. Persons with long-term CI use, who received a FOX-assisted CI fitting at least 6 months ago, display improved speech understanding after MAP modifications, as recommended by the current version of FOX. This can be explained only by intrinsic improvements in FOX's algorithms, as they have resulted from learning. This learning is an inherent feature of artificial intelligence and it may yield measurable benefit in speech understanding even in long-term CI recipients.
Classroom acoustics design for speakers’ comfort and speech intelligibility: a European perspective

DEFF Research Database (Denmark)

Garcia, David Pelegrin; Rasmussen, Birgit; Brunskog, Jonas

2014-01-01

. The recommended values of reverberation time in fully occupied classrooms for exible teaching methods are between 0.45 s and 0.6 s (between 0.6 and 0.7 s in an unoccupied but furnished condition) for classrooms with less than 40 students and volumes below 210 m 3 . When designing larger classrooms, a dedicated......Current European regulatory requirements or guidelines for reverberation time in classrooms have the goal of enhancing speech intelligibility for students and reducing noise levels in classrooms. At the same time, school teachers suffer frequently from voice problems due to high vocal load...... intelligibility for students. Two room acoustic parameters are shown relevant for a speaker: the voice support, linked to vocal effort, and the decay time derived from an oral-binaural impulse response, linked to vocal comfort. Theoretical prediction models for room-averaged values of these parameters...
Assessment of the Speech Intelligibility Performance of Post Lingual Cochlear Implant Users at Different Signal-to-Noise Ratios Using the Turkish Matrix Test

Directory of Open Access Journals (Sweden)

Zahra Polat

2016-10-01

Full Text Available Background: Spoken word recognition and speech perception tests in quiet are being used as a routine in assessment of the benefit which children and adult cochlear implant users receive from their devices. Cochlear implant users generally demonstrate high level performances in these test materials as they are able to achieve high level speech perception ability in quiet situations. Although these test materials provide valuable information regarding Cochlear Implant (CI users’ performances in optimal listening conditions, they do not give realistic information regarding performances in adverse listening conditions, which is the case in the everyday environment. Aims: The aim of this study was to assess the speech intelligibility performance of post lingual CI users in the presence of noise at different signal-to-noise ratio with the Matrix Test developed for Turkish language. Study Design: Cross-sectional study. Methods: The thirty post lingual implant user adult subjects, who had been using implants for a minimum of one year, were evaluated with Turkish Matrix test. Subjects’ speech intelligibility was measured using the adaptive and non-adaptive Matrix Test in quiet and noisy environments. Results: The results of the study show a correlation between Pure Tone Average (PTA values of the subjects and Matrix test Speech Reception Threshold (SRT values in the quiet. Hence, it is possible to asses PTA values of CI users using the Matrix Test also. However, no correlations were found between Matrix SRT values in the quiet and Matrix SRT values in noise. Similarly, the correlation between PTA values and intelligibility scores in noise was also not significant. Therefore, it may not be possible to assess the intelligibility performance of CI users using test batteries performed in quiet conditions. Conclusion: The Matrix Test can be used to assess the benefit of CI users from their systems in everyday life, since it is possible to perform
Oral Articulatory Control in Childhood Apraxia of Speech

Science.gov (United States)

Grigos, Maria I.; Moss, Aviva; Lu, Ying

2015-01-01

Purpose: The purpose of this research was to examine spatial and temporal aspects of articulatory control in children with childhood apraxia of speech (CAS), children with speech delay characterized by an articulation/phonological impairment (SD), and controls with typical development (TD) during speech tasks that increased in word length. Method:…
The Influence of Cochlear Mechanical Dysfunction, Temporal Processing Deficits, and Age on the Intelligibility of Audible Speech in Noise for Hearing-Impaired Listeners

Directory of Open Access Journals (Sweden)

Peter T. Johannesen

2016-05-01

Full Text Available The aim of this study was to assess the relative importance of cochlear mechanical dysfunction, temporal processing deficits, and age on the ability of hearing-impaired listeners to understand speech in noisy backgrounds. Sixty-eight listeners took part in the study. They were provided with linear, frequency-specific amplification to compensate for their audiometric losses, and intelligibility was assessed for speech-shaped noise (SSN and a time-reversed two-talker masker (R2TM. Behavioral estimates of cochlear gain loss and residual compression were available from a previous study and were used as indicators of cochlear mechanical dysfunction. Temporal processing abilities were assessed using frequency modulation detection thresholds. Age, audiometric thresholds, and the difference between audiometric threshold and cochlear gain loss were also included in the analyses. Stepwise multiple linear regression models were used to assess the relative importance of the various factors for intelligibility. Results showed that (a cochlear gain loss was unrelated to intelligibility, (b residual cochlear compression was related to intelligibility in SSN but not in a R2TM, (c temporal processing was strongly related to intelligibility in a R2TM and much less so in SSN, and (d age per se impaired intelligibility. In summary, all factors affected intelligibility, but their relative importance varied across maskers.
Dysfluencies in the speech of adults with intellectual disabilities and reported speech difficulties.

Science.gov (United States)

Coppens-Hofman, Marjolein C; Terband, Hayo R; Maassen, Ben A M; van Schrojenstein Lantman-De Valk, Henny M J; van Zaalen-op't Hof, Yvonne; Snik, Ad F M

2013-01-01

In individuals with an intellectual disability, speech dysfluencies are more common than in the general population. In clinical practice, these fluency disorders are generally diagnosed and treated as stuttering rather than cluttering. To characterise the type of dysfluencies in adults with intellectual disabilities and reported speech difficulties with an emphasis on manifestations of stuttering and cluttering, which distinction is to help optimise treatment aimed at improving fluency and intelligibility. The dysfluencies in the spontaneous speech of 28 adults (18-40 years; 16 men) with mild and moderate intellectual disabilities (IQs 40-70), who were characterised as poorly intelligible by their caregivers, were analysed using the speech norms for typically developing adults and children. The speakers were subsequently assigned to different diagnostic categories by relating their resulting dysfluency profiles to mean articulatory rate and articulatory rate variability. Twenty-two (75%) of the participants showed clinically significant dysfluencies, of which 21% were classified as cluttering, 29% as cluttering-stuttering and 25% as clear cluttering at normal articulatory rate. The characteristic pattern of stuttering did not occur. The dysfluencies in the speech of adults with intellectual disabilities and poor intelligibility show patterns that are specific for this population. Together, the results suggest that in this specific group of dysfluent speakers interventions should be aimed at cluttering rather than stuttering. The reader will be able to (1) describe patterns of dysfluencies in the speech of adults with intellectual disabilities that are specific for this group of people, (2) explain that a high rate of dysfluencies in speech is potentially a major determiner of poor intelligibility in adults with ID and (3) describe suggestions for intervention focusing on cluttering rather than stuttering in dysfluent speakers with ID. Copyright © 2013 Elsevier Inc
Speech production gains following constraint-induced movement therapy in children with hemiparesis.

Science.gov (United States)

Allison, Kristen M; Reidy, Teressa Garcia; Boyle, Mary; Naber, Erin; Carney, Joan; Pidcock, Frank S

2017-01-01

The purpose of this study was to investigate changes in speech skills of children who have hemiparesis and speech impairment after participation in a constraint-induced movement therapy (CIMT) program. While case studies have reported collateral speech gains following CIMT, the effect of CIMT on speech production has not previously been directly investigated to the knowledge of these investigators. Eighteen children with hemiparesis and co-occurring speech impairment participated in a 21-day clinical CIMT program. The Goldman-Fristoe Test of Articulation-2 (GFTA-2) was used to assess children's articulation of speech sounds before and after the intervention. Changes in percent of consonants correct (PCC) on the GFTA-2 were used as a measure of change in speech production. Children made significant gains in PCC following CIMT. Gains were similar in children with left and right-sided hemiparesis, and across age groups. This study reports significant collateral gains in speech production following CIMT and suggests benefits of CIMT may also spread to speech motor domains.
Intelligibility and Clarity of Reverberant Speech: Effects of Wide Dynamic Range Compression Release Time and Working Memory

Science.gov (United States)

Reinhart, Paul N.; Souza, Pamela E.

2016-01-01

Purpose: The purpose of this study was to examine the effects of varying wide dynamic range compression (WDRC) release time on intelligibility and clarity of reverberant speech. The study also considered the role of individual working memory. Method: Thirty older listeners with mild to moderately-severe sloping sensorineural hearing loss…
A Cross-Language Study of Acoustic Predictors of Speech Intelligibility in Individuals with Parkinson's Disease

Science.gov (United States)

Kim, Yunjung; Choi, Yaelin

2017-01-01

Purpose: The present study aimed to compare acoustic models of speech intelligibility in individuals with the same disease (Parkinson's disease [PD]) and presumably similar underlying neuropathologies but with different native languages (American English [AE] and Korean). Method: A total of 48 speakers from the 4 speaker groups (AE speakers with…
Evaluation of articulation of Turkish phonemes after removable partial denture application

Directory of Open Access Journals (Sweden)

Özbeki Murat

2003-01-01

Full Text Available In this study, the adaptation of patients to removable partial dentures was evaluated related to articulation of Turkish phonemes. Articulation of /t,d,n,l,r/, /g,k/, /b,p,m/ and /s,z,Õ,v,f,y,j,h,c/ phonemes were evaluated by three speech pathologists, on records taken from 15 patients before the insertion of a removable partial denture, just after insertion, and one week later. The test consisted of evaluation of phoneme articulation of independent syllables in terms of distortion, omission, substitution, mass effect, hypernasality and hyponasality. Data were evaluated with Cochrane Q, McNemar and Kruskal-Wallis tests. The results showed that for some phonemes, problems in articulation occurred after the insertion of a removable partial denture while for others a significant amelioration was observed after the insertion of a removable partial denture. In general, problems in articulation of evaluated phonemes were resolved after one week of use.
Use of a Deep Recurrent Neural Network to Reduce Wind Noise: Effects on Judged Speech Intelligibility and Sound Quality.

Science.gov (United States)

Keshavarzi, Mahmoud; Goehring, Tobias; Zakis, Justin; Turner, Richard E; Moore, Brian C J

2018-01-01

Despite great advances in hearing-aid technology, users still experience problems with noise in windy environments. The potential benefits of using a deep recurrent neural network (RNN) for reducing wind noise were assessed. The RNN was trained using recordings of the output of the two microphones of a behind-the-ear hearing aid in response to male and female speech at various azimuths in the presence of noise produced by wind from various azimuths with a velocity of 3 m/s, using the "clean" speech as a reference. A paired-comparison procedure was used to compare all possible combinations of three conditions for subjective intelligibility and for sound quality or comfort. The conditions were unprocessed noisy speech, noisy speech processed using the RNN, and noisy speech that was high-pass filtered (which also reduced wind noise). Eighteen native English-speaking participants were tested, nine with normal hearing and nine with mild-to-moderate hearing impairment. Frequency-dependent linear amplification was provided for the latter. Processing using the RNN was significantly preferred over no processing by both subject groups for both subjective intelligibility and sound quality, although the magnitude of the preferences was small. High-pass filtering (HPF) was not significantly preferred over no processing. Although RNN was significantly preferred over HPF only for sound quality for the hearing-impaired participants, for the results as a whole, there was a preference for RNN over HPF. Overall, the results suggest that reduction of wind noise using an RNN is possible and might have beneficial effects when used in hearing aids.
Use of a Deep Recurrent Neural Network to Reduce Wind Noise: Effects on Judged Speech Intelligibility and Sound Quality

Science.gov (United States)

Keshavarzi, Mahmoud; Goehring, Tobias; Zakis, Justin; Turner, Richard E.; Moore, Brian C. J.

2018-01-01

Despite great advances in hearing-aid technology, users still experience problems with noise in windy environments. The potential benefits of using a deep recurrent neural network (RNN) for reducing wind noise were assessed. The RNN was trained using recordings of the output of the two microphones of a behind-the-ear hearing aid in response to male and female speech at various azimuths in the presence of noise produced by wind from various azimuths with a velocity of 3 m/s, using the “clean” speech as a reference. A paired-comparison procedure was used to compare all possible combinations of three conditions for subjective intelligibility and for sound quality or comfort. The conditions were unprocessed noisy speech, noisy speech processed using the RNN, and noisy speech that was high-pass filtered (which also reduced wind noise). Eighteen native English-speaking participants were tested, nine with normal hearing and nine with mild-to-moderate hearing impairment. Frequency-dependent linear amplification was provided for the latter. Processing using the RNN was significantly preferred over no processing by both subject groups for both subjective intelligibility and sound quality, although the magnitude of the preferences was small. High-pass filtering (HPF) was not significantly preferred over no processing. Although RNN was significantly preferred over HPF only for sound quality for the hearing-impaired participants, for the results as a whole, there was a preference for RNN over HPF. Overall, the results suggest that reduction of wind noise using an RNN is possible and might have beneficial effects when used in hearing aids. PMID:29708061
The Influence of Visual and Auditory Information on the Perception of Speech and Non-Speech Oral Movements in Patients with Left Hemisphere Lesions

Science.gov (United States)

Schmid, Gabriele; Thielmann, Anke; Ziegler, Wolfram

2009-01-01

Patients with lesions of the left hemisphere often suffer from oral-facial apraxia, apraxia of speech, and aphasia. In these patients, visual features often play a critical role in speech and language therapy, when pictured lip shapes or the therapist's visible mouth movements are used to facilitate speech production and articulation. This demands…
The (in)dependence of articulation and lexical planning during isolated word production.

Science.gov (United States)

Buz, Esteban; Jaeger, T Florian

The number of phonological neighbors to a word (PND) can affect its lexical planning and pronunciation. Similar parallel effects on planning and articulation have been observed for other lexical variables, such as a word's contextual predictability. Such parallelism is frequently taken to indicate that effects on articulation are mediated by effects on the time course of lexical planning. We test this mediation assumption for PND and find it unsupported. In a picture naming experiment, we measure speech onset latencies (planning), word durations, and vowel dispersion (articulation). We find that PND predicts both latencies and durations. Further, latencies predict durations. However, the effects of PND and latency on duration are independent: parallel effects do not imply mediation. We discuss the consequences for accounts of lexical planning, articulation, and the link between them. In particular, our results suggest that ease of planning does not explain effects of PND on articulation.
Methods of analysis speech rate: a pilot study.

Science.gov (United States)

Costa, Luanna Maria Oliveira; Martins-Reis, Vanessa de Oliveira; Celeste, Letícia Côrrea

2016-01-01

To describe the performance of fluent adults in different measures of speech rate. The study included 24 fluent adults, of both genders, speakers of Brazilian Portuguese, who were born and still living in the metropolitan region of Belo Horizonte, state of Minas Gerais, aged between 18 and 59 years. Participants were grouped by age: G1 (18-29 years), G2 (30-39 years), G3 (40-49 years), and G4 (50-59 years). The speech samples were obtained following the methodology of the Speech Fluency Assessment Protocol. In addition to the measures of speech rate proposed by the protocol (speech rate in words and syllables per minute), the rate of speech into phonemes per second and the articulation rate with and without the disfluencies were calculated. We used the nonparametric Friedman test and the Wilcoxon test for multiple comparisons. Groups were compared using the nonparametric Kruskal Wallis. The significance level was of 5%. There were significant differences between measures of speech rate involving syllables. The multiple comparisons showed that all the three measures were different. There was no effect of age for the studied measures. These findings corroborate previous studies. The inclusion of temporal acoustic measures such as speech rate in phonemes per second and articulation rates with and without disfluencies can be a complementary approach in the evaluation of speech rate.
Hemispheric asymmetries in speech perception: sense, nonsense and modulations.

Directory of Open Access Journals (Sweden)

Stuart Rosen

Full Text Available The well-established left hemisphere specialisation for language processing has long been claimed to be based on a low-level auditory specialization for specific acoustic features in speech, particularly regarding 'rapid temporal processing'.A novel analysis/synthesis technique was used to construct a variety of sounds based on simple sentences which could be manipulated in spectro-temporal complexity, and whether they were intelligible or not. All sounds consisted of two noise-excited spectral prominences (based on the lower two formants in the original speech which could be static or varying in frequency and/or amplitude independently. Dynamically varying both acoustic features based on the same sentence led to intelligible speech but when either or both acoustic features were static, the stimuli were not intelligible. Using the frequency dynamics from one sentence with the amplitude dynamics of another led to unintelligible sounds of comparable spectro-temporal complexity to the intelligible ones. Positron emission tomography (PET was used to compare which brain regions were active when participants listened to the different sounds.Neural activity to spectral and amplitude modulations sufficient to support speech intelligibility (without actually being intelligible was seen bilaterally, with a right temporal lobe dominance. A left dominant response was seen only to intelligible sounds. It thus appears that the left hemisphere specialisation for speech is based on the linguistic properties of utterances, not on particular acoustic features.
Suppressed Alpha Oscillations Predict Intelligibility of Speech and its Acoustic Details

Science.gov (United States)

Weisz, Nathan

2012-01-01

Modulations of human alpha oscillations (8–13 Hz) accompany many cognitive processes, but their functional role in auditory perception has proven elusive: Do oscillatory dynamics of alpha reflect acoustic details of the speech signal and are they indicative of comprehension success? Acoustically presented words were degraded in acoustic envelope and spectrum in an orthogonal design, and electroencephalogram responses in the frequency domain were analyzed in 24 participants, who rated word comprehensibility after each trial. First, the alpha power suppression during and after a degraded word depended monotonically on spectral and, to a lesser extent, envelope detail. The magnitude of this alpha suppression exhibited an additional and independent influence on later comprehension ratings. Second, source localization of alpha suppression yielded superior parietal, prefrontal, as well as anterior temporal brain areas. Third, multivariate classification of the time–frequency pattern across participants showed that patterns of late posterior alpha power allowed best for above-chance classification of word intelligibility. Results suggest that both magnitude and topography of late alpha suppression in response to single words can indicate a listener's sensitivity to acoustic features and the ability to comprehend speech under adverse listening conditions. PMID:22100354
COMPARATIVE ANALISYS OF ARTICULATION AND PHONOLOGY DISORDERS IN FUNCTION OF DIFFERENTIAL DIAGNOSIS

Directory of Open Access Journals (Sweden)

Ana POPOSKA

2010-04-01

Full Text Available Sound expression is the first impression of speech and language. Whatever its origin, false pronunciation is the first sign. In the early school years of developmental speech – if a language disorder appears, it is often followed by the disruption of the phonological – articulation segment.This research aims to establish and compare frequency, type, and every articulate and acoustic characteristics of the disordered sound in both children with Dyslalia, as well as those with SLI.This micro investigation was done using 71 examinees ages 6 to 8. Thirty-five examinees were with Dyslalia and 36 had SLI. Their achievements are underlined using comparative analysis, tested with two relevant tests.Some of the more important conclusions are:Children with Dyslalia mostly showed distorted sounds, while those with SLI mostly substituted the disordered sound. In Dyslalia, fricatives were most affected, but in the case of SLI, all sound groups were disordered usually. In both tested groups, the type of disorder was due to the misplacement of sound formation. All children having articulation disorders while also having sound discrimination have not only phonetic contrasts, but were also being influenced by the rest of the linguistic aspects.

Audiovisual Speech Synchrony Measure: Application to Biometrics

Directory of Open Access Journals (Sweden)

Gérard Chollet

2007-01-01

Full Text Available Speech is a means of communication which is intrinsically bimodal: the audio signal originates from the dynamics of the articulators. This paper reviews recent works in the field of audiovisual speech, and more specifically techniques developed to measure the level of correspondence between audio and visual speech. It overviews the most common audio and visual speech front-end processing, transformations performed on audio, visual, or joint audiovisual feature spaces, and the actual measure of correspondence between audio and visual speech. Finally, the use of synchrony measure for biometric identity verification based on talking faces is experimented on the BANCA database.
PRESCHOOL SPEECH ARTICULATION AND NONWORD REPETITION ABILITIES MAY HELP PREDICT EVENTUAL RECOVERY OR PERSISTENCE OF STUTTERING

Science.gov (United States)

Spencer, Caroline; Weber-Fox, Christine

2014-01-01

Purpose In preschool children, we investigated whether expressive and receptive language, phonological, articulatory, and/or verbal working memory proficiencies aid in predicting eventual recovery or persistence of stuttering. Methods Participants included 65 children, including 25 children who do not stutter (CWNS) and 40 who stutter (CWS) recruited at age 3;9–5;8. At initial testing, participants were administered the Test of Auditory Comprehension of Language, 3rd edition (TACL-3), Structured Photographic Expressive Language Test, 3rd edition (SPELT-3), Bankson-Bernthal Test of Phonology-Consonant Inventory subtest (BBTOP-CI), Nonword Repetition Test (NRT; Dollaghan & Campbell, 1998), and Test of Auditory Perceptual Skills-Revised (TAPS-R) auditory number memory and auditory word memory subtests. Stuttering behaviors of CWS were assessed in subsequent years, forming groups whose stuttering eventually persisted (CWS-Per; n=19) or recovered (CWS-Rec; n=21). Proficiency scores in morphosyntactic skills, consonant production, verbal working memory for known words, and phonological working memory and speech production for novel nonwords obtained at the initial testing were analyzed for each group. Results CWS-Per were less proficient than CWNS and CWS-Rec in measures of consonant production (BBTOP-CI) and repetition of novel phonological sequences (NRT). In contrast, receptive language, expressive language, and verbal working memory abilities did not distinguish CWS-Rec from CWS-Per. Binary logistic regression analysis indicated that preschool BBTOP-CI scores and overall NRT proficiency significantly predicted future recovery status. Conclusion Results suggest that phonological and speech articulation abilities in the preschool years should be considered with other predictive factors as part of a comprehensive risk assessment for the development of chronic stuttering. PMID:25173455
A NOVEL APPROACH TO STUTTERED SPEECH CORRECTION

Directory of Open Access Journals (Sweden)

Alim Sabur Ajibola

2016-06-01

Full Text Available Stuttered speech is a dysfluency rich speech, more prevalent in males than females. It has been associated with insufficient air pressure or poor articulation, even though the root causes are more complex. The primary features include prolonged speech and repetitive speech, while some of its secondary features include, anxiety, fear, and shame. This study used LPC analysis and synthesis algorithms to reconstruct the stuttered speech. The results were evaluated using cepstral distance, Itakura-Saito distance, mean square error, and likelihood ratio. These measures implied perfect speech reconstruction quality. ASR was used for further testing, and the results showed that all the reconstructed speech samples were perfectly recognized while only three samples of the original speech were perfectly recognized.
Prevalence of Speech Disorders in Arak Primary School Students, 2014-2015

Directory of Open Access Journals (Sweden)

Abdoreza Yavari

2016-09-01

Full Text Available Abstract Background: The speech disorders may produce irreparable damage to childs speech and language development in the psychosocial view. The voice, speech sound production and fluency disorders are speech disorders, that may result from delay or impairment in speech motor control mechanism, central neuron system disorders, improper language stimulation or voice abuse. Materials and Methods: This study examined the prevalence of speech disorders in 1393 Arakian students at 1 to 6th grades of primary school. After collecting continuous speech samples, picture description, passage reading and phonetic test, we recorded the pathological signs of stuttering, articulation disorder and voice disorders in a special sheet. Results: The prevalence of articulation, voice and stuttering disorders was 8%, 3.5% and%1 and the prevalence of speech disorders was 11.9%. The prevalence of speech disorders was decreasing with increasing of student’s grade. 12.2% of boy students and 11.7% of girl students of primary school in Arak had speech disorders. Conclusion: The prevalence of speech disorders of primary school students in Arak is similar to the prevalence of speech disorders in Kermanshah, but the prevalence of speech disorders in this research is smaller than many similar researches in Iran. It seems that racial and cultural diversity has some effect on increasing the prevalence of speech disorders in Arak city.
Changes in speech following maxillary distraction osteogenesis.

Science.gov (United States)

Guyette, T W; Polley, J W; Figueroa, A; Smith, B E

2001-05-01

The purpose of this study was to describe changes in articulation and velopharyngeal function following maxillary distraction osteogenesis. This is a descriptive, post hoc clinical report comparing the performance of patients before and after maxillary distraction. The independent variable was maxillary distraction while the dependent variables were resonance, articulation errors, and velopharyngeal function. The data were collected at a tertiary health care center in Chicago. The data from pre- and postoperative evaluations of 18 maxillary distraction patients were used. The outcome measures were severity of hypernasality and hyponasality, velopharyngeal orifice size as estimated using the pressure-flow technique, and number and type of articulation errors. At the long-term follow-up, 16.7% exhibited a significant increase in hypernasality. Seventy-five percent of patients with preoperative hyponasality experienced improved nasal resonance. Articulation improved in 67% of patients by the 1-year follow-up. In a predominately cleft palate population, the risk for velopharyngeal insufficiency following maxillary distraction is similar to the risk observed in Le Fort I maxillary advancement. Patients being considered for maxillary distraction surgery should receive pre- and postoperative speech evaluations and be counseled about risks for changes in their speech.
Partially overlapping sensorimotor networks underlie speech praxis and verbal short-term memory: evidence from apraxia of speech following acute stroke.

Science.gov (United States)

Hickok, Gregory; Rogalsky, Corianne; Chen, Rong; Herskovits, Edward H; Townsley, Sarah; Hillis, Argye E

2014-01-01

We tested the hypothesis that motor planning and programming of speech articulation and verbal short-term memory (vSTM) depend on partially overlapping networks of neural regions. We evaluated this proposal by testing 76 individuals with acute ischemic stroke for impairment in motor planning of speech articulation (apraxia of speech, AOS) and vSTM in the first day of stroke, before the opportunity for recovery or reorganization of structure-function relationships. We also evaluated areas of both infarct and low blood flow that might have contributed to AOS or impaired vSTM in each person. We found that AOS was associated with tissue dysfunction in motor-related areas (posterior primary motor cortex, pars opercularis; premotor cortex, insula) and sensory-related areas (primary somatosensory cortex, secondary somatosensory cortex, parietal operculum/auditory cortex); while impaired vSTM was associated with primarily motor-related areas (pars opercularis and pars triangularis, premotor cortex, and primary motor cortex). These results are consistent with the hypothesis, also supported by functional imaging data, that both speech praxis and vSTM rely on partially overlapping networks of brain regions.
Evaluation of articulation simulation system using artificial maxillectomy models.

Science.gov (United States)

Elbashti, M E; Hattori, M; Sumita, Y I; Taniguchi, H

2015-09-01

Acoustic evaluation is valuable for guiding the treatment of maxillofacial defects and determining the effectiveness of rehabilitation with an obturator prosthesis. Model simulations are important in terms of pre-surgical planning and pre- and post-operative speech function. This study aimed to evaluate the acoustic characteristics of voice generated by an articulation simulation system using a vocal tract model with or without artificial maxillectomy defects. More specifically, we aimed to establish a speech simulation system for maxillectomy defect models that both surgeons and maxillofacial prosthodontists can use in guiding treatment planning. Artificially simulated maxillectomy defects were prepared according to Aramany's classification (Classes I-VI) in a three-dimensional vocal tract plaster model of a subject uttering the vowel /a/. Formant and nasalance acoustic data were analysed using Computerized Speech Lab and the Nasometer, respectively. Formants and nasalance of simulated /a/ sounds were successfully detected and analysed. Values of Formants 1 and 2 for the non-defect model were 675.43 and 976.64 Hz, respectively. Median values of Formants 1 and 2 for the defect models were 634.36 and 1026.84 Hz, respectively. Nasalance was 11% in the non-defect model, whereas median nasalance was 28% in the defect models. The results suggest that an articulation simulation system can be used to help surgeons and maxillofacial prosthodontists to plan post-surgical defects that will be facilitate maxillofacial rehabilitation. © 2015 John Wiley & Sons Ltd.
Are you a good mimic? Neuro-acoustic signatures for speech imitation ability

Directory of Open Access Journals (Sweden)

Susanne Maria Reiterer

2013-10-01

Full Text Available We investigated individual differences in speech imitation ability in late bilinguals using a neuro-acoustic approach. 138 German-English bilinguals matched on various behavioral measures were tested for speech imitation ability in a foreign language, Hindi, and categorised into high and low ability groups. Brain activations and speech recordings were obtained from 26 participants from the two extreme groups as they performed a functional neuroimaging experiment which required them to imitate sentences in three conditions: (A German, (B English and (C German with fake English accent. We used recently developed novel acoustic analysis, namely the ‘articulation space’ as a metric to compare speech imitation abilities of the two groups. Across all three conditions, direct comparisons between the two groups, revealed brain activations (FWE corrected, p< 0.05 that were more widespread with significantly higher peak activity in the left supramarginal gyrus and postcentral areas for the low ability group. The high ability group, on the other hand showed significantly larger articulation space in all three conditions. In addition, articulation space also correlated positively with imitation ability (Pearson’s r=0.7, p<0.01. Our results suggest that an expanded articulation space for high ability individuals allows access to a larger repertoire of sounds, thereby providing skilled imitators greater flexibility in pronunciation and language learning.
Infants' preference for native audiovisual speech dissociated from congruency preference.

Directory of Open Access Journals (Sweden)

Kathleen Shaw

Full Text Available Although infant speech perception in often studied in isolated modalities, infants' experience with speech is largely multimodal (i.e., speech sounds they hear are accompanied by articulating faces. Across two experiments, we tested infants' sensitivity to the relationship between the auditory and visual components of audiovisual speech in their native (English and non-native (Spanish language. In Experiment 1, infants' looking times were measured during a preferential looking task in which they saw two simultaneous visual speech streams articulating a story, one in English and the other in Spanish, while they heard either the English or the Spanish version of the story. In Experiment 2, looking times from another group of infants were measured as they watched single displays of congruent and incongruent combinations of English and Spanish audio and visual speech streams. Findings demonstrated an age-related increase in looking towards the native relative to non-native visual speech stream when accompanied by the corresponding (native auditory speech. This increase in native language preference did not appear to be driven by a difference in preference for native vs. non-native audiovisual congruence as we observed no difference in looking times at the audiovisual streams in Experiment 2.
Comment on "Monkey vocal tracts are speech-ready".

Science.gov (United States)

Lieberman, Philip

2017-07-01

Monkey vocal tracts are capable of producing monkey speech, not the full range of articulate human speech. The evolution of human speech entailed both anatomy and brains. Fitch, de Boer, Mathur, and Ghazanfar in Science Advances claim that "monkey vocal tracts are speech-ready," and conclude that "…the evolution of human speech capabilities required neural change rather than modifications of vocal anatomy." Neither premise is consistent either with the data presented and the conclusions reached by de Boer and Fitch themselves in their own published papers on the role of anatomy in the evolution of human speech or with the body of independent studies published since the 1950s.
Speech profile of patients undergoing primary palatoplasty.

Science.gov (United States)

Menegueti, Katia Ignacio; Mangilli, Laura Davison; Alonso, Nivaldo; Andrade, Claudia Regina Furquim de

2017-10-26

To characterize the profile and speech characteristics of patients undergoing primary palatoplasty in a Brazilian university hospital, considering the time of intervention (early, before two years of age; late, after two years of age). Participants were 97 patients of both genders with cleft palate and/or cleft and lip palate, assigned to the Speech-language Pathology Department, who had been submitted to primary palatoplasty and presented no prior history of speech-language therapy. Patients were divided into two groups: early intervention group (EIG) - 43 patients undergoing primary palatoplasty before 2 years of age and late intervention group (LIG) - 54 patients undergoing primary palatoplasty after 2 years of age. All patients underwent speech-language pathology assessment. The following parameters were assessed: resonance classification, presence of nasal turbulence, presence of weak intraoral air pressure, presence of audible nasal air emission, speech understandability, and compensatory articulation disorder (CAD). At statistical significance level of 5% (p≤0.05), no significant difference was observed between the groups in the following parameters: resonance classification (p=0.067); level of hypernasality (p=0.113), presence of nasal turbulence (p=0.179); presence of weak intraoral air pressure (p=0.152); presence of nasal air emission (p=0.369), and speech understandability (p=0.113). The groups differed with respect to presence of compensatory articulation disorders (p=0.020), with the LIG presenting higher occurrence of altered phonemes. It was possible to assess the general profile and speech characteristics of the study participants. Patients submitted to early primary palatoplasty present better speech profile.
The impact of exploiting spectro-temporal context in computational speech segregation

DEFF Research Database (Denmark)

Bentsen, Thomas; Kressner, Abigail Anne; Dau, Torsten

2018-01-01

Computational speech segregation aims to automatically segregate speech from interfering noise, often by employing ideal binary mask estimation. Several studies have tried to exploit contextual information in speech to improve mask estimation accuracy by using two frequently-used strategies that (1...... for measured intelligibility. The findings may have implications for the design of speech segregation systems, and for the selection of a cost function that correlates with intelligibility....
Cognitive functions in Childhood Apraxia of Speech

NARCIS (Netherlands)

Nijland, L.; Terband, H.; Maassen, B.

2015-01-01

Purpose: Childhood Apraxia of Speech (CAS) is diagnosed on the basis of specific speech characteristics, in the absence of problems in hearing, intelligence, and language comprehension. This does not preclude the possibility that children with this speech disorder might demonstrate additional
Audiovisual Asynchrony Detection in Human Speech

Science.gov (United States)

Maier, Joost X.; Di Luca, Massimiliano; Noppeney, Uta

2011-01-01

Combining information from the visual and auditory senses can greatly enhance intelligibility of natural speech. Integration of audiovisual speech signals is robust even when temporal offsets are present between the component signals. In the present study, we characterized the temporal integration window for speech and nonspeech stimuli with…
A music perception disorder (congenital amusia) influences speech comprehension.

Science.gov (United States)

Liu, Fang; Jiang, Cunmei; Wang, Bei; Xu, Yi; Patel, Aniruddh D

2015-01-01

This study investigated the underlying link between speech and music by examining whether and to what extent congenital amusia, a musical disorder characterized by degraded pitch processing, would impact spoken sentence comprehension for speakers of Mandarin, a tone language. Sixteen Mandarin-speaking amusics and 16 matched controls were tested on the intelligibility of news-like Mandarin sentences with natural and flat fundamental frequency (F0) contours (created via speech resynthesis) under four signal-to-noise (SNR) conditions (no noise, +5, 0, and -5dB SNR). While speech intelligibility in quiet and extremely noisy conditions (SNR=-5dB) was not significantly compromised by flattened F0, both amusic and control groups achieved better performance with natural-F0 sentences than flat-F0 sentences under moderately noisy conditions (SNR=+5 and 0dB). Relative to normal listeners, amusics demonstrated reduced speech intelligibility in both quiet and noise, regardless of whether the F0 contours of the sentences were natural or flattened. This deficit in speech intelligibility was not associated with impaired pitch perception in amusia. These findings provide evidence for impaired speech comprehension in congenital amusia, suggesting that the deficit of amusics extends beyond pitch processing and includes segmental processing. Copyright © 2014 Elsevier Ltd. All rights reserved.
Speech outcomes of early palatal repair with or without intravelar veloplasty in children with complete unilateral cleft lip and palate.

Science.gov (United States)

Doucet, Jean-Charles; Herlin, Christian; Captier, Guillaume; Baylon, Hélène; Verdeil, Mélanie; Bigorre, Michèle

2013-12-01

We compared the early speech outcomes of 40 consecutive children with complete unilateral cleft lip and palate (UCLP) who had been treated according to different 2-stage protocols: the Malek protocol (soft palate closure without intravelar veloplasty at 3 months; lip and hard palate repair at 6 months) (n=20), and the Talmant protocol (cheilorhinoplasty and soft palate repair with intravelar veloplasty at 6 months; hard palate closure at 18 months) (n=20). We compared the speech assessments obtained at a mean (SD) age of 3.3 (0.35) years after treatment by the same surgeon. The main outcome measures evaluated were acquisition and intelligibility of speech, velopharyngeal insufficiency, and incidence of complications. A delay in speech articulation of one year or more was seen more often in patients treated by the Malek protocol (11/20) than in those treated according to the Talmant protocol (3/20, p=0.019). Good intelligibility was noted in 15/20 in the Talmant group compared with 6/20 in the Malek group (p=0.010). Assessment with an aerophonoscope showed that nasal air emission was most pronounced in patients in the Malek group (p=0.007). Velopharyngeal insufficiency was present in 11/20 in the Malek group, and in 3/20 in the Talmant group (p=0.019). No patients in the Talmant group had an oronasal fistula (ppalate, early speech outcomes were better in the Talmant group because intravelar veloplasty was successful and there were no fistulas after closure of the hard palate in 2 layers. Copyright © 2013 The British Association of Oral and Maxillofacial Surgeons. All rights reserved.
Language and Speech Improvement for Kindergarten and First Grade. A Supplementary Handbook.

Science.gov (United States)

Cole, Roberta; And Others

The 16-unit language and speech improvement handbook for kindergarten and first grade students contains an introductory section which includes a discussion of the child's developmental speech and language characteristics, a sound development chart, a speech and hearing language screening test, the Henja articulation test, and a general outline of…
The superior precentral gyrus of the insula does not appear to be functionally specialized for articulation.

Science.gov (United States)

Fedorenko, Evelina; Fillmore, Paul; Smith, Kimberly; Bonilha, Leonardo; Fridriksson, Julius

2015-04-01

Broca (Broca P. Bull Soc Anat Paris 36: 330-357, 1861) influentially argued that posterior left inferior frontal gyrus supports speech articulation. According to an alternative proposal (e.g., Dronkers NF. Nature 384: 159-161, 1996; Wise RJ, Greene J, Buchel C, Scott SK. Lancet 353: 1057-1061, 1999; Baldo JV, Wilkins DP, Ogar J, Willock S, Dronkers NF. Cortex 47: 800-807, 2011), a region in the anterior insula [specifically, the superior precentral gyrus of the insula (SPGI)] is the seat of articulatory abilities. Moreover, Dronkers and colleagues have argued that the SPGI is functionally specialized for (complex) speech articulation. Here, we evaluate this claim using individual-subject functional MRI (fMRI) analyses (e.g., Fedorenko E, Hsieh PJ, Nieto-Castanon A, Whitfield-Gabrieli S, Kanwisher N. J Neurophysiol 104: 1177-1194, 2010). We find that the SPGI responds weakly, if at all, during articulation (parts of Broca's area respond 3-4 times more strongly) and does not show a stronger response to higher articulatory demands. This holds regardless of whether the SPGI is defined functionally (by selecting the most articulation-responsive voxels in the vicinity of the SPGI in each subject individually) or anatomically (by using masks drawn on each individual subject's anatomy). Critically, nonspeech oral movements activate the SPGI more strongly than articulation, especially under the anatomical definition of the SPGI. In line with Hillis et al. (Hillis AE, Work M, Barker PB, Jacobs MA, Breese EL, Maurer K. Brain 127: 1479-1487, 2004; also Trupe L, Varma DD, Gomez Y, Race D, Leigh R, Hillis AE, Gottesman RF. Stroke 44: 740-744, 2013), we argue that previous links between the SPGI, and perhaps anterior insula more generally, and articulation may be due to its high base rate of ischemic damage (and activation in fMRI; Yarkoni T, Poldrack RA, Nichols TE, Van Essen DC, Wager TD. Nat Methods 8: 665-670, 2011), combined with its proximity to regions that more directly
Team Management of a Young Adult with a Traumatic Cleft Repair

Directory of Open Access Journals (Sweden)

Nandini V Kamat

2011-01-01

This case report describes management of an adult patient with a traumatically repaired cleft. The maxilla was deficient in all three planes and the mandible appeared protrusive. The patient′s speech had nasalance and articulation problems. The patient was informed about the possibility of speech deterioration post-surgery and was explained about the options available. After appropriate presurgical orthodontics, maxilla was surgically expanded and was also advanced by 5 mm. After surgery, speech was more intelligible because of improved articulation. It was possible to get optimum esthetic and functional result due to coordinated team approach.
Analysis of a simplified normalized covariance measure based on binary weighting functions for predicting the intelligibility of noise-suppressed speech.

Science.gov (United States)

Chen, Fei; Loizou, Philipos C

2010-12-01

The normalized covariance measure (NCM) has been shown previously to predict reliably the intelligibility of noise-suppressed speech containing non-linear distortions. This study analyzes a simplified NCM measure that requires only a small number of bands (not necessarily contiguous) and uses simple binary (1 or 0) weighting functions. The rationale behind the use of a small number of bands is to account for the fact that the spectral information contained in contiguous or nearby bands is correlated and redundant. The modified NCM measure was evaluated with speech intelligibility scores obtained by normal-hearing listeners in 72 noisy conditions involving noise-suppressed speech corrupted by four different types of maskers (car, babble, train, and street interferences). High correlation (r = 0.8) was obtained with the modified NCM measure even when only one band was used. Further analysis revealed a masker-specific pattern of correlations when only one band was used, and bands with low correlation signified the corresponding envelopes that have been severely distorted by the noise-suppression algorithm and/or the masker. Correlation improved to r = 0.84 when only two disjoint bands (centered at 325 and 1874 Hz) were used. Even further improvements in correlation (r = 0.85) were obtained when three or four lower-frequency (<700 Hz) bands were selected.

An Ecosystem of Intelligent ICT Tools for Speech-Language Therapy Based on a Formal Knowledge Model.

Science.gov (United States)

Robles-Bykbaev, Vladimir; López-Nores, Martín; Pazos-Arias, José; Quisi-Peralta, Diego; García-Duque, Jorge

2015-01-01

The language and communication constitute the development mainstays of several intellectual and cognitive skills in humans. However, there are millions of people around the world who suffer from several disabilities and disorders related with language and communication, while most of the countries present a lack of corresponding services related with health care and rehabilitation. On these grounds, we are working to develop an ecosystem of intelligent ICT tools to support speech and language pathologists, doctors, students, patients and their relatives. This ecosystem has several layers and components, integrating Electronic Health Records management, standardized vocabularies, a knowledge database, an ontology of concepts from the speech-language domain, and an expert system. We discuss the advantages of such an approach through experiments carried out in several institutions assisting children with a wide spectrum of disabilities.
The impact of language co-activation on L1 and L2 speech fluency

NARCIS (Netherlands)

Bergmann, Christopher; Sprenger, Simone A.; Schmid, Monika S.

2015-01-01

Fluent speech depends on the availability of well-established linguistic knowledge and routines for speech planning and articulation. A lack of speech fluency in late second-language (12) learners may point to a deficiency of these representations, due to incomplete acquisition. Experiments on
Between-Word Simplification Patterns in the Continuous Speech of Children with Speech Sound Disorders

Science.gov (United States)

Klein, Harriet B.; Liu-Shea, May

2009-01-01

Purpose: This study was designed to identify and describe between-word simplification patterns in the continuous speech of children with speech sound disorders. It was hypothesized that word combinations would reveal phonological changes that were unobserved with single words, possibly accounting for discrepancies between the intelligibility of…
Speech and Swallowing in Parkinson’s Disease

OpenAIRE

Tjaden, Kris

2008-01-01

Dysarthria and dysphagia occur frequently in Parkinson’s disease (PD). Reduced speech intelligibility is a significant functional limitation of dysarthria, and in the case of PD is likely related articulatory and phonatory impairment. Prosodically-based treatments show the most promise for addressing these deficits as well as for maximizing speech intelligibility. Communication-oriented strategies also may help to enhance mutual understanding between a speaker and listener. Dysphagia in PD ca...
Cognitive Functions in Childhood Apraxia of Speech

Science.gov (United States)

Nijland, Lian; Terband, Hayo; Maassen, Ben

2015-01-01

Purpose: Childhood apraxia of speech (CAS) is diagnosed on the basis of specific speech characteristics, in the absence of problems in hearing, intelligence, and language comprehension. This does not preclude the possibility that children with this speech disorder might demonstrate additional problems. Method: Cognitive functions were investigated…
Binary Masking & Speech Intelligibility

DEFF Research Database (Denmark)

Boldt, Jesper

The purpose of this thesis is to examine how binary masking can be used to increase intelligibility in situations where hearing impaired listeners have difficulties understanding what is being said. The major part of the experiments carried out in this thesis can be categorized as either experime......The purpose of this thesis is to examine how binary masking can be used to increase intelligibility in situations where hearing impaired listeners have difficulties understanding what is being said. The major part of the experiments carried out in this thesis can be categorized as either...... experiments under ideal conditions or as experiments under more realistic conditions useful for real-life applications such as hearing aids. In the experiments under ideal conditions, the previously defined ideal binary mask is evaluated using hearing impaired listeners, and a novel binary mask -- the target...... binary mask -- is introduced. The target binary mask shows the same substantial increase in intelligibility as the ideal binary mask and is proposed as a new reference for binary masking. In the category of real-life applications, two new methods are proposed: a method for estimation of the ideal binary...
The Effectiveness of Clear Speech as a Masker

Science.gov (United States)

Calandruccio, Lauren; Van Engen, Kristin; Dhar, Sumitrajit; Bradlow, Ann R.

2010-01-01

Purpose: It is established that speaking clearly is an effective means of enhancing intelligibility. Because any signal-processing scheme modeled after known acoustic-phonetic features of clear speech will likely affect both target and competing speech, it is important to understand how speech recognition is affected when a competing speech signal…
Partially Overlapping Sensorimotor Networks Underlie Speech Praxis and Verbal Short-Term Memory: Evidence from Apraxia of Speech Following Acute Stroke

Directory of Open Access Journals (Sweden)

Gregory eHickok

2014-08-01

Full Text Available We tested the hypothesis that motor planning and programming of speech articulation and verbal short-term memory (vSTM depend on partially overlapping networks of neural regions. We evaluated this proposal by testing 76 individuals with acute ischemic stroke for impairment in motor planning of speech articulation (apraxia of speech; AOS and vSTM in the first day of stroke, before the opportunity for recovery or reorganization of structure-function relationships. We also evaluate areas of both infarct and low blood flow that might have contributed to AOS or impaired vSTM in each person. We found that AOS was associated with tissue dysfunction in motor-related areas (posterior primary motor cortex, pars opercularis; premotor cortex, insula and sensory-related areas (primary somatosensory cortex, secondary somatosensory cortex, parietal operculum/auditory cortex; while impaired vSTM was associated with primarily motor-related areas (pars opercularis and pars triangularis, premotor cortex, and primary motor cortex. These results are consistent with the hypothesis, also supported by functional imaging data, that both speech praxis and vSTM rely on partially overlapping networks of brain regions.
[Speech perception development in children with dyslexia].

Science.gov (United States)

Ortiz, Rosario; Jiménez, Juan E; Muñetón, Mercedes; Rojas, Estefanía; Estévez, Adelina; Guzmán, Remedios; Rodríguez, Cristina; Naranjo, Francisco

2008-11-01

Several studies have indicated that dyslexics show a deficit in speech perception (SP). The main purpose of this research is to determine the development of SP in dyslexics and normal readers paired by grades from 2nd to 6th grade of primary school and to know whether the phonetic contrasts that are relevant for SP change during development, taking into account the individual differences. The achievement of both groups was compared in the phonetic tasks: voicing contrast, place of articulation contrast and manner of articulation contrast. The results showed that the dyslexic performed poorer than the normal readers in SP. In place of articulation contrast, the developmental pattern is similar in both groups but not in voicing and manner of articulation. Manner of articulation has more influence on SP, and its development is higher than the other contrast tasks in both groups.
Dynamically adapted context-specific hyper-articulation: Feedback from interlocutors affects speakers’ subsequent pronunciations

Science.gov (United States)

Buz, Esteban; Tanenhaus, Michael K.; Jaeger, T. Florian

2016-01-01

We ask whether speakers can adapt their productions when feedback from their interlocutors suggests that previous productions were perceptually confusable. To address this question, we use a novel web-based task-oriented paradigm for speech recording, in which participants produce instructions towards a (simulated) partner with naturalistic response times. We manipulate (1) whether a target word with a voiceless plosive (e.g., pill) occurs in the presence of a voiced competitor (bill) or an unrelated word (food) and (2) whether or not the simulated partner occasionally misunderstands the target word. Speakers hyper-articulated the target word when a voiced competitor was present. Moreover, the size of the hyper-articulation effect was nearly doubled when partners occasionally misunderstood the instruction. A novel type of distributional analysis further suggests that hyper-articulation did not change the target of production, but rather reduced the probability of perceptually ambiguous or confusable productions. These results were obtained in the absence of explicit clarification requests, and persisted across words and over trials. Our findings suggest that speakers adapt their pronunciations based on the perceived communicative success of their previous productions in the current environment. We discuss why speakers make adaptive changes to their speech and what mechanisms might underlie speakers’ ability to do so. PMID:27375344
When Does Speech Sound Disorder Matter for Literacy? The Role of Disordered Speech Errors, Co-Occurring Language Impairment and Family Risk of Dyslexia

Science.gov (United States)

Hayiou-Thomas, Marianna E.; Carroll, Julia M.; Leavett, Ruth; Hulme, Charles; Snowling, Margaret J.

2017-01-01

Background: This study considers the role of early speech difficulties in literacy development, in the context of additional risk factors. Method: Children were identified with speech sound disorder (SSD) at the age of 3½ years, on the basis of performance on the Diagnostic Evaluation of Articulation and Phonology. Their literacy skills were…
Speech motor coordination in Dutch-speaking children with DAS studied with EMMA

NARCIS (Netherlands)

Nijland, L.; Maassen, B.A.M.; Hulstijn, W.; Peters, H.F.M.

2004-01-01

Developmental apraxia of speech (DAS) is generally classified as a 'speech motor' disorder. Direct measurement of articulatory movement is, however, virtually non-existent. In the present study we investigated the coordination between articulators in children with DAS using kinematic measurements.
Ordinal models of audiovisual speech perception

DEFF Research Database (Denmark)

Andersen, Tobias

2011-01-01

Audiovisual information is integrated in speech perception. One manifestation of this is the McGurk illusion in which watching the articulating face alters the auditory phonetic percept. Understanding this phenomenon fully requires a computational model with predictive power. Here, we describe...
Deep Brain Stimulation of the Subthalamic Nucleus Parameter Optimization for Vowel Acoustics and Speech Intelligibility in Parkinson's Disease

Science.gov (United States)

Knowles, Thea; Adams, Scott; Abeyesekera, Anita; Mancinelli, Cynthia; Gilmore, Greydon; Jog, Mandar

2018-01-01

Purpose: The settings of 3 electrical stimulation parameters were adjusted in 12 speakers with Parkinson's disease (PD) with deep brain stimulation of the subthalamic nucleus (STN-DBS) to examine their effects on vowel acoustics and speech intelligibility. Method: Participants were tested under permutations of low, mid, and high STN-DBS frequency,…
Rhythm Perception and Its Role in Perception and Learning of Dysrhythmic Speech.

Science.gov (United States)

Borrie, Stephanie A; Lansford, Kaitlin L; Barrett, Tyson S

2017-03-01

The perception of rhythm cues plays an important role in recognizing spoken language, especially in adverse listening conditions. Indeed, this has been shown to hold true even when the rhythm cues themselves are dysrhythmic. This study investigates whether expertise in rhythm perception provides a processing advantage for perception (initial intelligibility) and learning (intelligibility improvement) of naturally dysrhythmic speech, dysarthria. Fifty young adults with typical hearing participated in 3 key tests, including a rhythm perception test, a receptive vocabulary test, and a speech perception and learning test, with standard pretest, familiarization, and posttest phases. Initial intelligibility scores were calculated as the proportion of correct pretest words, while intelligibility improvement scores were calculated by subtracting this proportion from the proportion of correct posttest words. Rhythm perception scores predicted intelligibility improvement scores but not initial intelligibility. On the other hand, receptive vocabulary scores predicted initial intelligibility scores but not intelligibility improvement. Expertise in rhythm perception appears to provide an advantage for processing dysrhythmic speech, but a familiarization experience is required for the advantage to be realized. Findings are discussed in relation to the role of rhythm in speech processing and shed light on processing models that consider the consequence of rhythm abnormalities in dysarthria.
Aerodynamic Indices of Velopharyngeal Function in Childhood Apraxia of Speech

Science.gov (United States)

Sealey, Linda R.; Giddens, Cheryl L.

2010-01-01

Childhood apraxia of speech (CAS) is characterized as a deficit in the motor processes of speech for the volitional control of the articulators, including the velum. One of the many characteristics attributed to children with CAS is intermittent or inconsistent hypernasality. The purpose of this study was to document differences in velopharyngeal…
The role of periodicity in perceiving speech in quiet and in background noise.

Science.gov (United States)

Steinmetzger, Kurt; Rosen, Stuart

2015-12-01

The ability of normal-hearing listeners to perceive sentences in quiet and in background noise was investigated in a variety of conditions mixing the presence and absence of periodicity (i.e., voicing) in both target and masker. Experiment 1 showed that in quiet, aperiodic noise-vocoded speech and speech with a natural amount of periodicity were equally intelligible, while fully periodic speech was much harder to understand. In Experiments 2 and 3, speech reception thresholds for these targets were measured in the presence of four different maskers: speech-shaped noise, harmonic complexes with a dynamically varying F0 contour, and 10 Hz amplitude-modulated versions of both. For experiment 2, results of experiment 1 were used to identify conditions with equal intelligibility in quiet, while in experiment 3 target intelligibility in quiet was near ceiling. In the presence of a masker, periodicity in the target speech mattered little, but listeners strongly benefited from periodicity in the masker. Substantial fluctuating-masker benefits required the target speech to be almost perfectly intelligible in quiet. In summary, results suggest that the ability to exploit periodicity cues may be an even more important factor when attempting to understand speech embedded in noise than the ability to benefit from masker fluctuations.
SPEECH EVALUATION WITH AND WITHOUT PALATAL OBTURATOR IN PATIENTS SUBMITTED TO MAXILLECTOMY

Science.gov (United States)

de Carvalho-Teles, Viviane; Pegoraro-Krook, Maria Inês; Lauris, José Roberto Pereira

2006-01-01

Most patients who have undergone resection of the maxillae due to benign or malignant tumors in the palatomaxillary region present with speech and swallowing disorders. Coupling of the oral and nasal cavities increases nasal resonance, resulting in hypernasality and unintelligible speech. Prosthodontic rehabilitation of maxillary resections with effective separation of the oral and nasal cavities can improve speech and esthetics, and assist the psychosocial adjustment of the patient as well. The objective of this study was to evaluate the efficacy of the palatal obturator prosthesis on speech intelligibility and resonance of 23 patients with age ranging from 18 to 83 years (Mean = 49.5 years), who had undergone inframedial-structural maxillectomy. The patients were requested to count from 1 to 20, to repeat 21 words and to spontaneously speak for 15 seconds, once with and again without the prosthesis, for tape recording purposes. The resonance and speech intelligibility were judged by 5 speech language pathologists from the tape recordings samples. The results have shown that the majority of patients (82.6%) significantly improved their speech intelligibility, and 16 patients (69.9%) exhibited a significant hypernasality reduction with the obturator in place. The results of this study indicated that maxillary obturator prosthesis was efficient to improve the speech intelligibility and resonance in patients who had undergone maxillectomy. PMID:19089242
Association of Velopharyngeal Insufficiency With Quality of Life and Patient-Reported Outcomes After Speech Surgery.

Science.gov (United States)

Bhuskute, Aditi; Skirko, Jonathan R; Roth, Christina; Bayoumi, Ahmed; Durbin-Johnson, Blythe; Tollefson, Travis T

2017-09-01

Patients with cleft palate and other causes of velopharyngeal insufficiency (VPI) suffer adverse effects on social interactions and communication. Measurement of these patient-reported outcomes is needed to help guide surgical and nonsurgical care. To further validate the VPI Effects on Life Outcomes (VELO) instrument, measure the change in quality of life (QOL) after speech surgery, and test the association of change in speech with change in QOL. Prospective descriptive cohort including children and young adults undergoing speech surgery for VPI in a tertiary academic center. Participants completed the validated VELO instrument before and after surgical treatment. The main outcome measures were preoperative and postoperative VELO scores and the perceptual speech assessment of speech intelligibility. The VELO scores are divided into subscale domains. Changes in VELO after surgery were analyzed using linear regression models. VELO scores were analyzed as a function of speech intelligibility adjusting for age and cleft type. The correlation between speech intelligibility rating and VELO scores was estimated using the polyserial correlation. Twenty-nine patients (13 males and 16 females) were included. Mean (SD) age was 7.9 (4.1) years (range, 4-20 years). Pharyngeal flap was used in 14 (48%) cases, Furlow palatoplasty in 12 (41%), and sphincter pharyngoplasty in 1 (3%). The mean (SD) preoperative speech intelligibility rating was 1.71 (1.08), which decreased postoperatively to 0.79 (0.93) in 24 patients who completed protocol (P Speech Intelligibility was correlated with preoperative and postoperative total VELO score (P speech intelligibility. Speech surgery improves VPI-specific quality of life. We confirmed validation in a population of untreated patients with VPI and included pharyngeal flap surgery, which had not previously been included in validation studies. The VELO instrument provides patient-specific outcomes, which allows a broader understanding of the
The Effect of Otitis Media on Articulation in Children with Cerebral Palsy.

Science.gov (United States)

Van der Vyver, Marguerite; And Others

1988-01-01

A study involving 20 Afrikaans-speaking children with cerebral palsy found that recurrent otitis media in early childhood had a negative effect on articulation abilities of the 7 to 11-year-old children but that other factors such as intelligence also played a role. (JDD)

Gender recognition from unconstrained and articulated human body.

Science.gov (United States)

Wu, Qin; Guo, Guodong

2014-01-01

Gender recognition has many useful applications, ranging from business intelligence to image search and social activity analysis. Traditional research on gender recognition focuses on face images in a constrained environment. This paper proposes a method for gender recognition in articulated human body images acquired from an unconstrained environment in the real world. A systematic study of some critical issues in body-based gender recognition, such as which body parts are informative, how many body parts are needed to combine together, and what representations are good for articulated body-based gender recognition, is also presented. This paper also pursues data fusion schemes and efficient feature dimensionality reduction based on the partial least squares estimation. Extensive experiments are performed on two unconstrained databases which have not been explored before for gender recognition.
Gender Recognition from Unconstrained and Articulated Human Body

Science.gov (United States)

Wu, Qin; Guo, Guodong

2014-01-01

Gender recognition has many useful applications, ranging from business intelligence to image search and social activity analysis. Traditional research on gender recognition focuses on face images in a constrained environment. This paper proposes a method for gender recognition in articulated human body images acquired from an unconstrained environment in the real world. A systematic study of some critical issues in body-based gender recognition, such as which body parts are informative, how many body parts are needed to combine together, and what representations are good for articulated body-based gender recognition, is also presented. This paper also pursues data fusion schemes and efficient feature dimensionality reduction based on the partial least squares estimation. Extensive experiments are performed on two unconstrained databases which have not been explored before for gender recognition. PMID:24977203
Integration of auditory and visual speech information

NARCIS (Netherlands)

Hall, M.; Smeele, P.M.T.; Kuhl, P.K.

1998-01-01

The integration of auditory and visual speech is observed when modes specify different places of articulation. Influences of auditory variation on integration were examined using consonant identifi-cation, plus quality and similarity ratings. Auditory identification predicted auditory-visual
Connections of Grasping and Horizontal Hand Movements with Articulation in Czech Speakers

Czech Academy of Sciences Publication Activity Database

Tiainen, M.; Lukavský, Jiří; Tiippana, K.; Vainio, M.; Šimko, J.; Felisberti, F.; Vainio, L.

2017-01-01

Roč. 8, duben (2017), s. 1-10, č. článku 516. ISSN 1664-1078 Grant - others:AV ČR(CZ) StrategieAV21/14 Program:StrategieAV Institutional support: RVO:68081740 Keywords : articulation * motor actions * language * grasping * manual gestures * speech * manual actions Subject RIV: AN - Psychology OBOR OECD: Cognitive sciences Impact factor: 2.323, year: 2016
Acoustic properties of naturally produced clear speech at normal speaking rates

Science.gov (United States)

Krause, Jean C.; Braida, Louis D.

2004-01-01

Sentences spoken ``clearly'' are significantly more intelligible than those spoken ``conversationally'' for hearing-impaired listeners in a variety of backgrounds [Picheny et al., J. Speech Hear. Res. 28, 96-103 (1985); Uchanski et al., ibid. 39, 494-509 (1996); Payton et al., J. Acoust. Soc. Am. 95, 1581-1592 (1994)]. While producing clear speech, however, talkers often reduce their speaking rate significantly [Picheny et al., J. Speech Hear. Res. 29, 434-446 (1986); Uchanski et al., ibid. 39, 494-509 (1996)]. Yet speaking slowly is not solely responsible for the intelligibility benefit of clear speech (over conversational speech), since a recent study [Krause and Braida, J. Acoust. Soc. Am. 112, 2165-2172 (2002)] showed that talkers can produce clear speech at normal rates with training. This finding suggests that clear speech has inherent acoustic properties, independent of rate, that contribute to improved intelligibility. Identifying these acoustic properties could lead to improved signal processing schemes for hearing aids. To gain insight into these acoustical properties, conversational and clear speech produced at normal speaking rates were analyzed at three levels of detail (global, phonological, and phonetic). Although results suggest that talkers may have employed different strategies to achieve clear speech at normal rates, two global-level properties were identified that appear likely to be linked to the improvements in intelligibility provided by clear/normal speech: increased energy in the 1000-3000-Hz range of long-term spectra and increased modulation depth of low frequency modulations of the intensity envelope. Other phonological and phonetic differences associated with clear/normal speech include changes in (1) frequency of stop burst releases, (2) VOT of word-initial voiceless stop consonants, and (3) short-term vowel spectra.
Viewing speech in action: speech articulation videos in the public domain that demonstrate the sounds of the International Phonetic Alphabet (IPA)

OpenAIRE

Nakai, S.; Beavan, D.; Lawson, E.; Leplâtre, G.; Scobbie, J. M.; Stuart-Smith, J.

2016-01-01

In this article, we introduce recently released, publicly available resources, which allow users to watch videos of hidden articulators (e.g. the tongue) during the production of various types of sounds found in the world’s languages. The articulation videos on these resources are linked to a clickable International Phonetic Alphabet chart ([International Phonetic Association. 1999. Handbook of the International Phonetic Association: A Guide to the Use of the International Phonetic Alphabet. ...
Gender Recognition from Unconstrained and Articulated Human Body

OpenAIRE

Wu, Qin; Guo, Guodong

2014-01-01

Gender recognition has many useful applications, ranging from business intelligence to image search and social activity analysis. Traditional research on gender recognition focuses on face images in a constrained environment. This paper proposes a method for gender recognition in articulated human body images acquired from an unconstrained environment in the real world. A systematic study of some critical issues in body-based gender recognition, such as which body parts are informative, ho...
Speech recognition using articulatory and excitation source features

CERN Document Server

Rao, K Sreenivasa

2017-01-01

This book discusses the contribution of articulatory and excitation source information in discriminating sound units. The authors focus on excitation source component of speech -- and the dynamics of various articulators during speech production -- for enhancement of speech recognition (SR) performance. Speech recognition is analyzed for read, extempore, and conversation modes of speech. Five groups of articulatory features (AFs) are explored for speech recognition, in addition to conventional spectral features. Each chapter provides the motivation for exploring the specific feature for SR task, discusses the methods to extract those features, and finally suggests appropriate models to capture the sound unit specific knowledge from the proposed features. The authors close by discussing various combinations of spectral, articulatory and source features, and the desired models to enhance the performance of SR systems.
[Velopharyngeal closure pattern and speech performance among submucous cleft palate patients].

Science.gov (United States)

Heng, Yin; Chunli, Guo; Bing, Shi; Yang, Li; Jingtao, Li

2017-06-01

To characterize the velopharyngeal closure patterns and speech performance among submucous cleft palate patients. Patients with submucous cleft palate visiting the Department of Cleft Lip and Palate Surgery, West China Hospital of Stomatology, Sichuan University between 2008 and 2016 were reviewed. Outcomes of subjective speech evaluation including velopharyngeal function, consonant articulation, and objective nasopharyngeal endoscopy including the mobility of soft palate, pharyngeal walls were retrospectively analyzed. A total of 353 cases were retrieved in this study, among which 138 (39.09%) demonstrated velopharyngeal competence, 176 (49.86%) velopharyngeal incompetence, and 39 (11.05%) marginal velopharyngeal incompetence. A total of 268 cases were subjected to nasopharyngeal endoscopy examination, where 167 (62.31%) demonstrated circular closure pattern, 89 (33.21%) coronal pattern, and 12 (4.48%) sagittal pattern. Passavant's ridge existed in 45.51% (76/167) patients with circular closure and 13.48% (12/89) patients with coronal closure. Among the 353 patients included in this study, 137 (38.81%) presented normal articulation, 124 (35.13%) consonant elimination, 51 (14.45%) compensatory articulation, 36 (10.20%) consonant weakening, 25 (7.08%) consonant replacement, and 36 (10.20%) multiple articulation errors. Circular closure was the most prevalent velopharyngeal closure pattern among patients with submucous cleft palate, and high-pressure consonant deletion was the most common articulation abnormality. Articulation error occurred more frequently among patients with a low velopharyngeal closure rate.
Non-Intrusive Intelligibility Prediction Using a Codebook-Based Approach

DEFF Research Database (Denmark)

Sørensen, Charlotte; Kavalekalam, Mathew Shaji; Xenaki, Angeliki

2017-01-01

It could be beneficial for users of hearing aids if these were able to automatically adjust the processing according to the speech intelligibility in the specific acoustic environment. Most speech intelligibility metrics are intrusive, i.e., they require a clean reference signal, which is rarely...... a high correlation between the proposed non-intrusive codebookbased STOI (NIC-STOI) and the intrusive STOI indicating that NIC-STOI is a suitable metric for automatic classification of speech signals...
Speech intelligibility after gingivectomy of excess palatal tissue

Directory of Open Access Journals (Sweden)

Aruna Balasundaram

2014-01-01

Full Text Available To appreciate any enhancement in speech following gingivectomy of enlarged anterior palatal gingiva. Periodontal literature has documented various conditions, pathophysiology, and treatment modalities of gingival enlargement. Relationship between gingival maladies and speech alteration has received scant attention. This case report describes on altered speech pattern enhancement secondary to the gingivectomy procedure. A systemically healthy 24-year- female patient reported with bilateral anterior gingival enlargement who was provisionally diagnosed as "gingival abscess with inflammatory enlargement" in relation to palatal aspect of the right maxillary canine to left maxillary canine. Bilateral gingivectomy procedure was performed by external bevel incision in relation to anterior palatal gingiva and a large wedge of epithelium and connective tissue was removed. Patient and her close acquaintances noticed a great improvement in her pronunciation and enunciation of sounds like "t", "d", "n", "l", "th", following removal of excess gingival palatal tissue and was also appreciated with visual analog scale score. Exploration of linguistic research documented the significance of tongue-palate contact during speech. Any excess gingival tissue in palatal region brings about disruption in speech by altering tongue-palate contact. Periodontal surgery like gingivectomy may improve disrupted phonetics. Excess gingival palatal tissue impedes on tongue-palate contact and interferes speech. Pronunciation of consonants like "t", "d", "n", "l", "th", are altered with anterior enlarged palatal gingiva. Excision of the enlarged palatal tissue results in improvement of speech.
Cognitive Processing Speed, Working Memory, and the Intelligibility of Hearing Aid-Processed Speech in Persons with Hearing Impairment

Directory of Open Access Journals (Sweden)

Wycliffe Kabaywe Yumba

2017-08-01

Full Text Available Previous studies have demonstrated that successful listening with advanced signal processing in digital hearing aids is associated with individual cognitive capacity, particularly working memory capacity (WMC. This study aimed to examine the relationship between cognitive abilities (cognitive processing speed and WMC and individual listeners’ responses to digital signal processing settings in adverse listening conditions. A total of 194 native Swedish speakers (83 women and 111 men, aged 33–80 years (mean = 60.75 years, SD = 8.89, with bilateral, symmetrical mild to moderate sensorineural hearing loss who had completed a lexical decision speed test (measuring cognitive processing speed and semantic word-pair span test (SWPST, capturing WMC participated in this study. The Hagerman test (capturing speech recognition in noise was conducted using an experimental hearing aid with three digital signal processing settings: (1 linear amplification without noise reduction (NoP, (2 linear amplification with noise reduction (NR, and (3 non-linear amplification without NR (“fast-acting compression”. The results showed that cognitive processing speed was a better predictor of speech intelligibility in noise, regardless of the types of signal processing algorithms used. That is, there was a stronger association between cognitive processing speed and NR outcomes and fast-acting compression outcomes (in steady state noise. We observed a weaker relationship between working memory and NR, but WMC did not relate to fast-acting compression. WMC was a relatively weaker predictor of speech intelligibility in noise. These findings might have been different if the participants had been provided with training and or allowed to acclimatize to binary masking noise reduction or fast-acting compression.
Correlational Analysis of Speech Intelligibility Tests and Metrics for Speech Transmission

Science.gov (United States)

2017-12-04

sounds, are more prone to masking than the high-energy, wide-spectrum vowels. Such contaminated speech is still audible but not clear. Thus, speech...Science; 2012 June 12–14; Kuala Lumpur ( Malaysia ): New York (NY): IEEE; c2012. p. 676–682. Approved for public release; distribution is unlimited. 47...ARRABITO 1 UNIV OF COLORADO (PDF) K AREHART 1 NASA (PDF) J ALLEN 1 FOOD AND DRUG ADM-DEPT (PDF) OF HEALTH AND HUMAN SERVICES
Effects of Speaking Task on Intelligibility in Parkinson's Disease

Science.gov (United States)

Tjaden, Kris; Wilding, Greg

2011-01-01

Intelligibility tests for dysarthria typically provide an estimate of overall severity for speech materials elicited through imitation or read from a printed script. The extent to which these types of tasks and procedures reflect intelligibility for extemporaneous speech is not well understood. The purpose of this study was to compare…
Relations Between the Intelligibility of Speech in Noise and Psychophysical Measures of Hearing Measured in Four Languages Using the Auditory Profile Test Battery

NARCIS (Netherlands)

van Esch, T. E. M.; Dreschler, W. A.

2015-01-01

The aim of the present study was to determine the relations between the intelligibility of speech in noise and measures of auditory resolution, loudness recruitment, and cognitive function. The analyses were based on data published earlier as part of the presentation of the Auditory Profile, a test
Simultaneous Treatment of Grammatical and Speech-Comprehensibility Deficits in Children with Down Syndrome

Science.gov (United States)

Camarata, Stephen; Yoder, Paul; Camarata, Mary

2006-01-01

Children with Down syndrome often display speech-comprehensibility and grammatical deficits beyond what would be predicted based upon general mental age. Historically, speech-comprehensibility has often been treated using traditional articulation therapy and oral-motor training so there may be little or no coordination of grammatical and…
Spectrotemporal modulation sensitivity for hearing-impaired listeners: dependence on carrier center frequency and the relationship to speech intelligibility.

Science.gov (United States)

Mehraei, Golbarg; Gallun, Frederick J; Leek, Marjorie R; Bernstein, Joshua G W

2014-07-01

Poor speech understanding in noise by hearing-impaired (HI) listeners is only partly explained by elevated audiometric thresholds. Suprathreshold-processing impairments such as reduced temporal or spectral resolution or temporal fine-structure (TFS) processing ability might also contribute. Although speech contains dynamic combinations of temporal and spectral modulation and TFS content, these capabilities are often treated separately. Modulation-depth detection thresholds for spectrotemporal modulation (STM) applied to octave-band noise were measured for normal-hearing and HI listeners as a function of temporal modulation rate (4-32 Hz), spectral ripple density [0.5-4 cycles/octave (c/o)] and carrier center frequency (500-4000 Hz). STM sensitivity was worse than normal for HI listeners only for a low-frequency carrier (1000 Hz) at low temporal modulation rates (4-12 Hz) and a spectral ripple density of 2 c/o, and for a high-frequency carrier (4000 Hz) at a high spectral ripple density (4 c/o). STM sensitivity for the 4-Hz, 4-c/o condition for a 4000-Hz carrier and for the 4-Hz, 2-c/o condition for a 1000-Hz carrier were correlated with speech-recognition performance in noise after partialling out the audiogram-based speech-intelligibility index. Poor speech-reception and STM-detection performance for HI listeners may be related to a combination of reduced frequency selectivity and a TFS-processing deficit limiting the ability to track spectral-peak movements.
FOXP2 and the neuroanatomy of speech and language.

Science.gov (United States)

Vargha-Khadem, Faraneh; Gadian, David G; Copp, Andrew; Mishkin, Mortimer

2005-02-01

That speech and language are innate capacities of the human brain has long been widely accepted, but only recently has an entry point into the genetic basis of these remarkable faculties been found. The discovery of a mutation in FOXP2 in a family with a speech and language disorder has enabled neuroscientists to trace the neural expression of this gene during embryological development, track the effects of this gene mutation on brain structure and function, and so begin to decipher that part of our neural inheritance that culminates in articulate speech.
Do long-term tongue piercings affect speech quality?

Science.gov (United States)

Heinen, Esther; Birkholz, Peter; Willmes, Klaus; Neuschaefer-Rube, Christiane

2017-10-01

To explore possible effects of tongue piercing on perceived speech quality. Using a quasi-experimental design, we analyzed the effect of tongue piercing on speech in a perception experiment. Samples of spontaneous speech and read speech were recorded from 20 long-term pierced and 20 non-pierced individuals (10 males, 10 females each). The individuals having a tongue piercing were recorded with attached and removed piercing. The audio samples were blindly rated by 26 female and 20 male laypersons and by 5 female speech-language pathologists with regard to perceived speech quality along 5 dimensions: speech clarity, speech rate, prosody, rhythm and fluency. We found no statistically significant differences for any of the speech quality dimensions between the pierced and non-pierced individuals, neither for the read nor for the spontaneous speech. In addition, neither length nor position of piercing had a significant effect on speech quality. The removal of tongue piercings had no effects on speech performance either. Rating differences between laypersons and speech-language pathologists were not dependent on the presence of a tongue piercing. People are able to perfectly adapt their articulation to long-term tongue piercings such that their speech quality is not perceptually affected.
Method of controlling innovative articulation for articulated vehicle

Directory of Open Access Journals (Sweden)

Szumilas Mateusz

2018-01-01

Full Text Available Operation of an articulated vehicle is dependent on an appropriate damping action taking place in its rotary articulation. In order to analyse an impact of the control of the articulation on the motion of the vehicle a model of the vehicle with a controllable hydraulic damping system has been developed. A 90 degree turn and lane change manoeuvres were simulated using LabVIEW software. Modification of the damping parameters of the articulation, according to the velocity and articulation angle of the vehicle, proved to have a significant impact on the vehicle motion stability. Moreover, the sensor layer necessary for the control algorithm as well as the diagnostic system is described.

Progressive apraxia of speech as a window into the study of speech planning processes.

Science.gov (United States)

Laganaro, Marina; Croisier, Michèle; Bagou, Odile; Assal, Frédéric

2012-09-01

We present a 3-year follow-up study of a patient with progressive apraxia of speech (PAoS), aimed at investigating whether the theoretical organization of phonetic encoding is reflected in the progressive disruption of speech. As decreased speech rate was the most striking pattern of disruption during the first 2 years, durational analyses were carried out longitudinally on syllables excised from spontaneous, repetition and reading speech samples. The crucial result of the present study is the demonstration of an effect of syllable frequency on duration: the progressive disruption of articulation rate did not affect all syllables in the same way, but followed a gradient that was function of the frequency of use of syllable-sized motor programs. The combination of data from this case of PAoS with previous psycholinguistic and neurolinguistic data, points to a frequency organization of syllable-sized speech-motor plans. In this study we also illustrate how studying PAoS can be exploited in theoretical and clinical investigations of phonetic encoding as it represents a unique opportunity to investigate speech while it progressively disrupts. Copyright © 2011 Elsevier Srl. All rights reserved.
Audiovisual Speech Perception and Eye Gaze Behavior of Adults with Asperger Syndrome

Science.gov (United States)

Saalasti, Satu; Katsyri, Jari; Tiippana, Kaisa; Laine-Hernandez, Mari; von Wendt, Lennart; Sams, Mikko

2012-01-01

Audiovisual speech perception was studied in adults with Asperger syndrome (AS), by utilizing the McGurk effect, in which conflicting visual articulation alters the perception of heard speech. The AS group perceived the audiovisual stimuli differently from age, sex and IQ matched controls. When a voice saying /p/ was presented with a face…
Speech enhancement theory and practice

CERN Document Server

Loizou, Philipos C

2013-01-01

With the proliferation of mobile devices and hearing devices, including hearing aids and cochlear implants, there is a growing and pressing need to design algorithms that can improve speech intelligibility without sacrificing quality. Responding to this need, Speech Enhancement: Theory and Practice, Second Edition introduces readers to the basic problems of speech enhancement and the various algorithms proposed to solve these problems. Updated and expanded, this second edition of the bestselling textbook broadens its scope to include evaluation measures and enhancement algorithms aimed at impr
Robust intelligent sliding model control using recurrent cerebellar model articulation controller for uncertain nonlinear chaotic systems

International Nuclear Information System (INIS)

Peng Yafu

2009-01-01

In this paper, a robust intelligent sliding model control (RISMC) scheme using an adaptive recurrent cerebellar model articulation controller (RCMAC) is developed for a class of uncertain nonlinear chaotic systems. This RISMC system offers a design approach to drive the state trajectory to track a desired trajectory, and it is comprised of an adaptive RCMAC and a robust controller. The adaptive RCMAC is used to mimic an ideal sliding mode control (SMC) due to unknown system dynamics, and a robust controller is designed to recover the residual approximation error for guaranteeing the stable characteristic. Moreover, the Taylor linearization technique is employed to derive the linearized model of the RCMAC. The all adaptation laws of the RISMC system are derived based on the Lyapunov stability analysis and projection algorithm, so that the stability of the system can be guaranteed. Finally, the proposed RISMC system is applied to control a Van der Pol oscillator, a Genesio chaotic system and a Chua's chaotic circuit. The effectiveness of the proposed control scheme is verified by some simulation results with unknown system dynamics and existence of external disturbance. In addition, the advantages of the proposed RISMC are indicated in comparison with a SMC system
Neuronal populations in the occipital cortex of the blind synchronize to the temporal dynamics of speech

Science.gov (United States)

Van Ackeren, Markus Johannes; Barbero, Francesca M; Mattioni, Stefania; Bottini, Roberto

2018-01-01

The occipital cortex of early blind individuals (EB) activates during speech processing, challenging the notion of a hard-wired neurobiology of language. But, at what stage of speech processing do occipital regions participate in EB? Here we demonstrate that parieto-occipital regions in EB enhance their synchronization to acoustic fluctuations in human speech in the theta-range (corresponding to syllabic rate), irrespective of speech intelligibility. Crucially, enhanced synchronization to the intelligibility of speech was selectively observed in primary visual cortex in EB, suggesting that this region is at the interface between speech perception and comprehension. Moreover, EB showed overall enhanced functional connectivity between temporal and occipital cortices that are sensitive to speech intelligibility and altered directionality when compared to the sighted group. These findings suggest that the occipital cortex of the blind adopts an architecture that allows the tracking of speech material, and therefore does not fully abstract from the reorganized sensory inputs it receives. PMID:29338838
Functional connectivity between face-movement and speech-intelligibility areas during auditory-only speech perception.

Science.gov (United States)

Schall, Sonja; von Kriegstein, Katharina

2014-01-01

It has been proposed that internal simulation of the talking face of visually-known speakers facilitates auditory speech recognition. One prediction of this view is that brain areas involved in auditory-only speech comprehension interact with visual face-movement sensitive areas, even under auditory-only listening conditions. Here, we test this hypothesis using connectivity analyses of functional magnetic resonance imaging (fMRI) data. Participants (17 normal participants, 17 developmental prosopagnosics) first learned six speakers via brief voice-face or voice-occupation training (comprehension. Overall, the present findings indicate that learned visual information is integrated into the analysis of auditory-only speech and that this integration results from the interaction of task-relevant face-movement and auditory speech-sensitive areas.
Representational Similarity Analysis Reveals Heterogeneous Networks Supporting Speech Motor Control

DEFF Research Database (Denmark)

Zheng, Zane; Cusack, Rhodri; Johnsrude, Ingrid

The everyday act of speaking involves the complex processes of speech motor control. One important feature of such control is regulation of articulation when auditory concomitants of speech do not correspond to the intended motor gesture. While theoretical accounts of speech monitoring posit...... multiple functional components required for detection of errors in speech planning (e.g., Levelt, 1983), neuroimaging studies generally indicate either single brain regions sensitive to speech production errors, or small, discrete networks. Here we demonstrate that the complex system controlling speech...... is supported by a complex neural network that is involved in linguistic, motoric and sensory processing. With the aid of novel real-time acoustic analyses and representational similarity analyses of fMRI signals, our data show functionally differentiated networks underlying auditory feedback control of speech....
STANFORD ARTIFICIAL INTELLIGENCE PROJECT.

Science.gov (United States)

ARTIFICIAL INTELLIGENCE , GAME THEORY, DECISION MAKING, BIONICS, AUTOMATA, SPEECH RECOGNITION, GEOMETRIC FORMS, LEARNING MACHINES, MATHEMATICAL MODELS, PATTERN RECOGNITION, SERVOMECHANISMS, SIMULATION, BIBLIOGRAPHIES.
Exploring Australian speech-language pathologists' use and perceptions ofnon-speech oral motor exercises.

Science.gov (United States)

Rumbach, Anna F; Rose, Tanya A; Cheah, Mynn

2018-01-29

To explore Australian speech-language pathologists' use of non-speech oral motor exercises, and rationales for using/not using non-speech oral motor exercises in clinical practice. A total of 124 speech-language pathologists practising in Australia, working with paediatric and/or adult clients with speech sound difficulties, completed an online survey. The majority of speech-language pathologists reported that they did not use non-speech oral motor exercises when working with paediatric or adult clients with speech sound difficulties. However, more than half of the speech-language pathologists working with adult clients who have dysarthria reported using non-speech oral motor exercises with this population. The most frequently reported rationale for using non-speech oral motor exercises in speech sound difficulty management was to improve awareness/placement of articulators. The majority of speech-language pathologists agreed there is no clear clinical or research evidence base to support non-speech oral motor exercise use with clients who have speech sound difficulties. This study provides an overview of Australian speech-language pathologists' reported use and perceptions of non-speech oral motor exercises' applicability and efficacy in treating paediatric and adult clients who have speech sound difficulties. The research findings provide speech-language pathologists with insight into how and why non-speech oral motor exercises are currently used, and adds to the knowledge base regarding Australian speech-language pathology practice of non-speech oral motor exercises in the treatment of speech sound difficulties. Implications for Rehabilitation Non-speech oral motor exercises refer to oral motor activities which do not involve speech, but involve the manipulation or stimulation of oral structures including the lips, tongue, jaw, and soft palate. Non-speech oral motor exercises are intended to improve the function (e.g., movement, strength) of oral structures. The
Measures to Evaluate the Effects of DBS on Speech Production

Science.gov (United States)

Weismer, Gary; Yunusova, Yana; Bunton, Kate

2011-01-01

The purpose of this paper is to review and evaluate measures of speech production that could be used to document effects of Deep Brain Stimulation (DBS) on speech performance, especially in persons with Parkinson disease (PD). A small set of evaluative criteria for these measures is presented first, followed by consideration of several speech physiology and speech acoustic measures that have been studied frequently and reported on in the literature on normal speech production, and speech production affected by neuromotor disorders (dysarthria). Each measure is reviewed and evaluated against the evaluative criteria. Embedded within this review and evaluation is a presentation of new data relating speech motions to speech intelligibility measures in speakers with PD, amyotrophic lateral sclerosis (ALS), and control speakers (CS). These data are used to support the conclusion that at the present time the slope of second formant transitions (F2 slope), an acoustic measure, is well suited to make inferences to speech motion and to predict speech intelligibility. The use of other measures should not be ruled out, however, and we encourage further development of evaluative criteria for speech measures designed to probe the effects of DBS or any treatment with potential effects on speech production and communication skills. PMID:24932066
Distributed Speech Enhancement in Wireless Acoustic Sensor Networks

NARCIS (Netherlands)

Zeng, Y.

2015-01-01

In digital speech communication applications like hands-free mobile telephony, hearing aids and human-to-computer communication systems, the recorded speech signals are typically corrupted by background noise. As a result, their quality and intelligibility can get severely degraded. Traditional
The role of high-frequency envelope fluctuations for speech masking release

DEFF Research Database (Denmark)

Jørgensen, Søren; Dau, Torsten

2013-01-01

The speech-based envelope power spectrum model (sEPSM; Jørgensen and Dau, 2011; Jørgensen et al., 2013) was shown to successfully predict speech intelligibility in conditions with stationary and fluctuating interferers, reverberation, and spectral subtraction. The key element in the model...... was the multi-resolution estimation of the signal-to-noise ratio in the envelope domain (SNRenv) at the output of a modulation filterbank. The simulations suggested that mainly modulation filters centered in the range from 1-8 Hz contribute to speech intelligibility in the case of stationary maskers whereas...... modulation filters tuned to frequencies above 16 Hz might be important in the case of fluctuating maskers. In the present study, the role of high-frequency envelope fluctuations for speech masking release was further investigated in conditions of speech-on-speech masking. Simulations were compared to various...
The role of high-frequency envelope fluctuations for speech masking release

DEFF Research Database (Denmark)

Jørgensen, Søren; Dau, Torsten

2013-01-01

The speech-based envelope power spectrum model [sEPSM; Jørgensen and Dau (2011), Jørgensen et al. (2013)] was shown to successfully predict speech intelligibility in conditions with stationary and fluctuating interferers, reverberation, and spectral subtraction. The key element in the model...... was the multi-resolution estimation of the signal-to-noise ratio in the envelope domain (SNRenv) at the output of a modulation filterbank. The simulations suggested that mainly modulation filters centered in the range from 1 to 8 Hz contribute to speech intelligibility in the case of stationary maskers whereas...... modulation filters tuned to frequencies above 16 Hz might be important in the case of fluctuating maskers. In the present study, the role of high-frequency envelope fluctuations for speech masking release was further investigated in conditions of speech-on-speech masking. Simulations were compared to various...
Memory for speech and speech for memory.

Science.gov (United States)

Locke, J L; Kutz, K J

1975-03-01

Thirty kindergarteners, 15 who substituted /w/ for /r/ and 15 with correct articulation, received two perception tests and a memory test that included /w/ and /r/ in minimally contrastive syllables. Although both groups had nearly perfect perception of the experimenter's productions of /w/ and /r/, misarticulating subjects perceived their own tape-recorded w/r productions as /w/. In the memory task these same misarticulating subjects committed significantly more /w/-/r/ confusions in unspoken recall. The discussion considers why people subvocally rehearse; a developmental period in which children do not rehearse; ways subvocalization may aid recall, including motor and acoustic encoding; an echoic store that provides additional recall support if subjects rehearse vocally, and perception of self- and other- produced phonemes by misarticulating children-including its relevance to a motor theory of perception. Evidence is presented that speech for memory can be sufficiently impaired to cause memory disorder. Conceptions that restrict speech disorder to an impairment of communication are challenged.
Drama techniques as part of cluttering therapy according to the verbotonal method

OpenAIRE

Hercigonja Salamoni, Darija; Rendulić, Ana

2017-01-01

Cluttering is a syndrome characterised by a wide range of symptoms. It always contains one or more key elements such as abnormally fast speech rate, greater than expected number of disfluencies, reduced intelligibility due to over-coarticulation and indistinct articulation, inappropriate brakes in speech pattern, monotone speech, disturbance in language planning, etc. Drama activities and storytelling share a number of features that allow spontaneous use during therapy process and detachment ...
Metaheuristic applications to speech enhancement

CERN Document Server

Kunche, Prajna

2016-01-01

This book serves as a basic reference for those interested in the application of metaheuristics to speech enhancement. The major goal of the book is to explain the basic concepts of optimization methods and their use in heuristic optimization in speech enhancement to scientists, practicing engineers, and academic researchers in speech processing. The authors discuss why it has been a challenging problem for researchers to develop new enhancement algorithms that aid in the quality and intelligibility of degraded speech. They present powerful optimization methods to speech enhancement that can help to solve the noise reduction problems. Readers will be able to understand the fundamentals of speech processing as well as the optimization techniques, how the speech enhancement algorithms are implemented by utilizing optimization methods, and will be given the tools to develop new algorithms. The authors also provide a comprehensive literature survey regarding the topic.
The Prevalence of Speech and Language Disorders in French-Speaking Preschool Children From Yaoundé (Cameroon).

Science.gov (United States)

Tchoungui Oyono, Lilly; Pascoe, Michelle; Singh, Shajila

2018-05-17

The purpose of this study was to determine the prevalence of speech and language disorders in French-speaking preschool-age children in Yaoundé, the capital city of Cameroon. A total of 460 participants aged 3-5 years were recruited from the 7 communes of Yaoundé using a 2-stage cluster sampling method. Speech and language assessment was undertaken using a standardized speech and language test, the Evaluation du Langage Oral (Khomsi, 2001), which was purposefully renormed on the sample. A predetermined cutoff of 2 SDs below the normative mean was applied to identify articulation, expressive language, and receptive language disorders. Fluency and voice disorders were identified using clinical judgment by a speech-language pathologist. Overall prevalence was calculated as follows: speech disorders, 14.7%; language disorders, 4.3%; and speech and language disorders, 17.1%. In terms of disorders, prevalence findings were as follows: articulation disorders, 3.6%; expressive language disorders, 1.3%; receptive language disorders, 3%; fluency disorders, 8.4%; and voice disorders, 3.6%. Prevalence figures are higher than those reported for other countries and emphasize the urgent need to develop speech and language services for the Cameroonian population.
Perceptual effects of noise reduction by time-frequency masking of noisy speech.

Science.gov (United States)

Brons, Inge; Houben, Rolph; Dreschler, Wouter A

2012-10-01

Time-frequency masking is a method for noise reduction that is based on the time-frequency representation of a speech in noise signal. Depending on the estimated signal-to-noise ratio (SNR), each time-frequency unit is either attenuated or not. A special type of a time-frequency mask is the ideal binary mask (IBM), which has access to the real SNR (ideal). The IBM either retains or removes each time-frequency unit (binary mask). The IBM provides large improvements in speech intelligibility and is a valuable tool for investigating how different factors influence intelligibility. This study extends the standard outcome measure (speech intelligibility) with additional perceptual measures relevant for noise reduction: listening effort, noise annoyance, speech naturalness, and overall preference. Four types of time-frequency masking were evaluated: the original IBM, a tempered version of the IBM (called ITM) which applies limited and non-binary attenuation, and non-ideal masking (also tempered) with two different types of noise-estimation algorithms. The results from ideal masking imply that there is a trade-off between intelligibility and sound quality, which depends on the attenuation strength. Additionally, the results for non-ideal masking suggest that subjective measures can show effects of noise reduction even if noise reduction does not lead to differences in intelligibility.
Maximum likelihood PSD estimation for speech enhancement in reverberant and noisy conditions

DEFF Research Database (Denmark)

Kuklasinski, Adam; Doclo, Simon; Jensen, Jesper

2016-01-01

of the estimator is in speech enhancement algorithms, such as the Multi-channel Wiener Filter (MWF) and the Minimum Variance Distortionless Response (MVDR) beamformer. We evaluate these two algorithms in a speech dereverberation task and compare the performance obtained using the proposed and a competing PSD...... estimator. Instrumental performance measures indicate an advantage of the proposed estimator over the competing one. In a speech intelligibility test all algorithms significantly improved the word intelligibility score. While the results suggest a minor advantage of using the proposed PSD estimator...
Structural analysis of a speech disorder of children with a mild mental retardation

Directory of Open Access Journals (Sweden)

Franc Smole

2004-05-01

Full Text Available The aim of this research was to define the structure of speech disorder of children with a mild mental retardation. 100 subjects were chosen among pupils from the 1st to the 4th grade of elementary school who were under logopaedic treatment. To determine speech comprehension Reynell's developmental scales were used and for evaluation of speech articulation the Three-position test for articulation evaluation. With the Bender test we determined a child's mental age as well as defined the signs of psychological disfunction of organic nature. For the field of phonological consciousness a Test of reading and writing disturbances was applied. Speech fluency was evaluated by the Riley test. Evaluation scales were adapted for determining speech-language levels and motor skills of speech organs and hands. Data on results in psychological test and on the family was summed up from the diagnostic treatment guidance documents. Social behaviour in school was evaluated by their teachers. Six factors which hierarchicallydefine the structure of speech disorder were determined by the factor analysis. We found out that signs of a child's brain lesion are the factor which has the most influence on a child's mental age. The results of this research might be helpful to logopaedists in determining a logopaedic treatment for children with a mild mental retardation.

Contribution of envelope periodicity to release from speech-on-speech masking

DEFF Research Database (Denmark)

Christiansen, Claus; MacDonald, Ewen; Dau, Torsten

2013-01-01

Masking release (MR) is the improvement in speech intelligibility for a fluctuating interferer compared to stationary noise. Reduction in MR due to vocoder processing is usually linked to distortions in the temporal fine structure of the stimuli and a corresponding reduction in the fundamental fr...
The Comorbidity between Attention-Deficit/Hyperactivity Disorder (ADHD) in Children and Arabic Speech Sound Disorder

Science.gov (United States)

Hariri, Ruaa Osama

2016-01-01

Children with Attention-Deficiency/Hyperactive Disorder (ADHD) often have co-existing learning disabilities and developmental weaknesses or delays in some areas including speech (Rief, 2005). Seeing that phonological disorders include articulation errors and other forms of speech disorders, studies pertaining to children with ADHD symptoms who…
Listeners' Perceptions of Speech and Language Disorders

Science.gov (United States)

Allard, Emily R.; Williams, Dale F.

2008-01-01

Using semantic differential scales with nine trait pairs, 445 adults rated five audio-taped speech samples, one depicting an individual without a disorder and four portraying communication disorders. Statistical analyses indicated that the no disorder sample was rated higher with respect to the trait of employability than were the articulation,…
Speech cues contribute to audiovisual spatial integration.

Directory of Open Access Journals (Sweden)

Christopher W Bishop

Full Text Available Speech is the most important form of human communication but ambient sounds and competing talkers often degrade its acoustics. Fortunately the brain can use visual information, especially its highly precise spatial information, to improve speech comprehension in noisy environments. Previous studies have demonstrated that audiovisual integration depends strongly on spatiotemporal factors. However, some integrative phenomena such as McGurk interference persist even with gross spatial disparities, suggesting that spatial alignment is not necessary for robust integration of audiovisual place-of-articulation cues. It is therefore unclear how speech-cues interact with audiovisual spatial integration mechanisms. Here, we combine two well established psychophysical phenomena, the McGurk effect and the ventriloquist's illusion, to explore this dependency. Our results demonstrate that conflicting spatial cues may not interfere with audiovisual integration of speech, but conflicting speech-cues can impede integration in space. This suggests a direct but asymmetrical influence between ventral 'what' and dorsal 'where' pathways.
Consequences of phonological ability for intelligibility of speech in youngsters with cleft palate

DEFF Research Database (Denmark)

Willadsen, Elisabeth; Poulsen, Mads

In a previous Randomized Clinical Trial (RCT) including children with unilateral cleft lip and palate, it was found that children who had the entire palate closed by 12 months of age had almost as good phonological abilities as a control group, whereas a group of children with a residual cleft...... groups. Based on a pilot study a difference between the groups was observed, suggesting that control children were most easily understood, followed by children with closed palates who were more easily understood than children with a residual cleft....... in the hard palate performed significantly worse at 3 years of age (Willadsen, in press). To investigate the influence of phonological ability on intelligibility of speech in 14 children from each of the 3 groups, an investigation including 84 lay listeners was conducted. The lay listeners were presented...
Effects of human fatigue on speech signals

Science.gov (United States)

Stamoulis, Catherine

2004-05-01

Cognitive performance may be significantly affected by fatigue. In the case of critical personnel, such as pilots, monitoring human fatigue is essential to ensure safety and success of a given operation. One of the modalities that may be used for this purpose is speech, which is sensitive to respiratory changes and increased muscle tension of vocal cords, induced by fatigue. Age, gender, vocal tract length, physical and emotional state may significantly alter speech intensity, duration, rhythm, and spectral characteristics. In addition to changes in speech rhythm, fatigue may also affect the quality of speech, such as articulation. In a noisy environment, detecting fatigue-related changes in speech signals, particularly subtle changes at the onset of fatigue, may be difficult. Therefore, in a performance-monitoring system, speech parameters which are significantly affected by fatigue need to be identified and extracted from input signals. For this purpose, a series of experiments was performed under slowly varying cognitive load conditions and at different times of the day. The results of the data analysis are presented here.
Subjective and Objective Quality Assessment of Single-Channel Speech Separation Algorithms

DEFF Research Database (Denmark)

Mowlaee, Pejman; Saeidi, Rahim; Christensen, Mads Græsbøll

2012-01-01

Previous studies on performance evaluation of single-channel speech separation (SCSS) algorithms mostly focused on automatic speech recognition (ASR) accuracy as their performance measure. Assessing the separated signals by different metrics other than this has the benefit that the results...... are expected to carry on to other applications beyond ASR. In this paper, in addition to conventional speech quality metrics (PESQ and SNRloss), we also evaluate the separation systems output using different source separation metrics: blind source separation evaluation (BSS EVAL) and perceptual evaluation...... that PESQ and PEASS quality metrics predict well the subjective quality of separated signals obtained by the separation systems. From the results it is observed that the short-time objective intelligibility (STOI) measure predict the speech intelligibility results....
Requirements for the evaluation of computational speech segregation systems

DEFF Research Database (Denmark)

May, Tobias; Dau, Torsten

2014-01-01

Recent studies on computational speech segregation reported improved speech intelligibility in noise when estimating and applying an ideal binary mask with supervised learning algorithms. However, an important requirement for such systems in technical applications is their robustness to acoustic...... associated with perceptual attributes in speech segregation. The results could help establish a framework for a systematic evaluation of future segregation systems....
Accuracy of Repetition of Digitized and Synthesized Speech for Young Children in Background Noise

Science.gov (United States)

Drager, Kathryn D. R.; Clark-Serpentine, Elizabeth A.; Johnson, Kate E.; Roeser, Jennifer L.

2006-01-01

Purpose: The present study investigated the intelligibility of digitized and synthesized speech output in background noise for children 3-5 years old. The purpose of the study was to determine whether there was a difference in the intelligibility (ability to repeat) of 3 types of speech output (digitized, DECTalk synthesized, and MacinTalk…
On-device mobile speech recognition

OpenAIRE

Mustafa, MK

2016-01-01

Despite many years of research, Speech Recognition remains an active area of research in Artificial Intelligence. Currently, the most common commercial application of this technology on mobile devices uses a wireless client – server approach to meet the computational and memory demands of the speech recognition process. Unfortunately, such an approach is unlikely to remain viable when fully applied over the approximately 7.22 Billion mobile phones currently in circulation. In this thesis we p...
Speech Outcomes after Tonsillectomy in Patients with Known Velopharyngeal Insufficiency

Directory of Open Access Journals (Sweden)

L. M. Paulson

2012-01-01

Full Text Available Introduction. Controversy exists over whether tonsillectomy will affect speech in patients with known velopharyngeal insufficiency (VPI, particularly in those with cleft palate. Methods. All patients seen at the OHSU Doernbecher Children's Hospital VPI clinic between 1997 and 2010 with VPI who underwent tonsillectomy were reviewed. Speech parameters were assessed before and after tonsillectomy. Wilcoxon rank-sum testing was used to evaluate for significance. Results. A total of 46 patients with VPI underwent tonsillectomy during this period. Twenty-three had pre- and postoperative speech evaluation sufficient for analysis. The majority (87% had a history of cleft palate. Indications for tonsillectomy included obstructive sleep apnea in 11 (48% and staged tonsillectomy prior to pharyngoplasty in 10 (43%. There was no significant difference between pre- and postoperative speech intelligibility or velopharyngeal competency in this population. Conclusion. In this study, tonsillectomy in patients with VPI did not significantly alter speech intelligibility or velopharyngeal competence.
Severe Multisensory Speech Integration Deficits in High-Functioning School-Aged Children with Autism Spectrum Disorder (ASD) and Their Resolution During Early Adolescence

Science.gov (United States)

Foxe, John J.; Molholm, Sophie; Del Bene, Victor A.; Frey, Hans-Peter; Russo, Natalie N.; Blanco, Daniella; Saint-Amour, Dave; Ross, Lars A.

2015-01-01

Under noisy listening conditions, visualizing a speaker's articulations substantially improves speech intelligibility. This multisensory speech integration ability is crucial to effective communication, and the appropriate development of this capacity greatly impacts a child's ability to successfully navigate educational and social settings. Research shows that multisensory integration abilities continue developing late into childhood. The primary aim here was to track the development of these abilities in children with autism, since multisensory deficits are increasingly recognized as a component of the autism spectrum disorder (ASD) phenotype. The abilities of high-functioning ASD children (n = 84) to integrate seen and heard speech were assessed cross-sectionally, while environmental noise levels were systematically manipulated, comparing them with age-matched neurotypical children (n = 142). Severe integration deficits were uncovered in ASD, which were increasingly pronounced as background noise increased. These deficits were evident in school-aged ASD children (5–12 year olds), but were fully ameliorated in ASD children entering adolescence (13–15 year olds). The severity of multisensory deficits uncovered has important implications for educators and clinicians working in ASD. We consider the observation that the multisensory speech system recovers substantially in adolescence as an indication that it is likely amenable to intervention during earlier childhood, with potentially profound implications for the development of social communication abilities in ASD children. PMID:23985136
Developmental language and speech disability.

Science.gov (United States)

Spiel, G; Brunner, E; Allmayer, B; Pletz, A

2001-09-01

Speech disabilities (articulation deficits) and language disorders--expressive (vocabulary) receptive (language comprehension) are not uncommon in children. An overview of these along with a global description of the impairment of communication as well as clinical characteristics of language developmental disorders are presented in this article. The diagnostic tables, which are applied in the European and Anglo-American speech areas, ICD-10 and DSM-IV, have been explained and compared. Because of their strengths and weaknesses an alternative classification of language and speech developmental disorders is proposed, which allows a differentiation between expressive and receptive language capabilities with regard to the semantic and the morphological/syntax domains. Prevalence and comorbidity rates, psychosocial influences, biological factors and the biological social interaction have been discussed. The necessity of the use of standardized examinations is emphasised. General logopaedic treatment paradigms, specific therapy concepts and an overview of prognosis have been described.
On Optimal Linear Filtering of Speech for Near-End Listening Enhancement

DEFF Research Database (Denmark)

Taal, Cees H.; Jensen, Jesper; Leijon, Arne

2013-01-01

In this letter the focus is on linear filtering of speech before degradation due to additive background noise. The goal is to design the filter such that the speech intelligibility index (SII) is maximized when the speech is played back in a known noisy environment. Moreover, a power constraint i...
Intelligent Multi-Media Integrated Interface Project

Science.gov (United States)

1990-06-01

been devoted to the application of aritificial intelligence technology to the development of human -computer interface technology that integrates speech...RADC-TR-90-128 Final Technical Report June 1090 AD-A225 973 INTELLIGENT MULTI-MEDIA INTEGRATED INTERFACE PROJECT Calspan-University of Buffalo...contractual obligations or notices on a specific document require that it be returned. INTELLIGENT MULTI-MEDIA INTEGRATED INTERFACE PROJECT J. G. Neal J. M
[Intermodal timing cues for audio-visual speech recognition].

Science.gov (United States)

Hashimoto, Masahiro; Kumashiro, Masaharu

2004-06-01

The purpose of this study was to investigate the limitations of lip-reading advantages for Japanese young adults by desynchronizing visual and auditory information in speech. In the experiment, audio-visual speech stimuli were presented under the six test conditions: audio-alone, and audio-visually with either 0, 60, 120, 240 or 480 ms of audio delay. The stimuli were the video recordings of a face of a female Japanese speaking long and short Japanese sentences. The intelligibility of the audio-visual stimuli was measured as a function of audio delays in sixteen untrained young subjects. Speech intelligibility under the audio-delay condition of less than 120 ms was significantly better than that under the audio-alone condition. On the other hand, the delay of 120 ms corresponded to the mean mora duration measured for the audio stimuli. The results implied that audio delays of up to 120 ms would not disrupt lip-reading advantage, because visual and auditory information in speech seemed to be integrated on a syllabic time scale. Potential applications of this research include noisy workplace in which a worker must extract relevant speech from all the other competing noises.
Musicians do not benefit from differences in fundamental frequency when listening to speech in competing speech backgrounds

DEFF Research Database (Denmark)

Madsen, Sara Miay Kim; Whiteford, Kelly L.; Oxenham, Andrew J.

2017-01-01

Recent studies disagree on whether musicians have an advantage over non-musicians in understanding speech in noise. However, it has been suggested that musicians may be able to use diferences in fundamental frequency (F0) to better understand target speech in the presence of interfering talkers....... Here we studied a relatively large (N=60) cohort of young adults, equally divided between nonmusicians and highly trained musicians, to test whether the musicians were better able to understand speech either in noise or in a two-talker competing speech masker. The target speech and competing speech...... were presented with either their natural F0 contours or on a monotone F0, and the F0 diference between the target and masker was systematically varied. As expected, speech intelligibility improved with increasing F0 diference between the target and the two-talker masker for both natural and monotone...
SynFace—Speech-Driven Facial Animation for Virtual Speech-Reading Support

Directory of Open Access Journals (Sweden)

Giampiero Salvi

2009-01-01

Full Text Available This paper describes SynFace, a supportive technology that aims at enhancing audio-based spoken communication in adverse acoustic conditions by providing the missing visual information in the form of an animated talking head. Firstly, we describe the system architecture, consisting of a 3D animated face model controlled from the speech input by a specifically optimised phonetic recogniser. Secondly, we report on speech intelligibility experiments with focus on multilinguality and robustness to audio quality. The system, already available for Swedish, English, and Flemish, was optimised for German and for Swedish wide-band speech quality available in TV, radio, and Internet communication. Lastly, the paper covers experiments with nonverbal motions driven from the speech signal. It is shown that turn-taking gestures can be used to affect the flow of human-human dialogues. We have focused specifically on two categories of cues that may be extracted from the acoustic signal: prominence/emphasis and interactional cues (turn-taking/back-channelling.
Dysarthria in Mandarin-Speaking Children with Cerebral Palsy: Speech Subsystem Profiles

Science.gov (United States)

Chen, Li-Mei; Hustad, Katherine C.; Kent, Ray D.; Lin, Yu Ching

2018-01-01

Purpose: This study explored the speech characteristics of Mandarin-speaking children with cerebral palsy (CP) and typically developing (TD) children to determine (a) how children in the 2 groups may differ in their speech patterns and (b) the variables correlated with speech intelligibility for words and sentences. Method: Data from 6 children…
Eyes and ears: Using eye tracking and pupillometry to understand challenges to speech recognition.

Science.gov (United States)

Van Engen, Kristin J; McLaughlin, Drew J

2018-05-04

Although human speech recognition is often experienced as relatively effortless, a number of common challenges can render the task more difficult. Such challenges may originate in talkers (e.g., unfamiliar accents, varying speech styles), the environment (e.g. noise), or in listeners themselves (e.g., hearing loss, aging, different native language backgrounds). Each of these challenges can reduce the intelligibility of spoken language, but even when intelligibility remains high, they can place greater processing demands on listeners. Noisy conditions, for example, can lead to poorer recall for speech, even when it has been correctly understood. Speech intelligibility measures, memory tasks, and subjective reports of listener difficulty all provide critical information about the effects of such challenges on speech recognition. Eye tracking and pupillometry complement these methods by providing objective physiological measures of online cognitive processing during listening. Eye tracking records the moment-to-moment direction of listeners' visual attention, which is closely time-locked to unfolding speech signals, and pupillometry measures the moment-to-moment size of listeners' pupils, which dilate in response to increased cognitive load. In this paper, we review the uses of these two methods for studying challenges to speech recognition. Copyright © 2018. Published by Elsevier B.V.

Self-Reported Speech Problems in Adolescents and Young Adults with 22q11.2 Deletion Syndrome: A Cross-Sectional Cohort Study

Directory of Open Access Journals (Sweden)

Nicole E Spruijt

2014-09-01

Full Text Available BackgroundSpeech problems are a common clinical feature of the 22q11.2 deletion syndrome. The objectives of this study were to inventory the speech history and current self-reported speech rating of adolescents and young adults, and examine the possible variables influencing the current speech ratings, including cleft palate, surgery, speech and language therapy, intelligence quotient, and age at assessment.MethodsIn this cross-sectional cohort study, 50 adolescents and young adults with the 22q11.2 deletion syndrome (ages, 12-26 years, 67% female filled out questionnaires. A neuropsychologist administered an age-appropriate intelligence quotient test. The demographics, histories, and intelligence of patients with normal speech (speech rating=1 were compared to those of patients with different speech (speech rating>1.ResultsOf the 50 patients, a minority (26% had a cleft palate, nearly half (46% underwent a pharyngoplasty, and all (100% had speech and language therapy. Poorer speech ratings were correlated with more years of speech and language therapy (Spearman's correlation= 0.418, P=0.004; 95% confidence interval, 0.145-0.632. Only 34% had normal speech ratings. The groups with normal and different speech were not significantly different with respect to the demographic variables; a history of cleft palate, surgery, or speech and language therapy; and the intelligence quotient.ConclusionsAll adolescents and young adults with the 22q11.2 deletion syndrome had undergone speech and language therapy, and nearly half of them underwent pharyngoplasty. Only 34% attained normal speech ratings. Those with poorer speech ratings had speech and language therapy for more years.
Signal Processing Methods for Removing the Effects of Whole Body Vibration upon Speech

Science.gov (United States)

Bitner, Rachel M.; Begault, Durand R.

2014-01-01

Humans may be exposed to whole-body vibration in environments where clear speech communications are crucial, particularly during the launch phases of space flight and in high-performance aircraft. Prior research has shown that high levels of vibration cause a decrease in speech intelligibility. However, the effects of whole-body vibration upon speech are not well understood, and no attempt has been made to restore speech distorted by whole-body vibration. In this paper, a model for speech under whole-body vibration is proposed and a method to remove its effect is described. The method described reduces the perceptual effects of vibration, yields higher ASR accuracy scores, and may significantly improve intelligibility. Possible applications include incorporation within communication systems to improve radio-communication systems in environments such a spaceflight, aviation, or off-road vehicle operations.
Improving user-friendliness by using visually supported speech recognition

NARCIS (Netherlands)

Waals, J.A.J.S.; Kooi, F.L.; Kriekaard, J.J.

2002-01-01

While speech recognition in principle may be one of the most natural interfaces, in practice it is not due to the lack of user-friendliness. Words are regularly interpreted wrong, and subjects tend to articulate in an exaggerated manner. We explored the potential of visually supported error
The effect of room acoustics on the measured speech privacy in two typical European open plan offices

NARCIS (Netherlands)

Wenmaekers, R.H.C.; Hout, van N.H.A.M.; Luxemburg, van L.C.J.; Hak, C.C.J.M.

2009-01-01

The reverberation time and the background noise level are often used as the most important design parameters in European open plan offices to achieve a comfortable acoustic climate and to control speech intelligibility. Good speech intelligibility is desired for people working together, but bad
The role of short-time intensity and envelope power for speech intelligibility and psychoacoustic masking.

Science.gov (United States)

Biberger, Thomas; Ewert, Stephan D

2017-08-01

The generalized power spectrum model [GPSM; Biberger and Ewert (2016). J. Acoust. Soc. Am. 140, 1023-1038], combining the "classical" concept of the power-spectrum model (PSM) and the envelope power spectrum-model (EPSM), was demonstrated to account for several psychoacoustic and speech intelligibility (SI) experiments. The PSM path of the model uses long-time power signal-to-noise ratios (SNRs), while the EPSM path uses short-time envelope power SNRs. A systematic comparison of existing SI models for several spectro-temporal manipulations of speech maskers and gender combinations of target and masker speakers [Schubotz et al. (2016). J. Acoust. Soc. Am. 140, 524-540] showed the importance of short-time power features. Conversely, Jørgensen et al. [(2013). J. Acoust. Soc. Am. 134, 436-446] demonstrated a higher predictive power of short-time envelope power SNRs than power SNRs using reverberation and spectral subtraction. Here the GPSM was extended to utilize short-time power SNRs and was shown to account for all psychoacoustic and SI data of the three mentioned studies. The best processing strategy was to exclusively use either power or envelope-power SNRs, depending on the experimental task. By analyzing both domains, the suggested model might provide a useful tool for clarifying the contribution of amplitude modulation masking and energetic masking.
The Hypothesis of Apraxia of Speech in Children with Autism Spectrum Disorder

OpenAIRE

Shriberg, Lawrence D.; Paul, Rhea; Black, Lois M.; van Santen, Jan P.

2011-01-01

In a sample of 46 children aged 4 to 7 years with Autism Spectrum Disorder (ASD) and intelligible speech, there was no statistical support for the hypothesis of concomitant Childhood Apraxia of Speech (CAS). Perceptual and acoustic measures of participants’ speech, prosody, and voice were compared with data from 40 typically-developing children, 13 preschool children with Speech Delay, and 15 participants aged 5 to 49 years with CAS in neurogenetic disorders. Speech Delay and Speech Errors, r...
Evaluation of speech transmission in open public spaces affected by combined noises.

Science.gov (United States)

Lee, Pyoung Jik; Jeon, Jin Yong

2011-07-01

In the present study, the effects of interference from combined noises on speech transmission were investigated in a simulated open public space. Sound fields for dominant noises were predicted using a typical urban square model surrounded by buildings. Then road traffic noise and two types of construction noises, corresponding to stationary and impulsive noises, were selected as background noises. Listening tests were performed on a group of adults, and the quality of speech transmission was evaluated using listening difficulty as well as intelligibility scores. During the listening tests, two factors that affect speech transmission performance were considered: (1) temporal characteristics of construction noise (stationary or impulsive) and (2) the levels of the construction and road traffic noises. The results indicated that word intelligibility scores and listening difficulty ratings were affected by the temporal characteristics of construction noise due to fluctuations in the background noise level. It was also observed that listening difficulty is unable to describe the speech transmission in noisy open public spaces showing larger variation than did word intelligibility scores. © 2011 Acoustical Society of America
Differences between the production of [s] and [ʃ] in the speech of adults, typically developing children, and children with speech sound disorders: An ultrasound study.

Science.gov (United States)

Francisco, Danira Tavares; Wertzner, Haydée Fiszbein

2017-01-01

This study describes the criteria that are used in ultrasound to measure the differences between the tongue contours that produce [s] and [ʃ] sounds in the speech of adults, typically developing children (TDC), and children with speech sound disorder (SSD) with the phonological process of palatal fronting. Overlapping images of the tongue contours that resulted from 35 subjects producing the [s] and [ʃ] sounds were analysed to select 11 spokes on the radial grid that were spread over the tongue contour. The difference was calculated between the mean contour of the [s] and [ʃ] sounds for each spoke. A cluster analysis produced groups with some consistency in the pattern of articulation across subjects and differentiated adults and TDC to some extent and children with SSD with a high level of success. Children with SSD were less likely to show differentiation of the tongue contours between the articulation of [s] and [ʃ].
Acoustic Changes in the Speech of Children with Cerebral Palsy Following an Intensive Program of Dysarthria Therapy

Science.gov (United States)

Pennington, Lindsay; Lombardo, Eftychia; Steen, Nick; Miller, Nick

2018-01-01

Background: The speech intelligibility of children with dysarthria and cerebral palsy has been observed to increase following therapy focusing on respiration and phonation. Aims: To determine if speech intelligibility change following intervention is associated with change in acoustic measures of voice. Methods & Procedures: We recorded 16…
Oral and Hand Movement Speeds Are Associated with Expressive Language Ability in Children with Speech Sound Disorder

Science.gov (United States)

Peter, Beate

2012-01-01

This study tested the hypothesis that children with speech sound disorder have generalized slowed motor speeds. It evaluated associations among oral and hand motor speeds and measures of speech (articulation and phonology) and language (receptive vocabulary, sentence comprehension, sentence imitation), in 11 children with moderate to severe SSD…
Acoustic assessment of speech privacy curtains in two nursing units

Science.gov (United States)

Pope, Diana S.; Miller-Klein, Erik T.

2016-01-01

Hospitals have complex soundscapes that create challenges to patient care. Extraneous noise and high reverberation rates impair speech intelligibility, which leads to raised voices. In an unintended spiral, the increasing noise may result in diminished speech privacy, as people speak loudly to be heard over the din. The products available to improve hospital soundscapes include construction materials that absorb sound (acoustic ceiling tiles, carpet, wall insulation) and reduce reverberation rates. Enhanced privacy curtains are now available and offer potential for a relatively simple way to improve speech privacy and speech intelligibility by absorbing sound at the hospital patient's bedside. Acoustic assessments were performed over 2 days on two nursing units with a similar design in the same hospital. One unit was built with the 1970s’ standard hospital construction and the other was newly refurbished (2013) with sound-absorbing features. In addition, we determined the effect of an enhanced privacy curtain versus standard privacy curtains using acoustic measures of speech privacy and speech intelligibility indexes. Privacy curtains provided auditory protection for the patients. In general, that protection was increased by the use of enhanced privacy curtains. On an average, the enhanced curtain improved sound absorption from 20% to 30%; however, there was considerable variability, depending on the configuration of the rooms tested. Enhanced privacy curtains provide measureable improvement to the acoustics of patient rooms but cannot overcome larger acoustic design issues. To shorten reverberation time, additional absorption, and compact and more fragmented nursing unit floor plate shapes should be considered. PMID:26780959
Acoustic assessment of speech privacy curtains in two nursing units.

Science.gov (United States)

Pope, Diana S; Miller-Klein, Erik T

2016-01-01

Hospitals have complex soundscapes that create challenges to patient care. Extraneous noise and high reverberation rates impair speech intelligibility, which leads to raised voices. In an unintended spiral, the increasing noise may result in diminished speech privacy, as people speak loudly to be heard over the din. The products available to improve hospital soundscapes include construction materials that absorb sound (acoustic ceiling tiles, carpet, wall insulation) and reduce reverberation rates. Enhanced privacy curtains are now available and offer potential for a relatively simple way to improve speech privacy and speech intelligibility by absorbing sound at the hospital patient's bedside. Acoustic assessments were performed over 2 days on two nursing units with a similar design in the same hospital. One unit was built with the 1970s' standard hospital construction and the other was newly refurbished (2013) with sound-absorbing features. In addition, we determined the effect of an enhanced privacy curtain versus standard privacy curtains using acoustic measures of speech privacy and speech intelligibility indexes. Privacy curtains provided auditory protection for the patients. In general, that protection was increased by the use of enhanced privacy curtains. On an average, the enhanced curtain improved sound absorption from 20% to 30%; however, there was considerable variability, depending on the configuration of the rooms tested. Enhanced privacy curtains provide measureable improvement to the acoustics of patient rooms but cannot overcome larger acoustic design issues. To shorten reverberation time, additional absorption, and compact and more fragmented nursing unit floor plate shapes should be considered.
Acoustic assessment of speech privacy curtains in two nursing units

Directory of Open Access Journals (Sweden)

Diana S Pope

2016-01-01

Full Text Available Hospitals have complex soundscapes that create challenges to patient care. Extraneous noise and high reverberation rates impair speech intelligibility, which leads to raised voices. In an unintended spiral, the increasing noise may result in diminished speech privacy, as people speak loudly to be heard over the din. The products available to improve hospital soundscapes include construction materials that absorb sound (acoustic ceiling tiles, carpet, wall insulation and reduce reverberation rates. Enhanced privacy curtains are now available and offer potential for a relatively simple way to improve speech privacy and speech intelligibility by absorbing sound at the hospital patient′s bedside. Acoustic assessments were performed over 2 days on two nursing units with a similar design in the same hospital. One unit was built with the 1970s′ standard hospital construction and the other was newly refurbished (2013 with sound-absorbing features. In addition, we determined the effect of an enhanced privacy curtain versus standard privacy curtains using acoustic measures of speech privacy and speech intelligibility indexes. Privacy curtains provided auditory protection for the patients. In general, that protection was increased by the use of enhanced privacy curtains. On an average, the enhanced curtain improved sound absorption from 20% to 30%; however, there was considerable variability, depending on the configuration of the rooms tested. Enhanced privacy curtains provide measureable improvement to the acoustics of patient rooms but cannot overcome larger acoustic design issues. To shorten reverberation time, additional absorption, and compact and more fragmented nursing unit floor plate shapes should be considered.
Visual feedback of tongue movement for novel speech sound learning

Directory of Open Access Journals (Sweden)

William F Katz

2015-11-01

Full Text Available Pronunciation training studies have yielded important information concerning the processing of audiovisual (AV information. Second language (L2 learners show increased reliance on bottom-up, multimodal input for speech perception (compared to monolingual individuals. However, little is known about the role of viewing one’s own speech articulation processes during speech training. The current study investigated whether real-time, visual feedback for tongue movement can improve a speaker’s learning of non-native speech sounds. An interactive 3D tongue visualization system based on electromagnetic articulography (EMA was used in a speech training experiment. Native speakers of American English produced a novel speech sound (/ɖ̠/; a voiced, coronal, palatal stop before, during, and after trials in which they viewed their own speech movements using the 3D model. Talkers’ productions were evaluated using kinematic (tongue-tip spatial positioning and acoustic (burst spectra measures. The results indicated a rapid gain in accuracy associated with visual feedback training. The findings are discussed with respect to neural models for multimodal speech processing.
Digitized Speech Characteristics in Patients with Maxillectomy Defects.

Science.gov (United States)

Elbashti, Mahmoud E; Sumita, Yuka I; Hattori, Mariko; Aswehlee, Amel M; Taniguchi, Hisashi

2017-12-06

Accurate evaluation of speech characteristics through formant frequency measurement is important for proper speech rehabilitation in patients after maxillectomy. This study aimed to evaluate the utility of digital acoustic analysis and vowel pentagon space for the prediction of speech ability after maxillectomy, by comparing the acoustic characteristics of vowel articulation in three classes of maxillectomy defects. Aramany's classifications I, II, and IV were used to group 27 male patients after maxillectomy. Digital acoustic analysis of five Japanese vowels-/a/, /e/, /i/, /o/, and /u/-was performed using a speech analysis system. First formant (F1) and second formant (F2) frequencies were calculated using an autocorrelation method. Data were plotted on an F1-F2 plane for each patient, and the F1 and F2 ranges were calculated. The vowel pentagon spaces were also determined. One-way ANOVA was applied to compare all results between the three groups. Class II maxillectomy patients had a significantly higher F2 range than did Class I and Class IV patients (p = 0.002). In contrast, there was no significant difference in the F1 range between the three classes. The vowel pentagon spaces were significantly larger in class II maxillectomy patients than in Class I and Class IV patients (p = 0.014). The results of this study indicate that the acoustic characteristics of maxillectomy patients are affected by the defect area. This finding may provide information for obturator design based on vowel articulation and defect class. © 2017 by the American College of Prosthodontists.
Kalman filter for speech enhancement in cocktail party scenarios using a codebook-based approach

DEFF Research Database (Denmark)

Kavalekalam, Mathew Shaji; Christensen, Mads Græsbøll; Gran, Fredrik

2016-01-01

Enhancement of speech in non-stationary background noise is a challenging task, and conventional single channel speech enhancement algorithms have not been able to improve the speech intelligibility in such scenarios. The work proposed in this paper investigates a single channel Kalman filter based...... trained codebook over a generic speech codebook in relation to the performance of the speech enhancement system....
Children's Attitudes Toward Peers With Unintelligible Speech Associated With Cleft Lip and/or Palate.

Science.gov (United States)

Lee, Alice; Gibbon, Fiona E; Spivey, Kimberley

2017-05-01

The objective of this study was to investigate whether reduced speech intelligibility in children with cleft palate affects social and personal attribute judgments made by typically developing children of different ages. The study (1) measured the correlation between intelligibility scores of speech samples from children with cleft palate and social and personal attribute judgments made by typically developing children based on these samples and (2) compared the attitude judgments made by children of different ages. Participants A total of 90 typically developing children, 30 in each of three age groups (7 to 8 years, 9 to 10 years, and 11 to 12 years). Speech intelligibility scores and typically developing children's attitudes were measured using eight social and personal attributes on a three-point rating scale. There was a significant correlation between the speech intelligibility scores and attitude judgments for a number of traits: "sick-healthy" as rated by the children aged 7 to 8 years, "no friends-friends" by the children aged 9 to 10 years, and "ugly-good looking" and "no friends-friends" by the children aged 11 to 12 years. Children aged 7 to 8 years gave significantly lower ratings for "mean-kind" but higher ratings for "shy-outgoing" when compared with the other two groups. Typically developing children tended to make negative social and personal attribute judgments about children with cleft palate based solely on the intelligibility of their speech. Society, educators, and health professionals should work together to ensure that children with cleft palate are not stigmatized by their peers.
Phonologically based assessment and intervention in Spastic Cerebral Palsy: A case analysis

Directory of Open Access Journals (Sweden)

Michael A. Crary

1981-11-01

Full Text Available The articulation errors of one adult subject demonstrating a spastic variety of congenital cerebral palsy were evaluated via a phonological process analysis. This analysis indicated that a stopping process (replacement of fricatives with homorganic stops was the most detrimental to the subject's, intelligibility. Subsequent to this analysis a phonemic contrasting programme was initiated toward the goal of minimizing the influence of the stopping process. Results of spontaneous speech sample analyses indicated that this approach was successful in increasing the percentage of correctly produced fricative patterns. Success in this case suggests the applicability of a linguistically based intervention approach in structural/functional disturbances of speech articulation.
Tools for the assessment of childhood apraxia of speech.

Science.gov (United States)

Gubiani, Marileda Barichello; Pagliarin, Karina Carlesso; Keske-Soares, Marcia

2015-01-01

This study systematically reviews the literature on the main tools used to evaluate childhood apraxia of speech (CAS). The search strategy includes Scopus, PubMed, and Embase databases. Empirical studies that used tools for assessing CAS were selected. Articles were selected by two independent researchers. The search retrieved 695 articles, out of which 12 were included in the study. Five tools were identified: Verbal Motor Production Assessment for Children, Dynamic Evaluation of Motor Speech Skill, The Orofacial Praxis Test, Kaufman Speech Praxis Test for Children, and Madison Speech Assessment Protocol. There are few instruments available for CAS assessment and most of them are intended to assess praxis and/or orofacial movements, sequences of orofacial movements, articulation of syllables and phonemes, spontaneous speech, and prosody. There are some tests for assessment and diagnosis of CAS. However, few studies on this topic have been conducted at the national level, as well as protocols to assess and assist in an accurate diagnosis.
Nobel peace speech

Directory of Open Access Journals (Sweden)

Joshua FRYE

2017-07-01

Full Text Available The Nobel Peace Prize has long been considered the premier peace prize in the world. According to Geir Lundestad, Secretary of the Nobel Committee, of the 300 some peace prizes awarded worldwide, “none is in any way as well known and as highly respected as the Nobel Peace Prize” (Lundestad, 2001. Nobel peace speech is a unique and significant international site of public discourse committed to articulating the universal grammar of peace. Spanning over 100 years of sociopolitical history on the world stage, Nobel Peace Laureates richly represent an important cross-section of domestic and international issues increasingly germane to many publics. Communication scholars’ interest in this rhetorical genre has increased in the past decade. Yet, the norm has been to analyze a single speech artifact from a prestigious or controversial winner rather than examine the collection of speeches for generic commonalities of import. In this essay, we analyze the discourse of Nobel peace speech inductively and argue that the organizing principle of the Nobel peace speech genre is the repetitive form of normative liberal principles and values that function as rhetorical topoi. These topoi include freedom and justice and appeal to the inviolable, inborn right of human beings to exercise certain political and civil liberties and the expectation of equality of protection from totalitarian and tyrannical abuses. The significance of this essay to contemporary communication theory is to expand our theoretical understanding of rhetoric’s role in the maintenance and development of an international and cross-cultural vocabulary for the grammar of peace.

The Hypothesis of Apraxia of Speech in Children with Autism Spectrum Disorder

Science.gov (United States)

Shriberg, Lawrence D.; Paul, Rhea; Black, Lois M.; van Santen, Jan P.

2011-01-01

In a sample of 46 children aged 4-7 years with Autism Spectrum Disorder (ASD) and intelligible speech, there was no statistical support for the hypothesis of concomitant Childhood Apraxia of Speech (CAS). Perceptual and acoustic measures of participants' speech, prosody, and voice were compared with data from 40 typically-developing children, 13…
[Improving speech comprehension using a new cochlear implant speech processor].

Science.gov (United States)

Müller-Deile, J; Kortmann, T; Hoppe, U; Hessel, H; Morsnowski, A

2009-06-01

The aim of this multicenter clinical field study was to assess the benefits of the new Freedom 24 sound processor for cochlear implant (CI) users implanted with the Nucleus 24 cochlear implant system. The study included 48 postlingually profoundly deaf experienced CI users who demonstrated speech comprehension performance with their current speech processor on the Oldenburg sentence test (OLSA) in quiet conditions of at least 80% correct scores and who were able to perform adaptive speech threshold testing using the OLSA in noisy conditions. Following baseline measures of speech comprehension performance with their current speech processor, subjects were upgraded to the Freedom 24 speech processor. After a take-home trial period of at least 2 weeks, subject performance was evaluated by measuring the speech reception threshold with the Freiburg multisyllabic word test and speech intelligibility with the Freiburg monosyllabic word test at 50 dB and 70 dB in the sound field. The results demonstrated highly significant benefits for speech comprehension with the new speech processor. Significant benefits for speech comprehension were also demonstrated with the new speech processor when tested in competing background noise.In contrast, use of the Abbreviated Profile of Hearing Aid Benefit (APHAB) did not prove to be a suitably sensitive assessment tool for comparative subjective self-assessment of hearing benefits with each processor. Use of the preprocessing algorithm known as adaptive dynamic range optimization (ADRO) in the Freedom 24 led to additional improvements over the standard upgrade map for speech comprehension in quiet and showed equivalent performance in noise. Through use of the preprocessing beam-forming algorithm BEAM, subjects demonstrated a highly significant improved signal-to-noise ratio for speech comprehension thresholds (i.e., signal-to-noise ratio for 50% speech comprehension scores) when tested with an adaptive procedure using the Oldenburg
Treating speech subsystems in childhood apraxia of speech with tactual input: the PROMPT approach.

Science.gov (United States)

Dale, Philip S; Hayden, Deborah A

2013-11-01

Prompts for Restructuring Oral Muscular Phonetic Targets (PROMPT; Hayden, 2004; Hayden, Eigen, Walker, & Olsen, 2010)-a treatment approach for the improvement of speech sound disorders in children-uses tactile-kinesthetic- proprioceptive (TKP) cues to support and shape movements of the oral articulators. No research to date has systematically examined the efficacy of PROMPT for children with childhood apraxia of speech (CAS). Four children (ages 3;6 [years;months] to 4;8), all meeting the American Speech-Language-Hearing Association (2007) criteria for CAS, were treated using PROMPT. All children received 8 weeks of 2 × per week treatment, including at least 4 weeks of full PROMPT treatment that included TKP cues. During the first 4 weeks, 2 of the 4 children received treatment that included all PROMPT components except TKP cues. This design permitted both between-subjects and within-subjects comparisons to evaluate the effect of TKP cues. Gains in treatment were measured by standardized tests and by criterion-referenced measures based on the production of untreated probe words, reflecting change in speech movements and auditory perceptual accuracy. All 4 children made significant gains during treatment, but measures of motor speech control and untreated word probes provided evidence for more gain when TKP cues were included. PROMPT as a whole appears to be effective for treating children with CAS, and the inclusion of TKP cues appears to facilitate greater effect.
Semi-non-intrusive objective intelligibility measure using spatial filtering in hearing aids

DEFF Research Database (Denmark)

Sørensen, Charlotte; Boldt, Jesper Bünsow; Gran, Frederik

2016-01-01

-intrusive metrics have not been able to achieve acceptable intelligibility predictions. This paper presents a new semi-non-intrusive intelligibility measure based on an existing intrusive measure, STOI, where an estimate of the clean speech is extracted using spatial filtering in the hearing aid. The results......Reliable non-intrusive online assessment of speech intelligibility can play a key role for the functioning of hearing aids, e.g. as guidance for adjusting the hearing aid settings to the environment. While existing intrusive metrics can provide a precise and reliable measure, the current non...
ACOUSTIC SPEECH RECOGNITION FOR MARATHI LANGUAGE USING SPHINX

Directory of Open Access Journals (Sweden)

Aman Ankit

2016-09-01

Full Text Available Speech recognition or speech to text processing, is a process of recognizing human speech by the computer and converting into text. In speech recognition, transcripts are created by taking recordings of speech as audio and their text transcriptions. Speech based applications which include Natural Language Processing (NLP techniques are popular and an active area of research. Input to such applications is in natural language and output is obtained in natural language. Speech recognition mostly revolves around three approaches namely Acoustic phonetic approach, Pattern recognition approach and Artificial intelligence approach. Creation of acoustic model requires a large database of speech and training algorithms. The output of an ASR system is recognition and translation of spoken language into text by computers and computerized devices. ASR today finds enormous application in tasks that require human machine interfaces like, voice dialing, and etc. Our key contribution in this paper is to create corpora for Marathi language and explore the use of Sphinx engine for automatic speech recognition
SPEECH DISORDERS IN PRIMARY SCHOOL STUDENTS OF ISFAHAN (1998-9

Directory of Open Access Journals (Sweden)

B SHAFIEI

2002-06-01

Full Text Available Introduction. The aim of this study was to describe frequency of speech disorders in primary school students.
Methods. In a cross-sectional study 300 first and second grade primary school students were examined for speech disorders.
Results. From 300 subjects, 280 were normal (without speech disorders, 15 had articulation disorders, 2 had voice disorders, 3 had resonance disorders and no one had fluency disorders.
Discussion. The findings of this study are supported by fomer study in other countries, except frequency of fluency disorders which may due to low sample size of present study.
Measurements on the movement of the lower jaw in speech

NARCIS (Netherlands)

Nooteboom, S.G.; Slis, I.H.

1970-01-01

This report concerns some preliminary measurements on the movement of the lower jaw in speech. Such measurements may be interesting for several reasons. One of these is that they more easily than measurements on the movements of other articulators may give some insight into the effect of stress,
Predicting the effect of spectral subtraction on the speech recognition threshold based on the signal-to-noise ratio in the envelope domain

DEFF Research Database (Denmark)

Jørgensen, Søren; Dau, Torsten

2011-01-01

rarely been evaluated perceptually in terms of speech intelligibility. This study analyzed the effects of the spectral subtraction strategy proposed by Berouti at al. [ICASSP 4 (1979), 208-211] on the speech recognition threshold (SRT) obtained with sentences presented in stationary speech-shaped noise....... The SRT was measured in five normal-hearing listeners in six conditions of spectral subtraction. The results showed an increase of the SRT after processing, i.e. a decreased speech intelligibility, in contrast to what is predicted by the Speech Transmission Index (STI). Here, another approach is proposed......, denoted the speech-based envelope power spectrum model (sEPSM) which predicts the intelligibility based on the signal-to-noise ratio in the envelope domain. In contrast to the STI, the sEPSM is sensitive to the increased amount of the noise envelope power as a consequence of the spectral subtraction...
Human phoneme recognition depending on speech-intrinsic variability.

Science.gov (United States)

Meyer, Bernd T; Jürgens, Tim; Wesker, Thorsten; Brand, Thomas; Kollmeier, Birger

2010-11-01

The influence of different sources of speech-intrinsic variation (speaking rate, effort, style and dialect or accent) on human speech perception was investigated. In listening experiments with 16 listeners, confusions of consonant-vowel-consonant (CVC) and vowel-consonant-vowel (VCV) sounds in speech-weighted noise were analyzed. Experiments were based on the OLLO logatome speech database, which was designed for a man-machine comparison. It contains utterances spoken by 50 speakers from five dialect/accent regions and covers several intrinsic variations. By comparing results depending on intrinsic and extrinsic variations (i.e., different levels of masking noise), the degradation induced by variabilities can be expressed in terms of the SNR. The spectral level distance between the respective speech segment and the long-term spectrum of the masking noise was found to be a good predictor for recognition rates, while phoneme confusions were influenced by the distance to spectrally close phonemes. An analysis based on transmitted information of articulatory features showed that voicing and manner of articulation are comparatively robust cues in the presence of intrinsic variations, whereas the coding of place is more degraded. The database and detailed results have been made available for comparisons between human speech recognition (HSR) and automatic speech recognizers (ASR).
Speech coding

Energy Technology Data Exchange (ETDEWEB)

Ravishankar, C., Hughes Network Systems, Germantown, MD

1998-05-08

coding techniques are equally applicable to any voice signal whether or not it carries any intelligible information, as the term speech implies. Other terms that are commonly used are speech compression and voice compression since the fundamental idea behind speech coding is to reduce (compress) the transmission rate (or equivalently the bandwidth) And/or reduce storage requirements In this document the terms speech and voice shall be used interchangeably.
Survey on Chatbot Design Techniques in Speech Conversation Systems

OpenAIRE

Sameera A. Abdul-Kader; Dr. John Woods

2015-01-01

Human-Computer Speech is gaining momentum as a technique of computer interaction. There has been a recent upsurge in speech based search engines and assistants such as Siri, Google Chrome and Cortana. Natural Language Processing (NLP) techniques such as NLTK for Python can be applied to analyse speech, and intelligent responses can be found by designing an engine to provide appropriate human like responses. This type of programme is called a Chatbot, which is the focus of this study. This pap...
Intra-oral pressure-based voicing control of electrolaryngeal speech with intra-oral vibrator.

Science.gov (United States)

Takahashi, Hirokazu; Nakao, Masayuki; Kikuchi, Yataro; Kaga, Kimitaka

2008-07-01

In normal speech, coordinated activities of intrinsic laryngeal muscles suspend a glottal sound at utterance of voiceless consonants, automatically realizing a voicing control. In electrolaryngeal speech, however, the lack of voicing control is one of the causes of unclear voice, voiceless consonants tending to be misheard as the corresponding voiced consonants. In the present work, we developed an intra-oral vibrator with an intra-oral pressure sensor that detected utterance of voiceless phonemes during the intra-oral electrolaryngeal speech, and demonstrated that an intra-oral pressure-based voicing control could improve the intelligibility of the speech. The test voices were obtained from one electrolaryngeal speaker and one normal speaker. We first investigated on the speech analysis software how a voice onset time (VOT) and first formant (F1) transition of the test consonant-vowel syllables contributed to voiceless/voiced contrasts, and developed an adequate voicing control strategy. We then compared the intelligibility of consonant-vowel syllables among the intra-oral electrolaryngeal speech with and without online voicing control. The increase of intra-oral pressure, typically with a peak ranging from 10 to 50 gf/cm2, could reliably identify utterance of voiceless consonants. The speech analysis and intelligibility test then demonstrated that a short VOT caused the misidentification of the voiced consonants due to a clear F1 transition. Finally, taking these results together, the online voicing control, which suspended the prosthetic tone while the intra-oral pressure exceeded 2.5 gf/cm2 and during the 35 milliseconds that followed, proved efficient to improve the voiceless/voiced contrast.
Lingual–Alveolar Contact Pressure During Speech in Amyotrophic Lateral Sclerosis: Preliminary Findings

Science.gov (United States)

Knollhoff, Stephanie; Barohn, Richard J.

2017-01-01

Purpose This preliminary study on lingual–alveolar contact pressures (LACP) in people with amyotrophic lateral sclerosis (ALS) had several aims: (a) to evaluate whether the protocol induced fatigue, (b) to compare LACP during speech (LACP-Sp) and during maximum isometric pressing (LACP-Max) in people with ALS (PALS) versus healthy controls, (c) to compare the percentage of LACP-Max utilized during speech (%Max) for PALS versus controls, and (d) to evaluate relationships between LACP-Sp and LACP-Max with word intelligibility. Method Thirteen PALS and 12 healthy volunteers produced /t, d, s, z, l, n/ sounds while LACP-Sp was recorded. LACP-Max was obtained before and after the speech protocol. Word intelligibility was obtained from auditory–perceptual judgments. Results LACP-Max values measured before and after completion of the speech protocol did not differ. LACP-Sp and LACP-Max were statistically lower in the ALS bulbar group compared with controls and PALS with only spinal symptoms. There was no statistical difference between groups for %Max. LACP-Sp and LACP-Max were correlated with word intelligibility. Conclusions It was feasible to obtain LACP-Sp measures without inducing fatigue. Reductions in LACP-Sp and LACP-Max for bulbar speakers might reflect tongue weakness. Although confirmation of results is needed, the data indicate that individuals with high word intelligibility maintained LACP-Sp at or above 2 kPa and LACP-Max at or above 50 kPa. PMID:28335033
Combined Aphasia and Apraxia of Speech Treatment (CAAST): effects of a novel therapy.

Science.gov (United States)

Wambaugh, Julie L; Wright, Sandra; Nessler, Christina; Mauszycki, Shannon C

2014-12-01

This investigation was designed to examine the effects of a newly developed treatment for aphasia and acquired apraxia of speech (AOS). Combined Aphasia and Apraxia of Speech Treatment (CAAST) targets language and speech production simultaneously, with treatment techniques derived from Response Elaboration Training (Kearns, 1985) and Sound Production Treatment (Wambaugh, Kalinyak-Fliszar, West, & Doyle, 1998). The purpose of this study was to determine whether CAAST was associated with positive changes in verbal language and speech production with speakers with aphasia and AOS. Four participants with chronic aphasia and AOS received CAAST applied sequentially to sets of pictures in the context of multiple baseline designs. CAAST entailed elaboration of participant-initiated utterances, with sound production training applied as needed to the elaborated productions. The dependent variables were (a) production of correct information units (CIUs; Nicholas & Brookshire, 1993) in response to experimental picture stimuli, (b) percentage of consonants correct in sentence repetition, and (c) speech intelligibility. CAAST was associated with increased CIU production in trained and untrained picture sets for all participants. Gains in sound production accuracy and speech intelligibility varied across participants; a modification of CAAST to provide additional speech production treatment may be desirable.
The role of high-level processes for oscillatory phase entrainment to speech sound

Directory of Open Access Journals (Sweden)

Benedikt eZoefel

2015-12-01

Full Text Available Constantly bombarded with input, the brain has the need to filter out relevant information while ignoring the irrelevant rest. A powerful tool may be represented by neural oscillations which entrain their high-excitability phase to important input while their low-excitability phase attenuates irrelevant information. Indeed, the alignment between brain oscillations and speech improves intelligibility and helps dissociating speakers during a cocktail party. Although well-investigated, the contribution of low- and high-level processes to phase entrainment to speech sound has only recently begun to be understood. Here, we review those findings, and concentrate on three main results: (1 Phase entrainment to speech sound is modulated by attention or predictions, likely supported by top-down signals and indicating higher-level processes involved in the brain’s adjustment to speech. (2 As phase entrainment to speech can be observed without systematic fluctuations in sound amplitude or spectral content, it does not only reflect a passive steady-state ringing of the cochlea, but entails a higher-level process. (3 The role of intelligibility for phase entrainment is debated. Recent results suggest that intelligibility modulates the behavioral consequences of entrainment, rather than directly affecting the strength of entrainment in auditory regions. We conclude that phase entrainment to speech reflects a sophisticated mechanism: Several high-level processes interact to optimally align neural oscillations with predicted events of high relevance, even when they are hidden in a continuous stream of background noise.
Auditory-motor mapping training as an intervention to facilitate speech output in non-verbal children with autism: a proof of concept study.

Directory of Open Access Journals (Sweden)

Catherine Y Wan

Full Text Available Although up to 25% of children with autism are non-verbal, there are very few interventions that can reliably produce significant improvements in speech output. Recently, a novel intervention called Auditory-Motor Mapping Training (AMMT has been developed, which aims to promote speech production directly by training the association between sounds and articulatory actions using intonation and bimanual motor activities. AMMT capitalizes on the inherent musical strengths of children with autism, and offers activities that they intrinsically enjoy. It also engages and potentially stimulates a network of brain regions that may be dysfunctional in autism. Here, we report an initial efficacy study to provide 'proof of concept' for AMMT. Six non-verbal children with autism participated. Prior to treatment, the children had no intelligible words. They each received 40 individual sessions of AMMT 5 times per week, over an 8-week period. Probe assessments were conducted periodically during baseline, therapy, and follow-up sessions. After therapy, all children showed significant improvements in their ability to articulate words and phrases, with generalization to items that were not practiced during therapy sessions. Because these children had no or minimal vocal output prior to treatment, the acquisition of speech sounds and word approximations through AMMT represents a critical step in expressive language development in children with autism.
Perception of foreign-accented clear speech by younger and older English listeners

OpenAIRE

Li, Chi-Nin

2009-01-01

Naturally produced English clear speech has been shown to be more intelligible than English conversational speech. However, little is known about the extent of the clear speech effects in the production of nonnative English, and perception of foreign-accented English by younger and older listeners. The present study examined whether Cantonese speakers would employ the same strategies as those used by native English speakers in producing clear speech in their second language. Also, the clear s...
Speech Clarity Index (Ψ): A Distance-Based Speech Quality Indicator and Recognition Rate Prediction for Dysarthric Speakers with Cerebral Palsy

Science.gov (United States)

Kayasith, Prakasith; Theeramunkong, Thanaruk

It is a tedious and subjective task to measure severity of a dysarthria by manually evaluating his/her speech using available standard assessment methods based on human perception. This paper presents an automated approach to assess speech quality of a dysarthric speaker with cerebral palsy. With the consideration of two complementary factors, speech consistency and speech distinction, a speech quality indicator called speech clarity index (Ψ) is proposed as a measure of the speaker's ability to produce consistent speech signal for a certain word and distinguished speech signal for different words. As an application, it can be used to assess speech quality and forecast speech recognition rate of speech made by an individual dysarthric speaker before actual exhaustive implementation of an automatic speech recognition system for the speaker. The effectiveness of Ψ as a speech recognition rate predictor is evaluated by rank-order inconsistency, correlation coefficient, and root-mean-square of difference. The evaluations had been done by comparing its predicted recognition rates with ones predicted by the standard methods called the articulatory and intelligibility tests based on the two recognition systems (HMM and ANN). The results show that Ψ is a promising indicator for predicting recognition rate of dysarthric speech. All experiments had been done on speech corpus composed of speech data from eight normal speakers and eight dysarthric speakers.
A Pilot Investigation of Speech Sound Disorder Intervention Delivered by Telehealth to School-Age Children

Directory of Open Access Journals (Sweden)

Sue Grogan-Johnson

2011-05-01

Full Text Available This article describes a school-based telehealth service delivery model and reports outcomes made by school-age students with speech sound disorders in a rural Ohio school district. Speech therapy using computer-based speech sound intervention materials was provided either by live interactive videoconferencing (telehealth, or conventional side-by-side intervention. Progress was measured using pre- and post-intervention scores on the Goldman Fristoe Test of Articulation-2 (Goldman & Fristoe, 2002. Students in both service delivery models made significant improvements in speech sound production, with students in the telehealth condition demonstrating greater mastery of their Individual Education Plan (IEP goals. Live interactive videoconferencing thus appears to be a viable method for delivering intervention for speech sound disorders to children in a rural, public school setting. Keywords: Telehealth, telerehabilitation, videoconferencing, speech sound disorder, speech therapy, speech-language pathology; E-Helper
The design of a device for hearer and feeler differentiation, part A. [speech modulated hearing device

Science.gov (United States)

Creecy, R.

1974-01-01

A speech modulated white noise device is reported that gives the rhythmic characteristics of a speech signal for intelligible reception by deaf persons. The signal is composed of random amplitudes and frequencies as modulated by the speech envelope characteristics of rhythm and stress. Time intensity parameters of speech are conveyed through the vibro-tactile sensation stimuli.

Characteristics of motor speech phenotypes in multiple sclerosis.

Science.gov (United States)

Rusz, Jan; Benova, Barbora; Ruzickova, Hana; Novotny, Michal; Tykalova, Tereza; Hlavnicka, Jan; Uher, Tomas; Vaneckova, Manuela; Andelova, Michaela; Novotna, Klara; Kadrnozkova, Lucie; Horakova, Dana

2018-01-01

Motor speech disorders in multiple sclerosis (MS) are poorly understood and their quantitative, objective acoustic characterization remains limited. Additionally, little data regarding relationships between the severity of speech disorders and neurological involvement in MS, as well as the contribution of pyramidal and cerebellar functional systems on speech phenotypes, is available. Speech data were acquired from 141 MS patients with Expanded Disability Status Scale (EDSS) ranging from 1 to 6.5 and 70 matched healthy controls. Objective acoustic speech assessment including subtests on phonation, oral diadochokinesis, articulation and prosody was performed. The prevalence of dysarthria in our MS cohort was 56% while the severity was generally mild and primarily consisted of a combination of spastic and ataxic components. Prosodic-articulatory disorder presenting with monopitch, articulatory decay, excess loudness variations and slow rate was the most salient. Speech disorders reflected subclinical motor impairment with 78% accuracy in discriminating between a subgroup of asymptomatic MS (EDSS oral diadochokinesis and the 9-Hole Peg Test (r = - 0.65, p oral diadochokinesis and excess loudness variations significantly separated pure pyramidal and mixed pyramidal-cerebellar MS subgroups. Automated speech analyses may provide valuable biomarkers of disease progression in MS as dysarthria represents common and early manifestation that reflects disease disability and underlying pyramidal-cerebellar pathophysiology. Copyright © 2017 Elsevier B.V. All rights reserved.
[Modeling developmental aspects of sensorimotor control of speech production].

Science.gov (United States)

Kröger, B J; Birkholz, P; Neuschaefer-Rube, C

2007-05-01

Detailed knowledge of the neurophysiology of speech acquisition is important for understanding the developmental aspects of speech perception and production and for understanding developmental disorders of speech perception and production. A computer implemented neural model of sensorimotor control of speech production was developed. The model is capable of demonstrating the neural functions of different cortical areas during speech production in detail. (i) Two sensory and two motor maps or neural representations and the appertaining neural mappings or projections establish the sensorimotor feedback control system. These maps and mappings are already formed and trained during the prelinguistic phase of speech acquisition. (ii) The feedforward sensorimotor control system comprises the lexical map (representations of sounds, syllables, and words of the first language) and the mappings from lexical to sensory and to motor maps. The training of the appertaining mappings form the linguistic phase of speech acquisition. (iii) Three prelinguistic learning phases--i. e. silent mouthing, quasi stationary vocalic articulation, and realisation of articulatory protogestures--can be defined on the basis of our simulation studies using the computational neural model. These learning phases can be associated with temporal phases of prelinguistic speech acquisition obtained from natural data. The neural model illuminates the detailed function of specific cortical areas during speech production. In particular it can be shown that developmental disorders of speech production may result from a delayed or incorrect process within one of the prelinguistic learning phases defined by the neural model.
Reprogramming the articulated robotic arm for glass handling by using Arduino microcontroller

Science.gov (United States)

Razali, Zol Bahri; Kader, Mohamed Mydin M. Abdul; Kadir, Mohd Asmadi Akmal; Daud, Mohd Hisam

2017-09-01

The application of articulated robotic arm in industries is raised due to the expansion of using robot to replace human task, especially for the harmful tasks. However a few problems happen with the program use to schedule the arm, Thus the purpose of this project is to design, fabricate and integrate an articulated robotic arm by using Arduino microcontroller for handling glass sorting system. This project was designed to segregate glass and non-glass waste which would be pioneer step for recycling. This robotic arm has four servo motors to operate as a whole; three for the body and one for holding mechanism. This intelligent system is controlled by Arduino microcontroller and build with optical sensor to provide the distinguish objects that will be handled. Solidworks model was used to produce the detail design of the robotic arm and make the mechanical properties analysis by using a CAD software.
Sensory Intelligence for Extraction of an Abstract Auditory Rule: A Cross-Linguistic Study.

Science.gov (United States)

Guo, Xiao-Tao; Wang, Xiao-Dong; Liang, Xiu-Yuan; Wang, Ming; Chen, Lin

2018-02-21

In a complex linguistic environment, while speech sounds can greatly vary, some shared features are often invariant. These invariant features constitute so-called abstract auditory rules. Our previous study has shown that with auditory sensory intelligence, the human brain can automatically extract the abstract auditory rules in the speech sound stream, presumably serving as the neural basis for speech comprehension. However, whether the sensory intelligence for extraction of abstract auditory rules in speech is inherent or experience-dependent remains unclear. To address this issue, we constructed a complex speech sound stream using auditory materials in Mandarin Chinese, in which syllables had a flat lexical tone but differed in other acoustic features to form an abstract auditory rule. This rule was occasionally and randomly violated by the syllables with the rising, dipping or falling tone. We found that both Chinese and foreign speakers detected the violations of the abstract auditory rule in the speech sound stream at a pre-attentive stage, as revealed by the whole-head recordings of mismatch negativity (MMN) in a passive paradigm. However, MMNs peaked earlier in Chinese speakers than in foreign speakers. Furthermore, Chinese speakers showed different MMN peak latencies for the three deviant types, which paralleled recognition points. These findings indicate that the sensory intelligence for extraction of abstract auditory rules in speech sounds is innate but shaped by language experience. Copyright © 2018 IBRO. Published by Elsevier Ltd. All rights reserved.
Prevalence of Speech Disorders in Elementary School Students in Jordan

Science.gov (United States)

Al-Jazi, Aya Bassam; Al-Khamra, Rana

2015-01-01

Goal: The aim of this study was to find the prevalence of speech (articulation, voice, and fluency) disorders among elementary school students from first grade to fourth grade. This research was based on the screening implemented as part of the Madrasati Project, which is designed to serve the school system in Jordan. Method: A sample of 1,231…
Development of The Viking Speech Scale to classify the speech of children with cerebral palsy.

Science.gov (United States)

Pennington, Lindsay; Virella, Daniel; Mjøen, Tone; da Graça Andrada, Maria; Murray, Janice; Colver, Allan; Himmelmann, Kate; Rackauskaite, Gija; Greitane, Andra; Prasauskiene, Audrone; Andersen, Guro; de la Cruz, Javier

2013-10-01

Surveillance registers monitor the prevalence of cerebral palsy and the severity of resulting impairments across time and place. The motor disorders of cerebral palsy can affect children's speech production and limit their intelligibility. We describe the development of a scale to classify children's speech performance for use in cerebral palsy surveillance registers, and its reliability across raters and across time. Speech and language therapists, other healthcare professionals and parents classified the speech of 139 children with cerebral palsy (85 boys, 54 girls; mean age 6.03 years, SD 1.09) from observation and previous knowledge of the children. Another group of health professionals rated children's speech from information in their medical notes. With the exception of parents, raters reclassified children's speech at least four weeks after their initial classification. Raters were asked to rate how easy the scale was to use and how well the scale described the child's speech production using Likert scales. Inter-rater reliability was moderate to substantial (k>.58 for all comparisons). Test-retest reliability was substantial to almost perfect for all groups (k>.68). Over 74% of raters found the scale easy or very easy to use; 66% of parents and over 70% of health care professionals judged the scale to describe children's speech well or very well. We conclude that the Viking Speech Scale is a reliable tool to describe the speech performance of children with cerebral palsy, which can be applied through direct observation of children or through case note review. Copyright © 2013 Elsevier Ltd. All rights reserved.
DISORDERS OF THE SOUND ARTICULATION IN PRETERM CHILDREN

Directory of Open Access Journals (Sweden)

Vesela MILANKOV

2009-11-01

Full Text Available Speech and language development is a good indicator of child’s cognitive development. The risk factors influencing development and functioning of prematurely born children are multiple. In addition to articulation disorder, there are motoric, conginitive and social aspects of delayed development. Premature babies are born before they physically ready to leave the womb. However, most babies born after about 26 weeks of gestational age have chances for survival, but they are at a greater risk of medical complications, since the earlier children are born, the less developed their organs are. Aim: To demonstrate basic parameters, establish differences, determine characteristics of disorder of sound articulation in fullterm and preterm children. Methodology: Research was conducted at the Clinics of Child’s Habilitation and Rehabilitation in Novi Sad. The prospective research study was carried out comprising 61 children with mean age of 4 years. The study inclusion criteria were gestational age and birth weight. Regarding these parameters, the children without major neurlologic or system disabilities were included, and they were Serbian speaking. The sample comprised 31 children with GS≥38 weeks and body weight of ≥3000 g, while the preterm group comprised 30 children with GS≤32 weeks and body weight of ≤1500 g. Results of the study indicate to a difference between fullterm children and preterm children with regard to articulation disorders, of which the statistically significant was a sound distortion. The overall sample showed that the substitution with distortion was most frequent disorder, while the interdental sigmatism was the most represented one. Conclusion: The obtained results lead to conclusion that preterm children, being a high-risk group, need to be followed up by age two, and provided timely proffesional help at pre-school age, since numerous adverse factors affect their overall development.
Development of a Low-Cost, Noninvasive, Portable Visual Speech Recognition Program.

Science.gov (United States)

Kohlberg, Gavriel D; Gal, Ya'akov Kobi; Lalwani, Anil K

2016-09-01

Loss of speech following tracheostomy and laryngectomy severely limits communication to simple gestures and facial expressions that are largely ineffective. To facilitate communication in these patients, we seek to develop a low-cost, noninvasive, portable, and simple visual speech recognition program (VSRP) to convert articulatory facial movements into speech. A Microsoft Kinect-based VSRP was developed to capture spatial coordinates of lip movements and translate them into speech. The articulatory speech movements associated with 12 sentences were used to train an artificial neural network classifier. The accuracy of the classifier was then evaluated on a separate, previously unseen set of articulatory speech movements. The VSRP was successfully implemented and tested in 5 subjects. It achieved an accuracy rate of 77.2% (65.0%-87.6% for the 5 speakers) on a 12-sentence data set. The mean time to classify an individual sentence was 2.03 milliseconds (1.91-2.16). We have demonstrated the feasibility of a low-cost, noninvasive, portable VSRP based on Kinect to accurately predict speech from articulation movements in clinically trivial time. This VSRP could be used as a novel communication device for aphonic patients. © The Author(s) 2016.
Fonoterapia em glossectomia total: estudo de caso Speech therapy in total glossectomy: case study

Directory of Open Access Journals (Sweden)

Camila Alves Vieira

2011-12-01

evaluation, the subject was diagnosed with severe mechanical oropharyngeal dysphagia and alteration in speech articulation. Speech rehabilitation used direct and indirect therapies. Indirect therapy focused on oral motor control, sensitivity, mobility, motricity, tonus and posture of the structures adjacent to the resected tongue. Direct therapy used the head back posture maneuver to help the ejection of food into the pharynx. The patient started exclusive oral feeding, except for solid foods, after ten months in treatment. Over-articulation, speed and rhythm exercises were used to improve speech intelligibility. Thus, the results of speech-language pathology intervention were considered positive, and the patient was discharged after a year in treatment. It is concluded that tongue resections present significant sequelae to swallowing and speech functions and, therefore, speech-language pathology intervention activity is indispensible for the modification and adaptation of these functions, in addition to providing the patient with better quality of life.
Pure apraxia of speech due to infarct in premotor cortex.

Science.gov (United States)

Patira, Riddhi; Ciniglia, Lauren; Calvert, Timothy; Altschuler, Eric L

Apraxia of speech (AOS) is now recognized as an articulation disorder distinct from dysarthria and aphasia. Various lesions have been associated with AOS in studies that are limited in precise localization due to variability in size and type of pathology. We present a case of pure AOS in setting of an acute stroke to localize more precisely than ever before the brain area responsible for AOS, dorsal premotor cortex (dPMC). The dPMC is in unique position to plan and coordinate speech production by virtue of its connection with nearby motor cortex harboring corticobulbar tract, supplementary motor area, inferior frontal operculum, and temporo-parietal area via the dorsal stream of dual-stream model of speech processing. The role of dPMC is further supported as part of dorsal stream in the dual-stream model of speech processing as well as controller in the hierarchical state feedback control model. Copyright © 2017 Polish Neurological Society. Published by Elsevier Urban & Partner Sp. z o.o. All rights reserved.
Sleep Disrupts High-Level Speech Parsing Despite Significant Basic Auditory Processing.

Science.gov (United States)

Makov, Shiri; Sharon, Omer; Ding, Nai; Ben-Shachar, Michal; Nir, Yuval; Zion Golumbic, Elana

2017-08-09

The extent to which the sleeping brain processes sensory information remains unclear. This is particularly true for continuous and complex stimuli such as speech, in which information is organized into hierarchically embedded structures. Recently, novel metrics for assessing the neural representation of continuous speech have been developed using noninvasive brain recordings that have thus far only been tested during wakefulness. Here we investigated, for the first time, the sleeping brain's capacity to process continuous speech at different hierarchical levels using a newly developed Concurrent Hierarchical Tracking (CHT) approach that allows monitoring the neural representation and processing-depth of continuous speech online. Speech sequences were compiled with syllables, words, phrases, and sentences occurring at fixed time intervals such that different linguistic levels correspond to distinct frequencies. This enabled us to distinguish their neural signatures in brain activity. We compared the neural tracking of intelligible versus unintelligible (scrambled and foreign) speech across states of wakefulness and sleep using high-density EEG in humans. We found that neural tracking of stimulus acoustics was comparable across wakefulness and sleep and similar across all conditions regardless of speech intelligibility. In contrast, neural tracking of higher-order linguistic constructs (words, phrases, and sentences) was only observed for intelligible speech during wakefulness and could not be detected at all during nonrapid eye movement or rapid eye movement sleep. These results suggest that, whereas low-level auditory processing is relatively preserved during sleep, higher-level hierarchical linguistic parsing is severely disrupted, thereby revealing the capacity and limits of language processing during sleep. SIGNIFICANCE STATEMENT Despite the persistence of some sensory processing during sleep, it is unclear whether high-level cognitive processes such as speech
Age-related changes to spectral voice characteristics affect judgments of prosodic, segmental, and talker attributes for child and adult speech

Science.gov (United States)

Dilley, Laura C.; Wieland, Elizabeth A.; Gamache, Jessica L.; McAuley, J. Devin; Redford, Melissa A.

2013-01-01

Purpose As children mature, changes in voice spectral characteristics covary with changes in speech, language, and behavior. Spectral characteristics were manipulated to alter the perceived ages of talkers’ voices while leaving critical acoustic-prosodic correlates intact, to determine whether perceived age differences were associated with differences in judgments of prosodic, segmental, and talker attributes. Method Speech was modified by lowering formants and fundamental frequency, for 5-year-old children’s utterances, or raising them, for adult caregivers’ utterances. Next, participants differing in awareness of the manipulation (Exp. 1a) or amount of speech-language training (Exp. 1b) made judgments of prosodic, segmental, and talker attributes. Exp. 2 investigated the effects of spectral modification on intelligibility. Finally, in Exp. 3 trained analysts used formal prosody coding to assess prosodic characteristics of spectrally-modified and unmodified speech. Results Differences in perceived age were associated with differences in ratings of speech rate, fluency, intelligibility, likeability, anxiety, cognitive impairment, and speech-language disorder/delay; effects of training and awareness of the manipulation on ratings were limited. There were no significant effects of the manipulation on intelligibility or formally coded prosody judgments. Conclusions Age-related voice characteristics can greatly affect judgments of speech and talker characteristics, raising cautionary notes for developmental research and clinical work. PMID:23275414
Speech and Communication Changes Reported by People with Parkinson's Disease.

Science.gov (United States)

Schalling, Ellika; Johansson, Kerstin; Hartelius, Lena

2017-01-01

Changes in communicative functions are common in Parkinson's disease (PD), but there are only limited data provided by individuals with PD on how these changes are perceived, what their consequences are, and what type of intervention is provided. To present self-reported information about speech and communication, the impact on communicative participation, and the amount and type of speech-language pathology services received by people with PD. Respondents with PD recruited via the Swedish Parkinson's Disease Society filled out a questionnaire accessed via a Web link or provided in a paper version. Of 188 respondents, 92.5% reported at least one symptom related to communication; the most common symptoms were weak voice, word-finding difficulties, imprecise articulation, and getting off topic in conversation. The speech and communication problems resulted in restricted communicative participation for between a quarter and a third of the respondents, and their speech caused embarrassment sometimes or more often to more than half. Forty-five percent of the respondents had received speech-language pathology services. Most respondents reported both speech and language symptoms, and many experienced restricted communicative participation. Access to speech-language pathology services is still inadequate. Services should also address cognitive/linguistic aspects to meet the needs of people with PD. © 2018 S. Karger AG, Basel.
Morphometric Differences of Vocal Tract Articulators in Different Loudness Conditions in Singing.

Science.gov (United States)

Echternach, Matthias; Burk, Fabian; Burdumy, Michael; Traser, Louisa; Richter, Bernhard

2016-01-01

Dynamic MRI analysis of phonation has gathered interest in voice and speech physiology. However, there are limited data addressing the extent to which articulation is dependent on loudness. 12 professional singer subjects of different voice classifications were analysed concerning the vocal tract profiles recorded with dynamic real-time MRI with 25fps in different pitch and loudness conditions. The subjects were asked to sing ascending scales on the vowel /a/ in three loudness conditions (comfortable=mf, very soft=pp, very loud=ff, respectively). Furthermore, fundamental frequency and sound pressure level were analysed from the simultaneously recorded optical audio signal after noise cancellation. The data show articulatory differences with respect to changes of both pitch and loudness. Here, lip opening and pharynx width were increased. While the vertical larynx position was rising with pitch it was lower for greater loudness. Especially, the lip opening and pharynx width were more strongly correlated with the sound pressure level than with pitch. For the vowel /a/ loudness has an effect on articulation during singing which should be considered when articulatory vocal tract data are interpreted.
Handling risk attitudes for preference learning and intelligent decision support

DEFF Research Database (Denmark)

Franco de los Ríos, Camilo; Hougaard, Jens Leth; Nielsen, Kurt

2015-01-01

Intelligent decision support should allow integrating human knowledge with efficient algorithms for making interpretable and useful recommendations on real world decision problems. Attitudes and preferences articulate and come together under a decision process that should be explicitly modeled...
Development of equally intelligible Telugu sentence-lists to test speech recognition in noise.

Science.gov (United States)

Tanniru, Kishore; Narne, Vijaya Kumar; Jain, Chandni; Konadath, Sreeraj; Singh, Niraj Kumar; Sreenivas, K J Ramadevi; K, Anusha

2017-09-01

To develop sentence lists in the Telugu language for the assessment of speech recognition threshold (SRT) in the presence of background noise through identification of the mean signal-to-noise ratio required to attain a 50% sentence recognition score (SRTn). This study was conducted in three phases. The first phase involved the selection and recording of Telugu sentences. In the second phase, 20 lists, each consisting of 10 sentences with equal intelligibility, were formulated using a numerical optimisation procedure. In the third phase, the SRTn of the developed lists was estimated using adaptive procedures on individuals with normal hearing. A total of 68 native Telugu speakers with normal hearing participated in the study. Of these, 18 (including the speakers) performed on various subjective measures in first phase, 20 performed on sentence/word recognition in noise for second phase and 30 participated in the list equivalency procedures in third phase. In all, 15 lists of comparable difficulty were formulated as test material. The mean SRTn across these lists corresponded to -2.74 (SD = 0.21). The developed sentence lists provided a valid and reliable tool to measure SRTn in Telugu native speakers.
Examination of Articulation in Patient Using Obturator by Means of Computer Planning

Directory of Open Access Journals (Sweden)

Somaieh Allahiary

2013-02-01

Full Text Available Background and Aims: Approximately 5% of cancers involve structures of oral cavity. Partial resection of maxilla (maxillectomy may be performed in these cases. Maxillectomy often results in significant functional disabilities such as inability in mastication, deglutition and speech with adverse impact on psychological statusand social life of patients. Obturator prosthesis is a prosthodontic treatment to separate nasal and oral cavities and restore the critical above mentioned functions. The assessment of speech is considered to examine speech function restored by the treatment. The purpose of this study was to evaluate the speech in patients with resected maxilla who have been treated by obturator prosthesis from a pool of related patients in the Prosthodotnics department ofdental faculty, Tehran University of Medical Sciences. The evaluation was performed with computer software using sentence intelligibility (SI test. Materials and Methods: This cross sectional study was conducted on 10 subjects (23-66 years referred to the Prosthodontics department of the faculty and received an obturator. After primary examination of the prosthesis,the patients completed SI test in an acoustic room under guidance of a speech therapist. The performed tests were analyzed by the speech therapist. In addition, the SI with and without the prosthesis was evaluate by lay audience. The statistical analyses were performed using Wilcoxon-signed rank test and Weighted Kappa. Results: Significant differences were found between SI tests with and without the obturators (P<0.001. Two of 10 patients showed problems in speech function using obturator. Conclusion: Within the limitations of the present study, obturators had significant effect on improvement of the speech outcomes of examined patients. Improvement of the quality of life could be predicted.
Perception of the multisensory coherence of fluent audiovisual speech in infancy: its emergence and the role of experience.

Science.gov (United States)

Lewkowicz, David J; Minar, Nicholas J; Tift, Amy H; Brandon, Melissa

2015-02-01

To investigate the developmental emergence of the perception of the multisensory coherence of native and non-native audiovisual fluent speech, we tested 4-, 8- to 10-, and 12- to 14-month-old English-learning infants. Infants first viewed two identical female faces articulating two different monologues in silence and then in the presence of an audible monologue that matched the visible articulations of one of the faces. Neither the 4-month-old nor 8- to 10-month-old infants exhibited audiovisual matching in that they did not look longer at the matching monologue. In contrast, the 12- to 14-month-old infants exhibited matching and, consistent with the emergence of perceptual expertise for the native language, perceived the multisensory coherence of native-language monologues earlier in the test trials than that of non-native language monologues. Moreover, the matching of native audible and visible speech streams observed in the 12- to 14-month-olds did not depend on audiovisual synchrony, whereas the matching of non-native audible and visible speech streams did depend on synchrony. Overall, the current findings indicate that the perception of the multisensory coherence of fluent audiovisual speech emerges late in infancy, that audiovisual synchrony cues are more important in the perception of the multisensory coherence of non-native speech than that of native audiovisual speech, and that the emergence of this skill most likely is affected by perceptual narrowing. Copyright © 2014 Elsevier Inc. All rights reserved.
The relationship between high-frequency pure-tone hearing loss, hearing in noise test (HINT) thresholds, and the articulation index.

Science.gov (United States)

Vermiglio, Andrew J; Soli, Sigfrid D; Freed, Daniel J; Fisher, Laurel M

2012-01-01

Speech recognition in noise testing has been conducted at least since the 1940s (Dickson et al, 1946). The ability to recognize speech in noise is a distinct function of the auditory system (Plomp, 1978). According to Kochkin (2002), difficulty recognizing speech in noise is the primary complaint of hearing aid users. However, speech recognition in noise testing has not found widespread use in the field of audiology (Mueller, 2003; Strom, 2003; Tannenbaum and Rosenfeld, 1996). The audiogram has been used as the "gold standard" for hearing ability. However, the audiogram is a poor indicator of speech recognition in noise ability. This study investigates the relationship between pure-tone thresholds, the articulation index, and the ability to recognize speech in quiet and in noise. Pure-tone thresholds were measured for audiometric frequencies 250-6000 Hz. Pure-tone threshold groups were created. These included a normal threshold group and slight, mild, severe, and profound high-frequency pure-tone threshold groups. Speech recognition thresholds in quiet and in noise were obtained using the Hearing in Noise Test (HINT) (Nilsson et al, 1994; Vermiglio, 2008). The articulation index was determined by using Pavlovic's method with pure-tone thresholds (Pavlovic, 1989, 1991). Two hundred seventy-eight participants were tested. All participants were native speakers of American English. Sixty-three of the original participants were removed in order to create groups of participants with normal low-frequency pure-tone thresholds and relatively symmetrical high-frequency pure-tone threshold groups. The final set of 215 participants had a mean age of 33 yr with a range of 17-59 yr. Pure-tone threshold data were collected using the Hughson-Weslake procedure. Speech recognition data were collected using a Windows-based HINT software system. Statistical analyses were conducted using descriptive, correlational, and multivariate analysis of covariance (MANCOVA) statistics. The
Sentence-Level Movements in Parkinson's Disease: Loud, Clear, and Slow Speech

Science.gov (United States)

Kearney,Elaine; Giles, Renuka; Haworth, Brandon; Faloutsos, Petros; Baljko, Melanie; Yunusova, Yana

2017-01-01

Purpose: To further understand the effect of Parkinson's disease (PD) on articulatory movements in speech and to expand our knowledge of therapeutic treatment strategies, this study examined movements of the jaw, tongue blade, and tongue dorsum during sentence production with respect to speech intelligibility and compared the effect of varying…

Visemic Processing in Audiovisual Discrimination of Natural Speech: A Simultaneous fMRI-EEG Study

Science.gov (United States)

Dubois, Cyril; Otzenberger, Helene; Gounot, Daniel; Sock, Rudolph; Metz-Lutz, Marie-Noelle

2012-01-01

In a noisy environment, visual perception of articulatory movements improves natural speech intelligibility. Parallel to phonemic processing based on auditory signal, visemic processing constitutes a counterpart based on "visemes", the distinctive visual units of speech. Aiming at investigating the neural substrates of visemic processing in a…
Intelligent Shutter Speech Control System Based on DSP

Directory of Open Access Journals (Sweden)

Yonghong Deng

2017-01-01

Full Text Available Based on TMS320F28035 DSP, this paper designed a smart shutters voice control system, which realized the functions of opening and closing shutters, intelligent switching of lighting mode and solar power supply through voice control. The traditional control mode is converted to voice control at the same time with automatic lighting and solar power supply function. In the convenience of people’s lives at the same time more satisfied with today’s people on the intelligent and environmental protection of the two concepts of the pursuit. The whole system is simple, low cost, safe and reliable.
Within-subjects comparison of the HiRes and Fidelity120 speech processing strategies: speech perception and its relation to place-pitch sensitivity.

Science.gov (United States)

Donaldson, Gail S; Dawson, Patricia K; Borden, Lamar Z

2011-01-01

Previous studies have confirmed that current steering can increase the number of discriminable pitches available to many cochlear implant (CI) users; however, the ability to perceive additional pitches has not been linked to improved speech perception. The primary goals of this study were to determine (1) whether adult CI users can achieve higher levels of spectral cue transmission with a speech processing strategy that implements current steering (Fidelity120) than with a predecessor strategy (HiRes) and, if so, (2) whether the magnitude of improvement can be predicted from individual differences in place-pitch sensitivity. A secondary goal was to determine whether Fidelity120 supports higher levels of speech recognition in noise than HiRes. A within-subjects repeated measures design evaluated speech perception performance with Fidelity120 relative to HiRes in 10 adult CI users. Subjects used the novel strategy (either HiRes or Fidelity120) for 8 wks during the main study; a subset of five subjects used Fidelity120 for three additional months after the main study. Speech perception was assessed for the spectral cues related to vowel F1 frequency, vowel F2 frequency, and consonant place of articulation; overall transmitted information for vowels and consonants; and sentence recognition in noise. Place-pitch sensitivity was measured for electrode pairs in the apical, middle, and basal regions of the implanted array using a psychophysical pitch-ranking task. With one exception, there was no effect of strategy (HiRes versus Fidelity120) on the speech measures tested, either during the main study (N = 10) or after extended use of Fidelity120 (N = 5). The exception was a small but significant advantage for HiRes over Fidelity120 for consonant perception during the main study. Examination of individual subjects' data revealed that 3 of 10 subjects demonstrated improved perception of one or more spectral cues with Fidelity120 relative to HiRes after 8 wks or longer
Telehealth Delivery of Rapid Syllable Transitions (ReST) Treatment for Childhood Apraxia of Speech

Science.gov (United States)

Thomas, Donna C.; McCabe, Patricia; Ballard, Kirrie J.; Lincoln, Michelle

2016-01-01

Background: Rapid Syllable Transitions (ReST) treatment uses pseudo-word targets with varying lexical stress to target simultaneously articulation, prosodic accuracy and coarticulatory transitions in childhood apraxia of speech (CAS). The treatment is efficacious for the acquisition of imitated pseudo-words, and generalization of skill to…
Comparison of speech performance in labial and lingual orthodontic patients: A prospective study

Science.gov (United States)

Rai, Ambesh Kumar; Rozario, Joe E.; Ganeshkar, Sanjay V.

2014-01-01

Background: The intensity and duration of speech difficulty inherently associated with lingual therapy is a significant issue of concern in orthodontics. This study was designed to evaluate and to compare the duration of changes in speech between labial and lingual orthodontics. Materials and Methods: A prospective longitudinal clinical study was designed to assess speech of 24 patients undergoing labial or lingual orthodontic treatment. An objective spectrographic evaluation of/s/sound was done using software PRAAT version 5.0.47, a semiobjective auditive evaluation of articulation was done by four speech pathologists and a subjective assessment of speech was done by four laypersons. The tests were performed before (T1), within 24 h (T2), after 1 week (T3) and after 1 month (T4) of the start of therapy. The Mann-Whitney U-test for independent samples was used to assess the significance difference between the labial and lingual appliances. A speech alteration with P appliance systems caused a comparable speech difficulty immediately after bonding (T2). Although the speech recovered within a week in the labial group (T3), the lingual group continued to experience discomfort even after a month (T4). PMID:25540661
Correlation between crystallographic computing and artificial intelligence research

Energy Technology Data Exchange (ETDEWEB)

Feigenbaum, E A [Stanford Univ., CA; Engelmore, R S; Johnson, C K

1977-01-01

Artificial intelligence research, as a part of computer science, has produced a variety of programs of experimental and applications interest: programs for scientific inference, chemical synthesis, planning robot control, extraction of meaning from English sentences, speech understanding, interpretation of visual images, and so on. The symbolic manipulation techniques used in artificial intelligence provide a framework for analyzing and coding the knowledge base of a problem independently of an algorithmic implementation. A possible application of artificial intelligence methodology to protein crystallography is described. 2 figures, 2 tables.
Effect of emotion and articulation of speech on the Uncanny Valley in virtual characters

DEFF Research Database (Denmark)

Tinwell, Angela; Grimshaw, Mark Nicholas; Abdel Nabi, Debbie

2011-01-01

This paper presents a study of how exaggerated facial expression in the lower face region affects perception of emotion and the Uncanny Valley phenomenon in realistic, human-like, virtual characters. Characters communicated the six basic emotions, anger, disgust, fear, sadness and surprise...... with normal and exaggerated mouth movements. Measures were taken for perceived familiarity and human-likness. the results showed that: an increased intensity of articulation significantly reduced the uncanny for anger, yet increased perception of the uncanny for characters expressing happiness...
Dysarthric Bengali speech: A neurolinguistic study

Directory of Open Access Journals (Sweden)

Chakraborty N

2008-01-01

Full Text Available Background and Aims: Dysarthria affects linguistic domains such as respiration, phonation, articulation, resonance and prosody due to upper motor neuron, lower motor neuron, cerebellar or extrapyramidal tract lesions. Although Bengali is one of the major languages globally, dysarthric Bengali speech has not been subjected to neurolinguistic analysis. We attempted such an analysis with the goal of identifying the speech defects in native Bengali speakers in various types of dysarthria encountered in neurological disorders. Settings and Design: A cross-sectional observational study was conducted with 66 dysarthric subjects, predominantly middle-aged males, attending the Neuromedicine OPD of a tertiary care teaching hospital in Kolkata. Materials and Methods: After neurological examination, an instrument comprising commonly used Bengali words and a text block covering all Bengali vowels and consonants were used to carry out perceptual analysis of dysarthric speech. From recorded speech, 24 parameters pertaining to five linguistic domains were assessed. The Kruskal-Wallis analysis of variance, Chi-square test and Fisher′s exact test were used for analysis. Results: The dysarthria types were spastic (15 subjects, flaccid (10, mixed (12, hypokinetic (12, hyperkinetic (9 and ataxic (8. Of the 24 parameters assessed, 15 were found to occur in one or more types with a prevalence of at least 25%. Imprecise consonant was the most frequently occurring defect in most dysarthrias. The spectrum of defects in each type was identified. Some parameters were capable of distinguishing between types. Conclusions: This perceptual analysis has defined linguistic defects likely to be encountered in dysarthric Bengali speech in neurological disorders. The speech distortion can be described and distinguished by a limited number of parameters. This may be of importance to the speech therapist and neurologist in planning rehabilitation and further management.
Rule-Based Storytelling Text-to-Speech (TTS Synthesis

Directory of Open Access Journals (Sweden)

Ramli Izzad

2016-01-01

Full Text Available In recent years, various real life applications such as talking books, gadgets and humanoid robots have drawn the attention to pursue research in the area of expressive speech synthesis. Speech synthesis is widely used in various applications. However, there is a growing need for an expressive speech synthesis especially for communication and robotic. In this paper, global and local rule are developed to convert neutral to storytelling style speech for the Malay language. In order to generate rules, modification of prosodic parameters such as pitch, intensity, duration, tempo and pauses are considered. Modification of prosodic parameters is examined by performing prosodic analysis on a story collected from an experienced female and male storyteller. The global and local rule is applied in sentence level and synthesized using HNM. Subjective tests are conducted to evaluate the synthesized storytelling speech quality of both rules based on naturalness, intelligibility, and similarity to the original storytelling speech. The results showed that global rule give a better result than local rule
Impairment of vowel articulation as a possible marker of disease progression in Parkinson's disease.

Directory of Open Access Journals (Sweden)

Sabine Skodda

Full Text Available PURPOSE: The aim of the current study was to survey if vowel articulation in speakers with Parkinson's disease (PD shows specific changes in the course of the disease. METHOD: 67 patients with PD (42 male and 40 healthy speakers (20 male were tested and retested after an average time interval of 34 months. Participants had to read a given text as source for subsequent calculation of the triangular vowel space area (tVSA and vowel articulation index (VAI. Measurement of tVSA and VAI were based upon analysis of the first and second formant of the vowels /α/, /i/and /u/ extracted from defined words within the text. RESULTS: At first visit, VAI values were reduced in male and female PD patients as compared to the control group, and showed a further decrease at the second visit. Only in female Parkinsonian speakers, VAI was correlated to overall speech impairment based upon perceptual impression. VAI and tVSA were correlated to gait impairment, but no correlations were seen between VAI and global motor impairment or overall disease duration. tVSA showed a similar reduction in the PD as compared to the control group and was also found to further decline between first and second examination in female, but not in male speakers with PD. CONCLUSIONS: Measurement of VAI seems to be superior to tVSA in the description of impaired vowel articulation and its further decline in the course of the disease in PD. Since impairment of vowel articulation was found to be independent from global motor function but correlated to gait dysfunction, measurement of vowel articulation might have a potential to serve as a marker of axial disease progression.
Speech Transduction Based on Linguistic Content

DEFF Research Database (Denmark)

Juel Henrichsen, Peter; Christiansen, Thomas Ulrich

Digital hearing aids use a variety of advanced digital signal processing methods in order to improve speech intelligibility. These methods are based on knowledge about the acoustics outside the ear as well as psychoacoustics. This paper investigates the recent observation that speech elements...... with a high degree of information can be robustly identified based on basic acoustic properties, i.e., function words have greater spectral tilt than content words for each of the 18 Danish talkers investigated. In this paper we examine these spectral tilt differences as a function of time based on a speech...... material six times the duration of previous investigations. Our results show that the correlation of spectral tilt with information content is relatively constant across time, even if averaged across talkers. This indicates that it is possible to devise a robust method for estimating information density...
DynaLearn-An Intelligent Learning Environment for Learning Conceptual Knowledge

NARCIS (Netherlands)

Bredeweg, Bert; Liem, Jochem; Beek, Wouter; Linnebank, Floris; Gracia, Jorge; Lozano, Esther; Wißner, Michael; Bühling, René; Salles, Paulo; Noble, Richard; Zitek, Andreas; Borisova, Petya; Mioduser, David

2013-01-01

Articulating thought in computerbased media is a powerful means for humans to develop their understanding of phenomena. We have created DynaLearn, an intelligent learning environment that allows learners to acquire conceptual knowledge by constructing and simulating qualitative models of how systems
Using Zebra-speech to study sequential and simultaneous speech segregation in a cochlear-implant simulation.

Science.gov (United States)

Gaudrain, Etienne; Carlyon, Robert P

2013-01-01

Previous studies have suggested that cochlear implant users may have particular difficulties exploiting opportunities to glimpse clear segments of a target speech signal in the presence of a fluctuating masker. Although it has been proposed that this difficulty is associated with a deficit in linking the glimpsed segments across time, the details of this mechanism are yet to be explained. The present study introduces a method called Zebra-speech developed to investigate the relative contribution of simultaneous and sequential segregation mechanisms in concurrent speech perception, using a noise-band vocoder to simulate cochlear implants. One experiment showed that the saliency of the difference between the target and the masker is a key factor for Zebra-speech perception, as it is for sequential segregation. Furthermore, forward masking played little or no role, confirming that intelligibility was not limited by energetic masking but by across-time linkage abilities. In another experiment, a binaural cue was used to distinguish the target and the masker. It showed that the relative contribution of simultaneous and sequential segregation depended on the spectral resolution, with listeners relying more on sequential segregation when the spectral resolution was reduced. The potential of Zebra-speech as a segregation enhancement strategy for cochlear implants is discussed.
Listener Perception of Monopitch, Naturalness, and Intelligibility for Speakers with Parkinson's Disease

Science.gov (United States)

Anand, Supraja; Stepp, Cara E.

2015-01-01

Purpose: Given the potential significance of speech naturalness to functional and social rehabilitation outcomes, the objective of this study was to examine the effect of listener perceptions of monopitch on speech naturalness and intelligibility in individuals with Parkinson's disease (PD). Method: Two short utterances were extracted from…
Speech and language intervention in bilinguals

Directory of Open Access Journals (Sweden)

Eliane Ramos

2011-12-01

Full Text Available Increasingly, speech and language pathologists (SLPs around the world are faced with the unique set of issues presented by their bilingual clients. Some professional associations in different countries have presented recommendations when assessing and treating bilingual populations. In children, most of the studies have focused on intervention for language and phonology/ articulation impairments and very few focus on stuttering. In general, studies of language intervention tend to agree that intervention in the first language (L1 either increase performance on L2 or does not hinder it. In bilingual adults, monolingual versus bilingual intervention is especially relevant in cases of aphasia; dysarthria in bilinguals has been barely approached. Most studies of cross-linguistic effects in bilingual aphasics have focused on lexical retrieval training. It has been noted that even though a majority of studies have disclosed a cross-linguistic generalization from one language to the other, some methodological weaknesses are evident. It is concluded that even though speech and language intervention in bilinguals represents a most important clinical area in speech language pathology, much more research using larger samples and controlling for potentially confounding variables is evidently required.
Speech comprehension aided by multiple modalities: behavioural and neural interactions

Science.gov (United States)

McGettigan, Carolyn; Faulkner, Andrew; Altarelli, Irene; Obleser, Jonas; Baverstock, Harriet; Scott, Sophie K.

2014-01-01

Speech comprehension is a complex human skill, the performance of which requires the perceiver to combine information from several sources – e.g. voice, face, gesture, linguistic context – to achieve an intelligible and interpretable percept. We describe a functional imaging investigation of how auditory, visual and linguistic information interact to facilitate comprehension. Our specific aims were to investigate the neural responses to these different information sources, alone and in interaction, and further to use behavioural speech comprehension scores to address sites of intelligibility-related activation in multifactorial speech comprehension. In fMRI, participants passively watched videos of spoken sentences, in which we varied Auditory Clarity (with noise-vocoding), Visual Clarity (with Gaussian blurring) and Linguistic Predictability. Main effects of enhanced signal with increased auditory and visual clarity were observed in overlapping regions of posterior STS. Two-way interactions of the factors (auditory × visual, auditory × predictability) in the neural data were observed outside temporal cortex, where positive signal change in response to clearer facial information and greater semantic predictability was greatest at intermediate levels of auditory clarity. Overall changes in stimulus intelligibility by condition (as determined using an independent behavioural experiment) were reflected in the neural data by increased activation predominantly in bilateral dorsolateral temporal cortex, as well as inferior frontal cortex and left fusiform gyrus. Specific investigation of intelligibility changes at intermediate auditory clarity revealed a set of regions, including posterior STS and fusiform gyrus, showing enhanced responses to both visual and linguistic information. Finally, an individual differences analysis showed that greater comprehension performance in the scanning participants (measured in a post-scan behavioural test) were associated with
Influence of timing of delayed hard palate closure on articulation skills in 3-year-old Danish children with unilateral cleft lip and palate.

Science.gov (United States)

Willadsen, Elisabeth; Boers, Maria; Schöps, Antje; Kisling-Møller, Mia; Nielsen, Joan Bogh; Jørgensen, Line Dahl; Andersen, Mikael; Bolund, Stig; Andersen, Helene Søgaard

2018-01-01

Differing results regarding articulation skills in young children with cleft palate (CP) have been reported and often interpreted as a consequence of different surgical protocols. To assess the influence of different timing of hard palate closure in a two-stage procedure on articulation skills in 3-year-olds born with unilateral cleft lip and palate (UCLP). Secondary aims were to compare results with peers without CP, and to investigate if there are gender differences in articulation skills. Furthermore, burden of treatment was to be estimated in terms of secondary surgery, hearing and speech therapy. A randomized controlled trial (RCT). Early hard palate closure (EHPC) at 12 months versus late hard palate closure (LHPC) at 36 months in a two-stage procedure was tested in a cohort of 126 Danish-speaking children born with non-syndromic UCLP. All participants had the lip and soft palate closed around 4 months of age. Audio and video recordings of a naming test were available from 113 children (32 girls and 81 boys) and were transcribed phonetically. Recordings were obtained prior to hard palate closure in the LHPC group. The main outcome measures were percentage consonants correct adjusted (PCC-A) and consonant errors from blinded assessments. Results from 36 Danish-speaking children without CP obtained previously by Willadsen in 2012 were used for comparison. Children with EHPC produced significantly more target consonants correctly (83%) than children with LHPC (48%; p < .001). In addition, children with LHPC produced significantly more active cleft speech characteristics than children with EHPC (p < .001). Boys achieved significantly lower PCC-A scores than girls (p = .04) and produced significantly more consonant errors than girls (p = .02). No significant differences were found between groups regarding burden of treatment. The control group performed significantly better than the EHPC and LHPC groups on all compared variables. © 2017 Royal College of Speech
Using the Self-Select Paradigm to Delineate the Nature of Speech Motor Programming

Science.gov (United States)

Wright, David L.; Robin, Don A.; Rhee, Jooyhun; Vaculin, Amber; Jacks, Adam; Guenther, Frank H.; Fox, Peter T.

2009-01-01

Purpose: The authors examined the involvement of 2 speech motor programming processes identified by S. T. Klapp (1995, 2003) during the articulation of utterances differing in syllable and sequence complexity. According to S. T. Klapp, 1 process, INT, resolves the demands of the programmed unit, whereas a second process, SEQ, oversees the serial…
Perception of the Multisensory Coherence of Fluent Audiovisual Speech in Infancy: Its Emergence & the Role of Experience

Science.gov (United States)

Lewkowicz, David J.; Minar, Nicholas J.; Tift, Amy H.; Brandon, Melissa

2014-01-01

To investigate the developmental emergence of the ability to perceive the multisensory coherence of native and non-native audiovisual fluent speech, we tested 4-, 8–10, and 12–14 month-old English-learning infants. Infants first viewed two identical female faces articulating two different monologues in silence and then in the presence of an audible monologue that matched the visible articulations of one of the faces. Neither the 4-month-old nor the 8–10 month-old infants exhibited audio-visual matching in that neither group exhibited greater looking at the matching monologue. In contrast, the 12–14 month-old infants exhibited matching and, consistent with the emergence of perceptual expertise for the native language, they perceived the multisensory coherence of native-language monologues earlier in the test trials than of non-native language monologues. Moreover, the matching of native audible and visible speech streams observed in the 12–14 month olds did not depend on audio-visual synchrony whereas the matching of non-native audible and visible speech streams did depend on synchrony. Overall, the current findings indicate that the perception of the multisensory coherence of fluent audiovisual speech emerges late in infancy, that audio-visual synchrony cues are more important in the perception of the multisensory coherence of non-native than native audiovisual speech, and that the emergence of this skill most likely is affected by perceptual narrowing. PMID:25462038
Prediction of speech masking release for fluctuating interferers based on the envelope power signal-to-noise ratio

DEFF Research Database (Denmark)

Jørgensen, Søren; Dau, Torsten

2012-01-01

-hearing listeners in conditions with additive stationary noise, reverberation, and nonlinear processing with spectral subtraction. The latter condition represents a case in which the standardized speech intelligibility index and speech transmission index fail. However, the sEPSM is limited to conditions...... for the stationary and non-stationary interferers, demonstrating further that the envelope SNR is crucial for speech comprehension....

A novel speech prosthesis for mandibular guidance therapy in hemimandibulectomy patient: A clinical report

Directory of Open Access Journals (Sweden)

Raghavendra Adaki

2016-01-01

Full Text Available Treating diverse maxillofacial patients poses a challenge to the maxillofacial prosthodontist. Rehabilitation of hemimandibulectomy patients must aim at restoring mastication and other functions such as intelligible speech, swallowing, and esthetics. Prosthetic methods such as palatal ramp and mandibular guiding flange reposition the deviated mandible. Such prosthesis can also be used to restore speech in case of patients with debilitating speech following surgical resection. This clinical report gives detail of a hemimandibulectomy patient provided with an interim removable dental speech prosthesis with composite resin flange for mandibular guidance therapy.
Perceptual restoration of degraded speech is preserved with advancing age.

Science.gov (United States)

Saija, Jefta D; Akyürek, Elkan G; Andringa, Tjeerd C; Başkent, Deniz

2014-02-01

Cognitive skills, such as processing speed, memory functioning, and the ability to divide attention, are known to diminish with aging. The present study shows that, despite these changes, older adults can successfully compensate for degradations in speech perception. Critically, the older participants of this study were not pre-selected for high performance on cognitive tasks, but only screened for normal hearing. We measured the compensation for speech degradation using phonemic restoration, where intelligibility of degraded speech is enhanced using top-down repair mechanisms. Linguistic knowledge, Gestalt principles of perception, and expectations based on situational and linguistic context are used to effectively fill in the inaudible masked speech portions. A positive compensation effect was previously observed only with young normal hearing people, but not with older hearing-impaired populations, leaving the question whether the lack of compensation was due to aging or due to age-related hearing problems. Older participants in the present study showed poorer intelligibility of degraded speech than the younger group, as expected from previous reports of aging effects. However, in conditions that induce top-down restoration, a robust compensation was observed. Speech perception by the older group was enhanced, and the enhancement effect was similar to that observed with the younger group. This effect was even stronger with slowed-down speech, which gives more time for cognitive processing. Based on previous research, the likely explanations for these observations are that older adults can overcome age-related cognitive deterioration by relying on linguistic skills and vocabulary that they have accumulated over their lifetime. Alternatively, or simultaneously, they may use different cerebral activation patterns or exert more mental effort. This positive finding on top-down restoration skills by the older individuals suggests that new cognitive training methods
Linearized motion estimation for articulated planes.

Science.gov (United States)

Datta, Ankur; Sheikh, Yaser; Kanade, Takeo

2011-04-01

In this paper, we describe the explicit application of articulation constraints for estimating the motion of a system of articulated planes. We relate articulations to the relative homography between planes and show that these articulations translate into linearized equality constraints on a linear least-squares system, which can be solved efficiently using a Karush-Kuhn-Tucker system. The articulation constraints can be applied for both gradient-based and feature-based motion estimation algorithms and to illustrate this, we describe a gradient-based motion estimation algorithm for an affine camera and a feature-based motion estimation algorithm for a projective camera that explicitly enforces articulation constraints. We show that explicit application of articulation constraints leads to numerically stable estimates of motion. The simultaneous computation of motion estimates for all of the articulated planes in a scene allows us to handle scene areas where there is limited texture information and areas that leave the field of view. Our results demonstrate the wide applicability of the algorithm in a variety of challenging real-world cases such as human body tracking, motion estimation of rigid, piecewise planar scenes, and motion estimation of triangulated meshes.
The effect of deep brain stimulation on the speech motor system.

Science.gov (United States)

Mücke, Doris; Becker, Johannes; Barbe, Michael T; Meister, Ingo; Liebhart, Lena; Roettger, Timo B; Dembek, Till; Timmermann, Lars; Grice, Martine

2014-08-01

Chronic deep brain stimulation of the nucleus ventralis intermedius is an effective treatment for individuals with medication-resistant essential tremor. However, these individuals report that stimulation has a deleterious effect on their speech. The present study investigates one important factor leading to these effects: the coordination of oral and glottal articulation. Sixteen native-speaking German adults with essential tremor, between 26 and 86 years old, with and without chronic deep brain stimulation of the nucleus ventralis intermedius and 12 healthy, age-matched subjects were recorded performing a fast syllable repetition task (/papapa/, /tatata/, /kakaka/). Syllable duration and voicing-to-syllable ratio as well as parameters related directly to consonant production, voicing during constriction, and frication during constriction were measured. Voicing during constriction was greater in subjects with essential tremor than in controls, indicating a perseveration of voicing into the voiceless consonant. Stimulation led to fewer voiceless intervals (voicing-to-syllable ratio), indicating a reduced degree of glottal abduction during the entire syllable cycle. Stimulation also induced incomplete oral closures (frication during constriction), indicating imprecise oral articulation. The detrimental effect of stimulation on the speech motor system can be quantified using acoustic measures at the subsyllabic level.
Effects of social cognitive impairment on speech disorder in schizophrenia.

Science.gov (United States)

Docherty, Nancy M; McCleery, Amanda; Divilbiss, Marielle; Schumann, Emily B; Moe, Aubrey; Shakeel, Mohammed K

2013-05-01

Disordered speech in schizophrenia impairs social functioning because it impedes communication with others. Treatment approaches targeting this symptom have been limited by an incomplete understanding of its causes. This study examined the process underpinnings of speech disorder, assessed in terms of communication failure. Contributions of impairments in 2 social cognitive abilities, emotion perception and theory of mind (ToM), to speech disorder were assessed in 63 patients with schizophrenia or schizoaffective disorder and 21 nonpsychiatric participants, after controlling for the effects of verbal intelligence and impairments in basic language-related neurocognitive abilities. After removal of the effects of the neurocognitive variables, impairments in emotion perception and ToM each explained additional variance in speech disorder in the patients but not the controls. The neurocognitive and social cognitive variables, taken together, explained 51% of the variance in speech disorder in the patients. Schizophrenic disordered speech may be less a concomitant of "positive" psychotic process than of illness-related limitations in neurocognitive and social cognitive functioning.
Morphometric Differences of Vocal Tract Articulators in Different Loudness Conditions in Singing.

Directory of Open Access Journals (Sweden)

Matthias Echternach

Full Text Available Dynamic MRI analysis of phonation has gathered interest in voice and speech physiology. However, there are limited data addressing the extent to which articulation is dependent on loudness.12 professional singer subjects of different voice classifications were analysed concerning the vocal tract profiles recorded with dynamic real-time MRI with 25fps in different pitch and loudness conditions. The subjects were asked to sing ascending scales on the vowel /a/ in three loudness conditions (comfortable=mf, very soft=pp, very loud=ff, respectively. Furthermore, fundamental frequency and sound pressure level were analysed from the simultaneously recorded optical audio signal after noise cancellation.The data show articulatory differences with respect to changes of both pitch and loudness. Here, lip opening and pharynx width were increased. While the vertical larynx position was rising with pitch it was lower for greater loudness. Especially, the lip opening and pharynx width were more strongly correlated with the sound pressure level than with pitch.For the vowel /a/ loudness has an effect on articulation during singing which should be considered when articulatory vocal tract data are interpreted.
The effect of varying talker identity and listening conditions on gaze behavior during audiovisual speech perception.

Science.gov (United States)

Buchan, Julie N; Paré, Martin; Munhall, Kevin G

2008-11-25

During face-to-face conversation the face provides auditory and visual linguistic information, and also conveys information about the identity of the speaker. This study investigated behavioral strategies involved in gathering visual information while watching talking faces. The effects of varying talker identity and varying the intelligibility of speech (by adding acoustic noise) on gaze behavior were measured with an eyetracker. Varying the intelligibility of the speech by adding noise had a noticeable effect on the location and duration of fixations. When noise was present subjects adopted a vantage point that was more centralized on the face by reducing the frequency of the fixations on the eyes and mouth and lengthening the duration of their gaze fixations on the nose and mouth. Varying talker identity resulted in a more modest change in gaze behavior that was modulated by the intelligibility of the speech. Although subjects generally used similar strategies to extract visual information in both talker variability conditions, when noise was absent there were more fixations on the mouth when viewing a different talker every trial as opposed to the same talker every trial. These findings provide a useful baseline for studies examining gaze behavior during audiovisual speech perception and perception of dynamic faces.
The Use of Electropalatography in the Treatment of Acquired Apraxia of Speech.

Science.gov (United States)

Mauszycki, Shannon C; Wright, Sandra; Dingus, Nicole; Wambaugh, Julie L

2016-12-01

This investigation was designed to examine the effects of an articulatory-kinematic treatment in conjunction with visual biofeedback (VBFB) via electropalatography (EPG) on the accuracy of articulation for acquired apraxia of speech (AOS). A multiple-baseline design across participants and behaviors was used with 4 individuals with chronic AOS and aphasia. Accuracy of target speech sounds in treated and untreated phrases in probe sessions served as the dependent variable. Participants received an articulatory-kinematic treatment in combination with VBFB, which was sequentially applied to 3 stimulus sets composed of 2-word phrases with a target speech sound for each set. Positive changes in articulatory accuracy were observed for participants for the majority of treated speech sounds. Also, there was generalization to untreated phrases for most trained speech sounds. Two participants had better long-term maintenance of treated speech sounds in both trained and untrained stimuli. Findings indicate EPG may be a potential treatment tool for AOS. It appears that individuals with AOS can benefit from VBFB via EPG in improving articulatory accuracy. However, further research is needed to determine if VBFB is more advantageous than behavioral treatments that have been proven effective in improving speech production for speakers with AOS.
Automatic Speech Recognition from Neural Signals: A Focused Review

Directory of Open Access Journals (Sweden)

Christian Herff

2016-09-01

Full Text Available Speech interfaces have become widely accepted and are nowadays integrated in various real-life applications and devices. They have become a part of our daily life. However, speech interfaces presume the ability to produce intelligible speech, which might be impossible due to either loud environments, bothering bystanders or incapabilities to produce speech (i.e.~patients suffering from locked-in syndrome. For these reasons it would be highly desirable to not speak but to simply envision oneself to say words or sentences. Interfaces based on imagined speech would enable fast and natural communication without the need for audible speech and would give a voice to otherwise mute people.This focused review analyzes the potential of different brain imaging techniques to recognize speech from neural signals by applying Automatic Speech Recognition technology. We argue that modalities based on metabolic processes, such as functional Near Infrared Spectroscopy and functional Magnetic Resonance Imaging, are less suited for Automatic Speech Recognition from neural signals due to low temporal resolution but are very useful for the investigation of the underlying neural mechanisms involved in speech processes. In contrast, electrophysiologic activity is fast enough to capture speech processes and is therefor better suited for ASR. Our experimental results indicate the potential of these signals for speech recognition from neural data with a focus on invasively measured brain activity (electrocorticography. As a first example of Automatic Speech Recognition techniques used from neural signals, we discuss the emph{Brain-to-text} system.
Development of a serial order in speech constrained by articulatory coordination.

Science.gov (United States)

Oohashi, Hiroki; Watanabe, Hama; Taga, Gentaro

2013-01-01

Universal linguistic constraints seem to govern the organization of sound sequences in words. However, our understanding of the origin and development of these constraints is incomplete. One possibility is that the development of neuromuscular control of articulators acts as a constraint for the emergence of sequences in words. Repetitions of the same consonant observed in early infancy and an increase in variation of consonantal sequences over months of age have been interpreted as a consequence of the development of neuromuscular control. Yet, it is not clear how sequential coordination of articulators such as lips, tongue apex and tongue dorsum constrains sequences of labial, coronal and dorsal consonants in words over the course of development. We examined longitudinal development of consonant-vowel-consonant(-vowel) sequences produced by Japanese children between 7 and 60 months of age. The sequences were classified according to places of articulation for corresponding consonants. The analyses of individual and group data show that infants prefer repetitive and fronting articulations, as shown in previous studies. Furthermore, we reveal that serial order of different places of articulations within the same organ appears earlier and then gradually develops, whereas serial order of different articulatory organs appears later and then rapidly develops. In the same way, we also analyzed the sequences produced by English children and obtained similar developmental trends. These results suggest that the development of intra- and inter-articulator coordination constrains the acquisition of serial orders in speech with the complexity that characterizes adult language.
Cognitive load during speech perception in noise: the influence of age, hearing loss, and cognition on the pupil response.

Science.gov (United States)

Zekveld, Adriana A; Kramer, Sophia E; Festen, Joost M

2011-01-01

The aim of the present study was to evaluate the influence of age, hearing loss, and cognitive ability on the cognitive processing load during listening to speech presented in noise. Cognitive load was assessed by means of pupillometry (i.e., examination of pupil dilation), supplemented with subjective ratings. Two groups of subjects participated: 38 middle-aged participants (mean age = 55 yrs) with normal hearing and 36 middle-aged participants (mean age = 61 yrs) with hearing loss. Using three Speech Reception Threshold (SRT) in stationary noise tests, we estimated the speech-to-noise ratios (SNRs) required for the correct repetition of 50%, 71%, or 84% of the sentences (SRT50%, SRT71%, and SRT84%, respectively). We examined the pupil response during listening: the peak amplitude, the peak latency, the mean dilation, and the pupil response duration. For each condition, participants rated the experienced listening effort and estimated their performance level. Participants also performed the Text Reception Threshold (TRT) test, a test of processing speed, and a word vocabulary test. Data were compared with previously published data from young participants with normal hearing. Hearing loss was related to relatively poor SRTs, and higher speech intelligibility was associated with lower effort and higher performance ratings. For listeners with normal hearing, increasing age was associated with poorer TRTs and slower processing speed but with larger word vocabulary. A multivariate repeated-measures analysis of variance indicated main effects of group and SNR and an interaction effect between these factors on the pupil response. The peak latency was relatively short and the mean dilation was relatively small at low intelligibility levels for the middle-aged groups, whereas the reverse was observed for high intelligibility levels. The decrease in the pupil response as a function of increasing SNR was relatively small for the listeners with hearing loss. Spearman
CDS is not what you think - Hypoarticulation in Danish Child Directed Speech

DEFF Research Database (Denmark)

Dideriksen, Christina Rejkjær; Fusaroli, Riccardo

et al. 2008). A previous study relying on lab-elicited stimuli indicated that Danish CDS might be peculiar, with a surprising lack of increased articulation (Bohn 2013). In the current study, we focused on longer naturalistic recordings in an environment known and safe for both child and mother...... common CDS acoustic traits: increased pitch and pitch variability and lower speech rate. However, we also find a significantly reduced vowel space when compared to adult-directed speech, which is especially surprising given the wide range of Danish vocalic sounds. We are currently extending the analysis...... and cultural affordances and the many complex routes to learn a language....
Aging and Spectro-Temporal Integration of Speech

Directory of Open Access Journals (Sweden)

John H. Grose

2016-10-01

Full Text Available The purpose of this study was to determine the effects of age on the spectro-temporal integration of speech. The hypothesis was that the integration of speech fragments distributed over frequency, time, and ear of presentation is reduced in older listeners—even for those with good audiometric hearing. Younger, middle-aged, and older listeners (10 per group with good audiometric hearing participated. They were each tested under seven conditions that encompassed combinations of spectral, temporal, and binaural integration. Sentences were filtered into two bands centered at 500 Hz and 2500 Hz, with criterion bandwidth tailored for each participant. In some conditions, the speech bands were individually square wave interrupted at a rate of 10 Hz. Configurations of uninterrupted, synchronously interrupted, and asynchronously interrupted frequency bands were constructed that constituted speech fragments distributed across frequency, time, and ear of presentation. The over-arching finding was that, for most configurations, performance was not differentially affected by listener age. Although speech intelligibility varied across condition, there was no evidence of performance deficits in older listeners in any condition. This study indicates that age, per se, does not necessarily undermine the ability to integrate fragments of speech dispersed across frequency and time.
Validating a Method to Assess Lipreading, Audiovisual Gain, and Integration During Speech Reception With Cochlear-Implanted and Normal-Hearing Subjects Using a Talking Head.

Science.gov (United States)

Schreitmüller, Stefan; Frenken, Miriam; Bentz, Lüder; Ortmann, Magdalene; Walger, Martin; Meister, Hartmut

Watching a talker's mouth is beneficial for speech reception (SR) in many communication settings, especially in noise and when hearing is impaired. Measures for audiovisual (AV) SR can be valuable in the framework of diagnosing or treating hearing disorders. This study addresses the lack of standardized methods in many languages for assessing lipreading, AV gain, and integration. A new method is validated that supplements a German speech audiometric test with visualizations of the synthetic articulation of an avatar that was used, for it is feasible to lip-sync auditory speech in a highly standardized way. Three hypotheses were formed according to the literature on AV SR that used live or filmed talkers. It was tested whether respective effects could be reproduced with synthetic articulation: (1) cochlear implant (CI) users have a higher visual-only SR than normal-hearing (NH) individuals, and younger individuals obtain higher lipreading scores than older persons. (2) Both CI and NH gain from presenting AV over unimodal (auditory or visual) sentences in noise. (3) Both CI and NH listeners efficiently integrate complementary auditory and visual speech features. In a controlled, cross-sectional study with 14 experienced CI users (mean age 47.4) and 14 NH individuals (mean age 46.3, similar broad age distribution), lipreading, AV gain, and integration of a German matrix sentence test were assessed. Visual speech stimuli were synthesized by the articulation of the Talking Head system "MASSY" (Modular Audiovisual Speech Synthesizer), which displayed standardized articulation with respect to the visibility of German phones. In line with the hypotheses and previous literature, CI users had a higher mean visual-only SR than NH individuals (CI, 38%; NH, 12%; p < 0.001). Age was correlated with lipreading such that within each group, younger individuals obtained higher visual-only scores than older persons (rCI = -0.54; p = 0.046; rNH = -0.78; p < 0.001). Both CI and NH
Individual differences in speech-in-noise perception parallel neural speech processing and attention in preschoolers

Science.gov (United States)

Thompson, Elaine C.; Carr, Kali Woodruff; White-Schwoch, Travis; Otto-Meyer, Sebastian; Kraus, Nina

2016-01-01

From bustling classrooms to unruly lunchrooms, school settings are noisy. To learn effectively in the unwelcome company of numerous distractions, children must clearly perceive speech in noise. In older children and adults, speech-in-noise perception is supported by sensory and cognitive processes, but the correlates underlying this critical listening skill in young children (3–5 year olds) remain undetermined. Employing a longitudinal design (two evaluations separated by ~12 months), we followed a cohort of 59 preschoolers, ages 3.0–4.9, assessing word-in-noise perception, cognitive abilities (intelligence, short-term memory, attention), and neural responses to speech. Results reveal changes in word-in-noise perception parallel changes in processing of the fundamental frequency (F0), an acoustic cue known for playing a role central to speaker identification and auditory scene analysis. Four unique developmental trajectories (speech-in-noise perception groups) confirm this relationship, in that improvements and declines in word-in-noise perception couple with enhancements and diminishments of F0 encoding, respectively. Improvements in word-in-noise perception also pair with gains in attention. Word-in-noise perception does not relate to strength of neural harmonic representation or short-term memory. These findings reinforce previously-reported roles of F0 and attention in hearing speech in noise in older children and adults, and extend this relationship to preschool children. PMID:27864051
Maximum Likelihood PSD Estimation for Speech Enhancement in Reverberation and Noise

DEFF Research Database (Denmark)

Kuklasinski, Adam; Doclo, Simon; Jensen, Søren Holdt

2016-01-01

In this contribution we focus on the problem of power spectral density (PSD) estimation from multiple microphone signals in reverberant and noisy environments. The PSD estimation method proposed in this paper is based on the maximum likelihood (ML) methodology. In particular, we derive a novel ML...... instrumental measures and is shown to be higher than when the competing estimator is used. Moreover, we perform a speech intelligibility test where we demonstrate that both the proposed and the competing PSD estimators lead to similar intelligibility improvements......., it is shown numerically that the mean squared estimation error achieved by the proposed method is near the limit set by the corresponding Cram´er-Rao lower bound. The speech dereverberation performance of a multi-channel Wiener filter (MWF) based on the proposed PSD estimators is measured using several...
High-frequency energy in singing and speech

Science.gov (United States)

Monson, Brian Bruce

While human speech and the human voice generate acoustical energy up to (and beyond) 20 kHz, the energy above approximately 5 kHz has been largely neglected. Evidence is accruing that this high-frequency energy contains perceptual information relevant to speech and voice, including percepts of quality, localization, and intelligibility. The present research was an initial step in the long-range goal of characterizing high-frequency energy in singing voice and speech, with particular regard for its perceptual role and its potential for modification during voice and speech production. In this study, a database of high-fidelity recordings of talkers was created and used for a broad acoustical analysis and general characterization of high-frequency energy, as well as specific characterization of phoneme category, voice and speech intensity level, and mode of production (speech versus singing) by high-frequency energy content. Directionality of radiation of high-frequency energy from the mouth was also examined. The recordings were used for perceptual experiments wherein listeners were asked to discriminate between speech and voice samples that differed only in high-frequency energy content. Listeners were also subjected to gender discrimination tasks, mode-of-production discrimination tasks, and transcription tasks with samples of speech and singing that contained only high-frequency content. The combination of these experiments has revealed that (1) human listeners are able to detect very subtle level changes in high-frequency energy, and (2) human listeners are able to extract significant perceptual information from high-frequency energy.
Joint Service Aircrew Mask (JSAM) - Tactical Aircraft (TA) A/P22P-14A Respirator Assembly (V)5: Speech Intelligibility Performance with Double Hearing Protection, HGU-84/P Flight Helmet

Science.gov (United States)

2017-04-06

data does not license the holder or any other person or corporation; or convey any rights or permission to manufacture , use, or sell any patented...airworthiness. The JSAM-TA Respirator Assembly (V)5 (Figure 2) is a chemical, biological, and radiological respirator assembly manufactured by Cam Lock...Classic™ sizing matrix for speech intelligibility Subject ID# Gender HGU- 84/P Helmet Helmet Liner (inches) Earcup Spacers (centered behind
A new time-adaptive discrete bionic wavelet transform for enhancing speech from adverse noise environment

Science.gov (United States)

Palaniswamy, Sumithra; Duraisamy, Prakash; Alam, Mohammad Showkat; Yuan, Xiaohui

2012-04-01

Automatic speech processing systems are widely used in everyday life such as mobile communication, speech and speaker recognition, and for assisting the hearing impaired. In speech communication systems, the quality and intelligibility of speech is of utmost importance for ease and accuracy of information exchange. To obtain an intelligible speech signal and one that is more pleasant to listen, noise reduction is essential. In this paper a new Time Adaptive Discrete Bionic Wavelet Thresholding (TADBWT) scheme is proposed. The proposed technique uses Daubechies mother wavelet to achieve better enhancement of speech from additive non- stationary noises which occur in real life such as street noise and factory noise. Due to the integration of human auditory system model into the wavelet transform, bionic wavelet transform (BWT) has great potential for speech enhancement which may lead to a new path in speech processing. In the proposed technique, at first, discrete BWT is applied to noisy speech to derive TADBWT coefficients. Then the adaptive nature of the BWT is captured by introducing a time varying linear factor which updates the coefficients at each scale over time. This approach has shown better performance than the existing algorithms at lower input SNR due to modified soft level dependent thresholding on time adaptive coefficients. The objective and subjective test results confirmed the competency of the TADBWT technique. The effectiveness of the proposed technique is also evaluated for speaker recognition task under noisy environment. The recognition results show that the TADWT technique yields better performance when compared to alternate methods specifically at lower input SNR.
Feedforward and Feedback Control in Apraxia of Speech: Effects of Noise Masking on Vowel Production

Science.gov (United States)

Maas, Edwin; Mailend, Marja-Liisa; Guenther, Frank H.

2015-01-01

Purpose: This study was designed to test two hypotheses about apraxia of speech (AOS) derived from the Directions Into Velocities of Articulators (DIVA) model (Guenther et al., 2006): the feedforward system deficit hypothesis and the feedback system deficit hypothesis. Method: The authors used noise masking to minimize auditory feedback during…

Typical versus delayed speech onset influences verbal reporting of autistic interests.

Science.gov (United States)

Chiodo, Liliane; Majerus, Steve; Mottron, Laurent

2017-01-01

The distinction between autism and Asperger syndrome has been abandoned in the DSM-5. However, this clinical categorization largely overlaps with the presence or absence of a speech onset delay which is associated with clinical, cognitive, and neural differences. It is unknown whether these different speech development pathways and associated cognitive differences are involved in the heterogeneity of the restricted interests that characterize autistic adults. This study tested the hypothesis that speech onset delay, or conversely, early mastery of speech, orients the nature and verbal reporting of adult autistic interests. The occurrence of a priori defined descriptors for perceptual and thematic dimensions were determined, as well as the perceived function and benefits, in the response of autistic people to a semi-structured interview on their intense interests. The number of words, grammatical categories, and proportion of perceptual / thematic descriptors were computed and compared between groups by variance analyses. The participants comprised 40 autistic adults grouped according to the presence ( N = 20) or absence ( N = 20) of speech onset delay, as well as 20 non-autistic adults, also with intense interests, matched for non-verbal intelligence using Raven's Progressive Matrices. The overall nature, function, and benefit of intense interests were similar across autistic subgroups, and between autistic and non-autistic groups. However, autistic participants with a history of speech onset delay used more perceptual than thematic descriptors when talking about their interests, whereas the opposite was true for autistic individuals without speech onset delay. This finding remained significant after controlling for linguistic differences observed between the two groups. Verbal reporting, but not the nature or positive function, of intense interests differed between adult autistic individuals depending on their speech acquisition history: oral reporting of
Eighth International Conference on Intelligent Systems and Knowledge Engineering

CERN Document Server

Li, Tianrui; ISKE 2013; Foundations of Intelligent Systems; Knowledge Engineering and Management; Practical Applications of Intelligent Systems

2014-01-01

"Foundations of Intelligent Systems" presents selected papers from the 2013 International Conference on Intelligent Systems and Knowledge Engineering (ISKE2013). The aim of this conference is to bring together experts from different expertise areas to discuss the state-of-the-art in Intelligent Systems and Knowledge Engineering, and to present new research results and perspectives on future development. The topics in this volume include, but not limited to: Artificial Intelligence Theories, Pattern Recognition, Intelligent System Models, Speech Recognition, Computer Vision, Multi-Agent Systems, Machine Learning, Soft Computing and Fuzzy Systems, Biological Inspired Computation, Game Theory, Cognitive Systems and Information Processing, Computational Intelligence, etc. The proceedings are benefit for both researchers and practitioners who want to utilize intelligent methods in their specific research fields. Dr. Zhenkun Wen is a Professor at the College of Computer and Software Engineering, Shenzhen University...
Spoken Language Understanding Systems for Extracting Semantic Information from Speech

CERN Document Server

Tur, Gokhan

2011-01-01

Spoken language understanding (SLU) is an emerging field in between speech and language processing, investigating human/ machine and human/ human communication by leveraging technologies from signal processing, pattern recognition, machine learning and artificial intelligence. SLU systems are designed to extract the meaning from speech utterances and its applications are vast, from voice search in mobile devices to meeting summarization, attracting interest from both commercial and academic sectors. Both human/machine and human/human communications can benefit from the application of SLU, usin
Neural Spike-Train Analyses of the Speech-Based Envelope Power Spectrum Model

Science.gov (United States)

Rallapalli, Varsha H.

2016-01-01

Diagnosing and treating hearing impairment is challenging because people with similar degrees of sensorineural hearing loss (SNHL) often have different speech-recognition abilities. The speech-based envelope power spectrum model (sEPSM) has demonstrated that the signal-to-noise ratio (SNRENV) from a modulation filter bank provides a robust speech-intelligibility measure across a wider range of degraded conditions than many long-standing models. In the sEPSM, noise (N) is assumed to: (a) reduce S + N envelope power by filling in dips within clean speech (S) and (b) introduce an envelope noise floor from intrinsic fluctuations in the noise itself. While the promise of SNRENV has been demonstrated for normal-hearing listeners, it has not been thoroughly extended to hearing-impaired listeners because of limited physiological knowledge of how SNHL affects speech-in-noise envelope coding relative to noise alone. Here, envelope coding to speech-in-noise stimuli was quantified from auditory-nerve model spike trains using shuffled correlograms, which were analyzed in the modulation-frequency domain to compute modulation-band estimates of neural SNRENV. Preliminary spike-train analyses show strong similarities to the sEPSM, demonstrating feasibility of neural SNRENV computations. Results suggest that individual differences can occur based on differential degrees of outer- and inner-hair-cell dysfunction in listeners currently diagnosed into the single audiological SNHL category. The predicted acoustic-SNR dependence in individual differences suggests that the SNR-dependent rate of susceptibility could be an important metric in diagnosing individual differences. Future measurements of the neural SNRENV in animal studies with various forms of SNHL will provide valuable insight for understanding individual differences in speech-in-noise intelligibility.
Objective voice and speech analysis of persons with chronic hoarseness by prosodic analysis of speech samples.

Science.gov (United States)

Haderlein, Tino; Döllinger, Michael; Matoušek, Václav; Nöth, Elmar

2016-10-01

Automatic voice assessment is often performed using sustained vowels. In contrast, speech analysis of read-out texts can be applied to voice and speech assessment. Automatic speech recognition and prosodic analysis were used to find regression formulae between automatic and perceptual assessment of four voice and four speech criteria. The regression was trained with 21 men and 62 women (average age 49.2 years) and tested with another set of 24 men and 49 women (48.3 years), all suffering from chronic hoarseness. They read the text 'Der Nordwind und die Sonne' ('The North Wind and the Sun'). Five voice and speech therapists evaluated the data on 5-point Likert scales. Ten prosodic and recognition accuracy measures (features) were identified which describe all the examined criteria. Inter-rater correlation within the expert group was between r = 0.63 for the criterion 'match of breath and sense units' and r = 0.87 for the overall voice quality. Human-machine correlation was between r = 0.40 for the match of breath and sense units and r = 0.82 for intelligibility. The perceptual ratings of different criteria were highly correlated with each other. Likewise, the feature sets modeling the criteria were very similar. The automatic method is suitable for assessing chronic hoarseness in general and for subgroups of functional and organic dysphonia. In its current version, it is almost as reliable as a randomly picked rater from a group of voice and speech therapists.
1st International Conference on Intelligent Computing, Communication and Devices

CERN Document Server

Patnaik, Srikanta; Ichalkaranje, Nikhil

2015-01-01

In the history of mankind, three revolutions which impact the human life are the tool-making revolution, agricultural revolution and industrial revolution. They have transformed not only the economy and civilization but the overall development of the society. Probably, intelligence revolution is the next revolution, which the society will perceive in the next 10 years. ICCD-2014 covers all dimensions of intelligent sciences, i.e. Intelligent Computing, Intelligent Communication and Intelligent Devices. This volume covers contributions from Intelligent Communication which are from the areas such as Communications and Wireless Ad Hoc & Sensor Networks, Speech & Natural Language Processing, including Signal, Image and Video Processing and Mobile broadband and Optical networks, which are the key to the ground-breaking inventions to intelligent communication technologies. Secondly, Intelligent Device is any type of equipment, instrument, or machine that has its own computing capability. Contributions from ...
Measurement of speech levels in the presence of time varying background noise

Science.gov (United States)

Pearsons, K. S.; Horonjeff, R.

1982-01-01

Short-term speech level measurements which could be used to note changes in vocal effort in a time varying noise environment were studied. Knowing the changes in speech level would in turn allow prediction of intelligibility in the presence of aircraft flyover noise. Tests indicated that it is possible to use two second samples of speech to estimate long term root mean square speech levels. Other tests were also performed in which people read out loud during aircraft flyover noise. Results of these tests indicate that people do indeed raise their voice during flyovers at a rate of about 3-1/2 dB for each 10 dB increase in background level. This finding is in agreement with other tests of speech levels in the presence of steady state background noise.
Blind speech separation system for humanoid robot with FastICA for audio filtering and separation

Science.gov (United States)

Budiharto, Widodo; Santoso Gunawan, Alexander Agung

2016-07-01

Nowadays, there are many developments in building intelligent humanoid robot, mainly in order to handle voice and image. In this research, we propose blind speech separation system using FastICA for audio filtering and separation that can be used in education or entertainment. Our main problem is to separate the multi speech sources and also to filter irrelevant noises. After speech separation step, the results will be integrated with our previous speech and face recognition system which is based on Bioloid GP robot and Raspberry Pi 2 as controller. The experimental results show the accuracy of our blind speech separation system is about 88% in command and query recognition cases.
21 CFR 872.6140 - Articulation paper.

Science.gov (United States)

2010-04-01

... 21 Food and Drugs 8 2010-04-01 2010-04-01 false Articulation paper. 872.6140 Section 872.6140 Food... DEVICES DENTAL DEVICES Miscellaneous Devices § 872.6140 Articulation paper. (a) Identification. Articulation paper is a device composed of paper coated with an ink dye intended to be placed between the...
THE SCANDCLEFT RANDOMISED CONTROLLED TRIALS: SPEECH OUTCOMES IN 5-YEAR-OLDS WITH UCLP – consonant proficiency and errors

DEFF Research Database (Denmark)

Willadsen, Elisabeth; Persson, Christina; Lohmander, Anette

2017-01-01

Background and aim: Normal articulation before school start is a main objective in cleft palate treatment. The aim was to investigate if differences exist in consonant proficiency at age 5 years between children with unilateral cleft lip and palate (UCLP) randomised to different surgical protocol...... in terms of secondary pharyngeal surgeries, number of fistulae, and speech therapy visits differed. Trial registration: ISRCTN29932826. Keywords: Primary palatal repair, unilateral cleft lip and palate, consonant proficiency, cleft speech characteristics, randomised clinical trial...
Speech chronemics--a hidden dimension of speech. Theoretical background, measurement and clinical validity.

Science.gov (United States)

Krüger, H P

1989-02-01

The term "speech chronemics" is introduced to characterize a research strategy which extracts from the physical qualities of the speech signal only the pattern of ons ("speaking") and offs ("pausing"). The research in this field can be structured into the methodological dimension "unit of time", "number of speakers", and "quality of the prosodic measures". It is shown that a researcher's actual decision for one method largely determines the outcome of his study. Then, with the Logoport a new portable measurement device is presented. It enables the researcher to study speaking behavior over long periods of time (up to 24 hours) in the normal environment of his subjects. Two experiments are reported. The first shows the validity of articulation pauses for variations in the physiological state of the organism. The second study proves a new betablocking agent to have sociotropic effects: in a long-term trial socially high-strung subjects showed an improved interaction behavior (compared to placebo and socially easy-going persons) in their everyday life. Finally, the need for a comprehensive theoretical foundation and for standardization of measurement situations and methods is emphasized.
Sensorimotor Representation of Speech Perception. Cross-Decoding of Place of Articulation Features during Selective Attention to Syllables in 7T fMRI

NARCIS (Netherlands)

Archila-Meléndez, Mario E.; Valente, Giancarlo; Correia, Joao M.; Rouhl, Rob P. W.; van Kranen-Mastenbroek, Vivianne H.; Jansma, Bernadette M.

2018-01-01

Sensorimotor integration, the translation between acoustic signals and motoric programs, may constitute a crucial mechanism for speech. During speech perception, the acoustic-motoric translations include the recruitment of cortical areas for the representation of speech articulatory features, such
The history of articulators: the "Articulator Wars" phenomenon with some circumstances leading up to it.

Science.gov (United States)

Starcke, Edgar N; Engelmeier, Robert L; Belles, Donald M

2010-06-01

At the dawn of the 20th century, all was not well with the practice of "plate prostheses." Removable prosthodontics had been degrading for several decades and was now generally in low esteem, even though there had been many significant advances. W. E. Walker had introduced adjustable condylar guides, George Snow, the facebow, and Carl Christensen, a method for clinically measuring the condylar inclines. Nevertheless, the average practicing dentist was still using simple hinge articulators and was apathetic to the deplorable state of the artificial teeth available; however, this was all going to change dramatically when two dentists, Alfred Gysi and J. Leon Williams, working together between 1910 and 1914, presented to the profession the "Trubyte Artificial Tooth System" that embodied both a typal system for selecting anterior teeth and new posterior occlusal carvings that made possible, for the first time, the articulation of artificial teeth. This incited many of prosthetic dentistry's elite to introduce their own theories of mandibular movement and the articulators that they designed to reflect those theories. The intense debates that ensued, both in the meeting halls and in the literature, were numerous and lasted for decades. At the time, the "Articulator Wars" had both positive and negative consequences. Today, with many of the "Articulator Wars" issues remaining as part of the practice of dentistry, the "Articulator Wars" can be considered a phenomenon of enlightenment.
Development of a serial order in speech constrained by articulatory coordination.

Directory of Open Access Journals (Sweden)

Hiroki Oohashi

Full Text Available Universal linguistic constraints seem to govern the organization of sound sequences in words. However, our understanding of the origin and development of these constraints is incomplete. One possibility is that the development of neuromuscular control of articulators acts as a constraint for the emergence of sequences in words. Repetitions of the same consonant observed in early infancy and an increase in variation of consonantal sequences over months of age have been interpreted as a consequence of the development of neuromuscular control. Yet, it is not clear how sequential coordination of articulators such as lips, tongue apex and tongue dorsum constrains sequences of labial, coronal and dorsal consonants in words over the course of development. We examined longitudinal development of consonant-vowel-consonant(-vowel sequences produced by Japanese children between 7 and 60 months of age. The sequences were classified according to places of articulation for corresponding consonants. The analyses of individual and group data show that infants prefer repetitive and fronting articulations, as shown in previous studies. Furthermore, we reveal that serial order of different places of articulations within the same organ appears earlier and then gradually develops, whereas serial order of different articulatory organs appears later and then rapidly develops. In the same way, we also analyzed the sequences produced by English children and obtained similar developmental trends. These results suggest that the development of intra- and inter-articulator coordination constrains the acquisition of serial orders in speech with the complexity that characterizes adult language.
The Contribution of Auditory and Cognitive Factors to Intelligibility of Words and Sentences in Noise.

Science.gov (United States)

Heinrich, Antje; Knight, Sarah

2016-01-01

Understanding the causes for speech-in-noise (SiN) perception difficulties is complex, and is made even more difficult by the fact that listening situations can vary widely in target and background sounds. While there is general agreement that both auditory and cognitive factors are important, their exact relationship to SiN perception across various listening situations remains unclear. This study manipulated the characteristics of the listening situation in two ways: first, target stimuli were either isolated words, or words heard in the context of low- (LP) and high-predictability (HP) sentences; second, the background sound, speech-modulated noise, was presented at two signal-to-noise ratios. Speech intelligibility was measured for 30 older listeners (aged 62-84) with age-normal hearing and related to individual differences in cognition (working memory, inhibition and linguistic skills) and hearing (PTA(0.25-8 kHz) and temporal processing). The results showed that while the effect of hearing thresholds on intelligibility was rather uniform, the influence of cognitive abilities was more specific to a certain listening situation. By revealing a complex picture of relationships between intelligibility and cognition, these results may help us understand some of the inconsistencies in the literature as regards cognitive contributions to speech perception.
Electropalatography in the Description and Treatment of Speech Disorders in Five Children with Cerebral Palsy

Science.gov (United States)

Nordberg, Ann; Carlsson, Goran; Lohmander, Anette

2011-01-01

Some children with cerebral palsy have articulation disorders that are resistant to conventional speech therapy. The aim of this study was to investigate whether the visual feedback method of electropalatography (EPG) could be an effective tool for treating five children (mean age of 9.4 years) with dysarthria and cerebral palsy and to explore…
Impact of speech-generating devices on the language development of a child with childhood apraxia of speech: a case study.

Science.gov (United States)

Lüke, Carina

2016-01-01

The purpose of the study was to evaluate the effectiveness of speech-generating devices (SGDs) on the communication and language development of a 2-year-old boy with severe childhood apraxia of speech (CAS). An A-B design was used over a treatment period of 1 year, followed by three additional follow-up measurements, in order to evaluate the implementation of SGDs in the speech therapy of a 2;7-year-old boy with severe CAS. In total, 53 therapy sessions were videotaped and analyzed to better understand his communicative (operationalized as means of communication) and linguistic (operationalized as intelligibility and consistency of speech-productions, lexical and grammatical development) development. The trend-lines of baseline phase A and intervention phase B were compared and percentage of non-overlapping data points were calculated to verify the value of the intervention. The use of SGDs led to an immediate increase in the communicative development of the child. An increase in all linguistic variables was observed, with a latency effect of eight to nine treatment sessions. The implementation of SGDs in speech therapy has the potential to be highly effective in regards to both communicative and linguistic competencies in young children with severe CAS. Implications for Rehabilitation Childhood apraxia of speech (CAS) is a neurological speech sound disorder which results in significant deficits in speech production and lead to a higher risk for language, reading and spelling difficulties. Speech-generating devices (SGD), as one method of augmentative and alternative communication (AAC), can effectively enhance the communicative and linguistic development of children with severe CAS.
The Relationship Between Apraxia of Speech and Oral Apraxia: Association or Dissociation?

Science.gov (United States)

Whiteside, Sandra P; Dyson, Lucy; Cowell, Patricia E; Varley, Rosemary A

2015-11-01

Acquired apraxia of speech (AOS) is a motor speech disorder that affects the implementation of articulatory gestures and the fluency and intelligibility of speech. Oral apraxia (OA) is an impairment of nonspeech volitional movement. Although many speakers with AOS also display difficulties with volitional nonspeech oral movements, the relationship between the 2 conditions is unclear. This study explored the relationship between speech and volitional nonspeech oral movement impairment in a sample of 50 participants with AOS. We examined levels of association and dissociation between speech and OA using a battery of nonspeech oromotor, speech, and auditory/aphasia tasks. There was evidence of a moderate positive association between the 2 impairments across participants. However, individual profiles revealed patterns of dissociation between the 2 in a few cases, with evidence of double dissociation of speech and oral apraxic impairment. We discuss the implications of these relationships for models of oral motor and speech control. © The Author 2015. Published by Oxford University Press. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
Articulating feedstock delivery device

Science.gov (United States)

Jordan, Kevin

2013-11-05

A fully articulable feedstock delivery device that is designed to operate at pressure and temperature extremes. The device incorporates an articulating ball assembly which allows for more accurate delivery of the feedstock to a target location. The device is suitable for a variety of applications including, but not limited to, delivery of feedstock to a high-pressure reaction chamber or process zone.
Fundamental Frequency and Direction-of-Arrival Estimation for Multichannel Speech Enhancement

DEFF Research Database (Denmark)

Karimian-Azari, Sam

Audio systems receive the speech signals of interest usually in the presence of noise. The noise has profound impacts on the quality and intelligibility of the speech signals, and it is therefore clear that the noisy signals must be cleaned up before being played back, stored, or analyzed. We can...... estimate the speech signal of interest from the noisy signals using a priori knowledge about it. A human speech signal is broadband and consists of both voiced and unvoiced parts. The voiced part is quasi-periodic with a time-varying fundamental frequency (or pitch as it is commonly referred to). We...... their time differences which eventually may further reduce the effects of noise. This thesis introduces a number of principles and methods to estimate periodic signals in noisy environments with application to multichannel speech enhancement. We propose model-based signal enhancement concerning the model...

Perceptions of University Instructors When Listening to International Student Speech

Science.gov (United States)

Sheppard, Beth; Elliott, Nancy; Baese-Berk, Melissa

2017-01-01

Intensive English Program (IEP) Instructors and content faculty both listen to international students at the university. For these two groups of instructors, this study compared perceptions of international student speech by collecting comprehensibility ratings and transcription samples for intelligibility scores. No significant differences were…
Acoustic-phonetic and artificial neural network feature analysis to assess speech quality of stop consonants produced by patients treated for oral or oropharyngeal cancer

NARCIS (Netherlands)

de Bruijn, Marieke J.; ten Bosch, Louis; Kuik, Dirk J.; Witte, Birgit I.; Langendijk, Johannes A.; Leemans, C. Rene; Verdonck-de Leeuw, Irma M.

Speech impairment often occurs in patients after treatment for head and neck cancer. A specific speech characteristic that influences intelligibility and speech quality is voice-onset-time (VOT) in stop consonants. VOT is one of the functionally most relevant parameters that distinguishes voiced and
A speech-controlled environmental control system for people with severe dysarthria.

Science.gov (United States)

Hawley, Mark S; Enderby, Pam; Green, Phil; Cunningham, Stuart; Brownsell, Simon; Carmichael, James; Parker, Mark; Hatzis, Athanassios; O'Neill, Peter; Palmer, Rebecca

2007-06-01

Automatic speech recognition (ASR) can provide a rapid means of controlling electronic assistive technology. Off-the-shelf ASR systems function poorly for users with severe dysarthria because of the increased variability of their articulations. We have developed a limited vocabulary speaker dependent speech recognition application which has greater tolerance to variability of speech, coupled with a computerised training package which assists dysarthric speakers to improve the consistency of their vocalisations and provides more data for recogniser training. These applications, and their implementation as the interface for a speech-controlled environmental control system (ECS), are described. The results of field trials to evaluate the training program and the speech-controlled ECS are presented. The user-training phase increased the recognition rate from 88.5% to 95.4% (p<0.001). Recognition rates were good for people with even the most severe dysarthria in everyday usage in the home (mean word recognition rate 86.9%). Speech-controlled ECS were less accurate (mean task completion accuracy 78.6% versus 94.8%) but were faster to use than switch-scanning systems, even taking into account the need to repeat unsuccessful operations (mean task completion time 7.7s versus 16.9s, p<0.001). It is concluded that a speech-controlled ECS is a viable alternative to switch-scanning systems for some people with severe dysarthria and would lead, in many cases, to more efficient control of the home.
Silent Speech Recognition as an Alternative Communication Device for Persons with Laryngectomy.

Science.gov (United States)

Meltzner, Geoffrey S; Heaton, James T; Deng, Yunbin; De Luca, Gianluca; Roy, Serge H; Kline, Joshua C

2017-12-01

Each year thousands of individuals require surgical removal of their larynx (voice box) due to trauma or disease, and thereby require an alternative voice source or assistive device to verbally communicate. Although natural voice is lost after laryngectomy, most muscles controlling speech articulation remain intact. Surface electromyographic (sEMG) activity of speech musculature can be recorded from the neck and face, and used for automatic speech recognition to provide speech-to-text or synthesized speech as an alternative means of communication. This is true even when speech is mouthed or spoken in a silent (subvocal) manner, making it an appropriate communication platform after laryngectomy. In this study, 8 individuals at least 6 months after total laryngectomy were recorded using 8 sEMG sensors on their face (4) and neck (4) while reading phrases constructed from a 2,500-word vocabulary. A unique set of phrases were used for training phoneme-based recognition models for each of the 39 commonly used phonemes in English, and the remaining phrases were used for testing word recognition of the models based on phoneme identification from running speech. Word error rates were on average 10.3% for the full 8-sensor set (averaging 9.5% for the top 4 participants), and 13.6% when reducing the sensor set to 4 locations per individual (n=7). This study provides a compelling proof-of-concept for sEMG-based alaryngeal speech recognition, with the strong potential to further improve recognition performance.
Auditory and Non-Auditory Contributions for Unaided Speech Recognition in Noise as a Function of Hearing Aid Use.

Science.gov (United States)

Gieseler, Anja; Tahden, Maike A S; Thiel, Christiane M; Wagener, Kirsten C; Meis, Markus; Colonius, Hans

2017-01-01

Differences in understanding speech in noise among hearing-impaired individuals cannot be explained entirely by hearing thresholds alone, suggesting the contribution of other factors beyond standard auditory ones as derived from the audiogram. This paper reports two analyses addressing individual differences in the explanation of unaided speech-in-noise performance among n = 438 elderly hearing-impaired listeners ( mean = 71.1 ± 5.8 years). The main analysis was designed to identify clinically relevant auditory and non-auditory measures for speech-in-noise prediction using auditory (audiogram, categorical loudness scaling) and cognitive tests (verbal-intelligence test, screening test of dementia), as well as questionnaires assessing various self-reported measures (health status, socio-economic status, and subjective hearing problems). Using stepwise linear regression analysis, 62% of the variance in unaided speech-in-noise performance was explained, with measures Pure-tone average (PTA), Age , and Verbal intelligence emerging as the three most important predictors. In the complementary analysis, those individuals with the same hearing loss profile were separated into hearing aid users (HAU) and non-users (NU), and were then compared regarding potential differences in the test measures and in explaining unaided speech-in-noise recognition. The groupwise comparisons revealed significant differences in auditory measures and self-reported subjective hearing problems, while no differences in the cognitive domain were found. Furthermore, groupwise regression analyses revealed that Verbal intelligence had a predictive value in both groups, whereas Age and PTA only emerged significant in the group of hearing aid NU.
Vocabulary Facilitates Speech Perception in Children With Hearing Aids.

Science.gov (United States)

Klein, Kelsey E; Walker, Elizabeth A; Kirby, Benjamin; McCreery, Ryan W

2017-08-16

We examined the effects of vocabulary, lexical characteristics (age of acquisition and phonotactic probability), and auditory access (aided audibility and daily hearing aid [HA] use) on speech perception skills in children with HAs. Participants included 24 children with HAs and 25 children with normal hearing (NH), ages 5-12 years. Groups were matched on age, expressive and receptive vocabulary, articulation, and nonverbal working memory. Participants repeated monosyllabic words and nonwords in noise. Stimuli varied on age of acquisition, lexical frequency, and phonotactic probability. Performance in each condition was measured by the signal-to-noise ratio at which the child could accurately repeat 50% of the stimuli. Children from both groups with larger vocabularies showed better performance than children with smaller vocabularies on nonwords and late-acquired words but not early-acquired words. Overall, children with HAs showed poorer performance than children with NH. Auditory access was not associated with speech perception for the children with HAs. Children with HAs show deficits in sensitivity to phonological structure but appear to take advantage of vocabulary skills to support speech perception in the same way as children with NH. Further investigation is needed to understand the causes of the gap that exists between the overall speech perception abilities of children with HAs and children with NH.
A Systematic Review of Cross-Linguistic and Multilingual Speech and Language Outcomes for Children with Hearing Loss

Science.gov (United States)

Crowe, Kathryn; McLeod, Sharynne

2014-01-01

The purpose of this study was to systematically review the factors affecting the language, speech intelligibility, speech production, and lexical tone development of children with hearing loss who use spoken languages other than English. Relevant studies of children with hearing loss published between 2000 and 2011 were reviewed with reference to…
Intelligence Unleashed: An argument for AI in Education

OpenAIRE

Luckin, R.; Holmes, W.

2016-01-01

This paper on artificial intelligence in education (AIEd) has two aims. The first: to explain to a non-specialist, interested, reader what AIEd is: its goals, how it is built, and how it works. The second: to set out the argument for what AIEd can offer teaching and learning, both now and in the future, with an eye towards improving learning and life outcomes for all. Computer systems that are artificially intelligent interact with the world using capabilities (such as speech recognition) and...
Can you hear me yet? An intracranial investigation of speech and non-speech audiovisual interactions in human cortex.

Science.gov (United States)

Rhone, Ariane E; Nourski, Kirill V; Oya, Hiroyuki; Kawasaki, Hiroto; Howard, Matthew A; McMurray, Bob

In everyday conversation, viewing a talker's face can provide information about the timing and content of an upcoming speech signal, resulting in improved intelligibility. Using electrocorticography, we tested whether human auditory cortex in Heschl's gyrus (HG) and on superior temporal gyrus (STG) and motor cortex on precentral gyrus (PreC) were responsive to visual/gestural information prior to the onset of sound and whether early stages of auditory processing were sensitive to the visual content (speech syllable versus non-speech motion). Event-related band power (ERBP) in the high gamma band was content-specific prior to acoustic onset on STG and PreC, and ERBP in the beta band differed in all three areas. Following sound onset, we found with no evidence for content-specificity in HG, evidence for visual specificity in PreC, and specificity for both modalities in STG. These results support models of audio-visual processing in which sensory information is integrated in non-primary cortical areas.
The Effect of Furlow Palatoplasty Timing on Speech Outcomes in Submucous Cleft Palate.

Science.gov (United States)

Swanson, Jordan W; Mitchell, Brianne T; Cohen, Marilyn; Solot, Cynthia; Jackson, Oksana; Low, David; Bartlett, Scott P; Taylor, Jesse A

2017-08-01

Because some patients with submucous cleft palate (SMCP) are asymptomatic, surgical treatment is conventionally delayed until hypernasal resonance is identified during speech production. We aim to identify whether speech outcomes after repair of a SMCP is influenced by age of repair. We retrospectively studied nonsyndromic children with SMCP. Speech results, before and after any surgical treatment or physical management of the palate were compared using the Pittsburgh Weighted Speech Scoring system. Furlow palatoplasty was performed on 40 nonsyndromic patients with SMCP, and 26 patients were not surgically treated. Total composite speech scores improved significantly among children repaired between 3 and 4 years of age (P = 0.02), but not older than 4 years (P = 0.63). Twelve (86%) of 14 patients repaired who are older than 4 years had borderline or incompetent speech (composite Pittsburgh Weighted Speech Scoring ≥3) compared with 2 (29%) of 7 repaired between 3 and 4 years of age (P = 0.0068), despite worse prerepair scores in the latter group. Resonance improved in children repaired who are older than 4 years, but articulation errors persisted to a greater degree than those treated before 4 years of age (P = 0.01.) CONCLUSIONS: Submucous cleft palate repair before 4 years of age appears associated with lower ultimate rates of borderline or incompetent speech. Speech of patients repaired at or after 4 years of age seems to be characterized by persistent misarticulation. These findings highlight the importance of timely diagnosis and management.
Speech therapy in peripheral facial palsy: an orofacial myofunctional approach

Directory of Open Access Journals (Sweden)

Hipólito Virgílio Magalhães Júnior

2009-12-01

Full Text Available Objective: To delineate the contributions of speech therapy in the rehabilitation of peripheral facial palsy, describing the role of orofacial myofunctional approach in this process. Methods: A literature review of published articles since 1995, held from March to December 2008, based on the characterization of peripheral facial palsy and its relation with speechlanguage disorders related to orofacial disorders in mobility, speech and chewing, among others. The review prioritized scientific journal articles and specific chapters from the studied period. As inclusion criteria, the literature should contain data on peripheral facial palsy, quotes on the changes in the stomatognathic system and on orofacial miofunctional approach. We excluded studies that addressed central paralysis, congenital palsy and those of non idiopathic causes. Results: The literature has addressed the contribution of speech therapy in the rehabilitation of facial symmetry, with improvement in the retention of liquids and soft foods during chewing and swallowing. The orofacial myofunctional approach contextualized the role of speech therapy in the improvement of the coordination of speech articulation and in the gain of oral control during chewing and swallowing Conclusion: Speech therapy in peripheral facial palsy contributed and was outlined by applying the orofacial myofunctional approach in the reestablishment of facial symmetry, from the work directed to the functions of the stomatognathic system, including oralfacial exercises and training of chewing in association with the training of the joint. There is a need for a greater number of publications in this specific area for speech therapy professional.
2nd International Conference on Intelligent Technologies and Engineering Systems

CERN Document Server

Chen, Cheng-Yi; Yang, Cheng-Fu

2014-01-01

This book includes the original, peer reviewed research papers from the 2nd International Conference on Intelligent Technologies and Engineering Systems (ICITES2013), which took place on December 12-14, 2013 at Cheng Shiu University in Kaohsiung, Taiwan. Topics covered include: laser technology, wireless and mobile networking, lean and agile manufacturing, speech processing, microwave dielectrics, intelligent circuits and systems, 3D graphics, communications, and structure dynamics and control.
The effect of instantaneous input dynamic range setting on the speech perception of children with the nucleus 24 implant.

Science.gov (United States)

Davidson, Lisa S; Skinner, Margaret W; Holstad, Beth A; Fears, Beverly T; Richter, Marie K; Matusofsky, Margaret; Brenner, Christine; Holden, Timothy; Birath, Amy; Kettel, Jerrica L; Scollie, Susan

2009-06-01

The purpose of this study was to examine the effects of a wider instantaneous input dynamic range (IIDR) setting on speech perception and comfort in quiet and noise for children wearing the Nucleus 24 implant system and the Freedom speech processor. In addition, children's ability to understand soft and conversational level speech in relation to aided sound-field thresholds was examined. Thirty children (age, 7 to 17 years) with the Nucleus 24 cochlear implant system and the Freedom speech processor with two different IIDR settings (30 versus 40 dB) were tested on the Consonant Nucleus Consonant (CNC) word test at 50 and 60 dB SPL, the Bamford-Kowal-Bench Speech in Noise Test, and a loudness rating task for four-talker speech noise. Aided thresholds for frequency-modulated tones, narrowband noise, and recorded Ling sounds were obtained with the two IIDRs and examined in relation to CNC scores at 50 dB SPL. Speech Intelligibility Indices were calculated using the long-term average speech spectrum of the CNC words at 50 dB SPL measured at each test site and aided thresholds. Group mean CNC scores at 50 dB SPL with the 40 IIDR were significantly higher (p Speech in Noise Test were not significantly different for the two IIDRs. Significantly improved aided thresholds at 250 to 6000 Hz as well as higher Speech Intelligibility Indices afforded improved audibility for speech presented at soft levels (50 dB SPL). These results indicate that an increased IIDR provides improved word recognition for soft levels of speech without compromising comfort of higher levels of speech sounds or sentence recognition in noise.
Phonological processes in the speech of school-age children with hearing loss: Comparisons with children with normal hearing.

Science.gov (United States)

Asad, Areej Nimer; Purdy, Suzanne C; Ballard, Elaine; Fairgray, Liz; Bowen, Caroline

2018-04-27

In this descriptive study, phonological processes were examined in the speech of children aged 5;0-7;6 (years; months) with mild to profound hearing loss using hearing aids (HAs) and cochlear implants (CIs), in comparison to their peers. A second aim was to compare phonological processes of HA and CI users. Children with hearing loss (CWHL, N = 25) were compared to children with normal hearing (CWNH, N = 30) with similar age, gender, linguistic, and socioeconomic backgrounds. Speech samples obtained from a list of 88 words, derived from three standardized speech tests, were analyzed using the CASALA (Computer Aided Speech and Language Analysis) program to evaluate participants' phonological systems, based on lax (a process appeared at least twice in the speech of at least two children) and strict (a process appeared at least five times in the speech of at least two children) counting criteria. Developmental phonological processes were eliminated in the speech of younger and older CWNH while eleven developmental phonological processes persisted in the speech of both age groups of CWHL. CWHL showed a similar trend of age of elimination to CWNH, but at a slower rate. Children with HAs and CIs produced similar phonological processes. Final consonant deletion, weak syllable deletion, backing, and glottal replacement were present in the speech of HA users, affecting their overall speech intelligibility. Developmental and non-developmental phonological processes persist in the speech of children with mild to profound hearing loss compared to their peers with typical hearing. The findings indicate that it is important for clinicians to consider phonological assessment in pre-school CWHL and the use of evidence-based speech therapy in order to reduce non-developmental and non-age-appropriate developmental processes, thereby enhancing their speech intelligibility. Copyright © 2018 Elsevier Inc. All rights reserved.
Influence of musical training on understanding voiced and whispered speech in noise.

Science.gov (United States)

Ruggles, Dorea R; Freyman, Richard L; Oxenham, Andrew J

2014-01-01

This study tested the hypothesis that the previously reported advantage of musicians over non-musicians in understanding speech in noise arises from more efficient or robust coding of periodic voiced speech, particularly in fluctuating backgrounds. Speech intelligibility was measured in listeners with extensive musical training, and in those with very little musical training or experience, using normal (voiced) or whispered (unvoiced) grammatically correct nonsense sentences in noise that was spectrally shaped to match the long-term spectrum of the speech, and was either continuous or gated with a 16-Hz square wave. Performance was also measured in clinical speech-in-noise tests and in pitch discrimination. Musicians exhibited enhanced pitch discrimination, as expected. However, no systematic or statistically significant advantage for musicians over non-musicians was found in understanding either voiced or whispered sentences in either continuous or gated noise. Musicians also showed no statistically significant advantage in the clinical speech-in-noise tests. Overall, the results provide no evidence for a significant difference between young adult musicians and non-musicians in their ability to understand speech in noise.
Social and Cognitive Impressions of Adults Who Do and Do Not Stutter Based on Listeners' Perceptions of Read-Speech Samples

Directory of Open Access Journals (Sweden)

Lauren J. Amick

2017-07-01

Full Text Available Stuttering is a neurodevelopmental disorder characterized by frequent and involuntary disruptions during speech production. Adults who stutter are often subject to negative perceptions. The present study examined whether negative social and cognitive impressions are formed when listening to speech, even without any knowledge about the speaker. Two experiments were conducted in which naïve participants were asked to listen to and provide ratings on samples of read speech produced by adults who stutter and typically-speaking adults without knowledge about the individuals who produced the speech. In both experiments, listeners rated speaker cognitive ability, likeability, anxiety, as well as a number of speech characteristics that included fluency, naturalness, intelligibility, the likelihood the speaker had a speech-and-language disorder (Experiment 1 only, rate and volume (both Experiments 1 and 2. The speech of adults who stutter was perceived to be less fluent, natural, intelligible, and to be slower and louder than the speech of typical adults. Adults who stutter were also perceived to have lower cognitive ability, to be less likeable and to be more anxious than the typical adult speakers. Relations between speech characteristics and social and cognitive impressions were found, independent of whether or not the speaker stuttered (i.e., they were found for both adults who stutter and typically-speaking adults and did not depend on being cued that some of the speakers may have had a speech-language impairment.
The effect of viewing speech on auditory speech processing is different in the left and right hemispheres.

Science.gov (United States)

Davis, Chris; Kislyuk, Daniel; Kim, Jeesun; Sams, Mikko

2008-11-25

We used whole-head magnetoencephalograpy (MEG) to record changes in neuromagnetic N100m responses generated in the left and right auditory cortex as a function of the match between visual and auditory speech signals. Stimuli were auditory-only (AO) and auditory-visual (AV) presentations of /pi/, /ti/ and /vi/. Three types of intensity matched auditory stimuli were used: intact speech (Normal), frequency band filtered speech (Band) and speech-shaped white noise (Noise). The behavioural task was to detect the /vi/ syllables which comprised 12% of stimuli. N100m responses were measured to averaged /pi/ and /ti/ stimuli. Behavioural data showed that identification of the stimuli was faster and more accurate for Normal than for Band stimuli, and for Band than for Noise stimuli. Reaction times were faster for AV than AO stimuli. MEG data showed that in the left hemisphere, N100m to both AO and AV stimuli was largest for the Normal, smaller for Band and smallest for Noise stimuli. In the right hemisphere, Normal and Band AO stimuli elicited N100m responses of quite similar amplitudes, but N100m amplitude to Noise was about half of that. There was a reduction in N100m for the AV compared to the AO conditions. The size of this reduction for each stimulus type was same in the left hemisphere but graded in the right (being largest to the Normal, smaller to the Band and smallest to the Noise stimuli). The N100m decrease for the Normal stimuli was significantly larger in the right than in the left hemisphere. We suggest that the effect of processing visual speech seen in the right hemisphere likely reflects suppression of the auditory response based on AV cues for place of articulation.
Multichannel infinite clipping as a form of sampling of speech signals

International Nuclear Information System (INIS)

Guidarelli, G.

1985-01-01

A remarkable improvement of both intelligibility and naturalness of infinitely clipped speech can be achieved by means of a multichannel system in which the speech signal is splitted into several band-pass channels before the clipping and successively reconstructed by summing up the clipped outputs of each channel. A possible explanation of such an improvement is given, founded on the so-called zero-based representation of band limited signals where the zero-crossings sequence is considered as a set of samples of the signal
Artificial intelligence, expert systems, computer vision, and natural language processing

Science.gov (United States)

Gevarter, W. B.

1984-01-01

An overview of artificial intelligence (AI), its core ingredients, and its applications is presented. The knowledge representation, logic, problem solving approaches, languages, and computers pertaining to AI are examined, and the state of the art in AI is reviewed. The use of AI in expert systems, computer vision, natural language processing, speech recognition and understanding, speech synthesis, problem solving, and planning is examined. Basic AI topics, including automation, search-oriented problem solving, knowledge representation, and computational logic, are discussed.
An overview of artificial intelligence and robotics. Volume 1: Artificial intelligence. Part B: Applications

Science.gov (United States)

Gevarter, W. B.

1983-01-01

Artificial Intelligence (AI) is an emerging technology that has recently attracted considerable attention. Many applications are now under development. This report, Part B of a three part report on AI, presents overviews of the key application areas: Expert Systems, Computer Vision, Natural Language Processing, Speech Interfaces, and Problem Solving and Planning. The basic approaches to such systems, the state-of-the-art, existing systems and future trends and expectations are covered.

Phoneme Compression: processing of the speech signal and effects on speech intelligibility in hearing-Impaired listeners

NARCIS (Netherlands)

A. Goedegebure (Andre)

2005-01-01

textabstractHearing-aid users often continue to have problems with poor speech understanding in difficult acoustical conditions. Another generally accounted problem is that certain sounds become too loud whereas other sounds are still not audible. Dynamic range compression is a signal processing
The speech-based envelope power spectrum model (sEPSM) family: Development, achievements, and current challenges

DEFF Research Database (Denmark)

Relano-Iborra, Helia; Chabot-Leclerc, Alexandre; Scheidiger, Christoph

2017-01-01

have extended the predictive power of the original model to a broad range of conditions. This contribution presents the most recent developments within the sEPSM “family:” (i) A binaural extension, the B-sEPSM [Chabot-Leclerc et al. (2016). J. Acoust. Soc. Am. 140(1), 192-205] which combines better......Intelligibility models provide insights regarding the effects of target speech characteristics, transmission channels and/or auditory processing on the speech perception performance of listeners. In 2011, Jørgensen and Dau proposed the speech-based envelope power spectrum model [sEPSM, Jørgensen...
Intelligence and Schooling. Fueling the Education Explosion: Proceedings of Conference 2 (Cleveland, Ohio, November 17-18, 1983).

Science.gov (United States)

Gardner, Mary, Ed.; Reed-Mundell, Charlene, Ed.

These proceedings contain presentations from a conference whose major topics were real-world intelligence, artificial intelligence, and linkage between the education and corporate sectors. "People, Perspectives...Potential and Possibilities" (Elyse S. Fleming), which was the conference's closing speech, briefly summarizes the information…
Speech timing and working memory in profoundly deaf children after cochlear implantation

OpenAIRE

Burkholder, Rose A.; Pisoni, David B.

2003-01-01

Thirty-seven profoundly deaf children between 8- and 9-years-old with cochlear implants and a comparison group of normal-hearing children were studied to measure speaking rates, digit spans, and speech timing during digit span recall. The deaf children displayed longer sentence durations and pauses during recall and shorter digit spans compared to the normal-hearing children. Articulation rates, measured from sentence durations, were strongly correlated with immediate memory span in both norm...
Grammatical constraints on phonological encoding in speech production.

Science.gov (United States)

Heller, Jordana R; Goldrick, Matthew

2014-12-01

To better understand the influence of grammatical encoding on the retrieval and encoding of phonological word-form information during speech production, we examine how grammatical class constraints influence the activation of phonological neighbors (words phonologically related to the target--e.g., MOON, TWO for target TUNE). Specifically, we compare how neighbors that share a target's grammatical category (here, nouns) influence its planning and retrieval, assessed by picture naming latencies, and phonetic encoding, assessed by word productions in picture names, when grammatical constraints are strong (in sentence contexts) versus weak (bare naming). Within-category (noun) neighbors influenced planning time and phonetic encoding more strongly in sentence contexts. This suggests that grammatical encoding constrains phonological processing; the influence of phonological neighbors is grammatically dependent. Moreover, effects on planning times could not fully account for phonetic effects, suggesting that phonological interaction affects articulation after speech onset. These results support production theories integrating grammatical, phonological, and phonetic processes.
Speech Production in 3-Year-Old Internationally Adopted Children with Unilateral Cleft Lip and Palate

Science.gov (United States)

Larsson, AnnaKarin; Schölin, Johnna; Mark, Hans; Jönsson, Radi; Persson, Christina

2017-01-01

Background: In the last decade, a large number of children with cleft lip and palate have been adopted to Sweden. A majority of the children were born in China and they usually arrive in Sweden with an unoperated palate. There is currently a lack of knowledge regarding speech and articulation development in this group of children, who also have to…
Articulating Atmospheres

DEFF Research Database (Denmark)

Kinch, Sofie

2011-01-01

This paper presents an architectural approach to designing computational interfaces by articulating the notion of atmosphere in the field of interaction design. It draws upon the concept of kinesthetic interaction and a philosophical notion on atmosphere emphasizing the importance of bodily...
[The controversy of routine articulator mounting in orthodontics].

Science.gov (United States)

Wang, Li; Han, Xianglong; Bai, Ding

2013-06-01

Articulators have been widely used by clinicians of dentistry. But routine articulator mounting is still controversial in orthodontics. Orthodontists oriented by gnathology approve routine articulator mounting while nongnathologic orthodontists disapprove it. This article reviews the thoughts of orthodontist that they agree or disagree with routine articulator mounting based on the considerations of biting, temporomandibular disorder (TMD), periodontitis, and so on.
Automatic initial and final segmentation in cleft palate speech of Mandarin speakers.

Directory of Open Access Journals (Sweden)

Ling He

Full Text Available The speech unit segmentation is an important pre-processing step in the analysis of cleft palate speech. In Mandarin, one syllable is composed of two parts: initial and final. In cleft palate speech, the resonance disorders occur at the finals and the voiced initials, while the articulation disorders occur at the unvoiced initials. Thus, the initials and finals are the minimum speech units, which could reflect the characteristics of cleft palate speech disorders. In this work, an automatic initial/final segmentation method is proposed. It is an important preprocessing step in cleft palate speech signal processing. The tested cleft palate speech utterances are collected from the Cleft Palate Speech Treatment Center in the Hospital of Stomatology, Sichuan University, which has the largest cleft palate patients in China. The cleft palate speech data includes 824 speech segments, and the control samples contain 228 speech segments. The syllables are extracted from the speech utterances firstly. The proposed syllable extraction method avoids the training stage, and achieves a good performance for both voiced and unvoiced speech. Then, the syllables are classified into with "quasi-unvoiced" or with "quasi-voiced" initials. Respective initial/final segmentation methods are proposed to these two types of syllables. Moreover, a two-step segmentation method is proposed. The rough locations of syllable and initial/final boundaries are refined in the second segmentation step, in order to improve the robustness of segmentation accuracy. The experiments show that the initial/final segmentation accuracies for syllables with quasi-unvoiced initials are higher than quasi-voiced initials. For the cleft palate speech, the mean time error is 4.4ms for syllables with quasi-unvoiced initials, and 25.7ms for syllables with quasi-voiced initials, and the correct segmentation accuracy P30 for all the syllables is 91.69%. For the control samples, P30 for all the
The effect of sensorineural hearing loss and tinnitus on speech recognition over air and bone conduction military communications headsets.

Science.gov (United States)

Manning, Candice; Mermagen, Timothy; Scharine, Angelique

2017-06-01

Military personnel are at risk for hearing loss due to noise exposure during deployment (USACHPPM, 2008). Despite mandated use of hearing protection, hearing loss and tinnitus are prevalent due to reluctance to use hearing protection. Bone conduction headsets can offer good speech intelligibility for normal hearing (NH) listeners while allowing the ears to remain open in quiet environments and the use of hearing protection when needed. Those who suffer from tinnitus, the experience of perceiving a sound not produced by an external source, often show degraded speech recognition; however, it is unclear whether this is a result of decreased hearing sensitivity or increased distractibility (Moon et al., 2015). It has been suggested that the vibratory stimulation of a bone conduction headset might ameliorate the effects of tinnitus on speech perception; however, there is currently no research to support or refute this claim (Hoare et al., 2014). Speech recognition of words presented over air conduction and bone conduction headsets was measured for three groups of listeners: NH, sensorineural hearing impaired, and/or tinnitus sufferers. Three levels of speech-to-noise (SNR = 0, -6, -12 dB) were created by embedding speech items in pink noise. Better speech recognition performance was observed with the bone conduction headset regardless of hearing profile, and speech intelligibility was a function of SNR. Discussion will include study limitations and the implications of these findings for those serving in the military. Published by Elsevier B.V.
An Intelligibility Assessment of Toddlers with Cleft Lip and Palate Who Received and Did Not Receive Presurgical Infant Orthopedic Treatment.

Science.gov (United States)

Konst, Emmy M.; Weersink-Braks, Hanny; Rietveld, Toni; Peters, Herman

2000-01-01

The influence of presurgical infant orthopedic treatment (PIO) on speech intelligibility was evaluated with 10 toddlers who used PIO during the first year of life and 10 who did not. Treated children were rated as exhibiting greater intelligibility, however, transcription data indicated there were not group differences in actual intelligibility.…
Accent, Intelligibility, and the Role of the Listener: Perceptions of English-Accented German by Native German Speakers

Science.gov (United States)

Hayes-Harb, Rachel; Watzinger-Tharp, Johanna

2012-01-01

We explore the relationship between accentedness and intelligibility, and investigate how listeners' beliefs about nonnative speech interact with their accentedness and intelligibility judgments. Native German speakers and native English learners of German produced German sentences, which were presented to 12 native German speakers in accentedness…
Effect of talker and speaking style on the Speech Transmission Index (L)

NARCIS (Netherlands)

Wijngaarden, S.J. van; Houtgast, T.

2004-01-01

The Speech Transmission Index (STI) is routinely applied for predicting the intelligibility of messages (sentences) in noise and reverberation. Despite clear evidence that the STI is capable of doing so accurately, recent results indicate that the STI sometimes underestimates the effect of
Use of listening strategies for the speech of individuals with dysarthria and cerebral palsy.

Science.gov (United States)

Hustad, Katherine C; Dardis, Caitlin M; Kramper, Amy J

2011-03-01

This study examined listeners' endorsement of cognitive, linguistic, segmental, and suprasegmental strategies employed when listening to speakers with dysarthria. The study also examined whether strategy endorsement differed between listeners who earned the highest and lowest intelligibility scores. Speakers were eight individuals with dysarthria and cerebral palsy. Listeners were 80 individuals who transcribed speech stimuli and rated their use of each of 24 listening strategies on a 4-point scale. Results showed that cognitive and linguistic strategies were most highly endorsed. Use of listening strategies did not differ between listeners with the highest and lowest intelligibility scores. Results suggest that there may be a core of strategies common to listeners of speakers with dysarthria that may be supplemented by additional strategies, based on characteristics of the speaker and speech signal.
On Autonomous Articulated Vehicles

OpenAIRE

Nayl, Thaker

2015-01-01

The objective of this thesis is to address the problems of modeling, path planning and path following for an articulated vehicle in a realistic environment and in the presence of multiple obstacles.In greater detail, the problem of the kinematic modeling of an articulated vehicle is revisited through the proposal of a proper model in which the dimensions and properties of the vehicle can be fully described, rather than considering it as a unit point. Based on this approach, nonlinear and line...
The Interaction of Temporal and Spectral Acoustic Information with Word Predictability on Speech Intelligibility

Science.gov (United States)

Shahsavarani, Somayeh Bahar

High-level, top-down information such as linguistic knowledge is a salient cortical resource that influences speech perception under most listening conditions. But, are all listeners able to exploit these resources for speech facilitation to the same extent? It was found that children with cochlear implants showed different patterns of benefit from contextual information in speech perception compared with their normal-haring peers. Previous studies have discussed the role of non-acoustic factors such as linguistic and cognitive capabilities to account for this discrepancy. Given the fact that the amount of acoustic information encoded and processed by auditory nerves of listeners with cochlear implants differs from normal-hearing listeners and even varies across individuals with cochlear implants, it is important to study the interaction of specific acoustic properties of the speech signal with contextual cues. This relationship has been mostly neglected in previous research. In this dissertation, we aimed to explore how different acoustic dimensions interact to affect listeners' abilities to combine top-down information with bottom-up information in speech perception beyond the known effects of linguistic and cognitive capacities shown previously. Specifically, the present study investigated whether there were any distinct context effects based on the resolution of spectral versus slowly-varying temporal information in perception of spectrally impoverished speech. To that end, two experiments were conducted. In both experiments, a noise-vocoded technique was adopted to generate spectrally-degraded speech to approximate acoustic cues delivered to listeners with cochlear implants. The frequency resolution was manipulated by varying the number of frequency channels. The temporal resolution was manipulated by low-pass filtering of amplitude envelope with varying low-pass cutoff frequencies. The stimuli were presented to normal-hearing native speakers of American
Gender Issues and Language Articulation; a Brief Look at Pros of Gender Neutral Language Articulation

Science.gov (United States)

Ebrahimi, Pouria

2009-01-01

As with the language articulated by learners--in both oral and written form--the supremacy of a masculine language use is witnessed. This brings to light the fact that gender has been excessively an unobserved factor in the process of language teaching. Although learners are apparently used to forming masculine-centered articulation, non-sexist…
78 FR 49693 - Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services...

Science.gov (United States)

2013-08-15

...-Speech Services for Individuals with Hearing and Speech Disabilities, Report and Order (Order), document...] Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services; Telecommunications Relay Services and Speech-to-Speech Services for Individuals With Hearing and Speech Disabilities...
Speech abilities in preschool children with speech sound disorder with and without co-occurring language impairment.

Science.gov (United States)

Macrae, Toby; Tyler, Ann A

2014-10-01

The authors compared preschool children with co-occurring speech sound disorder (SSD) and language impairment (LI) to children with SSD only in their numbers and types of speech sound errors. In this post hoc quasi-experimental study, independent samples t tests were used to compare the groups in the standard score from different tests of articulation/phonology, percent consonants correct, and the number of omission, substitution, distortion, typical, and atypical error patterns used in the production of different wordlists that had similar levels of phonetic and structural complexity. In comparison with children with SSD only, children with SSD and LI used similar numbers but different types of errors, including more omission patterns ( p < .001, d = 1.55) and fewer distortion patterns ( p = .022, d = 1.03). There were no significant differences in substitution, typical, and atypical error pattern use. Frequent omission error pattern use may reflect a more compromised linguistic system characterized by absent phonological representations for target sounds (see Shriberg et al., 2005). Research is required to examine the diagnostic potential of early frequent omission error pattern use in predicting later diagnoses of co-occurring SSD and LI and/or reading problems.
Speech recognition in natural background noise.

Directory of Open Access Journals (Sweden)

Julien Meyer

Full Text Available In the real world, human speech recognition nearly always involves listening in background noise. The impact of such noise on speech signals and on intelligibility performance increases with the separation of the listener from the speaker. The present behavioral experiment provides an overview of the effects of such acoustic disturbances on speech perception in conditions approaching ecologically valid contexts. We analysed the intelligibility loss in spoken word lists with increasing listener-to-speaker distance in a typical low-level natural background noise. The noise was combined with the simple spherical amplitude attenuation due to distance, basically changing the signal-to-noise ratio (SNR. Therefore, our study draws attention to some of the most basic environmental constraints that have pervaded spoken communication throughout human history. We evaluated the ability of native French participants to recognize French monosyllabic words (spoken at 65.3 dB(A, reference at 1 meter at distances between 11 to 33 meters, which corresponded to the SNRs most revealing of the progressive effect of the selected natural noise (-8.8 dB to -18.4 dB. Our results showed that in such conditions, identity of vowels is mostly preserved, with the striking peculiarity of the absence of confusion in vowels. The results also confirmed the functional role of consonants during lexical identification. The extensive analysis of recognition scores, confusion patterns and associated acoustic cues revealed that sonorant, sibilant and burst properties were the most important parameters influencing phoneme recognition. . Altogether these analyses allowed us to extract a resistance scale from consonant recognition scores. We also identified specific perceptual consonant confusion groups depending of the place in the words (onset vs. coda. Finally our data suggested that listeners may access some acoustic cues of the CV transition, opening interesting perspectives for

Speech recognition in natural background noise.

Science.gov (United States)

Meyer, Julien; Dentel, Laure; Meunier, Fanny

2013-01-01

In the real world, human speech recognition nearly always involves listening in background noise. The impact of such noise on speech signals and on intelligibility performance increases with the separation of the listener from the speaker. The present behavioral experiment provides an overview of the effects of such acoustic disturbances on speech perception in conditions approaching ecologically valid contexts. We analysed the intelligibility loss in spoken word lists with increasing listener-to-speaker distance in a typical low-level natural background noise. The noise was combined with the simple spherical amplitude attenuation due to distance, basically changing the signal-to-noise ratio (SNR). Therefore, our study draws attention to some of the most basic environmental constraints that have pervaded spoken communication throughout human history. We evaluated the ability of native French participants to recognize French monosyllabic words (spoken at 65.3 dB(A), reference at 1 meter) at distances between 11 to 33 meters, which corresponded to the SNRs most revealing of the progressive effect of the selected natural noise (-8.8 dB to -18.4 dB). Our results showed that in such conditions, identity of vowels is mostly preserved, with the striking peculiarity of the absence of confusion in vowels. The results also confirmed the functional role of consonants during lexical identification. The extensive analysis of recognition scores, confusion patterns and associated acoustic cues revealed that sonorant, sibilant and burst properties were the most important parameters influencing phoneme recognition. . Altogether these analyses allowed us to extract a resistance scale from consonant recognition scores. We also identified specific perceptual consonant confusion groups depending of the place in the words (onset vs. coda). Finally our data suggested that listeners may access some acoustic cues of the CV transition, opening interesting perspectives for future studies.
Everyday listeners' impressions of speech produced by individuals with adductor spasmodic dysphonia.

Science.gov (United States)

Nagle, Kathleen F; Eadie, Tanya L; Yorkston, Kathryn M

2015-01-01

Individuals with adductor spasmodic dysphonia (ADSD) have reported that unfamiliar communication partners appear to judge them as sneaky, nervous or not intelligent, apparently based on the quality of their speech; however, there is minimal research into the actual everyday perspective of listening to ADSD speech. The purpose of this study was to investigate the impressions of listeners hearing ADSD speech for the first time using a mixed-methods design. Everyday listeners were interviewed following sessions in which they made ratings of ADSD speech. A semi-structured interview approach was used and data were analyzed using thematic content analysis. Three major themes emerged: (1) everyday listeners make judgments about speakers with ADSD; (2) ADSD speech does not sound normal to everyday listeners; and (3) rating overall severity is difficult for everyday listeners. Participants described ADSD speech similarly to existing literature; however, some listeners inaccurately extrapolated speaker attributes based solely on speech samples. Listeners may draw erroneous conclusions about individuals with ADSD and these biases may affect the communicative success of these individuals. Results have implications for counseling individuals with ADSD, as well as the need for education and awareness about ADSD. Copyright © 2015 Elsevier Inc. All rights reserved.
Causal inference of asynchronous audiovisual speech

Directory of Open Access Journals (Sweden)

John F Magnotti

2013-11-01

Full Text Available During speech perception, humans integrate auditory information from the voice with visual information from the face. This multisensory integration increases perceptual precision, but only if the two cues come from the same talker; this requirement has been largely ignored by current models of speech perception. We describe a generative model of multisensory speech perception that includes this critical step of determining the likelihood that the voice and face information have a common cause. A key feature of the model is that it is based on a principled analysis of how an observer should solve this causal inference problem using the asynchrony between two cues and the reliability of the cues. This allows the model to make predictions abut the behavior of subjects performing a synchrony judgment task, predictive power that does not exist in other approaches, such as post hoc fitting of Gaussian curves to behavioral data. We tested the model predictions against the performance of 37 subjects performing a synchrony judgment task viewing audiovisual speech under a variety of manipulations, including varying asynchronies, intelligibility, and visual cue reliability. The causal inference model outperformed the Gaussian model across two experiments, providing a better fit to the behavioral data with fewer parameters. Because the causal inference model is derived from a principled understanding of the task, model parameters are directly interpretable in terms of stimulus and subject properties.
Development and evaluation of the British English coordinate response measure speech-in-noise test as an occupational hearing assessment tool.

Science.gov (United States)

Semeraro, Hannah D; Rowan, Daniel; van Besouw, Rachel M; Allsopp, Adrian A

2017-10-01

The studies described in this article outline the design and development of a British English version of the coordinate response measure (CRM) speech-in-noise (SiN) test. Our interest in the CRM is as a SiN test with high face validity for occupational auditory fitness for duty (AFFD) assessment. Study 1 used the method of constant stimuli to measure and adjust the psychometric functions of each target word, producing a speech corpus with equal intelligibility. After ensuring all the target words had similar intelligibility, for Studies 2 and 3, the CRM was presented in an adaptive procedure in stationary speech-spectrum noise to measure speech reception thresholds and evaluate the test-retest reliability of the CRM SiN test. Studies 1 (n = 20) and 2 (n = 30) were completed by normal-hearing civilians. Study 3 (n = 22) was completed by hearing impaired military personnel. The results display good test-retest reliability (95% confidence interval (CI) hearing impairment. The British English CRM using stationary speech-spectrum noise is a "ready to use" SiN test, suitable for investigation as an AFFD assessment tool for military personnel.
Speech and language development in 2-year-old children with cerebral palsy.

Science.gov (United States)

Hustad, Katherine C; Allison, Kristen; McFadd, Emily; Riehle, Katherine

2014-06-01

We examined early speech and language development in children who had cerebral palsy. Questions addressed whether children could be classified into early profile groups on the basis of speech and language skills and whether there were differences on selected speech and language measures among groups. Speech and language assessments were completed on 27 children with CP who were between the ages of 24 and 30 months (mean age 27.1 months; SD 1.8). We examined several measures of expressive and receptive language, along with speech intelligibility. Two-step cluster analysis was used to identify homogeneous groups of children based on their performance on the seven dependent variables characterizing speech and language performance. Three groups of children identified were those not yet talking (44% of the sample); those whose talking abilities appeared to be emerging (41% of the sample); and those who were established talkers (15% of the sample). Group differences were evident on all variables except receptive language skills. 85% of 2-year-old children with CP in this study had clinical speech and/or language delays relative to age expectations. Findings suggest that children with CP should receive speech and language assessment and treatment at or before 2 years of age.
Intravelar veloplasty in cleft lip, alveolus and palate and outcome of speech and language acquisition: a prospective study.

Science.gov (United States)

Bitter, Klaus; Wegener, Carla; Gomille, Nadine

2003-12-01

Speech and language acquisition are major, important criteria in the treatment outcomes of cleft lip and palate patients. A generally accepted and definitive treatment protocol regarding surgical techniques and the time schedule does not yet exist. In the world literature, there are reports of velo-pharyngeal insufficiency rates between 7 and 30%. In a prospective study, all children aged 312 months with cleft lip, alveolus and palate, or cleft palate only, underwent an intravelar veloplasty. Follow-up monitoring consisted of frequent clinical linguistic checks and supervision of language development without a planned intention of articulation therapy before the age of about 5 years. Three hundred and ninety-seven children with non-syndromic clefts were included in this study, the youngest being 8-year old. Sixty children (15%) showed deviations in language and speech acquisition. From these, 56 (14%) had received articulation therapy after the 5th birthday. From these 56 children, 45 had overcome their problems with speech therapy alone whereas 11 (3%) needed a velo-pharyngeoplasty. Although these results are much better than those reported in other cohorts, some children still have velo-pharyngeal incompetence for no apparent reason. One possible explanation might be surgical, since on occasions, the intravelar muscle bundle is divided into two parts and the palato-pharyngeal part runs isolated more laterally and can be missed during reconstruction and retropositioning.
A magnetic resonance imaging study on the articulatory and acoustic speech parameters of Malay vowels.

Science.gov (United States)

Zourmand, Alireza; Mirhassani, Seyed Mostafa; Ting, Hua-Nong; Bux, Shaik Ismail; Ng, Kwan Hoong; Bilgen, Mehmet; Jalaludin, Mohd Amin

2014-07-25

The phonetic properties of six Malay vowels are investigated using magnetic resonance imaging (MRI) to visualize the vocal tract in order to obtain dynamic articulatory parameters during speech production. To resolve image blurring due to the tongue movement during the scanning process, a method based on active contour extraction is used to track tongue contours. The proposed method efficiently tracks tongue contours despite the partial blurring of MRI images. Consequently, the articulatory parameters that are effectively measured as tongue movement is observed, and the specific shape of the tongue and its position for all six uttered Malay vowels are determined.Speech rehabilitation procedure demands some kind of visual perceivable prototype of speech articulation. To investigate the validity of the measured articulatory parameters based on acoustic theory of speech production, an acoustic analysis based on the uttered vowels by subjects has been performed. As the acoustic speech and articulatory parameters of uttered speech were examined, a correlation between formant frequencies and articulatory parameters was observed. The experiments reported a positive correlation between the constriction location of the tongue body and the first formant frequency, as well as a negative correlation between the constriction location of the tongue tip and the second formant frequency. The results demonstrate that the proposed method is an effective tool for the dynamic study of speech production.
A phonological analysis of the expressive and receptive articulatory difficulties of an aphasic with apraxia of speech: A case study

Directory of Open Access Journals (Sweden)

Aura Kagan

1977-11-01

Full Text Available The expressive and receptive phonological errors of an aphasic subject with mild apraxia of speech were analysed in terms of a distinctive feature framework. The results indicated that errors could be characterized linguistically and that such information could be of therapeutic significance. The relationship between articulation problems and ability to discriminate phonemes was investigated. Although no direct relationship was found, discrimination errors followed linguistic trends demonstrated in the articulation errors. The findings of this study suggest that the traditional idea of apraxia as a non-linguistic and purely motor disorder needs re-examination.
Automated analysis of connected speech reveals early biomarkers of Parkinson's disease in patients with rapid eye movement sleep behaviour disorder.

Science.gov (United States)

Hlavnička, Jan; Čmejla, Roman; Tykalová, Tereza; Šonka, Karel; Růžička, Evžen; Rusz, Jan

2017-02-02

For generations, the evaluation of speech abnormalities in neurodegenerative disorders such as Parkinson's disease (PD) has been limited to perceptual tests or user-controlled laboratory analysis based upon rather small samples of human vocalizations. Our study introduces a fully automated method that yields significant features related to respiratory deficits, dysphonia, imprecise articulation and dysrhythmia from acoustic microphone data of natural connected speech for predicting early and distinctive patterns of neurodegeneration. We compared speech recordings of 50 subjects with rapid eye movement sleep behaviour disorder (RBD), 30 newly diagnosed, untreated PD patients and 50 healthy controls, and showed that subliminal parkinsonian speech deficits can be reliably captured even in RBD patients, which are at high risk of developing PD or other synucleinopathies. Thus, automated vocal analysis should soon be able to contribute to screening and diagnostic procedures for prodromal parkinsonian neurodegeneration in natural environments.
Exploring expressivity and emotion with artificial voice and speech technologies.

Science.gov (United States)

Pauletto, Sandra; Balentine, Bruce; Pidcock, Chris; Jones, Kevin; Bottaci, Leonardo; Aretoulaki, Maria; Wells, Jez; Mundy, Darren P; Balentine, James

2013-10-01

Emotion in audio-voice signals, as synthesized by text-to-speech (TTS) technologies, was investigated to formulate a theory of expression for user interface design. Emotional parameters were specified with markup tags, and the resulting audio was further modulated with post-processing techniques. Software was then developed to link a selected TTS synthesizer with an automatic speech recognition (ASR) engine, producing a chatbot that could speak and listen. Using these two artificial voice subsystems, investigators explored both artistic and psychological implications of artificial speech emotion. Goals of the investigation were interdisciplinary, with interest in musical composition, augmentative and alternative communication (AAC), commercial voice announcement applications, human-computer interaction (HCI), and artificial intelligence (AI). The work-in-progress points towards an emerging interdisciplinary ontology for artificial voices. As one study output, HCI tools are proposed for future collaboration.
Sensorimotor speech disorders in Parkinson's disease: Programming and execution deficits

Directory of Open Access Journals (Sweden)

Karin Zazo Ortiz

Full Text Available ABSTRACT Introduction: Dysfunction in the basal ganglia circuits is a determining factor in the physiopathology of the classic signs of Parkinson's disease (PD and hypokinetic dysarthria is commonly related to PD. Regarding speech disorders associated with PD, the latest four-level framework of speech complicates the traditional view of dysarthria as a motor execution disorder. Based on findings that dysfunctions in basal ganglia can cause speech disorders, and on the premise that the speech deficits seen in PD are not related to an execution motor disorder alone but also to a disorder at the motor programming level, the main objective of this study was to investigate the presence of sensorimotor disorders of programming (besides the execution disorders previously described in PD patients. Methods: A cross-sectional study was conducted in a sample of 60 adults matched for gender, age and education: 30 adult patients diagnosed with idiopathic PD (PDG and 30 healthy adults (CG. All types of articulation errors were reanalyzed to investigate the nature of these errors. Interjections, hesitations and repetitions of words or sentences (during discourse were considered typical disfluencies; blocking, episodes of palilalia (words or syllables were analyzed as atypical disfluencies. We analysed features including successive self-initiated trial, phoneme distortions, self-correction, repetition of sounds and syllables, prolonged movement transitions, additions or omissions of sounds and syllables, in order to identify programming and/or execution failures. Orofacial agility was also investigated. Results: The PDG had worse performance on all sensorimotor speech tasks. All PD patients had hypokinetic dysarthria. Conclusion: The clinical characteristics found suggest both execution and programming sensorimotor speech disorders in PD patients.
Oral motor deficits in speech-impaired children with autism

Science.gov (United States)

Belmonte, Matthew K.; Saxena-Chandhok, Tanushree; Cherian, Ruth; Muneer, Reema; George, Lisa; Karanth, Prathibha

2013-01-01

Absence of communicative speech in autism has been presumed to reflect a fundamental deficit in the use of language, but at least in a subpopulation may instead stem from motor and oral motor issues. Clinical reports of disparity between receptive vs. expressive speech/language abilities reinforce this hypothesis. Our early-intervention clinic develops skills prerequisite to learning and communication, including sitting, attending, and pointing or reference, in children below 6 years of age. In a cohort of 31 children, gross and fine motor skills and activities of daily living as well as receptive and expressive speech were assessed at intake and after 6 and 10 months of intervention. Oral motor skills were evaluated separately within the first 5 months of the child's enrolment in the intervention programme and again at 10 months of intervention. Assessment used a clinician-rated structured report, normed against samples of 360 (for motor and speech skills) and 90 (for oral motor skills) typically developing children matched for age, cultural environment and socio-economic status. In the full sample, oral and other motor skills correlated with receptive and expressive language both in terms of pre-intervention measures and in terms of learning rates during the intervention. A motor-impaired group comprising a third of the sample was discriminated by an uneven profile of skills with oral motor and expressive language deficits out of proportion to the receptive language deficit. This group learnt language more slowly, and ended intervention lagging in oral motor skills. In individuals incapable of the degree of motor sequencing and timing necessary for speech movements, receptive language may outstrip expressive speech. Our data suggest that autistic motor difficulties could range from more basic skills such as pointing to more refined skills such as articulation, and need to be assessed and addressed across this entire range in each individual. PMID:23847480
Oral Motor Deficits in Speech-Impaired Children with Autism

Directory of Open Access Journals (Sweden)

Matthew K Belmonte

2013-07-01

Full Text Available Absence of communicative speech in autism has been presumed to reflect a fundamental deficit in the use of language, but at least in a subpopulation may instead stem from motor and oral motor issues. Clinical reports of disparity between receptive versus expressive speech / language abilities reinforce this hypothesis. Our early-intervention clinic develops skills prerequisite to learning and communication, including sitting, attending, and pointing or reference, in children below 6 years of age. In a cohort of 31 children, gross and fine motor skills and activities of daily living as well as receptive and expressive speech were assessed at intake and after 6 and 10 months of intervention. Oral motor skills were evaluated separately within the first 5 months of the child's enrolment in the intervention programme and again at 10 months of intervention. Assessment used a clinician-rated structured report, normed against samples of 360 (for motor and speech skills and 90 (for oral motor skills typically developing children matched for age, cultural environment and socio-economic status. In the full sample, oral and other motor skills correlated with receptive and expressive language both in terms of pre-intervention measures and in terms of learning rates during the intervention. A motor-impaired group comprising a third of the sample was discriminated by an uneven profile of skills with oral motor and expressive language deficits out of proportion to the receptive language deficit. This group learnt language more slowly, and ended intervention lagging in oral motor skills. In individuals incapable of the degree of motor sequencing and timing necessary for speech movements, receptive language may outstrip expressive speech. Our data suggest that autistic motor difficulties could range from more basic skills such as pointing to more refined skills such as articulation, and need to be assessed and addressed across this entire range in each individual.
Fine-structure processing, frequency selectivity and speech perception in hearing-impaired listeners

DEFF Research Database (Denmark)

Strelcyk, Olaf; Dau, Torsten

2008-01-01

Hearing-impaired people often experience great difficulty with speech communication when background noise is present, even if reduced audibility has been compensated for. Other impairment factors must be involved. In order to minimize confounding effects, the subjects participating in this study...... consisted of groups with homogeneous, symmetric audiograms. The perceptual listening experiments assessed the intelligibility of full-spectrum as well as low-pass filtered speech in the presence of stationary and fluctuating interferers, the individual's frequency selectivity and the integrity of temporal...... modulation were obtained. In addition, these binaural and monaural thresholds were measured in a stationary background noise in order to assess the persistence of the fine-structure processing to interfering noise. Apart from elevated speech reception thresholds, the hearing impaired listeners showed poorer...
Text-to-audiovisual speech synthesizer for children with learning disabilities.

Science.gov (United States)

Mendi, Engin; Bayrak, Coskun

2013-01-01

Learning disabilities affect the ability of children to learn, despite their having normal intelligence. Assistive tools can highly increase functional capabilities of children with learning disorders such as writing, reading, or listening. In this article, we describe a text-to-audiovisual synthesizer that can serve as an assistive tool for such children. The system automatically converts an input text to audiovisual speech, providing synchronization of the head, eye, and lip movements of the three-dimensional face model with appropriate facial expressions and word flow of the text. The proposed system can enhance speech perception and help children having learning deficits to improve their chances of success.
Speech-to-Speech Relay Service

Science.gov (United States)

Consumer Guide Speech to Speech Relay Service Speech-to-Speech (STS) is one form of Telecommunications Relay Service (TRS). TRS is a service that allows persons with hearing and speech disabilities ...
Learning-induced neural plasticity of speech processing before birth.

Science.gov (United States)

Partanen, Eino; Kujala, Teija; Näätänen, Risto; Liitola, Auli; Sambeth, Anke; Huotilainen, Minna

2013-09-10

Learning, the foundation of adaptive and intelligent behavior, is based on plastic changes in neural assemblies, reflected by the modulation of electric brain responses. In infancy, auditory learning implicates the formation and strengthening of neural long-term memory traces, improving discrimination skills, in particular those forming the prerequisites for speech perception and understanding. Although previous behavioral observations show that newborns react differentially to unfamiliar sounds vs. familiar sound material that they were exposed to as fetuses, the neural basis of fetal learning has not thus far been investigated. Here we demonstrate direct neural correlates of human fetal learning of speech-like auditory stimuli. We presented variants of words to fetuses; unlike infants with no exposure to these stimuli, the exposed fetuses showed enhanced brain activity (mismatch responses) in response to pitch changes for the trained variants after birth. Furthermore, a significant correlation existed between the amount of prenatal exposure and brain activity, with greater activity being associated with a higher amount of prenatal speech exposure. Moreover, the learning effect was generalized to other types of similar speech sounds not included in the training material. Consequently, our results indicate neural commitment specifically tuned to the speech features heard before birth and their memory representations.
Weak responses to auditory feedback perturbation during articulation in persons who stutter: evidence for abnormal auditory-motor transformation.

Directory of Open Access Journals (Sweden)

Shanqing Cai

Full Text Available Previous empirical observations have led researchers to propose that auditory feedback (the auditory perception of self-produced sounds when speaking functions abnormally in the speech motor systems of persons who stutter (PWS. Researchers have theorized that an important neural basis of stuttering is the aberrant integration of auditory information into incipient speech motor commands. Because of the circumstantial support for these hypotheses and the differences and contradictions between them, there is a need for carefully designed experiments that directly examine auditory-motor integration during speech production in PWS. In the current study, we used real-time manipulation of auditory feedback to directly investigate whether the speech motor system of PWS utilizes auditory feedback abnormally during articulation and to characterize potential deficits of this auditory-motor integration. Twenty-one PWS and 18 fluent control participants were recruited. Using a short-latency formant-perturbation system, we examined participants' compensatory responses to unanticipated perturbation of auditory feedback of the first formant frequency during the production of the monophthong [ε]. The PWS showed compensatory responses that were qualitatively similar to the controls' and had close-to-normal latencies (∼150 ms, but the magnitudes of their responses were substantially and significantly smaller than those of the control participants (by 47% on average, p<0.05. Measurements of auditory acuity indicate that the weaker-than-normal compensatory responses in PWS were not attributable to a deficit in low-level auditory processing. These findings are consistent with the hypothesis that stuttering is associated with functional defects in the inverse models responsible for the transformation from the domain of auditory targets and auditory error information into the domain of speech motor commands.
Effects of cognitive impairment on prosodic parameters of speech production planning in multiple sclerosis.

Science.gov (United States)

De Looze, Céline; Moreau, Noémie; Renié, Laurent; Kelly, Finnian; Ghio, Alain; Rico, Audrey; Audoin, Bertrand; Viallet, François; Pelletier, Jean; Petrone, Caterina

2017-05-24

Cognitive impairment (CI) affects 40-65% of patients with multiple sclerosis (MS). CI can have a negative impact on a patient's everyday activities, such as engaging in conversations. Speech production planning ability is crucial for successful verbal interactions and thus for preserving social and occupational skills. This study investigates the effect of cognitive-linguistic demand and CI on speech production planning in MS, as reflected in speech prosody. A secondary aim is to explore the clinical potential of prosodic features for the prediction of an individual's cognitive status in MS. A total of 45 subjects, that is 22 healthy controls (HC) and 23 patients in early stages of relapsing-remitting MS, underwent neuropsychological tests probing specific cognitive processes involved in speech production planning. All subjects also performed a read speech task, in which they had to read isolated sentences manipulated as for phonological length. Results show that the speech of MS patients with CI is mainly affected at the temporal level (articulation and speech rate, pause duration). Regression analyses further indicate that rate measures are correlated with working memory scores. In addition, linear discriminant analysis shows the ROC AUC of identifying MS patients with CI is 0.70 (95% confidence interval: 0.68-0.73). Our findings indicate that prosodic planning is deficient in patients with MS-CI and that the scope of planning depends on patients' cognitive abilities. We discuss how speech-based approaches could be used as an ecological method for the assessment and monitoring of CI in MS. © 2017 The British Psychological Society.
Intelligent audio analysis

CERN Document Server

Schuller, Björn W

2013-01-01

This book provides the reader with the knowledge necessary for comprehension of the field of Intelligent Audio Analysis. It firstly introduces standard methods and discusses the typical Intelligent Audio Analysis chain going from audio data to audio features to audio recognition. Further, an introduction to audio source separation, and enhancement and robustness are given. After the introductory parts, the book shows several applications for the three types of audio: speech, music, and general sound. Each task is shortly introduced, followed by a description of the specific data and methods applied, experiments and results, and a conclusion for this specific task. The books provides benchmark results and standardized test-beds for a broader range of audio analysis tasks. The main focus thereby lies on the parallel advancement of realism in audio analysis, as too often today’s results are overly optimistic owing to idealized testing conditions, and it serves to stimulate synergies arising from transfer of ...

Some links on this page may take you to non-federal websites. Their policies may differ from this site.