aided speech-identification performance: Topics by WorldWideScience.org

Sample records for aided speech-identification performance

Comparison of Gated Audiovisual Speech Identification in Elderly Hearing Aid Users and Elderly Normal-Hearing Individuals

Science.gov (United States)

Lidestam, Björn; Rönnberg, Jerker

2016-01-01

The present study compared elderly hearing aid (EHA) users (n = 20) with elderly normal-hearing (ENH) listeners (n = 20) in terms of isolation points (IPs, the shortest time required for correct identification of a speech stimulus) and accuracy of audiovisual gated speech stimuli (consonants, words, and final words in highly and less predictable sentences) presented in silence. In addition, we compared the IPs of audiovisual speech stimuli from the present study with auditory ones extracted from a previous study, to determine the impact of the addition of visual cues. Both participant groups achieved ceiling levels in terms of accuracy in the audiovisual identification of gated speech stimuli; however, the EHA group needed longer IPs for the audiovisual identification of consonants and words. The benefit of adding visual cues to auditory speech stimuli was more evident in the EHA group, as audiovisual presentation significantly shortened the IPs for consonants, words, and final words in less predictable sentences; in the ENH group, audiovisual presentation only shortened the IPs for consonants and words. In conclusion, although the audiovisual benefit was greater for EHA group, this group had inferior performance compared with the ENH group in terms of IPs when supportive semantic context was lacking. Consequently, EHA users needed the initial part of the audiovisual speech signal to be longer than did their counterparts with normal hearing to reach the same level of accuracy in the absence of a semantic context. PMID:27317667
Comparison of Gated Audiovisual Speech Identification in Elderly Hearing Aid Users and Elderly Normal-Hearing Individuals: Effects of Adding Visual Cues to Auditory Speech Stimuli.

Science.gov (United States)

Moradi, Shahram; Lidestam, Björn; Rönnberg, Jerker

2016-06-17

The present study compared elderly hearing aid (EHA) users (n = 20) with elderly normal-hearing (ENH) listeners (n = 20) in terms of isolation points (IPs, the shortest time required for correct identification of a speech stimulus) and accuracy of audiovisual gated speech stimuli (consonants, words, and final words in highly and less predictable sentences) presented in silence. In addition, we compared the IPs of audiovisual speech stimuli from the present study with auditory ones extracted from a previous study, to determine the impact of the addition of visual cues. Both participant groups achieved ceiling levels in terms of accuracy in the audiovisual identification of gated speech stimuli; however, the EHA group needed longer IPs for the audiovisual identification of consonants and words. The benefit of adding visual cues to auditory speech stimuli was more evident in the EHA group, as audiovisual presentation significantly shortened the IPs for consonants, words, and final words in less predictable sentences; in the ENH group, audiovisual presentation only shortened the IPs for consonants and words. In conclusion, although the audiovisual benefit was greater for EHA group, this group had inferior performance compared with the ENH group in terms of IPs when supportive semantic context was lacking. Consequently, EHA users needed the initial part of the audiovisual speech signal to be longer than did their counterparts with normal hearing to reach the same level of accuracy in the absence of a semantic context. © The Author(s) 2016.
Speech-enabled Computer-aided Translation

DEFF Research Database (Denmark)

Mesa-Lao, Bartolomé

2014-01-01

The present study has surveyed post-editor trainees’ views and attitudes before and after the introduction of speech technology as a front end to a computer-aided translation workbench. The aim of the survey was (i) to identify attitudes and perceptions among post-editor trainees before performing...... a post-editing task using automatic speech recognition (ASR); and (ii) to assess the degree to which post-editors’ attitudes and expectations to the use of speech technology changed after actually using it. The survey was based on two questionnaires: the first one administered before the participants...
Auditory and Cognitive Factors Underlying Individual Differences in Aided Speech-Understanding among Older Adults

Directory of Open Access Journals (Sweden)

Larry E. Humes

2013-10-01

Full Text Available This study was designed to address individual differences in aided speech understanding among a relatively large group of older adults. The group of older adults consisted of 98 adults (50 female and 48 male ranging in age from 60 to 86 (mean = 69.2. Hearing loss was typical for this age group and about 90% had not worn hearing aids. All subjects completed a battery of tests, including cognitive (6 measures, psychophysical (17 measures, and speech-understanding (9 measures, as well as the Speech, Spatial and Qualities of Hearing (SSQ self-report scale. Most of the speech-understanding measures made use of competing speech and the non-speech psychophysical measures were designed to tap phenomena thought to be relevant for the perception of speech in competing speech (e.g., stream segregation, modulation-detection interference. All measures of speech understanding were administered with spectral shaping applied to the speech stimuli to fully restore audibility through at least 4000 Hz. The measures used were demonstrated to be reliable in older adults and, when compared to a reference group of 28 young normal-hearing adults, age-group differences were observed on many of the measures. Principal-components factor analysis was applied successfully to reduce the number of independent and dependent (speech understanding measures for a multiple-regression analysis. Doing so yielded one global cognitive-processing factor and five non-speech psychoacoustic factors (hearing loss, dichotic signal detection, multi-burst masking, stream segregation, and modulation detection as potential predictors. To this set of six potential predictor variables were added subject age, Environmental Sound Identification (ESI, and performance on the text-recognition-threshold (TRT task (a visual analog of interrupted speech recognition. These variables were used to successfully predict one global aided speech-understanding factor, accounting for about 60% of the variance.
The Efficacy of Short-term Gated Audiovisual Speech Training for Improving Auditory Sentence Identification in Noise in Elderly Hearing Aid Users

Science.gov (United States)

Moradi, Shahram; Wahlin, Anna; Hällgren, Mathias; Rönnberg, Jerker; Lidestam, Björn

2017-01-01

This study aimed to examine the efficacy and maintenance of short-term (one-session) gated audiovisual speech training for improving auditory sentence identification in noise in experienced elderly hearing-aid users. Twenty-five hearing aid users (16 men and 9 women), with an average age of 70.8 years, were randomly divided into an experimental (audiovisual training, n = 14) and a control (auditory training, n = 11) group. Participants underwent gated speech identification tasks comprising Swedish consonants and words presented at 65 dB sound pressure level with a 0 dB signal-to-noise ratio (steady-state broadband noise), in audiovisual or auditory-only training conditions. The Hearing-in-Noise Test was employed to measure participants’ auditory sentence identification in noise before the training (pre-test), promptly after training (post-test), and 1 month after training (one-month follow-up). The results showed that audiovisual training improved auditory sentence identification in noise promptly after the training (post-test vs. pre-test scores); furthermore, this improvement was maintained 1 month after the training (one-month follow-up vs. pre-test scores). Such improvement was not observed in the control group, neither promptly after the training nor at the one-month follow-up. However, no significant between-groups difference nor an interaction between groups and session was observed. Conclusion: Audiovisual training may be considered in aural rehabilitation of hearing aid users to improve listening capabilities in noisy conditions. However, the lack of a significant between-groups effect (audiovisual vs. auditory) or an interaction between group and session calls for further research. PMID:28348542
Should visual speech cues (speechreading) be considered when fitting hearing aids?

Science.gov (United States)

Grant, Ken

2002-05-01

When talker and listener are face-to-face, visual speech cues become an important part of the communication environment, and yet, these cues are seldom considered when designing hearing aids. Models of auditory-visual speech recognition highlight the importance of complementary versus redundant speech information for predicting auditory-visual recognition performance. Thus, for hearing aids to work optimally when visual speech cues are present, it is important to know whether the cues provided by amplification and the cues provided by speechreading complement each other. In this talk, data will be reviewed that show nonmonotonicity between auditory-alone speech recognition and auditory-visual speech recognition, suggesting that efforts designed solely to improve auditory-alone recognition may not always result in improved auditory-visual recognition. Data will also be presented showing that one of the most important speech cues for enhancing auditory-visual speech recognition performance, voicing, is often the cue that benefits least from amplification.
Effects of noise and working memory capacity on memory processing of speech for hearing-aid users.

Science.gov (United States)

Ng, Elaine Hoi Ning; Rudner, Mary; Lunner, Thomas; Pedersen, Michael Syskind; Rönnberg, Jerker

2013-07-01

It has been shown that noise reduction algorithms can reduce the negative effects of noise on memory processing in persons with normal hearing. The objective of the present study was to investigate whether a similar effect can be obtained for persons with hearing impairment and whether such an effect is dependent on individual differences in working memory capacity. A sentence-final word identification and recall (SWIR) test was conducted in two noise backgrounds with and without noise reduction as well as in quiet. Working memory capacity was measured using a reading span (RS) test. Twenty-six experienced hearing-aid users with moderate to moderately severe sensorineural hearing loss. Noise impaired recall performance. Competing speech disrupted memory performance more than speech-shaped noise. For late list items the disruptive effect of the competing speech background was virtually cancelled out by noise reduction for persons with high working memory capacity. Noise reduction can reduce the adverse effect of noise on memory for speech for persons with good working memory capacity. We argue that the mechanism behind this is faster word identification that enhances encoding into working memory.
Contralateral Bimodal Stimulation: A Way to Enhance Speech Performance in Arabic-Speaking Cochlear Implant Patients.

Science.gov (United States)

Abdeltawwab, Mohamed M; Khater, Ahmed; El-Anwar, Mohammad W

2016-01-01

The combination of acoustic and electric stimulation as a way to enhance speech recognition performance in cochlear implant (CI) users has generated considerable interest in the recent years. The purpose of this study was to evaluate the bimodal advantage of the FS4 speech processing strategy in combination with hearing aids (HA) as a means to improve low-frequency resolution in CI patients. Nineteen postlingual CI adults were selected to participate in this study. All patients wore implants on one side and HA on the contralateral side with residual hearing. Monosyllabic word recognition, speech in noise, and emotion and talker identification were assessed using CI with fine structure processing/FS4 and high-definition continuous interleaved sampling strategies, HA alone, and a combination of CI and HA. The bimodal stimulation showed improvement in speech performance and emotion identification for the question/statement/order tasks, which was statistically significant compared to patients with CI alone, but there were no significant statistical differences in intragender talker discrimination and emotion identification for the happy/angry/neutral tasks. The poorest performance was obtained with HA only, and it was statistically significant compared to the other modalities. The bimodal stimulation showed enhanced speech performance in CI patients, and it improves the limitations provided by electric or acoustic stimulation alone. © 2016 S. Karger AG, Basel.
Vocabulary Facilitates Speech Perception in Children With Hearing Aids.

Science.gov (United States)

Klein, Kelsey E; Walker, Elizabeth A; Kirby, Benjamin; McCreery, Ryan W

2017-08-16

We examined the effects of vocabulary, lexical characteristics (age of acquisition and phonotactic probability), and auditory access (aided audibility and daily hearing aid [HA] use) on speech perception skills in children with HAs. Participants included 24 children with HAs and 25 children with normal hearing (NH), ages 5-12 years. Groups were matched on age, expressive and receptive vocabulary, articulation, and nonverbal working memory. Participants repeated monosyllabic words and nonwords in noise. Stimuli varied on age of acquisition, lexical frequency, and phonotactic probability. Performance in each condition was measured by the signal-to-noise ratio at which the child could accurately repeat 50% of the stimuli. Children from both groups with larger vocabularies showed better performance than children with smaller vocabularies on nonwords and late-acquired words but not early-acquired words. Overall, children with HAs showed poorer performance than children with NH. Auditory access was not associated with speech perception for the children with HAs. Children with HAs show deficits in sensitivity to phonological structure but appear to take advantage of vocabulary skills to support speech perception in the same way as children with NH. Further investigation is needed to understand the causes of the gap that exists between the overall speech perception abilities of children with HAs and children with NH.
A Joint Approach for Single-Channel Speaker Identification and Speech Separation

DEFF Research Database (Denmark)

Mowlaee, Pejman; Saeidi, Rahim; Christensen, Mads Græsbøll

2012-01-01

) accuracy, here, we report the objective and subjective results as well. The results show that the proposed system performs as well as the best of the state-of-the-art in terms of perceived quality while its performance in terms of speaker identification and automatic speech recognition results......In this paper, we present a novel system for joint speaker identification and speech separation. For speaker identification a single-channel speaker identification algorithm is proposed which provides an estimate of signal-to-signal ratio (SSR) as a by-product. For speech separation, we propose...... a sinusoidal model-based algorithm. The speech separation algorithm consists of a double-talk/single-talk detector followed by a minimum mean square error estimator of sinusoidal parameters for finding optimal codevectors from pre-trained speaker codebooks. In evaluating the proposed system, we start from...
Predictors of auditory performance in hearing-aid users: The role of cognitive function and auditory lifestyle (A)

DEFF Research Database (Denmark)

Vestergaard, Martin David

2006-01-01

no objective benefit can be measured. It has been suggested that lack of agreement between various hearing-aid outcome components can be explained by individual differences in cognitive function and auditory lifestyle. We measured speech identification, self-report outcome, spectral and temporal resolution...... of hearing, cognitive skills, and auditory lifestyle in 25 new hearing-aid users. The purpose was to assess the predictive power of the nonauditory measures while looking at the relationships between measures from various auditory-performance domains. The results showed that only moderate correlation exists...... between objective and subjective hearing-aid outcome. Different self-report outcome measures showed a different amount of correlation with objective auditory performance. Cognitive skills were found to play a role in explaining speech performance and spectral and temporal abilities, and auditory lifestyle...
Multilevel Analysis in Analyzing Speech Data

Science.gov (United States)

Guddattu, Vasudeva; Krishna, Y.

2011-01-01

The speech produced by human vocal tract is a complex acoustic signal, with diverse applications in phonetics, speech synthesis, automatic speech recognition, speaker identification, communication aids, speech pathology, speech perception, machine translation, hearing research, rehabilitation and assessment of communication disorders and many…
Assessment of hearing aid algorithms using a master hearing aid: the influence of hearing aid experience on the relationship between speech recognition and cognitive capacity.

Science.gov (United States)

Rählmann, Sebastian; Meis, Markus; Schulte, Michael; Kießling, Jürgen; Walger, Martin; Meister, Hartmut

2017-04-27

Model-based hearing aid development considers the assessment of speech recognition using a master hearing aid (MHA). It is known that aided speech recognition in noise is related to cognitive factors such as working memory capacity (WMC). This relationship might be mediated by hearing aid experience (HAE). The aim of this study was to examine the relationship of WMC and speech recognition with a MHA for listeners with different HAE. Using the MHA, unaided and aided 80% speech recognition thresholds in noise were determined. Individual WMC capacity was assed using the Verbal Learning and Memory Test (VLMT) and the Reading Span Test (RST). Forty-nine hearing aid users with mild to moderate sensorineural hearing loss divided into three groups differing in HAE. Whereas unaided speech recognition did not show a significant relationship with WMC, a significant correlation could be observed between WMC and aided speech recognition. However, this only applied to listeners with HAE of up to approximately three years, and a consistent weakening of the correlation could be observed with more experience. Speech recognition scores obtained in acute experiments with an MHA are less influenced by individual cognitive capacity when experienced HA users are taken into account.
Acceptable noise level (ANL) with Danish and non-semantic speech materials in adult hearing-aid users

DEFF Research Database (Denmark)

Olsen, Steen Østergaard; Lantz, Johannes; Nielsen, Lars Holme

2012-01-01

The acceptable noise level (ANL) test is used for quantification of the amount of background noise subjects accept when listening to speech. This study investigates Danish hearing-aid users' ANL performance using Danish and non-semantic speech signals, the repeatability of ANL, and the association...... between ANL and outcome of the international outcome inventory for hearing aids (IOI-HA)....
Temporal and spatio-temporal vibrotactile displays for voice fundamental frequency: an initial evaluation of a new vibrotactile speech perception aid with normal-hearing and hearing-impaired individuals.

Science.gov (United States)

Auer, E T; Bernstein, L E; Coulter, D C

1998-10-01

Four experiments were performed to evaluate a new wearable vibrotactile speech perception aid that extracts fundamental frequency (F0) and displays the extracted F0 as a single-channel temporal or an eight-channel spatio-temporal stimulus. Specifically, we investigated the perception of intonation (i.e., question versus statement) and emphatic stress (i.e., stress on the first, second, or third word) under Visual-Alone (VA), Visual-Tactile (VT), and Tactile-Alone (TA) conditions and compared performance using the temporal and spatio-temporal vibrotactile display. Subjects were adults with normal hearing in experiments I-III and adults with severe to profound hearing impairments in experiment IV. Both versions of the vibrotactile speech perception aid successfully conveyed intonation. Vibrotactile stress information was successfully conveyed, but vibrotactile stress information did not enhance performance in VT conditions beyond performance in VA conditions. In experiment III, which involved only intonation identification, a reliable advantage for the spatio-temporal display was obtained. Differences between subject groups were obtained for intonation identification, with more accurate VT performance by those with normal hearing. Possible effects of long-term hearing status are discussed.
Hearing Aid-Induced Plasticity in the Auditory System of Older Adults: Evidence from Speech Perception

Science.gov (United States)

Lavie, Limor; Banai, Karen; Karni, Avi; Attias, Joseph

2015-01-01

Purpose: We tested whether using hearing aids can improve unaided performance in speech perception tasks in older adults with hearing impairment. Method: Unaided performance was evaluated in dichotic listening and speech-in-noise tests in 47 older adults with hearing impairment; 36 participants in 3 study groups were tested before hearing aid…
Audiovisual Temporal Recalibration for Speech in Synchrony Perception and Speech Identification

Science.gov (United States)

Asakawa, Kaori; Tanaka, Akihiro; Imai, Hisato

We investigated whether audiovisual synchrony perception for speech could change after observation of the audiovisual temporal mismatch. Previous studies have revealed that audiovisual synchrony perception is re-calibrated after exposure to a constant timing difference between auditory and visual signals in non-speech. In the present study, we examined whether this audiovisual temporal recalibration occurs at the perceptual level even for speech (monosyllables). In Experiment 1, participants performed an audiovisual simultaneity judgment task (i.e., a direct measurement of the audiovisual synchrony perception) in terms of the speech signal after observation of the speech stimuli which had a constant audiovisual lag. The results showed that the “simultaneous” responses (i.e., proportion of responses for which participants judged the auditory and visual stimuli to be synchronous) at least partly depended on exposure lag. In Experiment 2, we adopted the McGurk identification task (i.e., an indirect measurement of the audiovisual synchrony perception) to exclude the possibility that this modulation of synchrony perception was solely attributable to the response strategy using stimuli identical to those of Experiment 1. The characteristics of the McGurk effect reported by participants depended on exposure lag. Thus, it was shown that audiovisual synchrony perception for speech could be modulated following exposure to constant lag both in direct and indirect measurement. Our results suggest that temporal recalibration occurs not only in non-speech signals but also in monosyllabic speech at the perceptual level.
Acceptable noise level (ANL) with Danish and non-semantic speech materials in adult hearing-aid users

DEFF Research Database (Denmark)

Olsen, Steen Østergaard; Lantz, Johannes; Nielsen, Lars Holme

2012-01-01

The acceptable noise level (ANL) test is used for quantification of the amount of background noise subjects accept when listening to speech. This study investigates Danish hearing-aid users' ANL performance using Danish and non-semantic speech signals, the repeatability of ANL, and the association...
Visual Speech Fills in Both Discrimination and Identification of Non-Intact Auditory Speech in Children

Science.gov (United States)

Jerger, Susan; Damian, Markus F.; McAlpine, Rachel P.; Abdi, Herve

2018-01-01

To communicate, children must discriminate and identify speech sounds. Because visual speech plays an important role in this process, we explored how visual speech influences phoneme discrimination and identification by children. Critical items had intact visual speech (e.g. baez) coupled to non-intact (excised onsets) auditory speech (signified…
SPEECH VISUALIZATION SISTEM AS A BASIS FOR SPEECH TRAINING AND COMMUNICATION AIDS

Directory of Open Access Journals (Sweden)

Oliana KRSTEVA

1997-09-01

Full Text Available One receives much more information through a visual sense than through a tactile one. However, most visual aids for hearing-impaired persons are not wearable because it is difficult to make them compact and it is not a best way to mask always their vision.Generally it is difficult to get the integrated patterns by a single mathematical transform of signals, such as a Foruier transform. In order to obtain the integrated pattern speech parameters should be carefully extracted by an analysis according as each parameter, and a visual pattern, which can intuitively be understood by anyone, must be synthesized from them. Successful integration of speech parameters will never disturb understanding of individual features, so that the system can be used for speech training and communication.

Visual speech alters the discrimination and identification of non-intact auditory speech in children with hearing loss.

Science.gov (United States)

Jerger, Susan; Damian, Markus F; McAlpine, Rachel P; Abdi, Hervé

2017-03-01

Understanding spoken language is an audiovisual event that depends critically on the ability to discriminate and identify phonemes yet we have little evidence about the role of early auditory experience and visual speech on the development of these fundamental perceptual skills. Objectives of this research were to determine 1) how visual speech influences phoneme discrimination and identification; 2) whether visual speech influences these two processes in a like manner, such that discrimination predicts identification; and 3) how the degree of hearing loss affects this relationship. Such evidence is crucial for developing effective intervention strategies to mitigate the effects of hearing loss on language development. Participants were 58 children with early-onset sensorineural hearing loss (CHL, 53% girls, M = 9;4 yrs) and 58 children with normal hearing (CNH, 53% girls, M = 9;4 yrs). Test items were consonant-vowel (CV) syllables and nonwords with intact visual speech coupled to non-intact auditory speech (excised onsets) as, for example, an intact consonant/rhyme in the visual track (Baa or Baz) coupled to non-intact onset/rhyme in the auditory track (/-B/aa or/-B/az). The items started with an easy-to-speechread/B/or difficult-to-speechread/G/onset and were presented in the auditory (static face) vs. audiovisual (dynamic face) modes. We assessed discrimination for intact vs. non-intact different pairs (e.g., Baa:/-B/aa). We predicted that visual speech would cause the non-intact onset to be perceived as intact and would therefore generate more same-as opposed to different-responses in the audiovisual than auditory mode. We assessed identification by repetition of nonwords with non-intact onsets (e.g.,/-B/az). We predicted that visual speech would cause the non-intact onset to be perceived as intact and would therefore generate more Baz-as opposed to az- responses in the audiovisual than auditory mode. Performance in the audiovisual mode showed more same
Visual Speech Alters the Discrimination and Identification of Non-Intact Auditory Speech in Children with Hearing Loss

Science.gov (United States)

Jerger, Susan; Damian, Markus F.; McAlpine, Rachel P.; Abdi, Hervé

2017-01-01

Performance in the audiovisual mode showed more same responses for the intact vs. non-intact different pairs (e.g., Baa:/–B/aa) and more intact onset responses for nonword repetition (Baz for/–B/az). Thus visual speech altered both discrimination and identification in the CHL—to a large extent for the /B/ onsets but only minimally for the /G/ onsets. The CHL identified the stimuli similarly to the CNH but did not discriminate the stimuli similarly. A bias-free measure of the children’s discrimination skills (i.e., d’ analysis) revealed that the CHL had greater difficulty discriminating intact from non-intact speech in both modes. As the degree of HL worsened, the ability to discriminate the intact vs. non-intact onsets in the auditory mode worsened. Discrimination ability in CHL significantly predicted their identification of the onsets—even after variation due to the other variables was controlled. Conclusions These results clearly established that visual speech can fill in non-intact auditory speech, and this effect, in turn, made the non-intact onsets more difficult to discriminate from intact speech and more likely to be perceived as intact. Such results 1) demonstrate the value of visual speech at multiple levels of linguistic processing and 2) support intervention programs that view visual speech as a powerful asset for developing spoken language in CHL. PMID:28167003
Comparison of Speech Perception in Background Noise with Acceptance of Background Noise in Aided and Unaided Conditions.

Science.gov (United States)

Nabelek, Anna K.; Tampas, Joanna W.; Burchfield, Samuel B.

2004-01-01

l, speech perception in noiseBackground noise is a significant factor influencing hearing-aid satisfaction and is a major reason for rejection of hearing aids. Attempts have been made by previous researchers to relate the use of hearing aids to speech perception in noise (SPIN), with an expectation of improved speech perception followed by an…
Simultaneous Assessment of Speech Identification and Spatial Discrimination

Directory of Open Access Journals (Sweden)

Jennifer K. Bizley

2015-12-01

Full Text Available With increasing numbers of children and adults receiving bilateral cochlear implants, there is an urgent need for assessment tools that enable testing of binaural hearing abilities. Current test batteries are either limited in scope or are of an impractical duration for routine testing. Here, we report a behavioral test that enables combined testing of speech identification and spatial discrimination in noise. In this task, multitalker babble was presented from all speakers, and pairs of speech tokens were sequentially presented from two adjacent speakers. Listeners were required to identify both words from a closed set of four possibilities and to determine whether the second token was presented to the left or right of the first. In Experiment 1, normal-hearing adult listeners were tested at 15° intervals throughout the frontal hemifield. Listeners showed highest spatial discrimination performance in and around the frontal midline, with a decline at more eccentric locations. In contrast, speech identification abilities were least accurate near the midline and showed an improvement in performance at more lateral locations. In Experiment 2, normal-hearing listeners were assessed using a restricted range of speaker locations designed to match those found in clinical testing environments. Here, speakers were separated by 15° around the midline and 30° at more lateral locations. This resulted in a similar pattern of behavioral results as in Experiment 1. We conclude, this test offers the potential to assess both spatial discrimination and the ability to use spatial information for unmasking in clinical populations.
The influence of hearing aids on the speech and language development of children with hearing loss.

Science.gov (United States)

Tomblin, J Bruce; Oleson, Jacob J; Ambrose, Sophie E; Walker, Elizabeth; Moeller, Mary Pat

2014-05-01

IMPORTANCE Hearing loss (HL) in children can be deleterious to their speech and language development. The standard of practice has been early provision of hearing aids (HAs) to moderate these effects; however, there have been few empirical studies evaluating the effectiveness of this practice on speech and language development among children with mild-to-severe HL. OBJECTIVE To investigate the contributions of aided hearing and duration of HA use to speech and language outcomes in children with mild-to-severe HL. DESIGN, SETTING, AND PARTICIPANTS An observational cross-sectional design was used to examine the association of aided hearing levels and length of HA use with levels of speech and language outcomes. One hundred eighty 3- and 5-year-old children with HL were recruited through records of Universal Newborn Hearing Screening and referrals from clinical service providers in the general community in 6 US states. INTERVENTIONS All but 4 children had been fitted with HAs, and measures of aided hearing and the duration of HA use were obtained. MAIN OUTCOMES AND MEASURES Standardized measures of speech and language ability were obtained. RESULTS Measures of the gain in hearing ability for speech provided by the HA were significantly correlated with levels of speech (ρ179 = 0.20; P = .008) and language: ρ155 = 0.21; P = .01) ability. These correlations were indicative of modest levels of association between aided hearing and speech and language outcomes. These benefits were found for children with mild and moderate-to-severe HL. In addition, the amount of benefit from aided hearing interacted with the duration of HA experience (Speech: F4,161 = 4.98; P < .001; Language: F4,138 = 2.91; P < .02). Longer duration of HA experience was most beneficial for children who had the best aided hearing. CONCLUSIONS AND RELEVANCE The degree of improved hearing provided by HAs was associated with better speech and language development in children
Speech understanding and directional hearing for hearing-impaired subjects with in-the-ear and behind-the-ear hearing aids

NARCIS (Netherlands)

Leeuw, A. R.; Dreschler, W. A.

1987-01-01

With respect to acoustical properties, in-the-ear (ITE) aids should give better understanding and directional hearing than behind-the-ear (BTE) aids. Also hearing-impaired subjects often prefer ITEs. A study was performed to assess objectively the improvement in speech understanding and directional
Sound quality measures for speech in noise through a commercial hearing aid implementing digital noise reduction.

Science.gov (United States)

Ricketts, Todd A; Hornsby, Benjamin W Y

2005-05-01

This brief report discusses the affect of digital noise reduction (DNR) processing on aided speech recognition and sound quality measures in 14 adults fitted with a commercial hearing aid. Measures of speech recognition and sound quality were obtained in two different speech-in-noise conditions (71 dBA speech, +6 dB SNR and 75 dBA speech, +1 dB SNR). The results revealed that the presence or absence of DNR processing did not impact speech recognition in noise (either positively or negatively). Paired comparisons of sound quality for the same speech in noise signals, however, revealed a strong preference for DNR processing. These data suggest that at least one implementation of DNR processing is capable of providing improved sound quality, for speech in noise, in the absence of improved speech recognition.
Bandwidth Extension of Telephone Speech Aided by Data Embedding

Directory of Open Access Journals (Sweden)

Sagi Ariel

2007-01-01

Full Text Available A system for bandwidth extension of telephone speech, aided by data embedding, is presented. The proposed system uses the transmitted analog narrowband speech signal as a carrier of the side information needed to carry out the bandwidth extension. The upper band of the wideband speech is reconstructed at the receiving end from two components: a synthetic wideband excitation signal, generated from the narrowband telephone speech and a wideband spectral envelope, parametrically represented and transmitted as embedded data in the telephone speech. We propose a novel data embedding scheme, in which the scalar Costa scheme is combined with an auditory masking model allowing high rate transparent embedding, while maintaining a low bit error rate. The signal is transformed to the frequency domain via the discrete Hartley transform (DHT and is partitioned into subbands. Data is embedded in an adaptively chosen subset of subbands by modifying the DHT coefficients. In our simulations, high quality wideband speech was obtained from speech transmitted over a telephone line (characterized by spectral magnitude distortion, dispersion, and noise, in which side information data is transparently embedded at the rate of 600 information bits/second and with a bit error rate of approximately . In a listening test, the reconstructed wideband speech was preferred (at different degrees over conventional telephone speech in of the test utterances.
Bandwidth Extension of Telephone Speech Aided by Data Embedding

Directory of Open Access Journals (Sweden)

David Malah

2007-01-01

Full Text Available A system for bandwidth extension of telephone speech, aided by data embedding, is presented. The proposed system uses the transmitted analog narrowband speech signal as a carrier of the side information needed to carry out the bandwidth extension. The upper band of the wideband speech is reconstructed at the receiving end from two components: a synthetic wideband excitation signal, generated from the narrowband telephone speech and a wideband spectral envelope, parametrically represented and transmitted as embedded data in the telephone speech. We propose a novel data embedding scheme, in which the scalar Costa scheme is combined with an auditory masking model allowing high rate transparent embedding, while maintaining a low bit error rate. The signal is transformed to the frequency domain via the discrete Hartley transform (DHT and is partitioned into subbands. Data is embedded in an adaptively chosen subset of subbands by modifying the DHT coefficients. In our simulations, high quality wideband speech was obtained from speech transmitted over a telephone line (characterized by spectral magnitude distortion, dispersion, and noise, in which side information data is transparently embedded at the rate of 600 information bits/second and with a bit error rate of approximately 3⋅10−4. In a listening test, the reconstructed wideband speech was preferred (at different degrees over conventional telephone speech in 92.5% of the test utterances.
Objective Prediction of Hearing Aid Benefit Across Listener Groups Using Machine Learning: Speech Recognition Performance With Binaural Noise-Reduction Algorithms.

Science.gov (United States)

Schädler, Marc R; Warzybok, Anna; Kollmeier, Birger

2018-01-01

The simulation framework for auditory discrimination experiments (FADE) was adopted and validated to predict the individual speech-in-noise recognition performance of listeners with normal and impaired hearing with and without a given hearing-aid algorithm. FADE uses a simple automatic speech recognizer (ASR) to estimate the lowest achievable speech reception thresholds (SRTs) from simulated speech recognition experiments in an objective way, independent from any empirical reference data. Empirical data from the literature were used to evaluate the model in terms of predicted SRTs and benefits in SRT with the German matrix sentence recognition test when using eight single- and multichannel binaural noise-reduction algorithms. To allow individual predictions of SRTs in binaural conditions, the model was extended with a simple better ear approach and individualized by taking audiograms into account. In a realistic binaural cafeteria condition, FADE explained about 90% of the variance of the empirical SRTs for a group of normal-hearing listeners and predicted the corresponding benefits with a root-mean-square prediction error of 0.6 dB. This highlights the potential of the approach for the objective assessment of benefits in SRT without prior knowledge about the empirical data. The predictions for the group of listeners with impaired hearing explained 75% of the empirical variance, while the individual predictions explained less than 25%. Possibly, additional individual factors should be considered for more accurate predictions with impaired hearing. A competing talker condition clearly showed one limitation of current ASR technology, as the empirical performance with SRTs lower than -20 dB could not be predicted.
Effect of hearing aids use on speech stimulus decoding through speech-evoked ABR

Directory of Open Access Journals (Sweden)

Renata Aparecida Leite

Full Text Available Abstract Introduction The electrophysiological responses obtained with the complex auditory brainstem response (cABR provide objective measures of subcortical processing of speech and other complex stimuli. The cABR has also been used to verify the plasticity in the auditory pathway in the subcortical regions. Objective To compare the results of cABR obtained in children using hearing aids before and after 9 months of adaptation, as well as to compare the results of these children with those obtained in children with normal hearing. Methods Fourteen children with normal hearing (Control Group - CG and 18 children with mild to moderate bilateral sensorineural hearing loss (Study Group - SG, aged 7-12 years, were evaluated. The children were submitted to pure tone and vocal audiometry, acoustic immittance measurements and ABR with speech stimulus, being submitted to the evaluations at three different moments: initial evaluation (M0, 3 months after the initial evaluation (M3 and 9 months after the evaluation (M9; at M0, the children assessed in the study group did not use hearing aids yet. Results When comparing the CG and the SG, it was observed that the SG had a lower median for the V-A amplitude at M0 and M3, lower median for the latency of the component V at M9 and a higher median for the latency of component O at M3 and M9. A reduction in the latency of component A at M9 was observed in the SG. Conclusion Children with mild to moderate hearing loss showed speech stimulus processing deficits and the main impairment is related to the decoding of the transient portion of this stimulus spectrum. It was demonstrated that the use of hearing aids promoted neuronal plasticity of the Central Auditory Nervous System after an extended time of sensory stimulation.
Hearing aid processing of loud speech and noise signals: Consequences for loudness perception and listening comfort

DEFF Research Database (Denmark)

Schmidt, Erik

2007-01-01

sounds, has found that both normal-hearing and hearing-impaired listeners prefer loud sounds to be closer to the most comfortable loudness-level, than suggested by common non-linear fitting rules. During this project, two listening experiments were carried out. In the first experiment, hearing aid users......Hearing aid processing of loud speech and noise signals: Consequences for loudness perception and listening comfort. Sound processing in hearing aids is determined by the fitting rule. The fitting rule describes how the hearing aid should amplify speech and sounds in the surroundings......, such that they become audible again for the hearing impaired person. The general goal is to place all sounds within the hearing aid users’ audible range, such that speech intelligibility and listening comfort become as good as possible. Amplification strategies in hearing aids are in many cases based on empirical...
Multistage audiovisual integration of speech: dissociating identification and detection

DEFF Research Database (Denmark)

Eskelund, Kasper; Tuomainen, Jyrki; Andersen, Tobias

2011-01-01

Speech perception integrates auditory and visual information. This is evidenced by the McGurk illusion where seeing the talking face influences the auditory phonetic percept and by the audiovisual detection advantage where seeing the talking face influences the detectability of the acoustic speech...... signal. Here we show that identification of phonetic content and detection can be dissociated as speech-specific and non-specific audiovisual integration effects. To this end, we employed synthetically modified stimuli, sine wave speech (SWS), which is an impoverished speech signal that only observers...... informed of its speech-like nature recognize as speech. While the McGurk illusion only occurred for informed observers the audiovisual detection advantage occurred for naïve observers as well. This finding supports a multi-stage account of audiovisual integration of speech in which the many attributes...
Effects of low harmonics on tone identification in natural and vocoded speech.

Science.gov (United States)

Liu, Chang; Azimi, Behnam; Tahmina, Qudsia; Hu, Yi

2012-11-01

This study investigated the contribution of low-frequency harmonics to identifying Mandarin tones in natural and vocoded speech in quiet and noisy conditions. Results showed that low-frequency harmonics of natural speech led to highly accurate tone identification; however, for vocoded speech, low-frequency harmonics yielded lower tone identification than stimuli with full harmonics, except for tone 4. Analysis of the correlation between tone accuracy and the amplitude-F0 correlation index suggested that "more" speech contents (i.e., more harmonics) did not necessarily yield better tone recognition for vocoded speech, especially when the amplitude contour of the signals did not co-vary with the F0 contour.
Comparison of bimodal and bilateral cochlear implant users on speech recognition with competing talker, music perception, affective prosody discrimination, and talker identification.

Science.gov (United States)

Cullington, Helen E; Zeng, Fan-Gang

2011-02-01

Despite excellent performance in speech recognition in quiet, most cochlear implant users have great difficulty with speech recognition in noise, music perception, identifying tone of voice, and discriminating different talkers. This may be partly due to the pitch coding in cochlear implant speech processing. Most current speech processing strategies use only the envelope information; the temporal fine structure is discarded. One way to improve electric pitch perception is to use residual acoustic hearing via a hearing aid on the nonimplanted ear (bimodal hearing). This study aimed to test the hypothesis that bimodal users would perform better than bilateral cochlear implant users on tasks requiring good pitch perception. Four pitch-related tasks were used. 1. Hearing in Noise Test (HINT) sentences spoken by a male talker with a competing female, male, or child talker. 2. Montreal Battery of Evaluation of Amusia. This is a music test with six subtests examining pitch, rhythm and timing perception, and musical memory. 3. Aprosodia Battery. This has five subtests evaluating aspects of affective prosody and recognition of sarcasm. 4. Talker identification using vowels spoken by 10 different talkers (three men, three women, two boys, and two girls). Bilateral cochlear implant users were chosen as the comparison group. Thirteen bimodal and 13 bilateral adult cochlear implant users were recruited; all had good speech perception in quiet. There were no significant differences between the mean scores of the bimodal and bilateral groups on any of the tests, although the bimodal group did perform better than the bilateral group on almost all tests. Performance on the different pitch-related tasks was not correlated, meaning that if a subject performed one task well they would not necessarily perform well on another. The correlation between the bimodal users' hearing threshold levels in the aided ear and their performance on these tasks was weak. Although the bimodal cochlear
Noise Reduction with Microphone Arrays for Speaker Identification

Energy Technology Data Exchange (ETDEWEB)

Cohen, Z

2011-12-22

Reducing acoustic noise in audio recordings is an ongoing problem that plagues many applications. This noise is hard to reduce because of interfering sources and non-stationary behavior of the overall background noise. Many single channel noise reduction algorithms exist but are limited in that the more the noise is reduced; the more the signal of interest is distorted due to the fact that the signal and noise overlap in frequency. Specifically acoustic background noise causes problems in the area of speaker identification. Recording a speaker in the presence of acoustic noise ultimately limits the performance and confidence of speaker identification algorithms. In situations where it is impossible to control the environment where the speech sample is taken, noise reduction filtering algorithms need to be developed to clean the recorded speech of background noise. Because single channel noise reduction algorithms would distort the speech signal, the overall challenge of this project was to see if spatial information provided by microphone arrays could be exploited to aid in speaker identification. The goals are: (1) Test the feasibility of using microphone arrays to reduce background noise in speech recordings; (2) Characterize and compare different multichannel noise reduction algorithms; (3) Provide recommendations for using these multichannel algorithms; and (4) Ultimately answer the question - Can the use of microphone arrays aid in speaker identification?
Multistage audiovisual integration of speech: dissociating identification and detection.

Science.gov (United States)

Eskelund, Kasper; Tuomainen, Jyrki; Andersen, Tobias S

2011-02-01

Speech perception integrates auditory and visual information. This is evidenced by the McGurk illusion where seeing the talking face influences the auditory phonetic percept and by the audiovisual detection advantage where seeing the talking face influences the detectability of the acoustic speech signal. Here, we show that identification of phonetic content and detection can be dissociated as speech-specific and non-specific audiovisual integration effects. To this end, we employed synthetically modified stimuli, sine wave speech (SWS), which is an impoverished speech signal that only observers informed of its speech-like nature recognize as speech. While the McGurk illusion only occurred for informed observers, the audiovisual detection advantage occurred for naïve observers as well. This finding supports a multistage account of audiovisual integration of speech in which the many attributes of the audiovisual speech signal are integrated by separate integration processes.
The Performance-Perceptual Test (PPT) and its relationship to aided reported handicap and hearing aid satisfaction.

Science.gov (United States)

Saunders, Gabrielle H; Forsline, Anna

2006-06-01

Results of objective clinical tests (e.g., measures of speech understanding in noise) often conflict with subjective reports of hearing aid benefit and satisfaction. The Performance-Perceptual Test (PPT) is an outcome measure in which objective and subjective evaluations are made by using the same test materials, testing format, and unit of measurement (signal-to-noise ratio, S/N), permitting a direct comparison between measured and perceived ability to hear. Two variables are measured: a Performance Speech Reception Threshold in Noise (SRTN) for 50% correct performance and a Perceptual SRTN, which is the S/N at which listeners perceive that they can understand the speech material. A third variable is computed: the Performance-Perceptual Discrepancy (PPDIS); it is the difference between the Performance and Perceptual SRTNs and measures the extent to which listeners "misjudge" their hearing ability. Saunders et al. in 2004 examined the relation between PPT scores and unaided hearing handicap. In this publication, the relations between the PPT, residual aided handicap, and hearing aid satisfaction are described. Ninety-four individuals between the ages of 47 and 86 yr participated. All had symmetrical sensorineural hearing loss and had worn binaural hearing aids for at least 6 wk before participating. All subjects underwent routine audiological examination and completed the PPT, the Hearing Handicap Inventory for the Elderly/Adults (HHIE/A), and the Satisfaction for Amplification in Daily Life questionnaire. Sixty-five subjects attended one research visit for participation in this study, and 29 attended a second visit to complete the PPT a second time. Performance and Perceptual SRTN and PPDIS scores were normally distributed and showed excellent test-retest reliability. Aided SRTNs were significantly better than unaided SRTNs; aided and unaided PPDIS values did not differ. Stepwise multiple linear regression showed that the PPDIS, the Performance SRTN, and age were
Objective Prediction of Hearing Aid Benefit Across Listener Groups Using Machine Learning: Speech Recognition Performance With Binaural Noise-Reduction Algorithms

Science.gov (United States)

Schädler, Marc R.; Warzybok, Anna; Kollmeier, Birger

2018-01-01

The simulation framework for auditory discrimination experiments (FADE) was adopted and validated to predict the individual speech-in-noise recognition performance of listeners with normal and impaired hearing with and without a given hearing-aid algorithm. FADE uses a simple automatic speech recognizer (ASR) to estimate the lowest achievable speech reception thresholds (SRTs) from simulated speech recognition experiments in an objective way, independent from any empirical reference data. Empirical data from the literature were used to evaluate the model in terms of predicted SRTs and benefits in SRT with the German matrix sentence recognition test when using eight single- and multichannel binaural noise-reduction algorithms. To allow individual predictions of SRTs in binaural conditions, the model was extended with a simple better ear approach and individualized by taking audiograms into account. In a realistic binaural cafeteria condition, FADE explained about 90% of the variance of the empirical SRTs for a group of normal-hearing listeners and predicted the corresponding benefits with a root-mean-square prediction error of 0.6 dB. This highlights the potential of the approach for the objective assessment of benefits in SRT without prior knowledge about the empirical data. The predictions for the group of listeners with impaired hearing explained 75% of the empirical variance, while the individual predictions explained less than 25%. Possibly, additional individual factors should be considered for more accurate predictions with impaired hearing. A competing talker condition clearly showed one limitation of current ASR technology, as the empirical performance with SRTs lower than −20 dB could not be predicted. PMID:29692200
Speech perception benefits of FM and infrared devices to children with hearing aids in a typical classroom.

Science.gov (United States)

Anderson, Karen L; Goldstein, Howard

2004-04-01

Children typically learn in classroom environments that have background noise and reverberation that interfere with accurate speech perception. Amplification technology can enhance the speech perception of students who are hard of hearing. This study used a single-subject alternating treatments design to compare the speech recognition abilities of children who are, hard of hearing when they were using hearing aids with each of three frequency modulated (FM) or infrared devices. Eight 9-12-year-olds with mild to severe hearing loss repeated Hearing in Noise Test (HINT) sentence lists under controlled conditions in a typical kindergarten classroom with a background noise level of +10 dB signal-to-noise (S/N) ratio and 1.1 s reverberation time. Participants listened to HINT lists using hearing aids alone and hearing aids in combination with three types of S/N-enhancing devices that are currently used in mainstream classrooms: (a) FM systems linked to personal hearing aids, (b) infrared sound field systems with speakers placed throughout the classroom, and (c) desktop personal sound field FM systems. The infrared ceiling sound field system did not provide benefit beyond that provided by hearing aids alone. Desktop and personal FM systems in combination with personal hearing aids provided substantial improvements in speech recognition. This information can assist in making S/N-enhancing device decisions for students using hearing aids. In a reverberant and noisy classroom setting, classroom sound field devices are not beneficial to speech perception for students with hearing aids, whereas either personal FM or desktop sound field systems provide listening benefits.

Performance Assessment of Dynaspeak Speech Recognition System on Inflight Databases

National Research Council Canada - National Science Library

Barry, Timothy

2004-01-01

.... To aid in the assessment of various commercially available speech recognition systems, several aircraft speech databases have been developed at the Air Force Research Laboratory's Human Effectiveness Directorate...
Speech endpoint detection with non-language speech sounds for generic speech processing applications

Science.gov (United States)

McClain, Matthew; Romanowski, Brian

2009-05-01

Non-language speech sounds (NLSS) are sounds produced by humans that do not carry linguistic information. Examples of these sounds are coughs, clicks, breaths, and filled pauses such as "uh" and "um" in English. NLSS are prominent in conversational speech, but can be a significant source of errors in speech processing applications. Traditionally, these sounds are ignored by speech endpoint detection algorithms, where speech regions are identified in the audio signal prior to processing. The ability to filter NLSS as a pre-processing step can significantly enhance the performance of many speech processing applications, such as speaker identification, language identification, and automatic speech recognition. In order to be used in all such applications, NLSS detection must be performed without the use of language models that provide knowledge of the phonology and lexical structure of speech. This is especially relevant to situations where the languages used in the audio are not known apriori. We present the results of preliminary experiments using data from American and British English speakers, in which segments of audio are classified as language speech sounds (LSS) or NLSS using a set of acoustic features designed for language-agnostic NLSS detection and a hidden-Markov model (HMM) to model speech generation. The results of these experiments indicate that the features and model used are capable of detection certain types of NLSS, such as breaths and clicks, while detection of other types of NLSS such as filled pauses will require future research.
Auditory and Non-Auditory Contributions for Unaided Speech Recognition in Noise as a Function of Hearing Aid Use.

Science.gov (United States)

Gieseler, Anja; Tahden, Maike A S; Thiel, Christiane M; Wagener, Kirsten C; Meis, Markus; Colonius, Hans

2017-01-01

Differences in understanding speech in noise among hearing-impaired individuals cannot be explained entirely by hearing thresholds alone, suggesting the contribution of other factors beyond standard auditory ones as derived from the audiogram. This paper reports two analyses addressing individual differences in the explanation of unaided speech-in-noise performance among n = 438 elderly hearing-impaired listeners ( mean = 71.1 ± 5.8 years). The main analysis was designed to identify clinically relevant auditory and non-auditory measures for speech-in-noise prediction using auditory (audiogram, categorical loudness scaling) and cognitive tests (verbal-intelligence test, screening test of dementia), as well as questionnaires assessing various self-reported measures (health status, socio-economic status, and subjective hearing problems). Using stepwise linear regression analysis, 62% of the variance in unaided speech-in-noise performance was explained, with measures Pure-tone average (PTA), Age , and Verbal intelligence emerging as the three most important predictors. In the complementary analysis, those individuals with the same hearing loss profile were separated into hearing aid users (HAU) and non-users (NU), and were then compared regarding potential differences in the test measures and in explaining unaided speech-in-noise recognition. The groupwise comparisons revealed significant differences in auditory measures and self-reported subjective hearing problems, while no differences in the cognitive domain were found. Furthermore, groupwise regression analyses revealed that Verbal intelligence had a predictive value in both groups, whereas Age and PTA only emerged significant in the group of hearing aid NU.
Dynamic relation between working memory capacity and speech recognition in noise during the first 6 months of hearing aid use.

Science.gov (United States)

Ng, Elaine H N; Classon, Elisabet; Larsby, Birgitta; Arlinger, Stig; Lunner, Thomas; Rudner, Mary; Rönnberg, Jerker

2014-11-23

The present study aimed to investigate the changing relationship between aided speech recognition and cognitive function during the first 6 months of hearing aid use. Twenty-seven first-time hearing aid users with symmetrical mild to moderate sensorineural hearing loss were recruited. Aided speech recognition thresholds in noise were obtained in the hearing aid fitting session as well as at 3 and 6 months postfitting. Cognitive abilities were assessed using a reading span test, which is a measure of working memory capacity, and a cognitive test battery. Results showed a significant correlation between reading span and speech reception threshold during the hearing aid fitting session. This relation was significantly weakened over the first 6 months of hearing aid use. Multiple regression analysis showed that reading span was the main predictor of speech recognition thresholds in noise when hearing aids were first fitted, but that the pure-tone average hearing threshold was the main predictor 6 months later. One way of explaining the results is that working memory capacity plays a more important role in speech recognition in noise initially rather than after 6 months of use. We propose that new hearing aid users engage working memory capacity to recognize unfamiliar processed speech signals because the phonological form of these signals cannot be automatically matched to phonological representations in long-term memory. As familiarization proceeds, the mismatch effect is alleviated, and the engagement of working memory capacity is reduced. © The Author(s) 2014.
Working memory and intelligibility of hearing-aid processed speech

Science.gov (United States)

Souza, Pamela E.; Arehart, Kathryn H.; Shen, Jing; Anderson, Melinda; Kates, James M.

2015-01-01

Previous work suggested that individuals with low working memory capacity may be at a disadvantage in adverse listening environments, including situations with background noise or substantial modification of the acoustic signal. This study explored the relationship between patient factors (including working memory capacity) and intelligibility and quality of modified speech for older individuals with sensorineural hearing loss. The modification was created using a combination of hearing aid processing [wide-dynamic range compression (WDRC) and frequency compression (FC)] applied to sentences in multitalker babble. The extent of signal modification was quantified via an envelope fidelity index. We also explored the contribution of components of working memory by including measures of processing speed and executive function. We hypothesized that listeners with low working memory capacity would perform more poorly than those with high working memory capacity across all situations, and would also be differentially affected by high amounts of signal modification. Results showed a significant effect of working memory capacity for speech intelligibility, and an interaction between working memory, amount of hearing loss and signal modification. Signal modification was the major predictor of quality ratings. These data add to the literature on hearing-aid processing and working memory by suggesting that the working memory-intelligibility effects may be related to aggregate signal fidelity, rather than to the specific signal manipulation. They also suggest that for individuals with low working memory capacity, sensorineural loss may be most appropriately addressed with WDRC and/or FC parameters that maintain the fidelity of the signal envelope. PMID:25999874
Working memory and intelligibility of hearing-aid processed speech

Directory of Open Access Journals (Sweden)

Pamela eSouza

2015-05-01

Full Text Available Previous work suggested that individuals with low working memory capacity may be at a disadvantage in adverse listening environments, including situations with background noise or substantial modification of the acoustic signal. This study explored the relationship between patient factors (including working memory capacity and intelligibility and quality of modified speech for older individuals with sensorineural hearing loss. The modification was created using a combination of hearing aid processing (wide-dynamic range compression and frequency compression applied to sentences in multitalker babble. The extent of signal modification was quantified via an envelope fidelity index. We also explored the contribution of components of working memory by including measures of processing speed and executive function. We hypothesized that listeners with low working memory capacity would perform more poorly than those with high working memory capacity across all situations, and would also be differentially affected by high amounts of signal modification. Results showed a significant effect of working memory capacity for speech intelligibility, and an interaction between working memory, amount of hearing loss and signal modification. Signal modification was the major predictor of quality ratings. These data add to the literature on hearing-aid processing and working memory by suggesting that the working memory-intelligibility effects may be related to aggregate signal fidelity, rather than on the specific signal manipulation. They also suggest that for individuals with low working memory capacity, sensorineural loss may be most appropriately addressed with wide-dynamic range compression and/or frequency compression parameters that maintain the fidelity of the signal envelope.
Gender Identification Using High-Frequency Speech Energy: Effects of Increasing the Low-Frequency Limit.

Science.gov (United States)

Donai, Jeremy J; Halbritter, Rachel M

The purpose of this study was to investigate the ability of normal-hearing listeners to use high-frequency energy for gender identification from naturally produced speech signals. Two experiments were conducted using a repeated-measures design. Experiment 1 investigated the effects of increasing high-pass filter cutoff (i.e., increasing the low-frequency spectral limit) on gender identification from naturally produced vowel segments. Experiment 2 studied the effects of increasing high-pass filter cutoff on gender identification from naturally produced sentences. Confidence ratings for the gender identification task were also obtained for both experiments. Listeners in experiment 1 were capable of extracting talker gender information at levels significantly above chance from vowel segments high-pass filtered up to 8.5 kHz. Listeners in experiment 2 also performed above chance on the gender identification task from sentences high-pass filtered up to 12 kHz. Cumulatively, the results of both experiments provide evidence that normal-hearing listeners can utilize information from the very high-frequency region (above 4 to 5 kHz) of the speech signal for talker gender identification. These findings are at variance with current assumptions regarding the perceptual information regarding talker gender within this frequency region. The current results also corroborate and extend previous studies of the use of high-frequency speech energy for perceptual tasks. These findings have potential implications for the study of information contained within the high-frequency region of the speech spectrum and the role this region may play in navigating the auditory scene, particularly when the low-frequency portion of the spectrum is masked by environmental noise sources or for listeners with substantial hearing loss in the low-frequency region and better hearing sensitivity in the high-frequency region (i.e., reverse slope hearing loss).
Noise-robust speech triage.

Science.gov (United States)

Bartos, Anthony L; Cipr, Tomas; Nelson, Douglas J; Schwarz, Petr; Banowetz, John; Jerabek, Ladislav

2018-04-01

A method is presented in which conventional speech algorithms are applied, with no modifications, to improve their performance in extremely noisy environments. It has been demonstrated that, for eigen-channel algorithms, pre-training multiple speaker identification (SID) models at a lattice of signal-to-noise-ratio (SNR) levels and then performing SID using the appropriate SNR dependent model was successful in mitigating noise at all SNR levels. In those tests, it was found that SID performance was optimized when the SNR of the testing and training data were close or identical. In this current effort multiple i-vector algorithms were used, greatly improving both processing throughput and equal error rate classification accuracy. Using identical approaches in the same noisy environment, performance of SID, language identification, gender identification, and diarization were significantly improved. A critical factor in this improvement is speech activity detection (SAD) that performs reliably in extremely noisy environments, where the speech itself is barely audible. To optimize SAD operation at all SNR levels, two algorithms were employed. The first maximized detection probability at low levels (-10 dB ≤ SNR < +10 dB) using just the voiced speech envelope, and the second exploited features extracted from the original speech to improve overall accuracy at higher quality levels (SNR ≥ +10 dB).
Gated audiovisual speech identification in silence vs. noise: effects on time and accuracy

Science.gov (United States)

Moradi, Shahram; Lidestam, Björn; Rönnberg, Jerker

2013-01-01

This study investigated the degree to which audiovisual presentation (compared to auditory-only presentation) affected isolation point (IPs, the amount of time required for the correct identification of speech stimuli using a gating paradigm) in silence and noise conditions. The study expanded on the findings of Moradi et al. (under revision), using the same stimuli, but presented in an audiovisual instead of an auditory-only manner. The results showed that noise impeded the identification of consonants and words (i.e., delayed IPs and lowered accuracy), but not the identification of final words in sentences. In comparison with the previous study by Moradi et al., it can be concluded that the provision of visual cues expedited IPs and increased the accuracy of speech stimuli identification in both silence and noise. The implication of the results is discussed in terms of models for speech understanding. PMID:23801980
Music and Speech Perception in Children Using Sung Speech.

Science.gov (United States)

Nie, Yingjiu; Galvin, John J; Morikawa, Michael; André, Victoria; Wheeler, Harley; Fu, Qian-Jie

2018-01-01

This study examined music and speech perception in normal-hearing children with some or no musical training. Thirty children (mean age = 11.3 years), 15 with and 15 without formal music training participated in the study. Music perception was measured using a melodic contour identification (MCI) task; stimuli were a piano sample or sung speech with a fixed timbre (same word for each note) or a mixed timbre (different words for each note). Speech perception was measured in quiet and in steady noise using a matrix-styled sentence recognition task; stimuli were naturally intonated speech or sung speech with a fixed pitch (same note for each word) or a mixed pitch (different notes for each word). Significant musician advantages were observed for MCI and speech in noise but not for speech in quiet. MCI performance was significantly poorer with the mixed timbre stimuli. Speech performance in noise was significantly poorer with the fixed or mixed pitch stimuli than with spoken speech. Across all subjects, age at testing and MCI performance were significantly correlated with speech performance in noise. MCI and speech performance in quiet was significantly poorer for children than for adults from a related study using the same stimuli and tasks; speech performance in noise was significantly poorer for young than for older children. Long-term music training appeared to benefit melodic pitch perception and speech understanding in noise in these pediatric listeners.
Speech-specific audiovisual perception affects identification but not detection of speech

DEFF Research Database (Denmark)

Eskelund, Kasper; Andersen, Tobias

Speech perception is audiovisual as evidenced by the McGurk effect in which watching incongruent articulatory mouth movements can change the phonetic auditory speech percept. This type of audiovisual integration may be specific to speech or be applied to all stimuli in general. To investigate...... of audiovisual integration specific to speech perception. However, the results of Tuomainen et al. might have been influenced by another effect. When observers were naïve, they had little motivation to look at the face. When informed, they knew that the face was relevant for the task and this could increase...... visual detection task. In our first experiment, observers presented with congruent and incongruent audiovisual sine-wave speech stimuli did only show a McGurk effect when informed of the speech nature of the stimulus. Performance on the secondary visual task was very good, thus supporting the finding...
Speech Perception Engages a General Timer: Evidence from a Divided Attention Word Identification Task

Science.gov (United States)

Casini, Laurence; Burle, Boris; Nguyen, Noel

2009-01-01

Time is essential to speech. The duration of speech segments plays a critical role in the perceptual identification of these segments, and therefore in that of spoken words. Here, using a French word identification task, we show that vowels are perceived as shorter when attention is divided between two tasks, as compared to a single task control…
Timing in audiovisual speech perception: A mini review and new psychophysical data.

Science.gov (United States)

Venezia, Jonathan H; Thurman, Steven M; Matchin, William; George, Sahara E; Hickok, Gregory

2016-02-01

Recent influential models of audiovisual speech perception suggest that visual speech aids perception by generating predictions about the identity of upcoming speech sounds. These models place stock in the assumption that visual speech leads auditory speech in time. However, it is unclear whether and to what extent temporally-leading visual speech information contributes to perception. Previous studies exploring audiovisual-speech timing have relied upon psychophysical procedures that require artificial manipulation of cross-modal alignment or stimulus duration. We introduce a classification procedure that tracks perceptually relevant visual speech information in time without requiring such manipulations. Participants were shown videos of a McGurk syllable (auditory /apa/ + visual /aka/ = perceptual /ata/) and asked to perform phoneme identification (/apa/ yes-no). The mouth region of the visual stimulus was overlaid with a dynamic transparency mask that obscured visual speech in some frames but not others randomly across trials. Variability in participants' responses (~35 % identification of /apa/ compared to ~5 % in the absence of the masker) served as the basis for classification analysis. The outcome was a high resolution spatiotemporal map of perceptually relevant visual features. We produced these maps for McGurk stimuli at different audiovisual temporal offsets (natural timing, 50-ms visual lead, and 100-ms visual lead). Briefly, temporally-leading (~130 ms) visual information did influence auditory perception. Moreover, several visual features influenced perception of a single speech sound, with the relative influence of each feature depending on both its temporal relation to the auditory signal and its informational content.
Timing in Audiovisual Speech Perception: A Mini Review and New Psychophysical Data

Science.gov (United States)

Venezia, Jonathan H.; Thurman, Steven M.; Matchin, William; George, Sahara E.; Hickok, Gregory

2015-01-01

Recent influential models of audiovisual speech perception suggest that visual speech aids perception by generating predictions about the identity of upcoming speech sounds. These models place stock in the assumption that visual speech leads auditory speech in time. However, it is unclear whether and to what extent temporally-leading visual speech information contributes to perception. Previous studies exploring audiovisual-speech timing have relied upon psychophysical procedures that require artificial manipulation of cross-modal alignment or stimulus duration. We introduce a classification procedure that tracks perceptually-relevant visual speech information in time without requiring such manipulations. Participants were shown videos of a McGurk syllable (auditory /apa/ + visual /aka/ = perceptual /ata/) and asked to perform phoneme identification (/apa/ yes-no). The mouth region of the visual stimulus was overlaid with a dynamic transparency mask that obscured visual speech in some frames but not others randomly across trials. Variability in participants' responses (∼35% identification of /apa/ compared to ∼5% in the absence of the masker) served as the basis for classification analysis. The outcome was a high resolution spatiotemporal map of perceptually-relevant visual features. We produced these maps for McGurk stimuli at different audiovisual temporal offsets (natural timing, 50-ms visual lead, and 100-ms visual lead). Briefly, temporally-leading (∼130 ms) visual information did influence auditory perception. Moreover, several visual features influenced perception of a single speech sound, with the relative influence of each feature depending on both its temporal relation to the auditory signal and its informational content. PMID:26669309
Assessing the efficacy of hearing-aid amplification using a phoneme test

DEFF Research Database (Denmark)

Scheidiger, Christoph; Allen, Jont B; Dau, Torsten

2017-01-01

Consonant-vowel (CV) perception experiments provide valuable insights into how humans process speech. Here, two CV identification experiments were conducted in a group of hearing-impaired (HI) listeners, using 14 consonants followed by the vowel /ɑ/. The CVs were presented in quiet and with added......, in combination with a well-controlled phoneme speech test, may be used to assess the impact of hearing-aid signal processing on speech intelligibility....
On the dynamics of the preference-performance relation for hearing aid noise reduction

DEFF Research Database (Denmark)

Fischer, Rosa-Linde; Wagener, Kirsten C.; Vormann, Matthias

on the data collected during the first laboratory assessment of the study. In particular, the influence of hearing aid experience and individual noise sensitivity on the preference-performance relation will be presented and discussed. REFERENCES S. Getzmann, E. Wascher and M. Falkenstein (2015). "What does......Previous research has shown that hearing aid users can differ substantially in their preference for noise reduction (NR) strength, and that preference for and speech recognition with NR processing typically are not correlated (e.g. Neher 2014; Serman et al. 2016). In other words, hearing aid users...... may prefer a certain NR setting, but perform better with a different one. The aim of the present work was to investigate the influence of individual noise sensitivity, hearing aid experience and acclimatization on the preference-performance relation for different NR settings. For this purpose...
Predicting automatic speech recognition performance over communication channels from instrumental speech quality and intelligibility scores

NARCIS (Netherlands)

Gallardo, L.F.; Möller, S.; Beerends, J.

2017-01-01

The performance of automatic speech recognition based on coded-decoded speech heavily depends on the quality of the transmitted signals, determined by channel impairments. This paper examines relationships between speech recognition performance and measurements of speech quality and intelligibility
Investigating the Role of Working Memory in Speech-in-noise Identification for Listeners with Normal Hearing.

Science.gov (United States)

Füllgrabe, Christian; Rosen, Stuart

2016-01-01

With the advent of cognitive hearing science, increased attention has been given to individual differences in cognitive functioning and their explanatory power in accounting for inter-listener variability in understanding speech in noise (SiN). The psychological construct that has received most interest is working memory (WM), representing the ability to simultaneously store and process information. Common lore and theoretical models assume that WM-based processes subtend speech processing in adverse perceptual conditions, such as those associated with hearing loss or background noise. Empirical evidence confirms the association between WM capacity (WMC) and SiN identification in older hearing-impaired listeners. To assess whether WMC also plays a role when listeners without hearing loss process speech in acoustically adverse conditions, we surveyed published and unpublished studies in which the Reading-Span test (a widely used measure of WMC) was administered in conjunction with a measure of SiN identification. The survey revealed little or no evidence for an association between WMC and SiN performance. We also analysed new data from 132 normal-hearing participants sampled from across the adult lifespan (18-91 years), for a relationship between Reading-Span scores and identification of matrix sentences in noise. Performance on both tasks declined with age, and correlated weakly even after controlling for the effects of age and audibility (r = 0.39, p ≤ 0.001, one-tailed). However, separate analyses for different age groups revealed that the correlation was only significant for middle-aged and older groups but not for the young (< 40 years) participants.
Achieving effective hearing aid fitting within one month after identification of childhood permanent hearing impairment.

Science.gov (United States)

Bastanza, G; Gallus, R; De Carlini, M; Picciotti, P M; Muzzi, E; Ciciriello, E; Orzan, E; Conti, G

2016-02-01

Diagnosis of child permanent hearing impairment (PHI) can be made with extreme timeliness compared to the past thanks to improvements in PHI identification through newborn hearing screening programmes. It now becomes essential to provide an effective amplification as quickly as possible in order to restore auditory function and favour speech and language development. The early fitting of hearing aids and possible later cochlear implantation indeed prompts the development of central auditory pathways, connections with secondary sensory brain areas, as well as with motor and articulatory cortex. The aim of this paper is to report the results of a strategic analysis that involves identification of strengths, weaknesses, opportunities and threats regarding the process of achieving early amplification in all cases of significant childhood PHI. The analysis is focused on the Italian situation and is part of the Italian Ministry of Health project CCM 2013 "Preventing Communication Disorders: a Regional Program for Early Identification, Intervention and Care of Hearing Impaired Children". © Copyright by Società Italiana di Otorinolaringologia e Chirurgia Cervico-Facciale.
Individual differences in selective attention predict speech identification at a cocktail party.

Science.gov (United States)

Oberfeld, Daniel; Klöckner-Nowotny, Felicitas

2016-08-31

Listeners with normal hearing show considerable individual differences in speech understanding when competing speakers are present, as in a crowded restaurant. Here, we show that one source of this variance are individual differences in the ability to focus selective attention on a target stimulus in the presence of distractors. In 50 young normal-hearing listeners, the performance in tasks measuring auditory and visual selective attention was associated with sentence identification in the presence of spatially separated competing speakers. Together, the measures of selective attention explained a similar proportion of variance as the binaural sensitivity for the acoustic temporal fine structure. Working memory span, age, and audiometric thresholds showed no significant association with speech understanding. These results suggest that a reduced ability to focus attention on a target is one reason why some listeners with normal hearing sensitivity have difficulty communicating in situations with background noise.

[Improving speech comprehension using a new cochlear implant speech processor].

Science.gov (United States)

Müller-Deile, J; Kortmann, T; Hoppe, U; Hessel, H; Morsnowski, A

2009-06-01

The aim of this multicenter clinical field study was to assess the benefits of the new Freedom 24 sound processor for cochlear implant (CI) users implanted with the Nucleus 24 cochlear implant system. The study included 48 postlingually profoundly deaf experienced CI users who demonstrated speech comprehension performance with their current speech processor on the Oldenburg sentence test (OLSA) in quiet conditions of at least 80% correct scores and who were able to perform adaptive speech threshold testing using the OLSA in noisy conditions. Following baseline measures of speech comprehension performance with their current speech processor, subjects were upgraded to the Freedom 24 speech processor. After a take-home trial period of at least 2 weeks, subject performance was evaluated by measuring the speech reception threshold with the Freiburg multisyllabic word test and speech intelligibility with the Freiburg monosyllabic word test at 50 dB and 70 dB in the sound field. The results demonstrated highly significant benefits for speech comprehension with the new speech processor. Significant benefits for speech comprehension were also demonstrated with the new speech processor when tested in competing background noise.In contrast, use of the Abbreviated Profile of Hearing Aid Benefit (APHAB) did not prove to be a suitably sensitive assessment tool for comparative subjective self-assessment of hearing benefits with each processor. Use of the preprocessing algorithm known as adaptive dynamic range optimization (ADRO) in the Freedom 24 led to additional improvements over the standard upgrade map for speech comprehension in quiet and showed equivalent performance in noise. Through use of the preprocessing beam-forming algorithm BEAM, subjects demonstrated a highly significant improved signal-to-noise ratio for speech comprehension thresholds (i.e., signal-to-noise ratio for 50% speech comprehension scores) when tested with an adaptive procedure using the Oldenburg
How does susceptibility to proactive interference relate to speech recognition in aided and unaided conditions?

Science.gov (United States)

Ellis, Rachel J; Rönnberg, Jerker

2015-01-01

Proactive interference (PI) is the capacity to resist interference to the acquisition of new memories from information stored in the long-term memory. Previous research has shown that PI correlates significantly with the speech-in-noise recognition scores of younger adults with normal hearing. In this study, we report the results of an experiment designed to investigate the extent to which tests of visual PI relate to the speech-in-noise recognition scores of older adults with hearing loss, in aided and unaided conditions. The results suggest that measures of PI correlate significantly with speech-in-noise recognition only in the unaided condition. Furthermore the relation between PI and speech-in-noise recognition differs to that observed in younger listeners without hearing loss. The findings suggest that the relation between PI tests and the speech-in-noise recognition scores of older adults with hearing loss relates to capability of the test to index cognitive flexibility.
How does susceptibility to proactive interference relate to speech recognition in aided and unaided conditions?

Directory of Open Access Journals (Sweden)

Rachel Jane Ellis

2015-08-01

Full Text Available Proactive interference (PI is the capacity to resist interference to the acquisition of new memories from information stored in the long-term memory. Previous research has shown that PI correlates significantly with the speech-in-noise recognition scores of younger adults with normal hearing. In this study, we report the results of an experiment designed to investigate the extent to which tests of visual PI relate to the speech-in-noise recognition scores of older adults with hearing loss, in aided and unaided conditions. The results suggest that measures of PI correlate significantly with speech-in-noise recognition only in the unaided condition. Furthermore the relation between PI and speech-in-noise recognition differs to that observed in younger listeners without hearing loss. The findings suggest that the relation between PI tests and the speech-in-noise recognition scores of older adults with hearing loss relates to capability of the test to index cognitive flexibility.
Successful and rapid response of speech bulb reduction program combined with speech therapy in velopharyngeal dysfunction: a case report.

Science.gov (United States)

Shin, Yu-Jeong; Ko, Seung-O

2015-12-01

Velopharyngeal dysfunction in cleft palate patients following the primary palate repair may result in nasal air emission, hypernasality, articulation disorder and poor intelligibility of speech. Among conservative treatment methods, speech aid prosthesis combined with speech therapy is widely used method. However because of its long time of treatment more than a year and low predictability, some clinicians prefer a surgical intervention. Thus, the purpose of this report was to increase an attention on the effectiveness of speech aid prosthesis by introducing a case that was successfully treated. In this clinical report, speech bulb reduction program with intensive speech therapy was applied for a patient with velopharyngeal dysfunction and it was rapidly treated by 5months which was unusually short period for speech aid therapy. Furthermore, advantages of pre-operative speech aid therapy were discussed.
Individual differences in selective attention predict speech identification at a cocktail party

Science.gov (United States)

Oberfeld, Daniel; Klöckner-Nowotny, Felicitas

2016-01-01

Listeners with normal hearing show considerable individual differences in speech understanding when competing speakers are present, as in a crowded restaurant. Here, we show that one source of this variance are individual differences in the ability to focus selective attention on a target stimulus in the presence of distractors. In 50 young normal-hearing listeners, the performance in tasks measuring auditory and visual selective attention was associated with sentence identification in the presence of spatially separated competing speakers. Together, the measures of selective attention explained a similar proportion of variance as the binaural sensitivity for the acoustic temporal fine structure. Working memory span, age, and audiometric thresholds showed no significant association with speech understanding. These results suggest that a reduced ability to focus attention on a target is one reason why some listeners with normal hearing sensitivity have difficulty communicating in situations with background noise. DOI: http://dx.doi.org/10.7554/eLife.16747.001 PMID:27580272
Cingulo-opercular activity affects incidental memory encoding for speech in noise.

Science.gov (United States)

Vaden, Kenneth I; Teubner-Rhodes, Susan; Ahlstrom, Jayne B; Dubno, Judy R; Eckert, Mark A

2017-08-15

Correctly understood speech in difficult listening conditions is often difficult to remember. A long-standing hypothesis for this observation is that the engagement of cognitive resources to aid speech understanding can limit resources available for memory encoding. This hypothesis is consistent with evidence that speech presented in difficult conditions typically elicits greater activity throughout cingulo-opercular regions of frontal cortex that are proposed to optimize task performance through adaptive control of behavior and tonic attention. However, successful memory encoding of items for delayed recognition memory tasks is consistently associated with increased cingulo-opercular activity when perceptual difficulty is minimized. The current study used a delayed recognition memory task to test competing predictions that memory encoding for words is enhanced or limited by the engagement of cingulo-opercular activity during challenging listening conditions. An fMRI experiment was conducted with twenty healthy adult participants who performed a word identification in noise task that was immediately followed by a delayed recognition memory task. Consistent with previous findings, word identification trials in the poorer signal-to-noise ratio condition were associated with increased cingulo-opercular activity and poorer recognition memory scores on average. However, cingulo-opercular activity decreased for correctly identified words in noise that were not recognized in the delayed memory test. These results suggest that memory encoding in difficult listening conditions is poorer when elevated cingulo-opercular activity is not sustained. Although increased attention to speech when presented in difficult conditions may detract from more active forms of memory maintenance (e.g., sub-vocal rehearsal), we conclude that task performance monitoring and/or elevated tonic attention supports incidental memory encoding in challenging listening conditions. Copyright © 2017
Use of amplitude modulation cues recovered from frequency modulation for cochlear implant users when original speech cues are severely degraded.

Science.gov (United States)

Won, Jong Ho; Shim, Hyun Joon; Lorenzi, Christian; Rubinstein, Jay T

2014-06-01

Won et al. (J Acoust Soc Am 132:1113-1119, 2012) reported that cochlear implant (CI) speech processors generate amplitude-modulation (AM) cues recovered from broadband speech frequency modulation (FM) and that CI users can use these cues for speech identification in quiet. The present study was designed to extend this finding for a wide range of listening conditions, where the original speech cues were severely degraded by manipulating either the acoustic signals or the speech processor. The manipulation of the acoustic signals included the presentation of background noise, simulation of reverberation, and amplitude compression. The manipulation of the speech processor included changing the input dynamic range and the number of channels. For each of these conditions, multiple levels of speech degradation were tested. Speech identification was measured for CI users and compared for stimuli having both AM and FM information (intact condition) or FM information only (FM condition). Each manipulation degraded speech identification performance for both intact and FM conditions. Performance for the intact and FM conditions became similar for stimuli having the most severe degradations. Identification performance generally overlapped for the intact and FM conditions. Moreover, identification performance for the FM condition was better than chance performance even at the maximum level of distortion. Finally, significant correlations were found between speech identification scores for the intact and FM conditions. Altogether, these results suggest that despite poor frequency selectivity, CI users can make efficient use of AM cues recovered from speech FM in difficult listening situations.
An experimental Dutch keyboard-to-speech system for the speech impaired

NARCIS (Netherlands)

Deliege, R.J.H.

1989-01-01

An experimental Dutch keyboard-to-speech system has been developed to explor the possibilities and limitations of Dutch speech synthesis in a communication aid for the speech impaired. The system uses diphones and a formant synthesizer chip for speech synthesis. Input to the system is in
Sound frequency affects speech emotion perception: results from congenital amusia.

Science.gov (United States)

Lolli, Sydney L; Lewenstein, Ari D; Basurto, Julian; Winnik, Sean; Loui, Psyche

2015-01-01

Congenital amusics, or "tone-deaf" individuals, show difficulty in perceiving and producing small pitch differences. While amusia has marked effects on music perception, its impact on speech perception is less clear. Here we test the hypothesis that individual differences in pitch perception affect judgment of emotion in speech, by applying low-pass filters to spoken statements of emotional speech. A norming study was first conducted on Mechanical Turk to ensure that the intended emotions from the Macquarie Battery for Evaluation of Prosody were reliably identifiable by US English speakers. The most reliably identified emotional speech samples were used in Experiment 1, in which subjects performed a psychophysical pitch discrimination task, and an emotion identification task under low-pass and unfiltered speech conditions. Results showed a significant correlation between pitch-discrimination threshold and emotion identification accuracy for low-pass filtered speech, with amusics (defined here as those with a pitch discrimination threshold >16 Hz) performing worse than controls. This relationship with pitch discrimination was not seen in unfiltered speech conditions. Given the dissociation between low-pass filtered and unfiltered speech conditions, we inferred that amusics may be compensating for poorer pitch perception by using speech cues that are filtered out in this manipulation. To assess this potential compensation, Experiment 2 was conducted using high-pass filtered speech samples intended to isolate non-pitch cues. No significant correlation was found between pitch discrimination and emotion identification accuracy for high-pass filtered speech. Results from these experiments suggest an influence of low frequency information in identifying emotional content of speech.
Low Delay Noise Reduction and Dereverberation for Hearing Aids

Directory of Open Access Journals (Sweden)

Heinrich W. Löllmann

2009-01-01

Full Text Available A new system for single-channel speech enhancement is proposed which achieves a joint suppression of late reverberant speech and background noise with a low signal delay and low computational complexity. It is based on a generalized spectral subtraction rule which depends on the variances of the late reverberant speech and background noise. The calculation of the spectral variances of the late reverberant speech requires an estimate of the reverberation time (RT which is accomplished by a maximum likelihood (ML approach. The enhancement with this blind RT estimation achieves almost the same speech quality as by using the actual RT. In comparison to commonly used post-filters in hearing aids which only perform a noise reduction, a significantly better objective and subjective speech quality is achieved. The proposed system performs time-domain filtering with coefficients adapted in the non-uniform (Bark-scaled frequency-domain. This allows to achieve a high speech quality with low signal delay which is important for speech enhancement in hearing aids or related applications such as hands-free communication systems.
Influence of hearing loss on children’s identification of spondee words in a speech-shaped noise or a two-talker masker

Science.gov (United States)

Leibold, Lori J.; Hillock-Dunn, Andrea; Duncan, Nicole; Roush, Patricia A.; Buss, Emily

2013-01-01

This study compared spondee identification performance in presence of speech-shaped noise or two competing talkers across children with hearing loss and age-matched children with normal hearing. The results showed a greater masking effect for children with hearing loss compared to children with normal hearing for both masker conditions. However, the magnitude of this group difference was significantly larger for the two-talker compared to the speech-shaped noise masker. These results support the hypothesis that hearing loss influences children’s perceptual processing abilities. PMID:23492919
Modern prescription theory and application: realistic expectations for speech recognition with hearing AIDS.

Science.gov (United States)

Johnson, Earl E

2013-01-01

A major decision at the time of hearing aid fitting and dispensing is the amount of amplification to provide listeners (both adult and pediatric populations) for the appropriate compensation of sensorineural hearing impairment across a range of frequencies (e.g., 160-10000 Hz) and input levels (e.g., 50-75 dB sound pressure level). This article describes modern prescription theory for hearing aids within the context of a risk versus return trade-off and efficient frontier analyses. The expected return of amplification recommendations (i.e., generic prescriptions such as National Acoustic Laboratories-Non-Linear 2, NAL-NL2, and Desired Sensation Level Multiple Input/Output, DSL m[i/o]) for the Speech Intelligibility Index (SII) and high-frequency audibility were traded against a potential risk (i.e., loudness). The modeled performance of each prescription was compared one with another and with the efficient frontier of normal hearing sensitivity (i.e., a reference point for the most return with the least risk). For the pediatric population, NAL-NL2 was more efficient for SII, while DSL m[i/o] was more efficient for high-frequency audibility. For the adult population, NAL-NL2 was more efficient for SII, while the two prescriptions were similar with regard to high-frequency audibility. In terms of absolute return (i.e., not considering the risk of loudness), however, DSL m[i/o] prescribed more outright high-frequency audibility than NAL-NL2 for either aged population, particularly, as hearing loss increased. Given the principles and demonstrated accuracy of desensitization (reduced utility of audibility with increasing hearing loss) observed at the group level, additional high-frequency audibility beyond that of NAL-NL2 is not expected to make further contributions to speech intelligibility (recognition) for the average listener.
Preliminary study of acoustic analysis for evaluating speech-aid oral prostheses: Characteristic dips in octave spectrum for comparison of nasality.

Science.gov (United States)

Chang, Yen-Liang; Hung, Chao-Ho; Chen, Po-Yueh; Chen, Wei-Chang; Hung, Shih-Han

2015-10-01

Acoustic analysis is often used in speech evaluation but seldom for the evaluation of oral prostheses designed for reconstruction of surgical defect. This study aimed to introduce the application of acoustic analysis for patients with velopharyngeal insufficiency (VPI) due to oral surgery and rehabilitated with oral speech-aid prostheses. The pre- and postprosthetic rehabilitation acoustic features of sustained vowel sounds from two patients with VPI were analyzed and compared with the acoustic analysis software Praat. There were significant differences in the octave spectrum of sustained vowel speech sound between the pre- and postprosthetic rehabilitation. Acoustic measurements of sustained vowels for patients before and after prosthetic treatment showed no significant differences for all parameters of fundamental frequency, jitter, shimmer, noise-to-harmonics ratio, formant frequency, F1 bandwidth, and band energy difference. The decrease in objective nasality perceptions correlated very well with the decrease in dips of the spectra for the male patient with a higher speech bulb height. Acoustic analysis may be a potential technique for evaluating the functions of oral speech-aid prostheses, which eliminates dysfunctions due to the surgical defect and contributes to a high percentage of intelligible speech. Octave spectrum analysis may also be a valuable tool for detecting changes in nasality characteristics of the voice during prosthetic treatment of VPI. Copyright © 2014. Published by Elsevier B.V.
Relationship between perceptual learning in speech and statistical learning in younger and older adults

Directory of Open Access Journals (Sweden)

Thordis Marisa Neger

2014-09-01

Full Text Available Within a few sentences, listeners learn to understand severely degraded speech such as noise-vocoded speech. However, individuals vary in the amount of such perceptual learning and it is unclear what underlies these differences. The present study investigates whether perceptual learning in speech relates to statistical learning, as sensitivity to probabilistic information may aid identification of relevant cues in novel speech input. If statistical learning and perceptual learning (partly draw on the same general mechanisms, then statistical learning in a non-auditory modality using non-linguistic sequences should predict adaptation to degraded speech.In the present study, 73 older adults (aged over 60 years and 60 younger adults (aged between 18 and 30 years performed a visual artificial grammar learning task and were presented with sixty meaningful noise-vocoded sentences in an auditory recall task. Within age groups, sentence recognition performance over exposure was analyzed as a function of statistical learning performance, and other variables that may predict learning (i.e., hearing, vocabulary, attention switching control, working memory and processing speed. Younger and older adults showed similar amounts of perceptual learning, but only younger adults showed significant statistical learning. In older adults, improvement in understanding noise-vocoded speech was constrained by age. In younger adults, amount of adaptation was associated with lexical knowledge and with statistical learning ability. Thus, individual differences in general cognitive abilities explain listeners' variability in adapting to noise-vocoded speech. Results suggest that perceptual and statistical learning share mechanisms of implicit regularity detection, but that the ability to detect statistical regularities is impaired in older adults if visual sequences are presented quickly.
Double Fourier analysis for Emotion Identification in Voiced Speech

International Nuclear Information System (INIS)

Sierra-Sosa, D.; Bastidas, M.; Ortiz P, D.; Quintero, O.L.

2016-01-01

We propose a novel analysis alternative, based on two Fourier Transforms for emotion recognition from speech. Fourier analysis allows for display and synthesizes different signals, in terms of power spectral density distributions. A spectrogram of the voice signal is obtained performing a short time Fourier Transform with Gaussian windows, this spectrogram portraits frequency related features, such as vocal tract resonances and quasi-periodic excitations during voiced sounds. Emotions induce such characteristics in speech, which become apparent in spectrogram time-frequency distributions. Later, the signal time-frequency representation from spectrogram is considered an image, and processed through a 2-dimensional Fourier Transform in order to perform the spatial Fourier analysis from it. Finally features related with emotions in voiced speech are extracted and presented. (paper)
Signal-to-Signal Ratio Independent Speaker Identification for Co-channel Speech Signals

DEFF Research Database (Denmark)

Saeidi, Rahim; Mowlaee, Pejman; Kinnunen, Tomi

2010-01-01

In this paper, we consider speaker identification for the co-channel scenario in which speech mixture from speakers is recorded by one microphone only. The goal is to identify both of the speakers from their mixed signal. High recognition accuracies have already been reported when an accurately...
Sound frequency affects speech emotion perception: Results from congenital amusia

Directory of Open Access Journals (Sweden)

Sydney eLolli

2015-09-01

Full Text Available Congenital amusics, or tone-deaf individuals, show difficulty in perceiving and producing small pitch differences. While amusia has marked effects on music perception, its impact on speech perception is less clear. Here we test the hypothesis that individual differences in pitch perception affect judgment of emotion in speech, by applying band-pass filters to spoken statements of emotional speech. A norming study was first conducted on Mechanical Turk to ensure that the intended emotions from the Macquarie Battery for Evaluation of Prosody (MBEP were reliably identifiable by US English speakers. The most reliably identified emotional speech samples were used in in Experiment 1, in which subjects performed a psychophysical pitch discrimination task, and an emotion identification task under band-pass and unfiltered speech conditions. Results showed a significant correlation between pitch discrimination threshold and emotion identification accuracy for band-pass filtered speech, with amusics (defined here as those with a pitch discrimination threshold > 16 Hz performing worse than controls. This relationship with pitch discrimination was not seen in unfiltered speech conditions. Given the dissociation between band-pass filtered and unfiltered speech conditions, we inferred that amusics may be compensating for poorer pitch perception by using speech cues that are filtered out in this manipulation.
Comparison of speech perception performance between Sprint/Esprit 3G and Freedom processors in children implanted with nucleus cochlear implants.

Science.gov (United States)

Santarelli, Rosamaria; Magnavita, Vincenzo; De Filippi, Roberta; Ventura, Laura; Genovese, Elisabetta; Arslan, Edoardo

2009-04-01

To compare speech perception performance in children fitted with previous generation Nucleus sound processor, Sprint or Esprit 3G, and the Freedom, the most recently released system from the Cochlear Corporation that features a larger input dynamic range. Prospective intrasubject comparative study. University Medical Center. Seventeen prelingually deafened children who had received the Nucleus 24 cochlear implant and used the Sprint or Esprit 3G sound processor. Cochlear implantation with Cochlear device. Speech perception was evaluated at baseline (Sprint, n = 11; Esprit 3G, n = 6) and after 1 month's experience with the Freedom sound processor. Identification and recognition of disyllabic words and identification of vowels were performed via recorded voice in quiet (70 dB [A]), in the presence of background noise at various levels of signal-to-noise ratio (+10, +5, 0, -5) and at a soft presentation level (60 dB [A]). Consonant identification and recognition of disyllabic words, trisyllabic words, and sentences were evaluated in live voice. Frequency discrimination was measured in a subset of subjects (n = 5) by using an adaptive, 3-interval, 3-alternative, forced-choice procedure. Identification of disyllabic words administered at a soft presentation level showed a significant increase when switching to the Freedom compared with the previously worn processor in children using the Sprint or Esprit 3G. Identification and recognition of disyllabic words in the presence of background noise as well as consonant identification and sentence recognition increased significantly for the Freedom compared with the previously worn device only in children fitted with the Sprint. Frequency discrimination was significantly better when switching to the Freedom compared with the previously worn processor. Serial comparisons revealed that that speech perception performance evaluated in children aged 5 to 15 years was superior with the Freedom than previous generations of Nucleus
Research of Features of the Phonetic System of Speech and Identification of Announcers on the Voice

Directory of Open Access Journals (Sweden)

Roman Aleksandrovich Vasilyev

2013-02-01

Full Text Available In the work the method of the phonetic analysis of speech — allocation of the list of elementary speech units such as separate phonemes from a continuous stream of informal conversation of the specific announcer is offered. The practical algorithm of identification of the announcer — process of definition speaking of the set of announcers is described.
Joint Single-Channel Speech Separation and Speaker Identification

DEFF Research Database (Denmark)

Mowlaee, Pejman; Saeidi, Rahim; Tan, Zheng-Hua

2010-01-01

In this paper, we propose a closed loop system to improve the performance of single-channel speech separation in a speaker independent scenario. The system is composed of two interconnected blocks: a separation block and a speaker identiſcation block. The improvement is accomplished by incorporat......In this paper, we propose a closed loop system to improve the performance of single-channel speech separation in a speaker independent scenario. The system is composed of two interconnected blocks: a separation block and a speaker identiſcation block. The improvement is accomplished...... enhances the quality of the separated output signals. To assess the improvements, the results are reported in terms of PESQ for both target and masked signals....

Enhancement of speech signals - with a focus on voiced speech models

DEFF Research Database (Denmark)

Nørholm, Sidsel Marie

This thesis deals with speech enhancement, i.e., noise reduction in speech signals. This has applications in, e.g., hearing aids and teleconference systems. We consider a signal-driven approach to speech enhancement where a model of the speech is assumed and filters are generated based...... on this model. The basic model used in this thesis is the harmonic model which is a commonly used model for describing the voiced part of the speech signal. We show that it can be beneficial to extend the model to take inharmonicities or the non-stationarity of speech into account. Extending the model...
Speech Understanding with a New Implant Technology: A Comparative Study with a New Nonskin Penetrating Baha System

Directory of Open Access Journals (Sweden)

Anja Kurz

2014-01-01

Full Text Available Objective. To compare hearing and speech understanding between a new, nonskin penetrating Baha system (Baha Attract to the current Baha system using a skin-penetrating abutment. Methods. Hearing and speech understanding were measured in 16 experienced Baha users. The transmission path via the abutment was compared to a simulated Baha Attract transmission path by attaching the implantable magnet to the abutment and then by adding a sample of artificial skin and the external parts of the Baha Attract system. Four different measurements were performed: bone conduction thresholds directly through the sound processor (BC Direct, aided sound field thresholds, aided speech understanding in quiet, and aided speech understanding in noise. Results. The simulated Baha Attract transmission path introduced an attenuation starting from approximately 5 dB at 1000 Hz, increasing to 20–25 dB above 6000 Hz. However, aided sound field threshold shows smaller differences and aided speech understanding in quiet and in noise does not differ significantly between the two transmission paths. Conclusion. The Baha Attract system transmission path introduces predominately high frequency attenuation. This attenuation can be partially compensated by adequate fitting of the speech processor. No significant decrease in speech understanding in either quiet or in noise was found.
Hearing aid processing strategies for listeners with different auditory profiles: Insights from the BEAR project

DEFF Research Database (Denmark)

Wu, Mengfan; El-Haj-Ali, Mouhamad; Sanchez Lopez, Raul

hearing aid settings that differed in terms of signal-to-noise ratio (SNR) improvement and temporal and spectral speech distortions were selected for testing based on a comprehensive technical evaluation of different parameterisations of the hearing aid simulator. Speech-in-noise perception was assessed...... stimulus comparison paradigm. RESULTS We hypothesize that the perceptual outcomes from the six hearing aid settings will differ across listeners with different auditory profiles. More specifically, we expect listeners showing high sensitivity to temporal and spectral differences to perform best with and....../or to favour hearing aid settings that preserve those cues. In contrast, we expect listeners showing low sensitivity to temporal and spectral differences to perform best with and/or to favour settings that maximize SNR improvement, independent of any additional speech distortions. Altogether, we anticipate...
Speech-specificity of two audiovisual integration effects

DEFF Research Database (Denmark)

Eskelund, Kasper; Tuomainen, Jyrki; Andersen, Tobias

2010-01-01

Seeing the talker’s articulatory mouth movements can influence the auditory speech percept both in speech identification and detection tasks. Here we show that these audiovisual integration effects also occur for sine wave speech (SWS), which is an impoverished speech signal that naïve observers...... often fail to perceive as speech. While audiovisual integration in the identification task only occurred when observers were informed of the speech-like nature of SWS, integration occurred in the detection task both for informed and naïve observers. This shows that both speech-specific and general...... mechanisms underlie audiovisual integration of speech....
Impact of Hearing Aid Technology on Outcomes in Daily Life II: Speech Understanding and Listening Effort.

Science.gov (United States)

Johnson, Jani A; Xu, Jingjing; Cox, Robyn M

2016-01-01

Modern hearing aid (HA) devices include a collection of acoustic signal-processing features designed to improve listening outcomes in a variety of daily auditory environments. Manufacturers market these features at successive levels of technological sophistication. The features included in costlier premium hearing devices are designed to result in further improvements to daily listening outcomes compared with the features included in basic hearing devices. However, independent research has not substantiated such improvements. This research was designed to explore differences in speech-understanding and listening-effort outcomes for older adults using premium-feature and basic-feature HAs in their daily lives. For this participant-blinded, repeated, crossover trial 45 older adults (mean age 70.3 years) with mild-to-moderate sensorineural hearing loss wore each of four pairs of bilaterally fitted HAs for 1 month. HAs were premium- and basic-feature devices from two major brands. After each 1-month trial, participants' speech-understanding and listening-effort outcomes were evaluated in the laboratory and in daily life. Three types of speech-understanding and listening-effort data were collected: measures of laboratory performance, responses to standardized self-report questionnaires, and participant diary entries about daily communication. The only statistically significant superiority for the premium-feature HAs occurred for listening effort in the loud laboratory condition and was demonstrated for only one of the tested brands. The predominant complaint of older adults with mild-to-moderate hearing impairment is difficulty understanding speech in various settings. The combined results of all the outcome measures used in this research suggest that, when fitted using scientifically based practices, both premium- and basic-feature HAs are capable of providing considerable, but essentially equivalent, improvements to speech understanding and listening effort in daily
The mechanism of speech processing in congenital amusia: evidence from Mandarin speakers.

Directory of Open Access Journals (Sweden)

Fang Liu

Full Text Available Congenital amusia is a neuro-developmental disorder of pitch perception that causes severe problems with music processing but only subtle difficulties in speech processing. This study investigated speech processing in a group of Mandarin speakers with congenital amusia. Thirteen Mandarin amusics and thirteen matched controls participated in a set of tone and intonation perception tasks and two pitch threshold tasks. Compared with controls, amusics showed impaired performance on word discrimination in natural speech and their gliding tone analogs. They also performed worse than controls on discriminating gliding tone sequences derived from statements and questions, and showed elevated thresholds for pitch change detection and pitch direction discrimination. However, they performed as well as controls on word identification, and on statement-question identification and discrimination in natural speech. Overall, tasks that involved multiple acoustic cues to communicative meaning were not impacted by amusia. Only when the tasks relied mainly on pitch sensitivity did amusics show impaired performance compared to controls. These findings help explain why amusia only affects speech processing in subtle ways. Further studies on a larger sample of Mandarin amusics and on amusics of other language backgrounds are needed to consolidate these results.
The mechanism of speech processing in congenital amusia: evidence from Mandarin speakers.

Science.gov (United States)

Liu, Fang; Jiang, Cunmei; Thompson, William Forde; Xu, Yi; Yang, Yufang; Stewart, Lauren

2012-01-01

Congenital amusia is a neuro-developmental disorder of pitch perception that causes severe problems with music processing but only subtle difficulties in speech processing. This study investigated speech processing in a group of Mandarin speakers with congenital amusia. Thirteen Mandarin amusics and thirteen matched controls participated in a set of tone and intonation perception tasks and two pitch threshold tasks. Compared with controls, amusics showed impaired performance on word discrimination in natural speech and their gliding tone analogs. They also performed worse than controls on discriminating gliding tone sequences derived from statements and questions, and showed elevated thresholds for pitch change detection and pitch direction discrimination. However, they performed as well as controls on word identification, and on statement-question identification and discrimination in natural speech. Overall, tasks that involved multiple acoustic cues to communicative meaning were not impacted by amusia. Only when the tasks relied mainly on pitch sensitivity did amusics show impaired performance compared to controls. These findings help explain why amusia only affects speech processing in subtle ways. Further studies on a larger sample of Mandarin amusics and on amusics of other language backgrounds are needed to consolidate these results.
The Effects of Background Noise on the Performance of an Automatic Speech Recogniser

Science.gov (United States)

Littlefield, Jason; HashemiSakhtsari, Ahmad

2002-11-01

Ambient or environmental noise is a major factor that affects the performance of an automatic speech recognizer. Large vocabulary, speaker-dependent, continuous speech recognizers are commercially available. Speech recognizers, perform well in a quiet environment, but poorly in a noisy environment. Speaker-dependent speech recognizers require training prior to them being tested, where the level of background noise in both phases affects the performance of the recognizer. This study aims to determine whether the best performance of a speech recognizer occurs when the levels of background noise during the training and test phases are the same, and how the performance is affected when the levels of background noise during the training and test phases are different. The relationship between the performance of the speech recognizer and upgrading the computer speed and amount of memory as well as software version was also investigated.
Auditory, Visual, and Auditory-Visual Speech Perception by Individuals with Cochlear Implants versus Individuals with Hearing Aids

Science.gov (United States)

Most, Tova; Rothem, Hilla; Luntz, Michal

2009-01-01

The researchers evaluated the contribution of cochlear implants (CIs) to speech perception by a sample of prelingually deaf individuals implanted after age 8 years. This group was compared with a group with profound hearing impairment (HA-P), and with a group with severe hearing impairment (HA-S), both of which used hearing aids. Words and…
Modeling Driving Performance Using In-Vehicle Speech Data From a Naturalistic Driving Study.

Science.gov (United States)

Kuo, Jonny; Charlton, Judith L; Koppel, Sjaan; Rudin-Brown, Christina M; Cross, Suzanne

2016-09-01

We aimed to (a) describe the development and application of an automated approach for processing in-vehicle speech data from a naturalistic driving study (NDS), (b) examine the influence of child passenger presence on driving performance, and (c) model this relationship using in-vehicle speech data. Parent drivers frequently engage in child-related secondary behaviors, but the impact on driving performance is unknown. Applying automated speech-processing techniques to NDS audio data would facilitate the analysis of in-vehicle driver-child interactions and their influence on driving performance. Speech activity detection and speaker diarization algorithms were applied to audio data from a Melbourne-based NDS involving 42 families. Multilevel models were developed to evaluate the effect of speech activity and the presence of child passengers on driving performance. Speech activity was significantly associated with velocity and steering angle variability. Child passenger presence alone was not associated with changes in driving performance. However, speech activity in the presence of two child passengers was associated with the most variability in driving performance. The effects of in-vehicle speech on driving performance in the presence of child passengers appear to be heterogeneous, and multiple factors may need to be considered in evaluating their impact. This goal can potentially be achieved within large-scale NDS through the automated processing of observational data, including speech. Speech-processing algorithms enable new perspectives on driving performance to be gained from existing NDS data, and variables that were once labor-intensive to process can be readily utilized in future research. © 2016, Human Factors and Ergonomics Society.
Speech perception in older hearing impaired listeners: benefits of perceptual training.

Directory of Open Access Journals (Sweden)

David L Woods

Full Text Available Hearing aids (HAs only partially restore the ability of older hearing impaired (OHI listeners to understand speech in noise, due in large part to persistent deficits in consonant identification. Here, we investigated whether adaptive perceptual training would improve consonant-identification in noise in sixteen aided OHI listeners who underwent 40 hours of computer-based training in their homes. Listeners identified 20 onset and 20 coda consonants in 9,600 consonant-vowel-consonant (CVC syllables containing different vowels (/ɑ/, /i/, or /u/ and spoken by four different talkers. Consonants were presented at three consonant-specific signal-to-noise ratios (SNRs spanning a 12 dB range. Noise levels were adjusted over training sessions based on d' measures. Listeners were tested before and after training to measure (1 changes in consonant-identification thresholds using syllables spoken by familiar and unfamiliar talkers, and (2 sentence reception thresholds (SeRTs using two different sentence tests. Consonant-identification thresholds improved gradually during training. Laboratory tests of d' thresholds showed an average improvement of 9.1 dB, with 94% of listeners showing statistically significant training benefit. Training normalized consonant confusions and improved the thresholds of some consonants into the normal range. Benefits were equivalent for onset and coda consonants, syllables containing different vowels, and syllables presented at different SNRs. Greater training benefits were found for hard-to-identify consonants and for consonants spoken by familiar than unfamiliar talkers. SeRTs, tested with simple sentences, showed less elevation than consonant-identification thresholds prior to training and failed to show significant training benefit, although SeRT improvements did correlate with improvements in consonant thresholds. We argue that the lack of SeRT improvement reflects the dominant role of top-down semantic processing in
Study of the Ability of Articulation Index (Al for Predicting the Unaided and Aided Speech Recognition Performance of 25 to 65 Years Old Hearing-Impaired Adults

Directory of Open Access Journals (Sweden)

Ghasem Mohammad Khani

2001-05-01

Full Text Available Background: In recent years there has been increased interest in the use of Al for assessing hearing handicap and for measuring the potential effectiveness of amplification system. AI is an expression of proportion of average speech signal that is audible to a given patient, and it can vary between 0.0 to 1.0. Method and Materials: This cross-sectional analytical study was carried out in department of audiology, rehabilitation, faculty, IUMS form 31 Oct 98 to 7 March 1999, on 40 normal hearing persons (80 ears; 19 males and 21 females and 40 hearing impaired persons (61 ears; 36 males and 25 females, 25-65 years old with moderate to moderately severe SNI-IL The pavlovic procedure (1988 for calculating Al, open set taped standard mono syllabic word lists, and the real -ear probe- tube microphone system to measure insertion gain were used, through test-retest. Results: 1/A significant correlation was shown between the Al scores and the speech recognition scores of normal hearing and hearing-impaired group with and without the hearing aid (P<0.05 2/ There was no significant differences in age group & sex: also 3 In test-retest measures of the insertion gain in each test and 4/No significant in test-retest of speech recognition test score. Conclusion: According to these results the Al can predict the unaided and aided monosyllabic recognition test scores very well, and age and sex variables have no effect on its ability. Therefore with respect to high reliability of the Al results and its simplicity, easy -to- use, cost effective, and little time consuming for calculation, its recommended the wide use of the Al, especially in clinical situation.
Speech, stone tool-making and the evolution of language.

Science.gov (United States)

Cataldo, Dana Michelle; Migliano, Andrea Bamberg; Vinicius, Lucio

2018-01-01

The 'technological hypothesis' proposes that gestural language evolved in early hominins to enable the cultural transmission of stone tool-making skills, with speech appearing later in response to the complex lithic industries of more recent hominins. However, no flintknapping study has assessed the efficiency of speech alone (unassisted by gesture) as a tool-making transmission aid. Here we show that subjects instructed by speech alone underperform in stone tool-making experiments in comparison to subjects instructed through either gesture alone or 'full language' (gesture plus speech), and also report lower satisfaction with their received instruction. The results provide evidence that gesture was likely to be selected over speech as a teaching aid in the earliest hominin tool-makers; that speech could not have replaced gesturing as a tool-making teaching aid in later hominins, possibly explaining the functional retention of gesturing in the full language of modern humans; and that speech may have evolved for reasons unrelated to tool-making. We conclude that speech is unlikely to have evolved as tool-making teaching aid superior to gesture, as claimed by the technological hypothesis, and therefore alternative views should be considered. For example, gestural language may have evolved to enable tool-making in earlier hominins, while speech may have later emerged as a response to increased trade and more complex inter- and intra-group interactions in Middle Pleistocene ancestors of Neanderthals and Homo sapiens; or gesture and speech may have evolved in parallel rather than in sequence.
Evaluating the Performance of a Visually Guided Hearing Aid Using a Dynamic Auditory-Visual Word Congruence Task.

Science.gov (United States)

Roverud, Elin; Best, Virginia; Mason, Christine R; Streeter, Timothy; Kidd, Gerald

2017-12-15

The "visually guided hearing aid" (VGHA), consisting of a beamforming microphone array steered by eye gaze, is an experimental device being tested for effectiveness in laboratory settings. Previous studies have found that beamforming without visual steering can provide significant benefits (relative to natural binaural listening) for speech identification in spatialized speech or noise maskers when sound sources are fixed in location. The aim of the present study was to evaluate the performance of the VGHA in listening conditions in which target speech could switch locations unpredictably, requiring visual steering of the beamforming. To address this aim, the present study tested an experimental simulation of the VGHA in a newly designed dynamic auditory-visual word congruence task. Ten young normal-hearing (NH) and 11 young hearing-impaired (HI) adults participated. On each trial, three simultaneous spoken words were presented from three source positions (-30, 0, and 30 azimuth). An auditory-visual word congruence task was used in which participants indicated whether there was a match between the word printed on a screen at a location corresponding to the target source and the spoken target word presented acoustically from that location. Performance was compared for a natural binaural condition (stimuli presented using impulse responses measured on KEMAR), a simulated VGHA condition (BEAM), and a hybrid condition that combined lowpass-filtered KEMAR and highpass-filtered BEAM information (BEAMAR). In some blocks, the target remained fixed at one location across trials, and in other blocks, the target could transition in location between one trial and the next with a fixed but low probability. Large individual variability in performance was observed. There were significant benefits for the hybrid BEAMAR condition relative to the KEMAR condition on average for both NH and HI groups when the targets were fixed. Although not apparent in the averaged data, some
Automatic Speech Signal Analysis for Clinical Diagnosis and Assessment of Speech Disorders

CERN Document Server

Baghai-Ravary, Ladan

2013-01-01

Automatic Speech Signal Analysis for Clinical Diagnosis and Assessment of Speech Disorders provides a survey of methods designed to aid clinicians in the diagnosis and monitoring of speech disorders such as dysarthria and dyspraxia, with an emphasis on the signal processing techniques, statistical validity of the results presented in the literature, and the appropriateness of methods that do not require specialized equipment, rigorously controlled recording procedures or highly skilled personnel to interpret results. Such techniques offer the promise of a simple and cost-effective, yet objective, assessment of a range of medical conditions, which would be of great value to clinicians. The ideal scenario would begin with the collection of examples of the clients’ speech, either over the phone or using portable recording devices operated by non-specialist nursing staff. The recordings could then be analyzed initially to aid diagnosis of conditions, and subsequently to monitor the clients’ progress and res...
Hearing speech in music.

Science.gov (United States)

Ekström, Seth-Reino; Borg, Erik

2011-01-01

The masking effect of a piano composition, played at different speeds and in different octaves, on speech-perception thresholds was investigated in 15 normal-hearing and 14 moderately-hearing-impaired subjects. Running speech (just follow conversation, JFC) testing and use of hearing aids increased the everyday validity of the findings. A comparison was made with standard audiometric noises [International Collegium of Rehabilitative Audiology (ICRA) noise and speech spectrum-filtered noise (SPN)]. All masking sounds, music or noise, were presented at the same equivalent sound level (50 dBA). The results showed a significant effect of piano performance speed and octave (Ptempo had the largest effect; and high octave and slow tempo, the smallest. Music had a lower masking effect than did ICRA noise with two or six speakers at normal vocal effort (Pmusic offers an interesting opportunity for studying masking under realistic conditions, where spectral and temporal features can be varied independently. The results have implications for composing music with vocal parts, designing acoustic environments and creating a balance between speech perception and privacy in social settings.
Differences in speech processing among elderly hearing-impaired listeners with or without hearing aid experience: Eye-tracking and fMRI measurements

DEFF Research Database (Denmark)

Habicht, Julia; Behler, Oliver; Kollmeier, Birger

2017-01-01

In contrast to the effects of hearing loss, the effects of hearing aid (HA) experience on speech-in-noise (SIN) processing are underexplored. Using an eye-tracking paradigm that allows determining how fast a participant can grasp the meaning of a sentence presented in noise together with two pict...... support the idea that HA experience positively influences the ability to process SIN quickly and that it reduces the recruitment of brain regions outside the core speech-comprehension network....
FBI fingerprint identification automation study. AIDS 3 evaluation report. Volume 4: Economic feasibility

Science.gov (United States)

Mulhall, B. D. L.

1980-01-01

The results of the economic analysis of the AIDS 3 system design are presented. AIDS 3 evaluated a set of economic feasibility measures including life cycle cost, implementation cost, annual operating expenditures and annual capital expenditures. The economic feasibility of AIDS 3 was determined by comparing the evaluated measures with the same measures, where applicable, evaluated for the current system. A set of future work load scenarios was constructed using JPL's environmental evaluation study of the fingerprint identification system. AIDS 3 and the current system were evaluated for each of the economic feasibility measures for each of the work load scenarios. They were compared for a set of performance measures, including response time and accuracy, and for a set of cost/benefit ratios, including cost per transaction and cost per technical search. Benefit measures related to the economic feasibility of the system are also presented, including the required number of employees and the required employee skill mix.
Social Anxiety, Affect, Cortisol Response and Performance on a Speech Task.

Science.gov (United States)

Losiak, Wladyslaw; Blaut, Agata; Klosowska, Joanna; Slowik, Natalia

2016-01-01

Social anxiety is characterized by increased emotional reactivity to social stimuli, but results of studies focusing on affective reactions of socially anxious subjects in the situation of social exposition are inconclusive, especially in the case of endocrinological measures of affect. This study was designed to examine individual differences in endocrinological and affective reactions to social exposure as well as in performance on a speech task in a group of students (n = 44) comprising subjects with either high or low levels of social anxiety. Measures of salivary cortisol and positive and negative affect were taken before and after an impromptu speech. Self-ratings and observer ratings of performance were also obtained. Cortisol levels and negative affect increased in both groups after the speech task, and positive affect decreased; however, group × affect interactions were not significant. Assessments conducted after the speech task revealed that highly socially anxious participants had lower observer ratings of performance while cortisol increase and changes in self-reported affect were not related to performance. Socially anxious individuals do not differ from nonanxious individuals in affective reactions to social exposition, but reveal worse performance at a speech task. © 2015 S. Karger AG, Basel.
High-performance speech recognition using consistency modeling

Science.gov (United States)

Digalakis, Vassilios; Murveit, Hy; Monaco, Peter; Neumeyer, Leo; Sankar, Ananth

1994-12-01

The goal of SRI's consistency modeling project is to improve the raw acoustic modeling component of SRI's DECIPHER speech recognition system and develop consistency modeling technology. Consistency modeling aims to reduce the number of improper independence assumptions used in traditional speech recognition algorithms so that the resulting speech recognition hypotheses are more self-consistent and, therefore, more accurate. At the initial stages of this effort, SRI focused on developing the appropriate base technologies for consistency modeling. We first developed the Progressive Search technology that allowed us to perform large-vocabulary continuous speech recognition (LVCSR) experiments. Since its conception and development at SRI, this technique has been adopted by most laboratories, including other ARPA contracting sites, doing research on LVSR. Another goal of the consistency modeling project is to attack difficult modeling problems, when there is a mismatch between the training and testing phases. Such mismatches may include outlier speakers, different microphones and additive noise. We were able to either develop new, or transfer and evaluate existing, technologies that adapted our baseline genonic HMM recognizer to such difficult conditions.

Temporal visual cues aid speech recognition

DEFF Research Database (Denmark)

Zhou, Xiang; Ross, Lars; Lehn-Schiøler, Tue

2006-01-01

of audio to generate an artificial talking-face video and measured word recognition performance on simple monosyllabic words. RESULTS: When presenting words together with the artificial video we find that word recognition is improved over purely auditory presentation. The effect is significant (p......BACKGROUND: It is well known that under noisy conditions, viewing a speaker's articulatory movement aids the recognition of spoken words. Conventionally it is thought that the visual input disambiguates otherwise confusing auditory input. HYPOTHESIS: In contrast we hypothesize...... that it is the temporal synchronicity of the visual input that aids parsing of the auditory stream. More specifically, we expected that purely temporal information, which does not convey information such as place of articulation may facility word recognition. METHODS: To test this prediction we used temporal features...
Maximum likelihood based multi-channel isotropic reverberation reduction for hearing aids

DEFF Research Database (Denmark)

Kuklasiński, Adam; Doclo, Simon; Jensen, Søren Holdt

2014-01-01

We propose a multi-channel Wiener filter for speech dereverberation in hearing aids. The proposed algorithm uses joint maximum likelihood estimation of the speech and late reverberation spectral variances, under the assumption that the late reverberant sound field is cylindrically isotropic....... The dereverberation performance of the algorithm is evaluated using computer simulations with realistic hearing aid microphone signals including head-related effects. The algorithm is shown to work well with signals reverberated both by synthetic and by measured room impulse responses, achieving improvements...
Selling health data: de-identification, privacy, and speech.

Science.gov (United States)

Kaplan, Bonnie

2015-07-01

Two court cases that involve selling prescription data for pharmaceutical marketing affect biomedical informatics, patient and clinician privacy, and regulation. Sorrell v. IMS Health Inc. et al. in the United States and R v. Department of Health, Ex Parte Source Informatics Ltd. in the United Kingdom concern privacy and health data protection, data de-identification and reidentification, drug detailing (marketing), commercial benefit from the required disclosure of personal information, clinician privacy and the duty of confidentiality, beneficial and unsavory uses of health data, regulating health technologies, and considering data as speech. Individuals should, at the very least, be aware of how data about them are collected and used. Taking account of how those data are used is needed so societal norms and law evolve ethically as new technologies affect health data privacy and protection.
Performance Aided Design

DEFF Research Database (Denmark)

Parigi, Dario

2014-01-01

paradigm where the increasing integration of parametric tools and performative analysis is changing the way we learn and design. The term Performance Aided Architectural Design (PAD) is proposed at the Master of Science of Architecture and Design at Aalborg University, with the aim of extending a tectonic...... tradition of architecture with computational tools, preparing the basis for the creation of the figure of a modern master builder, sitting at the boundary of the disciplines of architecture and engineering. Performance Aided Design focuses on the role of performative analysis, embedded tectonics......, and computational methods tools to trigger creativity and innovative understanding of relation between form material and a increasingly wide range of performances in architectural design. The ultimate goal is to pursue a design approach that aims at embracing rather than excluding the complexity implicit...
Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

International Nuclear Information System (INIS)

Holzrichter, J.F.; Ng, L.C.

1998-01-01

The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching. 35 figs
Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

Science.gov (United States)

Holzrichter, John F.; Ng, Lawrence C.

1998-01-01

The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching.
Sound localization and speech identification in the frontal median plane with a hear-through headset

DEFF Research Database (Denmark)

Hoffmann, Pablo F.; Møller, Anders Kalsgaard; Christensen, Flemming

2014-01-01

signals can be superimposed via earphone reproduction. An important aspect of the hear-through headset is its transparency, i.e. how close to real life can the electronically amplied sounds be perceived. Here we report experiments conducted to evaluate the auditory transparency of a hear-through headset...... prototype by comparing human performance in natural, hear-through, and fully occluded conditions for two spatial tasks: frontal vertical-plane sound localization and speech-on-speech spatial release from masking. Results showed that localization performance was impaired by the hear-through headset relative...... to the natural condition though not as much as in the fully occluded condition. Localization was affected the least when the sound source was in front of the listeners. Different from the vertical localization performance, results from the speech task suggest that normal speech-on-speech spatial release from...
Extraction Of Electronic Evidence From VoIP: Identification & Analysis Of Digital Speech

Directory of Open Access Journals (Sweden)

David Irwin

2012-09-01

Full Text Available The Voice over Internet Protocol (VoIP is increasing in popularity as a cost effective and efficient means of making telephone calls via the Internet. However, VoIP may also be an attractive method of communication to criminals as their true identity may be hidden and voice and video communications are encrypted as they are deployed across the Internet. This produces in a new set of challenges for forensic analysts compared with traditional wire-tapping of the Public Switched Telephone Network (PSTN infrastructure, which is not applicable to VoIP. Therefore, other methods of recovering electronic evidence from VoIP are required.Â This research investigates the analysis and recovery of digitised human, which persists in computer memory after a VoIP call.This paper proposes a proof of concept how remnants of digitised human speech from a VoIP call may be identified within a forensic memory capture based on how the human voice is detected via a microphone and encoded to a digital format using the sound card of your personal computer. This digital format is unencrypted whist processed in Random Access Memory (RAM before it is passed to the VoIP application for encryption and Â transmission over the Internet. Similarly, an incoming encrypted VoIP call is decrypted by the VoIP application and passes through RAM unencrypted in order to be played via the speaker output.A series of controlled tests were undertaken whereby RAM captures were analysed for remnants of digital speech after a VoIP audio call with known conversation. The identification and analysis of digital speech from RAM attempts to construct an automatic process for the identification and subsequent reconstruction of the audio content of a VoIP call.
Predicting the perceived sound quality of frequency-compressed speech.

Directory of Open Access Journals (Sweden)

Rainer Huber

Full Text Available The performance of objective speech and audio quality measures for the prediction of the perceived quality of frequency-compressed speech in hearing aids is investigated in this paper. A number of existing quality measures have been applied to speech signals processed by a hearing aid, which compresses speech spectra along frequency in order to make information contained in higher frequencies audible for listeners with severe high-frequency hearing loss. Quality measures were compared with subjective ratings obtained from normal hearing and hearing impaired children and adults in an earlier study. High correlations were achieved with quality measures computed by quality models that are based on the auditory model of Dau et al., namely, the measure PSM, computed by the quality model PEMO-Q; the measure qc, computed by the quality model proposed by Hansen and Kollmeier; and the linear subcomponent of the HASQI. For the prediction of quality ratings by hearing impaired listeners, extensions of some models incorporating hearing loss were implemented and shown to achieve improved prediction accuracy. Results indicate that these objective quality measures can potentially serve as tools for assisting in initial setting of frequency compression parameters.
The effect of aided language stimulation on vocabulary acquisition in children with little or no functional speech.

Science.gov (United States)

Dada, Shakila; Alant, Erna

2009-02-01

To describe the nature and frequency of the aided language stimulation program and determine the effects of a 3-week-long aided language stimulation program on the vocabulary acquisition skills of children with little or no functional speech (LNFS). Four children participated in this single-subject, multiple-probe study across activities. The aided language stimulation program comprised 3 activities: arts and crafts, food preparation, and story time activity. Each activity was repeated over the duration of 5 subsequent sessions. Eight target vocabulary items were taught within each activity. The acquisition of all 24 target items was probed throughout the duration of the 3-week intervention period. The frequency and nature of the aided language stimulation provided met the criterion of being used 70% of the time and providing aided language stimulation with an 80:20 ratio of statements to questions. The results indicated that all 4 participants acquired the target vocabulary items. There were, however, variations in the rate of acquisition. This study explores the impact of aided language stimulation on vocabulary acquisition in children. The most important clinical implication of this study is that a 3-week intervention program in aided language stimulation was sufficient to facilitate the comprehension of at least 24 vocabulary items in 4 children with LNFS.
Identification of suicidal tendencies in individuals using a quantitative analysis of their speech production

Directory of Open Access Journals (Sweden)

Zagorovskaya Olga Vladimirovna

2016-04-01

Full Text Available Suicide is one of the top factors contributing to deaths around the globe with individuals not making other people aware of their suicidal plans. Therefore it is of increasing importance to develop the methods of identification of these individuals. One of the directions of the ongoing research is identification of typological features of their speech patterns using the methods of mathematical linguistics and automatic text processing. Most studies addressing the problem use materials written in English. The article presents the analysis of the above studies and points out the ways of dealing with the issue employing materials written in Russian.
Speech comprehension aided by multiple modalities: behavioural and neural interactions

Science.gov (United States)

McGettigan, Carolyn; Faulkner, Andrew; Altarelli, Irene; Obleser, Jonas; Baverstock, Harriet; Scott, Sophie K.

2014-01-01

Speech comprehension is a complex human skill, the performance of which requires the perceiver to combine information from several sources – e.g. voice, face, gesture, linguistic context – to achieve an intelligible and interpretable percept. We describe a functional imaging investigation of how auditory, visual and linguistic information interact to facilitate comprehension. Our specific aims were to investigate the neural responses to these different information sources, alone and in interaction, and further to use behavioural speech comprehension scores to address sites of intelligibility-related activation in multifactorial speech comprehension. In fMRI, participants passively watched videos of spoken sentences, in which we varied Auditory Clarity (with noise-vocoding), Visual Clarity (with Gaussian blurring) and Linguistic Predictability. Main effects of enhanced signal with increased auditory and visual clarity were observed in overlapping regions of posterior STS. Two-way interactions of the factors (auditory × visual, auditory × predictability) in the neural data were observed outside temporal cortex, where positive signal change in response to clearer facial information and greater semantic predictability was greatest at intermediate levels of auditory clarity. Overall changes in stimulus intelligibility by condition (as determined using an independent behavioural experiment) were reflected in the neural data by increased activation predominantly in bilateral dorsolateral temporal cortex, as well as inferior frontal cortex and left fusiform gyrus. Specific investigation of intelligibility changes at intermediate auditory clarity revealed a set of regions, including posterior STS and fusiform gyrus, showing enhanced responses to both visual and linguistic information. Finally, an individual differences analysis showed that greater comprehension performance in the scanning participants (measured in a post-scan behavioural test) were associated with
The Effect of a Voice Activity Detector on the Speech Enhancement

DEFF Research Database (Denmark)

Dau, Torsten; Catic, Jasmina; Buchholz, Jörg

2010-01-01

A multimicrophone speech enhancement algorithm for binaural hearing aids that preserves interaural time delays was proposed recently. The algorithm is based on multichannel Wiener filtering and relies on a voice activity detector (VAD) for estimation of second-order statistics. Here, the effect...... of a VAD on the speech enhancement of this algorithm was evaluated using an envelopebased VAD, and the performance was compared to that achieved using an ideal error-free VAD. The performance was considered for stationary directional noise and nonstationary diffuse noise interferers at input SNRs from −10...
Is There a Relationship between Speech Identification in Noise and Categorical Perception in Children with Dyslexia?

Science.gov (United States)

Calcus, Axelle; Lorenzi, Christian; Collet, Gregory; Colin, Cécile; Kolinsky, Régine

2016-01-01

Purpose: Children with dyslexia have been suggested to experience deficits in both categorical perception (CP) and speech identification in noise (SIN) perception. However, results regarding both abilities are inconsistent, and the relationship between them is still unclear. Therefore, this study aimed to investigate the relationship between CP…
Interactive Speech-Defect Diagnostic/Therapeutic/Prosthetic Aid

Science.gov (United States)

Bates, R. H. T.; Brieseman, N. P.; Clark, T. M.; Elder, A. G.; Fright, W. R.; Garden, K. L.; Kennedy, W. K.; Squires, P. L.; Thorpe, C. W.; Jelinek, H. J.; Turner, S. G.

1987-11-01

We have designed and built a portable real-time speech processing system, which incorporates a TMS 32010 (i.e. a co-processor) within an IBM personal computer. The system design is discussed as is the speech therapy software that has been implemented. Displays of loudness, pitch and vocal tract cross-section as computed by the system are illustrated. Preliminary results show that an estimate of the glottal excitation, as extracted using shift-and-add, vary between individuals. We indicate why the estimate of the glottal excitation may be useful in the diagnosis of glottal disorders.
A Review of Standardized Tests of Nonverbal Oral and Speech Motor Performance in Children

Science.gov (United States)

McCauley, Rebecca J.; Strand, Edythe A.

2008-01-01

Purpose: To review the content and psychometric characteristics of 6 published tests currently available to aid in the study, diagnosis, and treatment of motor speech disorders in children. Method: We compared the content of the 6 tests and critically evaluated the degree to which important psychometric characteristics support the tests' use for…
Tactile Aids

Directory of Open Access Journals (Sweden)

Mohtaramossadat Homayuni

1996-04-01

Full Text Available Tactile aids, which translate sound waves into vibrations that can be felt by the skin, have been used for decades by people with severe/profound hearing loss to enhance speech/language development and improve speechreading.The development of tactile aids dates from the efforts of Goults and his co-workers in the 1920s; Although The power supply was too voluminous and it was difficult to carry specially by children, it was too huge and heavy to be carried outside the laboratories and its application was restricted to the experimental usage. Nowadays great advances have been performed in producing this instrument and its numerous models is available in markets around the world.
How to perform first aid.

Science.gov (United States)

Gloster, Annabella Satu; Johnson, Phillip John

2016-01-13

RATIONALE AND KEY POINTS: This article aims to help nurses to perform first aid in a safe, effective and patient-centred manner. First aid comprises a series of simple, potentially life-saving steps that an individual can perform with minimal equipment. Although it is not a legal requirement to respond to an emergency situation outside of work, nurses have a professional duty to respond and provide care within the limits of their competency. First aid is the provision of immediate medical assistance to an ill or injured person until definitive medical treatment can be accessed. First aid can save lives and it is essential that nurses understand the basic principles. REFLECTIVE ACTIVITY: Clinical skills articles can help update your practice and ensure it remains evidence based. Apply this article to your practice. Reflect on and write a short account of: 1. Your skill in performing first aid and any areas where you may need to extend your knowledge. 2. How reading this article will change your practice. Subscribers can upload their reflective accounts at: rcni.com/portfolio .
A Motor Speech Assessment for Children with Severe Speech Disorders: Reliability and Validity Evidence

Science.gov (United States)

Strand, Edythe A.; McCauley, Rebecca J.; Weigand, Stephen D.; Stoeckel, Ruth E.; Baas, Becky S.

2013-01-01

Purpose: In this article, the authors report reliability and validity evidence for the Dynamic Evaluation of Motor Speech Skill (DEMSS), a new test that uses dynamic assessment to aid in the differential diagnosis of childhood apraxia of speech (CAS). Method: Participants were 81 children between 36 and 79 months of age who were referred to the…
Auditory and language skills of children using hearing aids

Directory of Open Access Journals (Sweden)

Leticia Macedo Penna

2015-04-01

Full Text Available INTRODUCTION: Hearing loss may impair the development of a child. The rehabilitation process for individuals with hearing loss depends on effective interventions.OBJECTIVE: To describe the linguistic profile and the hearing skills of children using hearing aids, to characterize the rehabilitation process and to analyze its association with the children's degree of hearing loss.METHODS: Cross-sectional study with a non-probabilistic sample of 110 children using hearing aids (6-10 years of age for mild to profound hearing loss. Tests of language, speech perception, phonemic discrimination, and school performance were performed. The associations were verified by the following tests: chi-squared for linear trend and Kruskal-Wallis.RESULTS: About 65% of the children had altered vocabulary, whereas 89% and 94% had altered phonology and inferior school performance, respectively. The degree of hearing loss was associated with differences in the median age of diagnosis; the age at which the hearing aids were adapted and at which speech therapy was started; and the performance on auditory tests and the type of communication used.CONCLUSION: The diagnosis of hearing loss and the clinical interventions occurred late, contributing to impairments in auditory and language development.

Speech Alarms Pilot Study

Science.gov (United States)

Sandor, Aniko; Moses, Haifa

2016-01-01

Speech alarms have been used extensively in aviation and included in International Building Codes (IBC) and National Fire Protection Association's (NFPA) Life Safety Code. However, they have not been implemented on space vehicles. Previous studies conducted at NASA JSC showed that speech alarms lead to faster identification and higher accuracy. This research evaluated updated speech and tone alerts in a laboratory environment and in the Human Exploration Research Analog (HERA) in a realistic setup.
Speech in spinocerebellar ataxia.

Science.gov (United States)

Schalling, Ellika; Hartelius, Lena

2013-12-01

Spinocerebellar ataxias (SCAs) are a heterogeneous group of autosomal dominant cerebellar ataxias clinically characterized by progressive ataxia, dysarthria and a range of other concomitant neurological symptoms. Only a few studies include detailed characterization of speech symptoms in SCA. Speech symptoms in SCA resemble ataxic dysarthria but symptoms related to phonation may be more prominent. One study to date has shown an association between differences in speech and voice symptoms related to genotype. More studies of speech and voice phenotypes are motivated, to possibly aid in clinical diagnosis. In addition, instrumental speech analysis has been demonstrated to be a reliable measure that may be used to monitor disease progression or therapy outcomes in possible future pharmacological treatments. Intervention by speech and language pathologists should go beyond assessment. Clinical guidelines for management of speech, communication and swallowing need to be developed for individuals with progressive cerebellar ataxia. Copyright © 2013 Elsevier Inc. All rights reserved.
The performance of an automatic acoustic-based program classifier compared to hearing aid users' manual selection of listening programs.

Science.gov (United States)

Searchfield, Grant D; Linford, Tania; Kobayashi, Kei; Crowhen, David; Latzel, Matthias

2018-03-01

To compare preference for and performance of manually selected programmes to an automatic sound classifier, the Phonak AutoSense OS. A single blind repeated measures study. Participants were fit with Phonak Virto V90 ITE aids; preferences for different listening programmes were compared across four different sound scenarios (speech in: quiet, noise, loud noise and a car). Following a 4-week trial preferences were reassessed and the users preferred programme was compared to the automatic classifier for sound quality and hearing in noise (HINT test) using a 12 loudspeaker array. Twenty-five participants with symmetrical moderate-severe sensorineural hearing loss. Participant preferences of manual programme for scenarios varied considerably between and within sessions. A HINT Speech Reception Threshold (SRT) advantage was observed for the automatic classifier over participant's manual selection for speech in quiet, loud noise and car noise. Sound quality ratings were similar for both manual and automatic selections. The use of a sound classifier is a viable alternative to manual programme selection.
The effect of hearing aid noise reduction on listening effort in hearing-impaired adults.

Science.gov (United States)

Desjardins, Jamie L; Doherty, Karen A

2014-01-01

The purpose of the present study was to evaluate the effect of a noise-reduction (NR) algorithm on the listening effort hearing-impaired participants expend on a speech in noise task. Twelve hearing-impaired listeners fitted with behind-the-ear hearing aids with a fast-acting modulation-based NR algorithm participated in this study. A dual-task paradigm was used to measure listening effort with and without the NR enabled in the hearing aid. The primary task was a sentence-in-noise task presented at fixed overall speech performance levels of 76% (moderate listening condition) and 50% (difficult listening condition) correct performance, and the secondary task was a visual-tracking test. Participants also completed measures of working memory (Reading Span test), and processing speed (Digit Symbol Substitution Test) ability. Participants' speech recognition in noise scores did not significantly change with the NR algorithm activated in the hearing aid in either listening condition. The NR algorithm significantly decreased listening effort, but only in the more difficult listening condition. Last, there was a tendency for participants with faster processing speeds to expend less listening effort with the NR algorithm when listening to speech in background noise in the difficult listening condition. The NR algorithm reduced the listening effort adults with hearing loss must expend to understand speech in noise.
Relating hearing loss and executive functions to hearing aid users’ preference for, and speech recognition with, different combinations of binaural noise reduction and microphone directionality

Directory of Open Access Journals (Sweden)

Tobias eNeher

2014-12-01

Full Text Available Knowledge of how executive functions relate to preferred hearing aid (HA processing is sparse and seemingly inconsistent with related knowledge for speech recognition outcomes. This study thus aimed to find out if (1 performance on a measure of reading span (RS is related to preferred binaural noise reduction (NR strength, (2 similar relations exist for two different, nonverbal measures of executive function, (3 pure-tone average hearing loss (PTA, signal-to-noise ratio (SNR, and microphone directionality (DIR also influence preferred NR strength, and (4 preference and speech recognition outcomes are similar. Sixty elderly HA users took part. Six HA conditions consisting of omnidirectional or cardioid microphones followed by inactive, moderate, or strong binaural NR as well as linear amplification were tested. Outcome was assessed at fixed SNRs using headphone simulations of a frontal target talker in a busy cafeteria. Analyses showed positive effects of active NR and DIR on preference, and negative and positive effects of, respectively, strong NR and DIR on speech recognition. Also, while moderate NR was the most preferred NR setting overall, preference for strong NR increased with SNR. No relation between RS and preference was found. However, larger PTA was related to weaker preference for inactive NR and stronger preference for strong NR for both microphone modes. Equivalent (but weaker relations between worse performance on one nonverbal measure of executive function and the HA conditions without DIR were found. For speech recognition, there were relations between HA condition, PTA, and RS, but their pattern differed from that for preference. Altogether, these results indicate that, while moderate NR works well in general, a notable proportion of HA users prefer stronger NR. Furthermore, PTA and executive functions can account for some of the variability in preference for, and speech recognition with, different binaural NR and DIR settings.
Talker Variability in Audiovisual Speech Perception

Directory of Open Access Journals (Sweden)

Shannon eHeald

2014-07-01

Full Text Available A change in talker is a change in the context for the phonetic interpretation of acoustic patterns of speech. Different talkers have different mappings between acoustic patterns and phonetic categories and listeners need to adapt to these differences. Despite this complexity, listeners are adept at comprehending speech in multiple-talker contexts, albeit at a slight but measurable performance cost (e.g., slower recognition. So far, this talker-variability cost has been demonstrated only in audio-only speech. Other research in single-talker contexts have shown, however, that when listeners are able to see a talker’s face, speech recognition is improved under adverse listening (e.g., noise or distortion conditions that can increase uncertainty in the mapping between acoustic patterns and phonetic categories. Does seeing a talker's face reduce the cost of word recognition in multiple-talker contexts? We used a speeded word-monitoring task in which listeners make quick judgments about target-word recognition in single- and multiple-talker contexts. Results show faster recognition performance in single-talker conditions compared to multiple-talker conditions for both audio-only and audio-visual speech. However, recognition time in a multiple-talker context was slower in the audio-visual condition compared to audio-only condition. These results suggest that seeing a talker’s face during speech perception may slow recognition by increasing the importance of talker identification, signaling to the listener a change in talker has occurred.
Cognitive Processing Speed, Working Memory, and the Intelligibility of Hearing Aid-Processed Speech in Persons with Hearing Impairment

Directory of Open Access Journals (Sweden)

Wycliffe Kabaywe Yumba

2017-08-01

Full Text Available Previous studies have demonstrated that successful listening with advanced signal processing in digital hearing aids is associated with individual cognitive capacity, particularly working memory capacity (WMC. This study aimed to examine the relationship between cognitive abilities (cognitive processing speed and WMC and individual listeners’ responses to digital signal processing settings in adverse listening conditions. A total of 194 native Swedish speakers (83 women and 111 men, aged 33–80 years (mean = 60.75 years, SD = 8.89, with bilateral, symmetrical mild to moderate sensorineural hearing loss who had completed a lexical decision speed test (measuring cognitive processing speed and semantic word-pair span test (SWPST, capturing WMC participated in this study. The Hagerman test (capturing speech recognition in noise was conducted using an experimental hearing aid with three digital signal processing settings: (1 linear amplification without noise reduction (NoP, (2 linear amplification with noise reduction (NR, and (3 non-linear amplification without NR (“fast-acting compression”. The results showed that cognitive processing speed was a better predictor of speech intelligibility in noise, regardless of the types of signal processing algorithms used. That is, there was a stronger association between cognitive processing speed and NR outcomes and fast-acting compression outcomes (in steady state noise. We observed a weaker relationship between working memory and NR, but WMC did not relate to fast-acting compression. WMC was a relatively weaker predictor of speech intelligibility in noise. These findings might have been different if the participants had been provided with training and or allowed to acclimatize to binary masking noise reduction or fast-acting compression.
Comparison of Two Music Training Approaches on Music and Speech Perception in Cochlear Implant Users.

Science.gov (United States)

Fuller, Christina D; Galvin, John J; Maat, Bert; Başkent, Deniz; Free, Rolien H

2018-01-01

In normal-hearing (NH) adults, long-term music training may benefit music and speech perception, even when listening to spectro-temporally degraded signals as experienced by cochlear implant (CI) users. In this study, we compared two different music training approaches in CI users and their effects on speech and music perception, as it remains unclear which approach to music training might be best. The approaches differed in terms of music exercises and social interaction. For the pitch/timbre group, melodic contour identification (MCI) training was performed using computer software. For the music therapy group, training involved face-to-face group exercises (rhythm perception, musical speech perception, music perception, singing, vocal emotion identification, and music improvisation). For the control group, training involved group nonmusic activities (e.g., writing, cooking, and woodworking). Training consisted of weekly 2-hr sessions over a 6-week period. Speech intelligibility in quiet and noise, vocal emotion identification, MCI, and quality of life (QoL) were measured before and after training. The different training approaches appeared to offer different benefits for music and speech perception. Training effects were observed within-domain (better MCI performance for the pitch/timbre group), with little cross-domain transfer of music training (emotion identification significantly improved for the music therapy group). While training had no significant effect on QoL, the music therapy group reported better perceptual skills across training sessions. These results suggest that more extensive and intensive training approaches that combine pitch training with the social aspects of music therapy may further benefit CI users.
A comparison between the first-fit settings of two multichannel digital signal-processing strategies: music quality ratings and speech-in-noise scores.

Science.gov (United States)

Higgins, Paul; Searchfield, Grant; Coad, Gavin

2012-06-01

The aim of this study was to determine which level-dependent hearing aid digital signal-processing strategy (DSP) participants preferred when listening to music and/or performing a speech-in-noise task. Two receiver-in-the-ear hearing aids were compared: one using 32-channel adaptive dynamic range optimization (ADRO) and the other wide dynamic range compression (WDRC) incorporating dual fast (4 channel) and slow (15 channel) processing. The manufacturers' first-fit settings based on participants' audiograms were used in both cases. Results were obtained from 18 participants on a quick speech-in-noise (QuickSIN; Killion, Niquette, Gudmundsen, Revit, & Banerjee, 2004) task and for 3 music listening conditions (classical, jazz, and rock). Participants preferred the quality of music and performed better at the QuickSIN task using the hearing aids with ADRO processing. A potential reason for the better performance of the ADRO hearing aids was less fluctuation in output with change in sound dynamics. ADRO processing has advantages for both music quality and speech recognition in noise over the multichannel WDRC processing that was used in the study. Further evaluations of which DSP aspects contribute to listener preference are required.
Speech Intelligibility

Science.gov (United States)

Brand, Thomas

Speech intelligibility (SI) is important for different fields of research, engineering and diagnostics in order to quantify very different phenomena like the quality of recordings, communication and playback devices, the reverberation of auditoria, characteristics of hearing impairment, benefit using hearing aids or combinations of these things.
APPRECIATING SPEECH THROUGH GAMING

Directory of Open Access Journals (Sweden)

Mario T Carreon

2014-06-01

Full Text Available This paper discusses the Speech and Phoneme Recognition as an Educational Aid for the Deaf and Hearing Impaired (SPREAD application and the ongoing research on its deployment as a tool for motivating deaf and hearing impaired students to learn and appreciate speech. This application uses the Sphinx-4 voice recognition system to analyze the vocalization of the student and provide prompt feedback on their pronunciation. The packaging of the application as an interactive game aims to provide additional motivation for the deaf and hearing impaired student through visual motivation for them to learn and appreciate speech.
DFT-Domain Based Single-Microphone Noise Reduction for Speech Enhancement

DEFF Research Database (Denmark)

C. Hendriks, Richard; Gerkmann, Timo; Jensen, Jesper

As speech processing devices like mobile phones, voice controlled devices, and hearing aids have increased in popularity, people expect them to work anywhere and at any time without user intervention. However, the presence of acoustical disturbances limits the use of these applications, degrades...... their performance, or causes the user difficulties in understanding the conversation or appreciating the device. A common way to reduce the effects of such disturbances is through the use of single-microphone noise reduction algorithms for speech enhancement. The field of single-microphone noise reduction...
Speech enhancement theory and practice

CERN Document Server

Loizou, Philipos C

2013-01-01

With the proliferation of mobile devices and hearing devices, including hearing aids and cochlear implants, there is a growing and pressing need to design algorithms that can improve speech intelligibility without sacrificing quality. Responding to this need, Speech Enhancement: Theory and Practice, Second Edition introduces readers to the basic problems of speech enhancement and the various algorithms proposed to solve these problems. Updated and expanded, this second edition of the bestselling textbook broadens its scope to include evaluation measures and enhancement algorithms aimed at impr
Vibrant Soundbridge and Bone Conduction Hearing Aid in Patients with Bilateral Malformation of External Ear

Directory of Open Access Journals (Sweden)

Mondelli, Maria Fernanda Capoani Garcia

2015-10-01

Full Text Available Introduction Hearing loss is the most common clinical finding in patients with malformation of the external ear canal. Among the possibilities of treatment, there is the adaptation of hearing aids by bone conduction and the adaptation of implantable hearing aids. Objective To assess speech perception with the use of Vibrant Soundbridge (VBS - MED-EL, Innsbruck, Austria associated with additional amplification in patients with bilateral craniofacial malformation. Method We evaluated 11 patients with bilateral malformation over 12 years with mixed hearing loss or bilateral conductive. They were using the Softband (Oticon Medical, Sweden and bone conduction hearing aid in the ear opposite the one with the VSB. We performed the evaluation of speech perception using the Hearing in Noise Test. Results Participants were eight men and three women with a mean of 19.5 years. The signal / noise ratio presented significant results in patients fitted with VSB and bone conduction hearing aid. Conclusion The results of speech perception were significantly better with use of VBS combined with bone conduction hearing aids.
Metaheuristic applications to speech enhancement

CERN Document Server

Kunche, Prajna

2016-01-01

This book serves as a basic reference for those interested in the application of metaheuristics to speech enhancement. The major goal of the book is to explain the basic concepts of optimization methods and their use in heuristic optimization in speech enhancement to scientists, practicing engineers, and academic researchers in speech processing. The authors discuss why it has been a challenging problem for researchers to develop new enhancement algorithms that aid in the quality and intelligibility of degraded speech. They present powerful optimization methods to speech enhancement that can help to solve the noise reduction problems. Readers will be able to understand the fundamentals of speech processing as well as the optimization techniques, how the speech enhancement algorithms are implemented by utilizing optimization methods, and will be given the tools to develop new algorithms. The authors also provide a comprehensive literature survey regarding the topic.
Design and performance of an analysis-by-synthesis class of predictive speech coders

Science.gov (United States)

Rose, Richard C.; Barnwell, Thomas P., III

1990-01-01

The performance of a broad class of analysis-by-synthesis linear predictive speech coders is quantified experimentally. The class of coders includes a number of well-known techniques as well as a very large number of speech coders which have not been named or studied. A general formulation for deriving the parametric representation used in all of the coders in the class is presented. A new coder, named the self-excited vocoder, is discussed because of its good performance with low complexity, and because of the insight this coder gives to analysis-by-synthesis coders in general. The results of a study comparing the performances of different members of this class are presented. The study takes the form of a series of formal subjective and objective speech quality tests performed on selected coders. The results of this study lead to some interesting and important observations concerning the controlling parameters for analysis-by-synthesis speech coders.
Hearing speech in music

Directory of Open Access Journals (Sweden)

Seth-Reino Ekström

2011-01-01

Full Text Available The masking effect of a piano composition, played at different speeds and in different octaves, on speech-perception thresholds was investigated in 15 normal-hearing and 14 moderately-hearing-impaired subjects. Running speech (just follow conversation, JFC testing and use of hearing aids increased the everyday validity of the findings. A comparison was made with standard audiometric noises [International Collegium of Rehabilitative Audiology (ICRA noise and speech spectrum-filtered noise (SPN]. All masking sounds, music or noise, were presented at the same equivalent sound level (50 dBA. The results showed a significant effect of piano performance speed and octave (P<.01. Low octave and fast tempo had the largest effect; and high octave and slow tempo, the smallest. Music had a lower masking effect than did ICRA noise with two or six speakers at normal vocal effort (P<.01 and SPN (P<.05. Subjects with hearing loss had higher masked thresholds than the normal-hearing subjects (P<.01, but there were smaller differences between masking conditions (P<.01. It is pointed out that music offers an interesting opportunity for studying masking under realistic conditions, where spectral and temporal features can be varied independently. The results have implications for composing music with vocal parts, designing acoustic environments and creating a balance between speech perception and privacy in social settings.
Communication through Performance: Hausa Performance Art ...

African Journals Online (AJOL)

The human voice is a natural instrument with a natural capability. Thus, speech with the aid of performance and music has been combined since earliest times to communicate valuable insights into human nature and universal themes of life. Such themes include life, death, good and evil. This paper examined performance ...
Direcionalidade e redução de ruído em AASI: percepção de fala e benefício Directivity and noise reduction in hearing aids: speech perception and benefit

Directory of Open Access Journals (Sweden)

Camila Angélica Quintino

2010-10-01

Full Text Available Aparelho de Amplificação Sonora Individual (AASI. OBJETIVO: Comparar o desempenho, benefício e a satisfação de usuários de AASI intra-aural e retroauricular digital com algoritmo de redução de ruído e microfones omnidirecional e direcional. MATERIAL E MÉTODO: 34 usuários de AASI digital foram avaliados por meio do reconhecimento de sentenças no ruído e dos questionários APHAB e IOI. Estudo prospectivo. RESULTADOS: Melhores resultados foram obtidos com AASI intra-aurais e AASI direcionais, no entanto, não houve diferença estatística significante entre os grupos. CONCLUSÃO: A direcionalidade favoreceu o reconhecimento de fala no ruído e o benefício obtido em vida diária.Hearing aid. AIM: To compare the performance, benefit and satisfaction of users of ITE, CIC and BTE digital hearing aid with noise reduction and omnidirectional and directional microphones. METHOD: 34 users of hearing aid were evaluated by means of speech perception in noise tests and APHAB and IOI self assessment questionnaires. Prospective study. RESULTS: Better results were obtained by users of ITE, CIC and directional hearing aids, however, no statistical significance was found between the groups. CONCLUSION: Directivity improved speech perception in noise and benefit in daily life situations.
Influence of musical training on perception of L2 speech

NARCIS (Netherlands)

Sadakata, M.; Zanden, L.D.T. van der; Sekiyama, K.

2010-01-01

The current study reports specific cases in which a positive transfer of perceptual ability from the music domain to the language domain occurs. We tested whether musical training enhances discrimination and identification performance of L2 speech sounds (timing features, nasal consonants and

Molecular identification of protozoa causing AIDS-associated cholangiopathy in Buenos Aires, Argentina.

Science.gov (United States)

Nétor Velásquez, Jorge; Marta, Edgardo; Alicia di Risio, Cecilia; Etchart, Cristina; Gancedo, Elisa; Victor Chertcoff, Agustín; Bruno Malandrini, Jorge; Germán Astudillo, Osvaldo; Carnevale, Silvana

2012-12-01

Several species of microsporidia and coccidia are protozoa parasites responsible for cholan-giopathy disease in patients infected with human immunodeficiency virus (HIV). The goals of this work were to identift opportunistic protozoa by molecular methods and describe the clinical manifestations at the gastrointestinal tract and the biliary system in patients with AIDS-associated cholangiopathy from Buenos Aires, Argentina. This study included 11 adult HIV-infected individuals with diagnosis ofAIDS- associated cholangiopathy. An upper gastrointestinal endoscopy with biopsy specimen collection and a stool analysis for parasites were performed on each patient. The ultrasound analysis revealed bile ducts compromise. An endoscopic retrograde cholangiopancreatography and a magnetic resonance cholangiography were carried out. The identification to the species level was performed on biopsy specimens by molecular methods. Microorganisms were identified in 10 cases. The diagnosis in patients with sclerosing cholangitis was cryptosporidiosis in 3 cases, cystoisosporosis in 1 and microsporidiosis in 1. In patients with sclerosing cholangitis and papillary stenosis the diagnosis was microsporidiosis in 2 cases, cryptosporidiosis in 2 and cryptosporidiosis associated with microsporidiosis in 1. In 3 cases with cryptosporidiosis the species was Cryptosporidium hominis, 1 of them was associated with Enterocytozoon bieneusi, and the other 2 were coinfected with Cryptosporidium parvum. In the 4 cases with microsporidiosis the species was Enterocytozoon bieneusi. These results suggest that molecular methods may be useful tools to identify emerging protozoa in patients with AIDS-associated cholangiopathy.
Assessment of broadband SNR estimation for hearing aid applications

DEFF Research Database (Denmark)

May, Tobias; Kowalewski, Borys; Fereczkowski, Michal

2017-01-01

was systematically investigated. The most accurate approach utilized an estimation of the clean speech power spectral density (PSD) and the noisy speech power across a sliding window of 1280 ms and achieved an total SNR estimation error below 3 dB across a wide variety of background noises and input SNRs......An accurate estimation of the broadband input signal-to-noise ratio (SNR) is a prerequisite for many hearing-aid algorithms. An extensive comparison of three SNR estimation algorithms was performed. Moreover, the influence of the duration of the analysis window on the SNR estimation performance...
Distributed Speech Enhancement in Wireless Acoustic Sensor Networks

NARCIS (Netherlands)

Zeng, Y.

2015-01-01

In digital speech communication applications like hands-free mobile telephony, hearing aids and human-to-computer communication systems, the recorded speech signals are typically corrupted by background noise. As a result, their quality and intelligibility can get severely degraded. Traditional
FUSING SPEECH SIGNAL AND PALMPRINT FEATURES FOR AN SECURED AUTHENTICATION SYSTEM

Directory of Open Access Journals (Sweden)

P.K. Mahesh

2011-11-01

Full Text Available In the application of Biometric authentication, personal identification is regarded as an effective method for automatic recognition, with a high confidence, a person’s identity. Using multimodal biometric systems we typically get better performance compare to single biometric modality. This paper proposes the multimodal biometrics system for identity verification using two traits, i.e., speech signal and palmprint. Integrating the palmprint and speech information increases robustness of person authentication. The proposed system is designed for applications where the training data contains a speech signal and palmprint. It is well known that the performance of person authentication using only speech signal or palmprint is deteriorated by feature changes with time. The final decision is made by fusion at matching score level architecture in which feature vectors are created independently for query measures and are then compared to the enrolment templates, which are stored during database preparation.
The Relationship between Binaural Benefit and Difference in Unilateral Speech Recognition Performance for Bilateral Cochlear Implant Users

Science.gov (United States)

Yoon, Yang-soo; Li, Yongxin; Kang, Hou-Yong; Fu, Qian-Jie

2011-01-01

Objective The full benefit of bilateral cochlear implants may depend on the unilateral performance with each device, the speech materials, processing ability of the user, and/or the listening environment. In this study, bilateral and unilateral speech performances were evaluated in terms of recognition of phonemes and sentences presented in quiet or in noise. Design Speech recognition was measured for unilateral left, unilateral right, and bilateral listening conditions; speech and noise were presented at 0° azimuth. The “binaural benefit” was defined as the difference between bilateral performance and unilateral performance with the better ear. Study Sample 9 adults with bilateral cochlear implants participated. Results On average, results showed a greater binaural benefit in noise than in quiet for all speech tests. More importantly, the binaural benefit was greater when unilateral performance was similar across ears. As the difference in unilateral performance between ears increased, the binaural advantage decreased; this functional relationship was observed across the different speech materials and noise levels even though there was substantial intra- and inter-subject variability. Conclusions The results indicate that subjects who show symmetry in speech recognition performance between implanted ears in general show a large binaural benefit. PMID:21696329
Deep Recurrent Convolutional Neural Network: Improving Performance For Speech Recognition

OpenAIRE

Zhang, Zewang; Sun, Zheng; Liu, Jiaqi; Chen, Jingwen; Huo, Zhao; Zhang, Xiao

2016-01-01

A deep learning approach has been widely applied in sequence modeling problems. In terms of automatic speech recognition (ASR), its performance has significantly been improved by increasing large speech corpus and deeper neural network. Especially, recurrent neural network and deep convolutional neural network have been applied in ASR successfully. Given the arising problem of training speed, we build a novel deep recurrent convolutional network for acoustic modeling and then apply deep resid...
Social performance deficits in social anxiety disorder: reality during conversation and biased perception during speech.

Science.gov (United States)

Voncken, Marisol J; Bögels, Susan M

2008-12-01

Cognitive models emphasize that patients with social anxiety disorder (SAD) are mainly characterized by biased perception of their social performance. In addition, there is a growing body of evidence showing that SAD patients suffer from actual deficits in social interaction. To unravel what characterizes SAD patients the most, underestimation of social performance (defined as the discrepancy between self-perceived and observer-perceived social performance), or actual (observer-perceived) social performance, 48 patients with SAD and 27 normal control participants were observed during a speech and conversation. Consistent with the cognitive model of SAD, patients with SAD underestimated their social performance relative to control participants during the two interactions, but primarily during the speech. Actual social performance deficits were clearly apparent in the conversation but not in the speech. In conclusion, interactions that pull for more interpersonal skills, like a conversation, elicit more actual social performance deficits whereas, situations with a performance character, like a speech, bring about more cognitive distortions in patients with SAD.
Experimental studies of forensic odontology to aid in the identification process.

Science.gov (United States)

Saxena, Susmita; Sharma, Preeti; Gupta, Nitin

2010-07-01

The importance of dental identification is on the increase year after year. With the passage of time, the role of forensic odontology has increased as very often teeth and dental restorations are the only means of identification. Forensic odontology has played a key role in identification of persons in mass disasters (aviation, earthquakes, Tsunamis), in crime investigations, in ethnic studies, and in identification of decomposed and disfigured bodies like that of drowned persons, fire victims, and victims of motor vehicle accidents. The various methods employed in forensic odontology include tooth prints, radiographs, photographic study, rugoscopy, cheiloscopy and molecular methods. Investigative methods applied in forensic odontology are reasonably reliable, yet the shortcomings must be accounted for to make it a more meaningful and relevant procedure. This paper gives an overview of the various experimental studies to aid in the identification processes, discussing their feasibilities and limitations in day-to-day practice.
Cross-modal Association between Auditory and Visuospatial Information in Mandarin Tone Perception in Noise by Native and Non-native Perceivers

Directory of Open Access Journals (Sweden)

Beverly Hannah

2017-12-01

Full Text Available Speech perception involves multiple input modalities. Research has indicated that perceivers establish cross-modal associations between auditory and visuospatial events to aid perception. Such intermodal relations can be particularly beneficial for speech development and learning, where infants and non-native perceivers need additional resources to acquire and process new sounds. This study examines how facial articulatory cues and co-speech hand gestures mimicking pitch contours in space affect non-native Mandarin tone perception. Native English as well as Mandarin perceivers identified tones embedded in noise with either congruent or incongruent Auditory-Facial (AF and Auditory-FacialGestural (AFG inputs. Native Mandarin results showed the expected ceiling-level performance in the congruent AF and AFG conditions. In the incongruent conditions, while AF identification was primarily auditory-based, AFG identification was partially based on gestures, demonstrating the use of gestures as valid cues in tone identification. The English perceivers’ performance was poor in the congruent AF condition, but improved significantly in AFG. While the incongruent AF identification showed some reliance on facial information, incongruent AFG identification relied more on gestural than auditory-facial information. These results indicate positive effects of facial and especially gestural input on non-native tone perception, suggesting that cross-modal (visuospatial resources can be recruited to aid auditory perception when phonetic demands are high. The current findings may inform patterns of tone acquisition and development, suggesting how multi-modal speech enhancement principles may be applied to facilitate speech learning.
Exploration of available feature detection and identification systems and their performance on radiographs

Science.gov (United States)

Wantuch, Andrew C.; Vita, Joshua A.; Jimenez, Edward S.; Bray, Iliana E.

2016-10-01

Despite object detection, recognition, and identification being very active areas of computer vision research, many of the available tools to aid in these processes are designed with only photographs in mind. Although some algorithms used specifically for feature detection and identification may not take explicit advantage of the colors available in the image, they still under-perform on radiographs, which are grayscale images. We are especially interested in the robustness of these algorithms, specifically their performance on a preexisting database of X-ray radiographs in compressed JPEG form, with multiple ways of describing pixel information. We will review various aspects of the performance of available feature detection and identification systems, including MATLABs Computer Vision toolbox, VLFeat, and OpenCV on our non-ideal database. In the process, we will explore possible reasons for the algorithms' lessened ability to detect and identify features from the X-ray radiographs.
A Support Vector Machine-Based Gender Identification Using Speech Signal

Science.gov (United States)

Lee, Kye-Hwan; Kang, Sang-Ick; Kim, Deok-Hwan; Chang, Joon-Hyuk

We propose an effective voice-based gender identification method using a support vector machine (SVM). The SVM is a binary classification algorithm that classifies two groups by finding the voluntary nonlinear boundary in a feature space and is known to yield high classification performance. In the present work, we compare the identification performance of the SVM with that of a Gaussian mixture model (GMM)-based method using the mel frequency cepstral coefficients (MFCC). A novel approach of incorporating a features fusion scheme based on a combination of the MFCC and the fundamental frequency is proposed with the aim of improving the performance of gender identification. Experimental results demonstrate that the gender identification performance using the SVM is significantly better than that of the GMM-based scheme. Moreover, the performance is substantially improved when the proposed features fusion technique is applied.
Idaho's Three-Tiered System for Speech-Language Paratherapist Training and Utilization.

Science.gov (United States)

Longhurst, Thomas M.

1997-01-01

Discusses the development and current implementation of Idaho's three-tiered system of speech-language paratherapists. Support personnel providing speech-language services to learners with special communication needs in educational settings must obtain one of three certification levels: (1) speech-language aide, (2) associate degree…
Speech perception in noise in unilateral hearing loss

OpenAIRE

Mondelli, Maria Fernanda Capoani Garcia; dos Santos, Marina de Marchi; José, Maria Renata

2016-01-01

ABSTRACT INTRODUCTION: Unilateral hearing loss is characterized by a decrease of hearing in one ear only. In the presence of ambient noise, individuals with unilateral hearing loss are faced with greater difficulties understanding speech than normal listeners. OBJECTIVE: To evaluate the speech perception of individuals with unilateral hearing loss in speech perception with and without competitive noise, before and after the hearing aid fitting process. METHODS: The study included 30 adu...
Effect of Deep Brain Stimulation on Speech Performance in Parkinson's Disease

Directory of Open Access Journals (Sweden)

Sabine Skodda

2012-01-01

Full Text Available Deep brain stimulation (DBS has been reported to be successful in relieving the core motor symptoms of Parkinson's disease (PD and motor fluctuations in the more advanced stages of the disease. However, data on the effects of DBS on speech performance are inconsistent. While there are some series of patients documenting that speech function was relatively unaffected by DBS of the nucleus subthalamicus (STN, other investigators reported on improvements of distinct parameters of oral control and voice. Though, these ameliorations of single speech modalities were not always accompanied by an improvement of overall speech intelligibility. On the other hand, there are also indications for an induction of dysarthria as an adverse effect of STN-DBS occurring at least in some patients with PD. Since a deterioration of speech function has more often been observed under high stimulation amplitudes, this phenomenon has been ascribed to a spread of current-to-adjacent pathways which might also be the reason for the sporadic observation of an onset of dysarthria under DBS of other basal ganglia targets (e.g., globus pallidus internus/GPi or thalamus/Vim. The aim of this paper is to review and evaluate reports in the literature on the effects of DBS on speech function in PD.
Investigating executive functions in children with severe speech and movement disorders using structured tasks.

Science.gov (United States)

Stadskleiv, Kristine; von Tetzchner, Stephen; Batorowicz, Beata; van Balkom, Hans; Dahlgren-Sandberg, Annika; Renner, Gregor

2014-01-01

Executive functions are the basis for goal-directed activity and include planning, monitoring, and inhibition, and language seems to play a role in the development of these functions. There is a tradition of studying executive function in both typical and atypical populations, and the present study investigates executive functions in children with severe speech and motor impairments who are communicating using communication aids with graphic symbols, letters, and/or words. There are few neuropsychological studies of children in this group and little is known about their cognitive functioning, including executive functions. It was hypothesized that aided communication would tax executive functions more than speech. Twenty-nine children using communication aids and 27 naturally speaking children participated. Structured tasks resembling everyday activities, where the action goals had to be reached through communication with a partner, were used to get information about executive functions. The children (a) directed the partner to perform actions like building a Lego tower from a model the partner could not see and (b) gave information about an object without naming it to a person who had to guess what object it was. The executive functions of planning, monitoring, and impulse control were coded from the children's on-task behavior. Both groups solved most of the tasks correctly, indicating that aided communicators are able to use language to direct another person to do a complex set of actions. Planning and lack of impulsivity was positively related to task success in both groups. The aided group completed significantly fewer tasks, spent longer time and showed more variation in performance than the comparison group. The aided communicators scored lower on planning and showed more impulsivity than the comparison group, while both groups showed an equal degree of monitoring of the work progress. The results are consistent with the hypothesis that aided language tax
Automated dental identification system: An aid to forensic odontology

Directory of Open Access Journals (Sweden)

Parvathi Devi

2011-01-01

Full Text Available Automated dental identification system is computer-aided software for the postmortem identification of deceased individuals based on dental characteristics specifically radiographs. This system is receiving increased attention because of the large number of victims encountered in the mass disasters and it is 90% more time saving and accurate than the conventional radiographic methods. This technique is based on the intensity of the overall region of tooth image and therefore it does not necessitate the presence of sharp boundary between the teeth. It provides automated search and matching capabilities for digitized radiographs and photographic dental images and compares the teeth present in multiple digitized dental records in order to access their similarity. This paper highlights the functionality of its components and techniques used in realizing these components.
Comparison of speech performance in labial and lingual orthodontic patients: A prospective study

Science.gov (United States)

Rai, Ambesh Kumar; Rozario, Joe E.; Ganeshkar, Sanjay V.

2014-01-01

Background: The intensity and duration of speech difficulty inherently associated with lingual therapy is a significant issue of concern in orthodontics. This study was designed to evaluate and to compare the duration of changes in speech between labial and lingual orthodontics. Materials and Methods: A prospective longitudinal clinical study was designed to assess speech of 24 patients undergoing labial or lingual orthodontic treatment. An objective spectrographic evaluation of/s/sound was done using software PRAAT version 5.0.47, a semiobjective auditive evaluation of articulation was done by four speech pathologists and a subjective assessment of speech was done by four laypersons. The tests were performed before (T1), within 24 h (T2), after 1 week (T3) and after 1 month (T4) of the start of therapy. The Mann-Whitney U-test for independent samples was used to assess the significance difference between the labial and lingual appliances. A speech alteration with P appliance systems caused a comparable speech difficulty immediately after bonding (T2). Although the speech recovered within a week in the labial group (T3), the lingual group continued to experience discomfort even after a month (T4). PMID:25540661
The impact of cochlear implantation on speech understanding, subjective hearing performance, and tinnitus perception in patients with unilateral severe to profound hearing loss.

Science.gov (United States)

Távora-Vieira, Dayse; Marino, Roberta; Acharya, Aanand; Rajan, Gunesh P

2015-03-01

This study aimed to determine the impact of cochlear implantation on speech understanding in noise, subjective perception of hearing, and tinnitus perception of adult patients with unilateral severe to profound hearing loss and to investigate whether duration of deafness and age at implantation would influence the outcomes. In addition, this article describes the auditory training protocol used for unilaterally deaf patients. This is a prospective study of subjects undergoing cochlear implantation for unilateral deafness with or without associated tinnitus. Speech perception in noise was tested using the Bamford-Kowal-Bench speech-in-noise test presented at 65 dB SPL. The Speech, Spatial, and Qualities of Hearing Scale and the Abbreviated Profile of Hearing Aid Benefit were used to evaluate the subjective perception of hearing with a cochlear implant and quality of life. Tinnitus disturbance was measured using the Tinnitus Reaction Questionnaire. Data were collected before cochlear implantation and 3, 6, 12, and 24 months after implantation. Twenty-eight postlingual unilaterally deaf adults with or without tinnitus were implanted. There was a significant improvement in speech perception in noise across time in all spatial configurations. There was an overall significant improvement on the subjective perception of hearing and quality of life. Tinnitus disturbance reduced significantly across time. Age at implantation and duration of deafness did not influence the outcomes significantly. Cochlear implantation provided significant improvement in speech understanding in challenging situations, subjective perception of hearing performance, and quality of life. Cochlear implantation also resulted in reduced tinnitus disturbance. Age at implantation and duration of deafness did not seem to influence the outcomes.
Hearing loss and speech perception in noise difficulties in Fanconi anemia.

Science.gov (United States)

Verheij, Emmy; Oomen, Karin P Q; Smetsers, Stephanie E; van Zanten, Gijsbert A; Speleman, Lucienne

2017-10-01

Fanconi anemia is a hereditary chromosomal instability disorder. Hearing loss and ear abnormalities are among the many manifestations reported in this disorder. In addition, Fanconi anemia patients often complain about hearing difficulties in situations with background noise (speech perception in noise difficulties). Our study aimed to describe the prevalence of hearing loss and speech perception in noise difficulties in Dutch Fanconi anemia patients. Retrospective chart review. A retrospective chart review was conducted at a Dutch tertiary care center. All patients with Fanconi anemia at clinical follow-up in our hospital were included. Medical files were reviewed to collect data on hearing loss and speech perception in noise difficulties. In total, 49 Fanconi anemia patients were included. Audiograms were available in 29 patients and showed hearing loss in 16 patients (55%). Conductive hearing loss was present in 24.1%, sensorineural in 20.7%, and mixed in 10.3%. A speech in noise test was performed in 17 patients; speech perception in noise was subnormal in nine patients (52.9%) and abnormal in two patients (11.7%). Hearing loss and speech perception in noise abnormalities are common in Fanconi anemia. Therefore, pure tone audiograms and speech in noise tests should be performed, preferably already at a young age, because hearing aids or assistive listening devices could be very valuable in developing language and communication skills. 4. Laryngoscope, 127:2358-2361, 2017. © 2016 The American Laryngological, Rhinological and Otological Society, Inc.
Effect of Deep Brain Stimulation on Speech Performance in Parkinson's Disease

OpenAIRE

Skodda, Sabine

2012-01-01

Deep brain stimulation (DBS) has been reported to be successful in relieving the core motor symptoms of Parkinson's disease (PD) and motor fluctuations in the more advanced stages of the disease. However, data on the effects of DBS on speech performance are inconsistent. While there are some series of patients documenting that speech function was relatively unaffected by DBS of the nucleus subthalamicus (STN), other investigators reported on improvements of distinct parameters of oral control...

The effects of speech controls on performance in advanced helicopters in a double stimulation paradigm

Science.gov (United States)

Bortolussi, Michael R.; Vidulich, Michael A.

1991-01-01

The potential benefit of speech as a control modality has been investigated with mixed results. Earlier studies suggests that speech controls can reduce the potential of manual control overloads and improve time-sharing performance. However, these benefits were not without costs. Pilots reported higher workload levels associated with the use of speech controls. To further investigate these previous findings, an experiment was conducted in a simulation of an advanced single-pilot, scout/attack helicopter at NASA-Ames' ICAB (interchangeable cab) facility. Objective performance data suggested that speech control modality was effective in reducing interference of discrete, time-shared responses during continuous flight control activity. Subjective ratings, however, indicated that the speech control modality increased workload. Post-flight debriefing indicated that these results were mainly due to the increased effort to speak precisely to a less than perfect voice recognition system.
Vocal Performance and Speech Intonation: Bob Dylan’s “Like a Rolling Stone”

Directory of Open Access Journals (Sweden)

Michael Daley

2007-03-01

Full Text Available This article proposes a linguistic analysis of a recorded performance of a single verse of one of Dylan’s most popular songs—the originally released studio recording of “Like A Rolling Stone”—and describes more specifically the ways in which intonation relates to lyrics and performance. This analysis is used as source material for a close reading of the semantic, affective, and “playful” meanings of the performance, and is compared with some published accounts of the song’s reception. The author has drawn on the linguistic methodology formulated by Michael Halliday, who has found speech intonation (which includes pitch movement, timbre, syllabic rhythm, and loudness to be an integral part of English grammar and crucial to the transmission of certain kinds of meaning. Speech intonation is a deeply-rooted and powerfully meaningful aspect of human communication. This article argues that is plausible that a system so powerful in speech might have some bearing on the communication of meaning in sung performance.
Prediction and Optimization of Speech Intelligibility in Adverse Conditions

NARCIS (Netherlands)

Taal, C.H.

2013-01-01

In digital speech-communication systems like mobile phones, public address systems and hearing aids, conveying the message is one of the most important goals. This can be challenging since the intelligibility of the speech may be harmed at various stages before, during and after the transmission
Speech Compression

Directory of Open Access Journals (Sweden)

Jerry D. Gibson

2016-06-01

Full Text Available Speech compression is a key technology underlying digital cellular communications, VoIP, voicemail, and voice response systems. We trace the evolution of speech coding based on the linear prediction model, highlight the key milestones in speech coding, and outline the structures of the most important speech coding standards. Current challenges, future research directions, fundamental limits on performance, and the critical open problem of speech coding for emergency first responders are all discussed.
Speech recognition and parent-ratings from auditory development questionnaires in children who are hard of hearing

Science.gov (United States)

McCreery, Ryan W.; Walker, Elizabeth A.; Spratford, Meredith; Oleson, Jacob; Bentler, Ruth; Holte, Lenore; Roush, Patricia

2015-01-01

Objectives Progress has been made in recent years in the provision of amplification and early intervention for children who are hard of hearing. However, children who use hearing aids (HA) may have inconsistent access to their auditory environment due to limitations in speech audibility through their HAs or limited HA use. The effects of variability in children’s auditory experience on parent-report auditory skills questionnaires and on speech recognition in quiet and in noise were examined for a large group of children who were followed as part of the Outcomes of Children with Hearing Loss study. Design Parent ratings on auditory development questionnaires and children’s speech recognition were assessed for 306 children who are hard of hearing. Children ranged in age from 12 months to 9 years of age. Three questionnaires involving parent ratings of auditory skill development and behavior were used, including the LittlEARS Auditory Questionnaire, Parents Evaluation of Oral/Aural Performance in Children Rating Scale, and an adaptation of the Speech, Spatial and Qualities of Hearing scale. Speech recognition in quiet was assessed using the Open and Closed set task, Early Speech Perception Test, Lexical Neighborhood Test, and Phonetically-balanced Kindergarten word lists. Speech recognition in noise was assessed using the Computer-Assisted Speech Perception Assessment. Children who are hard of hearing were compared to peers with normal hearing matched for age, maternal educational level and nonverbal intelligence. The effects of aided audibility, HA use and language ability on parent responses to auditory development questionnaires and on children’s speech recognition were also examined. Results Children who are hard of hearing had poorer performance than peers with normal hearing on parent ratings of auditory skills and had poorer speech recognition. Significant individual variability among children who are hard of hearing was observed. Children with greater
Multi-channel PSD Estimators for Speech Dereverberation

DEFF Research Database (Denmark)

Kuklasinski, Adam; Doclo, Simon; Gerkmann, Timo

2015-01-01

densities (PSDs). We first derive closed-form expressions for the mean square error (MSE) of both PSD estimators and then show that one estimator – previously used for speech dereverberation by the authors – always yields a better MSE. Only in the case of a two microphone array or for special spatial...... distributions of the interference both estimators yield the same MSE. The theoretically derived MSE values are in good agreement with numerical simulation results and with instrumental speech quality measures in a realistic speech dereverberation task for binaural hearing aids....
A Dynamic Speech Comprehension Test for Assessing Real-World Listening Ability.

Science.gov (United States)

Best, Virginia; Keidser, Gitte; Freeston, Katrina; Buchholz, Jörg M

2016-07-01

Many listeners with hearing loss report particular difficulties with multitalker communication situations, but these difficulties are not well predicted using current clinical and laboratory assessment tools. The overall aim of this work is to create new speech tests that capture key aspects of multitalker communication situations and ultimately provide better predictions of real-world communication abilities and the effect of hearing aids. A test of ongoing speech comprehension introduced previously was extended to include naturalistic conversations between multiple talkers as targets, and a reverberant background environment containing competing conversations. In this article, we describe the development of this test and present a validation study. Thirty listeners with normal hearing participated in this study. Speech comprehension was measured for one-, two-, and three-talker passages at three different signal-to-noise ratios (SNRs), and working memory ability was measured using the reading span test. Analyses were conducted to examine passage equivalence, learning effects, and test-retest reliability, and to characterize the effects of number of talkers and SNR. Although we observed differences in difficulty across passages, it was possible to group the passages into four equivalent sets. Using this grouping, we achieved good test-retest reliability and observed no significant learning effects. Comprehension performance was sensitive to the SNR but did not decrease as the number of talkers increased. Individual performance showed associations with age and reading span score. This new dynamic speech comprehension test appears to be valid and suitable for experimental purposes. Further work will explore its utility as a tool for predicting real-world communication ability and hearing aid benefit. American Academy of Audiology.
HIV/AIDS AND THE RELIGIOUS LEADERS

African Journals Online (AJOL)

OLUDURO

Arguably, no religion is free from this epidemic ...... 89 There have been documented reports in Burundi and Malaysia of couples offering fake HIV/AIDS ..... Sidibé M 2010 Having faith: The global challenge of HIV and AIDS (Speech delivered.
Modeling speech intelligibility in adverse conditions

DEFF Research Database (Denmark)

Dau, Torsten

2012-01-01

) in conditions with nonlinearly processed speech. Instead of considering the reduction of the temporal modulation energy as the intelligibility metric, as assumed in the STI, the sEPSM applies the signal-to-noise ratio in the envelope domain (SNRenv). This metric was shown to be the key for predicting...... understanding speech when more than one person is talking, even when reduced audibility has been fully compensated for by a hearing aid. The reasons for these difficulties are not well understood. This presentation highlights recent concepts of the monaural and binaural signal processing strategies employed...... by the normal as well as impaired auditory system. Jørgensen and Dau [(2011). J. Acoust. Soc. Am. 130, 1475-1487] proposed the speech-based envelope power spectrum model (sEPSM) in an attempt to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index (SII...
Integration of auditory and visual speech information

NARCIS (Netherlands)

Hall, M.; Smeele, P.M.T.; Kuhl, P.K.

1998-01-01

The integration of auditory and visual speech is observed when modes specify different places of articulation. Influences of auditory variation on integration were examined using consonant identifi-cation, plus quality and similarity ratings. Auditory identification predicted auditory-visual
Quantifying the bystander-effect of 2.5G mobile telephones on the speech perception of digital hearing aid users.

Science.gov (United States)

Vlastarakos, P V; Nikolopoulos, T P; Manolopoulos, L; Stamou, A; Halkiotis, K K; Ferekidis, E; Georgiou, E

2012-01-01

To quantify the bystander-effect of 2.5G mobile telephones (2.5G-MTs) on the speech perception of digital hearing-aid (dHA) users. Differences in the susceptibility of behind-the-ear (BTE) compared to in-to-the-ear (ITE) dHAs were also assessed. Prospective-comparative study conducted at a tertiary referral centre (ENT Department) and a HA-fitting laboratory. Key-word recognition scores from open-sentence lists were calculated. Power-analysis determined that a minimum of 60 subjects with SNHL (30 in each group), using either BTE or ITE dHAs, were required for reliable study outcomes. Sixty-four adults were tested with a functioning 2.5G-MT at almost physical contact with their ear; thirty subjects used BTE and 34 ITE dHAs. Aided word recognition score differences between studied groups and within each group, while a 2.5G-MT was activated. Cut-off inclusion criterion regarding baseline aided word recognition score was 75%. Baseline aided word recognition scores for ITE dHAs were better compared to BTE ones (p ITE dHAs proved more susceptible to electromagnetic interference (p ITE HAs is not confirmed by the results of the present study. EBM level of evidence: 2c.
The Soft Palate Friendly Speech Bulb for Velopharyngeal Insufficiency.

Science.gov (United States)

Kahlon, Sukhdeep Singh; Kahlon, Monaliza; Gupta, Shilpa; Dhingra, Parvinder Singh

2016-09-01

Velopharyngeal insufficiency is an anatomic defect of the soft palate making palatopharyngeal sphincter incomplete. It is an important concern to address in patients with bilateral cleft lip and palate. Speech aid prosthesis or speech bulbs are best choice in cases where surgically repaired soft palate is too short to contact pharyngeal walls during function but these prosthesis have been associated with inadequate marginal closure, ulcerations and patient discomfort. Here is a case report of untreated bilateral cleft lip and palate associated with palatal insufficiency treated by means of palate friendly innovative speech bulb. This modified speech bulb is a combination of hard acrylic and soft lining material. The hard self-curing acrylic resin covers only the hard palate area and a permanent soft silicone lining material covering the soft palate area. A claw-shaped wire component was extended backwards from acrylic and was embedded in soft silicone to aid in retention and approximation of two materials. The advantage of adding the soft lining material in posterior area helped in covering the adequate superior extension and margins for maximal pharyngeal activity. This also improved the hypernasality, speech, comfort and overall patient acceptance.
Advantages of binaural amplification to acceptable noise level of directional hearing aid users.

Science.gov (United States)

Kim, Ja-Hee; Lee, Jae Hee; Lee, Ho-Ki

2014-06-01

The goal of the present study was to examine whether Acceptable Noise Levels (ANLs) would be lower (greater acceptance of noise) in binaural listening than in monaural listening condition and also whether meaningfulness of background speech noise would affect ANLs for directional microphone hearing aid users. In addition, any relationships between the individual binaural benefits on ANLs and the individuals' demographic information were investigated. Fourteen hearing aid users (mean age, 64 years) participated for experimental testing. For the ANL calculation, listeners' most comfortable listening levels and background noise level were measured. Using Korean ANL material, ANLs of all participants were evaluated under monaural and binaural amplification with a counterbalanced order. The ANLs were also compared across five types of competing speech noises, consisting of 1- through 8-talker background speech maskers. Seven young normal-hearing listeners (mean age, 27 years) participated for the same measurements as a pilot testing. The results demonstrated that directional hearing aid users accepted more noise (lower ANLs) with binaural amplification than with monaural amplification, regardless of the type of competing speech. When the background speech noise became more meaningful, hearing-impaired listeners accepted less amount of noise (higher ANLs), revealing that ANL is dependent on the intelligibility of the competing speech. The individuals' binaural advantages in ANLs were significantly greater for the listeners with longer experience of hearing aids, yet not related to their age or hearing thresholds. Binaural directional microphone processing allowed hearing aid users to accept a greater amount of background noise, which may in turn improve listeners' hearing aid success. Informational masking substantially influenced background noise acceptance. Given a significant association between ANLs and duration of hearing aid usage, ANL measurement can be useful for
Tune in or tune out: age-related differences in listening to speech in music.

Science.gov (United States)

Russo, Frank A; Pichora-Fuller, M Kathleen

2008-10-01

To examine age-related differences in listening to speech in music. In the first experiment, the effect of music familiarity on word identification was compared with a standard measure of word identification in multitalker babble. The average level of the backgrounds was matched and two speech-to-background ratios were tested. In the second experiment, recognition recall was measured for background music heard during a word identification task. For older adults, word identification did not depend on the type of background, but for younger adults word identification was better when the background was familiar music than when it was unfamiliar music or babble. Younger listeners remembered background music better than older listeners, with the pattern of false alarms suggesting that younger listeners consciously processed the background music more than older listeners. In other words, younger listeners attempted to "tune in" the music background, but older listeners attempted to "tune out" the background. These findings reveal age-related differences in listening to speech in music. When older listeners are confronted with a music background they tend to focus attention on the speech foreground. In contrast, younger listeners attend to both the speech foreground and music background. When music is familiar, this strategy adopted by younger listeners seems to be beneficial to word identification.
Identification of irradiated spices with aid of scintillation counter

International Nuclear Information System (INIS)

Uusheimo, K.

1989-08-01

The aim off the work was to determine how one can identify gamma-irradiated spices with aid of a scintillation counter (LKB/Wallac 1219 RackBeta Spectral) by chemiluminescence measurements. Even though scintillation counters are more sensitive than real luminometers they have not been capable in identifying the irradiated spices after contact with photosensitizer like luminol, isoluminol and lucigenin presumably because the actual chemiluminescence reaction took place before the sample vial reached the measuring range. Whereas it was noticed that the identification of pure, dry allspice, black pepper, white peppar and cardemom was possible without any solutions when there were also present similar unirradiated spices. The identification was possible even after 23 weeks duration depending on the dose of the irradiation (10 kGy or 50 kGy) and the weight of the samples (1 g or 9 g). The duration of the investigation was 23 weeks
The effect of instantaneous input dynamic range setting on the speech perception of children with the nucleus 24 implant.

Science.gov (United States)

Davidson, Lisa S; Skinner, Margaret W; Holstad, Beth A; Fears, Beverly T; Richter, Marie K; Matusofsky, Margaret; Brenner, Christine; Holden, Timothy; Birath, Amy; Kettel, Jerrica L; Scollie, Susan

2009-06-01

The purpose of this study was to examine the effects of a wider instantaneous input dynamic range (IIDR) setting on speech perception and comfort in quiet and noise for children wearing the Nucleus 24 implant system and the Freedom speech processor. In addition, children's ability to understand soft and conversational level speech in relation to aided sound-field thresholds was examined. Thirty children (age, 7 to 17 years) with the Nucleus 24 cochlear implant system and the Freedom speech processor with two different IIDR settings (30 versus 40 dB) were tested on the Consonant Nucleus Consonant (CNC) word test at 50 and 60 dB SPL, the Bamford-Kowal-Bench Speech in Noise Test, and a loudness rating task for four-talker speech noise. Aided thresholds for frequency-modulated tones, narrowband noise, and recorded Ling sounds were obtained with the two IIDRs and examined in relation to CNC scores at 50 dB SPL. Speech Intelligibility Indices were calculated using the long-term average speech spectrum of the CNC words at 50 dB SPL measured at each test site and aided thresholds. Group mean CNC scores at 50 dB SPL with the 40 IIDR were significantly higher (p Speech in Noise Test were not significantly different for the two IIDRs. Significantly improved aided thresholds at 250 to 6000 Hz as well as higher Speech Intelligibility Indices afforded improved audibility for speech presented at soft levels (50 dB SPL). These results indicate that an increased IIDR provides improved word recognition for soft levels of speech without compromising comfort of higher levels of speech sounds or sentence recognition in noise.
Issues in Identification and Assessment of Children with Autism and a Proposed Resource Toolkit for Speech-Language Pathologists.

Science.gov (United States)

Hus, Yvette

2017-01-01

The prevalence of autism spectrum disorder (ASD) has increased significantly in the last decade as have treatment choices. Nonetheless, the vastly diverse autism topic includes issues related to naming, description, iden-tification, assessment, and differentiation from other neu-rodevelopmental conditions. ASD issues directly impact speech-language pathologists (SLPs) who often see these children as the second contact, after pediatric medical practitioners. Because of shared symptomology, differentiation among neurodevelopmental disorders is crucial as it impacts treatment, educational choices, and the performance trajectory of affected children. To highlight issues in: identification and differentiation of ASD from other communication and language challenges, the prevalence differences between ASD gender phenotypes, and the insufficient consideration of cultural factors in evaluating ASD in children. A second objective was to propose a tool to assist SLPs in the management of autism in children. A universal resource toolkit development project for SLP communities at large is proposed. The resource is comprised of research-based observation and screening tools for caregivers and educators, as well as parent questionnaires for portraying the children's function in the family, cultural com-munity, and educational setting. © 2017 S. Karger AG, Basel.
Is the Speech Transmission Index (STI) a robust measure of sound system speech intelligibility performance?

Science.gov (United States)

Mapp, Peter

2002-11-01

Although RaSTI is a good indicator of the speech intelligibility capability of auditoria and similar spaces, during the past 2-3 years it has been shown that RaSTI is not a robust predictor of sound system intelligibility performance. Instead, it is now recommended, within both national and international codes and standards, that full STI measurement and analysis be employed. However, new research is reported, that indicates that STI is not as flawless, nor robust as many believe. The paper highlights a number of potential error mechanisms. It is shown that the measurement technique and signal excitation stimulus can have a significant effect on the overall result and accuracy, particularly where DSP-based equipment is employed. It is also shown that in its current state of development, STI is not capable of appropriately accounting for a number of fundamental speech and system attributes, including typical sound system frequency response variations and anomalies. This is particularly shown to be the case when a system is operating under reverberant conditions. Comparisons between actual system measurements and corresponding word score data are reported where errors of up to 50 implications for VA and PA system performance verification will be discussed.
Musical background not associated with self-perceived hearing performance or speech perception in postlingual cochlear-implant users.

Science.gov (United States)

Fuller, Christina; Free, Rolien; Maat, Bert; Başkent, Deniz

2012-08-01

In normal-hearing listeners, musical background has been observed to change the sound representation in the auditory system and produce enhanced performance in some speech perception tests. Based on these observations, it has been hypothesized that musical background can influence sound and speech perception, and as an extension also the quality of life, by cochlear-implant users. To test this hypothesis, this study explored musical background [using the Dutch Musical Background Questionnaire (DMBQ)], and self-perceived sound and speech perception and quality of life [using the Nijmegen Cochlear Implant Questionnaire (NCIQ) and the Speech Spatial and Qualities of Hearing Scale (SSQ)] in 98 postlingually deafened adult cochlear-implant recipients. In addition to self-perceived measures, speech perception scores (percentage of phonemes recognized in words presented in quiet) were obtained from patient records. The self-perceived hearing performance was associated with the objective speech perception. Forty-one respondents (44% of 94 respondents) indicated some form of formal musical training. Fifteen respondents (18% of 83 respondents) judged themselves as having musical training, experience, and knowledge. No association was observed between musical background (quantified by DMBQ), and self-perceived hearing-related performance or quality of life (quantified by NCIQ and SSQ), or speech perception in quiet.
Multimodal Speech Capture System for Speech Rehabilitation and Learning.

Science.gov (United States)

Sebkhi, Nordine; Desai, Dhyey; Islam, Mohammad; Lu, Jun; Wilson, Kimberly; Ghovanloo, Maysam

2017-11-01

Speech-language pathologists (SLPs) are trained to correct articulation of people diagnosed with motor speech disorders by analyzing articulators' motion and assessing speech outcome while patients speak. To assist SLPs in this task, we are presenting the multimodal speech capture system (MSCS) that records and displays kinematics of key speech articulators, the tongue and lips, along with voice, using unobtrusive methods. Collected speech modalities, tongue motion, lips gestures, and voice are visualized not only in real-time to provide patients with instant feedback but also offline to allow SLPs to perform post-analysis of articulators' motion, particularly the tongue, with its prominent but hardly visible role in articulation. We describe the MSCS hardware and software components, and demonstrate its basic visualization capabilities by a healthy individual repeating the words "Hello World." A proof-of-concept prototype has been successfully developed for this purpose, and will be used in future clinical studies to evaluate its potential impact on accelerating speech rehabilitation by enabling patients to speak naturally. Pattern matching algorithms to be applied to the collected data can provide patients with quantitative and objective feedback on their speech performance, unlike current methods that are mostly subjective, and may vary from one SLP to another.

Only Behavioral But Not Self-Report Measures of Speech Perception Correlate with Cognitive Abilities.

Science.gov (United States)

Heinrich, Antje; Henshaw, Helen; Ferguson, Melanie A

2016-01-01

Good speech perception and communication skills in everyday life are crucial for participation and well-being, and are therefore an overarching aim of auditory rehabilitation. Both behavioral and self-report measures can be used to assess these skills. However, correlations between behavioral and self-report speech perception measures are often low. One possible explanation is that there is a mismatch between the specific situations used in the assessment of these skills in each method, and a more careful matching across situations might improve consistency of results. The role that cognition plays in specific speech situations may also be important for understanding communication, as speech perception tests vary in their cognitive demands. In this study, the role of executive function, working memory (WM) and attention in behavioral and self-report measures of speech perception was investigated. Thirty existing hearing aid users with mild-to-moderate hearing loss aged between 50 and 74 years completed a behavioral test battery with speech perception tests ranging from phoneme discrimination in modulated noise (easy) to words in multi-talker babble (medium) and keyword perception in a carrier sentence against a distractor voice (difficult). In addition, a self-report measure of aided communication, residual disability from the Glasgow Hearing Aid Benefit Profile, was obtained. Correlations between speech perception tests and self-report measures were higher when specific speech situations across both were matched. Cognition correlated with behavioral speech perception test results but not with self-report. Only the most difficult speech perception test, keyword perception in a carrier sentence with a competing distractor voice, engaged executive functions in addition to WM. In conclusion, any relationship between behavioral and self-report speech perception is not mediated by a shared correlation with cognition.
Interdependence of linguistic and indexical speech perception skills in school-age children with early cochlear implantation.

Science.gov (United States)

Geers, Ann E; Davidson, Lisa S; Uchanski, Rosalie M; Nicholas, Johanna G

2013-09-01

This study documented the ability of experienced pediatric cochlear implant (CI) users to perceive linguistic properties (what is said) and indexical attributes (emotional intent and talker identity) of speech, and examined the extent to which linguistic (LSP) and indexical (ISP) perception skills are related. Preimplant-aided hearing, age at implantation, speech processor technology, CI-aided thresholds, sequential bilateral cochlear implantation, and academic integration with hearing age-mates were examined for their possible relationships to both LSP and ISP skills. Sixty 9- to 12-year olds, first implanted at an early age (12 to 38 months), participated in a comprehensive test battery that included the following LSP skills: (1) recognition of monosyllabic words at loud and soft levels, (2) repetition of phonemes and suprasegmental features from nonwords, and (3) recognition of key words from sentences presented within a noise background, and the following ISP skills: (1) discrimination of across-gender and within-gender (female) talkers and (2) identification and discrimination of emotional content from spoken sentences. A group of 30 age-matched children without hearing loss completed the nonword repetition, and talker- and emotion-perception tasks for comparison. Word-recognition scores decreased with signal level from a mean of 77% correct at 70 dB SPL to 52% at 50 dB SPL. On average, CI users recognized 50% of key words presented in sentences that were 9.8 dB above background noise. Phonetic properties were repeated from nonword stimuli at about the same level of accuracy as suprasegmental attributes (70 and 75%, respectively). The majority of CI users identified emotional content and differentiated talkers significantly above chance levels. Scores on LSP and ISP measures were combined into separate principal component scores and these components were highly correlated (r = 0.76). Both LSP and ISP component scores were higher for children who received a CI
Hearing aid fitting for visual and hearing impaired patients with Usher syndrome type IIa.

Science.gov (United States)

Hartel, B P; Agterberg, M J H; Snik, A F; Kunst, H P M; van Opstal, A J; Bosman, A J; Pennings, R J E

2017-08-01

Usher syndrome is the leading cause of hereditary deaf-blindness. Most patients with Usher syndrome type IIa start using hearing aids from a young age. A serious complaint refers to interference between sound localisation abilities and adaptive sound processing (compression), as present in today's hearing aids. The aim of this study was to investigate the effect of advanced signal processing on binaural hearing, including sound localisation. In this prospective study, patients were fitted with hearing aids with a nonlinear (compression) and linear amplification programs. Data logging was used to objectively evaluate the use of either program. Performance was evaluated with a speech-in-noise test, a sound localisation test and two questionnaires focussing on self-reported benefit. Data logging confirmed that the reported use of hearing aids was high. The linear program was used significantly more often (average use: 77%) than the nonlinear program (average use: 17%). The results for speech intelligibility in noise and sound localisation did not show a significant difference between type of amplification. However, the self-reported outcomes showed higher scores on 'ease of communication' and overall benefit, and significant lower scores on disability for the new hearing aids when compared to their previous hearing aids with compression amplification. Patients with Usher syndrome type IIa prefer a linear amplification over nonlinear amplification when fitted with novel hearing aids. Apart from a significantly higher logged use, no difference in speech in noise and sound localisation was observed between linear and nonlinear amplification with the currently used tests. Further research is needed to evaluate the reasons behind the preference for the linear settings. © 2016 The Authors. Clinical Otolaryngology Published by John Wiley & Sons Ltd.
Investigating executive functions in children with severe speech and movement disorders using structured tasks

Directory of Open Access Journals (Sweden)

Kristine eStadskleiv

2014-09-01

Full Text Available Executive functions are the basis for goal-directed activity and include planning, monitoring, and inhibition, and language seems to play a role in the development of these functions. There is a tradition of studying executive function in both typical and atypical populations, and the present study investigates executive functions in children with severe speech and motor impairments who are communicating using communication aids with graphic symbols, letters and/or words. There are few neuropsychological studies of children in this group and little is known about their cognitive functioning, including executive functions. It was hypothesized that aided communication would tax executive functions more than speech. 29 children using communication aids and 27 naturally speaking children participated. Structured tasks resembling everyday activities, where the action goals had to be reached through communication with a partner, were used to get information about executive functions. The children a directed the partner to perform actions like building a Lego tower from a model the partner could not see and b gave information about an object without naming it to a person who had to guess what object it was. The executive functions of planning, monitoring and impulse control were coded from the children’s on-task behavior. Both groups solved most of the tasks correctly, indicating that aided communicators are able to use language to direct another person to do a complex set of actions. Planning and lack of impulsivity was positively related to task success in both groups. The aided group completed significantly fewer tasks, spent longer time and showed more variation in performance than the comparison group. The aided communicators scored lower on planning and showed more impulsivity than the comparison group, while both groups showed an equal degree of monitoring of the work progress. The results are consistent with the hypothesis that aided language
Beamforming under Quantization Errors in Wireless Binaural Hearing Aids

Directory of Open Access Journals (Sweden)

Srinivasan Sriram

2008-01-01

Full Text Available Improving the intelligibility of speech in different environments is one of the main objectives of hearing aid signal processing algorithms. Hearing aids typically employ beamforming techniques using multiple microphones for this task. In this paper, we discuss a binaural beamforming scheme that uses signals from the hearing aids worn on both the left and right ears. Specifically, we analyze the effect of a low bit rate wireless communication link between the left and right hearing aids on the performance of the beamformer. The scheme is comprised of a generalized sidelobe canceller (GSC that has two inputs: observations from one ear, and quantized observations from the other ear, and whose output is an estimate of the desired signal. We analyze the performance of this scheme in the presence of a localized interferer as a function of the communication bit rate using the resultant mean-squared error as the signal distortion measure.
Beamforming under Quantization Errors in Wireless Binaural Hearing Aids

Directory of Open Access Journals (Sweden)

Kees Janse

2008-09-01

Full Text Available Improving the intelligibility of speech in different environments is one of the main objectives of hearing aid signal processing algorithms. Hearing aids typically employ beamforming techniques using multiple microphones for this task. In this paper, we discuss a binaural beamforming scheme that uses signals from the hearing aids worn on both the left and right ears. Specifically, we analyze the effect of a low bit rate wireless communication link between the left and right hearing aids on the performance of the beamformer. The scheme is comprised of a generalized sidelobe canceller (GSC that has two inputs: observations from one ear, and quantized observations from the other ear, and whose output is an estimate of the desired signal. We analyze the performance of this scheme in the presence of a localized interferer as a function of the communication bit rate using the resultant mean-squared error as the signal distortion measure.
Memory for speech and speech for memory.

Science.gov (United States)

Locke, J L; Kutz, K J

1975-03-01

Thirty kindergarteners, 15 who substituted /w/ for /r/ and 15 with correct articulation, received two perception tests and a memory test that included /w/ and /r/ in minimally contrastive syllables. Although both groups had nearly perfect perception of the experimenter's productions of /w/ and /r/, misarticulating subjects perceived their own tape-recorded w/r productions as /w/. In the memory task these same misarticulating subjects committed significantly more /w/-/r/ confusions in unspoken recall. The discussion considers why people subvocally rehearse; a developmental period in which children do not rehearse; ways subvocalization may aid recall, including motor and acoustic encoding; an echoic store that provides additional recall support if subjects rehearse vocally, and perception of self- and other- produced phonemes by misarticulating children-including its relevance to a motor theory of perception. Evidence is presented that speech for memory can be sufficiently impaired to cause memory disorder. Conceptions that restrict speech disorder to an impairment of communication are challenged.
Representational Similarity Analysis Reveals Heterogeneous Networks Supporting Speech Motor Control

DEFF Research Database (Denmark)

Zheng, Zane; Cusack, Rhodri; Johnsrude, Ingrid

The everyday act of speaking involves the complex processes of speech motor control. One important feature of such control is regulation of articulation when auditory concomitants of speech do not correspond to the intended motor gesture. While theoretical accounts of speech monitoring posit...... multiple functional components required for detection of errors in speech planning (e.g., Levelt, 1983), neuroimaging studies generally indicate either single brain regions sensitive to speech production errors, or small, discrete networks. Here we demonstrate that the complex system controlling speech...... is supported by a complex neural network that is involved in linguistic, motoric and sensory processing. With the aid of novel real-time acoustic analyses and representational similarity analyses of fMRI signals, our data show functionally differentiated networks underlying auditory feedback control of speech....
Spectrotemporal Modulation Detection and Speech Perception by Cochlear Implant Users.

Science.gov (United States)

Won, Jong Ho; Moon, Il Joon; Jin, Sunhwa; Park, Heesung; Woo, Jihwan; Cho, Yang-Sun; Chung, Won-Ho; Hong, Sung Hwa

2015-01-01

Spectrotemporal modulation (STM) detection performance was examined for cochlear implant (CI) users. The test involved discriminating between an unmodulated steady noise and a modulated stimulus. The modulated stimulus presents frequency modulation patterns that change in frequency over time. In order to examine STM detection performance for different modulation conditions, two different temporal modulation rates (5 and 10 Hz) and three different spectral modulation densities (0.5, 1.0, and 2.0 cycles/octave) were employed, producing a total 6 different STM stimulus conditions. In order to explore how electric hearing constrains STM sensitivity for CI users differently from acoustic hearing, normal-hearing (NH) and hearing-impaired (HI) listeners were also tested on the same tasks. STM detection performance was best in NH subjects, followed by HI subjects. On average, CI subjects showed poorest performance, but some CI subjects showed high levels of STM detection performance that was comparable to acoustic hearing. Significant correlations were found between STM detection performance and speech identification performance in quiet and in noise. In order to understand the relative contribution of spectral and temporal modulation cues to speech perception abilities for CI users, spectral and temporal modulation detection was performed separately and related to STM detection and speech perception performance. The results suggest that that slow spectral modulation rather than slow temporal modulation may be important for determining speech perception capabilities for CI users. Lastly, test-retest reliability for STM detection was good with no learning. The present study demonstrates that STM detection may be a useful tool to evaluate the ability of CI sound processing strategies to deliver clinically pertinent acoustic modulation information.
Brain Plasticity in Speech Training in Native English Speakers Learning Mandarin Tones

Science.gov (United States)

Heinzen, Christina Carolyn

The current study employed behavioral and event-related potential (ERP) measures to investigate brain plasticity associated with second-language (L2) phonetic learning based on an adaptive computer training program. The program utilized the acoustic characteristics of Infant-Directed Speech (IDS) to train monolingual American English-speaking listeners to perceive Mandarin lexical tones. Behavioral identification and discrimination tasks were conducted using naturally recorded speech, carefully controlled synthetic speech, and non-speech control stimuli. The ERP experiments were conducted with selected synthetic speech stimuli in a passive listening oddball paradigm. Identical pre- and post- tests were administered on nine adult listeners, who completed two-to-three hours of perceptual training. The perceptual training sessions used pair-wise lexical tone identification, and progressed through seven levels of difficulty for each tone pair. The levels of difficulty included progression in speaker variability from one to four speakers and progression through four levels of acoustic exaggeration of duration, pitch range, and pitch contour. Behavioral results for the natural speech stimuli revealed significant training-induced improvement in identification of Tones 1, 3, and 4. Improvements in identification of Tone 4 generalized to novel stimuli as well. Additionally, comparison between discrimination of across-category and within-category stimulus pairs taken from a synthetic continuum revealed a training-induced shift toward more native-like categorical perception of the Mandarin lexical tones. Analysis of the Mismatch Negativity (MMN) responses in the ERP data revealed increased amplitude and decreased latency for pre-attentive processing of across-category discrimination as a result of training. There were also laterality changes in the MMN responses to the non-speech control stimuli, which could reflect reallocation of brain resources in processing pitch patterns
Perception of synthetic speech produced automatically by rule: Intelligibility of eight text-to-speech systems.

Science.gov (United States)

Greene, Beth G; Logan, John S; Pisoni, David B

1986-03-01

We present the results of studies designed to measure the segmental intelligibility of eight text-to-speech systems and a natural speech control, using the Modified Rhyme Test (MRT). Results indicated that the voices tested could be grouped into four categories: natural speech, high-quality synthetic speech, moderate-quality synthetic speech, and low-quality synthetic speech. The overall performance of the best synthesis system, DECtalk-Paul, was equivalent to natural speech only in terms of performance on initial consonants. The findings are discussed in terms of recent work investigating the perception of synthetic speech under more severe conditions. Suggestions for future research on improving the quality of synthetic speech are also considered.
Perception of synthetic speech produced automatically by rule: Intelligibility of eight text-to-speech systems

Science.gov (United States)

GREENE, BETH G.; LOGAN, JOHN S.; PISONI, DAVID B.

2012-01-01

We present the results of studies designed to measure the segmental intelligibility of eight text-to-speech systems and a natural speech control, using the Modified Rhyme Test (MRT). Results indicated that the voices tested could be grouped into four categories: natural speech, high-quality synthetic speech, moderate-quality synthetic speech, and low-quality synthetic speech. The overall performance of the best synthesis system, DECtalk-Paul, was equivalent to natural speech only in terms of performance on initial consonants. The findings are discussed in terms of recent work investigating the perception of synthetic speech under more severe conditions. Suggestions for future research on improving the quality of synthetic speech are also considered. PMID:23225916
Phoneme Compression: processing of the speech signal and effects on speech intelligibility in hearing-Impaired listeners

NARCIS (Netherlands)

A. Goedegebure (Andre)

2005-01-01

textabstractHearing-aid users often continue to have problems with poor speech understanding in difficult acoustical conditions. Another generally accounted problem is that certain sounds become too loud whereas other sounds are still not audible. Dynamic range compression is a signal processing
Gender and vocal production mode discrimination using the high frequencies for speech and singing

Science.gov (United States)

Monson, Brian B.; Lotto, Andrew J.; Story, Brad H.

2014-01-01

Humans routinely produce acoustical energy at frequencies above 6 kHz during vocalization, but this frequency range is often not represented in communication devices and speech perception research. Recent advancements toward high-definition (HD) voice and extended bandwidth hearing aids have increased the interest in the high frequencies. The potential perceptual information provided by high-frequency energy (HFE) is not well characterized. We found that humans can accomplish tasks of gender discrimination and vocal production mode discrimination (speech vs. singing) when presented with acoustic stimuli containing only HFE at both amplified and normal levels. Performance in these tasks was robust in the presence of low-frequency masking noise. No substantial learning effect was observed. Listeners also were able to identify the sung and spoken text (excerpts from “The Star-Spangled Banner”) with very few exposures. These results add to the increasing evidence that the high frequencies provide at least redundant information about the vocal signal, suggesting that its representation in communication devices (e.g., cell phones, hearing aids, and cochlear implants) and speech/voice synthesizers could improve these devices and benefit normal-hearing and hearing-impaired listeners. PMID:25400613
Performance of Jet Substructure Techniques and Boosted Object Identification in ATLAS

CERN Document Server

Lacey, J; The ATLAS collaboration

2014-01-01

ATLAS has implemented and commissioned many new jet substructure techniques to aid in the identification and interpretation of hadronic final states originating from Lorentz-boosted heavy particles produced at the LHC. These techniques include quantum jets, jet charge, jet shapes, quark/gluon, boosted boson and top quark tagging, along with grooming methods such as pruning, trimming, and filtering. These techniques have been validated using the large 2012 ATLAS dataset. Presented here is a summary of the state of the art jet substructure and tagging techniques developed in ATLAS, their performance and recent results.
Auditory and Non-Auditory Contributions for Unaided Speech Recognition in Noise as a Function of Hearing Aid Use

OpenAIRE

Gieseler, Anja; Tahden, Maike A. S.; Thiel, Christiane M.; Wagener, Kirsten C.; Meis, Markus; Colonius, Hans

2017-01-01

Differences in understanding speech in noise among hearing-impaired individuals cannot be explained entirely by hearing thresholds alone, suggesting the contribution of other factors beyond standard auditory ones as derived from the audiogram. This paper reports two analyses addressing individual differences in the explanation of unaided speech-in-noise performance among n = 438 elderly hearing-impaired listeners (mean = 71.1 ± 5.8 years). The main analysis was designed to identify clinically...
Effects of Age and Working Memory Capacity on Speech Recognition Performance in Noise Among Listeners With Normal Hearing.

Science.gov (United States)

Gordon-Salant, Sandra; Cole, Stacey Samuels

2016-01-01

This study aimed to determine if younger and older listeners with normal hearing who differ on working memory span perform differently on speech recognition tests in noise. Older adults typically exhibit poorer speech recognition scores in noise than younger adults, which is attributed primarily to poorer hearing sensitivity and more limited working memory capacity in older than younger adults. Previous studies typically tested older listeners with poorer hearing sensitivity and shorter working memory spans than younger listeners, making it difficult to discern the importance of working memory capacity on speech recognition. This investigation controlled for hearing sensitivity and compared speech recognition performance in noise by younger and older listeners who were subdivided into high and low working memory groups. Performance patterns were compared for different speech materials to assess whether or not the effect of working memory capacity varies with the demands of the specific speech test. The authors hypothesized that (1) normal-hearing listeners with low working memory span would exhibit poorer speech recognition performance in noise than those with high working memory span; (2) older listeners with normal hearing would show poorer speech recognition scores than younger listeners with normal hearing, when the two age groups were matched for working memory span; and (3) an interaction between age and working memory would be observed for speech materials that provide contextual cues. Twenty-eight older (61 to 75 years) and 25 younger (18 to 25 years) normal-hearing listeners were assigned to groups based on age and working memory status. Northwestern University Auditory Test No. 6 words and Institute of Electrical and Electronics Engineers sentences were presented in noise using an adaptive procedure to measure the signal-to-noise ratio corresponding to 50% correct performance. Cognitive ability was evaluated with two tests of working memory (Listening
Schizophrenia alters intra-network functional connectivity in the caudate for detecting speech under informational speech masking conditions.

Science.gov (United States)

Zheng, Yingjun; Wu, Chao; Li, Juanhua; Li, Ruikeng; Peng, Hongjun; She, Shenglin; Ning, Yuping; Li, Liang

2018-04-04

Speech recognition under noisy "cocktail-party" environments involves multiple perceptual/cognitive processes, including target detection, selective attention, irrelevant signal inhibition, sensory/working memory, and speech production. Compared to health listeners, people with schizophrenia are more vulnerable to masking stimuli and perform worse in speech recognition under speech-on-speech masking conditions. Although the schizophrenia-related speech-recognition impairment under "cocktail-party" conditions is associated with deficits of various perceptual/cognitive processes, it is crucial to know whether the brain substrates critically underlying speech detection against informational speech masking are impaired in people with schizophrenia. Using functional magnetic resonance imaging (fMRI), this study investigated differences between people with schizophrenia (n = 19, mean age = 33 ± 10 years) and their matched healthy controls (n = 15, mean age = 30 ± 9 years) in intra-network functional connectivity (FC) specifically associated with target-speech detection under speech-on-speech-masking conditions. The target-speech detection performance under the speech-on-speech-masking condition in participants with schizophrenia was significantly worse than that in matched healthy participants (healthy controls). Moreover, in healthy controls, but not participants with schizophrenia, the strength of intra-network FC within the bilateral caudate was positively correlated with the speech-detection performance under the speech-masking conditions. Compared to controls, patients showed altered spatial activity pattern and decreased intra-network FC in the caudate. In people with schizophrenia, the declined speech-detection performance under speech-on-speech masking conditions is associated with reduced intra-caudate functional connectivity, which normally contributes to detecting target speech against speech masking via its functions of suppressing masking-speech signals.
Computerized System to Aid Deaf Children in Speech Learning

National Research Council Canada - National Science Library

Riella, Rodrigo

2001-01-01

.... The aim of this analyzer is not to find the distinction between spoken words, main objective of a speech recognizer but to calculate a level of correctness in the toggle of a specific word, Voice...
Speech Recognition and Cognitive Skills in Bimodal Cochlear Implant Users

Science.gov (United States)

Hua, Håkan; Johansson, Björn; Magnusson, Lennart; Lyxell, Björn; Ellis, Rachel J.

2017-01-01

Purpose: To examine the relation between speech recognition and cognitive skills in bimodal cochlear implant (CI) and hearing aid users. Method: Seventeen bimodal CI users (28-74 years) were recruited to the study. Speech recognition tests were carried out in quiet and in noise. The cognitive tests employed included the Reading Span Test and the…

Communicative performance of adolescents with severe speech impairment: influence of context.

Science.gov (United States)

Dalton, B M; Bedrosian, J L

1989-08-01

The communicative performance of 4 preoperational-level adolescents, using limited speech, gestures, and communication board techniques, was examined in a two-part investigation. In Part 1, each subject participated in an academic interaction with a teacher in a therapy room. Data were transcribed and coded for communication mode, function, and role. Two subjects were found to predominantly use the speech mode, while the remaining 2 predominantly used board and one other mode. The majority of productions consisted of responses to requests, and the initiator role was infrequently occupied. These findings were similar to those reported in previous investigations conducted in classroom settings. In Part 2, another examination of the communicative performance of these subjects was conducted in spontaneous interactions involving speaking and nonspeaking peers in a therapy room. Using the same data analysis procedures, gesture and speech modes predominated for 3 of the subjects in the nonspeaking peer interactions. The remaining subject exhibited minimal interaction. No consistent pattern of mode usage was exhibited across the speaking peer interactions. In the nonspeaking peer interactions, request predominated. In contrast, a variety of communication functions was exhibited in the speaking peer interactions. Both the initiator and the maintainer roles were occupied in the majority of interactions. Pertinent variables and clinical implications are discussed.
Vocabulary comprehension and strategies in name construction among children using aided communication.

Science.gov (United States)

Deliberato, Débora; Jennische, Margareta; Oxley, Judith; Nunes, Leila Regina d'Oliveira de Paula; Walter, Cátia Crivelenti de Figueiredo; Massaro, Munique; Almeida, Maria Amélia; Stadskleiv, Kristine; Basil, Carmen; Coronas, Marc; Smith, Martine; von Tetzchner, Stephen

2018-03-01

Vocabulary learning reflects the language experiences of the child, both in typical and atypical development, although the vocabulary development of children who use aided communication may differ from children who use natural speech. This study compared the performance of children using aided communication with that of peers using natural speech on two measures of vocabulary knowledge: comprehension of graphic symbols and labeling of common objects. There were 92 participants not considered intellectually disabled in the aided group. The reference group consisted of 60 participants without known disorders. The comprehension task consisted of 63 items presented individually in each participant's graphic system, together with four colored line drawings. Participants were required to indicate which drawing corresponded to the symbol. In the expressive labelling task, 20 common objects presented in drawings had to be named. Both groups indicated the correct drawing for most of the items in the comprehension tasks, with a small advantage for the reference group. The reference group named most objects quickly and accurately, demonstrating that the objects were common and easily named. The aided language group named the majority correctly and in addition used a variety of naming strategies; they required more time than the reference group. The results give insights into lexical processing in aided communication and may have implications for aided language intervention.
Relating working memory to compression parameters in clinically fit hearing AIDS.

Science.gov (United States)

Souza, Pamela E; Sirow, Lynn

2014-12-01

Several laboratory studies have demonstrated that working memory may influence response to compression speed in controlled (i.e., laboratory) comparisons of compression. In this study, the authors explored whether the same relationship would occur under less controlled conditions, as might occur in a typical audiology clinic. Participants included 27 older adults who sought hearing care in a private practice audiology clinic. Working memory was measured for each participant using a reading span test. The authors examined the relationship between working memory and aided speech recognition in noise, using clinically fit hearing aids with a range of compression speeds. Working memory, amount of hearing loss, and age each contributed to speech recognition, but the contribution depended on the speed of the compression processor. For fast-acting compression, the best performance was obtained by patients with high working memory. For slow-acting compression, speech recognition was affected by age and amount of hearing loss but was not affected by working memory. Despite the expectation of greater variability from differences in compression implementation, number of compression channels, or attendant signal processing, the relationship between working memory and compression speed showed a similar pattern as results from more controlled, laboratory-based studies.
Alleviating speech and deglutition: Role of a prosthodontist in multidisciplinary management of velopharyngeal insufficiency.

Science.gov (United States)

Nanda, Aditi; Koli, Dheeraj; Sharma, Sunanda; Suryavanshi, Shalini; Verma, Mahesh

2015-01-01

Surgical resection of soft palate due to cancer affects the effective functioning of the velopharyngeal mechanism (speech and deglutition). With the loss of speech intelligibility, hyper resonance in voice and impaired function of swallowing (due to nasal regurgitation), there is a depreciation in the quality of life of such an individual. In a multidisciplinary setup, the role of a prosthodontist has been described to rehabilitate such patients by fabrication of speech aid prosthesis. The design and method of fabrication of the prosthesis are simple and easy to perform. The use of prosthesis, together with training (of speech) by a speech pathologist resulted in improvement in speech. Furthermore, an improvement in swallowing had been noted, resulting in an improved nutritional intake and general well-being of an individual. The take-home message is that in the treatment of oral cancer, feasible, and rapid rehabilitation should be endeavored in order to make the patient socially more acceptable. The onus lies on the prosthodontist to practise the same in a rapid manner before the moral of the patient becomes low due to the associated stigma of cancer.
Audiovisual integration in children listening to spectrally degraded speech.

Science.gov (United States)

Maidment, David W; Kang, Hi Jee; Stewart, Hannah J; Amitay, Sygal

2015-02-01

The study explored whether visual information improves speech identification in typically developing children with normal hearing when the auditory signal is spectrally degraded. Children (n=69) and adults (n=15) were presented with noise-vocoded sentences from the Children's Co-ordinate Response Measure (Rosen, 2011) in auditory-only or audiovisual conditions. The number of bands was adaptively varied to modulate the degradation of the auditory signal, with the number of bands required for approximately 79% correct identification calculated as the threshold. The youngest children (4- to 5-year-olds) did not benefit from accompanying visual information, in comparison to 6- to 11-year-old children and adults. Audiovisual gain also increased with age in the child sample. The current data suggest that children younger than 6 years of age do not fully utilize visual speech cues to enhance speech perception when the auditory signal is degraded. This evidence not only has implications for understanding the development of speech perception skills in children with normal hearing but may also inform the development of new treatment and intervention strategies that aim to remediate speech perception difficulties in pediatric cochlear implant users.
Intelligent hearing aids: the next revolution.

Science.gov (United States)

Tao Zhang; Mustiere, Fred; Micheyl, Christophe

2016-08-01

The first revolution in hearing aids came from nonlinear amplification, which allows better compensation for both soft and loud sounds. The second revolution stemmed from the introduction of digital signal processing, which allows better programmability and more sophisticated algorithms. The third revolution in hearing aids is wireless, which allows seamless connectivity between a pair of hearing aids and with more and more external devices. Each revolution has fundamentally transformed hearing aids and pushed the entire industry forward significantly. Machine learning has received significant attention in recent years and has been applied in many other industries, e.g., robotics, speech recognition, genetics, and crowdsourcing. We argue that the next revolution in hearing aids is machine intelligence. In fact, this revolution is already quietly happening. We will review the development in at least three major areas: applications of machine learning in speech enhancement; applications of machine learning in individualization and customization of signal processing algorithms; applications of machine learning in improving the efficiency and effectiveness of clinical tests. With the advent of the internet of things, the above developments will accelerate. This revolution will bring patient satisfactions to a new level that has never been seen before.
Individual differneces in degraded speech perception

Science.gov (United States)

Carbonell, Kathy M.

One of the lasting concerns in audiology is the unexplained individual differences in speech perception performance even for individuals with similar audiograms. One proposal is that there are cognitive/perceptual individual differences underlying this vulnerability and that these differences are present in normal hearing (NH) individuals but do not reveal themselves in studies that use clear speech produced in quiet (because of a ceiling effect). However, previous studies have failed to uncover cognitive/perceptual variables that explain much of the variance in NH performance on more challenging degraded speech tasks. This lack of strong correlations may be due to either examining the wrong measures (e.g., working memory capacity) or to there being no reliable differences in degraded speech performance in NH listeners (i.e., variability in performance is due to measurement noise). The proposed project has 3 aims; the first, is to establish whether there are reliable individual differences in degraded speech performance for NH listeners that are sustained both across degradation types (speech in noise, compressed speech, noise-vocoded speech) and across multiple testing sessions. The second aim is to establish whether there are reliable differences in NH listeners' ability to adapt their phonetic categories based on short-term statistics both across tasks and across sessions; and finally, to determine whether performance on degraded speech perception tasks are correlated with performance on phonetic adaptability tasks, thus establishing a possible explanatory variable for individual differences in speech perception for NH and hearing impaired listeners.
Auditory-model based assessment of the effects of hearing loss and hearing-aid compression on spectral and temporal resolution

DEFF Research Database (Denmark)

Kowalewski, Borys; MacDonald, Ewen; Strelcyk, Olaf

2016-01-01

. However, due to the complexity of speech and its robustness to spectral and temporal alterations, the effects of DRC on speech perception have been mixed and controversial. The goal of the present study was to obtain a clearer understanding of the interplay between hearing loss and DRC by means......Most state-of-the-art hearing aids apply multi-channel dynamic-range compression (DRC). Such designs have the potential to emulate, at least to some degree, the processing that takes place in the healthy auditory system. One way to assess hearing-aid performance is to measure speech intelligibility....... Outcomes were simulated using the auditory processing model of Jepsen et al. (2008) with the front end modified to include effects of hearing impairment and DRC. The results were compared to experimental data from normal-hearing and hearing-impaired listeners....
Population Health in Pediatric Speech and Language Disorders: Available Data Sources and a Research Agenda for the Field.

Science.gov (United States)

Raghavan, Ramesh; Camarata, Stephen; White, Karl; Barbaresi, William; Parish, Susan; Krahn, Gloria

2018-05-17

The aim of the study was to provide an overview of population science as applied to speech and language disorders, illustrate data sources, and advance a research agenda on the epidemiology of these conditions. Computer-aided database searches were performed to identify key national surveys and other sources of data necessary to establish the incidence, prevalence, and course and outcome of speech and language disorders. This article also summarizes a research agenda that could enhance our understanding of the epidemiology of these disorders. Although the data yielded estimates of prevalence and incidence for speech and language disorders, existing sources of data are inadequate to establish reliable rates of incidence, prevalence, and outcomes for speech and language disorders at the population level. Greater support for inclusion of speech and language disorder-relevant questions is necessary in national health surveys to build the population science in the field.
Musician effect on perception of spectro-temporally degraded speech, vocal emotion, and music in young adolescents.

NARCIS (Netherlands)

Başkent, Deniz; Fuller, Christina; Galvin, John; Schepel, Like; Gaudrain, Etienne; Free, Rolien

2018-01-01

In adult normal-hearing musicians, perception of music, vocal emotion, and speech in noise has been previously shown to be better than non-musicians, sometimes even with spectro-temporally degraded stimuli. In this study, melodic contour identification, vocal emotion identification, and speech
Speech and language support: How physicians can identify and treat speech and language delays in the office setting.

Science.gov (United States)

Moharir, Madhavi; Barnett, Noel; Taras, Jillian; Cole, Martha; Ford-Jones, E Lee; Levin, Leo

2014-01-01

Failure to recognize and intervene early in speech and language delays can lead to multifaceted and potentially severe consequences for early child development and later literacy skills. While routine evaluations of speech and language during well-child visits are recommended, there is no standardized (office) approach to facilitate this. Furthermore, extensive wait times for speech and language pathology consultation represent valuable lost time for the child and family. Using speech and language expertise, and paediatric collaboration, key content for an office-based tool was developed. early and accurate identification of speech and language delays as well as children at risk for literacy challenges; appropriate referral to speech and language services when required; and teaching and, thus, empowering parents to create rich and responsive language environments at home. Using this tool, in combination with the Canadian Paediatric Society's Read, Speak, Sing and Grow Literacy Initiative, physicians will be better positioned to offer practical strategies to caregivers to enhance children's speech and language capabilities. The tool represents a strategy to evaluate speech and language delays. It depicts age-specific linguistic/phonetic milestones and suggests interventions. The tool represents a practical interim treatment while the family is waiting for formal speech and language therapy consultation.
Robust Speaker Authentication Based on Combined Speech and Voiceprint Recognition

Science.gov (United States)

Malcangi, Mario

2009-08-01

Personal authentication is becoming increasingly important in many applications that have to protect proprietary data. Passwords and personal identification numbers (PINs) prove not to be robust enough to ensure that unauthorized people do not use them. Biometric authentication technology may offer a secure, convenient, accurate solution but sometimes fails due to its intrinsically fuzzy nature. This research aims to demonstrate that combining two basic speech processing methods, voiceprint identification and speech recognition, can provide a very high degree of robustness, especially if fuzzy decision logic is used.
Methods and Application of Phonetic Label Alignment in Speech Processing Tasks

Directory of Open Access Journals (Sweden)

M. Myslivec

2000-12-01

Full Text Available The paper deals with the problem of automatic phonetic segmentation ofspeech signals, namely for speech analysis and recognition purposes.Several methods and approaches are described and evaluated from thepoint of view of their accuracy. A complete instruction for creating anannotated database for training a Czech speech recognition system isprovided together with the authors' own experience. The results of thework have found practical applications, for example, in developing atool for semi-automatic speech segmentation, building alarge-vocabulary phoneme-based speech recognition system and designingan aid for learning and practicing pronunciation of words or phrases inthe native or a foreign language.
Semi-non-intrusive objective intelligibility measure using spatial filtering in hearing aids

DEFF Research Database (Denmark)

Sørensen, Charlotte; Boldt, Jesper Bünsow; Gran, Frederik

2016-01-01

-intrusive metrics have not been able to achieve acceptable intelligibility predictions. This paper presents a new semi-non-intrusive intelligibility measure based on an existing intrusive measure, STOI, where an estimate of the clean speech is extracted using spatial filtering in the hearing aid. The results......Reliable non-intrusive online assessment of speech intelligibility can play a key role for the functioning of hearing aids, e.g. as guidance for adjusting the hearing aid settings to the environment. While existing intrusive metrics can provide a precise and reliable measure, the current non...
An Interactive Human Interface Arm Robot with the Development of Food Aid

Directory of Open Access Journals (Sweden)

NASHWAN D. Zaki

2012-03-01

Full Text Available A robotic system for the disabled who needs supports at meal is proposed. A feature of this system is that the robotic aid system can communicate with the operator using the speech recognition and speech synthesis functions. Another feature is that the robotic aid system uses an image processing, and by doing this the system can recognize the environmental situations of the dishes, cups and so on. Due to this image processing function, the operator does not need to specify the position and the posture of the dishes and target objects. Furthermore, combination communication between speech and image processing will enables a friendly man-machine to communicate with each other, since speech and visual information are essential in the human communication.
Cognitive Spare Capacity and Speech Communication: A Narrative Overview

Directory of Open Access Journals (Sweden)

Mary Rudner

2014-01-01

Full Text Available Background noise can make speech communication tiring and cognitively taxing, especially for individuals with hearing impairment. It is now well established that better working memory capacity is associated with better ability to understand speech under adverse conditions as well as better ability to benefit from the advanced signal processing in modern hearing aids. Recent work has shown that although such processing cannot overcome hearing handicap, it can increase cognitive spare capacity, that is, the ability to engage in higher level processing of speech. This paper surveys recent work on cognitive spare capacity and suggests new avenues of investigation.
Speech-based Class Attendance

Science.gov (United States)

Faizel Amri, Umar; Nur Wahidah Nik Hashim, Nik; Hazrin Hany Mohamad Hanif, Noor

2017-11-01

In the department of engineering, students are required to fulfil at least 80 percent of class attendance. Conventional method requires student to sign his/her initial on the attendance sheet. However, this method is prone to cheating by having another student signing for their fellow classmate that is absent. We develop our hypothesis according to a verse in the Holy Qur’an (95:4), “We have created men in the best of mould”. Based on the verse, we believe each psychological characteristic of human being is unique and thus, their speech characteristic should be unique. In this paper we present the development of speech biometric-based attendance system. The system requires user’s voice to be installed in the system as trained data and it is saved in the system for registration of the user. The following voice of the user will be the test data in order to verify with the trained data stored in the system. The system uses PSD (Power Spectral Density) and Transition Parameter as the method for feature extraction of the voices. Euclidean and Mahalanobis distances are used in order to verified the user’s voice. For this research, ten subjects of five females and five males were chosen to be tested for the performance of the system. The system performance in term of recognition rate is found to be 60% correct identification of individuals.
Role of working memory and lexical knowledge in perceptual restoration of interrupted speech.

Science.gov (United States)

Nagaraj, Naveen K; Magimairaj, Beula M

2017-12-01

The role of working memory (WM) capacity and lexical knowledge in perceptual restoration (PR) of missing speech was investigated using the interrupted speech perception paradigm. Speech identification ability, which indexed PR, was measured using low-context sentences periodically interrupted at 1.5 Hz. PR was measured for silent gated, low-frequency speech noise filled, and low-frequency fine-structure and envelope filled interrupted conditions. WM capacity was measured using verbal and visuospatial span tasks. Lexical knowledge was assessed using both receptive vocabulary and meaning from context tests. Results showed that PR was better for speech noise filled condition than other conditions tested. Both receptive vocabulary and verbal WM capacity explained unique variance in PR for the speech noise filled condition, but were unrelated to performance in the silent gated condition. It was only receptive vocabulary that uniquely predicted PR for fine-structure and envelope filled conditions. These findings suggest that the contribution of lexical knowledge and verbal WM during PR depends crucially on the information content that replaced the silent intervals. When perceptual continuity was partially restored by filler speech noise, both lexical knowledge and verbal WM capacity facilitated PR. Importantly, for fine-structure and envelope filled interrupted conditions, lexical knowledge was crucial for PR.
Modern Tools in Patient-Centred Speech Therapy for Romanian Language

Directory of Open Access Journals (Sweden)

Mirela Danubianu

2016-03-01

Full Text Available The most common way to communicate with those around us is speech. Suffering from a speech disorder can have negative social effects: from leaving the individuals with low confidence and moral to problems with social interaction and the ability to live independently like adults. The speech therapy intervention is a complex process having particular objectives such as: discovery and identification of speech disorder and directing the therapy to correction, recovery, compensation, adaptation and social integration of patients. Computer-based Speech Therapy systems are a real help for therapists by creating a special learning environment. The Romanian language is a phonetic one, with special linguistic particularities. This paper aims to present a few computer-based speech therapy systems developed for the treatment of various speech disorders specific to Romanian language.
Identification of speech transients using variable frame rate analysis and wavelet packets.

Science.gov (United States)

Rasetshwane, Daniel M; Boston, J Robert; Li, Ching-Chung

2006-01-01

Speech transients are important cues for identifying and discriminating speech sounds. Yoo et al. and Tantibundhit et al. were successful in identifying speech transients and, emphasizing them, improving the intelligibility of speech in noise. However, their methods are computationally intensive and unsuitable for real-time applications. This paper presents a method to identify and emphasize speech transients that combines subband decomposition by the wavelet packet transform with variable frame rate (VFR) analysis and unvoiced consonant detection. The VFR analysis is applied to each wavelet packet to define a transitivity function that describes the extent to which the wavelet coefficients of that packet are changing. Unvoiced consonant detection is used to identify unvoiced consonant intervals and the transitivity function is amplified during these intervals. The wavelet coefficients are multiplied by the transitivity function for that packet, amplifying the coefficients localized at times when they are changing and attenuating coefficients at times when they are steady. Inverse transform of the modified wavelet packet coefficients produces a signal corresponding to speech transients similar to the transients identified by Yoo et al. and Tantibundhit et al. A preliminary implementation of the algorithm runs more efficiently.

Part-of-speech effects on text-to-speech synthesis

CSIR Research Space (South Africa)

Schlunz, GI

2010-11-01

Full Text Available One of the goals of text-to-speech (TTS) systems is to produce natural-sounding synthesised speech. Towards this end various natural language processing (NLP) tasks are performed to model the prosodic aspects of the TTS voice. One of the fundamental...
Music Training Can Improve Music and Speech Perception in Pediatric Mandarin-Speaking Cochlear Implant Users.

Science.gov (United States)

Cheng, Xiaoting; Liu, Yangwenyi; Shu, Yilai; Tao, Duo-Duo; Wang, Bing; Yuan, Yasheng; Galvin, John J; Fu, Qian-Jie; Chen, Bing

2018-01-01

Due to limited spectral resolution, cochlear implants (CIs) do not convey pitch information very well. Pitch cues are important for perception of music and tonal language; it is possible that music training may improve performance in both listening tasks. In this study, we investigated music training outcomes in terms of perception of music, lexical tones, and sentences in 22 young (4.8 to 9.3 years old), prelingually deaf Mandarin-speaking CI users. Music perception was measured using a melodic contour identification (MCI) task. Speech perception was measured for lexical tones and sentences presented in quiet. Subjects received 8 weeks of MCI training using pitch ranges not used for testing. Music and speech perception were measured at 2, 4, and 8 weeks after training was begun; follow-up measures were made 4 weeks after training was stopped. Mean baseline performance was 33.2%, 76.9%, and 45.8% correct for MCI, lexical tone recognition, and sentence recognition, respectively. After 8 weeks of MCI training, mean performance significantly improved by 22.9, 14.4, and 14.5 percentage points for MCI, lexical tone recognition, and sentence recognition, respectively ( p music and speech performance. The results suggest that music training can significantly improve pediatric Mandarin-speaking CI users' music and speech perception.
State Aid and Student Performance: A Supply-Demand Analysis

Science.gov (United States)

Kinnucan, Henry W.; Zheng, Yuqing; Brehmer, Gerald

2006-01-01

Using a supply-demand framework, a six-equation model is specified to generate hypotheses about the relationship between state aid and student performance. Theory predicts that an increase in state or federal aid provides an incentive to decrease local funding, but that the disincentive associated with increased state aid is moderated when federal…
Exploring the link between cognitive abilities and speech recognition in the elderly under different listening conditions

DEFF Research Database (Denmark)

Nuesse, Theresa; Steenken, Rike; Neher, Tobias

2018-01-01

, and it has been suggested that differences in cognitive abilities may also be important. The objective of this study was to investigate associations between performance in cognitive tasks and speech recognition under different listening conditions in older adults with either age appropriate hearing...... or hearing-impairment. To that end, speech recognition threshold (SRT) measurements were performed under several masking conditions that varied along the perceptual dimensions of dip listening, spatial separation, and informational masking. In addition, a neuropsychological test battery was administered......, which included measures of verbal working- and short-term memory, executive functioning, selective and divided attention, and lexical and semantic abilities. Age-matched groups of older adults with either age-appropriate hearing (ENH, N = 20) or aided hearing impairment (EHI, N = 21) participated...
The Soft Palate Friendly Speech Bulb for Velopharyngeal Insufficiency

OpenAIRE

Kahlon, Sukhdeep Singh; Kahlon, Monaliza; Gupta, Shilpa; Dhingra, Parvinder Singh

2016-01-01

Velopharyngeal insufficiency is an anatomic defect of the soft palate making palatopharyngeal sphincter incomplete. It is an important concern to address in patients with bilateral cleft lip and palate. Speech aid prosthesis or speech bulbs are best choice in cases where surgically repaired soft palate is too short to contact pharyngeal walls during function but these prosthesis have been associated with inadequate marginal closure, ulcerations and patient discomfort. Here is a case report of...
Speech and language support: How physicians can identify and treat speech and language delays in the office setting

Science.gov (United States)

Moharir, Madhavi; Barnett, Noel; Taras, Jillian; Cole, Martha; Ford-Jones, E Lee; Levin, Leo

2014-01-01

Failure to recognize and intervene early in speech and language delays can lead to multifaceted and potentially severe consequences for early child development and later literacy skills. While routine evaluations of speech and language during well-child visits are recommended, there is no standardized (office) approach to facilitate this. Furthermore, extensive wait times for speech and language pathology consultation represent valuable lost time for the child and family. Using speech and language expertise, and paediatric collaboration, key content for an office-based tool was developed. The tool aimed to help physicians achieve three main goals: early and accurate identification of speech and language delays as well as children at risk for literacy challenges; appropriate referral to speech and language services when required; and teaching and, thus, empowering parents to create rich and responsive language environments at home. Using this tool, in combination with the Canadian Paediatric Society’s Read, Speak, Sing and Grow Literacy Initiative, physicians will be better positioned to offer practical strategies to caregivers to enhance children’s speech and language capabilities. The tool represents a strategy to evaluate speech and language delays. It depicts age-specific linguistic/phonetic milestones and suggests interventions. The tool represents a practical interim treatment while the family is waiting for formal speech and language therapy consultation. PMID:24627648
Effects of Instantaneous Multiband Dynamic Compression on Speech Intelligibility

Directory of Open Access Journals (Sweden)

Herzke Tobias

2005-01-01

Full Text Available The recruitment phenomenon, that is, the reduced dynamic range between threshold and uncomfortable level, is attributed to the loss of instantaneous dynamic compression on the basilar membrane. Despite this, hearing aids commonly use slow-acting dynamic compression for its compensation, because this was found to be the most successful strategy in terms of speech quality and intelligibility rehabilitation. Former attempts to use fast-acting compression gave ambiguous results, raising the question as to whether auditory-based recruitment compensation by instantaneous compression is in principle applicable in hearing aids. This study thus investigates instantaneous multiband dynamic compression based on an auditory filterbank. Instantaneous envelope compression is performed in each frequency band of a gammatone filterbank, which provides a combination of time and frequency resolution comparable to the normal healthy cochlea. The gain characteristics used for dynamic compression are deduced from categorical loudness scaling. In speech intelligibility tests, the instantaneous dynamic compression scheme was compared against a linear amplification scheme, which used the same filterbank for frequency analysis, but employed constant gain factors that restored the sound level for medium perceived loudness in each frequency band. In subjective comparisons, five of nine subjects preferred the linear amplification scheme and would not accept the instantaneous dynamic compression in hearing aids. Four of nine subjects did not perceive any quality differences. A sentence intelligibility test in noise (Oldenburg sentence test showed little to no negative effects of the instantaneous dynamic compression, compared to linear amplification. A word intelligibility test in quiet (one-syllable rhyme test showed that the subjects benefit from the larger amplification at low levels provided by instantaneous dynamic compression. Further analysis showed that the increase
How is the McGurk effect modulated by Cued Speech in deaf and hearing adults?

OpenAIRE

Bayard, Clémence; Colin, Cécile; Leybaert, Jacqueline

2014-01-01

Speech perception for both hearing and deaf people involves an integrative process between auditory and lip-reading information. In order to disambiguate information from lips, manual cues from Cued Speech may be added. Cued Speech (CS) is a system of manual aids developed to help deaf people to clearly and completely understand speech visually (Cornett, 1967). Within this system, both labial and manual information, as lone input sources, remain ambiguous. Perceivers, therefore, have to combi...
Speech Transduction Based on Linguistic Content

DEFF Research Database (Denmark)

Juel Henrichsen, Peter; Christiansen, Thomas Ulrich

Digital hearing aids use a variety of advanced digital signal processing methods in order to improve speech intelligibility. These methods are based on knowledge about the acoustics outside the ear as well as psychoacoustics. This paper investigates the recent observation that speech elements...... with a high degree of information can be robustly identified based on basic acoustic properties, i.e., function words have greater spectral tilt than content words for each of the 18 Danish talkers investigated. In this paper we examine these spectral tilt differences as a function of time based on a speech...... material six times the duration of previous investigations. Our results show that the correlation of spectral tilt with information content is relatively constant across time, even if averaged across talkers. This indicates that it is possible to devise a robust method for estimating information density...
The analysis of speech acts patterns in two Egyptian inaugural speeches

Directory of Open Access Journals (Sweden)

Imad Hayif Sameer

2017-09-01

Full Text Available The theory of speech acts, which clarifies what people do when they speak, is not about individual words or sentences that form the basic elements of human communication, but rather about particular speech acts that are performed when uttering words. A speech act is the attempt at doing something purely by speaking. Many things can be done by speaking. Speech acts are studied under what is called speech act theory, and belong to the domain of pragmatics. In this paper, two Egyptian inaugural speeches from El-Sadat and El-Sisi, belonging to different periods were analyzed to find out whether there were differences within this genre in the same culture or not. The study showed that there was a very small difference between these two speeches which were analyzed according to Searle’s theory of speech acts. In El Sadat’s speech, commissives came to occupy the first place. Meanwhile, in El–Sisi’s speech, assertives occupied the first place. Within the speeches of one culture, we can find that the differences depended on the circumstances that surrounded the elections of the Presidents at the time. Speech acts were tools they used to convey what they wanted and to obtain support from their audiences.
Comprehension of synthetic speech and digitized natural speech by adults with aphasia.

Science.gov (United States)

Hux, Karen; Knollman-Porter, Kelly; Brown, Jessica; Wallace, Sarah E

2017-09-01

Using text-to-speech technology to provide simultaneous written and auditory content presentation may help compensate for chronic reading challenges if people with aphasia can understand synthetic speech output; however, inherent auditory comprehension challenges experienced by people with aphasia may make understanding synthetic speech difficult. This study's purpose was to compare the preferences and auditory comprehension accuracy of people with aphasia when listening to sentences generated with digitized natural speech, Alex synthetic speech (i.e., Macintosh platform), or David synthetic speech (i.e., Windows platform). The methodology required each of 20 participants with aphasia to select one of four images corresponding in meaning to each of 60 sentences comprising three stimulus sets. Results revealed significantly better accuracy given digitized natural speech than either synthetic speech option; however, individual participant performance analyses revealed three patterns: (a) comparable accuracy regardless of speech condition for 30% of participants, (b) comparable accuracy between digitized natural speech and one, but not both, synthetic speech option for 45% of participants, and (c) greater accuracy with digitized natural speech than with either synthetic speech option for remaining participants. Ranking and Likert-scale rating data revealed a preference for digitized natural speech and David synthetic speech over Alex synthetic speech. Results suggest many individuals with aphasia can comprehend synthetic speech options available on popular operating systems. Further examination of synthetic speech use to support reading comprehension through text-to-speech technology is thus warranted. Copyright © 2017 Elsevier Inc. All rights reserved.
Silent Speech Recognition as an Alternative Communication Device for Persons with Laryngectomy.

Science.gov (United States)

Meltzner, Geoffrey S; Heaton, James T; Deng, Yunbin; De Luca, Gianluca; Roy, Serge H; Kline, Joshua C

2017-12-01

Each year thousands of individuals require surgical removal of their larynx (voice box) due to trauma or disease, and thereby require an alternative voice source or assistive device to verbally communicate. Although natural voice is lost after laryngectomy, most muscles controlling speech articulation remain intact. Surface electromyographic (sEMG) activity of speech musculature can be recorded from the neck and face, and used for automatic speech recognition to provide speech-to-text or synthesized speech as an alternative means of communication. This is true even when speech is mouthed or spoken in a silent (subvocal) manner, making it an appropriate communication platform after laryngectomy. In this study, 8 individuals at least 6 months after total laryngectomy were recorded using 8 sEMG sensors on their face (4) and neck (4) while reading phrases constructed from a 2,500-word vocabulary. A unique set of phrases were used for training phoneme-based recognition models for each of the 39 commonly used phonemes in English, and the remaining phrases were used for testing word recognition of the models based on phoneme identification from running speech. Word error rates were on average 10.3% for the full 8-sensor set (averaging 9.5% for the top 4 participants), and 13.6% when reducing the sensor set to 4 locations per individual (n=7). This study provides a compelling proof-of-concept for sEMG-based alaryngeal speech recognition, with the strong potential to further improve recognition performance.
Prediction of IOI-HA Scores Using Speech Reception Thresholds and Speech Discrimination Scores in Quiet

DEFF Research Database (Denmark)

Brännström, K Jonas; Lantz, Johannes; Nielsen, Lars Holme

2014-01-01

), and speech discrimination scores (SDSs) in quiet or in noise are common assessments made prior to hearing aid (HA) fittings. It is not known whether SRT and SDS in quiet relate to HA outcome measured with the International Outcome Inventory for Hearing Aids (IOI-HA). PURPOSE: The aim of the present study...... COLLECTION AND ANALYSIS: The psychometric properties were evaluated and compared to previous studies using the IOI-HA. The associations and differences between the outcome scores and a number of descriptive variables (age, gender, fitted monaurally/binaurally with HA, first-time/experienced HA users, years...
The contribution of dynamic visual cues to audiovisual speech perception.

Science.gov (United States)

Jaekl, Philip; Pesquita, Ana; Alsius, Agnes; Munhall, Kevin; Soto-Faraco, Salvador

2015-08-01

Seeing a speaker's facial gestures can significantly improve speech comprehension, especially in noisy environments. However, the nature of the visual information from the speaker's facial movements that is relevant for this enhancement is still unclear. Like auditory speech signals, visual speech signals unfold over time and contain both dynamic configural information and luminance-defined local motion cues; two information sources that are thought to engage anatomically and functionally separate visual systems. Whereas, some past studies have highlighted the importance of local, luminance-defined motion cues in audiovisual speech perception, the contribution of dynamic configural information signalling changes in form over time has not yet been assessed. We therefore attempted to single out the contribution of dynamic configural information to audiovisual speech processing. To this aim, we measured word identification performance in noise using unimodal auditory stimuli, and with audiovisual stimuli. In the audiovisual condition, speaking faces were presented as point light displays achieved via motion capture of the original talker. Point light displays could be isoluminant, to minimise the contribution of effective luminance-defined local motion information, or with added luminance contrast, allowing the combined effect of dynamic configural cues and local motion cues. Audiovisual enhancement was found in both the isoluminant and contrast-based luminance conditions compared to an auditory-only condition, demonstrating, for the first time the specific contribution of dynamic configural cues to audiovisual speech improvement. These findings imply that globally processed changes in a speaker's facial shape contribute significantly towards the perception of articulatory gestures and the analysis of audiovisual speech. Copyright © 2015 Elsevier Ltd. All rights reserved.
Musical background not associated with self-perceived hearing performance or speech perception in postlingual cochlear-implant users

NARCIS (Netherlands)

Fuller, Christina; Free, Rolien; Maat, Bert; Baskent, Deniz

In normal-hearing listeners, musical background has been observed to change the sound representation in the auditory system and produce enhanced performance in some speech perception tests. Based on these observations, it has been hypothesized that musical background can influence sound and speech
A Decision-Tree-Based Algorithm for Speech/Music Classification and Segmentation

Directory of Open Access Journals (Sweden)

Lavner Yizhar

2009-01-01

Full Text Available We present an efficient algorithm for segmentation of audio signals into speech or music. The central motivation to our study is consumer audio applications, where various real-time enhancements are often applied. The algorithm consists of a learning phase and a classification phase. In the learning phase, predefined training data is used for computing various time-domain and frequency-domain features, for speech and music signals separately, and estimating the optimal speech/music thresholds, based on the probability density functions of the features. An automatic procedure is employed to select the best features for separation. In the test phase, initial classification is performed for each segment of the audio signal, using a three-stage sieve-like approach, applying both Bayesian and rule-based methods. To avoid erroneous rapid alternations in the classification, a smoothing technique is applied, averaging the decision on each segment with past segment decisions. Extensive evaluation of the algorithm, on a database of more than 12 hours of speech and more than 22 hours of music showed correct identification rates of 99.4% and 97.8%, respectively, and quick adjustment to alternating speech/music sections. In addition to its accuracy and robustness, the algorithm can be easily adapted to different audio types, and is suitable for real-time operation.
FPGA Implementation for GMM-Based Speaker Identification

Directory of Open Access Journals (Sweden)

Phaklen EhKan

2011-01-01

Full Text Available In today's society, highly accurate personal identification systems are required. Passwords or pin numbers can be forgotten or forged and are no longer considered to offer a high level of security. The use of biological features, biometrics, is becoming widely accepted as the next level for security systems. Biometric-based speaker identification is a method of identifying persons from their voice. Speaker-specific characteristics exist in speech signals due to different speakers having different resonances of the vocal tract. These differences can be exploited by extracting feature vectors such as Mel-Frequency Cepstral Coefficients (MFCCs from the speech signal. A well-known statistical modelling process, the Gaussian Mixture Model (GMM, then models the distribution of each speaker's MFCCs in a multidimensional acoustic space. The GMM-based speaker identification system has features that make it promising for hardware acceleration. This paper describes the hardware implementation for classification of a text-independent GMM-based speaker identification system. The aim was to produce a system that can perform simultaneous identification of large numbers of voice streams in real time. This has important potential applications in security and in automated call centre applications. A speedup factor of ninety was achieved compared to a software implementation on a standard PC.
The influence of age, hearing, and working memory on the speech comprehension benefit derived from an automatic speech recognition system.

Science.gov (United States)

Zekveld, Adriana A; Kramer, Sophia E; Kessens, Judith M; Vlaming, Marcel S M G; Houtgast, Tammo

2009-04-01

The aim of the current study was to examine whether partly incorrect subtitles that are automatically generated by an Automatic Speech Recognition (ASR) system, improve speech comprehension by listeners with hearing impairment. In an earlier study (Zekveld et al. 2008), we showed that speech comprehension in noise by young listeners with normal hearing improves when presenting partly incorrect, automatically generated subtitles. The current study focused on the effects of age, hearing loss, visual working memory capacity, and linguistic skills on the benefit obtained from automatically generated subtitles during listening to speech in noise. In order to investigate the effects of age and hearing loss, three groups of participants were included: 22 young persons with normal hearing (YNH, mean age = 21 years), 22 middle-aged adults with normal hearing (MA-NH, mean age = 55 years) and 30 middle-aged adults with hearing impairment (MA-HI, mean age = 57 years). The benefit from automatic subtitling was measured by Speech Reception Threshold (SRT) tests (Plomp & Mimpen, 1979). Both unimodal auditory and bimodal audiovisual SRT tests were performed. In the audiovisual tests, the subtitles were presented simultaneously with the speech, whereas in the auditory test, only speech was presented. The difference between the auditory and audiovisual SRT was defined as the audiovisual benefit. Participants additionally rated the listening effort. We examined the influences of ASR accuracy level and text delay on the audiovisual benefit and the listening effort using a repeated measures General Linear Model analysis. In a correlation analysis, we evaluated the relationships between age, auditory SRT, visual working memory capacity and the audiovisual benefit and listening effort. The automatically generated subtitles improved speech comprehension in noise for all ASR accuracies and delays covered by the current study. Higher ASR accuracy levels resulted in more benefit obtained
78 FR 49693 - Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services...

Science.gov (United States)

2013-08-15

...-Speech Services for Individuals with Hearing and Speech Disabilities, Report and Order (Order), document...] Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services; Telecommunications Relay Services and Speech-to-Speech Services for Individuals With Hearing and Speech Disabilities...
Profiling Speech and Pausing in Amyotrophic Lateral Sclerosis (ALS and Frontotemporal Dementia (FTD.

Directory of Open Access Journals (Sweden)

Yana Yunusova

Full Text Available This study examines reading aloud in patients with amyotrophic lateral sclerosis (ALS and those with frontotemporal dementia (FTD in order to determine whether differences in patterns of speaking and pausing exist between patients with primary motor vs. primary cognitive-linguistic deficits, and in contrast to healthy controls.136 participants were included in the study: 33 controls, 85 patients with ALS, and 18 patients with either the behavioural variant of FTD (FTD-BV or progressive nonfluent aphasia (FTD-PNFA. Participants with ALS were further divided into 4 non-overlapping subgroups--mild, respiratory, bulbar (with oral-motor deficit and bulbar-respiratory--based on the presence and severity of motor bulbar or respiratory signs. All participants read a passage aloud. Custom-made software was used to perform speech and pause analyses, and this provided measures of speaking and articulatory rates, duration of speech, and number and duration of pauses. These measures were statistically compared in different subgroups of patients.The results revealed clear differences between patient groups and healthy controls on the passage reading task. A speech-based motor function measure (i.e., articulatory rate was able to distinguish patients with bulbar ALS or FTD-PNFA from those with respiratory ALS or FTD-BV. Distinguishing the disordered groups proved challenging based on the pausing measures.This study demonstrated the use of speech measures in the identification of those with an oral-motor deficit, and showed the usefulness of performing a relatively simple reading test to assess speech versus pause behaviors across the ALS-FTD disease continuum. The findings also suggest that motor speech assessment should be performed as part of the diagnostic workup for patients with FTD.

Human speech articulator measurements using low power, 2GHz Homodyne sensors

International Nuclear Information System (INIS)

Barnes, T; Burnett, G C; Holzrichter, J F

1999-01-01

Very low power, short-range microwave ''radar-like'' sensors can measure the motions and vibrations of internal human speech articulators as speech is produced. In these animate (and also in inanimate acoustic systems) microwave sensors can measure vibration information associated with excitation sources and other interfaces. These data, together with the corresponding acoustic data, enable the calculation of system transfer functions. This information appears to be useful for a surprisingly wide range of applications such as speech coding and recognition, speaker or object identification, speech and musical instrument synthesis, noise cancellation, and other applications
Audiomotor Perceptual Training Enhances Speech Intelligibility in Background Noise.

Science.gov (United States)

Whitton, Jonathon P; Hancock, Kenneth E; Shannon, Jeffrey M; Polley, Daniel B

2017-11-06

Sensory and motor skills can be improved with training, but learning is often restricted to practice stimuli. As an exception, training on closed-loop (CL) sensorimotor interfaces, such as action video games and musical instruments, can impart a broad spectrum of perceptual benefits. Here we ask whether computerized CL auditory training can enhance speech understanding in levels of background noise that approximate a crowded restaurant. Elderly hearing-impaired subjects trained for 8 weeks on a CL game that, like a musical instrument, challenged them to monitor subtle deviations between predicted and actual auditory feedback as they moved their fingertip through a virtual soundscape. We performed our study as a randomized, double-blind, placebo-controlled trial by training other subjects in an auditory working-memory (WM) task. Subjects in both groups improved at their respective auditory tasks and reported comparable expectations for improved speech processing, thereby controlling for placebo effects. Whereas speech intelligibility was unchanged after WM training, subjects in the CL training group could correctly identify 25% more words in spoken sentences or digit sequences presented in high levels of background noise. Numerically, CL audiomotor training provided more than three times the benefit of our subjects' hearing aids for speech processing in noisy listening conditions. Gains in speech intelligibility could be predicted from gameplay accuracy and baseline inhibitory control. However, benefits did not persist in the absence of continuing practice. These studies employ stringent clinical standards to demonstrate that perceptual learning on a computerized audio game can transfer to "real-world" communication challenges. Copyright © 2017 Elsevier Ltd. All rights reserved.
Speech and Hearing Problems among Older People.

Science.gov (United States)

Carstenson, Blue

1978-01-01

Findings from speech and hearing tests of older people in South Dakota community senior programs indicate the need for better testing and therapy procedures. Lipreading may be more effective than hearing aids, and factors other than hearing may be involved. Some problems and needs are noted. (MF)
Neural Correlates of Selective Attention With Hearing Aid Use Followed by ReadMyQuips Auditory Training Program.

Science.gov (United States)

Rao, Aparna; Rishiq, Dania; Yu, Luodi; Zhang, Yang; Abrams, Harvey

The objectives of this study were to investigate the effects of hearing aid use and the effectiveness of ReadMyQuips (RMQ), an auditory training program, on speech perception performance and auditory selective attention using electrophysiological measures. RMQ is an audiovisual training program designed to improve speech perception in everyday noisy listening environments. Participants were adults with mild to moderate hearing loss who were first-time hearing aid users. After 4 weeks of hearing aid use, the experimental group completed RMQ training in 4 weeks, and the control group received listening practice on audiobooks during the same period. Cortical late event-related potentials (ERPs) and the Hearing in Noise Test (HINT) were administered at prefitting, pretraining, and post-training to assess effects of hearing aid use and RMQ training. An oddball paradigm allowed tracking of changes in P3a and P3b ERPs to distractors and targets, respectively. Behavioral measures were also obtained while ERPs were recorded from participants. After 4 weeks of hearing aid use but before auditory training, HINT results did not show a statistically significant change, but there was a significant P3a reduction. This reduction in P3a was correlated with improvement in d prime (d') in the selective attention task. Increased P3b amplitudes were also correlated with improvement in d' in the selective attention task. After training, this correlation between P3b and d' remained in the experimental group, but not in the control group. Similarly, HINT testing showed improved speech perception post training only in the experimental group. The criterion calculated in the auditory selective attention task showed a reduction only in the experimental group after training. ERP measures in the auditory selective attention task did not show any changes related to training. Hearing aid use was associated with a decrement in involuntary attention switch to distractors in the auditory selective
Hearing Aid and children

Directory of Open Access Journals (Sweden)

Jamileh Fatahi

2002-07-01

Full Text Available In order to develop oral communication, hearing impaired infants and young children must be able to hear speech comfortably and consistently. To day children with all degrees of hearing loss may be condidates for some kinds of amlification. As children differ from adults, many Factors should be consider in hearing aid selection, evaluation and fitting. For example the child age when he or she is candidate for custom instruments? Do we consider programmable Hearing aid? Are multi memory instruments appropriate for them? What about directional microphones? What style of hearing aid do we select? In this paper such questions are Answered.
[Hearing aid application performance evaluation questionnaire to presbycusis].

Science.gov (United States)

Chen, Xianghong; Zhou, Huifang; Zhang, Jing; Wang, Liqun

2011-02-01

By matching patients with presbycusis hearing aids,hearing aid performance assessment questionnaire to fill out to assess the effect of its use and targeted to solve problems encountered in its use and improve the quality of life of older persons. Through face to face way to investigate and analyse patients with hearing aids fitting, totally 30 subjects accepted the analysis, preliminary assessment of the use of hearing aids in patient with presbycusis results and solve problems encountered in its use by using SPSS software to analyze the collecting data. HHIE questionnaire on statistical analysis, obtained in patients with hearing loss use hearing aids after the problem is a significant improvement statistical analysis of the SADL questionnaire, the conclusion is relatively satisfied with the overall satisfaction. Effects Assessment Questionnaire in patients with hearing aids hearing impairment can be epitomized the disabled after use to improve the situation and understand the satisfaction of patients with hearing aids can be an initial effect as the rehabilitation of a reliable subjective assessment of the impact assessment indicators.
Hearing impairment, cognition and speech understanding: exploratory factor analyses of a comprehensive test battery for a group of hearing aid users, the n200 study.

Science.gov (United States)

Rönnberg, Jerker; Lunner, Thomas; Ng, Elaine Hoi Ning; Lidestam, Björn; Zekveld, Adriana Agatha; Sörqvist, Patrik; Lyxell, Björn; Träff, Ulf; Yumba, Wycliffe; Classon, Elisabet; Hällgren, Mathias; Larsby, Birgitta; Signoret, Carine; Pichora-Fuller, M Kathleen; Rudner, Mary; Danielsson, Henrik; Stenfelt, Stefan

2016-11-01

The aims of the current n200 study were to assess the structural relations between three classes of test variables (i.e. HEARING, COGNITION and aided speech-in-noise OUTCOMES) and to describe the theoretical implications of these relations for the Ease of Language Understanding (ELU) model. Participants were 200 hard-of-hearing hearing-aid users, with a mean age of 60.8 years. Forty-three percent were females and the mean hearing threshold in the better ear was 37.4 dB HL. LEVEL1 factor analyses extracted one factor per test and/or cognitive function based on a priori conceptualizations. The more abstract LEVEL 2 factor analyses were performed separately for the three classes of test variables. The HEARING test variables resulted in two LEVEL 2 factors, which we labelled SENSITIVITY and TEMPORAL FINE STRUCTURE; the COGNITIVE variables in one COGNITION factor only, and OUTCOMES in two factors, NO CONTEXT and CONTEXT. COGNITION predicted the NO CONTEXT factor to a stronger extent than the CONTEXT outcome factor. TEMPORAL FINE STRUCTURE and SENSITIVITY were associated with COGNITION and all three contributed significantly and independently to especially the NO CONTEXT outcome scores (R(2) = 0.40). All LEVEL 2 factors are important theoretically as well as for clinical assessment.
Mathematical modeling and signal processing in speech and hearing sciences

CERN Document Server

Xin, Jack

2014-01-01

The aim of the book is to give an accessible introduction of mathematical models and signal processing methods in speech and hearing sciences for senior undergraduate and beginning graduate students with basic knowledge of linear algebra, differential equations, numerical analysis, and probability. Speech and hearing sciences are fundamental to numerous technological advances of the digital world in the past decade, from music compression in MP3 to digital hearing aids, from network based voice enabled services to speech interaction with mobile phones. Mathematics and computation are intimately related to these leaps and bounds. On the other hand, speech and hearing are strongly interdisciplinary areas where dissimilar scientific and engineering publications and approaches often coexist and make it difficult for newcomers to enter.
School performance and school behavior of children affected by AIDS in China

Science.gov (United States)

Tu, Xiaoming; Lv, Yunfei; Li, Xiaoming; Fang, Xiaoyi; Zhao, Guoxiang; Lin, Xiuyun; Hong, Yan; Zhang, Liying; Stanton, Bonita

2009-01-01

It is generally recognized that the AIDS epidemic will have a negative effect on the orphans’ school education. However, few studies have been carried out to examine the school performance and school behavior of AIDS orphans and vulnerable children (children living with HIV-infected parents). Using both self-report and teacher evaluation data of 1625 children from rural central China, we examined the impact of parental HIV/AIDS on children's school performances (academic marks, educational expectation, and student leadership) and school behaviors (e.g., aggression, shy/anxious and assertive social skills). Results indicate that AIDS orphans and vulnerable children had disadvantages in school performances in comparison to their peers from the same community who did not experience AIDS-related death and illness in their family (comparison children). AIDS orphans had the lowest academic marks based on the reports of both children and teachers. Educational expectation was significantly lower among AIDS orphans and vulnerable children than comparison children from teacher's perspective. AIDS orphans were significantly more likely to demonstrate aggressive, impulsive and anxious behaviors than non-orphans. Moreover, orphans have more learning difficulties. Vulnerable children were also at a disadvantage on most measures. The data suggest that a greater attention is needed to the school performance and behavior of children affected by AIDS. The findings also indicate that AIDS relief and assistance program for children should go beyond the school attendance and make efforts to improve their school performance and education aspiration. PMID:20107622
Memory of AMR coded speech distorted by packet loss

OpenAIRE

Nykänen, Arne; Lindegren, David; Wruck, Louisa; Ljung, Robert; Odelius, Johan; Möller, Sebastian

2014-01-01

Previous studies have shown that free recall of spoken word lists is impaired if the speech is presented in background noise, even if the signal-to-noise ratio is kept at a level allowing full word identification. The objective of this study was to examine recall rates for word lists presented in noise and word lists coded by an AMR (Adaptive Multi Rate) telephone codec distorted by packet loss. Twenty subjects performed a word recall test. Word lists consisting of ten words were played to th...
Effects of Audio-Visual Integration on the Detection of Masked Speech and Non-Speech Sounds

Science.gov (United States)

Eramudugolla, Ranmalee; Henderson, Rachel; Mattingley, Jason B.

2011-01-01

Integration of simultaneous auditory and visual information about an event can enhance our ability to detect that event. This is particularly evident in the perception of speech, where the articulatory gestures of the speaker's lips and face can significantly improve the listener's detection and identification of the message, especially when that…
The Effects of Hearing Aid Directional Microphone and Noise Reduction Processing on Listening Effort in Older Adults with Hearing Loss.

Science.gov (United States)

Desjardins, Jamie L

2016-01-01

Older listeners with hearing loss may exert more cognitive resources to maintain a level of listening performance similar to that of younger listeners with normal hearing. Unfortunately, this increase in cognitive load, which is often conceptualized as increased listening effort, may come at the cost of cognitive processing resources that might otherwise be available for other tasks. The purpose of this study was to evaluate the independent and combined effects of a hearing aid directional microphone and a noise reduction (NR) algorithm on reducing the listening effort older listeners with hearing loss expend on a speech-in-noise task. Participants were fitted with study worn commercially available behind-the-ear hearing aids. Listening effort on a sentence recognition in noise task was measured using an objective auditory-visual dual-task paradigm. The primary task required participants to repeat sentences presented in quiet and in a four-talker babble. The secondary task was a digital visual pursuit rotor-tracking test, for which participants were instructed to use a computer mouse to track a moving target around an ellipse that was displayed on a computer screen. Each of the two tasks was presented separately and concurrently at a fixed overall speech recognition performance level of 50% correct with and without the directional microphone and/or the NR algorithm activated in the hearing aids. In addition, participants reported how effortful it was to listen to the sentences in quiet and in background noise in the different hearing aid listening conditions. Fifteen older listeners with mild sloping to severe sensorineural hearing loss participated in this study. Listening effort in background noise was significantly reduced with the directional microphones activated in the hearing aids. However, there was no significant change in listening effort with the hearing aid NR algorithm compared to no noise processing. Correlation analysis between objective and self
Contribution to automatic speech recognition. Analysis of the direct acoustical signal. Recognition of isolated words and phoneme identification

International Nuclear Information System (INIS)

Dupeyrat, Benoit

1981-01-01

This report deals with the acoustical-phonetic step of the automatic recognition of the speech. The parameters used are the extrema of the acoustical signal (coded in amplitude and duration). This coding method, the properties of which are described, is simple and well adapted to a digital processing. The quality and the intelligibility of the coded signal after reconstruction are particularly satisfactory. An experiment for the automatic recognition of isolated words has been carried using this coding system. We have designed a filtering algorithm operating on the parameters of the coding. Thus the characteristics of the formants can be derived under certain conditions which are discussed. Using these characteristics the identification of a large part of the phonemes for a given speaker was achieved. Carrying on the studies has required the development of a particular methodology of real time processing which allowed immediate evaluation of the improvement of the programs. Such processing on temporal coding of the acoustical signal is extremely powerful and could represent, used in connection with other methods an efficient tool for the automatic processing of the speech.(author) [fr
Declarations, accusations and judgement: examining conflict of interest discourses as performative speech-acts.

Science.gov (United States)

Mayes, Christopher; Lipworth, Wendy; Kerridge, Ian

2016-09-01

Concerns over conflicts of interest (COI) in academic research and medical practice continue to provoke a great deal of discussion. What is most obvious in this discourse is that when COIs are declared, or perceived to exist in others, there is a focus on both the descriptive question of whether there is a COI and, subsequently, the normative question of whether it is good, bad or neutral. We contend, however, that in addition to the descriptive and normative, COI declarations and accusations can be understood as performatives. In this article, we apply J.L. Austin's performative speech-act theory to COI discourses and illustrate how this works using a contemporary case study of COI in biomedical publishing. We argue that using Austin's theory of performative speech-acts serves to highlight the social arrangements and role of authorities in COI discourse and so provides a rich framework to examine declarations, accusations and judgements of COI that often arise in the context of biomedical research and practice.
Identification of Changes along a Continuum of Speech Intonation is Impaired in Congenital Amusia.

Science.gov (United States)

Hutchins, Sean; Gosselin, Nathalie; Peretz, Isabelle

2010-01-01

A small number of individuals have severe musical problems that have neuro-genetic underpinnings. This musical disorder is termed "congenital amusia," an umbrella term for lifelong musical disabilities that cannot be attributed to deafness, lack of exposure, or brain damage after birth. Amusics seem to lack the ability to detect fine pitch differences in tone sequences. However, differences between statements and questions, which vary in final pitch, are well perceived by most congenital amusic individuals. We hypothesized that the origin of this apparent domain-specificity of the disorder lies in the range of pitch variations, which are very coarse in speech as compared to music. Here, we tested this hypothesis by using a continuum of gradually increasing final pitch in both speech and tone sequences. To this aim, nine amusic cases and nine matched controls were presented with statements and questions that varied on a pitch continuum from falling to rising in 11 steps. The sentences were either naturally spoken or were tone sequence versions of these. The task was to categorize the sentences as statements or questions and the tone sequences as falling or rising. In each case, the observation of an S-shaped identification function indicates that amusics can accurately identify unambiguous examples of statements and questions but have problems with fine variations between these endpoints. Thus, the results indicate that a deficient pitch perception might compromise music, not because it is specialized for that domain but because music requirements are more fine-grained.
Speech-to-Speech Relay Service

Science.gov (United States)

Consumer Guide Speech to Speech Relay Service Speech-to-Speech (STS) is one form of Telecommunications Relay Service (TRS). TRS is a service that allows persons with hearing and speech disabilities ...
Visual Context Enhanced: The Joint Contribution of Iconic Gestures and Visible Speech to Degraded Speech Comprehension

Science.gov (United States)

Drijvers, Linda; Ozyurek, Asli

2017-01-01

Purpose: This study investigated whether and to what extent iconic co-speech gestures contribute to information from visible speech to enhance degraded speech comprehension at different levels of noise-vocoding. Previous studies of the contributions of these 2 visual articulators to speech comprehension have only been performed separately. Method:…
Speech understanding in noise with an eyeglass hearing aid: asymmetric fitting and the head shadow benefit of anterior microphones.

Science.gov (United States)

Mens, Lucas H M

2011-01-01

To test speech understanding in noise using array microphones integrated in an eyeglass device and to test if microphones placed anteriorly at the temple provide better directivity than above the pinna. Sentences were presented from the front and uncorrelated noise from 45, 135, 225 and 315°. Fifteen hearing impaired participants with a significant speech discrimination loss were included, as well as 5 normal hearing listeners. The device (Varibel) improved speech understanding in noise compared to most conventional directional devices with a directional benefit of 5.3 dB in the asymmetric fit mode, which was not significantly different from the bilateral fully directional mode (6.3 dB). Anterior microphones outperformed microphones at a conventional position above the pinna by 2.6 dB. By integrating microphones in an eyeglass frame, a long array can be used resulting in a higher directionality index and improved speech understanding in noise. An asymmetric fit did not significantly reduce performance and can be considered to increase acceptance and environmental awareness. Directional microphones at the temple seemed to profit more from the head shadow than above the pinna, better suppressing noise from behind the listener.
Adequacy of velopharyngeal closure and speech competency following prosthetic management of soft palate resection

International Nuclear Information System (INIS)

ElDakkak, M

1999-01-01

Ten patients who had undergone soft palate resection for the removal of palatal tumors were studied. In each patient, the surgical defect involved the posterior margin of the soft palate and lead to velopharyngeal insufficiency. None of the patients suffered any speech, hearing or nasal problems before surgery. For each patient, a speech aid obturator was constructed and was used at least one month before the evaluation. Prosthetic management of each subject was evaluated as reflected in adequacy of velopharyngeal closure and speech competency. Various aspects of speech including intelligibility, articulation, nasality, hoarseness and overall speech were correlated with the adequacy of velopharyngeal closure. (author)
Human speech articulator measurements using low power, 2GHz Homodyne sensors

Energy Technology Data Exchange (ETDEWEB)

Barnes, T; Burnett, G C; Holzrichter, J F

1999-06-29

Very low power, short-range microwave ''radar-like'' sensors can measure the motions and vibrations of internal human speech articulators as speech is produced. In these animate (and also in inanimate acoustic systems) microwave sensors can measure vibration information associated with excitation sources and other interfaces. These data, together with the corresponding acoustic data, enable the calculation of system transfer functions. This information appears to be useful for a surprisingly wide range of applications such as speech coding and recognition, speaker or object identification, speech and musical instrument synthesis, noise cancellation, and other applications.

Rapid Statistical Learning Supporting Word Extraction From Continuous Speech.

Science.gov (United States)

Batterink, Laura J

2017-07-01

The identification of words in continuous speech, known as speech segmentation, is a critical early step in language acquisition. This process is partially supported by statistical learning, the ability to extract patterns from the environment. Given that speech segmentation represents a potential bottleneck for language acquisition, patterns in speech may be extracted very rapidly, without extensive exposure. This hypothesis was examined by exposing participants to continuous speech streams composed of novel repeating nonsense words. Learning was measured on-line using a reaction time task. After merely one exposure to an embedded novel word, learners demonstrated significant learning effects, as revealed by faster responses to predictable than to unpredictable syllables. These results demonstrate that learners gained sensitivity to the statistical structure of unfamiliar speech on a very rapid timescale. This ability may play an essential role in early stages of language acquisition, allowing learners to rapidly identify word candidates and "break in" to an unfamiliar language.
Experience with speech sounds is not necessary for cue trading by budgerigars (Melopsittacus undulatus.

Directory of Open Access Journals (Sweden)

Mary Flaherty

Full Text Available The influence of experience with human speech sounds on speech perception in budgerigars, vocal mimics whose speech exposure can be tightly controlled in a laboratory setting, was measured. Budgerigars were divided into groups that differed in auditory exposure and then tested on a cue-trading identification paradigm with synthetic speech. Phonetic cue trading is a perceptual phenomenon observed when changes on one cue dimension are offset by changes in another cue dimension while still maintaining the same phonetic percept. The current study examined whether budgerigars would trade the cues of voice onset time (VOT and the first formant onset frequency when identifying syllable initial stop consonants and if this would be influenced by exposure to speech sounds. There were a total of four different exposure groups: No speech exposure (completely isolated, Passive speech exposure (regular exposure to human speech, and two Speech-trained groups. After the exposure period, all budgerigars were tested for phonetic cue trading using operant conditioning procedures. Birds were trained to peck keys in response to different synthetic speech sounds that began with "d" or "t" and varied in VOT and frequency of the first formant at voicing onset. Once training performance criteria were met, budgerigars were presented with the entire intermediate series, including ambiguous sounds. Responses on these trials were used to determine which speech cues were used, if a trading relation between VOT and the onset frequency of the first formant was present, and whether speech exposure had an influence on perception. Cue trading was found in all birds and these results were largely similar to those of a group of humans. Results indicated that prior speech experience was not a requirement for cue trading by budgerigars. The results are consistent with theories that explain phonetic cue trading in terms of a rich auditory encoding of the speech signal.
Profiling Speech and Pausing in Amyotrophic Lateral Sclerosis (ALS) and Frontotemporal Dementia (FTD)

Science.gov (United States)

Yunusova, Yana; Graham, Naida L.; Shellikeri, Sanjana; Phuong, Kent; Kulkarni, Madhura; Rochon, Elizabeth; Tang-Wai, David F.; Chow, Tiffany W.; Black, Sandra E.; Zinman, Lorne H.; Green, Jordan R.

2016-01-01

Objective This study examines reading aloud in patients with amyotrophic lateral sclerosis (ALS) and those with frontotemporal dementia (FTD) in order to determine whether differences in patterns of speaking and pausing exist between patients with primary motor vs. primary cognitive-linguistic deficits, and in contrast to healthy controls. Design 136 participants were included in the study: 33 controls, 85 patients with ALS, and 18 patients with either the behavioural variant of FTD (FTD-BV) or progressive nonfluent aphasia (FTD-PNFA). Participants with ALS were further divided into 4 non-overlapping subgroups—mild, respiratory, bulbar (with oral-motor deficit) and bulbar-respiratory—based on the presence and severity of motor bulbar or respiratory signs. All participants read a passage aloud. Custom-made software was used to perform speech and pause analyses, and this provided measures of speaking and articulatory rates, duration of speech, and number and duration of pauses. These measures were statistically compared in different subgroups of patients. Results The results revealed clear differences between patient groups and healthy controls on the passage reading task. A speech-based motor function measure (i.e., articulatory rate) was able to distinguish patients with bulbar ALS or FTD-PNFA from those with respiratory ALS or FTD-BV. Distinguishing the disordered groups proved challenging based on the pausing measures. Conclusions and Relevance This study demonstrated the use of speech measures in the identification of those with an oral-motor deficit, and showed the usefulness of performing a relatively simple reading test to assess speech versus pause behaviors across the ALS—FTD disease continuum. The findings also suggest that motor speech assessment should be performed as part of the diagnostic workup for patients with FTD. PMID:26789001
Development and assessment of two fixed-array microphones for use with hearing aids

NARCIS (Netherlands)

Bilsen, F.A.; Soede, W.; Berkhout, A.J.

1993-01-01

Hearing-impaired listeners often have great difficulty understanding speech in situations with background noise (e.g., meetings, parties) . Conventional hearing aids offer insufficient directivity to significantly reduce background noise relative to the desired speech signal . Based on array
The influence of spectral and spatial characteristics of early reflections on speech intelligibility

DEFF Research Database (Denmark)

Arweiler, Iris; Buchholz, Jörg; Dau, Torsten

The auditory system employs different strategies to facilitate speech intelligibility in complex listening conditions. One of them is the integration of early reflections (ER’s) with the direct sound (DS) to increase the effective speech level. So far the underlying mechanisms of ER processing have...... of listeners that speech intelligibility improved with added ER energy, but less than with added DS energy. An efficiency factor was introduced to quantify this effect. The difference in speech intelligibility could be mainly ascribed to the differences in the spectrum between the speech signals....... binaural). The direction-dependency could be explained by the spectral changes introduced by the pinna, head, and torso. The results will be important with regard to the influence of signal processing strategies in modern hearing aids on speech intelligibility, because they might alter the spectral...
Technology assisted speech and language therapy.

Science.gov (United States)

Glykas, Michael; Chytas, Panagiotis

2004-06-30

Speech and language therapists (SLTs) are faced daily with a diversity of speech and language disabilities, which are associated with a variety of conditions ranging from client groups with overall cognitive deficits to those with more specific difficulties. It is desirable that those working with such a range of problems and with such a demanding workload, plan care efficiently. Therefore, the introduction of methodologies, reference models of work and tools, which significantly improve the effectiveness of therapy, are particularly welcome. This paper describes the first web-based tool for diagnosis, treatment and e-Learning in the field of language and speech therapy. The system allows SLTs to find the optimum treatment for each patient, it also allows any non-specialist user-SLT, patient or helper (relative etc.)-to explore their creativity, by designing their own communication aid in an interactive manner, with the use of editors such as: configuration and vocabulary. The system has been tested and piloted by potential users in Greece and the UK.
Reference-Free Assessment of Speech Intelligibility Using Bispectrum of an Auditory Neurogram

Science.gov (United States)

Hossain, Mohammad E.; Jassim, Wissam A.; Zilany, Muhammad S. A.

2016-01-01

Sensorineural hearing loss occurs due to damage to the inner and outer hair cells of the peripheral auditory system. Hearing loss can cause decreases in audibility, dynamic range, frequency and temporal resolution of the auditory system, and all of these effects are known to affect speech intelligibility. In this study, a new reference-free speech intelligibility metric is proposed using 2-D neurograms constructed from the output of a computational model of the auditory periphery. The responses of the auditory-nerve fibers with a wide range of characteristic frequencies were simulated to construct neurograms. The features of the neurograms were extracted using third-order statistics referred to as bispectrum. The phase coupling of neurogram bispectrum provides a unique insight for the presence (or deficit) of supra-threshold nonlinearities beyond audibility for listeners with normal hearing (or hearing loss). The speech intelligibility scores predicted by the proposed method were compared to the behavioral scores for listeners with normal hearing and hearing loss both in quiet and under noisy background conditions. The results were also compared to the performance of some existing methods. The predicted results showed a good fit with a small error suggesting that the subjective scores can be estimated reliably using the proposed neural-response-based metric. The proposed metric also had a wide dynamic range, and the predicted scores were well-separated as a function of hearing loss. The proposed metric successfully captures the effects of hearing loss and supra-threshold nonlinearities on speech intelligibility. This metric could be applied to evaluate the performance of various speech-processing algorithms designed for hearing aids and cochlear implants. PMID:26967160
Low-Arousal Speech Noise Improves Performance in N-Back Task: An ERP Study

Science.gov (United States)

Zhang, Dandan; Jin, Yi; Luo, Yuejia

2013-01-01

The relationship between noise and human performance is a crucial topic in ergonomic research. However, the brain dynamics of the emotional arousal effects of background noises are still unclear. The current study employed meaningless speech noises in the n-back working memory task to explore the changes of event-related potentials (ERPs) elicited by the noises with low arousal level vs. high arousal level. We found that the memory performance in low arousal condition were improved compared with the silent and the high arousal conditions; participants responded more quickly and had larger P2 and P3 amplitudes in low arousal condition while the performance and ERP components showed no significant difference between high arousal and silent conditions. These findings suggested that the emotional arousal dimension of background noises had a significant influence on human working memory performance, and that this effect was independent of the acoustic characteristics of noises (e.g., intensity) and the meaning of speech materials. The current findings improve our understanding of background noise effects on human performance and lay the ground for the investigation of patients with attention deficits. PMID:24204607
Low-arousal speech noise improves performance in N-back task: an ERP study.

Science.gov (United States)

Han, Longzhu; Liu, Yunzhe; Zhang, Dandan; Jin, Yi; Luo, Yuejia

2013-01-01

The relationship between noise and human performance is a crucial topic in ergonomic research. However, the brain dynamics of the emotional arousal effects of background noises are still unclear. The current study employed meaningless speech noises in the n-back working memory task to explore the changes of event-related potentials (ERPs) elicited by the noises with low arousal level vs. high arousal level. We found that the memory performance in low arousal condition were improved compared with the silent and the high arousal conditions; participants responded more quickly and had larger P2 and P3 amplitudes in low arousal condition while the performance and ERP components showed no significant difference between high arousal and silent conditions. These findings suggested that the emotional arousal dimension of background noises had a significant influence on human working memory performance, and that this effect was independent of the acoustic characteristics of noises (e.g., intensity) and the meaning of speech materials. The current findings improve our understanding of background noise effects on human performance and lay the ground for the investigation of patients with attention deficits.
Low-arousal speech noise improves performance in N-back task: an ERP study.

Directory of Open Access Journals (Sweden)

Longzhu Han

Full Text Available The relationship between noise and human performance is a crucial topic in ergonomic research. However, the brain dynamics of the emotional arousal effects of background noises are still unclear. The current study employed meaningless speech noises in the n-back working memory task to explore the changes of event-related potentials (ERPs elicited by the noises with low arousal level vs. high arousal level. We found that the memory performance in low arousal condition were improved compared with the silent and the high arousal conditions; participants responded more quickly and had larger P2 and P3 amplitudes in low arousal condition while the performance and ERP components showed no significant difference between high arousal and silent conditions. These findings suggested that the emotional arousal dimension of background noises had a significant influence on human working memory performance, and that this effect was independent of the acoustic characteristics of noises (e.g., intensity and the meaning of speech materials. The current findings improve our understanding of background noise effects on human performance and lay the ground for the investigation of patients with attention deficits.
Performance of wavelet analysis and neural networks for pathological voices identification

Science.gov (United States)

Salhi, Lotfi; Talbi, Mourad; Abid, Sabeur; Cherif, Adnane

2011-09-01

Within the medical environment, diverse techniques exist to assess the state of the voice of the patient. The inspection technique is inconvenient for a number of reasons, such as its high cost, the duration of the inspection, and above all, the fact that it is an invasive technique. This study focuses on a robust, rapid and accurate system for automatic identification of pathological voices. This system employs non-invasive, non-expensive and fully automated method based on hybrid approach: wavelet transform analysis and neural network classifier. First, we present the results obtained in our previous study while using classic feature parameters. These results allow visual identification of pathological voices. Second, quantified parameters drifting from the wavelet analysis are proposed to characterise the speech sample. On the other hand, a system of multilayer neural networks (MNNs) has been developed which carries out the automatic detection of pathological voices. The developed method was evaluated using voice database composed of recorded voice samples (continuous speech) from normophonic or dysphonic speakers. The dysphonic speakers were patients of a National Hospital 'RABTA' of Tunis Tunisia and a University Hospital in Brussels, Belgium. Experimental results indicate a success rate ranging between 75% and 98.61% for discrimination of normal and pathological voices using the proposed parameters and neural network classifier. We also compared the average classification rate based on the MNN, Gaussian mixture model and support vector machines.
The role of bone conduction hearing aids in congenital unilateral hearing loss: A systematic review.

Science.gov (United States)

Liu, C Carrie; Livingstone, Devon; Yunker, Warren K

2017-03-01

To systematically review the literature on the audiological and/or quality of life benefits of a bone conduction hearing aid (BCHA) in children with congenital unilateral conductive or sensorineural deafness. A systematic search was performed according to the PRISMA guidelines using the PubMed, Medline, and Embase databases. Data were collected on the following outcomes of interest: speech reception threshold, speech discrimination, sound localization, and quality of life measures. Given the heterogeneity of the data for quantitative analysis, the results are qualitatively summarized. Eight studies were included in the review. Four studies examined the audiological outcomes associated with bone conduction hearing aid implantation. There was a consistent gain in speech reception thresholds and speech discrimination, especially in noisy environments. Results pertaining to sound localization was inconsistent. The studies that examined quality of life measures reported a high usage rate of BCHAs among children. Quality of life improvements are reported with suggested benefit in the subdomain of learning. Given the potential benefits of a BCHA, along with the fact that it can be safely trialed using a headband, it is reasonable to trial a BCHA in children with congenital unilateral deafness. Should the trial offer audiological and/or quality of life benefits for the individual child, then BCHA implantation can be considered. Copyright © 2017 Elsevier B.V. All rights reserved.
Transitioning hearing aid users with severe and profound loss to a new gain/frequency response: benefit, perception, and acceptance.

Science.gov (United States)

Convery, Elizabeth; Keidser, Gitte

2011-03-01

Adults with severe and profound hearing loss tend to be long-term, full-time users of amplification who are highly reliant on their hearing aids. As a result of these characteristics, they are often reluctant to update their hearing aids when new features or signal-processing algorithms become available. Due to the electroacoustic constraints of older devices, many severely and profoundly hearing-impaired adults continue to wear hearing aids that provide more low- and mid-frequency gain and less high-frequency gain than would be prescribed by the National Acoustic Laboratories' revised formula with profound correction factor (NAL-RP). To investigate the effect of a gradual change in gain/frequency response on experienced hearing-aid wearers with moderately severe to profound hearing loss. Double-blind, randomized controlled trial. Twenty-three experienced adult hearing-aid users with severe and profound hearing loss participated in the study. Participants were selected for inclusion in the study if the gain/frequency response of their own hearing aids differed significantly from their NAL-RP prescription. Participants were assigned either to a control or to an experimental group balanced for aided ear three-frequency pure-tone average (PTA) and age. Participants were fitted with Siemens Artis 2 SP behind-the-ear (BTE) hearing aids that were matched to the gain/frequency response of their own hearing aids for a 65 dB SPL input level. The experimental group progressed incrementally to their NAL-RP targets over the course of 15 wk, while the control group maintained their initial settings throughout the study. Aided speech discrimination testing, loudness scaling, and structured questionnaires were completed at 3, 6, 9, 12, and 15 wk postfitting. A paired comparison between the old and new gain/frequency responses was completed at 1 and 15 wk postfitting. Statistical analysis was conducted to examine differences between the experimental and control groups and changes
Hearing aids in children: the importance of the verification and validation processes.

Science.gov (United States)

Rissatto, Mara Renata; Novaes, Beatriz Cavalcanti de Albuquerque Caiuby

2009-01-01

during the fitting of hearing aids in children it is important, besides using a verification protocol, to have a validation process. to describe and discuss the use of a protocol for the fitting and the verification of hearing aids in children, as well as the impact of the adjustment of the acoustic characteristics in speech perception tasks. ten children aging from three to eleven years were enrolled in this study. All children presented bilateral sensorineural hearing impairment, were users of hearing aids and were followed at a public hearing health care service in Bahia. The children were submitted to the following procedures: pure tone air and bone conduction thresholds; real-ear coupler difference (RECD); verification with real-ear measurement equipment: coupler gain/output and insertion gain and to speech perception tasks: 'The Six-Sound Test' (Ling, 2006) and the 'Word Associations for Syllable Perception' (WASP - Koch, 1999). The programmed electro acoustic characteristics of the hearing aids were compared to the electro acoustic characteristics prescribed by the DSL [i/o] v4.1 software. The speech perception tasks were reapplied on three occasions: straight after the modification of the electro acoustic characteristics, after 30 days and 60 days. for more than 50% of the tested children, the programmed electro acoustic characteristics of the hearing aids did not correspond to that suggested by the DSL [i/o] software. Adequate prescription was verified in 70% of the investigated sample; this was also confirmed by the results in the speech perception tasks (p=0.000). This data confirmed that the mean percentage of correct answers increased after the modification of the electro acoustic characteristics. the use of a protocol that verifies and validates the fitting of hearing aids in children is necessary.
Intelligibility for Binaural Speech with Discarded Low-SNR Speech Components.

Science.gov (United States)

Schoenmaker, Esther; van de Par, Steven

2016-01-01

Speech intelligibility in multitalker settings improves when the target speaker is spatially separated from the interfering speakers. A factor that may contribute to this improvement is the improved detectability of target-speech components due to binaural interaction in analogy to the Binaural Masking Level Difference (BMLD). This would allow listeners to hear target speech components within specific time-frequency intervals that have a negative SNR, similar to the improvement in the detectability of a tone in noise when these contain disparate interaural difference cues. To investigate whether these negative-SNR target-speech components indeed contribute to speech intelligibility, a stimulus manipulation was performed where all target components were removed when local SNRs were smaller than a certain criterion value. It can be expected that for sufficiently high criterion values target speech components will be removed that do contribute to speech intelligibility. For spatially separated speakers, assuming that a BMLD-like detection advantage contributes to intelligibility, degradation in intelligibility is expected already at criterion values below 0 dB SNR. However, for collocated speakers it is expected that higher criterion values can be applied without impairing speech intelligibility. Results show that degradation of intelligibility for separated speakers is only seen for criterion values of 0 dB and above, indicating a negligible contribution of a BMLD-like detection advantage in multitalker settings. These results show that the spatial benefit is related to a spatial separation of speech components at positive local SNRs rather than to a BMLD-like detection improvement for speech components at negative local SNRs.
A comparative study of hearing aids and round window application of the vibrant sound bridge (VSB) for patients with mixed or conductive hearing loss.

Science.gov (United States)

Marino, Roberta; Linton, Nicola; Eikelboom, Robert H; Statham, Elle; Rajan, Gunesh P

2013-04-01

This study was undertaken to determine the efficacy of the round window (RW) application of the vibrant soundbridge (VSB) in patients with mixed or conductive hearing loss. Speech in quiet and in noise were compared to preoperative data attained with conventional hearing aids so that each subject served as his or her own control in a single test protocol. Eighteen adults implanted monaurally with the VSB in the poorer hearing ear. Experience with the VSB ranged from nine to 25 months. Sixteen of the 18 subjects were successful VSB users, wearing their device all waking hours. There was no significant deterioration in the averaged bone conduction results preoperatively versus post-operatively (p>0.05). Speech recognition in quiet results were not significantly different to performance attained whilst wearing hearing aids (p>0.05). Speech recognition in noise performance was substantially improved with use of the VSB in most test conditions. For the majority of the subjects, the VSB was an effective method of hearing restoration for their mixed and conductive hearing loss.
A false sense of security: safety behaviors erode objective speech performance in individuals with social anxiety disorder.

Science.gov (United States)

Rowa, Karen; Paulitzki, Jeffrey R; Ierullo, Maria D; Chiang, Brenda; Antony, Martin M; McCabe, Randi E; Moscovitch, David A

2015-05-01

In the current study, 55 participants with a diagnosis of generalized social anxiety disorder (SAD), 23 participants with a diagnosis of an anxiety disorder other than SAD with no comorbid SAD, and 50 healthy controls completed a speech task as well as self-reported measures of safety behavior use. Speeches were videotaped and coded for global and specific indicators of performance by two raters who were blind to participants' diagnostic status. Results suggested that the objective performance of people with SAD was poorer than that of both control groups, who did not differ from each other. Moreover, self-reported use of safety behaviors during the speech strongly mediated the relationship between diagnostic group and observers' performance ratings. These results are consistent with contemporary cognitive-behavioral and interpersonal models of SAD and suggest that socially anxious individuals' performance skills may be undermined by the use of safety behaviors. These data provide further support for recommendations from previous studies that the elimination of safety behaviors ought to be a priority in cognitive behavioral therapy for SAD. Copyright © 2014. Published by Elsevier Ltd.
Automated Speech Rate Measurement in Dysarthria

Science.gov (United States)

Martens, Heidi; Dekens, Tomas; Van Nuffelen, Gwen; Latacz, Lukas; Verhelst, Werner; De Bodt, Marc

2015-01-01

Purpose: In this study, a new algorithm for automated determination of speech rate (SR) in dysarthric speech is evaluated. We investigated how reliably the algorithm calculates the SR of dysarthric speech samples when compared with calculation performed by speech-language pathologists. Method: The new algorithm was trained and tested using Dutch…
The Contribution of Cognitive Factors to Individual Differences in Understanding Noise-Vocoded Speech in Young and Older Adults

Directory of Open Access Journals (Sweden)

Stephanie Rosemann

2017-06-01

Full Text Available Noise-vocoded speech is commonly used to simulate the sensation after cochlear implantation as it consists of spectrally degraded speech. High individual variability exists in learning to understand both noise-vocoded speech and speech perceived through a cochlear implant (CI. This variability is partly ascribed to differing cognitive abilities like working memory, verbal skills or attention. Although clinically highly relevant, up to now, no consensus has been achieved about which cognitive factors exactly predict the intelligibility of speech in noise-vocoded situations in healthy subjects or in patients after cochlear implantation. We aimed to establish a test battery that can be used to predict speech understanding in patients prior to receiving a CI. Young and old healthy listeners completed a noise-vocoded speech test in addition to cognitive tests tapping on verbal memory, working memory, lexicon and retrieval skills as well as cognitive flexibility and attention. Partial-least-squares analysis revealed that six variables were important to significantly predict vocoded-speech performance. These were the ability to perceive visually degraded speech tested by the Text Reception Threshold, vocabulary size assessed with the Multiple Choice Word Test, working memory gauged with the Operation Span Test, verbal learning and recall of the Verbal Learning and Retention Test and task switching abilities tested by the Comprehensive Trail-Making Test. Thus, these cognitive abilities explain individual differences in noise-vocoded speech understanding and should be considered when aiming to predict hearing-aid outcome.
Automatic speech recognition (ASR) based approach for speech therapy of aphasic patients: A review

Science.gov (United States)

Jamal, Norezmi; Shanta, Shahnoor; Mahmud, Farhanahani; Sha'abani, MNAH

2017-09-01

This paper reviews the state-of-the-art an automatic speech recognition (ASR) based approach for speech therapy of aphasic patients. Aphasia is a condition in which the affected person suffers from speech and language disorder resulting from a stroke or brain injury. Since there is a growing body of evidence indicating the possibility of improving the symptoms at an early stage, ASR based solutions are increasingly being researched for speech and language therapy. ASR is a technology that transfers human speech into transcript text by matching with the system's library. This is particularly useful in speech rehabilitation therapies as they provide accurate, real-time evaluation for speech input from an individual with speech disorder. ASR based approaches for speech therapy recognize the speech input from the aphasic patient and provide real-time feedback response to their mistakes. However, the accuracy of ASR is dependent on many factors such as, phoneme recognition, speech continuity, speaker and environmental differences as well as our depth of knowledge on human language understanding. Hence, the review examines recent development of ASR technologies and its performance for individuals with speech and language disorders.

Improving Understanding of Emotional Speech Acoustic Content

Science.gov (United States)

Tinnemore, Anna

Children with cochlear implants show deficits in identifying emotional intent of utterances without facial or body language cues. A known limitation to cochlear implants is the inability to accurately portray the fundamental frequency contour of speech which carries the majority of information needed to identify emotional intent. Without reliable access to the fundamental frequency, other methods of identifying vocal emotion, if identifiable, could be used to guide therapies for training children with cochlear implants to better identify vocal emotion. The current study analyzed recordings of adults speaking neutral sentences with a set array of emotions in a child-directed and adult-directed manner. The goal was to identify acoustic cues that contribute to emotion identification that may be enhanced in child-directed speech, but are also present in adult-directed speech. Results of this study showed that there were significant differences in the variation of the fundamental frequency, the variation of intensity, and the rate of speech among emotions and between intended audiences.
Song and speech: examining the link between singing talent and speech imitation ability.

Science.gov (United States)

Christiner, Markus; Reiterer, Susanne M

2013-01-01

In previous research on speech imitation, musicality, and an ability to sing were isolated as the strongest indicators of good pronunciation skills in foreign languages. We, therefore, wanted to take a closer look at the nature of the ability to sing, which shares a common ground with the ability to imitate speech. This study focuses on whether good singing performance predicts good speech imitation. Forty-one singers of different levels of proficiency were selected for the study and their ability to sing, to imitate speech, their musical talent and working memory were tested. Results indicated that singing performance is a better indicator of the ability to imitate speech than the playing of a musical instrument. A multiple regression revealed that 64% of the speech imitation score variance could be explained by working memory together with educational background and singing performance. A second multiple regression showed that 66% of the speech imitation variance of completely unintelligible and unfamiliar language stimuli (Hindi) could be explained by working memory together with a singer's sense of rhythm and quality of voice. This supports the idea that both vocal behaviors have a common grounding in terms of vocal and motor flexibility, ontogenetic and phylogenetic development, neural orchestration and auditory memory with singing fitting better into the category of "speech" on the productive level and "music" on the acoustic level. As a result, good singers benefit from vocal and motor flexibility, productively and cognitively, in three ways. (1) Motor flexibility and the ability to sing improve language and musical function. (2) Good singers retain a certain plasticity and are open to new and unusual sound combinations during adulthood both perceptually and productively. (3) The ability to sing improves the memory span of the auditory working memory.
Song and speech: examining the link between singing talent and speech imitation ability

Directory of Open Access Journals (Sweden)

Markus eChristiner

2013-11-01

Full Text Available In previous research on speech imitation, musicality and an ability to sing were isolated as the strongest indicators of good pronunciation skills in foreign languages. We, therefore, wanted to take a closer look at the nature of the ability to sing, which shares a common ground with the ability to imitate speech. This study focuses on whether good singing performance predicts good speech imitation. Fourty-one singers of different levels of proficiency were selected for the study and their ability to sing, to imitate speech, their musical talent and working memory were tested. Results indicated that singing performance is a better indicator of the ability to imitate speech than the playing of a musical instrument. A multiple regression revealed that 64 % of the speech imitation score variance could be explained by working memory together with educational background and singing performance. A second multiple regression showed that 66 % of the speech imitation variance of completely unintelligible and unfamiliar language stimuli (Hindi could be explained by working memory together with a singer’s sense of rhythm and quality of voice. This supports the idea that both vocal behaviors have a common grounding in terms of vocal and motor flexibility, ontogenetic and phylogenetic development, neural orchestration and sound memory with singing fitting better into the category of "speech" on the productive level and "music" on the acoustic level. As a result, good singers benefit from vocal and motor flexibility, productively and cognitively, in three ways. 1. Motor flexibility and the ability to sing improve language and musical function. 2. Good singers retain a certain plasticity and are open to new and unusual sound combinations during adulthood both perceptually and productively. 3. The ability to sing improves the memory span of the auditory short term memory.
Speech Processing to Improve the Perception of Speech in Background Noise for Children With Auditory Processing Disorder and Typically Developing Peers.

Science.gov (United States)

Flanagan, Sheila; Zorilă, Tudor-Cătălin; Stylianou, Yannis; Moore, Brian C J

2018-01-01

Auditory processing disorder (APD) may be diagnosed when a child has listening difficulties but has normal audiometric thresholds. For adults with normal hearing and with mild-to-moderate hearing impairment, an algorithm called spectral shaping with dynamic range compression (SSDRC) has been shown to increase the intelligibility of speech when background noise is added after the processing. Here, we assessed the effect of such processing using 8 children with APD and 10 age-matched control children. The loudness of the processed and unprocessed sentences was matched using a loudness model. The task was to repeat back sentences produced by a female speaker when presented with either speech-shaped noise (SSN) or a male competing speaker (CS) at two signal-to-background ratios (SBRs). Speech identification was significantly better with SSDRC processing than without, for both groups. The benefit of SSDRC processing was greater for the SSN than for the CS background. For the SSN, scores were similar for the two groups at both SBRs. For the CS, the APD group performed significantly more poorly than the control group. The overall improvement produced by SSDRC processing could be useful for enhancing communication in a classroom where the teacher's voice is broadcast using a wireless system.
Speech coding

Energy Technology Data Exchange (ETDEWEB)

Ravishankar, C., Hughes Network Systems, Germantown, MD

1998-05-08

Speech is the predominant means of communication between human beings and since the invention of the telephone by Alexander Graham Bell in 1876, speech services have remained to be the core service in almost all telecommunication systems. Original analog methods of telephony had the disadvantage of speech signal getting corrupted by noise, cross-talk and distortion Long haul transmissions which use repeaters to compensate for the loss in signal strength on transmission links also increase the associated noise and distortion. On the other hand digital transmission is relatively immune to noise, cross-talk and distortion primarily because of the capability to faithfully regenerate digital signal at each repeater purely based on a binary decision. Hence end-to-end performance of the digital link essentially becomes independent of the length and operating frequency bands of the link Hence from a transmission point of view digital transmission has been the preferred approach due to its higher immunity to noise. The need to carry digital speech became extremely important from a service provision point of view as well. Modem requirements have introduced the need for robust, flexible and secure services that can carry a multitude of signal types (such as voice, data and video) without a fundamental change in infrastructure. Such a requirement could not have been easily met without the advent of digital transmission systems, thereby requiring speech to be coded digitally. The term Speech Coding is often referred to techniques that represent or code speech signals either directly as a waveform or as a set of parameters by analyzing the speech signal. In either case, the codes are transmitted to the distant end where speech is reconstructed or synthesized using the received set of codes. A more generic term that is applicable to these techniques that is often interchangeably used with speech coding is the term voice coding. This term is more generic in the sense that the
Individual differences in speech-in-noise perception parallel neural speech processing and attention in preschoolers

Science.gov (United States)

Thompson, Elaine C.; Carr, Kali Woodruff; White-Schwoch, Travis; Otto-Meyer, Sebastian; Kraus, Nina

2016-01-01

From bustling classrooms to unruly lunchrooms, school settings are noisy. To learn effectively in the unwelcome company of numerous distractions, children must clearly perceive speech in noise. In older children and adults, speech-in-noise perception is supported by sensory and cognitive processes, but the correlates underlying this critical listening skill in young children (3–5 year olds) remain undetermined. Employing a longitudinal design (two evaluations separated by ~12 months), we followed a cohort of 59 preschoolers, ages 3.0–4.9, assessing word-in-noise perception, cognitive abilities (intelligence, short-term memory, attention), and neural responses to speech. Results reveal changes in word-in-noise perception parallel changes in processing of the fundamental frequency (F0), an acoustic cue known for playing a role central to speaker identification and auditory scene analysis. Four unique developmental trajectories (speech-in-noise perception groups) confirm this relationship, in that improvements and declines in word-in-noise perception couple with enhancements and diminishments of F0 encoding, respectively. Improvements in word-in-noise perception also pair with gains in attention. Word-in-noise perception does not relate to strength of neural harmonic representation or short-term memory. These findings reinforce previously-reported roles of F0 and attention in hearing speech in noise in older children and adults, and extend this relationship to preschool children. PMID:27864051
Assessment of Danish-speaking children’s phonological development and speech disorders

DEFF Research Database (Denmark)

Clausen, Marit Carolin; Fox-Boyer, Annette

2018-01-01

The identification of speech sounds disorders is an important everyday task for speech and language therapists (SLTs) working with children. Therefore, assessment tools are needed that are able to correctly identify and diagnose a child with a suspected speech disorder and furthermore, that provide...... of the existing speech assessments in Denmark showed that none of the materials fulfilled current recommendations identified in research literature. Therefore, the aim of this paper is to describe the evaluation of a newly constructed instrument for assessing the speech development and disorders of Danish...... with suspected speech disorder (Clausen and Fox-Boyer, in prep). The results indicated that the instrument showed strong inter-examiner reliability for both populations as well as a high content and diagnostic validity. Hence, the study showed that the LogoFoVa can be regarded as a reliable and valid tool...
Speech perception benefits of internet versus conventional telephony for hearing-impaired individuals.

Science.gov (United States)

Mantokoudis, Georgios; Dubach, Patrick; Pfiffner, Flurin; Kompis, Martin; Caversaccio, Marco; Senn, Pascal

2012-07-16

Telephone communication is a challenge for many hearing-impaired individuals. One important technical reason for this difficulty is the restricted frequency range (0.3-3.4 kHz) of conventional landline telephones. Internet telephony (voice over Internet protocol [VoIP]) is transmitted with a larger frequency range (0.1-8 kHz) and therefore includes more frequencies relevant to speech perception. According to a recently published, laboratory-based study, the theoretical advantage of ideal VoIP conditions over conventional telephone quality has translated into improved speech perception by hearing-impaired individuals. However, the speech perception benefits of nonideal VoIP network conditions, which may occur in daily life, have not been explored. VoIP use cannot be recommended to hearing-impaired individuals before its potential under more realistic conditions has been examined. To compare realistic VoIP network conditions, under which digital data packets may be lost, with ideal conventional telephone quality with respect to their impact on speech perception by hearing-impaired individuals. We assessed speech perception using standardized test material presented under simulated VoIP conditions with increasing digital data packet loss (from 0% to 20%) and compared with simulated ideal conventional telephone quality. We monaurally tested 10 adult users of cochlear implants, 10 adult users of hearing aids, and 10 normal-hearing adults in the free sound field, both in quiet and with background noise. Across all participant groups, mean speech perception scores using VoIP with 0%, 5%, and 10% packet loss were 15.2% (range 0%-53%), 10.6% (4%-46%), and 8.8% (7%-33%) higher, respectively, than with ideal conventional telephone quality. Speech perception did not differ between VoIP with 20% packet loss and conventional telephone quality. The maximum benefits were observed under ideal VoIP conditions without packet loss and were 36% (P = .001) for cochlear implant users, 18
Special needs children with speech and hearing difficulties: prevalence and unmet needs.

Science.gov (United States)

Kenney, Mary Kay; Kogan, Michael D

2011-01-01

The purpose of this study was to establish prevalences and sociodemographic characteristics associated with parent-reported speech and hearing difficulties among children with special health care needs (CSHCN); determine unmet needs for therapy, hearing aids, and communication devices; and examine the association between unmet needs and resources such as health insurance, early intervention/special education, and a medical home. Data were analyzed for 300,910 children without special health care needs and 40,723 CSHCN from the 2005-2006 National Survey of Children with Special Health Care Needs. Prevalence, sociodemographic characteristics, and unmet needs for 7132 CSHCN with speech difficulties and 1982 CSHCN with hearing difficulties were assessed. Logistic regression was used to determine the associations between unmet needs for therapy or hearing/communication devices and resources for addressing needs for therapy, hearing, and communication aids. The parent-reported prevalence of speech difficulty among CSHCN in the general population was 2.9% and approximately 20% among all CSHCN, in contrast to the lower prevalence of hearing difficulty (0.7% and 5%, respectively). Relative unmet need was greatest for communication devices and least for hearing aids. The strongest association with reducing unmet needs was having a medical home, and the most significant aspect of medical home was having effective care coordination. Having a medical home is significantly associated with fewer unmet needs for therapy and hearing/communication devices among CSHCN with speech and hearing difficulties. Care coordination may constitute an important factor that allows the primary care provider to link with services that CSHCN with communication problems require. Published by Elsevier Inc.
Collective speech acts

NARCIS (Netherlands)

Meijers, A.W.M.; Tsohatzidis, S.L.

2007-01-01

From its early development in the 1960s, speech act theory always had an individualistic orientation. It focused exclusively on speech acts performed by individual agents. Paradigmatic examples are ‘I promise that p’, ‘I order that p’, and ‘I declare that p’. There is a single speaker and a single
Cochlear implants: 100 pediatric case conversions from the body worn to the nucleus esprit 22 ear level speech processor.

Science.gov (United States)

Dodd, M C; Nikolopoulos, T P; Totten, C; Cope, Y; O'Donoghue, G M

2005-07-01

To assess performance of Nucleus 22 mini system pediatric users converted from the Spectra 22 body-worn to the ESPrit 22 ear-level speech processor using aided thresholds and speech discrimination measures before and after the conversion. Spectra 22 body-worn speech processor users were chosen using preselection criteria (stable map, ability to report on the quality of the signal, no device problems). The subjects underwent tuning, map conversion, fitting of the ESPrit 22, and aided soundfield threshold and speech discrimination testing. The first 100 consecutive conversions are analyzed in this study. Fifty children (50%) were female, and 50 (50%) were male. The average age at implantation was 4.6 years (median 4.3 years, range 1.7 to 11 years). The average age of fitting the ear level speech processor was 11.1 years (median 11 years, range 6.2 to 18.2 years). Tertiary referral pediatric cochlear implant center in the United Kingdom. Of the 100 fittings attempted, all Spectra 22 maps could to be converted for use in the ESPrit 22. Of these 100 fittings, 44 were straightforward with no adjustment to map parameters being required, and 56 needed rate reductions and other map adjustments to achieve the conversion. The difference of the mean thresholds before and after the conversion did not exceed 2 dB across the frequencies studied (0.5-4 kHz). In 95% of the cases, the differences were less than 9 dB(A). With regard to speech discrimination testing, the mean threshold before the conversion was 53.4 dB and after the conversion 52.7 dB. Of the 100 conversions, only five children stopped using the ESPrit 22 despite fitting being achieved. Conversion from the Spectra 22 body worn to the ESPrit 22 ear level speech processor was found to be feasible in all the 100 cases studied. Only a minority (5%) of children chose not to use the ear level speech processor suggesting that children and parents were satisfied from the conversion.
Hearing impairment, cognition and speech understanding: exploratory factor analyses of a comprehensive test battery for a group of hearing aid users, the n200 study

Science.gov (United States)

Rönnberg, Jerker; Lunner, Thomas; Ng, Elaine Hoi Ning; Lidestam, Björn; Zekveld, Adriana Agatha; Sörqvist, Patrik; Lyxell, Björn; Träff, Ulf; Yumba, Wycliffe; Classon, Elisabet; Hällgren, Mathias; Larsby, Birgitta; Signoret, Carine; Pichora-Fuller, M. Kathleen; Rudner, Mary; Danielsson, Henrik; Stenfelt, Stefan

2016-01-01

Abstract Objective: The aims of the current n200 study were to assess the structural relations between three classes of test variables (i.e. HEARING, COGNITION and aided speech-in-noise OUTCOMES) and to describe the theoretical implications of these relations for the Ease of Language Understanding (ELU) model. Study sample: Participants were 200 hard-of-hearing hearing-aid users, with a mean age of 60.8 years. Forty-three percent were females and the mean hearing threshold in the better ear was 37.4 dB HL. Design: LEVEL1 factor analyses extracted one factor per test and/or cognitive function based on a priori conceptualizations. The more abstract LEVEL 2 factor analyses were performed separately for the three classes of test variables. Results: The HEARING test variables resulted in two LEVEL 2 factors, which we labelled SENSITIVITY and TEMPORAL FINE STRUCTURE; the COGNITIVE variables in one COGNITION factor only, and OUTCOMES in two factors, NO CONTEXT and CONTEXT. COGNITION predicted the NO CONTEXT factor to a stronger extent than the CONTEXT outcome factor. TEMPORAL FINE STRUCTURE and SENSITIVITY were associated with COGNITION and all three contributed significantly and independently to especially the NO CONTEXT outcome scores (R2 = 0.40). Conclusions: All LEVEL 2 factors are important theoretically as well as for clinical assessment. PMID:27589015
Identification of changes along a continuum of speech intonation is impaired in congenital amusia

Directory of Open Access Journals (Sweden)

Sean eHutchins

2010-12-01

Full Text Available A small number of individuals have severe musical problems that have neuro-genetic underpinnings. This musical disorder is termed congenital amusia, an umbrella term for lifelong musical disabilities that cannot be attributed to deafness, lack of exposure, or brain damage after birth. Amusics seem to lack the ability to detect fine pitch differences in tone sequences. However, differences between statements and questions, which vary in final pitch, are well perceived by most congenital amusic individuals. We hypothesized that the origin of this apparent domain-specificity of the disorder lies in the range of pitch variations, which are very coarse in speech as compared to music. Here, we tested this hypothesis by using a continuum of gradually increasing final pitch in both speech and tone sequences. To this aim, nine amusic cases and nine matched controls were presented with statements and questions that varied on a pitch continuum from falling to rising in 11 steps. The sentences were either naturally spoken or were tone sequence versions of these. The task was to categorize the sentences as statements or questions and the tone sequences, as falling or rising. In each case, the observation of an S-shaped identification function indicates that amusics can accurately identify unambiguous examples of statements and questions but have problems with fine variations between these endpoints. Thus, the results indicate that a deficient pitch perception might compromise music, not because it is specialized for that domain but because music requirements are more fine-grained.
Hearing and seeing meaning in noise. Alpha, beta and gamma oscillations predict gestural enhancement of degraded speech comprehension

NARCIS (Netherlands)

Drijvers, L.; Özyürek, A.; Jensen, O.

2018-01-01

During face-to-face communication, listeners integrate speech with gestures. The semantic information conveyed by iconic gestures (e.g., a drinking gesture) can aid speech comprehension in adverse listening conditions. In this magnetoencephalography (MEG) study, we investigated the spatiotemporal
Improvement of intelligibility of ideal binary-masked noisy speech by adding background noise.

Science.gov (United States)

Cao, Shuyang; Li, Liang; Wu, Xihong

2011-04-01

When a target-speech/masker mixture is processed with the signal-separation technique, ideal binary mask (IBM), intelligibility of target speech is remarkably improved in both normal-hearing listeners and hearing-impaired listeners. Intelligibility of speech can also be improved by filling in speech gaps with un-modulated broadband noise. This study investigated whether intelligibility of target speech in the IBM-treated target-speech/masker mixture can be further improved by adding a broadband-noise background. The results of this study show that following the IBM manipulation, which remarkably released target speech from speech-spectrum noise, foreign-speech, or native-speech masking (experiment 1), adding a broadband-noise background with the signal-to-noise ratio no less than 4 dB significantly improved intelligibility of target speech when the masker was either noise (experiment 2) or speech (experiment 3). The results suggest that since adding the noise background shallows the areas of silence in the time-frequency domain of the IBM-treated target-speech/masker mixture, the abruption of transient changes in the mixture is smoothed and the perceived continuity of target-speech components becomes enhanced, leading to improved target-speech intelligibility. The findings are useful for advancing computational auditory scene analysis, hearing-aid/cochlear-implant designs, and understanding of speech perception under "cocktail-party" conditions.
Music and hearing aids--an introduction.

Science.gov (United States)

Chasin, Marshall

2012-09-01

Modern digital hearing aids have provided improved fidelity over those of earlier decades for speech. The same however cannot be said for music. Most modern hearing aids have a limitation of their "front end," which comprises the analog-to-digital (A/D) converter. For a number of reasons, the spectral nature of music as an input to a hearing aid is beyond the optimal operating conditions of the "front end" components. Amplified music tends to be of rather poor fidelity. Once the music signal is distorted, no amount of software manipulation that occurs later in the circuitry can improve things. The solution is not a software issue. Some characteristics of music that make it difficult to be transduced without significant distortion include an increased sound level relative to that of speech, and the crest factor- the difference in dB between the instantaneous peak of a signal and its RMS value. Clinical strategies and technical innovations have helped to improve the fidelity of amplified music and these include a reduction of the level of the input that is presented to the A/D converter.
DELVING INTO SPEECH ACT A Case Of Indonesian EFL Young Learners

Directory of Open Access Journals (Sweden)

Swastika Septiani, S.Pd

2017-04-01

Full Text Available This study attempts to describe the use of speech acts applied in primary school. This study is intended to identify the speech acts performed in primary school, to find the most dominant speech acts performed in elementary school, to give brief description of how speech acts applied in primary school, and to know how to apply the result of the study in English teaching learning to young learners. The speech acts performed in primary school is classified based on Searle‘s theory of speech acts. The most dominant speech acts performed in primary school is Directive (41.17%, the second speech act mostly performed is Declarative (33.33%, the third speech act mostly performed is Representative and Expressive (each 11.76%, and the least speech act performed is Commisive (1.9%. The speech acts performed in elementary school is applied on the context of situation determined by the National Education Standards Agency (BSNP. The speech acts performed in fourth grade have to be applied in the context of classroom, and the speech acts performed in fifth grade have to be applied in the context of school, whereas the speech acts performed in sixth grade have to be applied in the context of the students‘ surroundings. The result of this study is highy expected to give significant contribution to English teaching learning to young learners. By acknowledging the characteristics of young learners, the way they learn English as a foreign language, the teachers are expected to have inventive strategies and various techniques to create a fun and condusive atmosphere in English class.
Development of a Bone-Conducted Ultrasonic Hearing Aid for the Profoundly Deaf: Evaluation of Sound Quality Using a Semantic Differential Method

Science.gov (United States)

Nakagawa, Seiji; Fujiyuki, Chika; Kagomiya, Takayuki

2013-07-01

Bone-conducted ultrasound (BCU) is perceived even by the profoundly sensorineural deaf. A novel hearing aid using the perception of amplitude-modulated BCU (BCU hearing aid: BCUHA) has been developed. However, there is room for improvement particularly in terms of sound quality. BCU speech is accompanied by a strong high-pitched tone and contain some distortion. In this study, the sound quality of BCU speech with several types of amplitude modulation [double-sideband with transmitted carrier (DSB-TC), double-sideband with suppressed carrier (DSB-SC), and transposed modulations] and air-conducted (AC) speech was quantitatively evaluated using semantic differential and factor analysis. The results showed that all the types of BCU speech had higher metallic and lower esthetic factor scores than AC speech. On the other hand, transposed speech was closer than the other types of BCU speech to AC speech generally; the transposed speech showed a higher powerfulness factor score than the other types of BCU speech and a higher esthetic factor score than DSB-SC speech. These results provide useful information for further development of the BCUHA.
An algorithm of improving speech emotional perception for hearing aid

Science.gov (United States)

Xi, Ji; Liang, Ruiyu; Fei, Xianju

2017-07-01

In this paper, a speech emotion recognition (SER) algorithm was proposed to improve the emotional perception of hearing-impaired people. The algorithm utilizes multiple kernel technology to overcome the drawback of SVM: slow training speed. Firstly, in order to improve the adaptive performance of Gaussian Radial Basis Function (RBF), the parameter determining the nonlinear mapping was optimized on the basis of Kernel target alignment. Then, the obtained Kernel Function was used as the basis kernel of Multiple Kernel Learning (MKL) with slack variable that could solve the over-fitting problem. However, the slack variable also brings the error into the result. Therefore, a soft-margin MKL was proposed to balance the margin against the error. Moreover, the relatively iterative algorithm was used to solve the combination coefficients and hyper-plane equations. Experimental results show that the proposed algorithm can acquire an accuracy of 90% for five kinds of emotions including happiness, sadness, anger, fear and neutral. Compared with KPCA+CCA and PIM-FSVM, the proposed algorithm has the highest accuracy.
Studying performation: the arrangement of speech, calculation and writing acts within dispositifs : Carbon accounting for strategizing in a large corporation

OpenAIRE

Le Breton , Morgane; Aggeri , Franck

2016-01-01

International audience; This paper aims at proposing an analytical framework for performation process that is performation through speech, calculation and writing acts connected within a “dispositif”. This analytical framework is put into practice in the case study of a French large corporation which has built a low-carbon strategy based on carbon accounting tools. We have found that low-carbon strategy is performed through carbon accounting tools since speech, calculation and writing acts ar...

Development of The Viking Speech Scale to classify the speech of children with cerebral palsy.

Science.gov (United States)

Pennington, Lindsay; Virella, Daniel; Mjøen, Tone; da Graça Andrada, Maria; Murray, Janice; Colver, Allan; Himmelmann, Kate; Rackauskaite, Gija; Greitane, Andra; Prasauskiene, Audrone; Andersen, Guro; de la Cruz, Javier

2013-10-01

Surveillance registers monitor the prevalence of cerebral palsy and the severity of resulting impairments across time and place. The motor disorders of cerebral palsy can affect children's speech production and limit their intelligibility. We describe the development of a scale to classify children's speech performance for use in cerebral palsy surveillance registers, and its reliability across raters and across time. Speech and language therapists, other healthcare professionals and parents classified the speech of 139 children with cerebral palsy (85 boys, 54 girls; mean age 6.03 years, SD 1.09) from observation and previous knowledge of the children. Another group of health professionals rated children's speech from information in their medical notes. With the exception of parents, raters reclassified children's speech at least four weeks after their initial classification. Raters were asked to rate how easy the scale was to use and how well the scale described the child's speech production using Likert scales. Inter-rater reliability was moderate to substantial (k>.58 for all comparisons). Test-retest reliability was substantial to almost perfect for all groups (k>.68). Over 74% of raters found the scale easy or very easy to use; 66% of parents and over 70% of health care professionals judged the scale to describe children's speech well or very well. We conclude that the Viking Speech Scale is a reliable tool to describe the speech performance of children with cerebral palsy, which can be applied through direct observation of children or through case note review. Copyright © 2013 Elsevier Ltd. All rights reserved.
Evaluation of pitch coding alternatives for vibrotactile stimulation in speech training of the deaf

Energy Technology Data Exchange (ETDEWEB)

Barbacena, I L; Barros, A T [CEFET/PB, Joao Pessoa - PB (Brazil); Freire, R C S [DEE, UFCG, Campina Grande-PB (Brazil); Vieira, E C A [CEFET/PB, Joao Pessoa - PB (Brazil)

2007-11-15

Use of vibrotactile feedback stimulation as an aid for speech vocalization by the hearing impaired or deaf is reviewed. Architecture of a vibrotactile based speech therapy system is proposed. Different formulations for encoding the fundamental frequency of the vocalized speech into the pulsed stimulation frequency are proposed and investigated. Simulation results are also presented to obtain a comparative evaluation of the effectiveness of the different formulated transformations. Results of the perception sensitivity to the vibrotactile stimulus frequency to verify effectiveness of the above transformations are included.
Evaluation of pitch coding alternatives for vibrotactile stimulation in speech training of the deaf

International Nuclear Information System (INIS)

Barbacena, I L; Barros, A T; Freire, R C S; Vieira, E C A

2007-01-01

Use of vibrotactile feedback stimulation as an aid for speech vocalization by the hearing impaired or deaf is reviewed. Architecture of a vibrotactile based speech therapy system is proposed. Different formulations for encoding the fundamental frequency of the vocalized speech into the pulsed stimulation frequency are proposed and investigated. Simulation results are also presented to obtain a comparative evaluation of the effectiveness of the different formulated transformations. Results of the perception sensitivity to the vibrotactile stimulus frequency to verify effectiveness of the above transformations are included
Approaches for Language Identification in Mismatched Environments

Science.gov (United States)

2016-09-08

domain adaptation, unsupervised learning , deep neural networks, bottleneck features 1. Introduction and task Spoken language identification (LID) is...the process of identifying the language in a spoken speech utterance. In recent years, great improvements in LID system performance have been seen...be the case in practice. Lastly, we conduct an out-of-set experiment where VoA data from 9 other languages (Amharic, Creole, Croatian, English
New Directions In Radioisotope Spectrum Identification

International Nuclear Information System (INIS)

Salaymeh, S.; Jeffcoat, R.

2010-01-01

Recent studies have found the performance of commercial handheld detectors with automatic RIID software to be less than acceptable. Previously, we have explored approaches rooted in speech processing such as cepstral features and information-theoretic measures. Scientific advances are often made when researchers identify mathematical or physical commonalities between different fields and are able to apply mature techniques or algorithms developed in one field to another field which shares some of the same challenges. The authors of this paper have identified similarities between the unsolved problems faced in gamma-spectroscopy for automated radioisotope identification and the challenges of the much larger body of research in speech processing. Our research has led to a probabilistic framework for describing and solving radioisotope identification problems. Many heuristic approaches to classification in current use, including for radioisotope classification, make implicit probabilistic assumptions which are not clear to the users and, if stated explicitly, might not be considered desirable. Our framework leads to a classification approach with demonstrable improvements using standard feature sets on proof-of-concept simulated and field-collected data.
Peripheral auditory processing and speech reception in impaired hearing

DEFF Research Database (Denmark)

Strelcyk, Olaf

One of the most common complaints of people with impaired hearing concerns their difficulty with understanding speech. Particularly in the presence of background noise, hearing-impaired people often encounter great difficulties with speech communication. In most cases, the problem persists even...... if reduced audibility has been compensated for by hearing aids. It has been hypothesized that part of the difficulty arises from changes in the perception of sounds that are well above hearing threshold, such as reduced frequency selectivity and deficits in the processing of temporal fine structure (TFS......) at the output of the inner-ear (cochlear) filters. The purpose of this work was to investigate these aspects in detail. One chapter studies relations between frequency selectivity, TFS processing, and speech reception in listeners with normal and impaired hearing, using behavioral listening experiments. While...
Research and development of a versatile portable speech prosthesis

Science.gov (United States)

1981-01-01

The Versatile Portable Speech Prosthesis (VPSP), a synthetic speech output communication aid for non-speaking people is described. It was intended initially for severely physically limited people with cerebral palsy who are in electric wheelchairs. Hence, it was designed to be placed on a wheelchair and powered from a wheelchair battery. It can easily be separated from the wheelchair. The VPSP is versatile because it is designed to accept any means of single switch, multiple switch, or keyboard control which physically limited people have the ability to use. It is portable because it is mounted on and can go with the electric wheelchair. It is a speech prosthesis, obviously, because it speaks with a synthetic voice for people unable to speak with their own voices. Both hardware and software are described.
Job Performance Aid Methods (for Job Guide Manuals and Other Formats).

Science.gov (United States)

James, Frank W.

The report provides simplified instructions for writing and illustrating Job Performance Aids (JPAs). JPAs are step-by-step work instructions geared to the intellectual level of the performer and background training aids for psychological task preparedness. The first two sections of the report discuss the origin of JPAs and the principles of task…
An analysis of machine translation and speech synthesis in speech-to-speech translation system

OpenAIRE

Hashimoto, K.; Yamagishi, J.; Byrne, W.; King, S.; Tokuda, K.

2011-01-01

This paper provides an analysis of the impacts of machine translation and speech synthesis on speech-to-speech translation systems. The speech-to-speech translation system consists of three components: speech recognition, machine translation and speech synthesis. Many techniques for integration of speech recognition and machine translation have been proposed. However, speech synthesis has not yet been considered. Therefore, in this paper, we focus on machine translation and speech synthesis, ...
Song and speech: examining the link between singing talent and speech imitation ability

Science.gov (United States)

Christiner, Markus; Reiterer, Susanne M.

2013-01-01

In previous research on speech imitation, musicality, and an ability to sing were isolated as the strongest indicators of good pronunciation skills in foreign languages. We, therefore, wanted to take a closer look at the nature of the ability to sing, which shares a common ground with the ability to imitate speech. This study focuses on whether good singing performance predicts good speech imitation. Forty-one singers of different levels of proficiency were selected for the study and their ability to sing, to imitate speech, their musical talent and working memory were tested. Results indicated that singing performance is a better indicator of the ability to imitate speech than the playing of a musical instrument. A multiple regression revealed that 64% of the speech imitation score variance could be explained by working memory together with educational background and singing performance. A second multiple regression showed that 66% of the speech imitation variance of completely unintelligible and unfamiliar language stimuli (Hindi) could be explained by working memory together with a singer's sense of rhythm and quality of voice. This supports the idea that both vocal behaviors have a common grounding in terms of vocal and motor flexibility, ontogenetic and phylogenetic development, neural orchestration and auditory memory with singing fitting better into the category of “speech” on the productive level and “music” on the acoustic level. As a result, good singers benefit from vocal and motor flexibility, productively and cognitively, in three ways. (1) Motor flexibility and the ability to sing improve language and musical function. (2) Good singers retain a certain plasticity and are open to new and unusual sound combinations during adulthood both perceptually and productively. (3) The ability to sing improves the memory span of the auditory working memory. PMID:24319438
Effects of irrelevant speech and traffic noise on speech perception and cognitive performance in elementary school children.

Science.gov (United States)

Klatte, Maria; Meis, Markus; Sukowski, Helga; Schick, August

2007-01-01

The effects of background noise of moderate intensity on short-term storage and processing of verbal information were analyzed in 6 to 8 year old children. In line with adult studies on "irrelevant sound effect" (ISE), serial recall of visually presented digits was severely disrupted by background speech that the children did not understand. Train noises of equal Intensity however, had no effect. Similar results were demonstrated with tasks requiring storage and processing of heard information. Memory for nonwords, execution of oral instructions and categorizing speech sounds were significantly disrupted by irrelevant speech. The affected functions play a fundamental role in the acquisition of spoken and written language. Implications concerning current models of the ISE and the acoustic conditions in schools and kindergardens are discussed.
Segmental intelligibility of synthetic speech produced by rule.

Science.gov (United States)

Logan, J S; Greene, B G; Pisoni, D B

1989-08-01

This paper reports the results of an investigation that employed the modified rhyme test (MRT) to measure the segmental intelligibility of synthetic speech generated automatically by rule. Synthetic speech produced by ten text-to-speech systems was studied and compared to natural speech. A variation of the standard MRT was also used to study the effects of response set size on perceptual confusions. Results indicated that the segmental intelligibility scores formed a continuum. Several systems displayed very high levels of performance that were close to or equal to scores obtained with natural speech; other systems displayed substantially worse performance compared to natural speech. The overall performance of the best system, DECtalk--Paul, was equivalent to the data obtained with natural speech for consonants in syllable-initial position. The findings from this study are discussed in terms of the use of a set of standardized procedures for measuring intelligibility of synthetic speech under controlled laboratory conditions. Recent work investigating the perception of synthetic speech under more severe conditions in which greater demands are made on the listener's processing resources is also considered. The wide range of intelligibility scores obtained in the present study demonstrates important differences in perception and suggests that not all synthetic speech is perceptually equivalent to the listener.
Segmental intelligibility of synthetic speech produced by rule

Science.gov (United States)

Logan, John S.; Greene, Beth G.; Pisoni, David B.

2012-01-01

This paper reports the results of an investigation that employed the modified rhyme test (MRT) to measure the segmental intelligibility of synthetic speech generated automatically by rule. Synthetic speech produced by ten text-to-speech systems was studied and compared to natural speech. A variation of the standard MRT was also used to study the effects of response set size on perceptual confusions. Results indicated that the segmental intelligibility scores formed a continuum. Several systems displayed very high levels of performance that were close to or equal to scores obtained with natural speech; other systems displayed substantially worse performance compared to natural speech. The overall performance of the best system, DECtalk—Paul, was equivalent to the data obtained with natural speech for consonants in syllable-initial position. The findings from this study are discussed in terms of the use of a set of standardized procedures for measuring intelligibility of synthetic speech under controlled laboratory conditions. Recent work investigating the perception of synthetic speech under more severe conditions in which greater demands are made on the listener’s processing resources is also considered. The wide range of intelligibility scores obtained in the present study demonstrates important differences in perception and suggests that not all synthetic speech is perceptually equivalent to the listener. PMID:2527884
Noise and pitch interact during the cortical segregation of concurrent speech.

Science.gov (United States)

Bidelman, Gavin M; Yellamsetty, Anusha

2017-08-01

Behavioral studies reveal listeners exploit intrinsic differences in voice fundamental frequency (F0) to segregate concurrent speech sounds-the so-called "F0-benefit." More favorable signal-to-noise ratio (SNR) in the environment, an extrinsic acoustic factor, similarly benefits the parsing of simultaneous speech. Here, we examined the neurobiological substrates of these two cues in the perceptual segregation of concurrent speech mixtures. We recorded event-related brain potentials (ERPs) while listeners performed a speeded double-vowel identification task. Listeners heard two concurrent vowels whose F0 differed by zero or four semitones presented in either clean (no noise) or noise-degraded (+5 dB SNR) conditions. Behaviorally, listeners were more accurate in correctly identifying both vowels for larger F0 separations but F0-benefit was more pronounced at more favorable SNRs (i.e., pitch × SNR interaction). Analysis of the ERPs revealed that only the P2 wave (∼200 ms) showed a similar F0 x SNR interaction as behavior and was correlated with listeners' perceptual F0-benefit. Neural classifiers applied to the ERPs further suggested that speech sounds are segregated neurally within 200 ms based on SNR whereas segregation based on pitch occurs later in time (400-700 ms). The earlier timing of extrinsic SNR compared to intrinsic F0-based segregation implies that the cortical extraction of speech from noise is more efficient than differentiating speech based on pitch cues alone, which may recruit additional cortical processes. Findings indicate that noise and pitch differences interact relatively early in cerebral cortex and that the brain arrives at the identities of concurrent speech mixtures as early as ∼200 ms. Copyright © 2017 Elsevier B.V. All rights reserved.
Performance Study of Objective Speech Quality Measurement for Modern Wireless-VoIP Communications

Directory of Open Access Journals (Sweden)

Chan Wai-Yip

2009-01-01

Full Text Available Wireless-VoIP communications introduce perceptual degradations that are not present with traditional VoIP communications. This paper investigates the effects of such degradations on the performance of three state-of-the-art standard objective quality measurement algorithms—PESQ, P.563, and an "extended" E-model. The comparative study suggests that measurement performance is significantly affected by acoustic background noise type and level as well as speech codec and packet loss concealment strategy. On our data, PESQ attains superior overall performance and P.563 and E-model attain comparable performance figures.
Phonological processes in the speech of school-age children with hearing loss: Comparisons with children with normal hearing.

Science.gov (United States)

Asad, Areej Nimer; Purdy, Suzanne C; Ballard, Elaine; Fairgray, Liz; Bowen, Caroline

2018-04-27

In this descriptive study, phonological processes were examined in the speech of children aged 5;0-7;6 (years; months) with mild to profound hearing loss using hearing aids (HAs) and cochlear implants (CIs), in comparison to their peers. A second aim was to compare phonological processes of HA and CI users. Children with hearing loss (CWHL, N = 25) were compared to children with normal hearing (CWNH, N = 30) with similar age, gender, linguistic, and socioeconomic backgrounds. Speech samples obtained from a list of 88 words, derived from three standardized speech tests, were analyzed using the CASALA (Computer Aided Speech and Language Analysis) program to evaluate participants' phonological systems, based on lax (a process appeared at least twice in the speech of at least two children) and strict (a process appeared at least five times in the speech of at least two children) counting criteria. Developmental phonological processes were eliminated in the speech of younger and older CWNH while eleven developmental phonological processes persisted in the speech of both age groups of CWHL. CWHL showed a similar trend of age of elimination to CWNH, but at a slower rate. Children with HAs and CIs produced similar phonological processes. Final consonant deletion, weak syllable deletion, backing, and glottal replacement were present in the speech of HA users, affecting their overall speech intelligibility. Developmental and non-developmental phonological processes persist in the speech of children with mild to profound hearing loss compared to their peers with typical hearing. The findings indicate that it is important for clinicians to consider phonological assessment in pre-school CWHL and the use of evidence-based speech therapy in order to reduce non-developmental and non-age-appropriate developmental processes, thereby enhancing their speech intelligibility. Copyright © 2018 Elsevier Inc. All rights reserved.
Reducing Channel Interaction Through Cochlear Implant Programming May Improve Speech Perception

Directory of Open Access Journals (Sweden)

Julie A. Bierer

2016-06-01

Full Text Available Speech perception among cochlear implant (CI listeners is highly variable. High degrees of channel interaction are associated with poorer speech understanding. Two methods for reducing channel interaction, focusing electrical fields, and deactivating subsets of channels were assessed by the change in vowel and consonant identification scores with different program settings. The main hypotheses were that (a focused stimulation will improve phoneme recognition and (b speech perception will improve when channels with high thresholds are deactivated. To select high-threshold channels for deactivation, subjects’ threshold profiles were processed to enhance the peaks and troughs, and then an exclusion or inclusion criterion based on the mean and standard deviation was used. Low-threshold channels were selected manually and matched in number and apex-to-base distribution. Nine ears in eight adult CI listeners with Advanced Bionics HiRes90k devices were tested with six experimental programs. Two, all-channel programs, (a 14-channel partial tripolar (pTP and (b 14-channel monopolar (MP, and four variable-channel programs, derived from these two base programs, (c pTP with high- and (d low-threshold channels deactivated, and (e MP with high- and (f low-threshold channels deactivated, were created. Across subjects, performance was similar with pTP and MP programs. However, poorer performing subjects (scoring 2. These same subjects showed slightly more benefit with the reduced channel MP programs (5 and 6. Subjective ratings were consistent with performance. These finding suggest that reducing channel interaction may benefit poorer performing CI listeners.
Development of Bone-Conducted Ultrasonic Hearing Aid for the Profoundly Deaf: Assessments of the Modulation Type with Regard to Intelligibility and Sound Quality

Science.gov (United States)

Nakagawa, Seiji; Fujiyuki, Chika; Kagomiya, Takayuki

2012-07-01

Bone-conducted ultrasound (BCU) is perceived even by the profoundly sensorineural deaf. A novel hearing aid using the perception of amplitude-modulated BCU (BCU hearing aid: BCUHA) has been developed; however, further improvements are needed, especially in terms of articulation and sound quality. In this study, the intelligibility and sound quality of BCU speech with several types of amplitude modulation [double-sideband with transmitted carrier (DSB-TC), double-sideband with suppressed carrier (DSB-SC), and transposed modulation] were evaluated. The results showed that DSB-TC and transposed speech were more intelligible than DSB-SC speech, and transposed speech was closer than the other types of BCU speech to air-conducted speech in terms of sound quality. These results provide useful information for further development of the BCUHA.
The Effect of English Verbal Songs on Connected Speech Aspects of Adult English Learners’ Speech Production

Directory of Open Access Journals (Sweden)

Farshid Tayari Ashtiani

2015-02-01

Full Text Available The present study was an attempt to investigate the impact of English verbal songs on connected speech aspects of adult English learners’ speech production. 40 participants were selected based on the results of their performance in a piloted and validated version of NELSON test given to 60 intermediate English learners in a language institute in Tehran. Then they were equally distributed in two control and experimental groups and received a validated pretest of reading aloud and speaking in English. Afterward, the treatment was performed in 18 sessions by singing preselected songs culled based on some criteria such as popularity, familiarity, amount, and speed of speech delivery, etc. In the end, the posttests of reading aloud and speaking in English were administered. The results revealed that the treatment had statistically positive effects on the connected speech aspects of English learners’ speech production at statistical .05 level of significance. Meanwhile, the results represented that there was not any significant difference between the experimental group’s mean scores on the posttests of reading aloud and speaking. It was thus concluded that providing the EFL learners with English verbal songs could positively affect connected speech aspects of both modes of speech production, reading aloud and speaking. The Findings of this study have pedagogical implications for language teachers to be more aware and knowledgeable of the benefits of verbal songs to promote speech production of language learners in terms of naturalness and fluency. Keywords: English Verbal Songs, Connected Speech, Speech Production, Reading Aloud, Speaking
Speech acts and performances of scientific citizenship: Examining how scientists talk about therapeutic cloning.

Science.gov (United States)

Marks, Nicola J

2014-07-01

Scientists play an important role in framing public engagement with science. Their language can facilitate or impede particular interactions taking place with particular citizens: scientists' "speech acts" can "perform" different types of "scientific citizenship". This paper examines how scientists in Australia talked about therapeutic cloning during interviews and during the 2006 parliamentary debates on stem cell research. Some avoided complex labels, thereby facilitating public examination of this field. Others drew on language that only opens a space for publics to become educated, not to participate in a more meaningful way. Importantly, public utterances made by scientists here contrast with common international utterances: they did not focus on the therapeutic but the research promises of therapeutic cloning. Social scientists need to pay attention to the performative aspects of language in order to promote genuine citizen involvement in techno-science. Speech Act Theory is a useful analytical tool for this.

Teaching communication aid use in everyday conversation

DEFF Research Database (Denmark)

Pilesjö, Maja Sigurd; Norén, Niklas

2017-01-01

This Conversation Analysis study investigated how a speech and language therapist (SLT) created opportunities for communication aid use in multiparty conversation. An SLT interacted with a child with multiple disabilities and her grandparents in a home setting, using a bliss board. The analyses...
Speech Perception Benefits of Internet Versus Conventional Telephony for Hearing-Impaired Individuals

Science.gov (United States)

Dubach, Patrick; Pfiffner, Flurin; Kompis, Martin; Caversaccio, Marco

2012-01-01

Background Telephone communication is a challenge for many hearing-impaired individuals. One important technical reason for this difficulty is the restricted frequency range (0.3–3.4 kHz) of conventional landline telephones. Internet telephony (voice over Internet protocol [VoIP]) is transmitted with a larger frequency range (0.1–8 kHz) and therefore includes more frequencies relevant to speech perception. According to a recently published, laboratory-based study, the theoretical advantage of ideal VoIP conditions over conventional telephone quality has translated into improved speech perception by hearing-impaired individuals. However, the speech perception benefits of nonideal VoIP network conditions, which may occur in daily life, have not been explored. VoIP use cannot be recommended to hearing-impaired individuals before its potential under more realistic conditions has been examined. Objective To compare realistic VoIP network conditions, under which digital data packets may be lost, with ideal conventional telephone quality with respect to their impact on speech perception by hearing-impaired individuals. Methods We assessed speech perception using standardized test material presented under simulated VoIP conditions with increasing digital data packet loss (from 0% to 20%) and compared with simulated ideal conventional telephone quality. We monaurally tested 10 adult users of cochlear implants, 10 adult users of hearing aids, and 10 normal-hearing adults in the free sound field, both in quiet and with background noise. Results Across all participant groups, mean speech perception scores using VoIP with 0%, 5%, and 10% packet loss were 15.2% (range 0%–53%), 10.6% (4%–46%), and 8.8% (7%–33%) higher, respectively, than with ideal conventional telephone quality. Speech perception did not differ between VoIP with 20% packet loss and conventional telephone quality. The maximum benefits were observed under ideal VoIP conditions without packet loss and
Speech to Text Software Evaluation Report

CERN Document Server

Martins Santo, Ana Luisa

2017-01-01

This document compares out-of-box performance of three commercially available speech recognition software: Vocapia VoxSigma TM , Google Cloud Speech, and Lime- craft Transcriber. It is defined a set of evaluation criteria and test methods for speech recognition softwares. The evaluation of these softwares in noisy environments are also included for the testing purposes. Recognition accuracy was compared using noisy environments and languages. Testing in ”ideal” non-noisy environment of a quiet room has been also performed for comparison.
Salvation and Speech Act. Reading Luther with the Aid of Searle’s Analysis of Declarations

Directory of Open Access Journals (Sweden)

Randolph Jacob R.

2017-05-01

Full Text Available Many Luther scholars have made passing reference to Martin Luther’s theology of the Word as a ‘speech-act’ theology. This essay aims to probe points of continuity and discontinuity between Luther’s understanding of the Word, as exemplified in the promise of God, and a particular speech-act philosophy as posited by John Searle. The analysis of Searle in the area of declarations, as well as a survey of Lutheran conceptions of the Word of promise in both sacrament and Scripture, will evidence specific moments of clarity in Luther’s so-called ‘speech-act’ theology and provide a helpful paradigm for viewing the creative impact of the Word as conceived by Luther.
The Diagnostic Conference Planning Questionnaire for Speech-Language Pathology.

Science.gov (United States)

Houle, Gail Ruppert

1990-01-01

The article describes a tool to increase professional effectiveness in supervisory conferencing in speech-language pathology based on the dual areas of role expectations for clinicians and personal needs as derived from Maslow's hierarchy of needs. The conferencing questionnaire aids in recognizing the needs of the supervisee, stating problems,…
Refining Stimulus Parameters in Assessing Infant Speech Perception Using Visual Reinforcement Infant Speech Discrimination: Sensation Level.

Science.gov (United States)

Uhler, Kristin M; Baca, Rosalinda; Dudas, Emily; Fredrickson, Tammy

2015-01-01

Speech perception measures have long been considered an integral piece of the audiological assessment battery. Currently, a prelinguistic, standardized measure of speech perception is missing in the clinical assessment battery for infants and young toddlers. Such a measure would allow systematic assessment of speech perception abilities of infants as well as the potential to investigate the impact early identification of hearing loss and early fitting of amplification have on the auditory pathways. To investigate the impact of sensation level (SL) on the ability of infants with normal hearing (NH) to discriminate /a-i/ and /ba-da/ and to determine if performance on the two contrasts are significantly different in predicting the discrimination criterion. The design was based on a survival analysis model for event occurrence and a repeated measures logistic model for binary outcomes. The outcome for survival analysis was the minimum SL for criterion and the outcome for the logistic regression model was the presence/absence of achieving the criterion. Criterion achievement was designated when an infant's proportion correct score was >0.75 on the discrimination performance task. Twenty-two infants with NH sensitivity participated in this study. There were 9 males and 13 females, aged 6-14 mo. Testing took place over two to three sessions. The first session consisted of a hearing test, threshold assessment of the two speech sounds (/a/ and /i/), and if time and attention allowed, visual reinforcement infant speech discrimination (VRISD). The second session consisted of VRISD assessment for the two test contrasts (/a-i/ and /ba-da/). The presentation level started at 50 dBA. If the infant was unable to successfully achieve criterion (>0.75) at 50 dBA, the presentation level was increased to 70 dBA followed by 60 dBA. Data examination included an event analysis, which provided the probability of criterion distribution across SL. The second stage of the analysis was a
Effects of human fatigue on speech signals

Science.gov (United States)

Stamoulis, Catherine

2004-05-01

Cognitive performance may be significantly affected by fatigue. In the case of critical personnel, such as pilots, monitoring human fatigue is essential to ensure safety and success of a given operation. One of the modalities that may be used for this purpose is speech, which is sensitive to respiratory changes and increased muscle tension of vocal cords, induced by fatigue. Age, gender, vocal tract length, physical and emotional state may significantly alter speech intensity, duration, rhythm, and spectral characteristics. In addition to changes in speech rhythm, fatigue may also affect the quality of speech, such as articulation. In a noisy environment, detecting fatigue-related changes in speech signals, particularly subtle changes at the onset of fatigue, may be difficult. Therefore, in a performance-monitoring system, speech parameters which are significantly affected by fatigue need to be identified and extracted from input signals. For this purpose, a series of experiments was performed under slowly varying cognitive load conditions and at different times of the day. The results of the data analysis are presented here.
Application of the wavelet transform for speech processing

Science.gov (United States)

Maes, Stephane

1994-01-01

Speaker identification and word spotting will shortly play a key role in space applications. An approach based on the wavelet transform is presented that, in the context of the 'modulation model,' enables extraction of speech features which are used as input for the classification process.
Speech Synthesis Applied to Language Teaching.

Science.gov (United States)

Sherwood, Bruce

1981-01-01

The experimental addition of speech output to computer-based Esperanto lessons using speech synthesized from text is described. Because of Esperanto's phonetic spelling and simple rhythm, it is particularly easy to describe the mechanisms of Esperanto synthesis. Attention is directed to how the text-to-speech conversion is performed and the ways…
Issues in developing valid assessments of speech pathology students' performance in the workplace.

Science.gov (United States)

McAllister, Sue; Lincoln, Michelle; Ferguson, Alison; McAllister, Lindy

2010-01-01

Workplace-based learning is a critical component of professional preparation in speech pathology. A validated assessment of this learning is seen to be 'the gold standard', but it is difficult to develop because of design and validation issues. These issues include the role and nature of judgement in assessment, challenges in measuring quality, and the relationship between assessment and learning. Valid assessment of workplace-based performance needs to capture the development of competence over time and account for both occupation specific and generic competencies. This paper reviews important conceptual issues in the design of valid and reliable workplace-based assessments of competence including assessment content, process, impact on learning, measurement issues, and validation strategies. It then goes on to share what has been learned about quality assessment and validation of a workplace-based performance assessment using competency-based ratings. The outcomes of a four-year national development and validation of an assessment tool are described. A literature review of issues in conceptualizing, designing, and validating workplace-based assessments was conducted. Key factors to consider in the design of a new tool were identified and built into the cycle of design, trialling, and data analysis in the validation stages of the development process. This paper provides an accessible overview of factors to consider in the design and validation of workplace-based assessment tools. It presents strategies used in the development and national validation of a tool COMPASS, used in an every speech pathology programme in Australia, New Zealand, and Singapore. The paper also describes Rasch analysis, a model-based statistical approach which is useful for establishing validity and reliability of assessment tools. Through careful attention to conceptual and design issues in the development and trialling of workplace-based assessments, it has been possible to develop the
Preschool Children's Performance on Profiling Elements of Prosody in Speech-Communication (PEPS-C)

Science.gov (United States)

Gibbon, Fiona E.; Smyth, Heather

2013-01-01

Profiling Elements of Prosody in Speech-Communication (PEPS-C) has not been used widely to assess prosodic abilities of preschool children. This study was therefore aimed at investigating typically developing 4-year-olds' performance on PEPS-C. PEPS-C was presented to 30 typically developing 4-year-olds recruited in southern Ireland. Children were…
Performance Evaluation of Speech Recognition Systems as a Next-Generation Pilot-Vehicle Interface Technology

Science.gov (United States)

Arthur, Jarvis J., III; Shelton, Kevin J.; Prinzel, Lawrence J., III; Bailey, Randall E.

2016-01-01

During the flight trials known as Gulfstream-V Synthetic Vision Systems Integrated Technology Evaluation (GV-SITE), a Speech Recognition System (SRS) was used by the evaluation pilots. The SRS system was intended to be an intuitive interface for display control (rather than knobs, buttons, etc.). This paper describes the performance of the current "state of the art" Speech Recognition System (SRS). The commercially available technology was evaluated as an application for possible inclusion in commercial aircraft flight decks as a crew-to-vehicle interface. Specifically, the technology is to be used as an interface from aircrew to the onboard displays, controls, and flight management tasks. A flight test of a SRS as well as a laboratory test was conducted.
Diminutives facilitate word segmentation in natural speech: cross-linguistic evidence.

Science.gov (United States)

Kempe, Vera; Brooks, Patricia J; Gillis, Steven; Samson, Graham

2007-06-01

Final-syllable invariance is characteristic of diminutives (e.g., doggie), which are a pervasive feature of the child-directed speech registers of many languages. Invariance in word endings has been shown to facilitate word segmentation (Kempe, Brooks, & Gillis, 2005) in an incidental-learning paradigm in which synthesized Dutch pseudonouns were used. To broaden the cross-linguistic evidence for this invariance effect and to increase its ecological validity, adult English speakers (n=276) were exposed to naturally spoken Dutch or Russian pseudonouns presented in sentence contexts. A forced choice test was given to assess target recognition, with foils comprising unfamiliar syllable combinations in Experiments 1 and 2 and syllable combinations straddling word boundaries in Experiment 3. A control group (n=210) received the recognition test with no prior exposure to targets. Recognition performance improved with increasing final-syllable rhyme invariance, with larger increases for the experimental group. This confirms that word ending invariance is a valid segmentation cue in artificial, as well as naturalistic, speech and that diminutives may aid segmentation in a number of languages.
Neural Entrainment to Speech Modulates Speech Intelligibility

NARCIS (Netherlands)

Riecke, Lars; Formisano, Elia; Sorger, Bettina; Baskent, Deniz; Gaudrain, Etienne

2018-01-01

Speech is crucial for communication in everyday life. Speech-brain entrainment, the alignment of neural activity to the slow temporal fluctuations (envelope) of acoustic speech input, is a ubiquitous element of current theories of speech processing. Associations between speech-brain entrainment and
The influence of environmental sound training on the perception of spectrally degraded speech and environmental sounds.

Science.gov (United States)

Shafiro, Valeriy; Sheft, Stanley; Gygi, Brian; Ho, Kim Thien N

2012-06-01

Perceptual training with spectrally degraded environmental sounds results in improved environmental sound identification, with benefits shown to extend to untrained speech perception as well. The present study extended those findings to examine longer-term training effects as well as effects of mere repeated exposure to sounds over time. Participants received two pretests (1 week apart) prior to a week-long environmental sound training regimen, which was followed by two posttest sessions, separated by another week without training. Spectrally degraded stimuli, processed with a four-channel vocoder, consisted of a 160-item environmental sound test, word and sentence tests, and a battery of basic auditory abilities and cognitive tests. Results indicated significant improvements in all speech and environmental sound scores between the initial pretest and the last posttest with performance increments following both exposure and training. For environmental sounds (the stimulus class that was trained), the magnitude of positive change that accompanied training was much greater than that due to exposure alone, with improvement for untrained sounds roughly comparable to the speech benefit from exposure. Additional tests of auditory and cognitive abilities showed that speech and environmental sound performance were differentially correlated with tests of spectral and temporal-fine-structure processing, whereas working memory and executive function were correlated with speech, but not environmental sound perception. These findings indicate generalizability of environmental sound training and provide a basis for implementing environmental sound training programs for cochlear implant (CI) patients.
Speech perception performance of subjects with type I diabetes mellitus in noise

Directory of Open Access Journals (Sweden)

Bárbara Cristiane Sordi Silva

Full Text Available Abstract Introduction: Diabetes mellitus (DM is a chronic metabolic disorder of various origins that occurs when the pancreas fails to produce insulin in sufficient quantities or when the organism fails to respond to this hormone in an efficient manner. Objective: To evaluate the speech recognition in subjects with type I diabetes mellitus (DMI in quiet and in competitive noise. Methods: It was a descriptive, observational and cross-section study. We included 40 participants of both genders aged 18-30 years, divided into a control group (CG of 20 healthy subjects with no complaints or auditory changes, paired for age and gender with the study group, consisting of 20 subjects with a diagnosis of DMI. First, we applied basic audiological evaluations (pure tone audiometry, speech audiometry and immittance audiometry for all subjects; after these evaluations, we applied Sentence Recognition Threshold in Quiet (SRTQ and Sentence Recognition Threshold in Noise (SRTN in free field, using the List of Sentences in Portuguese test. Results: All subjects showed normal bilateral pure tone threshold, compatible speech audiometry and "A" tympanometry curve. Group comparison revealed a statistically significant difference for SRTQ (p = 0.0001, SRTN (p < 0.0001 and the signal-to-noise ratio (p < 0.0001. Conclusion: The performance of DMI subjects in SRTQ and SRTN was worse compared to the subjects without diabetes.
Music and hearing aids.

Science.gov (United States)

Madsen, Sara M K; Moore, Brian C J

2014-10-31

The signal processing and fitting methods used for hearing aids have mainly been designed to optimize the intelligibility of speech. Little attention has been paid to the effectiveness of hearing aids for listening to music. Perhaps as a consequence, many hearing-aid users complain that they are not satisfied with their hearing aids when listening to music. This issue inspired the Internet-based survey presented here. The survey was designed to identify the nature and prevalence of problems associated with listening to live and reproduced music with hearing aids. Responses from 523 hearing-aid users to 21 multiple-choice questions are presented and analyzed, and the relationships between responses to questions regarding music and questions concerned with information about the respondents, their hearing aids, and their hearing loss are described. Large proportions of the respondents reported that they found their hearing aids to be helpful for listening to both live and reproduced music, although less so for the former. The survey also identified problems such as distortion, acoustic feedback, insufficient or excessive gain, unbalanced frequency response, and reduced tone quality. The results indicate that the enjoyment of listening to music with hearing aids could be improved by an increase of the input and output dynamic range, extension of the low-frequency response, and improvement of feedback cancellation and automatic gain control systems. © The Author(s) 2014.
Music and Hearing Aids

Directory of Open Access Journals (Sweden)

Sara M. K. Madsen

2014-10-01

Full Text Available The signal processing and fitting methods used for hearing aids have mainly been designed to optimize the intelligibility of speech. Little attention has been paid to the effectiveness of hearing aids for listening to music. Perhaps as a consequence, many hearing-aid users complain that they are not satisfied with their hearing aids when listening to music. This issue inspired the Internet-based survey presented here. The survey was designed to identify the nature and prevalence of problems associated with listening to live and reproduced music with hearing aids. Responses from 523 hearing-aid users to 21 multiple-choice questions are presented and analyzed, and the relationships between responses to questions regarding music and questions concerned with information about the respondents, their hearing aids, and their hearing loss are described. Large proportions of the respondents reported that they found their hearing aids to be helpful for listening to both live and reproduced music, although less so for the former. The survey also identified problems such as distortion, acoustic feedback, insufficient or excessive gain, unbalanced frequency response, and reduced tone quality. The results indicate that the enjoyment of listening to music with hearing aids could be improved by an increase of the input and output dynamic range, extension of the low-frequency response, and improvement of feedback cancellation and automatic gain control systems.
TongueToSpeech (TTS): Wearable wireless assistive device for augmented speech.

Science.gov (United States)

Marjanovic, Nicholas; Piccinini, Giacomo; Kerr, Kevin; Esmailbeigi, Hananeh

2017-07-01

Speech is an important aspect of human communication; individuals with speech impairment are unable to communicate vocally in real time. Our team has developed the TongueToSpeech (TTS) device with the goal of augmenting speech communication for the vocally impaired. The proposed device is a wearable wireless assistive device that incorporates a capacitive touch keyboard interface embedded inside a discrete retainer. This device connects to a computer, tablet or a smartphone via Bluetooth connection. The developed TTS application converts text typed by the tongue into audible speech. Our studies have concluded that an 8-contact point configuration between the tongue and the TTS device would yield the best user precision and speed performance. On average using the TTS device inside the oral cavity takes 2.5 times longer than the pointer finger using a T9 (Text on 9 keys) keyboard configuration to type the same phrase. In conclusion, we have developed a discrete noninvasive wearable device that allows the vocally impaired individuals to communicate in real time.
A novel GLM-based method for the Automatic IDentification of functional Events (AIDE) in fNIRS data recorded in naturalistic environments.

Science.gov (United States)

Pinti, Paola; Merla, Arcangelo; Aichelburg, Clarisse; Lind, Frida; Power, Sarah; Swingler, Elizabeth; Hamilton, Antonia; Gilbert, Sam; Burgess, Paul W; Tachtsidis, Ilias

2017-07-15

Recent technological advances have allowed the development of portable functional Near-Infrared Spectroscopy (fNIRS) devices that can be used to perform neuroimaging in the real-world. However, as real-world experiments are designed to mimic everyday life situations, the identification of event onsets can be extremely challenging and time-consuming. Here, we present a novel analysis method based on the general linear model (GLM) least square fit analysis for the Automatic IDentification of functional Events (or AIDE) directly from real-world fNIRS neuroimaging data. In order to investigate the accuracy and feasibility of this method, as a proof-of-principle we applied the algorithm to (i) synthetic fNIRS data simulating both block-, event-related and mixed-design experiments and (ii) experimental fNIRS data recorded during a conventional lab-based task (involving maths). AIDE was able to recover functional events from simulated fNIRS data with an accuracy of 89%, 97% and 91% for the simulated block-, event-related and mixed-design experiments respectively. For the lab-based experiment, AIDE recovered more than the 66.7% of the functional events from the fNIRS experimental measured data. To illustrate the strength of this method, we then applied AIDE to fNIRS data recorded by a wearable system on one participant during a complex real-world prospective memory experiment conducted outside the lab. As part of the experiment, there were four and six events (actions where participants had to interact with a target) for the two different conditions respectively (condition 1: social-interact with a person; condition 2: non-social-interact with an object). AIDE managed to recover 3/4 events and 3/6 events for conditions 1 and 2 respectively. The identified functional events were then corresponded to behavioural data from the video recordings of the movements and actions of the participant. Our results suggest that "brain-first" rather than "behaviour-first" analysis is

Effect of gap detection threshold on consistency of speech in children with speech sound disorder.

Science.gov (United States)

Sayyahi, Fateme; Soleymani, Zahra; Akbari, Mohammad; Bijankhan, Mahmood; Dolatshahi, Behrooz

2017-02-01

The present study examined the relationship between gap detection threshold and speech error consistency in children with speech sound disorder. The participants were children five to six years of age who were categorized into three groups of typical speech, consistent speech disorder (CSD) and inconsistent speech disorder (ISD).The phonetic gap detection threshold test was used for this study, which is a valid test comprised six syllables with inter-stimulus intervals between 20-300ms. The participants were asked to listen to the recorded stimuli three times and indicate whether they heard one or two sounds. There was no significant difference between the typical and CSD groups (p=0.55), but there were significant differences in performance between the ISD and CSD groups and the ISD and typical groups (p=0.00). The ISD group discriminated between speech sounds at a higher threshold. Children with inconsistent speech errors could not distinguish speech sounds during time-limited phonetic discrimination. It is suggested that inconsistency in speech is a representation of inconsistency in auditory perception, which causes by high gap detection threshold. Copyright © 2016 Elsevier Ltd. All rights reserved.
APPLICATION OF INFORMATION AND COMMUNICATION TECHNOLOGIES IN COMPUTER AIDED LANGUAGE LEARNING

Directory of Open Access Journals (Sweden)

I. B. Tampel

2013-01-01

Full Text Available The article deals with the various ways of application for automatic speech recognition, Text-to-Speech technology, pronunciation and communication skills training, vocabulary check of the taught person, audition skills training in computer aided language learning (CALL-system. In spite of some constraints such technologies application is effective both for education problems simplification and for comfort growth of the system application.
Childhood apraxia of speech and multiple phonological disorders in Cairo-Egyptian Arabic speaking children: language, speech, and oro-motor differences.

Science.gov (United States)

Aziz, Azza Adel; Shohdi, Sahar; Osman, Dalia Mostafa; Habib, Emad Iskander

2010-06-01

Childhood apraxia of speech is a neurological childhood speech-sound disorder in which the precision and consistency of movements underlying speech are impaired in the absence of neuromuscular deficits. Children with childhood apraxia of speech and those with multiple phonological disorder share some common phonological errors that can be misleading in diagnosis. This study posed a question about a possible significant difference in language, speech and non-speech oral performances between children with childhood apraxia of speech, multiple phonological disorder and normal children that can be used for a differential diagnostic purpose. 30 pre-school children between the ages of 4 and 6 years served as participants. Each of these children represented one of 3 possible subject-groups: Group 1: multiple phonological disorder; Group 2: suspected cases of childhood apraxia of speech; Group 3: control group with no communication disorder. Assessment procedures included: parent interviews; testing of non-speech oral motor skills and testing of speech skills. Data showed that children with suspected childhood apraxia of speech showed significantly lower language score only in their expressive abilities. Non-speech tasks did not identify significant differences between childhood apraxia of speech and multiple phonological disorder groups except for those which required two sequential motor performances. In speech tasks, both consonant and vowel accuracy were significantly lower and inconsistent in childhood apraxia of speech group than in the multiple phonological disorder group. Syllable number, shape and sequence accuracy differed significantly in the childhood apraxia of speech group than the other two groups. In addition, children with childhood apraxia of speech showed greater difficulty in processing prosodic features indicating a clear need to address these variables for differential diagnosis and treatment of children with childhood apraxia of speech. Copyright (c
Evaluation of a clinical auditory profile in hearing-aid candidates

DEFF Research Database (Denmark)

Thorup, Nicoline; Santurette, Sébastien; Jørgensen, Søren

2015-01-01

by default. However, this does not necessary lead to the same HA benefit. This study aimed at identifying clinically relevant tests that may be informative in addition to the audiogram and relate more directly to HA benefit. Twenty-nine HI listeners performed fast tests of loudness perception, spectral...... and temporal resolution, binaural hearing, speech intelligibility in stationary and fluctuating noise, and a working-memory test. Six weeks after HA fitting they answered the International Outcome Inventory – Hearing Aid evaluation. The HI group was homogeneous based on the audiogram, but only one test...
Word-length algorithm for language identification of under-resourced languages

Directory of Open Access Journals (Sweden)

Ali Selamat

2016-10-01

Full Text Available Language identification is widely used in machine learning, text mining, information retrieval, and speech processing. Available techniques for solving the problem of language identification do require large amount of training text that are not available for under-resourced languages which form the bulk of the World’s languages. The primary objective of this study is to propose a lexicon based algorithm which is able to perform language identification using minimal training data. Because language identification is often the first step in many natural language processing tasks, it is necessary to explore techniques that will perform language identification in the shortest possible time. Hence, the second objective of this research is to study the effect of the proposed algorithm on the run-time performance of language identification. Precision, recall, and F1 measures were used to determine the effectiveness of the proposed word length algorithm using datasets drawn from the Universal Declaration of Human Rights Act in 15 languages. The experimental results show good accuracy on language identification at the document level and at the sentence level based on the available dataset. The improved algorithm also showed significant improvement in run time performance compared with the spelling checker approach.
Vowel production of Mandarin-speaking hearing aid users with different types of hearing loss.

Directory of Open Access Journals (Sweden)

Yu-Chen Hung

Full Text Available In contrast with previous research focusing on cochlear implants, this study examined the speech performance of hearing aid users with conductive (n = 11, mixed (n = 10, and sensorineural hearing loss (n = 7 and compared it with the speech of hearing control. Speech intelligibility was evaluated by computing the vowel space area defined by the Mandarin Chinese corner vowels /a, u, i/. The acoustic differences between the vowels were assessed using the Euclidean distance. The results revealed that both the conductive and mixed hearing loss groups exhibited a reduced vowel working space, but no significant difference was found between the sensorineural hearing loss and normal hearing groups. An analysis using the Euclidean distance further showed that the compression of vowel space area in conductive hearing loss can be attributed to the substantial lowering of the second formant of /i/. The differences in vowel production between groups are discussed in terms of the occlusion effect and the signal transmission media of various hearing devices.
Characteristics of Fluency and Speech in Two Families with High Incidences of Stuttering

Science.gov (United States)

Stager, Sheila V.; Freeman, Frances J.; Braun, Allen

2015-01-01

Purpose: This study presents data from 2 families with high incidence of stuttering, comparing methods of phenotype assignment and exploring the presence of other fluency disorders and corresponding speech characteristics. Method: Three methods for assigning phenotype of stuttering were used: self-identification, family identification, and expert…
Data-driven analysis of functional brain interactions during free listening to music and speech.

Science.gov (United States)

Fang, Jun; Hu, Xintao; Han, Junwei; Jiang, Xi; Zhu, Dajiang; Guo, Lei; Liu, Tianming

2015-06-01

Natural stimulus functional magnetic resonance imaging (N-fMRI) such as fMRI acquired when participants were watching video streams or listening to audio streams has been increasingly used to investigate functional mechanisms of the human brain in recent years. One of the fundamental challenges in functional brain mapping based on N-fMRI is to model the brain's functional responses to continuous, naturalistic and dynamic natural stimuli. To address this challenge, in this paper we present a data-driven approach to exploring functional interactions in the human brain during free listening to music and speech streams. Specifically, we model the brain responses using N-fMRI by measuring the functional interactions on large-scale brain networks with intrinsically established structural correspondence, and perform music and speech classification tasks to guide the systematic identification of consistent and discriminative functional interactions when multiple subjects were listening music and speech in multiple categories. The underlying premise is that the functional interactions derived from N-fMRI data of multiple subjects should exhibit both consistency and discriminability. Our experimental results show that a variety of brain systems including attention, memory, auditory/language, emotion, and action networks are among the most relevant brain systems involved in classic music, pop music and speech differentiation. Our study provides an alternative approach to investigating the human brain's mechanism in comprehension of complex natural music and speech.
Discrimination and identification of long vowels in children with typical language development and specific language impairment

Science.gov (United States)

Datta, Hia; Shafer, Valerie; Kurtzberg, Diane

2004-05-01

Researchers have claimed that children with specific language impairment (SLI) have particular difficulties in discriminating and identifying phonetically similar and brief speech sounds (Stark and Heinz, 1966; Studdert-Kennedy and Bradley, 1997; Sussman, 1993). In a recent study (Shafer et al., 2004), children with SLI were reported to have difficulty in processing brief (50 ms), phonetically similar vowels (/I-E/). The current study investigated perception of long (250 ms), phonetically similar vowels (/I-E/) in 8- to 10-year-old children with SLI and typical language development (TLD). The purpose was to examine whether phonetic similarity in vowels leads to poorer speech-perception in the SLI group. Behavioral and electrophysiological methods were employed to examine discrimination and identification of a nine-step vowel continuum from /I/ to /E/. Similar performances in discrimination were found for both groups, indicating that lengthening vowel duration indeed improves discrimination of phonetically similar vowels. However, these children with SLI showed poor behavioral identification, demonstrating that phonetic similarity of speech sounds, irrespective of their duration, contribute to the speech perception difficulty observed in SLI population. These findings suggest that the deficit in these children with SLI is at the level of working memory or long term memory representation of speech.
Language-specific strategy for programming hearing aids - A double-blind randomized controlled crossover study.

Science.gov (United States)

Matsumoto, Nozomu; Suzuki, Nobuyoshi; Iwasaki, Satoshi; Ishikawa, Kazuha; Tsukiji, Hiroki; Higashino, Yoshie; Tabuki, Tomoko; Nakagawa, Takashi

2018-08-01

Voice-aligned compression (VAC) is a method used in Oticon's hearing aids to provide more comfortable hearing without sacrificing speech discrimination. The complex, non-linear compression curve for the VAC strategy is designed based on the frequency profile of certain spoken Western languages. We hypothesized that hearing aids could be further customized for Japanese-speaking users by modifying the compression curve using the frequency profile of spoken Japanese. A double-blind randomized controlled crossover study was performed to determine whether or not Oticon's modified amplification strategy (VAC-J) provides subjectively preferable hearing aids for Japanese-speaking hearing aid users compared to the same company's original amplification strategy (VAC). The participants were randomized to two groups. The VAC-first group received a pair of hearing aids programmed using the VAC strategy and wore them for three weeks, and then received a pair of hearing aids programmed using VAC-J strategy and wore them for three weeks. The VAC-J-first group underwent the same study, but they received hearing aids in the reverse sequence. A Speech, Spatial and Qualities (SSQ) questionnaire was administered before beginning to use the hearing aids, at the end of using the first pair of hearing aids, and at the end of using the second pair of hearing aids. Twenty-five participants that met the inclusion/exclusion criteria from January 1 to October 31, 2016, were randomized to two groups. Twenty-two participants completed the study. There were no statistically significant differences in the increment of SSQ scores between the participants when using the VAC- or the VAC-J-programmed hearing aids. However, participants preferred the VAC-J strategy to the VAC strategy at the end of the study, and this difference was statistically significant. Japanese-speaking hearing aid users preferred using hearing aids that were fitted with the VAC-J strategy. Our results show that the VAC strategy
[On the use of the spectral speech characteristics for the determination of biometric parameters of the vocal tract in forensic medical identification of the speaker's personality].

Science.gov (United States)

Kaganov, A Sh

2014-01-01

The objective of the present study was to elucidate the relationship between the spectral speech characteristics and the biometric parameters of the speaker's vocal tract. The secondary objective was to consider the theoretical basis behind the medico-criminalistic personality identification from the biometric parameters of the speaker's vocal tract. The article is based on the results of real forensic medical investigations and the literature data.
Roots of Performance - Aided Design in Utzon´s design principles

DEFF Research Database (Denmark)

Parigi, Dario

2014-01-01

on paper, to an evolving paradigm where the increasing integration of parametric tools, performative analysis and computational methods is changing the way we learn and design. Its constitutive factors are: 1) embedded tectonics, 2) performance simulation 3) computational methods.......This paper discuss an emerging paradigm here identified as PAD, acronym of Performance-Aided Design, that aims at embracing complexity in the design process, and tackling it with digital tools. Computer Aided Design tools are gradually shifting from the mere translation of the work once carried...
The effect of multimicrophone noise reduction systems on sound source localization by users of binaural hearing aids.

Science.gov (United States)

Van den Bogaert, Tim; Doclo, Simon; Wouters, Jan; Moonen, Marc

2008-07-01

This paper evaluates the influence of three multimicrophone noise reduction algorithms on the ability to localize sound sources. Two recently developed noise reduction techniques for binaural hearing aids were evaluated, namely, the binaural multichannel Wiener filter (MWF) and the binaural multichannel Wiener filter with partial noise estimate (MWF-N), together with a dual-monaural adaptive directional microphone (ADM), which is a widely used noise reduction approach in commercial hearing aids. The influence of the different algorithms on perceived sound source localization and their noise reduction performance was evaluated. It is shown that noise reduction algorithms can have a large influence on localization and that (a) the ADM only preserves localization in the forward direction over azimuths where limited or no noise reduction is obtained; (b) the MWF preserves localization of the target speech component but may distort localization of the noise component. The latter is dependent on signal-to-noise ratio and masking effects; (c) the MWF-N enables correct localization of both the speech and the noise components; (d) the statistical Wiener filter approach introduces a better combination of sound source localization and noise reduction performance than the ADM approach.
MutAid: Sanger and NGS Based Integrated Pipeline for Mutation Identification, Validation and Annotation in Human Molecular Genetics.

Directory of Open Access Journals (Sweden)

Ram Vinay Pandey

Full Text Available Traditional Sanger sequencing as well as Next-Generation Sequencing have been used for the identification of disease causing mutations in human molecular research. The majority of currently available tools are developed for research and explorative purposes and often do not provide a complete, efficient, one-stop solution. As the focus of currently developed tools is mainly on NGS data analysis, no integrative solution for the analysis of Sanger data is provided and consequently a one-stop solution to analyze reads from both sequencing platforms is not available. We have therefore developed a new pipeline called MutAid to analyze and interpret raw sequencing data produced by Sanger or several NGS sequencing platforms. It performs format conversion, base calling, quality trimming, filtering, read mapping, variant calling, variant annotation and analysis of Sanger and NGS data under a single platform. It is capable of analyzing reads from multiple patients in a single run to create a list of potential disease causing base substitutions as well as insertions and deletions. MutAid has been developed for expert and non-expert users and supports four sequencing platforms including Sanger, Illumina, 454 and Ion Torrent. Furthermore, for NGS data analysis, five read mappers including BWA, TMAP, Bowtie, Bowtie2 and GSNAP and four variant callers including GATK-HaplotypeCaller, SAMTOOLS, Freebayes and VarScan2 pipelines are supported. MutAid is freely available at https://sourceforge.net/projects/mutaid.
MutAid: Sanger and NGS Based Integrated Pipeline for Mutation Identification, Validation and Annotation in Human Molecular Genetics.

Science.gov (United States)

Pandey, Ram Vinay; Pabinger, Stephan; Kriegner, Albert; Weinhäusel, Andreas

2016-01-01

Traditional Sanger sequencing as well as Next-Generation Sequencing have been used for the identification of disease causing mutations in human molecular research. The majority of currently available tools are developed for research and explorative purposes and often do not provide a complete, efficient, one-stop solution. As the focus of currently developed tools is mainly on NGS data analysis, no integrative solution for the analysis of Sanger data is provided and consequently a one-stop solution to analyze reads from both sequencing platforms is not available. We have therefore developed a new pipeline called MutAid to analyze and interpret raw sequencing data produced by Sanger or several NGS sequencing platforms. It performs format conversion, base calling, quality trimming, filtering, read mapping, variant calling, variant annotation and analysis of Sanger and NGS data under a single platform. It is capable of analyzing reads from multiple patients in a single run to create a list of potential disease causing base substitutions as well as insertions and deletions. MutAid has been developed for expert and non-expert users and supports four sequencing platforms including Sanger, Illumina, 454 and Ion Torrent. Furthermore, for NGS data analysis, five read mappers including BWA, TMAP, Bowtie, Bowtie2 and GSNAP and four variant callers including GATK-HaplotypeCaller, SAMTOOLS, Freebayes and VarScan2 pipelines are supported. MutAid is freely available at https://sourceforge.net/projects/mutaid.
Integrating speech technology to meet crew station design requirements

Science.gov (United States)

Simpson, Carol A.; Ruth, John C.; Moore, Carolyn A.

The last two years have seen improvements in speech generation and speech recognition technology that make speech I/O for crew station controls and displays viable for operational systems. These improvements include increased robustness of algorithm performance in high levels of background noise, increased vocabulary size, improved performance in the connected speech mode, and less speaker dependence. This improved capability makes possible far more sophisticated user interface design than was possible with earlier technology. Engineering, linguistic, and human factors design issues are discussed in the context of current voice I/O technology performance.
Speech recognition using articulatory and excitation source features

CERN Document Server

Rao, K Sreenivasa

2017-01-01

This book discusses the contribution of articulatory and excitation source information in discriminating sound units. The authors focus on excitation source component of speech -- and the dynamics of various articulators during speech production -- for enhancement of speech recognition (SR) performance. Speech recognition is analyzed for read, extempore, and conversation modes of speech. Five groups of articulatory features (AFs) are explored for speech recognition, in addition to conventional spectral features. Each chapter provides the motivation for exploring the specific feature for SR task, discusses the methods to extract those features, and finally suggests appropriate models to capture the sound unit specific knowledge from the proposed features. The authors close by discussing various combinations of spectral, articulatory and source features, and the desired models to enhance the performance of SR systems.
Effects of directional microphone and adaptive multichannel noise reduction algorithm on cochlear implant performance.

Science.gov (United States)

Chung, King; Zeng, Fan-Gang; Acker, Kyle N

2006-10-01

Although cochlear implant (CI) users have enjoyed good speech recognition in quiet, they still have difficulties understanding speech in noise. We conducted three experiments to determine whether a directional microphone and an adaptive multichannel noise reduction algorithm could enhance CI performance in noise and whether Speech Transmission Index (STI) can be used to predict CI performance in various acoustic and signal processing conditions. In Experiment I, CI users listened to speech in noise processed by 4 hearing aid settings: omni-directional microphone, omni-directional microphone plus noise reduction, directional microphone, and directional microphone plus noise reduction. The directional microphone significantly improved speech recognition in noise. Both directional microphone and noise reduction algorithm improved overall preference. In Experiment II, normal hearing individuals listened to the recorded speech produced by 4- or 8-channel CI simulations. The 8-channel simulation yielded similar speech recognition results as in Experiment I, whereas the 4-channel simulation produced no significant difference among the 4 settings. In Experiment III, we examined the relationship between STIs and speech recognition. The results suggested that STI could predict actual and simulated CI speech intelligibility with acoustic degradation and the directional microphone, but not the noise reduction algorithm. Implications for intelligibility enhancement are discussed.
Spoken Word Recognition Errors in Speech Audiometry: A Measure of Hearing Performance?

Directory of Open Access Journals (Sweden)

Martine Coene

2015-01-01

Full Text Available This report provides a detailed analysis of incorrect responses from an open-set spoken word-repetition task which is part of a Dutch speech audiometric test battery. Single-consonant confusions were analyzed from 230 normal hearing participants in terms of the probability of choice of a particular response on the basis of acoustic-phonetic, lexical, and frequency variables. The results indicate that consonant confusions are better predicted by lexical knowledge than by acoustic properties of the stimulus word. A detailed analysis of the transmission of phonetic features indicates that “voicing” is best preserved whereas “manner of articulation” yields most perception errors. As consonant confusion matrices are often used to determine the degree and type of a patient’s hearing impairment, to predict a patient’s gain in hearing performance with hearing devices and to optimize the device settings in view of maximum output, the observed findings are highly relevant for the audiological practice. Based on our findings, speech audiometric outcomes provide a combined auditory-linguistic profile of the patient. The use of confusion matrices might therefore not be the method best suited to measure hearing performance. Ideally, they should be complemented by other listening task types that are known to have less linguistic bias, such as phonemic discrimination.
[Evaluation of the Freiburg monosyllabic speech test in background noise].

Science.gov (United States)

Löhler, J; Akcicek, B; Pilnik, M; Saager-Post, K; Dazert, S; Biedron, S; Oeken, J; Mürbe, D; Löbert, J; Laszig, R; Wesarg, T; Langer, C; Plontke, S; Rahne, T; Machate, U; Noppeney, R; Schultz, K; Plinkert, P; Hoth, S; Praetorius, M; Schlattmann, P; Meister, E F; Pau, H W; Ehrt, K; Hagen, R; Shehata-Dieler, W; Cebulla, M; Walther, L E; Ernst, A

2013-07-01

The Freiburg speech test has been the gold standard in speech audiometry in Germany for many years. Previously, however, this test had not been evaluated in assessing the effectiveness of a hearing aid in background noise. Furthermore, the validity of particular word lists used in the test has been questioned repeatedly in the past, due to a suspected higher variation within these lists as compared to the other word list used. In this prospective study, two groups of subjects [normal hearing control subjects and patients with SNHL (sensorineural hearing loss) that had been fitted with hearing aid] were examined. In a first group, 113 control subjects with normal age- and gender-related pure tone thresholds were assessed by means of the Freiburg monosyllabic test under free-field conditions at 65 dB. The second group comprised 104 patients that had been fitted with hearing aids at least 3 months previously to treat their SNHL. Members of the SNHL group were assessed by means of the Freiburg monosyllabic test both with and without hearing aids, and in the presence or absence of background noise (CCITT-noise; 65/60 dB signal-noise ratio, in accordance with the Comité Consultatif International Téléphonique et Télégraphique), under free-field conditions at 65 dB. The first (control) group exhibited no gender-related differences in the Freiburg test results. In a few instances, inter-individual variability of responses was observed, although the reasons for this remain to be clarified. Within the second (patient) group, the Freiburg test results under the four different measurement conditions differed significantly from each other (p>0.05). This group exhibited a high degree of inter-individual variability between responses. In light of this, no significant differences in outcome could be assigned to the different word lists employed in the Freiburg speech test. The Freiburg monosyllabic test is able to assess the extent of hearing loss, as well as the effectiveness

Rehabilitation of an edentulous cleft lip and palate patient with a soft palate defect using a bar-retained, implant-supported speech-aid prosthesis: a clinical report.

Science.gov (United States)

Hakan Tuna, S; Pekkan, Gurel; Buyukgural, Bulent

2009-01-01

Prosthetic rehabilitation of an edentulous cleft lip and palate patient with a combined hard and soft palate defect is a great challenge, due to the lack of retention of the obturator prosthesis as a result of its weight and the inability to obtain a border seal. Dental implants improve the retention, stability, and occlusal function of prostheses when used in carefully selected cleft lip and palate cases. This clinical report presents an edentulous unilateral cleft lip and palate patient who has hard and soft palate defects and an atrophied maxilla, treated with an implant-supported speech-aid prosthesis.
Sensorimotor influences on speech perception in infancy.

Science.gov (United States)

Bruderer, Alison G; Danielson, D Kyle; Kandhadai, Padmapriya; Werker, Janet F

2015-11-03

The influence of speech production on speech perception is well established in adults. However, because adults have a long history of both perceiving and producing speech, the extent to which the perception-production linkage is due to experience is unknown. We addressed this issue by asking whether articulatory configurations can influence infants' speech perception performance. To eliminate influences from specific linguistic experience, we studied preverbal, 6-mo-old infants and tested the discrimination of a nonnative, and hence never-before-experienced, speech sound distinction. In three experimental studies, we used teething toys to control the position and movement of the tongue tip while the infants listened to the speech sounds. Using ultrasound imaging technology, we verified that the teething toys consistently and effectively constrained the movement and positioning of infants' tongues. With a looking-time procedure, we found that temporarily restraining infants' articulators impeded their discrimination of a nonnative consonant contrast but only when the relevant articulator was selectively restrained to prevent the movements associated with producing those sounds. Our results provide striking evidence that even before infants speak their first words and without specific listening experience, sensorimotor information from the articulators influences speech perception. These results transform theories of speech perception by suggesting that even at the initial stages of development, oral-motor movements influence speech sound discrimination. Moreover, an experimentally induced "impairment" in articulator movement can compromise speech perception performance, raising the question of whether long-term oral-motor impairments may impact perceptual development.
Dialogue enabling speech-to-text user assistive agent system for hearing-impaired person.

Science.gov (United States)

Lee, Seongjae; Kang, Sunmee; Han, David K; Ko, Hanseok

2016-06-01

A novel approach for assisting bidirectional communication between people of normal hearing and hearing-impaired is presented. While the existing hearing-impaired assistive devices such as hearing aids and cochlear implants are vulnerable in extreme noise conditions or post-surgery side effects, the proposed concept is an alternative approach wherein spoken dialogue is achieved by means of employing a robust speech recognition technique which takes into consideration of noisy environmental factors without any attachment into human body. The proposed system is a portable device with an acoustic beamformer for directional noise reduction and capable of performing speech-to-text transcription function, which adopts a keyword spotting method. It is also equipped with an optimized user interface for hearing-impaired people, rendering intuitive and natural device usage with diverse domain contexts. The relevant experimental results confirm that the proposed interface design is feasible for realizing an effective and efficient intelligent agent for hearing-impaired.
performance evaluation of a pilot paraplegic centricity mobility aid

African Journals Online (AJOL)

eobe

PERFORMANCE EVALUATION OF A PILOT PARAPLEGIC CENTRICITY. MOBILITY AID. MOBILITY ... The result of the test showed a remarkable improvement in. Wilcoxin's signed rank test. .... RESEARCH METHOD. RESEARCH METHOD.
Performance of optimized McRAPD in identification of 9 yeast species frequently isolated from patient samples: potential for automation.

Science.gov (United States)

Trtkova, Jitka; Pavlicek, Petr; Ruskova, Lenka; Hamal, Petr; Koukalova, Dagmar; Raclavsky, Vladislav

2009-11-10

Rapid, easy, economical and accurate species identification of yeasts isolated from clinical samples remains an important challenge for routine microbiological laboratories, because susceptibility to antifungal agents, probability to develop resistance and ability to cause disease vary in different species. To overcome the drawbacks of the currently available techniques we have recently proposed an innovative approach to yeast species identification based on RAPD genotyping and termed McRAPD (Melting curve of RAPD). Here we have evaluated its performance on a broader spectrum of clinically relevant yeast species and also examined the potential of automated and semi-automated interpretation of McRAPD data for yeast species identification. A simple fully automated algorithm based on normalized melting data identified 80% of the isolates correctly. When this algorithm was supplemented by semi-automated matching of decisive peaks in first derivative plots, 87% of the isolates were identified correctly. However, a computer-aided visual matching of derivative plots showed the best performance with average 98.3% of the accurately identified isolates, almost matching the 99.4% performance of traditional RAPD fingerprinting. Since McRAPD technique omits gel electrophoresis and can be performed in a rapid, economical and convenient way, we believe that it can find its place in routine identification of medically important yeasts in advanced diagnostic laboratories that are able to adopt this technique. It can also serve as a broad-range high-throughput technique for epidemiological surveillance.
The relationship between dorsolateral prefrontal activation and speech performance-based social anxiety using functional near infrared spectroscopy.

Science.gov (United States)

Glassman, Lisa H; Kuster, Anootnara T; Shaw, Jena A; Forman, Evan M; Izzetoglu, Meltem; Matteucci, Alyssa; Herbert, James D

2017-06-01

Functional near-infrared (fNIR) spectroscopy is a promising new technology that has demonstrated utility in the study of normal human cognition. We utilized fNIR spectroscopy to examine the effect of social anxiety and performance on hemodynamic activity in the dorsolateral prefrontal cortex (DLPFC). Socially phobic participants and non-clinical participants with varying levels of social anxiety completed a public speaking task in front of a small virtual audience while the DLPFC was being monitored by the fNIR device. The relationship between anxiety and both blood volume (BV) and deoxygenated hemoglobin (Hb) varied significantly as a function of speech performance, such that individuals with low social anxiety who performed well showed an increase in DLPFC activation relative to those who did not perform well. This result suggests that effortful thinking and/or efficient top-down inhibitory control may have been required to complete an impromptu speech task with good performance. In contrast, good performers who were highly socially anxious showed lower DLPFC activation relative to good performers who were low in social anxiety, suggesting autopilot thinking or less-effortful thinking. In poor performers, slight increases in DLPFC activation were observed from low to highly anxious individuals, which may reflect a shift from effortless thinking to heightened self-focused attention. Heightened self-focused attention, poor inhibitory control resulting in excessive fear or anxiety, or low motivation may lower performance. These results suggest that there can be different underlying mechanisms in the brain that affect the level of speech performance in individuals with varying degrees of social anxiety. This study highlights the utility of the fNIR device in the assessment of changes in DLPFC in response to exposure to realistic phobic stimuli, and further supports the potential utility of this technology in the study of the neurophysiology of anxiety disorders.
Dissociated Crossed Speech Areas in a Tumour Patient

Directory of Open Access Journals (Sweden)

Jörg Mauler

2017-05-01

Full Text Available In the past, the eloquent areas could be deliberately localised by the invasive Wada test. The very rare cases of dissociated crossed speech areas were accidentally found based on the clinical symptomatology. Today functional magnetic resonance imaging (fMRI-based imaging can be employed to non-invasively localise the eloquent areas in brain tumour patients for therapy planning. A 41-year-old, left-handed man with a low-grade glioma in the left frontal operculum extending to the insular cortex, tension headaches, and anomic aphasia over 5 months underwent a pre-operative speech area localisation fMRI measurement, which revealed the evidence of the transhemispheric disposition, where the dominant Wernicke speech area is located on the left and the Broca’s area is strongly lateralised to the right hemisphere. The outcome of the Wada test and the intraoperative cortico-subcortical stimulation mapping were congruent with this finding. After tumour removal, language area function was fully preserved. Upon the occurrence of brain tumours with a risk of impaired speech function, the rare dissociate crossed speech areas disposition may gain a clinically relevant meaning by allowing for more extended tumour removal. Hence, for its identification, diagnostics which take into account both brain hemispheres, such as fMRI, are recommended.
Inner Speech's Relationship With Overt Speech in Poststroke Aphasia.

Science.gov (United States)

Stark, Brielle C; Geva, Sharon; Warburton, Elizabeth A

2017-09-18

Relatively preserved inner speech alongside poor overt speech has been documented in some persons with aphasia (PWA), but the relationship of overt speech with inner speech is still largely unclear, as few studies have directly investigated these factors. The present study investigates the relationship of relatively preserved inner speech in aphasia with selected measures of language and cognition. Thirty-eight persons with chronic aphasia (27 men, 11 women; average age 64.53 ± 13.29 years, time since stroke 8-111 months) were classified as having relatively preserved inner and overt speech (n = 21), relatively preserved inner speech with poor overt speech (n = 8), or not classified due to insufficient measurements of inner and/or overt speech (n = 9). Inner speech scores (by group) were correlated with selected measures of language and cognition from the Comprehensive Aphasia Test (Swinburn, Porter, & Al, 2004). The group with poor overt speech showed a significant relationship of inner speech with overt naming (r = .95, p speech and language and cognition factors were not significant for the group with relatively good overt speech. As in previous research, we show that relatively preserved inner speech is found alongside otherwise severe production deficits in PWA. PWA with poor overt speech may rely more on preserved inner speech for overt picture naming (perhaps due to shared resources with verbal working memory) and for written picture description (perhaps due to reliance on inner speech due to perceived task difficulty). Assessments of inner speech may be useful as a standard component of aphasia screening, and therapy focused on improving and using inner speech may prove clinically worthwhile. https://doi.org/10.23641/asha.5303542.
Auditory Brainstem Response to Complex Sounds Predicts Self-Reported Speech-in-Noise Performance

Science.gov (United States)

Anderson, Samira; Parbery-Clark, Alexandra; White-Schwoch, Travis; Kraus, Nina

2013-01-01

Purpose: To compare the ability of the auditory brainstem response to complex sounds (cABR) to predict subjective ratings of speech understanding in noise on the Speech, Spatial, and Qualities of Hearing Scale (SSQ; Gatehouse & Noble, 2004) relative to the predictive ability of the Quick Speech-in-Noise test (QuickSIN; Killion, Niquette,…
LinguaTag: an Emotional Speech Analysis Application

OpenAIRE

Cullen, Charlie; Vaughan, Brian; Kousidis, Spyros

2008-01-01

The analysis of speech, particularly for emotional content, is an open area of current research. Ongoing work has developed an emotional speech corpus for analysis, and defined a vowel stress method by which this analysis may be performed. This paper documents the development of LinguaTag, an open source speech analysis software application which implements this vowel stress emotional speech analysis method developed as part of research into the acoustic and linguistic correlates of emotional...
Distributed neural signatures of natural audiovisual speech and music in the human auditory cortex.

Science.gov (United States)

Salmi, Juha; Koistinen, Olli-Pekka; Glerean, Enrico; Jylänki, Pasi; Vehtari, Aki; Jääskeläinen, Iiro P; Mäkelä, Sasu; Nummenmaa, Lauri; Nummi-Kuisma, Katarina; Nummi, Ilari; Sams, Mikko

2017-08-15

During a conversation or when listening to music, auditory and visual information are combined automatically into audiovisual objects. However, it is still poorly understood how specific type of visual information shapes neural processing of sounds in lifelike stimulus environments. Here we applied multi-voxel pattern analysis to investigate how naturally matching visual input modulates supratemporal cortex activity during processing of naturalistic acoustic speech, singing and instrumental music. Bayesian logistic regression classifiers with sparsity-promoting priors were trained to predict whether the stimulus was audiovisual or auditory, and whether it contained piano playing, speech, or singing. The predictive performances of the classifiers were tested by leaving one participant at a time for testing and training the model using the remaining 15 participants. The signature patterns associated with unimodal auditory stimuli encompassed distributed locations mostly in the middle and superior temporal gyrus (STG/MTG). A pattern regression analysis, based on a continuous acoustic model, revealed that activity in some of these MTG and STG areas were associated with acoustic features present in speech and music stimuli. Concurrent visual stimulus modulated activity in bilateral MTG (speech), lateral aspect of right anterior STG (singing), and bilateral parietal opercular cortex (piano). Our results suggest that specific supratemporal brain areas are involved in processing complex natural speech, singing, and piano playing, and other brain areas located in anterior (facial speech) and posterior (music-related hand actions) supratemporal cortex are influenced by related visual information. Those anterior and posterior supratemporal areas have been linked to stimulus identification and sensory-motor integration, respectively. Copyright © 2017 Elsevier Inc. All rights reserved.
Applications of Hilbert Spectral Analysis for Speech and Sound Signals

Science.gov (United States)

Huang, Norden E.

2003-01-01

A new method for analyzing nonlinear and nonstationary data has been developed, and the natural applications are to speech and sound signals. The key part of the method is the Empirical Mode Decomposition method with which any complicated data set can be decomposed into a finite and often small number of Intrinsic Mode Functions (IMF). An IMF is defined as any function having the same numbers of zero-crossing and extrema, and also having symmetric envelopes defined by the local maxima and minima respectively. The IMF also admits well-behaved Hilbert transform. This decomposition method is adaptive, and, therefore, highly efficient. Since the decomposition is based on the local characteristic time scale of the data, it is applicable to nonlinear and nonstationary processes. With the Hilbert transform, the Intrinsic Mode Functions yield instantaneous frequencies as functions of time, which give sharp identifications of imbedded structures. This method invention can be used to process all acoustic signals. Specifically, it can process the speech signals for Speech synthesis, Speaker identification and verification, Speech recognition, and Sound signal enhancement and filtering. Additionally, as the acoustical signals from machinery are essentially the way the machines are talking to us. Therefore, the acoustical signals, from the machines, either from sound through air or vibration on the machines, can tell us the operating conditions of the machines. Thus, we can use the acoustic signal to diagnosis the problems of machines.
Aurally Aided Visual Search Performance Comparing Virtual Audio Systems

DEFF Research Database (Denmark)

Larsen, Camilla Horne; Lauritsen, David Skødt; Larsen, Jacob Junker

2014-01-01

Due to increased computational power, reproducing binaural hearing in real-time applications, through usage of head-related transfer functions (HRTFs), is now possible. This paper addresses the differences in aurally-aided visual search performance between a HRTF enhanced audio system (3D) and an...... with white dots. The results indicate that 3D audio yields faster search latencies than panning audio, especially with larger amounts of distractors. The applications of this research could fit virtual environments such as video games or virtual simulations.......Due to increased computational power, reproducing binaural hearing in real-time applications, through usage of head-related transfer functions (HRTFs), is now possible. This paper addresses the differences in aurally-aided visual search performance between a HRTF enhanced audio system (3D...
Aurally Aided Visual Search Performance Comparing Virtual Audio Systems

DEFF Research Database (Denmark)

Larsen, Camilla Horne; Lauritsen, David Skødt; Larsen, Jacob Junker

2014-01-01

Due to increased computational power reproducing binaural hearing in real-time applications, through usage of head-related transfer functions (HRTFs), is now possible. This paper addresses the differences in aurally-aided visual search performance between an HRTF enhanced audio system (3D) and an...... with white dots. The results indicate that 3D audio yields faster search latencies than panning audio, especially with larger amounts of distractors. The applications of this research could fit virtual environments such as video games or virtual simulations.......Due to increased computational power reproducing binaural hearing in real-time applications, through usage of head-related transfer functions (HRTFs), is now possible. This paper addresses the differences in aurally-aided visual search performance between an HRTF enhanced audio system (3D...
Implementation of integrated circuit and design of SAR ADC for fully implantable hearing aids.

Science.gov (United States)

Kim, Jong Hoon; Lee, Jyung Hyun; Cho, Jin-Ho

2017-07-20

The hearing impaired population has been increasing; many people suffer from hearing problems. To deal with this difficulty, various types of hearing aids are being rapidly developed. In particular, fully implantable hearing aids are being actively studied to improve the performance of existing hearing aids and to reduce the stigma of hearing loss patients. It has to be of small size and low-power consumption for easy implantation and long-term use. The objective of the study was to implement a small size and low-power consumption successive approximation register analog-to-digital converter (SAR ADC) for fully implantable hearing aids. The ADC was selected as the SAR ADC because its analog circuit components are less required by the feedback circuit of the SAR ADC than the sigma-delta ADC which is conventionally used in hearing aids, and it has advantages in the area and power consumption. So, the circuit of SAR ADC is designed considering the speech region of humans because the objective is to deliver the speech signals of humans to hearing loss patients. If the switch of sample and hold works in the on/off positions, the charge injection and clock feedthrough are produced by a parasitic capacitor. These problems affect the linearity of the hold voltage, and as a result, an error of the bit conversion is generated. In order to solve the problem, a CMOS switch that consists of NMOS and PMOS was used, and it reduces the charge injection because the charge carriers in the NMOS and PMOS have inversed polarity. So, 16 bit conversion is performed before the occurrence of the Least Significant Bit (LSB) error. In order to minimize the offset voltage and power consumption of the designed comparator, we designed a preamplifier with current mirror. Therefore, the power consumption was reduced by the power control switch used in the comparator. The layout of the designed SAR ADC was performed by Virtuoso Layout Editor (Cadence, USA). In the layout result, the size of the
Phonological and Executive Working Memory in L2 Task-Based Speech Planning and Performance

Science.gov (United States)

Wen, Zhisheng

2016-01-01

The present study sets out to explore the distinctive roles played by two working memory (WM) components in various aspects of L2 task-based speech planning and performance. A group of 40 post-intermediate proficiency level Chinese EFL learners took part in the empirical study. Following the tenets and basic principles of the…
Let's all speak together! Exploring the masking effects of various languages on spoken word identification in multi-linguistic babble.

Science.gov (United States)

Gautreau, Aurore; Hoen, Michel; Meunier, Fanny

2013-01-01

This study aimed to characterize the linguistic interference that occurs during speech-in-speech comprehension by combining offline and online measures, which included an intelligibility task (at a -5 dB Signal-to-Noise Ratio) and 2 lexical decision tasks (at a -5 dB and 0 dB SNR) that were performed with French spoken target words. In these 3 experiments we always compared the masking effects of speech backgrounds (i.e., 4-talker babble) that were produced in the same language as the target language (i.e., French) or in unknown foreign languages (i.e., Irish and Italian) to the masking effects of corresponding non-speech backgrounds (i.e., speech-derived fluctuating noise). The fluctuating noise contained similar spectro-temporal information as babble but lacked linguistic information. At -5 dB SNR, both tasks revealed significantly divergent results between the unknown languages (i.e., Irish and Italian) with Italian and French hindering French target word identification to a similar extent, whereas Irish led to significantly better performances on these tasks. By comparing the performances obtained with speech and fluctuating noise backgrounds, we were able to evaluate the effect of each language. The intelligibility task showed a significant difference between babble and fluctuating noise for French, Irish and Italian, suggesting acoustic and linguistic effects for each language. However, the lexical decision task, which reduces the effect of post-lexical interference, appeared to be more accurate, as it only revealed a linguistic effect for French. Thus, although French and Italian had equivalent masking effects on French word identification, the nature of their interference was different. This finding suggests that the differences observed between the masking effects of Italian and Irish can be explained at an acoustic level but not at a linguistic level.
Speech Acquisition and Automatic Speech Recognition for Integrated Spacesuit Audio Systems

Science.gov (United States)

Huang, Yiteng; Chen, Jingdong; Chen, Shaoyan

2010-01-01

A voice-command human-machine interface system has been developed for spacesuit extravehicular activity (EVA) missions. A multichannel acoustic signal processing method has been created for distant speech acquisition in noisy and reverberant environments. This technology reduces noise by exploiting differences in the statistical nature of signal (i.e., speech) and noise that exists in the spatial and temporal domains. As a result, the automatic speech recognition (ASR) accuracy can be improved to the level at which crewmembers would find the speech interface useful. The developed speech human/machine interface will enable both crewmember usability and operational efficiency. It can enjoy a fast rate of data/text entry, small overall size, and can be lightweight. In addition, this design will free the hands and eyes of a suited crewmember. The system components and steps include beam forming/multi-channel noise reduction, single-channel noise reduction, speech feature extraction, feature transformation and normalization, feature compression, model adaption, ASR HMM (Hidden Markov Model) training, and ASR decoding. A state-of-the-art phoneme recognizer can obtain an accuracy rate of 65 percent when the training and testing data are free of noise. When it is used in spacesuits, the rate drops to about 33 percent. With the developed microphone array speech-processing technologies, the performance is improved and the phoneme recognition accuracy rate rises to 44 percent. The recognizer can be further improved by combining the microphone array and HMM model adaptation techniques and using speech samples collected from inside spacesuits. In addition, arithmetic complexity models for the major HMMbased ASR components were developed. They can help real-time ASR system designers select proper tasks when in the face of constraints in computational resources.
Musician advantage for speech-on-speech perception

NARCIS (Netherlands)

Başkent, Deniz; Gaudrain, Etienne

Evidence for transfer of musical training to better perception of speech in noise has been mixed. Unlike speech-in-noise, speech-on-speech perception utilizes many of the skills that musical training improves, such as better pitch perception and stream segregation, as well as use of higher-level
Cochlear Implantation in Patients With Usher Syndrome Type IIa Increases Performance and Quality of Life.

Science.gov (United States)

Hartel, Bas P; van Nierop, Josephine W I; Huinck, Wendy J; Rotteveel, Liselotte J C; Mylanus, Emmanuel A M; Snik, Ad F; Kunst, Henricus P M; Pennings, Ronald J E

2017-07-01

Usher syndrome type IIa (USH2a) is characterized by congenital moderate to severe hearing impairment and retinitis pigmentosa. Hearing rehabilitation starts in early childhood with the application of hearing aids. In some patients with USH2a, severe progression of hearing impairment leads to insufficient speech intelligibility with hearing aids and issues with adequate communication and safety. Cochlear implantation (CI) is the next step in rehabilitation of such patients. This study evaluates the performance and benefit of CI in patients with USH2a. Retrospective case-control study to evaluate the performance and benefit of CI in 16 postlingually deaf adults (eight patients with USH2a and eight matched controls). Performance and benefit were evaluated by a speech intelligibility test and three quality-of-life questionnaires. Patients with USH2a with a mean age of 59 years at implantation exhibited good performance after CI. The phoneme scores improved significantly from 41 to 87% in patients with USH2a (p = 0.02) and from 30 to 86% in the control group (p = 0.001). The results of the questionnaire survey demonstrated a clear benefit from CI. There were no differences in performance or benefit between patients with USH2a and control patients before and after CI. CI increases speech intelligibility and improves quality of life in patients with USH2a.

Educational Applications for Blind and Partially Sighted Pupils Based on Speech Technologies for Serbian

Science.gov (United States)

Lučić, Branko; Ostrogonac, Stevan; Vujnović Sedlar, Nataša; Sečujski, Milan

2015-01-01

The inclusion of persons with disabilities has always represented an important issue. Advancements within the field of computer science have enabled the development of different types of aids, which have significantly improved the quality of life of the disabled. However, for some disabilities, such as visual impairment, the purpose of these aids is to establish an alternative communication channel and thus overcome the user's disability. Speech technologies play the crucial role in this process. This paper presents the ongoing efforts to create a set of educational applications based on speech technologies for Serbian for the early stages of education of blind and partially sighted children. Two educational applications dealing with memory exercises and comprehension of geometrical shapes are presented, along with the initial tests results obtained from research including visually impaired pupils. PMID:26171422
Educational Applications for Blind and Partially Sighted Pupils Based on Speech Technologies for Serbian.

Science.gov (United States)

Lučić, Branko; Ostrogonac, Stevan; Vujnović Sedlar, Nataša; Sečujski, Milan

2015-01-01

The inclusion of persons with disabilities has always represented an important issue. Advancements within the field of computer science have enabled the development of different types of aids, which have significantly improved the quality of life of the disabled. However, for some disabilities, such as visual impairment, the purpose of these aids is to establish an alternative communication channel and thus overcome the user's disability. Speech technologies play the crucial role in this process. This paper presents the ongoing efforts to create a set of educational applications based on speech technologies for Serbian for the early stages of education of blind and partially sighted children. Two educational applications dealing with memory exercises and comprehension of geometrical shapes are presented, along with the initial tests results obtained from research including visually impaired pupils.
Use of a Deep Recurrent Neural Network to Reduce Wind Noise: Effects on Judged Speech Intelligibility and Sound Quality.

Science.gov (United States)

Keshavarzi, Mahmoud; Goehring, Tobias; Zakis, Justin; Turner, Richard E; Moore, Brian C J

2018-01-01

Despite great advances in hearing-aid technology, users still experience problems with noise in windy environments. The potential benefits of using a deep recurrent neural network (RNN) for reducing wind noise were assessed. The RNN was trained using recordings of the output of the two microphones of a behind-the-ear hearing aid in response to male and female speech at various azimuths in the presence of noise produced by wind from various azimuths with a velocity of 3 m/s, using the "clean" speech as a reference. A paired-comparison procedure was used to compare all possible combinations of three conditions for subjective intelligibility and for sound quality or comfort. The conditions were unprocessed noisy speech, noisy speech processed using the RNN, and noisy speech that was high-pass filtered (which also reduced wind noise). Eighteen native English-speaking participants were tested, nine with normal hearing and nine with mild-to-moderate hearing impairment. Frequency-dependent linear amplification was provided for the latter. Processing using the RNN was significantly preferred over no processing by both subject groups for both subjective intelligibility and sound quality, although the magnitude of the preferences was small. High-pass filtering (HPF) was not significantly preferred over no processing. Although RNN was significantly preferred over HPF only for sound quality for the hearing-impaired participants, for the results as a whole, there was a preference for RNN over HPF. Overall, the results suggest that reduction of wind noise using an RNN is possible and might have beneficial effects when used in hearing aids.
Use of a Deep Recurrent Neural Network to Reduce Wind Noise: Effects on Judged Speech Intelligibility and Sound Quality

Science.gov (United States)

Keshavarzi, Mahmoud; Goehring, Tobias; Zakis, Justin; Turner, Richard E.; Moore, Brian C. J.

2018-01-01

Despite great advances in hearing-aid technology, users still experience problems with noise in windy environments. The potential benefits of using a deep recurrent neural network (RNN) for reducing wind noise were assessed. The RNN was trained using recordings of the output of the two microphones of a behind-the-ear hearing aid in response to male and female speech at various azimuths in the presence of noise produced by wind from various azimuths with a velocity of 3 m/s, using the “clean” speech as a reference. A paired-comparison procedure was used to compare all possible combinations of three conditions for subjective intelligibility and for sound quality or comfort. The conditions were unprocessed noisy speech, noisy speech processed using the RNN, and noisy speech that was high-pass filtered (which also reduced wind noise). Eighteen native English-speaking participants were tested, nine with normal hearing and nine with mild-to-moderate hearing impairment. Frequency-dependent linear amplification was provided for the latter. Processing using the RNN was significantly preferred over no processing by both subject groups for both subjective intelligibility and sound quality, although the magnitude of the preferences was small. High-pass filtering (HPF) was not significantly preferred over no processing. Although RNN was significantly preferred over HPF only for sound quality for the hearing-impaired participants, for the results as a whole, there was a preference for RNN over HPF. Overall, the results suggest that reduction of wind noise using an RNN is possible and might have beneficial effects when used in hearing aids. PMID:29708061
Audiovisual Speech Synchrony Measure: Application to Biometrics

Directory of Open Access Journals (Sweden)

Gérard Chollet

2007-01-01

Full Text Available Speech is a means of communication which is intrinsically bimodal: the audio signal originates from the dynamics of the articulators. This paper reviews recent works in the field of audiovisual speech, and more specifically techniques developed to measure the level of correspondence between audio and visual speech. It overviews the most common audio and visual speech front-end processing, transformations performed on audio, visual, or joint audiovisual feature spaces, and the actual measure of correspondence between audio and visual speech. Finally, the use of synchrony measure for biometric identity verification based on talking faces is experimented on the BANCA database.
Acoustic properties of naturally produced clear speech at normal speaking rates

Science.gov (United States)

Krause, Jean C.; Braida, Louis D.

2004-01-01

Sentences spoken ``clearly'' are significantly more intelligible than those spoken ``conversationally'' for hearing-impaired listeners in a variety of backgrounds [Picheny et al., J. Speech Hear. Res. 28, 96-103 (1985); Uchanski et al., ibid. 39, 494-509 (1996); Payton et al., J. Acoust. Soc. Am. 95, 1581-1592 (1994)]. While producing clear speech, however, talkers often reduce their speaking rate significantly [Picheny et al., J. Speech Hear. Res. 29, 434-446 (1986); Uchanski et al., ibid. 39, 494-509 (1996)]. Yet speaking slowly is not solely responsible for the intelligibility benefit of clear speech (over conversational speech), since a recent study [Krause and Braida, J. Acoust. Soc. Am. 112, 2165-2172 (2002)] showed that talkers can produce clear speech at normal rates with training. This finding suggests that clear speech has inherent acoustic properties, independent of rate, that contribute to improved intelligibility. Identifying these acoustic properties could lead to improved signal processing schemes for hearing aids. To gain insight into these acoustical properties, conversational and clear speech produced at normal speaking rates were analyzed at three levels of detail (global, phonological, and phonetic). Although results suggest that talkers may have employed different strategies to achieve clear speech at normal rates, two global-level properties were identified that appear likely to be linked to the improvements in intelligibility provided by clear/normal speech: increased energy in the 1000-3000-Hz range of long-term spectra and increased modulation depth of low frequency modulations of the intensity envelope. Other phonological and phonetic differences associated with clear/normal speech include changes in (1) frequency of stop burst releases, (2) VOT of word-initial voiceless stop consonants, and (3) short-term vowel spectra.
Proceedings of the specialist meeting on operator aids for severe accidents management and training (SAMOA)

Energy Technology Data Exchange (ETDEWEB)

NONE

1993-07-01

The SAMOA meeting, held in Halden (Norway) in 1993, presented 17 papers grouped into three sessions which titles are: operator aids for control rooms, operator aids for technical support centers, simulation tools for operator training. The specialist meeting also addressed the question of identification of information needs not covered by the instrumentation, examined means to perform phenomenological behaviour assessments needed to support station procedures, and discussed computational aids/methods for predicting accident progression and consequences
Proceedings of the specialist meeting on operator aids for severe accidents management and training (SAMOA)

International Nuclear Information System (INIS)

1993-01-01

The SAMOA meeting, held in Halden (Norway) in 1993, presented 17 papers grouped into three sessions which titles are: operator aids for control rooms, operator aids for technical support centers, simulation tools for operator training. The specialist meeting also addressed the question of identification of information needs not covered by the instrumentation, examined means to perform phenomenological behaviour assessments needed to support station procedures, and discussed computational aids/methods for predicting accident progression and consequences
INTEGRATING MACHINE TRANSLATION AND SPEECH SYNTHESIS COMPONENT FOR ENGLISH TO DRAVIDIAN LANGUAGE SPEECH TO SPEECH TRANSLATION SYSTEM

Directory of Open Access Journals (Sweden)

J. SANGEETHA

2015-02-01

Full Text Available This paper provides an interface between the machine translation and speech synthesis system for converting English speech to Tamil text in English to Tamil speech to speech translation system. The speech translation system consists of three modules: automatic speech recognition, machine translation and text to speech synthesis. Many procedures for incorporation of speech recognition and machine translation have been projected. Still speech synthesis system has not yet been measured. In this paper, we focus on integration of machine translation and speech synthesis, and report a subjective evaluation to investigate the impact of speech synthesis, machine translation and the integration of machine translation and speech synthesis components. Here we implement a hybrid machine translation (combination of rule based and statistical machine translation and concatenative syllable based speech synthesis technique. In order to retain the naturalness and intelligibility of synthesized speech Auto Associative Neural Network (AANN prosody prediction is used in this work. The results of this system investigation demonstrate that the naturalness and intelligibility of the synthesized speech are strongly influenced by the fluency and correctness of the translated text.
Computer Aided Solvent Selection and Design Framework

DEFF Research Database (Denmark)

Mitrofanov, Igor; Conte, Elisa; Abildskov, Jens

and computer-aided tools and methods for property prediction and computer-aided molecular design (CAMD) principles. This framework is applicable for solvent selection and design in product design as well as process design. The first module of the framework is dedicated to the solvent selection and design...... in terms of: physical and chemical properties (solvent-pure properties); Environment, Health and Safety (EHS) characteristic (solvent-EHS properties); operational properties (solvent–solute properties). 3. Performing the search. The search step consists of two stages. The first is a generation and property...... identification of solvent candidates using special software ProCAMD and ProPred, which are the implementations of computer-aided molecular techniques. The second consists of assigning the RS-indices following the reaction–solvent and then consulting the known solvent database and identifying the set of solvents...
Speech reception with different bilateral directional processing schemes: Influence of binaural hearing, audiometric asymmetry, and acoustic scenario.

Science.gov (United States)

Neher, Tobias; Wagener, Kirsten C; Latzel, Matthias

2017-09-01

Hearing aid (HA) users can differ markedly in their benefit from directional processing (or beamforming) algorithms. The current study therefore investigated candidacy for different bilateral directional processing schemes. Groups of elderly listeners with symmetric (N = 20) or asymmetric (N = 19) hearing thresholds for frequencies below 2 kHz, a large spread in the binaural intelligibility level difference (BILD), and no difference in age, overall degree of hearing loss, or performance on a measure of selective attention took part. Aided speech reception was measured using virtual acoustics together with a simulation of a linked pair of completely occluding behind-the-ear HAs. Five processing schemes and three acoustic scenarios were used. The processing schemes differed in the tradeoff between signal-to-noise ratio (SNR) improvement and binaural cue preservation. The acoustic scenarios consisted of a frontal target talker presented against two speech maskers from ±60° azimuth or spatially diffuse cafeteria noise. For both groups, a significant interaction between BILD, processing scheme, and acoustic scenario was found. This interaction implied that, in situations with lateral speech maskers, HA users with BILDs larger than about 2 dB profited more from preserved low-frequency binaural cues than from greater SNR improvement, whereas for smaller BILDs the opposite was true. Audiometric asymmetry reduced the influence of binaural hearing. In spatially diffuse noise, the maximal SNR improvement was generally beneficial. N 0 S π detection performance at 500 Hz predicted the benefit from low-frequency binaural cues. Together, these findings provide a basis for adapting bilateral directional processing to individual and situational influences. Further research is needed to investigate their generalizability to more realistic HA conditions (e.g., with low-frequency vent-transmitted sound). Copyright © 2017 Elsevier B.V. All rights reserved.
A comparison of sound quality judgments for monaural and binaural hearing aid processed stimuli.

Science.gov (United States)

Balfour, P B; Hawkins, D B

1992-10-01

Fifteen adults with bilaterally symmetrical mild and/or moderate sensorineural hearing loss completed a paired-comparison task designed to elicit sound quality preference judgments for monaural/binaural hearing aid processed signals. Three stimuli (speech-in-quiet, speech-in-noise, and music) were recorded separately in three listening environments (audiometric test booth, living room, and a music/lecture hall) through hearing aids placed on a Knowles Electronics Manikin for Acoustics Research. Judgments were made on eight separate sound quality dimensions (brightness, clarity, fullness, loudness, nearness, overall impression, smoothness, and spaciousness) for each of the three stimuli in three listening environments. Results revealed a distinct binaural preference for all eight sound quality dimensions independent of listening environment. Binaural preferences were strongest for overall impression, fullness, and spaciousness. Stimulus type effect was significant only for fullness and spaciousness, where binaural preferences were strongest for speech-in-quiet. After binaural preference data were obtained, subjects ranked each sound quality dimension with respect to its importance for binaural listening relative to monaural. Clarity was ranked highest in importance and brightness was ranked least important. The key to demonstration of improved binaural hearing aid sound quality may be the use of a paired-comparison format.
Using Motivated Sequence in Persuasive Speaking: The Speech for Charity

Science.gov (United States)

McDermott, Virginia M.

2004-01-01

Objective: To select a charitable organization to receive the class monetary donation. Type of speech: Persuasive. Point value: 100 points, which is 20% of course grade. Requirements: (a) References: 5; (b) Length: 5-7 minutes; (c) Visual aid: Yes; (d) Outline: Yes; (e) Prerequisite reading: Chapter 15 (Lucas, 2001), Chapter 7 (McKerrow, Gronbeck,…
Speech recognition systems on the Cell Broadband Engine

Energy Technology Data Exchange (ETDEWEB)

Liu, Y; Jones, H; Vaidya, S; Perrone, M; Tydlitat, B; Nanda, A

2007-04-20

In this paper we describe our design, implementation, and first results of a prototype connected-phoneme-based speech recognition system on the Cell Broadband Engine{trademark} (Cell/B.E.). Automatic speech recognition decodes speech samples into plain text (other representations are possible) and must process samples at real-time rates. Fortunately, the computational tasks involved in this pipeline are highly data-parallel and can receive significant hardware acceleration from vector-streaming architectures such as the Cell/B.E. Identifying and exploiting these parallelism opportunities is challenging, but also critical to improving system performance. We observed, from our initial performance timings, that a single Cell/B.E. processor can recognize speech from thousands of simultaneous voice channels in real time--a channel density that is orders-of-magnitude greater than the capacity of existing software speech recognizers based on CPUs (central processing units). This result emphasizes the potential for Cell/B.E.-based speech recognition and will likely lead to the future development of production speech systems using Cell/B.E. clusters.
The Influence of High-Frequency Envelope Information on Low-Frequency Vowel Identification in Noise.

Directory of Open Access Journals (Sweden)

Wiebke Schubotz

Full Text Available Vowel identification in noise using consonant-vowel-consonant (CVC logatomes was used to investigate a possible interplay of speech information from different frequency regions. It was hypothesized that the periodicity conveyed by the temporal envelope of a high frequency stimulus can enhance the use of the information carried by auditory channels in the low-frequency region that share the same periodicity. It was further hypothesized that this acts as a strobe-like mechanism and would increase the signal-to-noise ratio for the voiced parts of the CVCs. In a first experiment, different high-frequency cues were provided to test this hypothesis, whereas a second experiment examined more closely the role of amplitude modulations and intact phase information within the high-frequency region (4-8 kHz. CVCs were either natural or vocoded speech (both limited to a low-pass cutoff-frequency of 2.5 kHz and were presented in stationary 3-kHz low-pass filtered masking noise. The experimental results did not support the hypothesized use of periodicity information for aiding low-frequency perception.
The Influence of High-Frequency Envelope Information on Low-Frequency Vowel Identification in Noise.

Science.gov (United States)

Schubotz, Wiebke; Brand, Thomas; Kollmeier, Birger; Ewert, Stephan D

2016-01-01

Vowel identification in noise using consonant-vowel-consonant (CVC) logatomes was used to investigate a possible interplay of speech information from different frequency regions. It was hypothesized that the periodicity conveyed by the temporal envelope of a high frequency stimulus can enhance the use of the information carried by auditory channels in the low-frequency region that share the same periodicity. It was further hypothesized that this acts as a strobe-like mechanism and would increase the signal-to-noise ratio for the voiced parts of the CVCs. In a first experiment, different high-frequency cues were provided to test this hypothesis, whereas a second experiment examined more closely the role of amplitude modulations and intact phase information within the high-frequency region (4-8 kHz). CVCs were either natural or vocoded speech (both limited to a low-pass cutoff-frequency of 2.5 kHz) and were presented in stationary 3-kHz low-pass filtered masking noise. The experimental results did not support the hypothesized use of periodicity information for aiding low-frequency perception.
Difficulty understanding speech in noise by the hearing impaired: underlying causes and technological solutions.

Science.gov (United States)

Healy, Eric W; Yoho, Sarah E

2016-08-01

A primary complaint of hearing-impaired individuals involves poor speech understanding when background noise is present. Hearing aids and cochlear implants often allow good speech understanding in quiet backgrounds. But hearing-impaired individuals are highly noise intolerant, and existing devices are not very effective at combating background noise. As a result, speech understanding in noise is often quite poor. In accord with the significance of the problem, considerable effort has been expended toward understanding and remedying this issue. Fortunately, our understanding of the underlying issues is reasonably good. In sharp contrast, effective solutions have remained elusive. One solution that seems promising involves a single-microphone machine-learning algorithm to extract speech from background noise. Data from our group indicate that the algorithm is capable of producing vast increases in speech understanding by hearing-impaired individuals. This paper will first provide an overview of the speech-in-noise problem and outline why hearing-impaired individuals are so noise intolerant. An overview of our approach to solving this problem will follow.
Speech-in-speech perception and executive function involvement.

Directory of Open Access Journals (Sweden)

Marcela Perrone-Bertolotti

Full Text Available This present study investigated the link between speech-in-speech perception capacities and four executive function components: response suppression, inhibitory control, switching and working memory. We constructed a cross-modal semantic priming paradigm using a written target word and a spoken prime word, implemented in one of two concurrent auditory sentences (cocktail party situation. The prime and target were semantically related or unrelated. Participants had to perform a lexical decision task on visual target words and simultaneously listen to only one of two pronounced sentences. The attention of the participant was manipulated: The prime was in the pronounced sentence listened to by the participant or in the ignored one. In addition, we evaluate the executive function abilities of participants (switching cost, inhibitory-control cost and response-suppression cost and their working memory span. Correlation analyses were performed between the executive and priming measurements. Our results showed a significant interaction effect between attention and semantic priming. We observed a significant priming effect in the attended but not in the ignored condition. Only priming effects obtained in the ignored condition were significantly correlated with some of the executive measurements. However, no correlation between priming effects and working memory capacity was found. Overall, these results confirm, first, the role of attention for semantic priming effect and, second, the implication of executive functions in speech-in-noise understanding capacities.
Cortical oscillations and entrainment in speech processing during working memory load

DEFF Research Database (Denmark)

Hjortkjær, Jens; Märcher-Rørsted, Jonatan; Fuglsang, Søren A

2018-01-01

Neuronal oscillations are thought to play an important role in working memory (WM) and speech processing. Listening to speech in real-life situations is often cognitively demanding but it is unknown whether WM load influences how auditory cortical activity synchronizes to speech features. Here, we...... developed an auditory n-back paradigm to investigate cortical entrainment to speech envelope fluctuations under different degrees of WM load. We measured the electroencephalogram, pupil dilations and behavioural performance from 22 subjects listening to continuous speech with an embedded n-back task....... The speech stimuli consisted of long spoken number sequences created to match natural speech in terms of sentence intonation, syllabic rate and phonetic content. To burden different WM functions during speech processing, listeners performed an n-back task on the speech sequences in different levels...
Audiovisual Integration in Children Listening to Spectrally Degraded Speech

Science.gov (United States)

Maidment, David W.; Kang, Hi Jee; Stewart, Hannah J.; Amitay, Sygal

2015-01-01

Purpose: The study explored whether visual information improves speech identification in typically developing children with normal hearing when the auditory signal is spectrally degraded. Method: Children (n = 69) and adults (n = 15) were presented with noise-vocoded sentences from the Children's Co-ordinate Response Measure (Rosen, 2011) in…

The selective role of premotor cortex in speech perception: a contribution to phoneme judgements but not speech comprehension.

Science.gov (United States)

Krieger-Redwood, Katya; Gaskell, M Gareth; Lindsay, Shane; Jefferies, Elizabeth

2013-12-01

Several accounts of speech perception propose that the areas involved in producing language are also involved in perceiving it. In line with this view, neuroimaging studies show activation of premotor cortex (PMC) during phoneme judgment tasks; however, there is debate about whether speech perception necessarily involves motor processes, across all task contexts, or whether the contribution of PMC is restricted to tasks requiring explicit phoneme awareness. Some aspects of speech processing, such as mapping sounds onto meaning, may proceed without the involvement of motor speech areas if PMC specifically contributes to the manipulation and categorical perception of phonemes. We applied TMS to three sites-PMC, posterior superior temporal gyrus, and occipital pole-and for the first time within the TMS literature, directly contrasted two speech perception tasks that required explicit phoneme decisions and mapping of speech sounds onto semantic categories, respectively. TMS to PMC disrupted explicit phonological judgments but not access to meaning for the same speech stimuli. TMS to two further sites confirmed that this pattern was site specific and did not reflect a generic difference in the susceptibility of our experimental tasks to TMS: stimulation of pSTG, a site involved in auditory processing, disrupted performance in both language tasks, whereas stimulation of occipital pole had no effect on performance in either task. These findings demonstrate that, although PMC is important for explicit phonological judgments, crucially, PMC is not necessary for mapping speech onto meanings.
Sound Classification in Hearing Aids Inspired by Auditory Scene Analysis

Science.gov (United States)

Büchler, Michael; Allegro, Silvia; Launer, Stefan; Dillier, Norbert

2005-12-01

A sound classification system for the automatic recognition of the acoustic environment in a hearing aid is discussed. The system distinguishes the four sound classes "clean speech," "speech in noise," "noise," and "music." A number of features that are inspired by auditory scene analysis are extracted from the sound signal. These features describe amplitude modulations, spectral profile, harmonicity, amplitude onsets, and rhythm. They are evaluated together with different pattern classifiers. Simple classifiers, such as rule-based and minimum-distance classifiers, are compared with more complex approaches, such as Bayes classifier, neural network, and hidden Markov model. Sounds from a large database are employed for both training and testing of the system. The achieved recognition rates are very high except for the class "speech in noise." Problems arise in the classification of compressed pop music, strongly reverberated speech, and tonal or fluctuating noises.
Magnified Neural Envelope Coding Predicts Deficits in Speech Perception in Noise.

Science.gov (United States)

Millman, Rebecca E; Mattys, Sven L; Gouws, André D; Prendergast, Garreth

2017-08-09

Verbal communication in noisy backgrounds is challenging. Understanding speech in background noise that fluctuates in intensity over time is particularly difficult for hearing-impaired listeners with a sensorineural hearing loss (SNHL). The reduction in fast-acting cochlear compression associated with SNHL exaggerates the perceived fluctuations in intensity in amplitude-modulated sounds. SNHL-induced changes in the coding of amplitude-modulated sounds may have a detrimental effect on the ability of SNHL listeners to understand speech in the presence of modulated background noise. To date, direct evidence for a link between magnified envelope coding and deficits in speech identification in modulated noise has been absent. Here, magnetoencephalography was used to quantify the effects of SNHL on phase locking to the temporal envelope of modulated noise (envelope coding) in human auditory cortex. Our results show that SNHL enhances the amplitude of envelope coding in posteromedial auditory cortex, whereas it enhances the fidelity of envelope coding in posteromedial and posterolateral auditory cortex. This dissociation was more evident in the right hemisphere, demonstrating functional lateralization in enhanced envelope coding in SNHL listeners. However, enhanced envelope coding was not perceptually beneficial. Our results also show that both hearing thresholds and, to a lesser extent, magnified cortical envelope coding in left posteromedial auditory cortex predict speech identification in modulated background noise. We propose a framework in which magnified envelope coding in posteromedial auditory cortex disrupts the segregation of speech from background noise, leading to deficits in speech perception in modulated background noise. SIGNIFICANCE STATEMENT People with hearing loss struggle to follow conversations in noisy environments. Background noise that fluctuates in intensity over time poses a particular challenge. Using magnetoencephalography, we demonstrate
Copenhagen failure : a rhetorical treatise of how speeches unite and divide mankind

OpenAIRE

Kortetmäki, Teea

2010-01-01

The purpose of this treatise is to analyse five of the Copenhagen Climate Convention's main speeches to see how they supported or weakened the agreement possibilities in the convention. Particular focus will be on the elements that divide or unite negotiators and whether the summit's failing outcome is already built in the pre-planned speeches held at the main podium. Theoretically, the study builds on Kenneth Burke's identification thesis and Elizabeth L. Malone's climate change debate an...
Preschool speech intelligibility and vocabulary skills predict long-term speech and language outcomes following cochlear implantation in early childhood.

Science.gov (United States)

Castellanos, Irina; Kronenberger, William G; Beer, Jessica; Henning, Shirley C; Colson, Bethany G; Pisoni, David B

2014-07-01

Speech and language measures during grade school predict adolescent speech-language outcomes in children who receive cochlear implants (CIs), but no research has examined whether speech and language functioning at even younger ages is predictive of long-term outcomes in this population. The purpose of this study was to examine whether early preschool measures of speech and language performance predict speech-language functioning in long-term users of CIs. Early measures of speech intelligibility and receptive vocabulary (obtained during preschool ages of 3-6 years) in a sample of 35 prelingually deaf, early-implanted children predicted speech perception, language, and verbal working memory skills up to 18 years later. Age of onset of deafness and age at implantation added additional variance to preschool speech intelligibility in predicting some long-term outcome scores, but the relationship between preschool speech-language skills and later speech-language outcomes was not significantly attenuated by the addition of these hearing history variables. These findings suggest that speech and language development during the preschool years is predictive of long-term speech and language functioning in early-implanted, prelingually deaf children. As a result, measures of speech-language functioning at preschool ages can be used to identify and adjust interventions for very young CI users who may be at long-term risk for suboptimal speech and language outcomes.
Hearing Handicap and Speech Recognition Correlate With Self-Reported Listening Effort and Fatigue.

Science.gov (United States)

Alhanbali, Sara; Dawes, Piers; Lloyd, Simon; Munro, Kevin J

To investigate the correlations between hearing handicap, speech recognition, listening effort, and fatigue. Eighty-four adults with hearing loss (65 to 85 years) completed three self-report questionnaires: the Fatigue Assessment Scale, the Effort Assessment Scale, and the Hearing Handicap Inventory for Elderly. Audiometric assessment included pure-tone audiometry and speech recognition in noise. There was a significant positive correlation between handicap and fatigue (r = 0.39, p speech recognition and fatigue (r = 0.22, p speech recognition both correlate with self-reported listening effort and fatigue, which is consistent with a model of listening effort and fatigue where perceived difficulty is related to sustained effort and fatigue for unrewarding tasks over which the listener has low control. A clinical implication is that encouraging clients to recognize and focus on the pleasure and positive experiences of listening may result in greater satisfaction and benefit from hearing aid use.
Do long-term tongue piercings affect speech quality?

Science.gov (United States)

Heinen, Esther; Birkholz, Peter; Willmes, Klaus; Neuschaefer-Rube, Christiane

2017-10-01

To explore possible effects of tongue piercing on perceived speech quality. Using a quasi-experimental design, we analyzed the effect of tongue piercing on speech in a perception experiment. Samples of spontaneous speech and read speech were recorded from 20 long-term pierced and 20 non-pierced individuals (10 males, 10 females each). The individuals having a tongue piercing were recorded with attached and removed piercing. The audio samples were blindly rated by 26 female and 20 male laypersons and by 5 female speech-language pathologists with regard to perceived speech quality along 5 dimensions: speech clarity, speech rate, prosody, rhythm and fluency. We found no statistically significant differences for any of the speech quality dimensions between the pierced and non-pierced individuals, neither for the read nor for the spontaneous speech. In addition, neither length nor position of piercing had a significant effect on speech quality. The removal of tongue piercings had no effects on speech performance either. Rating differences between laypersons and speech-language pathologists were not dependent on the presence of a tongue piercing. People are able to perfectly adapt their articulation to long-term tongue piercings such that their speech quality is not perceptually affected.
Deep bottleneck features for spoken language identification.

Directory of Open Access Journals (Sweden)

Bing Jiang

Full Text Available A key problem in spoken language identification (LID is to design effective representations which are specific to language information. For example, in recent years, representations based on both phonotactic and acoustic features have proven their effectiveness for LID. Although advances in machine learning have led to significant improvements, LID performance is still lacking, especially for short duration speech utterances. With the hypothesis that language information is weak and represented only latently in speech, and is largely dependent on the statistical properties of the speech content, existing representations may be insufficient. Furthermore they may be susceptible to the variations caused by different speakers, specific content of the speech segments, and background noise. To address this, we propose using Deep Bottleneck Features (DBF for spoken LID, motivated by the success of Deep Neural Networks (DNN in speech recognition. We show that DBFs can form a low-dimensional compact representation of the original inputs with a powerful descriptive and discriminative capability. To evaluate the effectiveness of this, we design two acoustic models, termed DBF-TV and parallel DBF-TV (PDBF-TV, using a DBF based i-vector representation for each speech utterance. Results on NIST language recognition evaluation 2009 (LRE09 show significant improvements over state-of-the-art systems. By fusing the output of phonotactic and acoustic approaches, we achieve an EER of 1.08%, 1.89% and 7.01% for 30 s, 10 s and 3 s test utterances respectively. Furthermore, various DBF configurations have been extensively evaluated, and an optimal system proposed.
Brainstem encoding of speech and musical stimuli in congenital amusia: Evidence from Cantonese speakers

Directory of Open Access Journals (Sweden)

Fang eLiu

2015-01-01

Full Text Available Congenital amusia is a neurodevelopmental disorder of musical processing that also impacts subtle aspects of speech processing. It remains debated at what stage(s of auditory processing deficits in amusia arise. In this study, we investigated whether amusia originates from impaired subcortical encoding of speech (in quiet and noise and musical sounds in the brainstem. Fourteen Cantonese-speaking amusics and 14 matched controls passively listened to six Cantonese lexical tones in quiet, two Cantonese tones in noise (signal-to-noise ratios at 0 and 20 dB, and two cello tones in quiet while their frequency-following responses (FFRs to these tones were recorded. All participants also completed a behavioral lexical tone identification task. The results indicated normal brainstem encoding of pitch in speech (in quiet and noise and musical stimuli in amusics relative to controls, as measured by FFR pitch strength, pitch error, and stimulus-to-response correlation. There was also no group difference in neural conduction time or FFR amplitudes. Both groups demonstrated better FFRs to speech (in quiet and noise than to musical stimuli. However, a significant group difference was observed for tone identification, with amusics showing significantly lower accuracy than controls. Analysis of the tone confusion matrices suggested that amusics were more likely than controls to confuse between tones that shared similar acoustic features. Interestingly, this deficit in lexical tone identification was not coupled with brainstem abnormality for either speech or musical stimuli. Together, our results suggest that the amusic brainstem is not functioning abnormally, although higher-order linguistic pitch processing is impaired in amusia. This finding has significant implications for theories of central auditory processing, requiring further investigations into how different stages of auditory processing interact in the human brain.
Brainstem encoding of speech and musical stimuli in congenital amusia: evidence from Cantonese speakers.

Science.gov (United States)

Liu, Fang; Maggu, Akshay R; Lau, Joseph C Y; Wong, Patrick C M

2014-01-01

Congenital amusia is a neurodevelopmental disorder of musical processing that also impacts subtle aspects of speech processing. It remains debated at what stage(s) of auditory processing deficits in amusia arise. In this study, we investigated whether amusia originates from impaired subcortical encoding of speech (in quiet and noise) and musical sounds in the brainstem. Fourteen Cantonese-speaking amusics and 14 matched controls passively listened to six Cantonese lexical tones in quiet, two Cantonese tones in noise (signal-to-noise ratios at 0 and 20 dB), and two cello tones in quiet while their frequency-following responses (FFRs) to these tones were recorded. All participants also completed a behavioral lexical tone identification task. The results indicated normal brainstem encoding of pitch in speech (in quiet and noise) and musical stimuli in amusics relative to controls, as measured by FFR pitch strength, pitch error, and stimulus-to-response correlation. There was also no group difference in neural conduction time or FFR amplitudes. Both groups demonstrated better FFRs to speech (in quiet and noise) than to musical stimuli. However, a significant group difference was observed for tone identification, with amusics showing significantly lower accuracy than controls. Analysis of the tone confusion matrices suggested that amusics were more likely than controls to confuse between tones that shared similar acoustic features. Interestingly, this deficit in lexical tone identification was not coupled with brainstem abnormality for either speech or musical stimuli. Together, our results suggest that the amusic brainstem is not functioning abnormally, although higher-order linguistic pitch processing is impaired in amusia. This finding has significant implications for theories of central auditory processing, requiring further investigations into how different stages of auditory processing interact in the human brain.
Brainstem encoding of speech and musical stimuli in congenital amusia: evidence from Cantonese speakers

Science.gov (United States)

Liu, Fang; Maggu, Akshay R.; Lau, Joseph C. Y.; Wong, Patrick C. M.

2015-01-01

Congenital amusia is a neurodevelopmental disorder of musical processing that also impacts subtle aspects of speech processing. It remains debated at what stage(s) of auditory processing deficits in amusia arise. In this study, we investigated whether amusia originates from impaired subcortical encoding of speech (in quiet and noise) and musical sounds in the brainstem. Fourteen Cantonese-speaking amusics and 14 matched controls passively listened to six Cantonese lexical tones in quiet, two Cantonese tones in noise (signal-to-noise ratios at 0 and 20 dB), and two cello tones in quiet while their frequency-following responses (FFRs) to these tones were recorded. All participants also completed a behavioral lexical tone identification task. The results indicated normal brainstem encoding of pitch in speech (in quiet and noise) and musical stimuli in amusics relative to controls, as measured by FFR pitch strength, pitch error, and stimulus-to-response correlation. There was also no group difference in neural conduction time or FFR amplitudes. Both groups demonstrated better FFRs to speech (in quiet and noise) than to musical stimuli. However, a significant group difference was observed for tone identification, with amusics showing significantly lower accuracy than controls. Analysis of the tone confusion matrices suggested that amusics were more likely than controls to confuse between tones that shared similar acoustic features. Interestingly, this deficit in lexical tone identification was not coupled with brainstem abnormality for either speech or musical stimuli. Together, our results suggest that the amusic brainstem is not functioning abnormally, although higher-order linguistic pitch processing is impaired in amusia. This finding has significant implications for theories of central auditory processing, requiring further investigations into how different stages of auditory processing interact in the human brain. PMID:25646077
A multicenter study on objective and subjective benefits with a transcutaneous bone-anchored hearing aid device

DEFF Research Database (Denmark)

Hougaard, Dan Dupont; Boldsen, Soren Kjaergaard; Jensen, Anne Marie

2017-01-01

Examination of objective as well as subjective outcomes with a new transcutaneous bone-anchored hearing aid device. The study was designed as a prospective multicenter consecutive case-series study involving tertiary referral centers at two Danish University Hospitals. A total of 23 patients were...... implanted. Three were lost to follow-up. Patients had single-sided deafness, conductive or mixed hearing loss. Intervention: Rehabilitative. Aided and unaided sound field hearing was evaluated objectively using (1) pure warble tone thresholds, (2) pure-tone average (PTA4), (3) speech discrimination score...... (SDS) in quiet, and (4) speech reception threshold 50% at 70 dB SPL noise level (SRT50%). Subjective benefit was evaluated by three validated questionnaires: (1) the IOI-HA, (2) the SSQ-12, and (3) a questionnaire evaluating both the frequency and the duration of hearing aid usage. The mean aided PTA4...
Early experience with the cochlear ESPrit ear-level speech processor in children.

Science.gov (United States)

Totten, C; Cope, Y; McCormick, B

2000-12-01

The ESPrit ear-level speech processor has recently become available in the United Kingdom for use with the Nucleus CI24M multichannel cochlear implant. We report on the use of this ear-level processor with 6 children, ages 8 to 15 years. In this study, all patients were initially fitted with the SPrint body-worn processor, this being a prerequisite for programming the ESPrit. Five of the children were fitted successfully with the ESPrit and are using their devices consistently. The results show that patient experience with the ESPrit has been favorable, although there have been some device and programming difficulties. Aided threshold measures show that the ESPrit processor performs at least as well as the SPrint processor, with a trend toward improved aided thresholds for the ESPrit processor compared with the SPrint processor. Further study of the functional benefit of both of these devices may confirm these potential gains. The ESPrit device currently has a disadvantage for children in that it does not support FM radio hearing aid use. Finally, caution is advised in the fitting of the ESPrit in very young children or inexperienced listeners, because of difficulties in monitoring device function.
Speech recognition in natural background noise.

Directory of Open Access Journals (Sweden)

Julien Meyer

Full Text Available In the real world, human speech recognition nearly always involves listening in background noise. The impact of such noise on speech signals and on intelligibility performance increases with the separation of the listener from the speaker. The present behavioral experiment provides an overview of the effects of such acoustic disturbances on speech perception in conditions approaching ecologically valid contexts. We analysed the intelligibility loss in spoken word lists with increasing listener-to-speaker distance in a typical low-level natural background noise. The noise was combined with the simple spherical amplitude attenuation due to distance, basically changing the signal-to-noise ratio (SNR. Therefore, our study draws attention to some of the most basic environmental constraints that have pervaded spoken communication throughout human history. We evaluated the ability of native French participants to recognize French monosyllabic words (spoken at 65.3 dB(A, reference at 1 meter at distances between 11 to 33 meters, which corresponded to the SNRs most revealing of the progressive effect of the selected natural noise (-8.8 dB to -18.4 dB. Our results showed that in such conditions, identity of vowels is mostly preserved, with the striking peculiarity of the absence of confusion in vowels. The results also confirmed the functional role of consonants during lexical identification. The extensive analysis of recognition scores, confusion patterns and associated acoustic cues revealed that sonorant, sibilant and burst properties were the most important parameters influencing phoneme recognition. . Altogether these analyses allowed us to extract a resistance scale from consonant recognition scores. We also identified specific perceptual consonant confusion groups depending of the place in the words (onset vs. coda. Finally our data suggested that listeners may access some acoustic cues of the CV transition, opening interesting perspectives for
Speech recognition in natural background noise.

Science.gov (United States)

Meyer, Julien; Dentel, Laure; Meunier, Fanny

2013-01-01

In the real world, human speech recognition nearly always involves listening in background noise. The impact of such noise on speech signals and on intelligibility performance increases with the separation of the listener from the speaker. The present behavioral experiment provides an overview of the effects of such acoustic disturbances on speech perception in conditions approaching ecologically valid contexts. We analysed the intelligibility loss in spoken word lists with increasing listener-to-speaker distance in a typical low-level natural background noise. The noise was combined with the simple spherical amplitude attenuation due to distance, basically changing the signal-to-noise ratio (SNR). Therefore, our study draws attention to some of the most basic environmental constraints that have pervaded spoken communication throughout human history. We evaluated the ability of native French participants to recognize French monosyllabic words (spoken at 65.3 dB(A), reference at 1 meter) at distances between 11 to 33 meters, which corresponded to the SNRs most revealing of the progressive effect of the selected natural noise (-8.8 dB to -18.4 dB). Our results showed that in such conditions, identity of vowels is mostly preserved, with the striking peculiarity of the absence of confusion in vowels. The results also confirmed the functional role of consonants during lexical identification. The extensive analysis of recognition scores, confusion patterns and associated acoustic cues revealed that sonorant, sibilant and burst properties were the most important parameters influencing phoneme recognition. . Altogether these analyses allowed us to extract a resistance scale from consonant recognition scores. We also identified specific perceptual consonant confusion groups depending of the place in the words (onset vs. coda). Finally our data suggested that listeners may access some acoustic cues of the CV transition, opening interesting perspectives for future studies.
Language and speech outcomes of children with hearing loss and additional disabilities: Identifying the variables that influence performance at 5 years of age

Science.gov (United States)

Cupples, Linda; Ching, Teresa Y.C.; Button, Laura; Leigh, Greg; Marnane, Vivienne; Whitfield, Jessica; Gunnourie, Miriam; Martin, Louise

2016-01-01

Objective This study examined language and speech outcomes in young children with hearing loss and additional disabilities. Design Receptive and expressive language skills and speech output accuracy were evaluated using direct assessment and caregiver report. Results were analysed first for the entire participant cohort, and then to compare results for children with hearing aids (HAs) versus cochlear implants (CIs). Study sample A population-based cohort of 146 5-year-old children with hearing loss and additional disabilities took part. Results Across all participants, multiple regressions showed that better language outcomes were associated with milder hearing loss, use of oral communication, higher levels of cognitive ability and maternal education, and earlier device fitting. Speech output accuracy was associated with use of oral communication only. Average outcomes were similar for children with HAs versus CIs, but their associations with demographic variables differed. For HA users, results resembled those for the whole cohort. For CI users, only use of oral communication and higher cognitive ability levels were significantly associated with better language outcomes. Conclusions The results underscore the importance of early device fitting for children with additional disabilities. Strong conclusions cannot be drawn for CI users given the small number of participants with complete data. PMID:27630013
Surgical improvement of speech disorder caused by amyotrophic lateral sclerosis.

Science.gov (United States)

Saigusa, Hideto; Yamaguchi, Satoshi; Nakamura, Tsuyoshi; Komachi, Taro; Kadosono, Osamu; Ito, Hiroyuki; Saigusa, Makoto; Niimi, Seiji

2012-12-01

Amyotrophic lateral sclerosis (ALS) is a progressive debilitating neurological disease. ALS disturbs the quality of life by affecting speech, swallowing and free mobility of the arms without affecting intellectual function. It is therefore of significance to improve intelligibility and quality of speech sounds, especially for ALS patients with slowly progressive courses. Currently, however, there is no effective or established approach to improve speech disorder caused by ALS. We investigated a surgical procedure to improve speech disorder for some patients with neuromuscular diseases with velopharyngeal closure incompetence. In this study, we performed the surgical procedure for two patients suffering from severe speech disorder caused by slowly progressing ALS. The patients suffered from speech disorder with hypernasality and imprecise and weak articulation during a 6-year course (patient 1) and a 3-year course (patient 2) of slowly progressing ALS. We narrowed bilateral lateral palatopharyngeal wall at velopharyngeal port, and performed this surgery under general anesthesia without muscle relaxant for the two patients. Postoperatively, intelligibility and quality of their speech sounds were greatly improved within one month without any speech therapy. The patients were also able to generate longer speech phrases after the surgery. Importantly, there was no serious complication during or after the surgery. In summary, we performed bilateral narrowing of lateral palatopharyngeal wall as a speech surgery for two patients suffering from severe speech disorder associated with ALS. With this technique, improved intelligibility and quality of speech can be maintained for longer duration for the patients with slowly progressing ALS.
Exploring the Link Between Cognitive Abilities and Speech Recognition in the Elderly Under Different Listening Conditions

Directory of Open Access Journals (Sweden)

Theresa Nuesse

2018-05-01

Full Text Available Elderly listeners are known to differ considerably in their ability to understand speech in noise. Several studies have addressed the underlying factors that contribute to these differences. These factors include audibility, and age-related changes in supra-threshold auditory processing abilities, and it has been suggested that differences in cognitive abilities may also be important. The objective of this study was to investigate associations between performance in cognitive tasks and speech recognition under different listening conditions in older adults with either age appropriate hearing or hearing-impairment. To that end, speech recognition threshold (SRT measurements were performed under several masking conditions that varied along the perceptual dimensions of dip listening, spatial separation, and informational masking. In addition, a neuropsychological test battery was administered, which included measures of verbal working and short-term memory, executive functioning, selective and divided attention, and lexical and semantic abilities. Age-matched groups of older adults with either age-appropriate hearing (ENH, n = 20 or aided hearing impairment (EHI, n = 21 participated. In repeated linear regression analyses, composite scores of cognitive test outcomes (evaluated using PCA were included to predict SRTs. These associations were different for the two groups. When hearing thresholds were controlled for, composed cognitive factors were significantly associated with the SRTs for the ENH listeners. Whereas better lexical and semantic abilities were associated with lower (better SRTs in this group, there was a negative association between attentional abilities and speech recognition in the presence of spatially separated speech-like maskers. For the EHI group, the pure-tone thresholds (averaged across 0.5, 1, 2, and 4 kHz were significantly associated with the SRTs, despite the fact that all signals were amplified and therefore in principle
Speech recognition in individuals with sensorineural hearing loss

Directory of Open Access Journals (Sweden)

Adriana Neves de Andrade

Full Text Available ABSTRACT INTRODUCTION: Hearing loss can negatively influence the communication performance of individuals, who should be evaluated with suitable material and in situations of listening close to those found in everyday life. OBJECTIVE: To analyze and compare the performance of patients with mild-to-moderate sensorineural hearing loss in speech recognition tests carried out in silence and with noise, according to the variables ear (right and left and type of stimulus presentation. METHODS: The study included 19 right-handed individuals with mild-to-moderate symmetrical bilateral sensorineural hearing loss, submitted to the speech recognition test with words in different modalities and speech test with white noise and pictures. RESULTS: There was no significant difference between right and left ears in any of the tests. The mean number of correct responses in the speech recognition test with pictures, live voice, and recorded monosyllables was 97.1%, 85.9%, and 76.1%, respectively, whereas after the introduction of noise, the performance decreased to 72.6% accuracy. CONCLUSIONS: The best performances in the Speech Recognition Percentage Index were obtained using monosyllabic stimuli, represented by pictures presented in silence, with no significant differences between the right and left ears. After the introduction of competitive noise, there was a decrease in individuals' performance.
State of the art in perceptual design of hearing aids

Science.gov (United States)

Edwards, Brent W.; van Tasell, Dianne J.

2002-05-01

Hearing aid capabilities have increased dramatically over the past six years, in large part due to the development of small, low-power digital signal processing chips suitable for hearing aid applications. As hearing aid signal processing capabilities increase, there will be new opportunities to apply perceptually based knowledge to technological development. Most hearing loss compensation techniques in today's hearing aids are based on simple estimates of audibility and loudness. As our understanding of the psychoacoustical and physiological characteristics of sensorineural hearing loss improves, the result should be improved design of hearing aids and fitting methods. The state of the art in hearing aids will be reviewed, including form factors, user requirements, and technology that improves speech intelligibility, sound quality, and functionality. General areas of auditory perception that remain unaddressed by current hearing aid technology will be discussed.

Is AGC beneficial in hearing aids?

Science.gov (United States)

King, A B; Martin, M C

1984-02-01

Three different functions of Automatic Gain Control (AGC) circuits in hearing aids are distinguished and the evidence for their benefits is considered. The value of AGC's function as a relatively distortion-free means of limiting output has been well established. With regard to compression, the benefit of short-term or 'syllabic' compression has not been demonstrated convincingly. Most evaluations of this type of AGC have looked for increase in speech intelligibility, but theoretical predictions of its effect do not appear to take account of the acoustic cues to consonant contrasts actually used by hearing impaired people, and empirical studies have often used listening conditions which do not give a realistic test of benefit. Relatively little attention has been paid to long-term compression, or to the effect of AGC on comfort rather than intelligibility. Listening tests carried out at the RNID and reported here have shown that AGC can benefit hearing aid users by allowing them to listen to a wider range of sound levels without either strain or discomfort, and, if time constants are well chosen, without adverse effects on speech intelligibility in quiet or in noise.
Formal auditory training in adult hearing aid users

Directory of Open Access Journals (Sweden)

Daniela Gil

2010-01-01

Full Text Available INTRODUCTION: Individuals with sensorineural hearing loss are often able to regain some lost auditory function with the help of hearing aids. However, hearing aids are not able to overcome auditory distortions such as impaired frequency resolution and speech understanding in noisy environments. The coexistence of peripheral hearing loss and a central auditory deficit may contribute to patient dissatisfaction with amplification, even when audiological tests indicate nearly normal hearing thresholds. OBJECTIVE: This study was designed to validate the effects of a formal auditory training program in adult hearing aid users with mild to moderate sensorineural hearing loss. METHODS: Fourteen bilateral hearing aid users were divided into two groups: seven who received auditory training and seven who did not. The training program was designed to improve auditory closure, figure-to-ground for verbal and nonverbal sounds and temporal processing (frequency and duration of sounds. Pre- and post-training evaluations included measuring electrophysiological and behavioral auditory processing and administration of the Abbreviated Profile of Hearing Aid Benefit (APHAB self-report scale. RESULTS: The post-training evaluation of the experimental group demonstrated a statistically significant reduction in P3 latency, improved performance in some of the behavioral auditory processing tests and higher hearing aid benefit in noisy situations (p-value < 0,05. No changes were noted for the control group (p-value <0,05. CONCLUSION: The results demonstrated that auditory training in adult hearing aid users can lead to a reduction in P3 latency, improvements in sound localization, memory for nonverbal sounds in sequence, auditory closure, figure-to-ground for verbal sounds and greater benefits in reverberant and noisy environments.
'Who's a good boy?!' Dogs prefer naturalistic dog-directed speech.

Science.gov (United States)

Benjamin, Alex; Slocombe, Katie

2018-05-01

Infant-directed speech (IDS) is a special speech register thought to aid language acquisition and improve affiliation in human infants. Although IDS shares some of its properties with dog-directed speech (DDS), it is unclear whether the production of DDS is functional, or simply an overgeneralisation of IDS within Western cultures. One recent study found that, while puppies attended more to a script read with DDS compared with adult-directed speech (ADS), adult dogs displayed no preference. In contrast, using naturalistic speech and a more ecologically valid set-up, we found that adult dogs attended to and showed more affiliative behaviour towards a speaker of DDS than of ADS. To explore whether this preference for DDS was modulated by the dog-specific words typically used in DDS, the acoustic features (prosody) of DDS or a combination of the two, we conducted a second experiment. Here the stimuli from experiment 1 were produced with reversed prosody, meaning the prosody and content of ADS and DDS were mismatched. The results revealed no significant effect of speech type, or content, suggesting that it is maybe the combination of the acoustic properties and the dog-related content of DDS that modulates the preference shown for naturalistic DDS. Overall, the results of this study suggest that naturalistic DDS, comprising of both dog-directed prosody and dog-relevant content words, improves dogs' attention and may strengthen the affiliative bond between humans and their pets.
The pathways for intelligible speech: multivariate and univariate perspectives.

Science.gov (United States)

Evans, S; Kyong, J S; Rosen, S; Golestani, N; Warren, J E; McGettigan, C; Mourão-Miranda, J; Wise, R J S; Scott, S K

2014-09-01

An anterior pathway, concerned with extracting meaning from sound, has been identified in nonhuman primates. An analogous pathway has been suggested in humans, but controversy exists concerning the degree of lateralization and the precise location where responses to intelligible speech emerge. We have demonstrated that the left anterior superior temporal sulcus (STS) responds preferentially to intelligible speech (Scott SK, Blank CC, Rosen S, Wise RJS. 2000. Identification of a pathway for intelligible speech in the left temporal lobe. Brain. 123:2400-2406.). A functional magnetic resonance imaging study in Cerebral Cortex used equivalent stimuli and univariate and multivariate analyses to argue for the greater importance of bilateral posterior when compared with the left anterior STS in responding to intelligible speech (Okada K, Rong F, Venezia J, Matchin W, Hsieh IH, Saberi K, Serences JT,Hickok G. 2010. Hierarchical organization of human auditory cortex: evidence from acoustic invariance in the response to intelligible speech. 20: 2486-2495.). Here, we also replicate our original study, demonstrating that the left anterior STS exhibits the strongest univariate response and, in decoding using the bilateral temporal cortex, contains the most informative voxels showing an increased response to intelligible speech. In contrast, in classifications using local "searchlights" and a whole brain analysis, we find greater classification accuracy in posterior rather than anterior temporal regions. Thus, we show that the precise nature of the multivariate analysis used will emphasize different response profiles associated with complex sound to speech processing. © The Author 2013. Published by Oxford University Press.
Improving on hidden Markov models: An articulatorily constrained, maximum likelihood approach to speech recognition and speech coding

Energy Technology Data Exchange (ETDEWEB)

Hogden, J.

1996-11-05

The goal of the proposed research is to test a statistical model of speech recognition that incorporates the knowledge that speech is produced by relatively slow motions of the tongue, lips, and other speech articulators. This model is called Maximum Likelihood Continuity Mapping (Malcom). Many speech researchers believe that by using constraints imposed by articulator motions, we can improve or replace the current hidden Markov model based speech recognition algorithms. Unfortunately, previous efforts to incorporate information about articulation into speech recognition algorithms have suffered because (1) slight inaccuracies in our knowledge or the formulation of our knowledge about articulation may decrease recognition performance, (2) small changes in the assumptions underlying models of speech production can lead to large changes in the speech derived from the models, and (3) collecting measurements of human articulator positions in sufficient quantity for training a speech recognition algorithm is still impractical. The most interesting (and in fact, unique) quality of Malcom is that, even though Malcom makes use of a mapping between acoustics and articulation, Malcom can be trained to recognize speech using only acoustic data. By learning the mapping between acoustics and articulation using only acoustic data, Malcom avoids the difficulties involved in collecting articulator position measurements and does not require an articulatory synthesizer model to estimate the mapping between vocal tract shapes and speech acoustics. Preliminary experiments that demonstrate that Malcom can learn the mapping between acoustics and articulation are discussed. Potential applications of Malcom aside from speech recognition are also discussed. Finally, specific deliverables resulting from the proposed research are described.
Improved Methods for Pitch Synchronous Linear Prediction Analysis of Speech

OpenAIRE

劉, 麗清

2015-01-01

Linear prediction (LP) analysis has been applied to speech system over the last few decades. LP technique is well-suited for speech analysis due to its ability to model speech production process approximately. Hence LP analysis has been widely used for speech enhancement, low-bit-rate speech coding in cellular telephony, speech recognition, characteristic parameter extraction (vocal tract resonances frequencies, fundamental frequency called pitch) and so on. However, the performance of the co...
Modeling Speech Level as a Function of Background Noise Level and Talker-to-Listener Distance for Talkers Wearing Hearing Protection Devices

Science.gov (United States)

Bouserhal, Rachel E.; Bockstael, Annelies; MacDonald, Ewen; Falk, Tiago H.; Voix, Jérémie

2017-01-01

Purpose: Studying the variations in speech levels with changing background noise level and talker-to-listener distance for talkers wearing hearing protection devices (HPDs) can aid in understanding communication in background noise. Method: Speech was recorded using an intra-aural HPD from 12 different talkers at 5 different distances in 3…
Preferred Compression Speed for Speech and Music and Its Relationship to Sensitivity to Temporal Fine Structure.

Science.gov (United States)

Moore, Brian C J; Sęk, Aleksander

2016-09-07

Multichannel amplitude compression is widely used in hearing aids. The preferred compression speed varies across individuals. Moore (2008) suggested that reduced sensitivity to temporal fine structure (TFS) may be associated with preference for slow compression. This idea was tested using a simulated hearing aid. It was also assessed whether preferences for compression speed depend on the type of stimulus: speech or music. Twenty-two hearing-impaired subjects were tested, and the stimulated hearing aid was fitted individually using the CAM2A method. On each trial, a given segment of speech or music was presented twice. One segment was processed with fast compression and the other with slow compression, and the order was balanced across trials. The subject indicated which segment was preferred and by how much. On average, slow compression was preferred over fast compression, more so for music, but there were distinct individual differences, which were highly correlated for speech and music. Sensitivity to TFS was assessed using the difference limen for frequency at 2000 Hz and by two measures of sensitivity to interaural phase at low frequencies. The results for the difference limens for frequency, but not the measures of sensitivity to interaural phase, supported the suggestion that preference for compression speed is affected by sensitivity to TFS. © The Author(s) 2016.
Assessment of the Speech Intelligibility Performance of Post Lingual Cochlear Implant Users at Different Signal-to-Noise Ratios Using the Turkish Matrix Test

Directory of Open Access Journals (Sweden)

Zahra Polat

2016-10-01

Full Text Available Background: Spoken word recognition and speech perception tests in quiet are being used as a routine in assessment of the benefit which children and adult cochlear implant users receive from their devices. Cochlear implant users generally demonstrate high level performances in these test materials as they are able to achieve high level speech perception ability in quiet situations. Although these test materials provide valuable information regarding Cochlear Implant (CI users’ performances in optimal listening conditions, they do not give realistic information regarding performances in adverse listening conditions, which is the case in the everyday environment. Aims: The aim of this study was to assess the speech intelligibility performance of post lingual CI users in the presence of noise at different signal-to-noise ratio with the Matrix Test developed for Turkish language. Study Design: Cross-sectional study. Methods: The thirty post lingual implant user adult subjects, who had been using implants for a minimum of one year, were evaluated with Turkish Matrix test. Subjects’ speech intelligibility was measured using the adaptive and non-adaptive Matrix Test in quiet and noisy environments. Results: The results of the study show a correlation between Pure Tone Average (PTA values of the subjects and Matrix test Speech Reception Threshold (SRT values in the quiet. Hence, it is possible to asses PTA values of CI users using the Matrix Test also. However, no correlations were found between Matrix SRT values in the quiet and Matrix SRT values in noise. Similarly, the correlation between PTA values and intelligibility scores in noise was also not significant. Therefore, it may not be possible to assess the intelligibility performance of CI users using test batteries performed in quiet conditions. Conclusion: The Matrix Test can be used to assess the benefit of CI users from their systems in everyday life, since it is possible to perform
Speech emotion recognition methods: A literature review

Science.gov (United States)

Basharirad, Babak; Moradhaseli, Mohammadreza

2017-10-01

Recently, attention of the emotional speech signals research has been boosted in human machine interfaces due to availability of high computation capability. There are many systems proposed in the literature to identify the emotional state through speech. Selection of suitable feature sets, design of a proper classifications methods and prepare an appropriate dataset are the main key issues of speech emotion recognition systems. This paper critically analyzed the current available approaches of speech emotion recognition methods based on the three evaluating parameters (feature set, classification of features, accurately usage). In addition, this paper also evaluates the performance and limitations of available methods. Furthermore, it highlights the current promising direction for improvement of speech emotion recognition systems.
Apraxia of Speech

Science.gov (United States)

... Health Info » Voice, Speech, and Language Apraxia of Speech On this page: What is apraxia of speech? ... about apraxia of speech? What is apraxia of speech? Apraxia of speech (AOS)—also known as acquired ...
The interaction between awareness of one's own speech disorder with linguistics variables: distinctive features and severity of phonological disorder.

Science.gov (United States)

Dias, Roberta Freitas; Melo, Roberta Michelon; Mezzomo, Carolina Lisbôa; Mota, Helena Bolli

2013-01-01

To analyze the possible relationship among the awareness of one's own speech disorder and some aspects of the phonological system, as the number and the type of changed distinctive features, as well as the interaction among the severity of the disorder and the non-specification of distinctive features. The analyzed group has 23 children with diagnosis of speech disorder, aged 5:0 to 7:7. The speech data were analyzed through the Distinctive Features Analysis and classified by the Percentage of Correct Consonants. One also applied the Awareness of one's own speech disorder test. The children were separated in two groups: with awareness of their own speech disorder established (more than 50% of correct identification) and without awareness of their own speech disorder established (less than 50% of correct identification). Finally, the variables of this research were submitted to analysis using descriptive and inferential statistics. The type of changed distinctive features weren't different between the groups, as well as the total of changed features and the severity disorder. However, a correlation between the severity disorder and the non-specification of distinctive features was verified, because the more severe disorders have more changes in these linguistic variables. The awareness of one's own speech disorder doesn't seem to be directly influenced by the type and by the number of changed distinctive features, neither by the speech disorder severity. Moreover, one verifies that the greater phonological disorder severity, the greater the number of changed distinctive features.
Evaluation of Speech Recognition of Cochlear Implant Recipients Using Adaptive, Digital Remote Microphone Technology and a Speech Enhancement Sound Processing Algorithm.

Science.gov (United States)

Wolfe, Jace; Morais, Mila; Schafer, Erin; Agrawal, Smita; Koch, Dawn

2015-05-01

Cochlear implant recipients often experience difficulty with understanding speech in the presence of noise. Cochlear implant manufacturers have developed sound processing algorithms designed to improve speech recognition in noise, and research has shown these technologies to be effective. Remote microphone technology utilizing adaptive, digital wireless radio transmission has also been shown to provide significant improvement in speech recognition in noise. There are no studies examining the potential improvement in speech recognition in noise when these two technologies are used simultaneously. The goal of this study was to evaluate the potential benefits and limitations associated with the simultaneous use of a sound processing algorithm designed to improve performance in noise (Advanced Bionics ClearVoice) and a remote microphone system that incorporates adaptive, digital wireless radio transmission (Phonak Roger). A two-by-two way repeated measures design was used to examine performance differences obtained without these technologies compared to the use of each technology separately as well as the simultaneous use of both technologies. Eleven Advanced Bionics (AB) cochlear implant recipients, ages 11 to 68 yr. AzBio sentence recognition was measured in quiet and in the presence of classroom noise ranging in level from 50 to 80 dBA in 5-dB steps. Performance was evaluated in four conditions: (1) No ClearVoice and no Roger, (2) ClearVoice enabled without the use of Roger, (3) ClearVoice disabled with Roger enabled, and (4) simultaneous use of ClearVoice and Roger. Speech recognition in quiet was better than speech recognition in noise for all conditions. Use of ClearVoice and Roger each provided significant improvement in speech recognition in noise. The best performance in noise was obtained with the simultaneous use of ClearVoice and Roger. ClearVoice and Roger technology each improves speech recognition in noise, particularly when used at the same time
Single-Sided Deafness: Impact of Cochlear Implantation on Speech Perception in Complex Noise and on Auditory Localization Accuracy.

Science.gov (United States)

Döge, Julia; Baumann, Uwe; Weissgerber, Tobias; Rader, Tobias

2017-12-01

To assess auditory localization accuracy and speech reception threshold (SRT) in complex noise conditions in adult patients with acquired single-sided deafness, after intervention with a cochlear implant (CI) in the deaf ear. Nonrandomized, open, prospective patient series. Tertiary referral university hospital. Eleven patients with late-onset single-sided deafness (SSD) and normal hearing in the unaffected ear, who received a CI. All patients were experienced CI users. Unilateral cochlear implantation. Speech perception was tested in a complex multitalker equivalent noise field consisting of multiple sound sources. Speech reception thresholds in noise were determined in aided (with CI) and unaided conditions. Localization accuracy was assessed in complete darkness. Acoustic stimuli were radiated by multiple loudspeakers distributed in the frontal horizontal plane between -60 and +60 degrees. In the aided condition, results show slightly improved speech reception scores compared with the unaided condition in most of the patients. For 8 of the 11 subjects, SRT was improved between 0.37 and 1.70 dB. Three of the 11 subjects showed deteriorations between 1.22 and 3.24 dB SRT. Median localization error decreased significantly by 12.9 degrees compared with the unaided condition. CI in single-sided deafness is an effective treatment to improve the auditory localization accuracy. Speech reception in complex noise conditions is improved to a lesser extent in 73% of the participating CI SSD patients. However, the absence of true binaural interaction effects (summation, squelch) impedes further improvements. The development of speech processing strategies that respect binaural interaction seems to be mandatory to advance speech perception in demanding listening situations in SSD patients.
Muon identification and performance in the ATLAS experiment

CERN Document Server

Rettie, Sebastien; The ATLAS collaboration

2018-01-01

Muon reconstruction and identification play a fundamental role in many analyses of central importance in the LHC run-2 Physics programme. The algorithms and the criteria used in ATLAS for the reconstruction and identification of muons with transverse momentum from a few GeV to the TeV scale will be presented. Their performance is measured in data based on the decays of Z and J/$\\psi$ to a pair of muons, that provide a large statistics calibration sample. Reconstruction and identification efficiencies are evaluated, as well as momentum scales and resolutions, and the results are used to derive precise MC simulation corrections. Isolation selection criteria and their performances in presence of high pileup will also be presented.
A harmonic excitation state-space approach to blind separation of speech

DEFF Research Database (Denmark)

Olsson, Rasmus Kongsgaard; Hansen, Lars Kai

2005-01-01

We discuss an identification framework for noisy speech mixtures. A block-based generative model is formulated that explicitly incorporates the time-varying harmonic plus noise (H+N) model for a number of latent sources observed through noisy convolutive mixtures. All parameters including...
Does the acceptable noise level (ANL) predict hearing-aid use?

DEFF Research Database (Denmark)

Olsen, Steen Østergaard; Brännström, K Jonas

2014-01-01

OBJECTIVE: It has been suggested that individuals have an inherent acceptance of noise in the presence of speech, and that different acceptance of noise results in different hearing-aid (HA) use. The acceptable noise level (ANL) has been proposed for measurement of this property. It has been...... claimed that the ANL magnitude can predict hearing-aid use patterns. Many papers have been published reporting on different aspects of ANL, but none have challenged the predictive power of ANL. The purpose of this study was to discuss whether ANL can predict HA use and how more reliable ANL results can...... reviewed journals as well as a number of papers from trade journals, posters and oral presentations from audiology conventions. CONCLUSIONS: An inherent acceptance of noise in the presence of speech may exist, but no method for precise measurement of ANL is available. The ANL model for prediction of HA use...
Source-system windowing for speech analysis

NARCIS (Netherlands)

Yegnanarayana, B.; Satyanarayana Murthy, P.; Eggen, J.H.

1993-01-01

In this paper we propose a speech-analysis method to bring out characteristics of the vocal tract system in short segments which are much less than a pitch period. The method performs windowing in the source and system components of the speech signal and recombines them to obtain a signal reflecting
Smartphone Application for the Analysis of Prosodic Features in Running Speech with a Focus on Bipolar Disorders: System Performance Evaluation and Case Study.

Science.gov (United States)

Guidi, Andrea; Salvi, Sergio; Ottaviano, Manuel; Gentili, Claudio; Bertschy, Gilles; de Rossi, Danilo; Scilingo, Enzo Pasquale; Vanello, Nicola

2015-11-06

Bipolar disorder is one of the most common mood disorders characterized by large and invalidating mood swings. Several projects focus on the development of decision support systems that monitor and advise patients, as well as clinicians. Voice monitoring and speech signal analysis can be exploited to reach this goal. In this study, an Android application was designed for analyzing running speech using a smartphone device. The application can record audio samples and estimate speech fundamental frequency, F0, and its changes. F0-related features are estimated locally on the smartphone, with some advantages with respect to remote processing approaches in terms of privacy protection and reduced upload costs. The raw features can be sent to a central server and further processed. The quality of the audio recordings, algorithm reliability and performance of the overall system were evaluated in terms of voiced segment detection and features estimation. The results demonstrate that mean F0 from each voiced segment can be reliably estimated, thus describing prosodic features across the speech sample. Instead, features related to F0 variability within each voiced segment performed poorly. A case study performed on a bipolar patient is presented.
Smartphone Application for the Analysis of Prosodic Features in Running Speech with a Focus on Bipolar Disorders: System Performance Evaluation and Case Study

Science.gov (United States)

Guidi, Andrea; Salvi, Sergio; Ottaviano, Manuel; Gentili, Claudio; Bertschy, Gilles; de Rossi, Danilo; Scilingo, Enzo Pasquale; Vanello, Nicola

2015-01-01

Bipolar disorder is one of the most common mood disorders characterized by large and invalidating mood swings. Several projects focus on the development of decision support systems that monitor and advise patients, as well as clinicians. Voice monitoring and speech signal analysis can be exploited to reach this goal. In this study, an Android application was designed for analyzing running speech using a smartphone device. The application can record audio samples and estimate speech fundamental frequency, F0, and its changes. F0-related features are estimated locally on the smartphone, with some advantages with respect to remote processing approaches in terms of privacy protection and reduced upload costs. The raw features can be sent to a central server and further processed. The quality of the audio recordings, algorithm reliability and performance of the overall system were evaluated in terms of voiced segment detection and features estimation. The results demonstrate that mean F0 from each voiced segment can be reliably estimated, thus describing prosodic features across the speech sample. Instead, features related to F0 variability within each voiced segment performed poorly. A case study performed on a bipolar patient is presented. PMID:26561811

Smartphone Application for the Analysis of Prosodic Features in Running Speech with a Focus on Bipolar Disorders: System Performance Evaluation and Case Study

Directory of Open Access Journals (Sweden)

Andrea Guidi

2015-11-01

Full Text Available Bipolar disorder is one of the most common mood disorders characterized by large and invalidating mood swings. Several projects focus on the development of decision support systems that monitor and advise patients, as well as clinicians. Voice monitoring and speech signal analysis can be exploited to reach this goal. In this study, an Android application was designed for analyzing running speech using a smartphone device. The application can record audio samples and estimate speech fundamental frequency, F0, and its changes. F0-related features are estimated locally on the smartphone, with some advantages with respect to remote processing approaches in terms of privacy protection and reduced upload costs. The raw features can be sent to a central server and further processed. The quality of the audio recordings, algorithm reliability and performance of the overall system were evaluated in terms of voiced segment detection and features estimation. The results demonstrate that mean F0 from each voiced segment can be reliably estimated, thus describing prosodic features across the speech sample. Instead, features related to F0 variability within each voiced segment performed poorly. A case study performed on a bipolar patient is presented.
The feasibility of miniaturizing the versatile portable speech prosthesis: A market survey of commercial products

Science.gov (United States)

Walklet, T.

1981-01-01

The feasibility of a miniature versatile portable speech prosthesis (VPSP) was analyzed and information on its potential users and on other similar devices was collected. The VPSP is a device that incorporates speech synthesis technology. The objective is to provide sufficient information to decide whether there is valuable technology to contribute to the miniaturization of the VPSP. The needs of potential users are identified, the development status of technologies similar or related to those used in the VPSP are evaluated. The VPSP, a computer based speech synthesis system fits on a wheelchair. The purpose was to produce a device that provides communication assistance in educational, vocational, and social situations to speech impaired individuals. It is expected that the VPSP can be a valuable aid for persons who are also motor impaired, which explains the placement of the system on a wheelchair.
Evaluation of a speaker identification system with and without fusion using three databases in the presence of noise and handset effects

Science.gov (United States)

S. Al-Kaltakchi, Musab T.; Woo, Wai L.; Dlay, Satnam; Chambers, Jonathon A.

2017-12-01

In this study, a speaker identification system is considered consisting of a feature extraction stage which utilizes both power normalized cepstral coefficients (PNCCs) and Mel frequency cepstral coefficients (MFCC). Normalization is applied by employing cepstral mean and variance normalization (CMVN) and feature warping (FW), together with acoustic modeling using a Gaussian mixture model-universal background model (GMM-UBM). The main contributions are comprehensive evaluations of the effect of both additive white Gaussian noise (AWGN) and non-stationary noise (NSN) (with and without a G.712 type handset) upon identification performance. In particular, three NSN types with varying signal to noise ratios (SNRs) were tested corresponding to street traffic, a bus interior, and a crowded talking environment. The performance evaluation also considered the effect of late fusion techniques based on score fusion, namely, mean, maximum, and linear weighted sum fusion. The databases employed were TIMIT, SITW, and NIST 2008; and 120 speakers were selected from each database to yield 3600 speech utterances. As recommendations from the study, mean fusion is found to yield overall best performance in terms of speaker identification accuracy (SIA) with noisy speech, whereas linear weighted sum fusion is overall best for original database recordings.
Tracking Change in Children with Severe and Persisting Speech Difficulties

Science.gov (United States)

Newbold, Elisabeth Joy; Stackhouse, Joy; Wells, Bill

2013-01-01

Standardised tests of whole-word accuracy are popular in the speech pathology and developmental psychology literature as measures of children's speech performance. However, they may not be sensitive enough to measure changes in speech output in children with severe and persisting speech difficulties (SPSD). To identify the best ways of doing this,…
Speech recognition in individuals with sensorineural hearing loss.

Science.gov (United States)

de Andrade, Adriana Neves; Iorio, Maria Cecilia Martinelli; Gil, Daniela

2016-01-01

Hearing loss can negatively influence the communication performance of individuals, who should be evaluated with suitable material and in situations of listening close to those found in everyday life. To analyze and compare the performance of patients with mild-to-moderate sensorineural hearing loss in speech recognition tests carried out in silence and with noise, according to the variables ear (right and left) and type of stimulus presentation. The study included 19 right-handed individuals with mild-to-moderate symmetrical bilateral sensorineural hearing loss, submitted to the speech recognition test with words in different modalities and speech test with white noise and pictures. There was no significant difference between right and left ears in any of the tests. The mean number of correct responses in the speech recognition test with pictures, live voice, and recorded monosyllables was 97.1%, 85.9%, and 76.1%, respectively, whereas after the introduction of noise, the performance decreased to 72.6% accuracy. The best performances in the Speech Recognition Percentage Index were obtained using monosyllabic stimuli, represented by pictures presented in silence, with no significant differences between the right and left ears. After the introduction of competitive noise, there was a decrease in individuals' performance. Copyright © 2015 Associação Brasileira de Otorrinolaringologia e Cirurgia Cérvico-Facial. Published by Elsevier Editora Ltda. All rights reserved.
Common neural substrates support speech and non-speech vocal tract gestures.

Science.gov (United States)

Chang, Soo-Eun; Kenney, Mary Kay; Loucks, Torrey M J; Poletto, Christopher J; Ludlow, Christy L

2009-08-01

The issue of whether speech is supported by the same neural substrates as non-speech vocal tract gestures has been contentious. In this fMRI study we tested whether producing non-speech vocal tract gestures in humans shares the same functional neuroanatomy as non-sense speech syllables. Production of non-speech vocal tract gestures, devoid of phonological content but similar to speech in that they had familiar acoustic and somatosensory targets, was compared to the production of speech syllables without meaning. Brain activation related to overt production was captured with BOLD fMRI using a sparse sampling design for both conditions. Speech and non-speech were compared using voxel-wise whole brain analyses, and ROI analyses focused on frontal and temporoparietal structures previously reported to support speech production. Results showed substantial activation overlap between speech and non-speech function in regions. Although non-speech gesture production showed greater extent and amplitude of activation in the regions examined, both speech and non-speech showed comparable left laterality in activation for both target perception and production. These findings posit a more general role of the previously proposed "auditory dorsal stream" in the left hemisphere--to support the production of vocal tract gestures that are not limited to speech processing.
Whole-exome sequencing supports genetic heterogeneity in childhood apraxia of speech.

Science.gov (United States)

Worthey, Elizabeth A; Raca, Gordana; Laffin, Jennifer J; Wilk, Brandon M; Harris, Jeremy M; Jakielski, Kathy J; Dimmock, David P; Strand, Edythe A; Shriberg, Lawrence D

2013-10-02

Childhood apraxia of speech (CAS) is a rare, severe, persistent pediatric motor speech disorder with associated deficits in sensorimotor, cognitive, language, learning and affective processes. Among other neurogenetic origins, CAS is the disorder segregating with a mutation in FOXP2 in a widely studied, multigenerational London family. We report the first whole-exome sequencing (WES) findings from a cohort of 10 unrelated participants, ages 3 to 19 years, with well-characterized CAS. As part of a larger study of children and youth with motor speech sound disorders, 32 participants were classified as positive for CAS on the basis of a behavioral classification marker using auditory-perceptual and acoustic methods that quantify the competence, precision and stability of a speaker's speech, prosody and voice. WES of 10 randomly selected participants was completed using the Illumina Genome Analyzer IIx Sequencing System. Image analysis, base calling, demultiplexing, read mapping, and variant calling were performed using Illumina software. Software developed in-house was used for variant annotation, prioritization and interpretation to identify those variants likely to be deleterious to neurodevelopmental substrates of speech-language development. Among potentially deleterious variants, clinically reportable findings of interest occurred on a total of five chromosomes (Chr3, Chr6, Chr7, Chr9 and Chr17), which included six genes either strongly associated with CAS (FOXP1 and CNTNAP2) or associated with disorders with phenotypes overlapping CAS (ATP13A4, CNTNAP1, KIAA0319 and SETX). A total of 8 (80%) of the 10 participants had clinically reportable variants in one or two of the six genes, with variants in ATP13A4, KIAA0319 and CNTNAP2 being the most prevalent. Similar to the results reported in emerging WES studies of other complex neurodevelopmental disorders, our findings from this first WES study of CAS are interpreted as support for heterogeneous genetic origins of
Aging and Spectro-Temporal Integration of Speech

Directory of Open Access Journals (Sweden)

John H. Grose

2016-10-01

Full Text Available The purpose of this study was to determine the effects of age on the spectro-temporal integration of speech. The hypothesis was that the integration of speech fragments distributed over frequency, time, and ear of presentation is reduced in older listeners—even for those with good audiometric hearing. Younger, middle-aged, and older listeners (10 per group with good audiometric hearing participated. They were each tested under seven conditions that encompassed combinations of spectral, temporal, and binaural integration. Sentences were filtered into two bands centered at 500 Hz and 2500 Hz, with criterion bandwidth tailored for each participant. In some conditions, the speech bands were individually square wave interrupted at a rate of 10 Hz. Configurations of uninterrupted, synchronously interrupted, and asynchronously interrupted frequency bands were constructed that constituted speech fragments distributed across frequency, time, and ear of presentation. The over-arching finding was that, for most configurations, performance was not differentially affected by listener age. Although speech intelligibility varied across condition, there was no evidence of performance deficits in older listeners in any condition. This study indicates that age, per se, does not necessarily undermine the ability to integrate fragments of speech dispersed across frequency and time.
Assessing Speech Intelligibility in Children with Hearing Loss: Toward Revitalizing a Valuable Clinical Tool

Science.gov (United States)

Ertmer, David J.

2011-01-01

Background: Newborn hearing screening, early intervention programs, and advancements in cochlear implant and hearing aid technology have greatly increased opportunities for children with hearing loss to become intelligible talkers. Optimizing speech intelligibility requires that progress be monitored closely. Although direct assessment of…
Stimulus variability and the phonetic relevance hypothesis: effects of variability in speaking style, fundamental frequency, and speaking rate on spoken word identification.

Science.gov (United States)

Sommers, Mitchell S; Barcroft, Joe

2006-04-01

Three experiments were conducted to examine the effects of trial-to-trial variations in speaking style, fundamental frequency, and speaking rate on identification of spoken words. In addition, the experiments investigated whether any effects of stimulus variability would be modulated by phonetic confusability (i.e., lexical difficulty). In Experiment 1, trial-to-trial variations in speaking style reduced the overall identification performance compared with conditions containing no speaking-style variability. In addition, the effects of variability were greater for phonetically confusable words than for phonetically distinct words. In Experiment 2, variations in fundamental frequency were found to have no significant effects on spoken word identification and did not interact with lexical difficulty. In Experiment 3, two different methods for varying speaking rate were found to have equivalent negative effects on spoken word recognition and similar interactions with lexical difficulty. Overall, the findings are consistent with a phonetic-relevance hypothesis, in which accommodating sources of acoustic-phonetic variability that affect phonetically relevant properties of speech signals can impair spoken word identification. In contrast, variability in parameters of the speech signal that do not affect phonetically relevant properties are not expected to affect overall identification performance. Implications of these findings for the nature and development of lexical representations are discussed.
Memory performance on the Auditory Inference Span Test is independent of background noise type for young adults with normal hearing at high speech intelligibility.

Science.gov (United States)

Rönnberg, Niklas; Rudner, Mary; Lunner, Thomas; Stenfelt, Stefan

2014-01-01

Listening in noise is often perceived to be effortful. This is partly because cognitive resources are engaged in separating the target signal from background noise, leaving fewer resources for storage and processing of the content of the message in working memory. The Auditory Inference Span Test (AIST) is designed to assess listening effort by measuring the ability to maintain and process heard information. The aim of this study was to use AIST to investigate the effect of background noise types and signal-to-noise ratio (SNR) on listening effort, as a function of working memory capacity (WMC) and updating ability (UA). The AIST was administered in three types of background noise: steady-state speech-shaped noise, amplitude modulated speech-shaped noise, and unintelligible speech. Three SNRs targeting 90% speech intelligibility or better were used in each of the three noise types, giving nine different conditions. The reading span test assessed WMC, while UA was assessed with the letter memory test. Twenty young adults with normal hearing participated in the study. Results showed that AIST performance was not influenced by noise type at the same intelligibility level, but became worse with worse SNR when background noise was speech-like. Performance on AIST also decreased with increasing memory load level. Correlations between AIST performance and the cognitive measurements suggested that WMC is of more importance for listening when SNRs are worse, while UA is of more importance for listening in easier SNRs. The results indicated that in young adults with normal hearing, the effort involved in listening in noise at high intelligibility levels is independent of the noise type. However, when noise is speech-like and intelligibility decreases, listening effort increases, probably due to extra demands on cognitive resources added by the informational masking created by the speech fragments and vocal sounds in the background noise.
Memory performance on the Auditory Inference Span Test is independent of background noise type for young adults with normal hearing at high speech intelligibility

Directory of Open Access Journals (Sweden)

Niklas eRönnberg

2014-12-01

Full Text Available Listening in noise is often perceived to be effortful. This is partly because cognitive resources are engaged in separating the target signal from background noise, leaving fewer resources for storage and processing of the content of the message in working memory. The Auditory Inference Span Test (AIST is designed to assess listening effort by measuring the ability to maintain and process heard information. The aim of this study was to use AIST to investigate the effect of background noise types and signal-to-noise ratio (SNR on listening effort, as a function of working memory capacity (WMC and updating ability (UA. The AIST was administered in three types of background noise: steady-state speech-shaped noise, amplitude modulated speech-shaped noise, and unintelligible speech. Three SNRs targeting 90% speech intelligibility or better were used in each of the three noise types, giving nine different conditions. The reading span test assessed WMC, while UA was assessed with the letter memory test. Twenty young adults with normal hearing participated in the study. Results showed that AIST performance was not influenced by noise type at the same intelligibility level, but became worse with worse SNR when background noise was speech-like. Performance on AIST also decreased with increasing MLL. Correlations between AIST performance and the cognitive measurements suggested that WMC is of more importance for listening when SNRs are worse, while UA is of more importance for listening in easier SNRs. The results indicated that in young adults with normal hearing, the effort involved in listening in noise at high intelligibility levels is independent of the noise type. However, when noise is speech-like and intelligibility decreases, listening effort increases, probably due to extra demands on cognitive resources added by the informational masking created by the speech-fragments and vocal sounds in the background noise.
FBI fingerprint identification automation study: AIDS 3 evaluation report. Volume 9: Functional requirements

Science.gov (United States)

1980-01-01

The current system and subsystem used by the Identification Division are described. System constraints that dictate the system environment are discussed and boundaries within which solutions must be found are described. The functional requirements were related to the performance requirements. These performance requirements were then related to their applicable subsystems. The flow of data, documents, or other pieces of information from one subsystem to another or from the external world into the identification system is described. Requirements and design standards for a computer based system are presented.
Theoretical Issues of Validity in the Measurement of Aided Speech Reception Threshold in Noise for Comparing Nonlinear Hearing Aid Systems.

Science.gov (United States)

Naylor, Graham

2016-07-01

Adaptive Speech Reception Threshold in noise (SRTn) measurements are often used to make comparisons between alternative hearing aid (HA) systems. Such measurements usually do not constrain the signal-to-noise ratio (SNR) at which testing takes place. Meanwhile, HA systems increasingly include nonlinear features that operate differently in different SNRs, and listeners differ in their inherent SNR requirements. To show that SRTn measurements, as commonly used in comparisons of alternative HA systems, suffer from threats to their validity, to illustrate these threats with examples of potentially invalid conclusions in the research literature, and to propose ways to tackle these threats. An examination of the nature of SRTn measurements in the context of test theory, modern nonlinear HAs, and listener diversity. Examples from the audiological research literature were used to estimate typical interparticipant variation in SRTn and to illustrate cases where validity may have been compromised. There can be no doubt that SRTn measurements, when used to compare nonlinear HA systems, in principle, suffer from threats to their internal and external/ecological validity. Interactions between HA nonlinearities and SNR, and interparticipant differences in inherent SNR requirements, can act to generate misleading results. In addition, SRTn may lie at an SNR outside the range for which the HA system is designed or expected to operate in. Although the extent of invalid conclusions in the literature is difficult to evaluate, examples of studies were nevertheless identified where the risk of each form of invalidity is significant. Reliable data on ecological SNRs is becoming available, so that ecological validity can be assessed. Methodological developments that can reduce the risk of invalid conclusions include variations on the SRTn measurement procedure itself, manipulations of stimulus or scoring conditions to place SRTn in an ecologically relevant range, and design and analysis
Speech Silicon: An FPGA Architecture for Real-Time Hidden Markov-Model-Based Speech Recognition

Directory of Open Access Journals (Sweden)

Schuster Jeffrey

2006-01-01

Full Text Available This paper examines the design of an FPGA-based system-on-a-chip capable of performing continuous speech recognition on medium sized vocabularies in real time. Through the creation of three dedicated pipelines, one for each of the major operations in the system, we were able to maximize the throughput of the system while simultaneously minimizing the number of pipeline stalls in the system. Further, by implementing a token-passing scheme between the later stages of the system, the complexity of the control was greatly reduced and the amount of active data present in the system at any time was minimized. Additionally, through in-depth analysis of the SPHINX 3 large vocabulary continuous speech recognition engine, we were able to design models that could be efficiently benchmarked against a known software platform. These results, combined with the ability to reprogram the system for different recognition tasks, serve to create a system capable of performing real-time speech recognition in a vast array of environments.
Speech Silicon: An FPGA Architecture for Real-Time Hidden Markov-Model-Based Speech Recognition

Directory of Open Access Journals (Sweden)

Alex K. Jones

2006-11-01

Full Text Available This paper examines the design of an FPGA-based system-on-a-chip capable of performing continuous speech recognition on medium sized vocabularies in real time. Through the creation of three dedicated pipelines, one for each of the major operations in the system, we were able to maximize the throughput of the system while simultaneously minimizing the number of pipeline stalls in the system. Further, by implementing a token-passing scheme between the later stages of the system, the complexity of the control was greatly reduced and the amount of active data present in the system at any time was minimized. Additionally, through in-depth analysis of the SPHINX 3 large vocabulary continuous speech recognition engine, we were able to design models that could be efficiently benchmarked against a known software platform. These results, combined with the ability to reprogram the system for different recognition tasks, serve to create a system capable of performing real-time speech recognition in a vast array of environments.
The speech signal segmentation algorithm using pitch synchronous analysis

Directory of Open Access Journals (Sweden)

Amirgaliyev Yedilkhan

2017-03-01

Full Text Available Parameterization of the speech signal using the algorithms of analysis synchronized with the pitch frequency is discussed. Speech parameterization is performed by the average number of zero transitions function and the signal energy function. Parameterization results are used to segment the speech signal and to isolate the segments with stable spectral characteristics. Segmentation results can be used to generate a digital voice pattern of a person or be applied in the automatic speech recognition. Stages needed for continuous speech segmentation are described.
21 CFR 886.5915 - Optical vision aid.

Science.gov (United States)

2010-04-01

... 21 Food and Drugs 8 2010-04-01 2010-04-01 false Optical vision aid. 886.5915 Section 886.5915 Food... DEVICES OPHTHALMIC DEVICES Therapeutic Devices § 886.5915 Optical vision aid. (a) Identification. An optical vision aid is a device that consists of a magnifying lens with an accompanying AC-powered or...
Performance factors of mobile rich media job aids for community health workers.

Science.gov (United States)

Florez-Arango, Jose F; Iyengar, M Sriram; Dunn, Kim; Zhang, Jiajie

2011-01-01

To study and analyze the possible benefits on performance of community health workers using point-of-care clinical guidelines implemented as interactive rich media job aids on small-format mobile platforms. A crossover study with one intervention (rich media job aids) and one control (traditional job aids), two periods, with 50 community health workers, each subject solving a total 15 standardized cases per period per period (30 cases in total per subject). Error rate per case and task, protocol compliance. A total of 1394 cases were evaluated. Intervention reduces errors by an average of 33.15% (p = 0.001) and increases protocol compliance 30.18% (p technologies in general, and the use of rich media clinical guidelines on cell phones in particular, for the improvement of community health worker performance in developing countries.
Audio-Visual Speech Recognition Using MPEG-4 Compliant Visual Features

Directory of Open Access Journals (Sweden)

Petar S. Aleksic

2002-11-01

Full Text Available We describe an audio-visual automatic continuous speech recognition system, which significantly improves speech recognition performance over a wide range of acoustic noise levels, as well as under clean audio conditions. The system utilizes facial animation parameters (FAPs supported by the MPEG-4 standard for the visual representation of speech. We also describe a robust and automatic algorithm we have developed to extract FAPs from visual data, which does not require hand labeling or extensive training procedures. The principal component analysis (PCA was performed on the FAPs in order to decrease the dimensionality of the visual feature vectors, and the derived projection weights were used as visual features in the audio-visual automatic speech recognition (ASR experiments. Both single-stream and multistream hidden Markov models (HMMs were used to model the ASR system, integrate audio and visual information, and perform a relatively large vocabulary (approximately 1000 words speech recognition experiments. The experiments performed use clean audio data and audio data corrupted by stationary white Gaussian noise at various SNRs. The proposed system reduces the word error rate (WER by 20% to 23% relatively to audio-only speech recognition WERs, at various SNRs (0Ã¢Â€Â“30 dB with additive white Gaussian noise, and by 19% relatively to audio-only speech recognition WER under clean audio conditions.

Decision support aids with anthropomorphic characteristics influence trust and performance in younger and older adults.

Science.gov (United States)

Pak, Richard; Fink, Nicole; Price, Margaux; Bass, Brock; Sturre, Lindsay

2012-01-01

This study examined the use of deliberately anthropomorphic automation on younger and older adults' trust, dependence and performance on a diabetes decision-making task. Research with anthropomorphic interface agents has shown mixed effects in judgments of preferences but has rarely examined effects on performance. Meanwhile, research in automation has shown some forms of anthropomorphism (e.g. etiquette) have effects on trust and dependence on automation. Participants answered diabetes questions with no-aid, a non-anthropomorphic aid or an anthropomorphised aid. Trust and dependence in the aid was measured. A minimally anthropomorphic aide primarily affected younger adults' trust in the aid. Dependence, however, for both age groups was influenced by the anthropomorphic aid. Automation that deliberately embodies person-like characteristics can influence trust and dependence on reasonably reliable automation. However, further research is necessary to better understand the specific aspects of the aid that affect different age groups. Automation that embodies human-like characteristics may be useful in situations where there is under-utilisation of reasonably reliable aids by enhancing trust and dependence in that aid. Practitioner Summary: The design of decision-support aids on consumer devices (e.g. smartphones) may influence the level of trust that users place in that system and their amount of use. This study is the first step in articulating how the design of aids may influence user's trust and use of such systems.
Source Separation via Spectral Masking for Speech Recognition Systems

Directory of Open Access Journals (Sweden)

Gustavo Fernandes Rodrigues

2012-12-01

Full Text Available In this paper we present an insight into the use of spectral masking techniques in time-frequency domain, as a preprocessing step for the speech signal recognition. Speech recognition systems have their performance negatively affected in noisy environments or in the presence of other speech signals. The limits of these masking techniques for different levels of the signal-to-noise ratio are discussed. We show the robustness of the spectral masking techniques against four types of noise: white, pink, brown and human speech noise (bubble noise. The main contribution of this work is to analyze the performance limits of recognition systems using spectral masking. We obtain an increase of 18% on the speech hit rate, when the speech signals were corrupted by other speech signals or bubble noise, with different signal-to-noise ratio of approximately 1, 10 and 20 dB. On the other hand, applying the ideal binary masks to mixtures corrupted by white, pink and brown noise, results an average growth of 9% on the speech hit rate, with the same different signal-to-noise ratio. The experimental results suggest that the masking spectral techniques are more suitable for the case when it is applied a bubble noise, which is produced by human speech, than for the case of applying white, pink and brown noise.
Acquirement and enhancement of remote speech signals

Science.gov (United States)

Lü, Tao; Guo, Jin; Zhang, He-yong; Yan, Chun-hui; Wang, Can-jin

2017-07-01

To address the challenges of non-cooperative and remote acoustic detection, an all-fiber laser Doppler vibrometer (LDV) is established. The all-fiber LDV system can offer the advantages of smaller size, lightweight design and robust structure, hence it is a better fit for remote speech detection. In order to improve the performance and the efficiency of LDV for long-range hearing, the speech enhancement technology based on optimally modified log-spectral amplitude (OM-LSA) algorithm is used. The experimental results show that the comprehensible speech signals within the range of 150 m can be obtained by the proposed LDV. The signal-to-noise ratio ( SNR) and mean opinion score ( MOS) of the LDV speech signal can be increased by 100% and 27%, respectively, by using the speech enhancement technology. This all-fiber LDV, which combines the speech enhancement technology, can meet the practical demand in engineering.
[Velopharyngeal closure pattern and speech performance among submucous cleft palate patients].

Science.gov (United States)

Heng, Yin; Chunli, Guo; Bing, Shi; Yang, Li; Jingtao, Li

2017-06-01

To characterize the velopharyngeal closure patterns and speech performance among submucous cleft palate patients. Patients with submucous cleft palate visiting the Department of Cleft Lip and Palate Surgery, West China Hospital of Stomatology, Sichuan University between 2008 and 2016 were reviewed. Outcomes of subjective speech evaluation including velopharyngeal function, consonant articulation, and objective nasopharyngeal endoscopy including the mobility of soft palate, pharyngeal walls were retrospectively analyzed. A total of 353 cases were retrieved in this study, among which 138 (39.09%) demonstrated velopharyngeal competence, 176 (49.86%) velopharyngeal incompetence, and 39 (11.05%) marginal velopharyngeal incompetence. A total of 268 cases were subjected to nasopharyngeal endoscopy examination, where 167 (62.31%) demonstrated circular closure pattern, 89 (33.21%) coronal pattern, and 12 (4.48%) sagittal pattern. Passavant's ridge existed in 45.51% (76/167) patients with circular closure and 13.48% (12/89) patients with coronal closure. Among the 353 patients included in this study, 137 (38.81%) presented normal articulation, 124 (35.13%) consonant elimination, 51 (14.45%) compensatory articulation, 36 (10.20%) consonant weakening, 25 (7.08%) consonant replacement, and 36 (10.20%) multiple articulation errors. Circular closure was the most prevalent velopharyngeal closure pattern among patients with submucous cleft palate, and high-pressure consonant deletion was the most common articulation abnormality. Articulation error occurred more frequently among patients with a low velopharyngeal closure rate.
[Progressive noise induced hearing loss caused by hearing AIDS, a dilemma for the worker and the expert alike].

Science.gov (United States)

Feldmann, H

2001-12-01

Investigating cases of noise induced hearing loss the expert is often confronted with the situation that the hearing loss is progressive although the noise exposure has been reduced to almost non-damaging levels. Other causes such as age, hereditary deafness, head injuries, blasts, internal diseases can be excluded. Hearing aids as sources of damaging noise? By consulting the protocol of the hearing-aid acoustician and by own examinations the expert should obtain the following data: loudness level that yields best discrimination score of speech; level of discomfort for tones and speech, discrimination score that is achieved under free field condition with a speech level of 65 dB, using the hearing aids. Furthermore he should explore the circumstances under which the hearing aids are used: how many hours per day, at what occasions etc.? It is likely that in using the hearing aids they are adjusted to emit an intensity level identical to the one yielding the optimal discrimination score. If this e. g. is 100 dB and the hearing aids are used for 2 hours per day this would be equivalent to an exposure to industrial noise of 94 dB (A) for 8 hours daily without ear protection. Among all individuals working under industrial noise exposure today only about 1 - 2 % having unusually vulnerable inner ears will suffer a noise induced hearing loss. On the other hand workers in industrial noise are accustomed to loud noise levels, usually have a raised threshold of discomfort and therefore are likely to adjust their hearing aids to such high intensities. The expert will have to decide whether in an individual case the industrial noise exposure or the use of the hearing aids is the dominant risk for further damage. The consequences in respect to the regulations of the workers' health insurance are discussed.
The effect of viewing speech on auditory speech processing is different in the left and right hemispheres.

Science.gov (United States)

Davis, Chris; Kislyuk, Daniel; Kim, Jeesun; Sams, Mikko

2008-11-25

We used whole-head magnetoencephalograpy (MEG) to record changes in neuromagnetic N100m responses generated in the left and right auditory cortex as a function of the match between visual and auditory speech signals. Stimuli were auditory-only (AO) and auditory-visual (AV) presentations of /pi/, /ti/ and /vi/. Three types of intensity matched auditory stimuli were used: intact speech (Normal), frequency band filtered speech (Band) and speech-shaped white noise (Noise). The behavioural task was to detect the /vi/ syllables which comprised 12% of stimuli. N100m responses were measured to averaged /pi/ and /ti/ stimuli. Behavioural data showed that identification of the stimuli was faster and more accurate for Normal than for Band stimuli, and for Band than for Noise stimuli. Reaction times were faster for AV than AO stimuli. MEG data showed that in the left hemisphere, N100m to both AO and AV stimuli was largest for the Normal, smaller for Band and smallest for Noise stimuli. In the right hemisphere, Normal and Band AO stimuli elicited N100m responses of quite similar amplitudes, but N100m amplitude to Noise was about half of that. There was a reduction in N100m for the AV compared to the AO conditions. The size of this reduction for each stimulus type was same in the left hemisphere but graded in the right (being largest to the Normal, smaller to the Band and smallest to the Noise stimuli). The N100m decrease for the Normal stimuli was significantly larger in the right than in the left hemisphere. We suggest that the effect of processing visual speech seen in the right hemisphere likely reflects suppression of the auditory response based on AV cues for place of articulation.
Speech Production and Speech Discrimination by Hearing-Impaired Children.

Science.gov (United States)

Novelli-Olmstead, Tina; Ling, Daniel

1984-01-01

Seven hearing impaired children (five to seven years old) assigned to the Speakers group made highly significant gains in speech production and auditory discrimination of speech, while Listeners made only slight speech production gains and no gains in auditory discrimination. Combined speech and auditory training was more effective than auditory…
Development and preliminary evaluation of a pediatric Spanish-English speech perception task.

Science.gov (United States)

Calandruccio, Lauren; Gomez, Bianca; Buss, Emily; Leibold, Lori J

2014-06-01

The purpose of this study was to develop a task to evaluate children's English and Spanish speech perception abilities in either noise or competing speech maskers. Eight bilingual Spanish-English and 8 age-matched monolingual English children (ages 4.9-16.4 years) were tested. A forced-choice, picture-pointing paradigm was selected for adaptively estimating masked speech reception thresholds. Speech stimuli were spoken by simultaneous bilingual Spanish-English talkers. The target stimuli were 30 disyllabic English and Spanish words, familiar to 5-year-olds and easily illustrated. Competing stimuli included either 2-talker English or 2-talker Spanish speech (corresponding to target language) and spectrally matched noise. For both groups of children, regardless of test language, performance was significantly worse for the 2-talker than for the noise masker condition. No difference in performance was found between bilingual and monolingual children. Bilingual children performed significantly better in English than in Spanish in competing speech. For all listening conditions, performance improved with increasing age. Results indicated that the stimuli and task were appropriate for speech recognition testing in both languages, providing a more conventional measure of speech-in-noise perception as well as a measure of complex listening. Further research is needed to determine performance for Spanish-dominant listeners and to evaluate the feasibility of implementation into routine clinical use.
Stuttering Frequency, Speech Rate, Speech Naturalness, and Speech Effort During the Production of Voluntary Stuttering.

Science.gov (United States)

Davidow, Jason H; Grossman, Heather L; Edge, Robin L

2018-05-01

Voluntary stuttering techniques involve persons who stutter purposefully interjecting disfluencies into their speech. Little research has been conducted on the impact of these techniques on the speech pattern of persons who stutter. The present study examined whether changes in the frequency of voluntary stuttering accompanied changes in stuttering frequency, articulation rate, speech naturalness, and speech effort. In total, 12 persons who stutter aged 16-34 years participated. Participants read four 300-syllable passages during a control condition, and three voluntary stuttering conditions that involved attempting to produce purposeful, tension-free repetitions of initial sounds or syllables of a word for two or more repetitions (i.e., bouncing). The three voluntary stuttering conditions included bouncing on 5%, 10%, and 15% of syllables read. Friedman tests and follow-up Wilcoxon signed ranks tests were conducted for the statistical analyses. Stuttering frequency, articulation rate, and speech naturalness were significantly different between the voluntary stuttering conditions. Speech effort did not differ between the voluntary stuttering conditions. Stuttering frequency was significantly lower during the three voluntary stuttering conditions compared to the control condition, and speech effort was significantly lower during two of the three voluntary stuttering conditions compared to the control condition. Due to changes in articulation rate across the voluntary stuttering conditions, it is difficult to conclude, as has been suggested previously, that voluntary stuttering is the reason for stuttering reductions found when using voluntary stuttering techniques. Additionally, future investigations should examine different types of voluntary stuttering over an extended period of time to determine their impact on stuttering frequency, speech rate, speech naturalness, and speech effort.
Learning trajectories for speech motor performance in children with specific language impairment.

Science.gov (United States)

Richtsmeier, Peter T; Goffman, Lisa

2015-01-01

Children with specific language impairment (SLI) often perform below expected levels, including on tests of motor skill and in learning tasks, particularly procedural learning. In this experiment we examined the possibility that children with SLI might also have a motor learning deficit. Twelve children with SLI and thirteen children with typical development (TD) produced complex nonwords in an imitation task. Productions were collected across three blocks, with the first and second blocks on the same day and the third block one week later. Children's lip movements while producing the nonwords were recorded using an Optotrak camera system. Movements were then analyzed for production duration and stability. Movement analyses indicated that both groups of children produced shorter productions in later blocks (corroborated by an acoustic analysis), and the rate of change was comparable for the TD and SLI groups. A nonsignificant trend for more stable productions was also observed in both groups. SLI is regularly accompanied by a motor deficit, and this study does not dispute that. However, children with SLI learned to make more efficient productions at a rate similar to their peers with TD, revealing some modification of the motor deficit associated with SLI. The reader will learn about deficits commonly associated with specific language impairment (SLI) that often occur alongside the hallmark language deficit. The authors present an experiment showing that children with SLI improved speech motor performance at a similar rate compared to typically developing children. The implication is that speech motor learning is not impaired in children with SLI. Copyright © 2015 Elsevier Inc. All rights reserved.
Experiments on Automatic Recognition of Nonnative Arabic Speech

Directory of Open Access Journals (Sweden)

Douglas O'Shaughnessy

2008-05-01

Full Text Available The automatic recognition of foreign-accented Arabic speech is a challenging task since it involves a large number of nonnative accents. As well, the nonnative speech data available for training are generally insufficient. Moreover, as compared to other languages, the Arabic language has sparked a relatively small number of research efforts. In this paper, we are concerned with the problem of nonnative speech in a speaker independent, large-vocabulary speech recognition system for modern standard Arabic (MSA. We analyze some major differences at the phonetic level in order to determine which phonemes have a significant part in the recognition performance for both native and nonnative speakers. Special attention is given to specific Arabic phonemes. The performance of an HMM-based Arabic speech recognition system is analyzed with respect to speaker gender and its native origin. The WestPoint modern standard Arabic database from the language data consortium (LDC and the hidden Markov Model Toolkit (HTK are used throughout all experiments. Our study shows that the best performance in the overall phoneme recognition is obtained when nonnative speakers are involved in both training and testing phases. This is not the case when a language model and phonetic lattice networks are incorporated in the system. At the phonetic level, the results show that female nonnative speakers perform better than nonnative male speakers, and that emphatic phonemes yield a significant decrease in performance when they are uttered by both male and female nonnative speakers.
Experiments on Automatic Recognition of Nonnative Arabic Speech

Directory of Open Access Journals (Sweden)

Selouani Sid-Ahmed

2008-01-01

Full Text Available The automatic recognition of foreign-accented Arabic speech is a challenging task since it involves a large number of nonnative accents. As well, the nonnative speech data available for training are generally insufficient. Moreover, as compared to other languages, the Arabic language has sparked a relatively small number of research efforts. In this paper, we are concerned with the problem of nonnative speech in a speaker independent, large-vocabulary speech recognition system for modern standard Arabic (MSA. We analyze some major differences at the phonetic level in order to determine which phonemes have a significant part in the recognition performance for both native and nonnative speakers. Special attention is given to specific Arabic phonemes. The performance of an HMM-based Arabic speech recognition system is analyzed with respect to speaker gender and its native origin. The WestPoint modern standard Arabic database from the language data consortium (LDC and the hidden Markov Model Toolkit (HTK are used throughout all experiments. Our study shows that the best performance in the overall phoneme recognition is obtained when nonnative speakers are involved in both training and testing phases. This is not the case when a language model and phonetic lattice networks are incorporated in the system. At the phonetic level, the results show that female nonnative speakers perform better than nonnative male speakers, and that emphatic phonemes yield a significant decrease in performance when they are uttered by both male and female nonnative speakers.
Inner Speech: Development, Cognitive Functions, Phenomenology, and Neurobiology

Science.gov (United States)

2015-01-01

Inner speech—also known as covert speech or verbal thinking—has been implicated in theories of cognitive development, speech monitoring, executive function, and psychopathology. Despite a growing body of knowledge on its phenomenology, development, and function, approaches to the scientific study of inner speech have remained diffuse and largely unintegrated. This review examines prominent theoretical approaches to inner speech and methodological challenges in its study, before reviewing current evidence on inner speech in children and adults from both typical and atypical populations. We conclude by considering prospects for an integrated cognitive science of inner speech, and present a multicomponent model of the phenomenon informed by developmental, cognitive, and psycholinguistic considerations. Despite its variability among individuals and across the life span, inner speech appears to perform significant functions in human cognition, which in some cases reflect its developmental origins and its sharing of resources with other cognitive processes. PMID:26011789
Effect of age at cochlear implantation on auditory and speech development of children with auditory neuropathy spectrum disorder.

Science.gov (United States)

Liu, Yuying; Dong, Ruijuan; Li, Yuling; Xu, Tianqiu; Li, Yongxin; Chen, Xueqing; Gong, Shusheng

2014-12-01

To evaluate the auditory and speech abilities in children with auditory neuropathy spectrum disorder (ANSD) after cochlear implantation (CI) and determine the role of age at implantation. Ten children participated in this retrospective case series study. All children had evidence of ANSD. All subjects had no cochlear nerve deficiency on magnetic resonance imaging and had used the cochlear implants for a period of 12-84 months. We divided our children into two groups: children who underwent implantation before 24 months of age and children who underwent implantation after 24 months of age. Their auditory and speech abilities were evaluated using the following: behavioral audiometry, the Categories of Auditory Performance (CAP), the Meaningful Auditory Integration Scale (MAIS), the Infant-Toddler Meaningful Auditory Integration Scale (IT-MAIS), the Standard-Chinese version of the Monosyllabic Lexical Neighborhood Test (LNT), the Multisyllabic Lexical Neighborhood Test (MLNT), the Speech Intelligibility Rating (SIR) and the Meaningful Use of Speech Scale (MUSS). All children showed progress in their auditory and language abilities. The 4-frequency average hearing level (HL) (500Hz, 1000Hz, 2000Hz and 4000Hz) of aided hearing thresholds ranged from 17.5 to 57.5dB HL. All children developed time-related auditory perception and speech skills. Scores of children with ANSD who received cochlear implants before 24 months tended to be better than those of children who received cochlear implants after 24 months. Seven children completed the Mandarin Lexical Neighborhood Test. Approximately half of the children showed improved open-set speech recognition. Cochlear implantation is helpful for children with ANSD and may be a good optional treatment for many ANSD children. In addition, children with ANSD fitted with cochlear implants before 24 months tended to acquire auditory and speech skills better than children fitted with cochlear implants after 24 months. Copyright © 2014
Common neural substrates support speech and non-speech vocal tract gestures

OpenAIRE

Chang, Soo-Eun; Kenney, Mary Kay; Loucks, Torrey M.J.; Poletto, Christopher J.; Ludlow, Christy L.

2009-01-01

The issue of whether speech is supported by the same neural substrates as non-speech vocal-tract gestures has been contentious. In this fMRI study we tested whether producing non-speech vocal tract gestures in humans shares the same functional neuroanatomy as non-sense speech syllables. Production of non-speech vocal tract gestures, devoid of phonological content but similar to speech in that they had familiar acoustic and somatosensory targets, were compared to the production of speech sylla...
Introductory speeches

International Nuclear Information System (INIS)

2001-01-01

This CD is multimedia presentation of programme safety upgrading of Bohunice V1 NPP. This chapter consist of introductory commentary and 4 introductory speeches (video records): (1) Introductory speech of Vincent Pillar, Board chairman and director general of Slovak electric, Plc. (SE); (2) Introductory speech of Stefan Schmidt, director of SE - Bohunice Nuclear power plants; (3) Introductory speech of Jan Korec, Board chairman and director general of VUJE Trnava, Inc. - Engineering, Design and Research Organisation, Trnava; Introductory speech of Dietrich Kuschel, Senior vice-president of FRAMATOME ANP Project and Engineering
Cognitive control components and speech symptoms in people with schizophrenia.

Science.gov (United States)

Becker, Theresa M; Cicero, David C; Cowan, Nelson; Kerns, John G

2012-03-30

Previous schizophrenia research suggests poor cognitive control is associated with schizophrenia speech symptoms. However, cognitive control is a broad construct. Two important cognitive control components are poor goal maintenance and poor verbal working memory storage. In the current research, people with schizophrenia (n=45) performed three cognitive tasks that varied in their goal maintenance and verbal working memory storage demands. Speech symptoms were assessed using clinical rating scales, ratings of disorganized speech from typed transcripts, and self-reported disorganization. Overall, alogia was associated with both goal maintenance and verbal working memory tasks. Objectively rated disorganized speech was associated with poor goal maintenance and with a task that included both goal maintenance and verbal working memory storage demands. In contrast, self-reported disorganization was unrelated to either amount of objectively rated disorganized speech or to cognitive control task performance, instead being associated with negative mood symptoms. Overall, our results suggest that alogia is associated with both poor goal maintenance and poor verbal working memory storage and that disorganized speech is associated with poor goal maintenance. In addition, patients' own assessment of their disorganization is related to negative mood, but perhaps not to objective disorganized speech or to cognitive control task performance. Published by Elsevier Ireland Ltd.
Mental practice with interactive 3D visual aids enhances surgical performance.

Science.gov (United States)

Yiasemidou, Marina; Glassman, Daniel; Mushtaq, Faisal; Athanasiou, Christos; Williams, Mark-Mon; Jayne, David; Miskovic, Danilo

2017-10-01

Evidence suggests that Mental Practice (MP) could be used to finesse surgical skills. However, MP is cognitively demanding and may be dependent on the ability of individuals to produce mental images. In this study, we hypothesised that the provision of interactive 3D visual aids during MP could facilitate surgical skill performance. 20 surgical trainees were case-matched to one of three different preparation methods prior to performing a simulated Laparoscopic Cholecystectomy (LC). Two intervention groups underwent a 25-minute MP session; one with interactive 3D visual aids depicting the relevant surgical anatomy (3D-MP group, n = 5) and one without (MP-Only, n = 5). A control group (n = 10) watched a didactic video of a real LC. Scores relating to technical performance and safety were recorded by a surgical simulator. The Control group took longer to complete the procedure relative to the 3D&MP condition (p = .002). The number of movements was also statistically different across groups (p = .001), with the 3D&MP group making fewer movements relative to controls (p = .001). Likewise, the control group moved further in comparison to the 3D&MP condition and the MP-Only condition (p = .004). No reliable differences were observed for safety metrics. These data provide evidence for the potential value of MP in improving performance. Furthermore, they suggest that 3D interactive visual aids during MP could potentially enhance performance, beyond the benefits of MP alone. These findings pave the way for future RCTs on surgical preparation and performance.
[Prosody, speech input and language acquisition].

Science.gov (United States)

Jungheim, M; Miller, S; Kühn, D; Ptok, M

2014-04-01

In order to acquire language, children require speech input. The prosody of the speech input plays an important role. In most cultures adults modify their code when communicating with children. Compared to normal speech this code differs especially with regard to prosody. For this review a selective literature search in PubMed and Scopus was performed. Prosodic characteristics are a key feature of spoken language. By analysing prosodic features, children gain knowledge about underlying grammatical structures. Child-directed speech (CDS) is modified in a way that meaningful sequences are highlighted acoustically so that important information can be extracted from the continuous speech flow more easily. CDS is said to enhance the representation of linguistic signs. Taking into consideration what has previously been described in the literature regarding the perception of suprasegmentals, CDS seems to be able to support language acquisition due to the correspondence of prosodic and syntactic units. However, no findings have been reported, stating that the linguistically reduced CDS could hinder first language acquisition.
Predicting speech intelligibility in conditions with nonlinearly processed noisy speech

DEFF Research Database (Denmark)

Jørgensen, Søren; Dau, Torsten

2013-01-01

The speech-based envelope power spectrum model (sEPSM; [1]) was proposed in order to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index (SII). The sEPSM applies the signal-tonoise ratio in the envelope domain (SNRenv), which was demonstrated...... to successfully predict speech intelligibility in conditions with nonlinearly processed noisy speech, such as processing with spectral subtraction. Moreover, a multiresolution version (mr-sEPSM) was demonstrated to account for speech intelligibility in various conditions with stationary and fluctuating...

Exploring Australian speech-language pathologists' use and perceptions ofnon-speech oral motor exercises.

Science.gov (United States)

Rumbach, Anna F; Rose, Tanya A; Cheah, Mynn

2018-01-29

To explore Australian speech-language pathologists' use of non-speech oral motor exercises, and rationales for using/not using non-speech oral motor exercises in clinical practice. A total of 124 speech-language pathologists practising in Australia, working with paediatric and/or adult clients with speech sound difficulties, completed an online survey. The majority of speech-language pathologists reported that they did not use non-speech oral motor exercises when working with paediatric or adult clients with speech sound difficulties. However, more than half of the speech-language pathologists working with adult clients who have dysarthria reported using non-speech oral motor exercises with this population. The most frequently reported rationale for using non-speech oral motor exercises in speech sound difficulty management was to improve awareness/placement of articulators. The majority of speech-language pathologists agreed there is no clear clinical or research evidence base to support non-speech oral motor exercise use with clients who have speech sound difficulties. This study provides an overview of Australian speech-language pathologists' reported use and perceptions of non-speech oral motor exercises' applicability and efficacy in treating paediatric and adult clients who have speech sound difficulties. The research findings provide speech-language pathologists with insight into how and why non-speech oral motor exercises are currently used, and adds to the knowledge base regarding Australian speech-language pathology practice of non-speech oral motor exercises in the treatment of speech sound difficulties. Implications for Rehabilitation Non-speech oral motor exercises refer to oral motor activities which do not involve speech, but involve the manipulation or stimulation of oral structures including the lips, tongue, jaw, and soft palate. Non-speech oral motor exercises are intended to improve the function (e.g., movement, strength) of oral structures. The
Dead regions in the cochlea: Implications for speech recognition and applicability of articulation index theory

DEFF Research Database (Denmark)

Vestergaard, Martin David

2003-01-01

Dead regions in the cochlea have been suggested to be responsible for failure by hearing aid users to benefit front apparently increased audibility in terms of speech intelligibility. As an alternative to the more cumbersome psychoacoustic tuning curve measurement, threshold-equalizing noise (TEN...
The power of Speech Acts: Reflections on a Performative Concept of Ethicat Oaths in Economics and Business

NARCIS (Netherlands)

Blok, V.

2013-01-01

Ethical oaths for bankers, economists and managers are increasingly seen as successful instruments to ensure more responsible behaviour. In this article, we reflect on the nature of ethical oaths. Based on John Austin's speech act theory and the work of Emmanuel Levinas, we introduce a performative
Predicting fatigue and psychophysiological test performance from speech for safety critical environments

Directory of Open Access Journals (Sweden)

Khan Richard Baykaner

2015-08-01

Full Text Available Automatic systems for estimating operator fatigue have application in safety-critical environments. A system which could estimate level of fatigue from speech would have application in domains where operators engage in regular verbal communication as part of their duties. Previous studies on the prediction of fatigue from speech have been limited because of their reliance on subjective ratings and because they lack comparison to other methods for assessing fatigue. In this paper we present an analysis of voice recordings and psychophysiological test scores collected from seven aerospace personnel during a training task in which they remained awake for 60 hours. We show that voice features and test scores are affected by both the total time spent awake and the time position within each subject’s circadian cycle. However, we show that time spent awake and time of day information are poor predictors of the test results; while voice features can give good predictions of the psychophysiological test scores and sleep latency. Mean absolute errors of prediction are possible within about 17.5% for sleep latency and 5-12% for test scores. We discuss the implications for the use of voice as a means to monitor the effects of fatigue on cognitive performance in practical applications.
Hidden Markov models in automatic speech recognition

Science.gov (United States)

Wrzoskowicz, Adam

1993-11-01

This article describes a method for constructing an automatic speech recognition system based on hidden Markov models (HMMs). The author discusses the basic concepts of HMM theory and the application of these models to the analysis and recognition of speech signals. The author provides algorithms which make it possible to train the ASR system and recognize signals on the basis of distinct stochastic models of selected speech sound classes. The author describes the specific components of the system and the procedures used to model and recognize speech. The author discusses problems associated with the choice of optimal signal detection and parameterization characteristics and their effect on the performance of the system. The author presents different options for the choice of speech signal segments and their consequences for the ASR process. The author gives special attention to the use of lexical, syntactic, and semantic information for the purpose of improving the quality and efficiency of the system. The author also describes an ASR system developed by the Speech Acoustics Laboratory of the IBPT PAS. The author discusses the results of experiments on the effect of noise on the performance of the ASR system and describes methods of constructing HMM's designed to operate in a noisy environment. The author also describes a language for human-robot communications which was defined as a complex multilevel network from an HMM model of speech sounds geared towards Polish inflections. The author also added mandatory lexical and syntactic rules to the system for its communications vocabulary.
The Hierarchical Cortical Organization of Human Speech Processing.

Science.gov (United States)

de Heer, Wendy A; Huth, Alexander G; Griffiths, Thomas L; Gallant, Jack L; Theunissen, Frédéric E

2017-07-05

Speech comprehension requires that the brain extract semantic meaning from the spectral features represented at the cochlea. To investigate this process, we performed an fMRI experiment in which five men and two women passively listened to several hours of natural narrative speech. We then used voxelwise modeling to predict BOLD responses based on three different feature spaces that represent the spectral, articulatory, and semantic properties of speech. The amount of variance explained by each feature space was then assessed using a separate validation dataset. Because some responses might be explained equally well by more than one feature space, we used a variance partitioning analysis to determine the fraction of the variance that was uniquely explained by each feature space. Consistent with previous studies, we found that speech comprehension involves hierarchical representations starting in primary auditory areas and moving laterally on the temporal lobe: spectral features are found in the core of A1, mixtures of spectral and articulatory in STG, mixtures of articulatory and semantic in STS, and semantic in STS and beyond. Our data also show that both hemispheres are equally and actively involved in speech perception and interpretation. Further, responses as early in the auditory hierarchy as in STS are more correlated with semantic than spectral representations. These results illustrate the importance of using natural speech in neurolinguistic research. Our methodology also provides an efficient way to simultaneously test multiple specific hypotheses about the representations of speech without using block designs and segmented or synthetic speech. SIGNIFICANCE STATEMENT To investigate the processing steps performed by the human brain to transform natural speech sound into meaningful language, we used models based on a hierarchical set of speech features to predict BOLD responses of individual voxels recorded in an fMRI experiment while subjects listened to
A Computer-Aided FPS-Oriented Approach for Construction Briefing

Institute of Scientific and Technical Information of China (English)

Xiaochun Luo; Qiping Shen

2008-01-01

Function performance specification (FPS) is one of the value management (VM) techniques de- veloped for the explicit statement of optimum product definition. This technique is widely used in software engineering and manufacturing industry, and proved to be successful to perform product defining tasks. This paper describes an FPS-odented approach for construction briefing, which is critical to the successful deliv- ery of construction projects. Three techniques, i.e., function analysis system technique, shared space, and computer-aided toolkit, are incorporated into the proposed approach. A computer-aided toolkit is developed to facilitate the implementation of FPS in the briefing processes. This approach can facilitate systematic, ef- ficient identification, clarification, and representation of client requirements in trail running. The limitations of the approach and future research work are also discussed at the end of the paper.
Speaker gender identification based on majority vote classifiers

Science.gov (United States)

Mezghani, Eya; Charfeddine, Maha; Nicolas, Henri; Ben Amar, Chokri

2017-03-01

Speaker gender identification is considered among the most important tools in several multimedia applications namely in automatic speech recognition, interactive voice response systems and audio browsing systems. Gender identification systems performance is closely linked to the selected feature set and the employed classification model. Typical techniques are based on selecting the best performing classification method or searching optimum tuning of one classifier parameters through experimentation. In this paper, we consider a relevant and rich set of features involving pitch, MFCCs as well as other temporal and frequency-domain descriptors. Five classification models including decision tree, discriminant analysis, nave Bayes, support vector machine and k-nearest neighbor was experimented. The three best perming classifiers among the five ones will contribute by majority voting between their scores. Experimentations were performed on three different datasets spoken in three languages: English, German and Arabic in order to validate language independency of the proposed scheme. Results confirm that the presented system has reached a satisfying accuracy rate and promising classification performance thanks to the discriminating abilities and diversity of the used features combined with mid-level statistics.
Degradation of labial information modifies audiovisual speech perception in cochlear-implanted children.

Science.gov (United States)

Huyse, Aurélie; Berthommier, Frédéric; Leybaert, Jacqueline

2013-01-01

The aim of the present study was to examine audiovisual speech integration in cochlear-implanted children and in normally hearing children exposed to degraded auditory stimuli. Previous studies have shown that speech perception in cochlear-implanted users is biased toward the visual modality when audition and vision provide conflicting information. Our main question was whether an experimentally designed degradation of the visual speech cue would increase the importance of audition in the response pattern. The impact of auditory proficiency was also investigated. A group of 31 children with cochlear implants and a group of 31 normally hearing children matched for chronological age were recruited. All children with cochlear implants had profound congenital deafness and had used their implants for at least 2 years. Participants had to perform an /aCa/ consonant-identification task in which stimuli were presented randomly in three conditions: auditory only, visual only, and audiovisual (congruent and incongruent McGurk stimuli). In half of the experiment, the visual speech cue was normal; in the other half (visual reduction) a degraded visual signal was presented, aimed at preventing lipreading of good quality. The normally hearing children received a spectrally reduced speech signal (simulating the input delivered by the cochlear implant). First, performance in visual-only and in congruent audiovisual modalities were decreased, showing that the visual reduction technique used here was efficient at degrading lipreading. Second, in the incongruent audiovisual trials, visual reduction led to a major increase in the number of auditory based responses in both groups. Differences between proficient and nonproficient children were found in both groups, with nonproficient children's responses being more visual and less auditory than those of proficient children. Further analysis revealed that differences between visually clear and visually reduced conditions and between
Effects of Feedback Frequency and Timing on Acquisition, Retention, and Transfer of Speech Skills in Acquired Apraxia of Speech

Science.gov (United States)

Hula, Shannon N. Austermann; Robin, Donald A.; Maas, Edwin; Ballard, Kirrie J.; Schmidt, Richard A.

2008-01-01

Purpose: Two studies examined speech skill learning in persons with apraxia of speech (AOS). Motor-learning research shows that delaying or reducing the frequency of feedback promotes retention and transfer of skills. By contrast, immediate or frequent feedback promotes temporary performance enhancement but interferes with retention and transfer.…
Preferred Compression Speed for Speech and Music and Its Relationship to Sensitivity to Temporal Fine Structure

OpenAIRE

Moore, Brian C. J.; S?k, Aleksander

2016-01-01

Multichannel amplitude compression is widely used in hearing aids. The preferred compression speed varies across individuals. Moore (2008) suggested that reduced sensitivity to temporal fine structure (TFS) may be associated with preference for slow compression. This idea was tested using a simulated hearing aid. It was also assessed whether preferences for compression speed depend on the type of stimulus: speech or music. Twenty-two hearing-impaired subjects were tested, and the stimulated h...
Modeling Speech Level as a Function of Background Noise Level and Talker-to-Listener Distance for Talkers Wearing Hearing Protection Devices

DEFF Research Database (Denmark)

Bouserhal, Rachel E.; Bockstael, Annelies; MacDonald, Ewen

2017-01-01

Purpose: Studying the variations in speech levels with changing background noise level and talker-to-listener distance for talkers wearing hearing protection devices (HPDs) can aid in understanding communication in background noise. Method: Speech was recorded using an intra-aural HPD from 12...... complements the existing model presented by Pelegrín-García, Smits, Brunskog, and Jeong (2011) and expands on it by taking into account the effects of occlusion and background noise level on changes in speech sound level. Conclusions: Three models of the relationship between vocal effort, background noise...
Predicting Speech Intelligibility Decline in Amyotrophic Lateral Sclerosis Based on the Deterioration of Individual Speech Subsystems

Science.gov (United States)

Yunusova, Yana; Wang, Jun; Zinman, Lorne; Pattee, Gary L.; Berry, James D.; Perry, Bridget; Green, Jordan R.

2016-01-01

Purpose To determine the mechanisms of speech intelligibility impairment due to neurologic impairments, intelligibility decline was modeled as a function of co-occurring changes in the articulatory, resonatory, phonatory, and respiratory subsystems. Method Sixty-six individuals diagnosed with amyotrophic lateral sclerosis (ALS) were studied longitudinally. The disease-related changes in articulatory, resonatory, phonatory, and respiratory subsystems were quantified using multiple instrumental measures, which were subjected to a principal component analysis and mixed effects models to derive a set of speech subsystem predictors. A stepwise approach was used to select the best set of subsystem predictors to model the overall decline in intelligibility. Results Intelligibility was modeled as a function of five predictors that corresponded to velocities of lip and jaw movements (articulatory), number of syllable repetitions in the alternating motion rate task (articulatory), nasal airflow (resonatory), maximum fundamental frequency (phonatory), and speech pauses (respiratory). The model accounted for 95.6% of the variance in intelligibility, among which the articulatory predictors showed the most substantial independent contribution (57.7%). Conclusion Articulatory impairments characterized by reduced velocities of lip and jaw movements and resonatory impairments characterized by increased nasal airflow served as the subsystem predictors of the longitudinal decline of speech intelligibility in ALS. Declines in maximum performance tasks such as the alternating motion rate preceded declines in intelligibility, thus serving as early predictors of bulbar dysfunction. Following the rapid decline in speech intelligibility, a precipitous decline in maximum performance tasks subsequently occurred. PMID:27148967
Multiple Transcoding Impact on Speech Quality in Ideal Network Conditions

Directory of Open Access Journals (Sweden)

Martin Mikulec

2015-01-01

Full Text Available This paper deals with the impact of transcoding on the speech quality. We have focused mainly on the transcoding between codecs without the negative influence of the network parameters such as packet loss and delay. It has ensured objective and repeatable results from our measurement. The measurement was performed on the Transcoding Measuring System developed especially for this purpose. The system is based on the open source projects and is useful as a design tool for VoIP system administrators. The paper compares the most used codecs from the transcoding perspective. The multiple transcoding between G711, GSM and G729 codecs were performed and the speech quality of these calls was evaluated. The speech quality was measured by Perceptual Evaluation of Speech Quality method, which provides results in Mean Opinion Score used to describe the speech quality on a scale from 1 to 5. The obtained results indicate periodical speech quality degradation on every transcoding between two codecs.
Effects of Removing Low-Frequency Electric Information on Speech Perception with Bimodal Hearing

Science.gov (United States)

Fowler, Jennifer R.; Eggleston, Jessica L.; Reavis, Kelly M.; McMillan, Garnett P.; Reiss, Lina A. J.

2016-01-01

Purpose: The objective was to determine whether speech perception could be improved for bimodal listeners (those using a cochlear implant [CI] in one ear and hearing aid in the contralateral ear) by removing low-frequency information provided by the CI, thereby reducing acoustic-electric overlap. Method: Subjects were adult CI subjects with at…
Processing melodic contour and speech intonation in congenital amusics with Mandarin Chinese.

Science.gov (United States)

Jiang, Cunmei; Hamm, Jeff P; Lim, Vanessa K; Kirk, Ian J; Yang, Yufang

2010-07-01

Congenital amusia is a disorder in the perception and production of musical pitch. It has been suggested that early exposure to a tonal language may compensate for the pitch disorder (Peretz, 2008). If so, it is reasonable to expect that there would be different characterizations of pitch perception in music and speech in congenital amusics who speak a tonal language, such as Mandarin. In this study, a group of 11 adults with amusia whose first language was Mandarin were tested with melodic contour and speech intonation discrimination and identification tasks. The participants with amusia were impaired in discriminating and identifying melodic contour. These abnormalities were also detected in identifying both speech and non-linguistic analogue derived patterns for the Mandarin intonation tasks. In addition, there was an overall trend for the participants with amusia to show deficits with respect to controls in the intonation discrimination tasks for both speech and non-linguistic analogues. These findings suggest that the amusics' melodic pitch deficits may extend to the perception of speech, and could potentially result in some language deficits in those who speak a tonal language. Copyright (c) 2010 Elsevier Ltd. All rights reserved.
Speech Problems

Science.gov (United States)

... Staying Safe Videos for Educators Search English Español Speech Problems KidsHealth / For Teens / Speech Problems What's in ... a person's ability to speak clearly. Some Common Speech and Language Disorders Stuttering is a problem that ...
Low-dimensional recurrent neural network-based Kalman filter for speech enhancement.

Science.gov (United States)

Xia, Youshen; Wang, Jun

2015-07-01

This paper proposes a new recurrent neural network-based Kalman filter for speech enhancement, based on a noise-constrained least squares estimate. The parameters of speech signal modeled as autoregressive process are first estimated by using the proposed recurrent neural network and the speech signal is then recovered from Kalman filtering. The proposed recurrent neural network is globally asymptomatically stable to the noise-constrained estimate. Because the noise-constrained estimate has a robust performance against non-Gaussian noise, the proposed recurrent neural network-based speech enhancement algorithm can minimize the estimation error of Kalman filter parameters in non-Gaussian noise. Furthermore, having a low-dimensional model feature, the proposed neural network-based speech enhancement algorithm has a much faster speed than two existing recurrent neural networks-based speech enhancement algorithms. Simulation results show that the proposed recurrent neural network-based speech enhancement algorithm can produce a good performance with fast computation and noise reduction. Copyright © 2015 Elsevier Ltd. All rights reserved.
Performance Assessment of the CapitalBio Mycobacterium Identification Array System for Identification of Mycobacteria

Science.gov (United States)

Liu, Jingbo; Yan, Zihe; Han, Min; Han, Zhijun; Jin, Lingjie; Zhao, Yanlin

2012-01-01

The CapitalBio Mycobacterium identification microarray system is a rapid system for the detection of Mycobacterium tuberculosis. The performance of this system was assessed with 24 reference strains, 486 Mycobacterium tuberculosis clinical isolates, and 40 clinical samples and then compared to the “gold standard” of DNA sequencing. The CapitalBio Mycobacterium identification microarray system showed highly concordant identification results of 100% and 98.4% for Mycobacterium tuberculosis complex (MTC) and nontuberculous mycobacteria (NTM), respectively. The sensitivity and specificity of the CapitalBio Mycobacterium identification array for identification of Mycobacterium tuberculosis isolates were 99.6% and 100%, respectively, for direct detection and identification of clinical samples, and the overall sensitivity was 52.5%. It was 100% for sputum, 16.7% for pleural fluid, and 10% for bronchoalveolar lavage fluid, respectively. The total assay was completed in 6 h, including DNA extraction, PCR, and hybridization. The results of this study confirm the utility of this system for the rapid identification of mycobacteria and suggest that the CapitalBio Mycobacterium identification array is a molecular diagnostic technique with high sensitivity and specificity that has the capacity to quickly identify most mycobacteria. PMID:22090408
Spoken Indian language identification: a review of features and ...

Indian Academy of Sciences (India)

BAKSHI AARTI

2018-04-12

Apr 12, 2018 ... languages and can be used for the purposes of spoken language identification. Keywords. SLID .... branch of linguistics to study the sound structure of human language. ... countries, work in the area of Indian language identification has not ...... English and speech database has been collected over tele-.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.