audiometry speech: Topics by WorldWideScience.org

Sample records for audiometry speech

Right Ear Advantage of Speech Audiometry in Single-sided Deafness.

Science.gov (United States)

Wettstein, Vincent G; Probst, Rudolf

2018-04-01

Postlingual single-sided deafness (SSD) is defined as normal hearing in one ear and severely impaired hearing in the other ear. A right ear advantage and dominance of the left hemisphere are well established findings in individuals with normal hearing and speech processing. Therefore, it seems plausible that a right ear advantage would exist in patients with SSD. The audiometric database was searched to identify patients with SSD. Results from the German monosyllabic Freiburg word test and four-syllabic number test in quiet were evaluated. Results of right-sided SSD were compared with left-sided SSD. Statistical calculations were done with the Mann-Whitney U test. Four hundred and six patients with SSD were identified, 182 with right-sided and 224 with left-sided SSD. The two groups had similar pure-tone thresholds without significant differences. All test parameters of speech audiometry had better values for right ears (SSD left) when compared with left ears (SSD right). Statistically significant results (p right and 97.5 ± 4.7% left, p right and 93.9 ± 9.1% left, p right and 63.8 ± 11.1 dB SPL left, p right ear advantage of speech audiometry was found in patients with SSD in this retrospective study of audiometric test results.
High-frequency audiometry: A means for early diagnosis of noise-induced hearing loss

Directory of Open Access Journals (Sweden)

Amir H Mehrparvar

2011-01-01

Full Text Available Noise-induced hearing loss (NIHL, an irreversible disorder, is a common problem in industrial settings. Early diagnosis of NIHL can help prevent the progression of hearing loss, especially in speech frequencies. For early diagnosis of NIHL, audiometry is performed routinely in conventional frequencies. We designed this study to compare the effect of noise on high-frequency audiometry (HFA and conventional audiometry. In a historical cohort study, we compared hearing threshold and prevalence of hearing loss in conventional and high frequencies of audiometry among textile workers divided into two groups: With and without exposure to noise more than 85 dB. The highest hearing threshold was observed at 4000 Hz, 6000 Hz and 16000 Hz in conventional right ear audiometry, conventional left ear audiometry and HFA in each ear, respectively. The hearing threshold was significantly higher at 16000 Hz compared to 4000. Hearing loss was more common in HFA than conventional audiometry. HFA is more sensitive to detect NIHL than conventional audiometry. It can be useful for early diagnosis of hearing sensitivity to noise, and thus preventing hearing loss in lower frequencies especially speech frequencies.
Occupational hearing loss: tonal audiometry X high frequencies audiometry

Directory of Open Access Journals (Sweden)

Lauris, José Roberto Pereira

2009-09-01

Full Text Available Introduction: Studies on the occupational exposure show that noise has been reaching a large part of the working population around the world, and NIHL (noise-induced hearing loss is the second most frequent disease of the hearing system. Objective: To review the audiometry results of employees at the campus of the University of São Paulo, Bauru. Method: 40 audiometry results were analyzed between 2007 and 2008, whose ages comprised between 32 and 59 years, of both sexes and several professions: gardeners, maintenance technicians, drivers etc. The participants were divided into 2 groups: those with tonal thresholds within acceptable thresholds and those who presented auditory thresholds alterations, that is tonal thresholds below 25 dB (NA in any frequency (Administrative Rule no. 19 of the Ministry of Labor 1998. In addition to the Conventional Audiologic Evaluation (250Hz to 8.000Hz we also carried out High Frequencies Audiometry (9000Hz, 10000Hz, 11200Hz, 12500Hz, 14000Hz and 16000Hz. Results: According to the classification proposed by FIORINI (1994, 25.0% (N=10 they presented with NIHL suggestive audiometric configurations. The results of high frequencies Audiometry confirmed worse thresholds than those obtained in the conventional audiometry in the 2 groups evaluated. Conclusion: The use of high frequencies audiometry proved to be an important register as a hearing alteration early detection method.
Nonorganic hearing loss in children: audiometry, clinical characteristics, biographical history and recovery of hearing thresholds.

Science.gov (United States)

Schmidt, Claus-Michael; am Zehnhoff-Dinnesen, Antoinette; Matulat, Peter; Knief, Arne; Rosslau, Ken; Deuster, Dirk

2013-07-01

The term "nonorganic hearing loss" (NOHL) (pseudohypacusis, functional or psychogenic hearing loss) describes a hearing loss without a detectable corresponding pathology in the auditory system. It is characterized by a discrepancy between elevated pure tone audiometry thresholds and normal speech discrimination. The recommended audiological management of NOHL in children comprises history taking, diagnosis, and counseling. According to the literature, prognosis depends on the severity of the patient's school and/or personal problems. Routine referral to a child psychiatrist is discussed as being controversial. The clinical history of 34 children with NOHL was retrospectively evaluated. In 15 children, follow up audiometry was performed. Results of biographical history, subjective and objective audiometry, additional speech and language assessment, psychological investigations and follow up audiometry are presented and discussed. The prevalence of NOHL was 1.8% in children with suspected hearing loss. Mean age at diagnosis was 10.8 years. Girls were twice as often affected as boys. Patient history showed a high prevalence of emotional and school problems. Pre-existing organic hearing loss can be worsened by nonorganic causes. Children with a fast recovery of hearing thresholds (n=6) showed a high rate (4/6) of family, social and emotional problems. In children with continuous threshold elevation (n=9), biographical history showed no recognizable or obvious family, social or emotional problems; learning disability (4/9) was the most frequently presented characteristic. Due to advances in objective audiometry, the diagnosis of NOHL is less challenging than management and counseling. Considering the high frequency of personal and school problems, a multidisciplinary setting is helpful. On the basis of our results, drawing conclusions from hearing threshold recovery on the severity of underlying psychic problems seems inappropriate. As a consequence, a referral to a
Test person operated 2-Alternative Forced Choice Audiometry compared to traditional audiometry

DEFF Research Database (Denmark)

Schmidt, Jesper Hvass; Brandt, Christian; Christensen-Dalsgaard, Jakob

Background: With a newly developed technique, hearing thresholds can be estimated with a system operated by the test persons themselves. This technique is based on the 2 Alternative Forced Choice paradigm known from the psychoacoustic research theory. Test persons can operate the system very......-likelihood and up-down methods has proven effective and reliable even under suboptimal test settings. In non-optimal testing conditions i.e. as a part of a hearing conservation programme the headphone Sennheiser HDA-200 has been used as it contains hearing protection. This test-method has been validated......-retest studies of 2AFC audiometry are comparable to test-retest results known from traditional audiometry under standard clinical settings. Conclusions 2 Alternative Forced Choice audiometry can be a reliable alternative to traditional audiometry especially under certain circumstances, where it can...
Audiometry

Science.gov (United States)

... is about 20 to 20,000 Hz. Some animals can hear up to 50,000 Hz. Human ... Names Audiometry; Hearing test; Audiography (audiogram) Images Ear anatomy References Handelsman JA, Van Riper LA, Lesperance MM. ...
Speech perception performance of subjects with type I diabetes mellitus in noise

Directory of Open Access Journals (Sweden)

Bárbara Cristiane Sordi Silva

Full Text Available Abstract Introduction: Diabetes mellitus (DM is a chronic metabolic disorder of various origins that occurs when the pancreas fails to produce insulin in sufficient quantities or when the organism fails to respond to this hormone in an efficient manner. Objective: To evaluate the speech recognition in subjects with type I diabetes mellitus (DMI in quiet and in competitive noise. Methods: It was a descriptive, observational and cross-section study. We included 40 participants of both genders aged 18-30 years, divided into a control group (CG of 20 healthy subjects with no complaints or auditory changes, paired for age and gender with the study group, consisting of 20 subjects with a diagnosis of DMI. First, we applied basic audiological evaluations (pure tone audiometry, speech audiometry and immittance audiometry for all subjects; after these evaluations, we applied Sentence Recognition Threshold in Quiet (SRTQ and Sentence Recognition Threshold in Noise (SRTN in free field, using the List of Sentences in Portuguese test. Results: All subjects showed normal bilateral pure tone threshold, compatible speech audiometry and "A" tympanometry curve. Group comparison revealed a statistically significant difference for SRTQ (p = 0.0001, SRTN (p < 0.0001 and the signal-to-noise ratio (p < 0.0001. Conclusion: The performance of DMI subjects in SRTQ and SRTN was worse compared to the subjects without diabetes.
Automated audiometry using apple iOS-based application technology.

Science.gov (United States)

Foulad, Allen; Bui, Peggy; Djalilian, Hamid

2013-11-01

The aim of this study is to determine the feasibility of an Apple iOS-based automated hearing testing application and to compare its accuracy with conventional audiometry. Prospective diagnostic study. Setting Academic medical center. An iOS-based software application was developed to perform automated pure-tone hearing testing on the iPhone, iPod touch, and iPad. To assess for device variations and compatibility, preliminary work was performed to compare the standardized sound output (dB) of various Apple device and headset combinations. Forty-two subjects underwent automated iOS-based hearing testing in a sound booth, automated iOS-based hearing testing in a quiet room, and conventional manual audiometry. The maximum difference in sound intensity between various Apple device and headset combinations was 4 dB. On average, 96% (95% confidence interval [CI], 91%-100%) of the threshold values obtained using the automated test in a sound booth were within 10 dB of the corresponding threshold values obtained using conventional audiometry. When the automated test was performed in a quiet room, 94% (95% CI, 87%-100%) of the threshold values were within 10 dB of the threshold values obtained using conventional audiometry. Under standardized testing conditions, 90% of the subjects preferred iOS-based audiometry as opposed to conventional audiometry. Apple iOS-based devices provide a platform for automated air conduction audiometry without requiring extra equipment and yield hearing test results that approach those of conventional audiometry.
Examining speech perception in noise and cognitive functions in the elderly.

Science.gov (United States)

Meister, Hartmut; Schreitmüller, Stefan; Grugel, Linda; Beutner, Dirk; Walger, Martin; Meister, Ingo

2013-12-01

The purpose of this study was to investigate the relationship of cognitive functions (i.e., working memory [WM]) and speech recognition against different background maskers in older individuals. Speech reception thresholds (SRTs) were determined using a matrix-sentence test. Unmodulated noise, modulated noise (International Collegium for Rehabilitative Audiology [ICRA] noise 5-250), and speech fragments (International Speech Test Signal [ISTS]) were used as background maskers. Verbal WM was assessed using the Verbal Learning and Memory Test (VLMT; Helmstaedter & Durwen, 1990). Measurements were conducted with 14 normal-hearing older individuals and a control group of 12 normal-hearing young listeners. Despite their normal hearing ability, the young listeners outperformed the older individuals in all background maskers. These differences were largest for the modulated maskers. SRTs were significantly correlated with the scores of the VLMT. A linear regression model also included WM as the only significant predictor variable. The results support the assumption that WM plays an important role for speech understanding and that it might have impact on results obtained using speech audiometry. Thus, an individual's WM capacity should be considered with aural diagnosis and rehabilitation. The VLMT proved to be a clinically applicable test for WM. Further cognitive functions important with speech understanding are currently being investigated within the SAKoLA (Sprachaudiometrie und kognitive Leistungen im Alter [Speech Audiometry and Cognitive Functions in the Elderly]) project.
Automated smartphone audiometry: Validation of a word recognition test app.

Science.gov (United States)

Dewyer, Nicholas A; Jiradejvong, Patpong; Henderson Sabes, Jennifer; Limb, Charles J

2018-03-01

Develop and validate an automated smartphone word recognition test. Cross-sectional case-control diagnostic test comparison. An automated word recognition test was developed as an app for a smartphone with earphones. English-speaking adults with recent audiograms and various levels of hearing loss were recruited from an audiology clinic and were administered the smartphone word recognition test. Word recognition scores determined by the smartphone app and the gold standard speech audiometry test performed by an audiologist were compared. Test scores for 37 ears were analyzed. Word recognition scores determined by the smartphone app and audiologist testing were in agreement, with 86% of the data points within a clinically acceptable margin of error and a linear correlation value between test scores of 0.89. The WordRec automated smartphone app accurately determines word recognition scores. 3b. Laryngoscope, 128:707-712, 2018. © 2017 The American Laryngological, Rhinological and Otological Society, Inc.
Extended high-frequency audiometry (9,000-20,000 Hz). Usefulness in audiological diagnosis.

Science.gov (United States)

Rodríguez Valiente, Antonio; Roldán Fidalgo, Amaya; Villarreal, Ithzel M; García Berrocal, José R

2016-01-01

Early detection and appropriate treatment of hearing loss are essential to minimise the consequences of hearing loss. In addition to conventional audiometry (125-8,000 Hz), extended high-frequency audiometry (9,000-20,000 Hz) is available. This type of audiometry may be useful in early diagnosis of hearing loss in certain conditions, such as the ototoxic effect of cisplatin-based treatment, noise exposure or oral misunderstanding, especially in noisy environments. Eleven examples are shown in which extended high-frequency audiometry has been useful in early detection of hearing loss, despite the subject having a normal conventional audiometry. The goal of the present paper was to highlight the importance of the extended high-frequency audiometry examination for it to become a standard tool in routine audiological examinations. Copyright © 2015 Elsevier España, S.L.U. y Sociedad Española de Otorrinolaringología y Patología Cérvico-Facial. All rights reserved.
The Galker test of speech reception in noise

DEFF Research Database (Denmark)

Lauritsen, Maj-Britt Glenn; Söderström, Margareta; Kreiner, Svend

2016-01-01

PURPOSE: We tested "the Galker test", a speech reception in noise test developed for primary care for Danish preschool children, to explore if the children's ability to hear and understand speech was associated with gender, age, middle ear status, and the level of background noise. METHODS......: The Galker test is a 35-item audio-visual, computerized word discrimination test in background noise. Included were 370 normally developed children attending day care center. The children were examined with the Galker test, tympanometry, audiometry, and the Reynell test of verbal comprehension. Parents...... and daycare teachers completed questionnaires on the children's ability to hear and understand speech. As most of the variables were not assessed using interval scales, non-parametric statistics (Goodman-Kruskal's gamma) were used for analyzing associations with the Galker test score. For comparisons...
Diagnostic Hearing Assessment in Schools: Validity and Time Efficiency of Automated Audiometry.

Science.gov (United States)

Mahomed-Asmail, Faheema; Swanepoel, De Wet; Eikelboom, Robert H

2016-01-01

Poor follow-up compliance from school-based hearing screening typically undermines the efficacy of school-based hearing screening programs. Onsite diagnostic audiometry with automation may reduce false positives and ensure directed referrals. To investigate the validity and time efficiency of automated diagnostic air- and bone-conduction audiometry for children in a natural school environment following hearing screening. A within-subject repeated measures design was employed to compare air- and bone-conduction pure-tone thresholds (0.5-4 kHz), measured by manual and automated pure-tone audiometry. Sixty-two children, 25 males and 37 females, with an average age of 8 yr (standard deviation [SD] = 0.92; range = 6-10 yr) were recruited for this study. The participants included 30 children who failed on a hearing screening and 32 children who passed a hearing screening. Threshold comparisons were made for air- and bone-conduction thresholds across ears tested with manual and automated audiometry. To avoid a floor effect thresholds of 15 dB HL were excluded in analyses. The Wilcoxon signed ranked test was used to compare threshold correspondence for manual and automated thresholds and the paired samples t-test was used to compare test time. Statistical significance was set as p ≤ 0.05. 85.7% of air-conduction thresholds and 44.6% of bone-conduction thresholds corresponded within the normal range (15 dB HL) for manual and automated audiometry. Both manual and automated audiometry air- and bone-conduction thresholds exceeded 15 dB HL in 9.9% and 34.0% of thresholds, respectively. For these thresholds, average absolute differences for air- and bone-conduction thresholds were 6.3 (SD = 8.3) and 2.2 dB (SD = 3.6) and they corresponded within 10 dB across frequencies in 87.7% and 100.0%, respectively. There was no significant difference between manual and automated air- and bone-conduction across frequencies for these thresholds. Using onsite automated diagnostic audiometry
Evaluation of Two Circumaural Earphones for Audiometry.

Science.gov (United States)

Smull, Clae C; Madsen, Brandon; Margolis, Robert H

2018-05-08

The Sennheiser HDA 200 earphone, a standard circumaural earphone used in audiometry for many years, is out of production and is replaced by the RadioEar DD450. The Sennheiser HD 280 Pro earphone is a consumer product that has characteristics that may be suitable for audiometry and may be a low-cost alternative to the DD450. The DD450 and HD 280 Pro earphones were compared with the HDA 200 for use in audiometry. RadioEar DD450 and Sennheiser HD 280 Pro earphones were evaluated for reference equivalent threshold sound pressure levels (RETSPLs), ambient-noise attenuation, and occlusion effects. Audiometric thresholds measured on a group of normal-hearing adults were used to determine RETSPLs. Ambient-noise attenuation was determined by measuring the sound pressure in the ear canal produced by a broadband signal from a loudspeaker with and without occlusion by the earphone. Acoustic occlusion effects were determined by measuring the ear-canal sound pressure produced by a bone-conducted source with and without occlusion by the earphone. The results were compared with measurements obtained from the HDA 200 earphone. Audiometric thresholds obtained using the DD450 earphone did not differ from those obtained with the HDA 200 earphones, indicating that the HDA 200 RETSPLs provided in the audiometer standards (ANSI S3.6-2010; ISO 389-8-2004) are transferable to the DD450. New RETSPLs for the HD 280 Pro earphone were determined from the threshold measurements. Ambient-noise attenuation provided by the DD450 was equivalent to the attenuation provided by the HDA 200. The HD 280 Pro provided less ambient-noise attenuation than the other circumaural earphones, but more than the supra-aural earphones commonly used in audiometry. The DD450 produced an occlusion effect 5 dB larger than that of the HDA 200 at 0.25 and 0.5 kHz; both earphones produced negligible occlusion effects at higher frequencies. The HD 280 Pro produced larger occlusion effects in the low frequencies than the
Smartphone threshold audiometry in underserved primary health-care contexts.

Science.gov (United States)

Sandström, Josefin; Swanepoel, De Wet; Carel Myburgh, Hermanus; Laurent, Claude

2016-01-01

To validate a calibrated smartphone-based hearing test in a sound booth environment and in primary health-care clinics. A repeated-measure within-subject study design was employed whereby air-conduction hearing thresholds determined by smartphone-based audiometry was compared to conventional audiometry in a sound booth and a primary health-care clinic environment. A total of 94 subjects (mean age 41 years ± 17.6 SD and range 18-88; 64% female) were assessed of whom 64 were tested in the sound booth and 30 within primary health-care clinics without a booth. In the sound booth 63.4% of conventional and smartphone thresholds indicated normal hearing (≤15 dBHL). Conventional thresholds exceeding 15 dB HL corresponded to smartphone thresholds within ≤10 dB in 80.6% of cases with an average threshold difference of -1.6 dB ± 9.9 SD. In primary health-care clinics 13.7% of conventional and smartphone thresholds indicated normal hearing (≤15 dBHL). Conventional thresholds exceeding 15 dBHL corresponded to smartphone thresholds within ≤10 dB in 92.9% of cases with an average threshold difference of -1.0 dB ± 7.1 SD. Accurate air-conduction audiometry can be conducted in a sound booth and without a sound booth in an underserved community health-care clinic using a smartphone.
Sensorineural hearing loss among cerebellopontine-angle tumor patients examined with pure tone audiometry and brainstem-evoked response audiometry

Science.gov (United States)

Rinindra, A. M.; Zizlavsky, S.; Bashiruddin, J.; Aman, R. A.; Wulani, V.; Bardosono, S.

2017-08-01

Tumor in the cerebellopontine angle (CPA) accurs for approximately 5-10% of all intracranial tumors, where unilateral hearing loss and tinnitus are the most frequent symptoms. This study aimed to collect data on sensorineural hearing loss in CPA tumor patients in Dr. Cipto Mangunkusumo Hospital (CMH) using pure tone audiometry and brainstem-evoked response audiometry (BERA). It also aimed to obtaine data on CPA-tumor imaging through magnetic resonance imaging (MRI). This was a descriptive, analytic, and cross-sectional study. The subjects of this study were gathered using a total sampling method from secondary data between July 2012 and November 2016. From 104 patients, 30 matched the inclusion criteria. The CPA-tumor patients in the ENT CMH outpatient clinic were mostly female, middle-aged patients (41-60 years) whose clinical presentation was mostly tinnitus and severe, asymmetric sensorineural hearing loss in 10 subjects. From 30 subjects, 29 showed ipsilaterally impaired BERA results, and 17 subjects showed contralaterally impaired BERA results. There were 24 subjects who with large-sized tumors and 19 subjects who had intracanal tumors that had spread until they were extracanal in 19 subjects.
Spoken Word Recognition Errors in Speech Audiometry: A Measure of Hearing Performance?

Directory of Open Access Journals (Sweden)

Martine Coene

2015-01-01

Full Text Available This report provides a detailed analysis of incorrect responses from an open-set spoken word-repetition task which is part of a Dutch speech audiometric test battery. Single-consonant confusions were analyzed from 230 normal hearing participants in terms of the probability of choice of a particular response on the basis of acoustic-phonetic, lexical, and frequency variables. The results indicate that consonant confusions are better predicted by lexical knowledge than by acoustic properties of the stimulus word. A detailed analysis of the transmission of phonetic features indicates that “voicing” is best preserved whereas “manner of articulation” yields most perception errors. As consonant confusion matrices are often used to determine the degree and type of a patient’s hearing impairment, to predict a patient’s gain in hearing performance with hearing devices and to optimize the device settings in view of maximum output, the observed findings are highly relevant for the audiological practice. Based on our findings, speech audiometric outcomes provide a combined auditory-linguistic profile of the patient. The use of confusion matrices might therefore not be the method best suited to measure hearing performance. Ideally, they should be complemented by other listening task types that are known to have less linguistic bias, such as phonemic discrimination.
Visual reinforcement audiometry: an Adobe Flash based approach.

Science.gov (United States)

Atherton, Steve

2010-09-01

Visual Reinforcement Audiometry (VRA) is a key behavioural test for young children. It is central to the diagnosis of hearing-impaired infants (1) . Habituation to the visual reinforcement can give misleading results. Medical Illustration ABM University Health Board has designed a collection of Flash animations to overcome this.
Hearing Handicap and Speech Recognition Correlate With Self-Reported Listening Effort and Fatigue.

Science.gov (United States)

Alhanbali, Sara; Dawes, Piers; Lloyd, Simon; Munro, Kevin J

To investigate the correlations between hearing handicap, speech recognition, listening effort, and fatigue. Eighty-four adults with hearing loss (65 to 85 years) completed three self-report questionnaires: the Fatigue Assessment Scale, the Effort Assessment Scale, and the Hearing Handicap Inventory for Elderly. Audiometric assessment included pure-tone audiometry and speech recognition in noise. There was a significant positive correlation between handicap and fatigue (r = 0.39, p speech recognition and fatigue (r = 0.22, p speech recognition both correlate with self-reported listening effort and fatigue, which is consistent with a model of listening effort and fatigue where perceived difficulty is related to sustained effort and fatigue for unrewarding tasks over which the listener has low control. A clinical implication is that encouraging clients to recognize and focus on the pleasure and positive experiences of listening may result in greater satisfaction and benefit from hearing aid use.
Computer-assisted CI fitting: Is the learning capacity of the intelligent agent FOX beneficial for speech understanding?

Science.gov (United States)

Meeuws, Matthias; Pascoal, David; Bermejo, Iñigo; Artaso, Miguel; De Ceulaer, Geert; Govaerts, Paul J

2017-07-01

The software application FOX ('Fitting to Outcome eXpert') is an intelligent agent to assist in the programing of cochlear implant (CI) processors. The current version utilizes a mixture of deterministic and probabilistic logic which is able to improve over time through a learning effect. This study aimed at assessing whether this learning capacity yields measurable improvements in speech understanding. A retrospective study was performed on 25 consecutive CI recipients with a median CI use experience of 10 years who came for their annual CI follow-up fitting session. All subjects were assessed by means of speech audiometry with open set monosyllables at 40, 55, 70, and 85 dB SPL in quiet with their home MAP. Other psychoacoustic tests were executed depending on the audiologist's clinical judgment. The home MAP and the corresponding test results were entered into FOX. If FOX suggested to make MAP changes, they were implemented and another speech audiometry was performed with the new MAP. FOX suggested MAP changes in 21 subjects (84%). The within-subject comparison showed a significant median improvement of 10, 3, 1, and 7% at 40, 55, 70, and 85 dB SPL, respectively. All but two subjects showed an instantaneous improvement in their mean speech audiometric score. Persons with long-term CI use, who received a FOX-assisted CI fitting at least 6 months ago, display improved speech understanding after MAP modifications, as recommended by the current version of FOX. This can be explained only by intrinsic improvements in FOX's algorithms, as they have resulted from learning. This learning is an inherent feature of artificial intelligence and it may yield measurable benefit in speech understanding even in long-term CI recipients.

High-frequency Audiometry Hearing on Monitoring of Individuals Exposed to Occupational Noise: A Systematic Review.

Science.gov (United States)

Antonioli, Cleonice Aparecida Silva; Momensohn-Santos, Teresa Maria; Benaglia, Tatiana Aparecida Silva

2016-07-01

The literature reports on high-frequency audiometry as one of the exams used on hearing monitoring of individuals exposed to high sound pressure in their work environment, due to the method́s greater sensitivity in early identification of hearing loss caused by noise. The frequencies that compose the exam are generally between 9 KHz and 20KHz, depending on the equipment. This study aims to perform a retrospective and secondary systematic revision of publications on high-frequency audiometry on hearing monitoring of individuals exposed to occupational noise. This systematic revision followed the methodology proposed in the Cochrane Handbook, focusing on the question: "Is High-frequency Audiometry more sensitive than Conventional Audiometry in the screening of early hearing loss individuals exposed to occupational noise?" The search was based on PubMed data, Base, Web of Science (Capes), Biblioteca Virtual em Saúde (BVS), and in the references cited in identified and selected articles. The search resulted in 6059 articles in total. Of these, only six studies were compatible with the criteria proposed in this study. The performed meta-analysis does not definitively answer the study's proposed question. It indicates that the 16 KHz high frequency audiometry (HFA) frequency is sensitive in early identification of hearing loss in the control group (medium difference (MD = 8.33)), as well as the 4 KHz frequency (CA), this one being a little less expressive (MD = 5.72). Thus, others studies are necessary to confirm the HFA importance for the early screening of hearing loss on individuals exposed to noise at the workplace.
The relationship between tinnitus pitch and parameters of audiometry and distortion product otoacoustic emissions.

Science.gov (United States)

Keppler, H; Degeest, S; Dhooge, I

2017-11-01

Chronic tinnitus is associated with reduced auditory input, which results in changes in the central auditory system. This study aimed to examine the relationship between tinnitus pitch and parameters of audiometry and distortion product otoacoustic emissions. For audiometry, the parameters represented the edge frequency of hearing loss, the frequency of maximum hearing loss and the frequency range of hearing loss. For distortion product otoacoustic emissions, the parameters were the frequency of lowest distortion product otoacoustic emission amplitudes and the frequency range of reduced distortion product otoacoustic emissions. Sixty-seven patients (45 males, 22 females) with subjective chronic tinnitus, aged 18 to 73 years, were included. No correlation was found between tinnitus pitch and parameters of audiometry and distortion product otoacoustic emissions. However, tinnitus pitch fell mostly within the frequency range of hearing loss. The current study seems to confirm the relationship between tinnitus pitch and the frequency range of hearing loss, thus supporting the homeostatic plasticity model.
High-frequency Audiometry Hearing on Monitoring of Individuals Exposed to Occupational Noise: A Systematic Review

Directory of Open Access Journals (Sweden)

Antonioli, Cleonice Aparecida Silva

2015-12-01

Full Text Available Introduction The literature reports on high-frequency audiometry as one of the exams used on hearing monitoring of individuals exposed to high sound pressure in their work environment, due to the method́s greater sensitivity in early identification of hearing loss caused by noise. The frequencies that compose the exam are generally between 9 KHz and 20KHz, depending on the equipment. Objective This study aims to perform a retrospective and secondary systematic revision of publications on high-frequency audiometry on hearing monitoring of individuals exposed to occupational noise. Data Synthesis This systematic revision followed the methodology proposed in the Cochrane Handbook, focusing on the question: “Is High-frequency Audiometry more sensitive than Conventional Audiometry in the screening of early hearing loss individuals exposed to occupational noise?” The search was based on PubMed data, Base, Web of Science (Capes, Biblioteca Virtual em Saúde (BVS, and in the references cited in identified and selected articles. The search resulted in 6059 articles in total. Of these, only six studies were compatible with the criteria proposed in this study. Conclusion The performed meta-analysis does not definitively answer the study's proposed question. It indicates that the 16 KHz high frequency audiometry (HFA frequency is sensitive in early identification of hearing loss in the control group (medium difference (MD = 8.33, as well as the 4 KHz frequency (CA, this one being a little less expressive (MD = 5.72. Thus, others studies are necessary to confirm the HFA importance for the early screening of hearing loss on individuals exposed to noise at the workplace.
[Hearing capacity and speech production in 417 children with facial cleft abnormalities].

Science.gov (United States)

Schönweiler, R; Schönweiler, B; Schmelzeisen, R

1994-11-01

Children with cleft palates often suffer from chronic conductive hearing losses, delayed language acquisition and speech disorders. This study presents results of speech and language outcomes in relation to hearing function and types of palatal malformations found. 417 children with cleft palates were examined during followup evaluations that extended over several years. Disorders were studied as they affected the ears, nose and throat, audiometry and speech and language pathology. Children with isolated cleft lips were excluded. Among the total group, 8% had normal speech and language development while 92% had speech or language disorders. 80% of these latter children had hearing problems that predominantly consisted of fluctuating conductive hearing losses caused by otitis media with effusion. 5% had sensorineural hearing losses. Fifty-eight children (14%) with rhinolalia aperta were not improved by speech therapy and required velopharyngoplasties, using a cranial-based pharyngeal flap. Language skills did not depend on the type of cleft palate presents but on the frequency and amount of hearing loss found. Otomicroscopy and audiometric follow-ups with insertions of ventilation tubes were considered to be most important for language development in those children with repeated middle ear infections. Speech or language therapy was necessary in 49% of the children.
The relationship of speech intelligibility with hearing sensitivity, cognition, and perceived hearing difficulties varies for different speech perception tests

Science.gov (United States)

Heinrich, Antje; Henshaw, Helen; Ferguson, Melanie A.

2015-01-01

Listeners vary in their ability to understand speech in noisy environments. Hearing sensitivity, as measured by pure-tone audiometry, can only partly explain these results, and cognition has emerged as another key concept. Although cognition relates to speech perception, the exact nature of the relationship remains to be fully understood. This study investigates how different aspects of cognition, particularly working memory and attention, relate to speech intelligibility for various tests. Perceptual accuracy of speech perception represents just one aspect of functioning in a listening environment. Activity and participation limits imposed by hearing loss, in addition to the demands of a listening environment, are also important and may be better captured by self-report questionnaires. Understanding how speech perception relates to self-reported aspects of listening forms the second focus of the study. Forty-four listeners aged between 50 and 74 years with mild sensorineural hearing loss were tested on speech perception tests differing in complexity from low (phoneme discrimination in quiet), to medium (digit triplet perception in speech-shaped noise) to high (sentence perception in modulated noise); cognitive tests of attention, memory, and non-verbal intelligence quotient; and self-report questionnaires of general health-related and hearing-specific quality of life. Hearing sensitivity and cognition related to intelligibility differently depending on the speech test: neither was important for phoneme discrimination, hearing sensitivity alone was important for digit triplet perception, and hearing and cognition together played a role in sentence perception. Self-reported aspects of auditory functioning were correlated with speech intelligibility to different degrees, with digit triplets in noise showing the richest pattern. The results suggest that intelligibility tests can vary in their auditory and cognitive demands and their sensitivity to the challenges that
The relationship of speech intelligibility with hearing sensitivity, cognition, and perceived hearing difficulties varies for different speech perception tests

Directory of Open Access Journals (Sweden)

Antje eHeinrich

2015-06-01

Full Text Available Listeners vary in their ability to understand speech in noisy environments. Hearing sensitivity, as measured by pure-tone audiometry, can only partly explain these results, and cognition has emerged as another key concept. Although cognition relates to speech perception, the exact nature of the relationship remains to be fully understood. This study investigates how different aspects of cognition, particularly working memory and attention, relate to speech intelligibility for various tests.Perceptual accuracy of speech perception represents just one aspect of functioning in a listening environment. Activity and participation limits imposed by hearing loss, in addition to the demands of a listening environment, are also important and may be better captured by self-report questionnaires. Understanding how speech perception relates to self-reported aspects of listening forms the second focus of the study.Forty-four listeners aged between 50-74 years with mild SNHL were tested on speech perception tests differing in complexity from low (phoneme discrimination in quiet, to medium (digit triplet perception in speech-shaped noise to high (sentence perception in modulated noise; cognitive tests of attention, memory, and nonverbal IQ; and self-report questionnaires of general health-related and hearing-specific quality of life.Hearing sensitivity and cognition related to intelligibility differently depending on the speech test: neither was important for phoneme discrimination, hearing sensitivity alone was important for digit triplet perception, and hearing and cognition together played a role in sentence perception. Self-reported aspects of auditory functioning were correlated with speech intelligibility to different degrees, with digit triplets in noise showing the richest pattern. The results suggest that intelligibility tests can vary in their auditory and cognitive demands and their sensitivity to the challenges that auditory environments pose on
Diagnosis of hearing loss using automated audiometry in an asynchronous telehealth model: A pilot accuracy study.

Science.gov (United States)

Brennan-Jones, Christopher G; Eikelboom, Robert H; Swanepoel, De Wet

2017-02-01

Introduction Standard criteria exist for diagnosing different types of hearing loss, yet audiologists interpret audiograms manually. This pilot study examined the feasibility of standardised interpretations of audiometry in a telehealth model of care. The aim of this study was to examine diagnostic accuracy of automated audiometry in adults with hearing loss in an asynchronous telehealth model using pre-defined diagnostic protocols. Materials and methods We recruited 42 study participants from a public audiology and otolaryngology clinic in Perth, Western Australia. Manual audiometry was performed by an audiologist either before or after automated audiometry. Diagnostic protocols were applied asynchronously for normal hearing, disabling hearing loss, conductive hearing loss and unilateral hearing loss. Sensitivity and specificity analyses were conducted using a two-by-two matrix and Cohen's kappa was used to measure agreement. Results The overall sensitivity for the diagnostic criteria was 0.88 (range: 0.86-1) and overall specificity was 0.93 (range: 0.86-0.97). Overall kappa ( k) agreement was 'substantial' k = 0.80 (95% confidence interval (CI) 0.70-0.89) and significant at p loss. This method has the potential to improve synchronous and asynchronous tele-audiology service delivery.
A user-operated audiometry method based on the maximum likelihood principle and the two-alternative forced-choice paradigm

DEFF Research Database (Denmark)

Schmidt, Jesper Hvass; Brandt, Christian; Pedersen, Ellen Raben

2014-01-01

with standard deviation of differences from 3.9 dB to 5.2 dB in the frequency range of 250-8000 Hz. User-operated 2AFC audiometrygave thresholds 1-2 dB lower at most frequencies compared to traditional audiometry. Conclusions: User-operated 2AFC audiometry does not require specific operating skills...
A speech reception in noise test for preschool children (the Galker-test)

DEFF Research Database (Denmark)

Lauritsen, Maj-Britt Glenn; Kreiner, Svend; Söderström, Margareta

2015-01-01

Purpose: This study evaluates initial validity and reliability of the “Galker test of speech reception in noise” developed for Danish preschool children suspected to have problems with hearing or understanding speech against strict psychometric standards and assesses acceptance by the children....... Methods:The Galker test is an audio-visual, computerised, word discrimination test in background noise, originally comprised of 50 word pairs. Three hundred and eighty eight children attending ordinary day care centres and aged 3–5 years were included. With multiple regression and the Rasch item response...... model it was examined whether the total score of the Galker test validly reflected item responses across subgroups defined by sex, age, bilingualism, tympanometry, audiometry and verbal comprehension. Results: A total of 370 children (95%) accepted testing and 339 (87%) completed all 50 items...
Hearing Tests Based on Biologically Calibrated Mobile Devices: Comparison With Pure-Tone Audiometry.

Science.gov (United States)

Masalski, Marcin; Grysiński, Tomasz; Kręcicki, Tomasz

2018-01-10

Hearing screening tests based on pure-tone audiometry may be conducted on mobile devices, provided that the devices are specially calibrated for the purpose. Calibration consists of determining the reference sound level and can be performed in relation to the hearing threshold of normal-hearing persons. In the case of devices provided by the manufacturer, together with bundled headphones, the reference sound level can be calculated once for all devices of the same model. This study aimed to compare the hearing threshold measured by a mobile device that was calibrated using a model-specific, biologically determined reference sound level with the hearing threshold obtained in pure-tone audiometry. Trial participants were recruited offline using face-to-face prompting from among Otolaryngology Clinic patients, who own Android-based mobile devices with bundled headphones. The hearing threshold was obtained on a mobile device by means of an open access app, Hearing Test, with incorporated model-specific reference sound levels. These reference sound levels were previously determined in uncontrolled conditions in relation to the hearing threshold of normal-hearing persons. An audiologist-assisted self-measurement was conducted by the participants in a sound booth, and it involved determining the lowest audible sound generated by the device within the frequency range of 250 Hz to 8 kHz. The results were compared with pure-tone audiometry. A total of 70 subjects, 34 men and 36 women, aged 18-71 years (mean 36, standard deviation [SD] 11) participated in the trial. The hearing threshold obtained on mobile devices was significantly different from the one determined by pure-tone audiometry with a mean difference of 2.6 dB (95% CI 2.0-3.1) and SD of 8.3 dB (95% CI 7.9-8.7). The number of differences not greater than 10 dB reached 89% (95% CI 88-91), whereas the mean absolute difference was obtained at 6.5 dB (95% CI 6.2-6.9). Sensitivity and specificity for a mobile
Speech comprehension difficulties in chronic tinnitus and its relation to hyperacusis

Directory of Open Access Journals (Sweden)

Veronika Vielsmeier

2016-12-01

Full Text Available AbstractObjectiveMany tinnitus patients complain about difficulties regarding speech comprehension. In spite of the high clinical relevance little is known about underlying mechanisms and predisposing factors. Here, we performed an exploratory investigation in a large sample of tinnitus patients to (1 estimate the prevalence of speech comprehension difficulties among tinnitus patients, to (2 compare subjective reports of speech comprehension difficulties with objective measurements in a standardized speech comprehension test and to (3 explore underlying mechanisms by analyzing the relationship between speech comprehension difficulties and peripheral hearing function (pure tone audiogram, as well as with co-morbid hyperacusis as a central auditory processing disorder. Subjects and MethodsSpeech comprehension was assessed in 361 tinnitus patients presenting between 07/2012 and 08/2014 at the Interdisciplinary Tinnitus Clinic at the University of Regensburg. The assessment included standard audiological assessment (pure tone audiometry, tinnitus pitch and loudness matching, the Goettingen sentence test (in quiet for speech audiometric evaluation, two questions about hyperacusis, and two questions about speech comprehension in quiet and noisy environments (How would you rate your ability to understand speech?; How would you rate your ability to follow a conversation when multiple people are speaking simultaneously?. Results Subjectively reported speech comprehension deficits are frequent among tinnitus patients, especially in noisy environments (cocktail party situation. 74.2% of all investigated patients showed disturbed speech comprehension (indicated by values above 21.5 dB SPL in the Goettingen sentence test. Subjective speech comprehension complaints (both in general and in noisy environment were correlated with hearing level and with audiologically-assessed speech comprehension ability. In contrast, co-morbid hyperacusis was only correlated
Patient-reported speech in noise difficulties and hyperacusis symptoms and correlation with test results.

Science.gov (United States)

Spyridakou, Chrysa; Luxon, Linda M; Bamiou, Doris E

2012-07-01

To compare self-reported symptoms of difficulty hearing speech in noise and hyperacusis in adults with auditory processing disorders (APDs) and normal controls; and to compare self-reported symptoms to objective test results (speech in babble test, transient evoked otoacoustic emission [TEOAE] suppression test using contralateral noise). A prospective case-control pilot study. Twenty-two participants were recruited in the study: 10 patients with reported hearing difficulty, normal audiometry, and a clinical diagnosis of APD; and 12 normal age-matched controls with no reported hearing difficulty. All participants completed the validated Amsterdam Inventory for Auditory Disability questionnaire, a hyperacusis questionnaire, a speech in babble test, and a TEOAE suppression test using contralateral noise. Patients had significantly worse scores than controls in all domains of the Amsterdam Inventory questionnaire (with the exception of sound detection) and the hyperacusis questionnaire (P reported symptoms of difficulty hearing speech in noise and speech in babble test results in the right ear (ρ = 0.624, P = .002), and between self-reported symptoms of hyperacusis and TEOAE suppression test results in the right ear (ρ = -0.597 P = .003). There was no significant correlation between the two tests. A strong correlation was observed between right ear speech in babble and patient-reported intelligibility of speech in noise, and right ear TEOAE suppression by contralateral noise and hyperacusis questionnaire. Copyright © 2012 The American Laryngological, Rhinological, and Otological Society, Inc.
Automated pure-tone audiometry: an analysis of capacity, need, and benefit.

Science.gov (United States)

Margolis, Robert H; Morgan, Donald E

2008-12-01

The rationale for automating pure-tone audiometry based on the need for hearing tests and the capacity of audiologists to provide testing is presented. The personnel time savings from automated testing are analyzed. Some possible effects of automated testing on the profession are explored. Need for testing was based on prevalence of hearing impairment, number of normal hearing patients seen for testing, and an assumption of the frequency of testing. Capacity is based on the number of audiologists and the number of audiograms performed in a typical workday. Time savings were estimated from the average duration of an audiogram and an assumption that 80% can be automated. A large gap exists between the need and the capacity of audiologists to provide testing. Automating 80% of audiograms would only partially close the gap. A significant time savings could accrue, permitting reallocation of time for doctoral level services. Although certain jobs could be affected, the gap between capacity and need is so great that automated audiometry will not significantly affect employment. Automation could increase the number of hearing impaired patients that could be served. The reallocation of personnel time would be a positive change for our patients and our profession.
High frequency audiometry in prospective clinical research of ototoxicity due to platinum derivatives

NARCIS (Netherlands)

van der Hulst, R. J.; Dreschler, W. A.; Urbanus, N. A.

1988-01-01

The results of clinical use of routine high frequency audiometry in monitoring the ototoxic side effects of platinum and its derivatives are described in this prospective study. After demonstrating the reproducibility of the technique, we discuss the first results of an analysis of ototoxic side
Comparison of middle latency responses in presbycusis patients with two different speech recognition scores.

Science.gov (United States)

Kirkim, Gunay; Madanoglu, Nevma; Akdas, Ferda; Serbetcioglu, M Bulent

2007-12-01

The purpose of this study is to evaluate whether the middle latency responses (MLR) can be used for an objective differentiation of patients with presbycusis having relatively good (Group I) and relatively poor speech recognition scores (Group II). All the participants of these groups had high frequency down-sloping hearing loss with an average of 26-60 dB HL. Data were collected from two described study groups and a control group, using pure tone audiometry, monosyllabic phonetically balanced word and synthetic sentence identification, as well as MLR. The study groups were compared with the control group. When patients in Group I were compared with the control group, only ipsilateral Na latency of middle latency evoked response was statistically significant in the right ear whereas ipsilateral Na latency in the right ear, ipsilateral and contralateral Na latency in the left ear of the patients in Group II were statistically significant. Thus, as an objective complementary tool for the evaluation of the speech perception ability of the patients with presbycusis, Na latency of MLR may be used in combination with the speech discrimination tests.
The Galker test of speech reception in noise; associations with background variables, middle ear status, hearing, and language in Danish preschool children.

Science.gov (United States)

Lauritsen, Maj-Britt Glenn; Söderström, Margareta; Kreiner, Svend; Dørup, Jens; Lous, Jørgen

2016-01-01

We tested "the Galker test", a speech reception in noise test developed for primary care for Danish preschool children, to explore if the children's ability to hear and understand speech was associated with gender, age, middle ear status, and the level of background noise. The Galker test is a 35-item audio-visual, computerized word discrimination test in background noise. Included were 370 normally developed children attending day care center. The children were examined with the Galker test, tympanometry, audiometry, and the Reynell test of verbal comprehension. Parents and daycare teachers completed questionnaires on the children's ability to hear and understand speech. As most of the variables were not assessed using interval scales, non-parametric statistics (Goodman-Kruskal's gamma) were used for analyzing associations with the Galker test score. For comparisons, analysis of variance (ANOVA) was used. Interrelations were adjusted for using a non-parametric graphic model. In unadjusted analyses, the Galker test was associated with gender, age group, language development (Reynell revised scale), audiometry, and tympanometry. The Galker score was also associated with the parents' and day care teachers' reports on the children's vocabulary, sentence construction, and pronunciation. Type B tympanograms were associated with a mean hearing 5-6dB below that of than type A, C1, or C2. In the graphic analysis, Galker scores were closely and significantly related to Reynell test scores (Gamma (G)=0.35), the children's age group (G=0.33), and the day care teachers' assessment of the children's vocabulary (G=0.26). The Galker test of speech reception in noise appears promising as an easy and quick tool for evaluating preschool children's understanding of spoken words in noise, and it correlated well with the day care teachers' reports and less with the parents' reports. Copyright © 2015 Elsevier Ireland Ltd. All rights reserved.
Effect of hearing aids use on speech stimulus decoding through speech-evoked ABR

Directory of Open Access Journals (Sweden)

Renata Aparecida Leite

Full Text Available Abstract Introduction The electrophysiological responses obtained with the complex auditory brainstem response (cABR provide objective measures of subcortical processing of speech and other complex stimuli. The cABR has also been used to verify the plasticity in the auditory pathway in the subcortical regions. Objective To compare the results of cABR obtained in children using hearing aids before and after 9 months of adaptation, as well as to compare the results of these children with those obtained in children with normal hearing. Methods Fourteen children with normal hearing (Control Group - CG and 18 children with mild to moderate bilateral sensorineural hearing loss (Study Group - SG, aged 7-12 years, were evaluated. The children were submitted to pure tone and vocal audiometry, acoustic immittance measurements and ABR with speech stimulus, being submitted to the evaluations at three different moments: initial evaluation (M0, 3 months after the initial evaluation (M3 and 9 months after the evaluation (M9; at M0, the children assessed in the study group did not use hearing aids yet. Results When comparing the CG and the SG, it was observed that the SG had a lower median for the V-A amplitude at M0 and M3, lower median for the latency of the component V at M9 and a higher median for the latency of component O at M3 and M9. A reduction in the latency of component A at M9 was observed in the SG. Conclusion Children with mild to moderate hearing loss showed speech stimulus processing deficits and the main impairment is related to the decoding of the transient portion of this stimulus spectrum. It was demonstrated that the use of hearing aids promoted neuronal plasticity of the Central Auditory Nervous System after an extended time of sensory stimulation.
Self-test web-based pure-tone audiometry: validity evaluation and measurement error analysis.

Science.gov (United States)

Masalski, Marcin; Kręcicki, Tomasz

2013-04-12

Potential methods of application of self-administered Web-based pure-tone audiometry conducted at home on a PC with a sound card and ordinary headphones depend on the value of measurement error in such tests. The aim of this research was to determine the measurement error of the hearing threshold determined in the way described above and to identify and analyze factors influencing its value. The evaluation of the hearing threshold was made in three series: (1) tests on a clinical audiometer, (2) self-tests done on a specially calibrated computer under the supervision of an audiologist, and (3) self-tests conducted at home. The research was carried out on the group of 51 participants selected from patients of an audiology outpatient clinic. From the group of 51 patients examined in the first two series, the third series was self-administered at home by 37 subjects (73%). The average difference between the value of the hearing threshold determined in series 1 and in series 2 was -1.54dB with standard deviation of 7.88dB and a Pearson correlation coefficient of .90. Between the first and third series, these values were -1.35dB±10.66dB and .84, respectively. In series 3, the standard deviation was most influenced by the error connected with the procedure of hearing threshold identification (6.64dB), calibration error (6.19dB), and additionally at the frequency of 250Hz by frequency nonlinearity error (7.28dB). The obtained results confirm the possibility of applying Web-based pure-tone audiometry in screening tests. In the future, modifications of the method leading to the decrease in measurement error can broaden the scope of Web-based pure-tone audiometry application.
Contrast sensitivity test and conventional and high frequency audiometry: information beyond that required to prescribe lenses and headsets

Science.gov (United States)

Comastri, S. A.; Martin, G.; Simon, J. M.; Angarano, C.; Dominguez, S.; Luzzi, F.; Lanusse, M.; Ranieri, M. V.; Boccio, C. M.

2008-04-01

In Optometry and in Audiology, the routine tests to prescribe correction lenses and headsets are respectively the visual acuity test (the first chart with letters was developed by Snellen in 1862) and conventional pure tone audiometry (the first audiometer with electrical current was devised by Hartmann in 1878). At present there are psychophysical non invasive tests that, besides evaluating visual and auditory performance globally and even in cases catalogued as normal according to routine tests, supply early information regarding diseases such as diabetes, hypertension, renal failure, cardiovascular problems, etc. Concerning Optometry, one of these tests is the achromatic luminance contrast sensitivity test (introduced by Schade in 1956). Concerning Audiology, one of these tests is high frequency pure tone audiometry (introduced a few decades ago) which yields information relative to pathologies affecting the basal cochlea and complements data resulting from conventional audiometry. These utilities of the contrast sensitivity test and of pure tone audiometry derive from the facts that Fourier components constitute the basis to synthesize stimuli present at the entrance of the visual and auditory systems; that these systems responses depend on frequencies and that the patient's psychophysical state affects frequency processing. The frequency of interest in the former test is the effective spatial frequency (inverse of the angle subtended at the eye by a cycle of a sinusoidal grating and measured in cycles/degree) and, in the latter, the temporal frequency (measured in cycles/sec). Both tests have similar duration and consist in determining the patient's threshold (corresponding to the inverse multiplicative of the contrast or to the inverse additive of the sound intensity level) for each harmonic stimulus present at the system entrance (sinusoidal grating or pure tone sound). In this article the frequencies, standard normality curves and abnormal threshold shifts
An overview of changes in pressure values of the middle ear using impedance audiometry among diver candidates in a hyperbaric chamber before and after a pressure test

Science.gov (United States)

Anoraga, J. S.; Bramantyo, B.; Bardosono, S.; Simanungkalit, S. H.; Basiruddin, J.

2017-08-01

Impedance audiometry is not yet routinely used in pressure tests, especially in Indonesia. Direct exposure to pressure in a hyperbaric chamber is sometimes without any assessment of the middle ear or the Eustachian tube function (ETF) of ventilation. Impedance audiometry examinations are important to assess ETF ventilation. This study determined the middle ear pressure value changes associated with the ETF (ventilation) of prospective divers. This study included 29 prospective divers aged 20-40 years without conductive hearing loss. All subjects underwent a modified diving impedance audiometry examination both before and after the pressure test in a double-lock hyperbaric chamber. Using the Toynbee maneuver, the values obtained for changes of pressure in the middle ear were significant before and after the pressure test in the right and left ears: p < 0.001 and p = 0.018, respectively. The impedance audiometry examination is necessary for the selection of candidate divers undergoing pressure tests within a hyperbaric chamber.

Extended High Frequency Audiometry in Polycystic Ovary Syndrome

Directory of Open Access Journals (Sweden)

Cuneyt Kucur

2013-01-01

and BMI of PCOS and control groups were comparable. Each subject was tested with low (250–2000 Hz, high (4000–8000 Hz, and extended high frequency audiometry (8000–20000. Hormonal and biochemical values including LH, LH/FSH, testosterone, fasting glucose, fasting insulin, HOMA-I, and CRP were calculated. Results. PCOS patients showed high levels of LH, LH/FSH, testosterone, fasting insulin, glucose, HOMA-I, and CRP levels. The hearing thresholds of the groups were similar at frequencies of 250, 500, 1000, 2000, and 4000 Hz; statistically significant difference was observed in 8000–14000 Hz in PCOS group compared to control group. Conclusion. PCOS patients have hearing impairment especially in extended high frequencies. Further studies are needed to help elucidate the mechanism behind hearing impairment in association with PCOS.
Differential Diagnosis of Speech Sound Disorder (Phonological Disorder): Audiological Assessment beyond the Pure-tone Audiogram.

Science.gov (United States)

Iliadou, Vasiliki Vivian; Chermak, Gail D; Bamiou, Doris-Eva

2015-04-01

According to the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition, diagnosis of speech sound disorder (SSD) requires a determination that it is not the result of other congenital or acquired conditions, including hearing loss or neurological conditions that may present with similar symptomatology. To examine peripheral and central auditory function for the purpose of determining whether a peripheral or central auditory disorder was an underlying factor or contributed to the child's SSD. Central auditory processing disorder clinic pediatric case reports. Three clinical cases are reviewed of children with diagnosed SSD who were referred for audiological evaluation by their speech-language pathologists as a result of slower than expected progress in therapy. Audiological testing revealed auditory deficits involving peripheral auditory function or the central auditory nervous system. These cases demonstrate the importance of increasing awareness among professionals of the need to fully evaluate the auditory system to identify auditory deficits that could contribute to a patient's speech sound (phonological) disorder. Audiological assessment in cases of suspected SSD should not be limited to pure-tone audiometry given its limitations in revealing the full range of peripheral and central auditory deficits, deficits which can compromise treatment of SSD. American Academy of Audiology.
Effect of age at cochlear implantation on auditory and speech development of children with auditory neuropathy spectrum disorder.

Science.gov (United States)

Liu, Yuying; Dong, Ruijuan; Li, Yuling; Xu, Tianqiu; Li, Yongxin; Chen, Xueqing; Gong, Shusheng

2014-12-01

To evaluate the auditory and speech abilities in children with auditory neuropathy spectrum disorder (ANSD) after cochlear implantation (CI) and determine the role of age at implantation. Ten children participated in this retrospective case series study. All children had evidence of ANSD. All subjects had no cochlear nerve deficiency on magnetic resonance imaging and had used the cochlear implants for a period of 12-84 months. We divided our children into two groups: children who underwent implantation before 24 months of age and children who underwent implantation after 24 months of age. Their auditory and speech abilities were evaluated using the following: behavioral audiometry, the Categories of Auditory Performance (CAP), the Meaningful Auditory Integration Scale (MAIS), the Infant-Toddler Meaningful Auditory Integration Scale (IT-MAIS), the Standard-Chinese version of the Monosyllabic Lexical Neighborhood Test (LNT), the Multisyllabic Lexical Neighborhood Test (MLNT), the Speech Intelligibility Rating (SIR) and the Meaningful Use of Speech Scale (MUSS). All children showed progress in their auditory and language abilities. The 4-frequency average hearing level (HL) (500Hz, 1000Hz, 2000Hz and 4000Hz) of aided hearing thresholds ranged from 17.5 to 57.5dB HL. All children developed time-related auditory perception and speech skills. Scores of children with ANSD who received cochlear implants before 24 months tended to be better than those of children who received cochlear implants after 24 months. Seven children completed the Mandarin Lexical Neighborhood Test. Approximately half of the children showed improved open-set speech recognition. Cochlear implantation is helpful for children with ANSD and may be a good optional treatment for many ANSD children. In addition, children with ANSD fitted with cochlear implants before 24 months tended to acquire auditory and speech skills better than children fitted with cochlear implants after 24 months. Copyright © 2014
Age-related hearing loss in dogs : Diagnosis with Brainstem-Evoked Response Audiometry and Treatment with Vibrant Soundbridge Middle Ear Implant.

NARCIS (Netherlands)

ter Haar, G.

2009-01-01

Age-related hearing loss (ARHL) is the most common cause of acquired hearing impairment in dogs. Diagnosis requires objective electrophysiological tests (brainstem evoked response audiometry [BERA]) evaluating the entire audible frequency range in dogs. In our laboratory a method was developed to
78 FR 49693 - Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services...

Science.gov (United States)

2013-08-15

...-Speech Services for Individuals with Hearing and Speech Disabilities, Report and Order (Order), document...] Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services; Telecommunications Relay Services and Speech-to-Speech Services for Individuals With Hearing and Speech Disabilities...
The effects of noise exposure and musical training on suprathreshold auditory processing and speech perception in noise.

Science.gov (United States)

Yeend, Ingrid; Beach, Elizabeth Francis; Sharma, Mridula; Dillon, Harvey

2017-09-01

Recent animal research has shown that exposure to single episodes of intense noise causes cochlear synaptopathy without affecting hearing thresholds. It has been suggested that the same may occur in humans. If so, it is hypothesized that this would result in impaired encoding of sound and lead to difficulties hearing at suprathreshold levels, particularly in challenging listening environments. The primary aim of this study was to investigate the effect of noise exposure on auditory processing, including the perception of speech in noise, in adult humans. A secondary aim was to explore whether musical training might improve some aspects of auditory processing and thus counteract or ameliorate any negative impacts of noise exposure. In a sample of 122 participants (63 female) aged 30-57 years with normal or near-normal hearing thresholds, we conducted audiometric tests, including tympanometry, audiometry, acoustic reflexes, otoacoustic emissions and medial olivocochlear responses. We also assessed temporal and spectral processing, by determining thresholds for detection of amplitude modulation and temporal fine structure. We assessed speech-in-noise perception, and conducted tests of attention, memory and sentence closure. We also calculated participants' accumulated lifetime noise exposure and administered questionnaires to assess self-reported listening difficulty and musical training. The results showed no clear link between participants' lifetime noise exposure and performance on any of the auditory processing or speech-in-noise tasks. Musical training was associated with better performance on the auditory processing tasks, but not the on the speech-in-noise perception tasks. The results indicate that sentence closure skills, working memory, attention, extended high frequency hearing thresholds and medial olivocochlear suppression strength are important factors that are related to the ability to process speech in noise. Crown Copyright © 2017. Published by
Sensitivity of cortical auditory evoked potential detection for hearing-impaired infants in response to short speech sounds

Directory of Open Access Journals (Sweden)

Bram Van Dun

2012-01-01

Full Text Available
Background: Cortical auditory evoked potentials (CAEPs are an emerging tool for hearing aid fitting evaluation in young children who cannot provide reliable behavioral feedback. It is therefore useful to determine the relationship between the sensation level of speech sounds and the detection sensitivity of CAEPs.

Design and methods: Twenty-five sensorineurally hearing impaired infants with an age range of 8 to 30 months were tested once, 18 aided and 7 unaided. First, behavioral thresholds of speech stimuli /m/, /g/, and /t/ were determined using visual reinforcement orientation audiometry (VROA. Afterwards, the same speech stimuli were presented at 55, 65, and 75 dB SPL, and CAEP recordings were made. An automatic statistical detection paradigm was used for CAEP detection.

Results: For sensation levels above 0, 10, and 20 dB respectively, detection sensitivities were equal to 72 ± 10, 75 ± 10, and 78 ± 12%. In 79% of the cases, automatic detection p-values became smaller when the sensation level was increased by 10 dB.

Conclusions: The results of this study suggest that the presence or absence of CAEPs can provide some indication of the audibility of a speech sound for infants with sensorineural hearing loss. The detection of a CAEP provides confidence, to a degree commensurate with the detection probability, that the infant is detecting that sound at the level presented. When testing infants where the audibility of speech sounds has not been established behaviorally, the lack of a cortical response indicates the possibility, but by no means a certainty, that the sensation level is 10 dB or less.
An assessment of threshold shifts in nonprofessional pop/rock musicians using conventional and extended high-frequency audiometry.

Science.gov (United States)

Schmuziger, Nicolas; Patscheke, Jochen; Probst, Rudolf

2007-09-01

The clinical value of extended high-frequency audiometry for the detection of noise-induced hearing loss has not been established conclusively. The purpose of this study was to assess the relative temporary threshold shift (TTS) in two frequency regions (conventional versus extended high frequency). In this exploratory study, pure-tone thresholds from 0.5 to 14 kHz were measured in both ears of 16 nonprofessional pop/rock musicians (mean age, 35 yr; range, 27 to 49 yr), before and after a 90-minute rehearsal session. All had experienced repeated exposures to intense sound levels during at least 5 yr of their musical careers. After the rehearsal, median threshold levels were found to be significantly poorer for frequencies from 0.5 to 8 kHz (Wilcoxon signed rank test, p audiometry does not seem advantageous as a means of the early detection of noise-induced hearing loss.
Speech-to-Speech Relay Service

Science.gov (United States)

Consumer Guide Speech to Speech Relay Service Speech-to-Speech (STS) is one form of Telecommunications Relay Service (TRS). TRS is a service that allows persons with hearing and speech disabilities ...
Relationship between pure-tone audiogram findings and speech perception among older Japanese persons.

Science.gov (United States)

Maeda, Yukihide; Takao, Soshi; Sugaya, Akiko; Kataoka, Yuko; Kariya, Shin; Tanaka, Satomi; Nagayasu, Rie; Nakagawa, Atsuko; Nishizaki, Kazunori

2018-02-01

To clarify how the pure-tone threshold (PTT) on the PTA predicts speech perception (SP) in elderly Japanese persons. Data on PTT and SP were cross-sectionally analyzed in Japanese persons (656 ears in 353 patients, aged ≥65 years). Correlations of SP and average PTT in all tested frequencies were evaluated by Pearson's correlation coefficient and simple linear regression. After adjusting for sex, laterality of ears, and age, the relationship of average and frequency-specific PTT with impaired SP ≤50% was estimated by logistic regression models. SP correlated well (r = -0.699) with the average PTT of all tested frequencies. On the other hand, the correlation between patient age and SP was weak, especially among ≤85-year-old persons (r = -0.092). Linear regression showed that the average PTT corresponding to SP of 50% was 76.4 dB nHL. Odds ratios for impaired SP were highest for PTT at 2000 Hz. Odds ratios were higher for middle (500, 1000, 2000 Hz) and high frequencies (4000, 8000 Hz) than low frequencies (125, 250 Hz). The PTT on the pure-tone audiogram (PTA) is a good predictor of SP by speech audiometry among older persons, which could provide clinically important information for hearing aid fitting and cochlear implantation.
Water-soluble coenzyme Q10 formulation in presbycusis: long-term effects.

Science.gov (United States)

Guastini, Luca; Mora, Renzo; Dellepiane, Massimo; Santomauro, Valentina; Giorgio, Manini; Salami, Angelo

2011-05-01

These findings provide the basis for understanding the duration of the effect after the last use of the drug and encourage a larger clinical trial to collect additional evidence on the effect of coenzyme Q10 (CoQ10) in preventing the development of hearing loss in subjects with presbycusis. The aim of this study was to evaluate the long-term effects of a water-soluble formulation of CoQ10 (Q-TER) in subjects with presbycusis. Sixty patients with presbycusis were included and divided at random into three numerically equal groups. For 30 days, group A underwent therapy with Q-TER, group B underwent therapy with vitamin E, and group C received placebo. Before, at the end, and 6 months after the end of the treatment, all patients underwent evaluation of pure tone audiometry, transient evoked otoacoustic emissions and otoacoustic products of distortion, auditory brainstem response, and speech audiometry. Compared with group B, at the end of the treatment in group A the pure tone audiometry showed a significant (p < 0.05) improvement of the audiometric thresholds at 1000, 2000, 4000, and 8000 Hz. This improvement was confirmed by the speech audiometry and last check. We found no significant differences in the other parameters and in group C.
Brainstem response audiometry in the determination of low-frequency hearing loss : a study of various methods for frequency-specific ABR-threshold assessment

NARCIS (Netherlands)

E.A.G.J. Conijn

1992-01-01

textabstractBrainstem Electric Response Audiometry (BERA) is a method to visualize some of the electric activity generated in the auditory nerve and the brainstem during the processing of sound. The amplitude of the Auditory Brainstem Response (ABR) is very small (0.05-0.5 flV). The potentials
An analysis of machine translation and speech synthesis in speech-to-speech translation system

OpenAIRE

Hashimoto, K.; Yamagishi, J.; Byrne, W.; King, S.; Tokuda, K.

2011-01-01

This paper provides an analysis of the impacts of machine translation and speech synthesis on speech-to-speech translation systems. The speech-to-speech translation system consists of three components: speech recognition, machine translation and speech synthesis. Many techniques for integration of speech recognition and machine translation have been proposed. However, speech synthesis has not yet been considered. Therefore, in this paper, we focus on machine translation and speech synthesis, ...
Hearing loss in children with otitis media with effusion: a systematic review.

Science.gov (United States)

Cai, Ting; McPherson, Bradley

2017-02-01

Otitis media with effusion (OME) is the presence of non-purulent inflammation in the middle ear. Hearing impairment is frequently associated with OME. Pure tone audiometry and speech audiometry are two of the most primarily utilised auditory assessments and provide valuable behavioural and functional estimation on hearing loss. This paper was designed to review and analyse the effects of the presence of OME on children's listening abilities. A systematic and descriptive review. Twelve articles reporting frequency-specific pure tone thresholds and/or speech perception measures in children with OME were identified using PubMed, Ovid, Web of Science, ProQuest and Google Scholar search platforms. The hearing loss related to OME averages 18-35 dB HL. The air conduction configuration is roughly flat with a slight elevation at 2000 Hz and a nadir at 8000 Hz. Both speech-in-quiet and speech-in-noise perception have been found to be impaired. OME imposes a series of disadvantages on hearing sensitivity and speech perception in children. Further studies investigating the full range of frequency-specific pure tone thresholds, and that adopt standardised speech test materials are advocated to evaluate hearing related disabilities with greater comprehensiveness, comparability and enhanced consideration of their real life implications.
Hearing status in patients with rheumatoid arthritis.

Science.gov (United States)

Ahmadzadeh, A; Daraei, M; Jalessi, M; Peyvandi, A A; Amini, E; Ranjbar, L A; Daneshi, A

2017-10-01

Rheumatoid arthritis is thought to induce conductive hearing loss and/or sensorineural hearing loss. This study evaluated the function of the middle ear and cochlea, and the related factors. Pure tone audiometry, speech reception thresholds, speech discrimination scores, tympanometry, acoustic reflexes, and distortion product otoacoustic emissions were assessed in rheumatoid arthritis patients and healthy volunteers. Pure tone audiometry results revealed a higher bone conduction threshold in the rheumatoid arthritis group, but there was no significant difference when evaluated according to the sensorineural hearing loss definition. Distortion product otoacoustic emissions related prevalence of conductive or mixed hearing loss, tympanometry values, acoustic reflexes, and speech discrimination scores were not significantly different between the two groups. Sensorineural hearing loss was significantly more prevalent in patients who used azathioprine, cyclosporine and etanercept. Higher bone conduction thresholds in some frequencies were detected in rheumatoid arthritis patients that were not clinically significant. Sensorineural hearing loss is significantly more prevalent in refractory rheumatoid arthritis patients.
Audiological manifestations in HIV-positive adults

Directory of Open Access Journals (Sweden)

Carla Gentile Matas

2014-07-01

Full Text Available OBJECTIVE:To characterize the findings of behavioral hearing assessment in HIV-positive individuals who received and did not receive antiretroviral treatment.METHODS:This research was a cross-sectional study. The participants were 45 HIV-positive individuals (18 not exposed and 27 exposed to antiretroviral treatment and 30 control-group individuals. All subjects completed an audiological evaluation through pure-tone audiometry, speech audiometry, and high-frequency audiometry.RESULTS:The hearing thresholds obtained by pure-tone audiometry were different between groups. The group that had received antiretroviral treatment had higher thresholds for the frequencies ranging from 250 to 3000 Hz compared with the control group and the group not exposed to treatment. In the range of frequencies from 4000 through 8000 Hz, the HIV-positive groups presented with higher thresholds than did the control group. The hearing thresholds determined by high-frequency audiometry were different between groups, with higher thresholds in the HIV-positive groups.CONCLUSION:HIV-positive individuals presented poorer results in pure-tone and high-frequency audiometry, suggesting impairment of the peripheral auditory pathway. Individuals who received antiretroviral treatment presented poorer results on both tests compared with individuals not exposed to antiretroviral treatment.
Audiological manifestations in HIV-positive adults.

Science.gov (United States)

Matas, Carla Gentile; Angrisani, Rosanna Giaffredo; Magliaro, Fernanda Cristina Leite; Segurado, Aluisio Augusto Cotrim

2014-07-01

To characterize the findings of behavioral hearing assessment in HIV-positive individuals who received and did not receive antiretroviral treatment. This research was a cross-sectional study. The participants were 45 HIV-positive individuals (18 not exposed and 27 exposed to antiretroviral treatment) and 30 control-group individuals. All subjects completed an audiological evaluation through pure-tone audiometry, speech audiometry, and high-frequency audiometry. The hearing thresholds obtained by pure-tone audiometry were different between groups. The group that had received antiretroviral treatment had higher thresholds for the frequencies ranging from 250 to 3000 Hz compared with the control group and the group not exposed to treatment. In the range of frequencies from 4000 through 8000 Hz, the HIV-positive groups presented with higher thresholds than did the control group. The hearing thresholds determined by high-frequency audiometry were different between groups, with higher thresholds in the HIV-positive groups. HIV-positive individuals presented poorer results in pure-tone and high-frequency audiometry, suggesting impairment of the peripheral auditory pathway. Individuals who received antiretroviral treatment presented poorer results on both tests compared with individuals not exposed to antiretroviral treatment.
Neural Entrainment to Speech Modulates Speech Intelligibility

NARCIS (Netherlands)

Riecke, Lars; Formisano, Elia; Sorger, Bettina; Baskent, Deniz; Gaudrain, Etienne

2018-01-01

Speech is crucial for communication in everyday life. Speech-brain entrainment, the alignment of neural activity to the slow temporal fluctuations (envelope) of acoustic speech input, is a ubiquitous element of current theories of speech processing. Associations between speech-brain entrainment and
THE PRESENCE OF ADENOID VEGETATIONS AND NASAL SPEECH, AND HEARING LOSS IN RELATION TO SECRETORY OTITIS MEDIA

Directory of Open Access Journals (Sweden)

Gabriela KOPACHEVA

2004-12-01

Full Text Available This study presents the treatment of 68 children with secretory otitis media. Children underwent adenoid vegetations, nasal speech, conductive hearing loss, ventilation disturbance in Eustachian tube. In all children adenoidectomy was indicated.38 boys and 30 girls at the age of 3-17 were divided in two main groups: * 29 children without hypertrophic (enlarged adenoids, * 39 children with enlarged (hypertrophic adenoids.The surgical treatment included insertion of ventilation tubes and adenoidectomy where there where hypertrophic adenoids.Clinical material was analyzed according to hearing threshold, hearing level, middle ear condition estimated by pure tone audiometry and tympanometry before and after treatment. Data concerning both groups were compared.The results indicated that adenoidectomy combined with the ventilation tubes facilitates secretory otitis media heeling as well as decrease of hearing impairments. That enables prompt restoration of the hearing function as an important precondition for development of the language, social, emotional and academic development of children.
Reduced auditory efferent activity in childhood selective mutism.

Science.gov (United States)

Bar-Haim, Yair; Henkin, Yael; Ari-Even-Roth, Daphne; Tetin-Schneider, Simona; Hildesheimer, Minka; Muchnik, Chava

2004-06-01

Selective mutism is a psychiatric disorder of childhood characterized by consistent inability to speak in specific situations despite the ability to speak normally in others. The objective of this study was to test whether reduced auditory efferent activity, which may have direct bearings on speaking behavior, is compromised in selectively mute children. Participants were 16 children with selective mutism and 16 normally developing control children matched for age and gender. All children were tested for pure-tone audiometry, speech reception thresholds, speech discrimination, middle-ear acoustic reflex thresholds and decay function, transient evoked otoacoustic emission, suppression of transient evoked otoacoustic emission, and auditory brainstem response. Compared with control children, selectively mute children displayed specific deficiencies in auditory efferent activity. These aberrations in efferent activity appear along with normal pure-tone and speech audiometry and normal brainstem transmission as indicated by auditory brainstem response latencies. The diminished auditory efferent activity detected in some children with SM may result in desensitization of their auditory pathways by self-vocalization and in reduced control of masking and distortion of incoming speech sounds. These children may gradually learn to restrict vocalization to the minimal amount possible in contexts that require complex auditory processing.

Management of non-organic hearing loss in children - A case study.

Science.gov (United States)

Skarzynski, Piotr Henryk; Raj-Koziak, Danuta; Rajchel, Joanna Jadwiga; Skarzynski, Henryk

2017-06-01

A 10 year-old girl was admitted due to the claim of progressively developing hearing loss. The impedance audiometry showed no abnormalities but it was impossible to obtain reliable outcomes during pure tone audiometry assessment. The girl was additionally sent for speech audiometry, indicating a bilateral hearing loss and objective evaluations such as distortion product otoacoustic emissions and auditory brainstem responses, which results indicated a normal hearing. On the second day, repeated subjective audiometric tests showed also normal hearing, despite constantly reported hearing loss. After the psychological consultation and exclusion of neurologic pathology, the diagnosis of non-organic hearing loss was stated and the girl was scheduled for regular appointments with psychologist. Copyright © 2017 Elsevier B.V. All rights reserved.
Inner Speech's Relationship With Overt Speech in Poststroke Aphasia.

Science.gov (United States)

Stark, Brielle C; Geva, Sharon; Warburton, Elizabeth A

2017-09-18

Relatively preserved inner speech alongside poor overt speech has been documented in some persons with aphasia (PWA), but the relationship of overt speech with inner speech is still largely unclear, as few studies have directly investigated these factors. The present study investigates the relationship of relatively preserved inner speech in aphasia with selected measures of language and cognition. Thirty-eight persons with chronic aphasia (27 men, 11 women; average age 64.53 ± 13.29 years, time since stroke 8-111 months) were classified as having relatively preserved inner and overt speech (n = 21), relatively preserved inner speech with poor overt speech (n = 8), or not classified due to insufficient measurements of inner and/or overt speech (n = 9). Inner speech scores (by group) were correlated with selected measures of language and cognition from the Comprehensive Aphasia Test (Swinburn, Porter, & Al, 2004). The group with poor overt speech showed a significant relationship of inner speech with overt naming (r = .95, p speech and language and cognition factors were not significant for the group with relatively good overt speech. As in previous research, we show that relatively preserved inner speech is found alongside otherwise severe production deficits in PWA. PWA with poor overt speech may rely more on preserved inner speech for overt picture naming (perhaps due to shared resources with verbal working memory) and for written picture description (perhaps due to reliance on inner speech due to perceived task difficulty). Assessments of inner speech may be useful as a standard component of aphasia screening, and therapy focused on improving and using inner speech may prove clinically worthwhile. https://doi.org/10.23641/asha.5303542.
Musician advantage for speech-on-speech perception

NARCIS (Netherlands)

Başkent, Deniz; Gaudrain, Etienne

Evidence for transfer of musical training to better perception of speech in noise has been mixed. Unlike speech-in-noise, speech-on-speech perception utilizes many of the skills that musical training improves, such as better pitch perception and stream segregation, as well as use of higher-level
INTEGRATING MACHINE TRANSLATION AND SPEECH SYNTHESIS COMPONENT FOR ENGLISH TO DRAVIDIAN LANGUAGE SPEECH TO SPEECH TRANSLATION SYSTEM

Directory of Open Access Journals (Sweden)

J. SANGEETHA

2015-02-01

Full Text Available This paper provides an interface between the machine translation and speech synthesis system for converting English speech to Tamil text in English to Tamil speech to speech translation system. The speech translation system consists of three modules: automatic speech recognition, machine translation and text to speech synthesis. Many procedures for incorporation of speech recognition and machine translation have been projected. Still speech synthesis system has not yet been measured. In this paper, we focus on integration of machine translation and speech synthesis, and report a subjective evaluation to investigate the impact of speech synthesis, machine translation and the integration of machine translation and speech synthesis components. Here we implement a hybrid machine translation (combination of rule based and statistical machine translation and concatenative syllable based speech synthesis technique. In order to retain the naturalness and intelligibility of synthesized speech Auto Associative Neural Network (AANN prosody prediction is used in this work. The results of this system investigation demonstrate that the naturalness and intelligibility of the synthesized speech are strongly influenced by the fluency and correctness of the translated text.
[Evaluation of the Freiburg monosyllabic speech test in background noise].

Science.gov (United States)

Löhler, J; Akcicek, B; Pilnik, M; Saager-Post, K; Dazert, S; Biedron, S; Oeken, J; Mürbe, D; Löbert, J; Laszig, R; Wesarg, T; Langer, C; Plontke, S; Rahne, T; Machate, U; Noppeney, R; Schultz, K; Plinkert, P; Hoth, S; Praetorius, M; Schlattmann, P; Meister, E F; Pau, H W; Ehrt, K; Hagen, R; Shehata-Dieler, W; Cebulla, M; Walther, L E; Ernst, A

2013-07-01

The Freiburg speech test has been the gold standard in speech audiometry in Germany for many years. Previously, however, this test had not been evaluated in assessing the effectiveness of a hearing aid in background noise. Furthermore, the validity of particular word lists used in the test has been questioned repeatedly in the past, due to a suspected higher variation within these lists as compared to the other word list used. In this prospective study, two groups of subjects [normal hearing control subjects and patients with SNHL (sensorineural hearing loss) that had been fitted with hearing aid] were examined. In a first group, 113 control subjects with normal age- and gender-related pure tone thresholds were assessed by means of the Freiburg monosyllabic test under free-field conditions at 65 dB. The second group comprised 104 patients that had been fitted with hearing aids at least 3 months previously to treat their SNHL. Members of the SNHL group were assessed by means of the Freiburg monosyllabic test both with and without hearing aids, and in the presence or absence of background noise (CCITT-noise; 65/60 dB signal-noise ratio, in accordance with the Comité Consultatif International Téléphonique et Télégraphique), under free-field conditions at 65 dB. The first (control) group exhibited no gender-related differences in the Freiburg test results. In a few instances, inter-individual variability of responses was observed, although the reasons for this remain to be clarified. Within the second (patient) group, the Freiburg test results under the four different measurement conditions differed significantly from each other (p>0.05). This group exhibited a high degree of inter-individual variability between responses. In light of this, no significant differences in outcome could be assigned to the different word lists employed in the Freiburg speech test. The Freiburg monosyllabic test is able to assess the extent of hearing loss, as well as the effectiveness
Speech endpoint detection with non-language speech sounds for generic speech processing applications

Science.gov (United States)

McClain, Matthew; Romanowski, Brian

2009-05-01

Non-language speech sounds (NLSS) are sounds produced by humans that do not carry linguistic information. Examples of these sounds are coughs, clicks, breaths, and filled pauses such as "uh" and "um" in English. NLSS are prominent in conversational speech, but can be a significant source of errors in speech processing applications. Traditionally, these sounds are ignored by speech endpoint detection algorithms, where speech regions are identified in the audio signal prior to processing. The ability to filter NLSS as a pre-processing step can significantly enhance the performance of many speech processing applications, such as speaker identification, language identification, and automatic speech recognition. In order to be used in all such applications, NLSS detection must be performed without the use of language models that provide knowledge of the phonology and lexical structure of speech. This is especially relevant to situations where the languages used in the audio are not known apriori. We present the results of preliminary experiments using data from American and British English speakers, in which segments of audio are classified as language speech sounds (LSS) or NLSS using a set of acoustic features designed for language-agnostic NLSS detection and a hidden-Markov model (HMM) to model speech generation. The results of these experiments indicate that the features and model used are capable of detection certain types of NLSS, such as breaths and clicks, while detection of other types of NLSS such as filled pauses will require future research.
Music and Speech Perception in Children Using Sung Speech.

Science.gov (United States)

Nie, Yingjiu; Galvin, John J; Morikawa, Michael; André, Victoria; Wheeler, Harley; Fu, Qian-Jie

2018-01-01

This study examined music and speech perception in normal-hearing children with some or no musical training. Thirty children (mean age = 11.3 years), 15 with and 15 without formal music training participated in the study. Music perception was measured using a melodic contour identification (MCI) task; stimuli were a piano sample or sung speech with a fixed timbre (same word for each note) or a mixed timbre (different words for each note). Speech perception was measured in quiet and in steady noise using a matrix-styled sentence recognition task; stimuli were naturally intonated speech or sung speech with a fixed pitch (same note for each word) or a mixed pitch (different notes for each word). Significant musician advantages were observed for MCI and speech in noise but not for speech in quiet. MCI performance was significantly poorer with the mixed timbre stimuli. Speech performance in noise was significantly poorer with the fixed or mixed pitch stimuli than with spoken speech. Across all subjects, age at testing and MCI performance were significantly correlated with speech performance in noise. MCI and speech performance in quiet was significantly poorer for children than for adults from a related study using the same stimuli and tasks; speech performance in noise was significantly poorer for young than for older children. Long-term music training appeared to benefit melodic pitch perception and speech understanding in noise in these pediatric listeners.
Apraxia of Speech

Science.gov (United States)

... Health Info » Voice, Speech, and Language Apraxia of Speech On this page: What is apraxia of speech? ... about apraxia of speech? What is apraxia of speech? Apraxia of speech (AOS)—also known as acquired ...
Common neural substrates support speech and non-speech vocal tract gestures.

Science.gov (United States)

Chang, Soo-Eun; Kenney, Mary Kay; Loucks, Torrey M J; Poletto, Christopher J; Ludlow, Christy L

2009-08-01

The issue of whether speech is supported by the same neural substrates as non-speech vocal tract gestures has been contentious. In this fMRI study we tested whether producing non-speech vocal tract gestures in humans shares the same functional neuroanatomy as non-sense speech syllables. Production of non-speech vocal tract gestures, devoid of phonological content but similar to speech in that they had familiar acoustic and somatosensory targets, was compared to the production of speech syllables without meaning. Brain activation related to overt production was captured with BOLD fMRI using a sparse sampling design for both conditions. Speech and non-speech were compared using voxel-wise whole brain analyses, and ROI analyses focused on frontal and temporoparietal structures previously reported to support speech production. Results showed substantial activation overlap between speech and non-speech function in regions. Although non-speech gesture production showed greater extent and amplitude of activation in the regions examined, both speech and non-speech showed comparable left laterality in activation for both target perception and production. These findings posit a more general role of the previously proposed "auditory dorsal stream" in the left hemisphere--to support the production of vocal tract gestures that are not limited to speech processing.
Speech Compression

Directory of Open Access Journals (Sweden)

Jerry D. Gibson

2016-06-01

Full Text Available Speech compression is a key technology underlying digital cellular communications, VoIP, voicemail, and voice response systems. We trace the evolution of speech coding based on the linear prediction model, highlight the key milestones in speech coding, and outline the structures of the most important speech coding standards. Current challenges, future research directions, fundamental limits on performance, and the critical open problem of speech coding for emergency first responders are all discussed.
Speech Production and Speech Discrimination by Hearing-Impaired Children.

Science.gov (United States)

Novelli-Olmstead, Tina; Ling, Daniel

1984-01-01

Seven hearing impaired children (five to seven years old) assigned to the Speakers group made highly significant gains in speech production and auditory discrimination of speech, while Listeners made only slight speech production gains and no gains in auditory discrimination. Combined speech and auditory training was more effective than auditory…
Stuttering Frequency, Speech Rate, Speech Naturalness, and Speech Effort During the Production of Voluntary Stuttering.

Science.gov (United States)

Davidow, Jason H; Grossman, Heather L; Edge, Robin L

2018-05-01

Voluntary stuttering techniques involve persons who stutter purposefully interjecting disfluencies into their speech. Little research has been conducted on the impact of these techniques on the speech pattern of persons who stutter. The present study examined whether changes in the frequency of voluntary stuttering accompanied changes in stuttering frequency, articulation rate, speech naturalness, and speech effort. In total, 12 persons who stutter aged 16-34 years participated. Participants read four 300-syllable passages during a control condition, and three voluntary stuttering conditions that involved attempting to produce purposeful, tension-free repetitions of initial sounds or syllables of a word for two or more repetitions (i.e., bouncing). The three voluntary stuttering conditions included bouncing on 5%, 10%, and 15% of syllables read. Friedman tests and follow-up Wilcoxon signed ranks tests were conducted for the statistical analyses. Stuttering frequency, articulation rate, and speech naturalness were significantly different between the voluntary stuttering conditions. Speech effort did not differ between the voluntary stuttering conditions. Stuttering frequency was significantly lower during the three voluntary stuttering conditions compared to the control condition, and speech effort was significantly lower during two of the three voluntary stuttering conditions compared to the control condition. Due to changes in articulation rate across the voluntary stuttering conditions, it is difficult to conclude, as has been suggested previously, that voluntary stuttering is the reason for stuttering reductions found when using voluntary stuttering techniques. Additionally, future investigations should examine different types of voluntary stuttering over an extended period of time to determine their impact on stuttering frequency, speech rate, speech naturalness, and speech effort.
Comprehension of synthetic speech and digitized natural speech by adults with aphasia.

Science.gov (United States)

Hux, Karen; Knollman-Porter, Kelly; Brown, Jessica; Wallace, Sarah E

2017-09-01

Using text-to-speech technology to provide simultaneous written and auditory content presentation may help compensate for chronic reading challenges if people with aphasia can understand synthetic speech output; however, inherent auditory comprehension challenges experienced by people with aphasia may make understanding synthetic speech difficult. This study's purpose was to compare the preferences and auditory comprehension accuracy of people with aphasia when listening to sentences generated with digitized natural speech, Alex synthetic speech (i.e., Macintosh platform), or David synthetic speech (i.e., Windows platform). The methodology required each of 20 participants with aphasia to select one of four images corresponding in meaning to each of 60 sentences comprising three stimulus sets. Results revealed significantly better accuracy given digitized natural speech than either synthetic speech option; however, individual participant performance analyses revealed three patterns: (a) comparable accuracy regardless of speech condition for 30% of participants, (b) comparable accuracy between digitized natural speech and one, but not both, synthetic speech option for 45% of participants, and (c) greater accuracy with digitized natural speech than with either synthetic speech option for remaining participants. Ranking and Likert-scale rating data revealed a preference for digitized natural speech and David synthetic speech over Alex synthetic speech. Results suggest many individuals with aphasia can comprehend synthetic speech options available on popular operating systems. Further examination of synthetic speech use to support reading comprehension through text-to-speech technology is thus warranted. Copyright © 2017 Elsevier Inc. All rights reserved.
Common neural substrates support speech and non-speech vocal tract gestures

OpenAIRE

Chang, Soo-Eun; Kenney, Mary Kay; Loucks, Torrey M.J.; Poletto, Christopher J.; Ludlow, Christy L.

2009-01-01

The issue of whether speech is supported by the same neural substrates as non-speech vocal-tract gestures has been contentious. In this fMRI study we tested whether producing non-speech vocal tract gestures in humans shares the same functional neuroanatomy as non-sense speech syllables. Production of non-speech vocal tract gestures, devoid of phonological content but similar to speech in that they had familiar acoustic and somatosensory targets, were compared to the production of speech sylla...
Introductory speeches

International Nuclear Information System (INIS)

2001-01-01

This CD is multimedia presentation of programme safety upgrading of Bohunice V1 NPP. This chapter consist of introductory commentary and 4 introductory speeches (video records): (1) Introductory speech of Vincent Pillar, Board chairman and director general of Slovak electric, Plc. (SE); (2) Introductory speech of Stefan Schmidt, director of SE - Bohunice Nuclear power plants; (3) Introductory speech of Jan Korec, Board chairman and director general of VUJE Trnava, Inc. - Engineering, Design and Research Organisation, Trnava; Introductory speech of Dietrich Kuschel, Senior vice-president of FRAMATOME ANP Project and Engineering
Predicting speech intelligibility in conditions with nonlinearly processed noisy speech

DEFF Research Database (Denmark)

Jørgensen, Søren; Dau, Torsten

2013-01-01

The speech-based envelope power spectrum model (sEPSM; [1]) was proposed in order to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index (SII). The sEPSM applies the signal-tonoise ratio in the envelope domain (SNRenv), which was demonstrated...... to successfully predict speech intelligibility in conditions with nonlinearly processed noisy speech, such as processing with spectral subtraction. Moreover, a multiresolution version (mr-sEPSM) was demonstrated to account for speech intelligibility in various conditions with stationary and fluctuating...
Aspectos fonoaudiológicos na síndrome de Crouzon: estudo de caso Speech-language aspects on Crouzon syndrome: case study

Directory of Open Access Journals (Sweden)

Isabela Gomes

2008-01-01

hearing assessments of a case of Crouzon syndrome at the age 6:4 years. PROCEDURE: the subject carried out the following evaluations: ABFW, Test of Receptive Vocabulary, Language-Cognition Development Evaluation, Evaluation of Structures and Functions of the Stomatognathic System, pure-tone audiometry threshold, immitance measures and vocal audiometry. RESULTS: the pure-tone audiometry identified bilateral moderate conductive hearing loss, compatible with vocal audiometry's and immitance measures' results. The stomatognathic system evaluation showed that the structures had reduced tonus and altered posture and mobility. Suction, chewing, deglutition and breathing functions were also altered. Phonologically, the following processes were identified: Cluster Simplification, Stopping of Fricatives and Others. In the Fluency evaluation, subject's performance was below the expected scores for matched age and gender. In the Pragmatics test, the child had 14.4 acts per minute and, predominantly, gestural communication. The Receptive Vocabulary Test showed scores 7.1% below reference. In the Expressive Vocabulary Test, data indicated a performance compatible to the reference values of 4 and 5 year-old children, below the expected scores for the subject's age. Regarding language and cognition, the analysis indicated a gap between the child's performance and the developmental level. CONCLUSION: the deficits caused by the syndrome are diffuse and interconnected. The present study had the aim to present the Speech-Language and Hearing Pathology associated aspects of a Crouzon syndrome case and to provide initial data to further investigate these aspects and the intervention process.
Exploring Australian speech-language pathologists' use and perceptions ofnon-speech oral motor exercises.

Science.gov (United States)

Rumbach, Anna F; Rose, Tanya A; Cheah, Mynn

2018-01-29

To explore Australian speech-language pathologists' use of non-speech oral motor exercises, and rationales for using/not using non-speech oral motor exercises in clinical practice. A total of 124 speech-language pathologists practising in Australia, working with paediatric and/or adult clients with speech sound difficulties, completed an online survey. The majority of speech-language pathologists reported that they did not use non-speech oral motor exercises when working with paediatric or adult clients with speech sound difficulties. However, more than half of the speech-language pathologists working with adult clients who have dysarthria reported using non-speech oral motor exercises with this population. The most frequently reported rationale for using non-speech oral motor exercises in speech sound difficulty management was to improve awareness/placement of articulators. The majority of speech-language pathologists agreed there is no clear clinical or research evidence base to support non-speech oral motor exercise use with clients who have speech sound difficulties. This study provides an overview of Australian speech-language pathologists' reported use and perceptions of non-speech oral motor exercises' applicability and efficacy in treating paediatric and adult clients who have speech sound difficulties. The research findings provide speech-language pathologists with insight into how and why non-speech oral motor exercises are currently used, and adds to the knowledge base regarding Australian speech-language pathology practice of non-speech oral motor exercises in the treatment of speech sound difficulties. Implications for Rehabilitation Non-speech oral motor exercises refer to oral motor activities which do not involve speech, but involve the manipulation or stimulation of oral structures including the lips, tongue, jaw, and soft palate. Non-speech oral motor exercises are intended to improve the function (e.g., movement, strength) of oral structures. The
[Improving speech comprehension using a new cochlear implant speech processor].

Science.gov (United States)

Müller-Deile, J; Kortmann, T; Hoppe, U; Hessel, H; Morsnowski, A

2009-06-01

The aim of this multicenter clinical field study was to assess the benefits of the new Freedom 24 sound processor for cochlear implant (CI) users implanted with the Nucleus 24 cochlear implant system. The study included 48 postlingually profoundly deaf experienced CI users who demonstrated speech comprehension performance with their current speech processor on the Oldenburg sentence test (OLSA) in quiet conditions of at least 80% correct scores and who were able to perform adaptive speech threshold testing using the OLSA in noisy conditions. Following baseline measures of speech comprehension performance with their current speech processor, subjects were upgraded to the Freedom 24 speech processor. After a take-home trial period of at least 2 weeks, subject performance was evaluated by measuring the speech reception threshold with the Freiburg multisyllabic word test and speech intelligibility with the Freiburg monosyllabic word test at 50 dB and 70 dB in the sound field. The results demonstrated highly significant benefits for speech comprehension with the new speech processor. Significant benefits for speech comprehension were also demonstrated with the new speech processor when tested in competing background noise.In contrast, use of the Abbreviated Profile of Hearing Aid Benefit (APHAB) did not prove to be a suitably sensitive assessment tool for comparative subjective self-assessment of hearing benefits with each processor. Use of the preprocessing algorithm known as adaptive dynamic range optimization (ADRO) in the Freedom 24 led to additional improvements over the standard upgrade map for speech comprehension in quiet and showed equivalent performance in noise. Through use of the preprocessing beam-forming algorithm BEAM, subjects demonstrated a highly significant improved signal-to-noise ratio for speech comprehension thresholds (i.e., signal-to-noise ratio for 50% speech comprehension scores) when tested with an adaptive procedure using the Oldenburg
Speech coding

Energy Technology Data Exchange (ETDEWEB)

Ravishankar, C., Hughes Network Systems, Germantown, MD

1998-05-08

Speech is the predominant means of communication between human beings and since the invention of the telephone by Alexander Graham Bell in 1876, speech services have remained to be the core service in almost all telecommunication systems. Original analog methods of telephony had the disadvantage of speech signal getting corrupted by noise, cross-talk and distortion Long haul transmissions which use repeaters to compensate for the loss in signal strength on transmission links also increase the associated noise and distortion. On the other hand digital transmission is relatively immune to noise, cross-talk and distortion primarily because of the capability to faithfully regenerate digital signal at each repeater purely based on a binary decision. Hence end-to-end performance of the digital link essentially becomes independent of the length and operating frequency bands of the link Hence from a transmission point of view digital transmission has been the preferred approach due to its higher immunity to noise. The need to carry digital speech became extremely important from a service provision point of view as well. Modem requirements have introduced the need for robust, flexible and secure services that can carry a multitude of signal types (such as voice, data and video) without a fundamental change in infrastructure. Such a requirement could not have been easily met without the advent of digital transmission systems, thereby requiring speech to be coded digitally. The term Speech Coding is often referred to techniques that represent or code speech signals either directly as a waveform or as a set of parameters by analyzing the speech signal. In either case, the codes are transmitted to the distant end where speech is reconstructed or synthesized using the received set of codes. A more generic term that is applicable to these techniques that is often interchangeably used with speech coding is the term voice coding. This term is more generic in the sense that the

The analysis of speech acts patterns in two Egyptian inaugural speeches

Directory of Open Access Journals (Sweden)

Imad Hayif Sameer

2017-09-01

Full Text Available The theory of speech acts, which clarifies what people do when they speak, is not about individual words or sentences that form the basic elements of human communication, but rather about particular speech acts that are performed when uttering words. A speech act is the attempt at doing something purely by speaking. Many things can be done by speaking. Speech acts are studied under what is called speech act theory, and belong to the domain of pragmatics. In this paper, two Egyptian inaugural speeches from El-Sadat and El-Sisi, belonging to different periods were analyzed to find out whether there were differences within this genre in the same culture or not. The study showed that there was a very small difference between these two speeches which were analyzed according to Searle’s theory of speech acts. In El Sadat’s speech, commissives came to occupy the first place. Meanwhile, in El–Sisi’s speech, assertives occupied the first place. Within the speeches of one culture, we can find that the differences depended on the circumstances that surrounded the elections of the Presidents at the time. Speech acts were tools they used to convey what they wanted and to obtain support from their audiences.
Speech Problems

Science.gov (United States)

... Staying Safe Videos for Educators Search English Español Speech Problems KidsHealth / For Teens / Speech Problems What's in ... a person's ability to speak clearly. Some Common Speech and Language Disorders Stuttering is a problem that ...
Alternative Speech Communication System for Persons with Severe Speech Disorders

Science.gov (United States)

Selouani, Sid-Ahmed; Sidi Yakoub, Mohammed; O'Shaughnessy, Douglas

2009-12-01

Assistive speech-enabled systems are proposed to help both French and English speaking persons with various speech disorders. The proposed assistive systems use automatic speech recognition (ASR) and speech synthesis in order to enhance the quality of communication. These systems aim at improving the intelligibility of pathologic speech making it as natural as possible and close to the original voice of the speaker. The resynthesized utterances use new basic units, a new concatenating algorithm and a grafting technique to correct the poorly pronounced phonemes. The ASR responses are uttered by the new speech synthesis system in order to convey an intelligible message to listeners. Experiments involving four American speakers with severe dysarthria and two Acadian French speakers with sound substitution disorders (SSDs) are carried out to demonstrate the efficiency of the proposed methods. An improvement of the Perceptual Evaluation of the Speech Quality (PESQ) value of 5% and more than 20% is achieved by the speech synthesis systems that deal with SSD and dysarthria, respectively.
A Danish open-set speech corpus for competing-speech studies

DEFF Research Database (Denmark)

Nielsen, Jens Bo; Dau, Torsten; Neher, Tobias

2014-01-01

Studies investigating speech-on-speech masking effects commonly use closed-set speech materials such as the coordinate response measure [Bolia et al. (2000). J. Acoust. Soc. Am. 107, 1065-1066]. However, these studies typically result in very low (i.e., negative) speech recognition thresholds (SRTs......) when the competing speech signals are spatially separated. To achieve higher SRTs that correspond more closely to natural communication situations, an open-set, low-context, multi-talker speech corpus was developed. Three sets of 268 unique Danish sentences were created, and each set was recorded...... with one of three professional female talkers. The intelligibility of each sentence in the presence of speech-shaped noise was measured. For each talker, 200 approximately equally intelligible sentences were then selected and systematically distributed into 10 test lists. Test list homogeneity was assessed...
Speech entrainment enables patients with Broca’s aphasia to produce fluent speech

Science.gov (United States)

Hubbard, H. Isabel; Hudspeth, Sarah Grace; Holland, Audrey L.; Bonilha, Leonardo; Fromm, Davida; Rorden, Chris

2012-01-01

A distinguishing feature of Broca’s aphasia is non-fluent halting speech typically involving one to three words per utterance. Yet, despite such profound impairments, some patients can mimic audio-visual speech stimuli enabling them to produce fluent speech in real time. We call this effect ‘speech entrainment’ and reveal its neural mechanism as well as explore its usefulness as a treatment for speech production in Broca’s aphasia. In Experiment 1, 13 patients with Broca’s aphasia were tested in three conditions: (i) speech entrainment with audio-visual feedback where they attempted to mimic a speaker whose mouth was seen on an iPod screen; (ii) speech entrainment with audio-only feedback where patients mimicked heard speech; and (iii) spontaneous speech where patients spoke freely about assigned topics. The patients produced a greater variety of words using audio-visual feedback compared with audio-only feedback and spontaneous speech. No difference was found between audio-only feedback and spontaneous speech. In Experiment 2, 10 of the 13 patients included in Experiment 1 and 20 control subjects underwent functional magnetic resonance imaging to determine the neural mechanism that supports speech entrainment. Group results with patients and controls revealed greater bilateral cortical activation for speech produced during speech entrainment compared with spontaneous speech at the junction of the anterior insula and Brodmann area 47, in Brodmann area 37, and unilaterally in the left middle temporal gyrus and the dorsal portion of Broca’s area. Probabilistic white matter tracts constructed for these regions in the normal subjects revealed a structural network connected via the corpus callosum and ventral fibres through the extreme capsule. Unilateral areas were connected via the arcuate fasciculus. In Experiment 3, all patients included in Experiment 1 participated in a 6-week treatment phase using speech entrainment to improve speech production
Multimodal Speech Capture System for Speech Rehabilitation and Learning.

Science.gov (United States)

Sebkhi, Nordine; Desai, Dhyey; Islam, Mohammad; Lu, Jun; Wilson, Kimberly; Ghovanloo, Maysam

2017-11-01

Speech-language pathologists (SLPs) are trained to correct articulation of people diagnosed with motor speech disorders by analyzing articulators' motion and assessing speech outcome while patients speak. To assist SLPs in this task, we are presenting the multimodal speech capture system (MSCS) that records and displays kinematics of key speech articulators, the tongue and lips, along with voice, using unobtrusive methods. Collected speech modalities, tongue motion, lips gestures, and voice are visualized not only in real-time to provide patients with instant feedback but also offline to allow SLPs to perform post-analysis of articulators' motion, particularly the tongue, with its prominent but hardly visible role in articulation. We describe the MSCS hardware and software components, and demonstrate its basic visualization capabilities by a healthy individual repeating the words "Hello World." A proof-of-concept prototype has been successfully developed for this purpose, and will be used in future clinical studies to evaluate its potential impact on accelerating speech rehabilitation by enabling patients to speak naturally. Pattern matching algorithms to be applied to the collected data can provide patients with quantitative and objective feedback on their speech performance, unlike current methods that are mostly subjective, and may vary from one SLP to another.
Speech Motor Control in Fluent and Dysfluent Speech Production of an Individual with Apraxia of Speech and Broca's Aphasia

Science.gov (United States)

van Lieshout, Pascal H. H. M.; Bose, Arpita; Square, Paula A.; Steele, Catriona M.

2007-01-01

Apraxia of speech (AOS) is typically described as a motor-speech disorder with clinically well-defined symptoms, but without a clear understanding of the underlying problems in motor control. A number of studies have compared the speech of subjects with AOS to the fluent speech of controls, but only a few have included speech movement data and if…
Enhancement of speech signals - with a focus on voiced speech models

DEFF Research Database (Denmark)

Nørholm, Sidsel Marie

This thesis deals with speech enhancement, i.e., noise reduction in speech signals. This has applications in, e.g., hearing aids and teleconference systems. We consider a signal-driven approach to speech enhancement where a model of the speech is assumed and filters are generated based...... on this model. The basic model used in this thesis is the harmonic model which is a commonly used model for describing the voiced part of the speech signal. We show that it can be beneficial to extend the model to take inharmonicities or the non-stationarity of speech into account. Extending the model...
Intelligibility for Binaural Speech with Discarded Low-SNR Speech Components.

Science.gov (United States)

Schoenmaker, Esther; van de Par, Steven

2016-01-01

Speech intelligibility in multitalker settings improves when the target speaker is spatially separated from the interfering speakers. A factor that may contribute to this improvement is the improved detectability of target-speech components due to binaural interaction in analogy to the Binaural Masking Level Difference (BMLD). This would allow listeners to hear target speech components within specific time-frequency intervals that have a negative SNR, similar to the improvement in the detectability of a tone in noise when these contain disparate interaural difference cues. To investigate whether these negative-SNR target-speech components indeed contribute to speech intelligibility, a stimulus manipulation was performed where all target components were removed when local SNRs were smaller than a certain criterion value. It can be expected that for sufficiently high criterion values target speech components will be removed that do contribute to speech intelligibility. For spatially separated speakers, assuming that a BMLD-like detection advantage contributes to intelligibility, degradation in intelligibility is expected already at criterion values below 0 dB SNR. However, for collocated speakers it is expected that higher criterion values can be applied without impairing speech intelligibility. Results show that degradation of intelligibility for separated speakers is only seen for criterion values of 0 dB and above, indicating a negligible contribution of a BMLD-like detection advantage in multitalker settings. These results show that the spatial benefit is related to a spatial separation of speech components at positive local SNRs rather than to a BMLD-like detection improvement for speech components at negative local SNRs.
An experimental Dutch keyboard-to-speech system for the speech impaired

NARCIS (Netherlands)

Deliege, R.J.H.

1989-01-01

An experimental Dutch keyboard-to-speech system has been developed to explor the possibilities and limitations of Dutch speech synthesis in a communication aid for the speech impaired. The system uses diphones and a formant synthesizer chip for speech synthesis. Input to the system is in
Speech Function and Speech Role in Carl Fredricksen's Dialogue on Up Movie

OpenAIRE

Rehana, Ridha; Silitonga, Sortha

2013-01-01

One aim of this article is to show through a concrete example how speech function and speech role used in movie. The illustrative example is taken from the dialogue of Up movie. Central to the analysis proper form of dialogue on Up movie that contain of speech function and speech role; i.e. statement, offer, question, command, giving, and demanding. 269 dialogue were interpreted by actor, and it was found that the use of speech function and speech role.
Experimental comparison between speech transmission index, rapid speech transmission index, and speech intelligibility index.

Science.gov (United States)

Larm, Petra; Hongisto, Valtteri

2006-02-01

During the acoustical design of, e.g., auditoria or open-plan offices, it is important to know how speech can be perceived in various parts of the room. Different objective methods have been developed to measure and predict speech intelligibility, and these have been extensively used in various spaces. In this study, two such methods were compared, the speech transmission index (STI) and the speech intelligibility index (SII). Also the simplification of the STI, the room acoustics speech transmission index (RASTI), was considered. These quantities are all based on determining an apparent speech-to-noise ratio on selected frequency bands and summing them using a specific weighting. For comparison, some data were needed on the possible differences of these methods resulting from the calculation scheme and also measuring equipment. Their prediction accuracy was also of interest. Measurements were made in a laboratory having adjustable noise level and absorption, and in a real auditorium. It was found that the measurement equipment, especially the selection of the loudspeaker, can greatly affect the accuracy of the results. The prediction accuracy of the RASTI was found acceptable, if the input values for the prediction are accurately known, even though the studied space was not ideally diffuse.
Robust Speech/Non-Speech Classification in Heterogeneous Multimedia Content

NARCIS (Netherlands)

Huijbregts, M.A.H.; de Jong, Franciska M.G.

In this paper we present a speech/non-speech classification method that allows high quality classification without the need to know in advance what kinds of audible non-speech events are present in an audio recording and that does not require a single parameter to be tuned on in-domain data. Because
Intelligibility of speech of children with speech and sound disorders

OpenAIRE

Ivetac, Tina

2014-01-01

The purpose of this study is to examine speech intelligibility of children with primary speech and sound disorders aged 3 to 6 years in everyday life. The research problem is based on the degree to which parents or guardians, immediate family members (sister, brother, grandparents), extended family members (aunt, uncle, cousin), child's friends, other acquaintances, child's teachers and strangers understand the speech of children with speech sound disorders. We examined whether the level ...
Speech disorders - children

Science.gov (United States)

... disorder; Voice disorders; Vocal disorders; Disfluency; Communication disorder - speech disorder; Speech disorder - stuttering ... evaluation tools that can help identify and diagnose speech disorders: Denver Articulation Screening Examination Goldman-Fristoe Test of ...
Neurophysiology of speech differences in childhood apraxia of speech.

Science.gov (United States)

Preston, Jonathan L; Molfese, Peter J; Gumkowski, Nina; Sorcinelli, Andrea; Harwood, Vanessa; Irwin, Julia R; Landi, Nicole

2014-01-01

Event-related potentials (ERPs) were recorded during a picture naming task of simple and complex words in children with typical speech and with childhood apraxia of speech (CAS). Results reveal reduced amplitude prior to speaking complex (multisyllabic) words relative to simple (monosyllabic) words for the CAS group over the right hemisphere during a time window thought to reflect phonological encoding of word forms. Group differences were also observed prior to production of spoken tokens regardless of word complexity during a time window just prior to speech onset (thought to reflect motor planning/programming). Results suggest differences in pre-speech neurolinguistic processes.
Listeners Experience Linguistic Masking Release in Noise-Vocoded Speech-in-Speech Recognition

Science.gov (United States)

Viswanathan, Navin; Kokkinakis, Kostas; Williams, Brittany T.

2018-01-01

Purpose: The purpose of this study was to evaluate whether listeners with normal hearing perceiving noise-vocoded speech-in-speech demonstrate better intelligibility of target speech when the background speech was mismatched in language (linguistic release from masking [LRM]) and/or location (spatial release from masking [SRM]) relative to the…
Speech Perception and Short-Term Memory Deficits in Persistent Developmental Speech Disorder

Science.gov (United States)

Kenney, Mary Kay; Barac-Cikoja, Dragana; Finnegan, Kimberly; Jeffries, Neal; Ludlow, Christy L.

2006-01-01

Children with developmental speech disorders may have additional deficits in speech perception and/or short-term memory. To determine whether these are only transient developmental delays that can accompany the disorder in childhood or persist as part of the speech disorder, adults with a persistent familial speech disorder were tested on speech…
Automatic speech recognition (ASR) based approach for speech therapy of aphasic patients: A review

Science.gov (United States)

Jamal, Norezmi; Shanta, Shahnoor; Mahmud, Farhanahani; Sha'abani, MNAH

2017-09-01

This paper reviews the state-of-the-art an automatic speech recognition (ASR) based approach for speech therapy of aphasic patients. Aphasia is a condition in which the affected person suffers from speech and language disorder resulting from a stroke or brain injury. Since there is a growing body of evidence indicating the possibility of improving the symptoms at an early stage, ASR based solutions are increasingly being researched for speech and language therapy. ASR is a technology that transfers human speech into transcript text by matching with the system's library. This is particularly useful in speech rehabilitation therapies as they provide accurate, real-time evaluation for speech input from an individual with speech disorder. ASR based approaches for speech therapy recognize the speech input from the aphasic patient and provide real-time feedback response to their mistakes. However, the accuracy of ASR is dependent on many factors such as, phoneme recognition, speech continuity, speaker and environmental differences as well as our depth of knowledge on human language understanding. Hence, the review examines recent development of ASR technologies and its performance for individuals with speech and language disorders.
Speech and Language Delay

Science.gov (United States)

... OTC Relief for Diarrhea Home Diseases and Conditions Speech and Language Delay Condition Speech and Language Delay Share Print Table of Contents1. ... Treatment6. Everyday Life7. Questions8. Resources What is a speech and language delay? A speech and language delay ...

Plasticity in the Human Speech Motor System Drives Changes in Speech Perception

Science.gov (United States)

Lametti, Daniel R.; Rochet-Capellan, Amélie; Neufeld, Emily; Shiller, Douglas M.

2014-01-01

Recent studies of human speech motor learning suggest that learning is accompanied by changes in auditory perception. But what drives the perceptual change? Is it a consequence of changes in the motor system? Or is it a result of sensory inflow during learning? Here, subjects participated in a speech motor-learning task involving adaptation to altered auditory feedback and they were subsequently tested for perceptual change. In two separate experiments, involving two different auditory perceptual continua, we show that changes in the speech motor system that accompany learning drive changes in auditory speech perception. Specifically, we obtained changes in speech perception when adaptation to altered auditory feedback led to speech production that fell into the phonetic range of the speech perceptual tests. However, a similar change in perception was not observed when the auditory feedback that subjects' received during learning fell into the phonetic range of the perceptual tests. This indicates that the central motor outflow associated with vocal sensorimotor adaptation drives changes to the perceptual classification of speech sounds. PMID:25080594
Childhood apraxia of speech: A survey of praxis and typical speech characteristics.

Science.gov (United States)

Malmenholt, Ann; Lohmander, Anette; McAllister, Anita

2017-07-01

The purpose of this study was to investigate current knowledge of the diagnosis childhood apraxia of speech (CAS) in Sweden and compare speech characteristics and symptoms to those of earlier survey findings in mainly English-speakers. In a web-based questionnaire 178 Swedish speech-language pathologists (SLPs) anonymously answered questions about their perception of typical speech characteristics for CAS. They graded own assessment skills and estimated clinical occurrence. The seven top speech characteristics reported as typical for children with CAS were: inconsistent speech production (85%), sequencing difficulties (71%), oro-motor deficits (63%), vowel errors (62%), voicing errors (61%), consonant cluster deletions (54%), and prosodic disturbance (53%). Motor-programming deficits described as lack of automatization of speech movements were perceived by 82%. All listed characteristics were consistent with the American Speech-Language-Hearing Association (ASHA) consensus-based features, Strand's 10-point checklist, and the diagnostic model proposed by Ozanne. The mode for clinical occurrence was 5%. Number of suspected cases of CAS in the clinical caseload was approximately one new patient/year and SLP. The results support and add to findings from studies of CAS in English-speaking children with similar speech characteristics regarded as typical. Possibly, these findings could contribute to cross-linguistic consensus on CAS characteristics.
Speech-specific audiovisual perception affects identification but not detection of speech

DEFF Research Database (Denmark)

Eskelund, Kasper; Andersen, Tobias

Speech perception is audiovisual as evidenced by the McGurk effect in which watching incongruent articulatory mouth movements can change the phonetic auditory speech percept. This type of audiovisual integration may be specific to speech or be applied to all stimuli in general. To investigate...... of audiovisual integration specific to speech perception. However, the results of Tuomainen et al. might have been influenced by another effect. When observers were naïve, they had little motivation to look at the face. When informed, they knew that the face was relevant for the task and this could increase...... visual detection task. In our first experiment, observers presented with congruent and incongruent audiovisual sine-wave speech stimuli did only show a McGurk effect when informed of the speech nature of the stimulus. Performance on the secondary visual task was very good, thus supporting the finding...
Speech-Language Therapy (For Parents)

Science.gov (United States)

... Staying Safe Videos for Educators Search English Español Speech-Language Therapy KidsHealth / For Parents / Speech-Language Therapy ... most kids with speech and/or language disorders. Speech Disorders, Language Disorders, and Feeding Disorders A speech ...
Clinical and audiological features of a syndrome with deterioration in speech recognition out of proportion to pure hearing loss

Directory of Open Access Journals (Sweden)

Abdi S

2007-04-01

Full Text Available Background: The objective of this study was to describe the audiologic and related characteristics of a group patient with speech perception affected out of proportion to pure tone hearing loss. A case series of patient were referred for evaluation and management to the Hearing Research Center.To describe the clinical picture of the patients with the key clinical feature of hearing loss for pure tones and reduction in speech discrimination out of proportion to the pure tone loss, having some of the criteria of auditory neuropathy (i.e. normal otoacoustic emissions, OAE, and abnormal auditory brainstem evoked potentials, ABR and lacking others (e.g. present auditory reflexes. Methods: Hearing abilities were measured by Pure Tone Audiometry (PTA and Speech Discrimination Scores (SDS, measured in all patients using a standardized list of 25 monosyllabic Farsi words at MCL in quiet. Auditory pathway integrity was measured by using Auditory Brainstem Response (ABR and Otoacoustic Emission (OAE and anatomical lesions Computed Tomography Scan (CT and Magnetic Resonance Image (MRI of brain and retrocochlea. Patient included in the series were 35 patients who have SDS disproportionably low with regard to PTA, absent ABR waves and normal OAE. Results: All patients reported the beginning of their problem around adolescence. Neither of them had anatomical lesion in imaging studies and neither of them had any finding suggestive of conductive hearing lesion. Although in most of the cases the hearing loss had been more apparent in the lower frequencies (i.e. 1000 Hz and less, a stronger correlation was found between SDS and hearing threshold at higher frequencies. These patients may not benefit from hearing aids, as the outer hair cells are functional and amplification doesn’t seem to help; though, it was tried for all. Conclusion: These patients share a pattern of sensory –neural loss with no detectable lesion. The age of onset and the gradual
Digital speech processing using Matlab

CERN Document Server

Gopi, E S

2014-01-01

Digital Speech Processing Using Matlab deals with digital speech pattern recognition, speech production model, speech feature extraction, and speech compression. The book is written in a manner that is suitable for beginners pursuing basic research in digital speech processing. Matlab illustrations are provided for most topics to enable better understanding of concepts. This book also deals with the basic pattern recognition techniques (illustrated with speech signals using Matlab) such as PCA, LDA, ICA, SVM, HMM, GMM, BPN, and KSOM.
Developmental apraxia of speech in children. Quantitive assessment of speech characteristics

NARCIS (Netherlands)

Thoonen, G.H.J.

1998-01-01

Developmental apraxia of speech (DAS) in children is a speech disorder, supposed to have a neurological origin, which is commonly considered to result from particular deficits in speech processing (i.e., phonological planning, motor programming). However, the label DAS has often been used as
Perception of synthetic speech produced automatically by rule: Intelligibility of eight text-to-speech systems.

Science.gov (United States)

Greene, Beth G; Logan, John S; Pisoni, David B

1986-03-01

We present the results of studies designed to measure the segmental intelligibility of eight text-to-speech systems and a natural speech control, using the Modified Rhyme Test (MRT). Results indicated that the voices tested could be grouped into four categories: natural speech, high-quality synthetic speech, moderate-quality synthetic speech, and low-quality synthetic speech. The overall performance of the best synthesis system, DECtalk-Paul, was equivalent to natural speech only in terms of performance on initial consonants. The findings are discussed in terms of recent work investigating the perception of synthetic speech under more severe conditions. Suggestions for future research on improving the quality of synthetic speech are also considered.
Perception of synthetic speech produced automatically by rule: Intelligibility of eight text-to-speech systems

Science.gov (United States)

GREENE, BETH G.; LOGAN, JOHN S.; PISONI, DAVID B.

2012-01-01

We present the results of studies designed to measure the segmental intelligibility of eight text-to-speech systems and a natural speech control, using the Modified Rhyme Test (MRT). Results indicated that the voices tested could be grouped into four categories: natural speech, high-quality synthetic speech, moderate-quality synthetic speech, and low-quality synthetic speech. The overall performance of the best synthesis system, DECtalk-Paul, was equivalent to natural speech only in terms of performance on initial consonants. The findings are discussed in terms of recent work investigating the perception of synthetic speech under more severe conditions. Suggestions for future research on improving the quality of synthetic speech are also considered. PMID:23225916
The speech perception skills of children with and without speech sound disorder.

Science.gov (United States)

Hearnshaw, Stephanie; Baker, Elise; Munro, Natalie

To investigate whether Australian-English speaking children with and without speech sound disorder (SSD) differ in their overall speech perception accuracy. Additionally, to investigate differences in the perception of specific phonemes and the association between speech perception and speech production skills. Twenty-five Australian-English speaking children aged 48-60 months participated in this study. The SSD group included 12 children and the typically developing (TD) group included 13 children. Children completed routine speech and language assessments in addition to an experimental Australian-English lexical and phonetic judgement task based on Rvachew's Speech Assessment and Interactive Learning System (SAILS) program (Rvachew, 2009). This task included eight words across four word-initial phonemes-/k, ɹ, ʃ, s/. Children with SSD showed significantly poorer perceptual accuracy on the lexical and phonetic judgement task compared with TD peers. The phonemes /ɹ/ and /s/ were most frequently perceived in error across both groups. Additionally, the phoneme /ɹ/ was most commonly produced in error. There was also a positive correlation between overall speech perception and speech production scores. Children with SSD perceived speech less accurately than their typically developing peers. The findings suggest that an Australian-English variation of a lexical and phonetic judgement task similar to the SAILS program is promising and worthy of a larger scale study. Copyright © 2017 Elsevier Inc. All rights reserved.
Speech Matters

DEFF Research Database (Denmark)

Hasse Jørgensen, Stina

2011-01-01

About Speech Matters - Katarina Gregos, the Greek curator's exhibition at the Danish Pavillion, the Venice Biannual 2011.......About Speech Matters - Katarina Gregos, the Greek curator's exhibition at the Danish Pavillion, the Venice Biannual 2011....
Hate speech

Directory of Open Access Journals (Sweden)

Anne Birgitta Nilsen

2014-12-01

Full Text Available The manifesto of the Norwegian terrorist Anders Behring Breivik is based on the “Eurabia” conspiracy theory. This theory is a key starting point for hate speech amongst many right-wing extremists in Europe, but also has ramifications beyond these environments. In brief, proponents of the Eurabia theory claim that Muslims are occupying Europe and destroying Western culture, with the assistance of the EU and European governments. By contrast, members of Al-Qaeda and other extreme Islamists promote the conspiracy theory “the Crusade” in their hate speech directed against the West. Proponents of the latter theory argue that the West is leading a crusade to eradicate Islam and Muslims, a crusade that is similarly facilitated by their governments. This article presents analyses of texts written by right-wing extremists and Muslim extremists in an effort to shed light on how hate speech promulgates conspiracy theories in order to spread hatred and intolerance.The aim of the article is to contribute to a more thorough understanding of hate speech’s nature by applying rhetorical analysis. Rhetorical analysis is chosen because it offers a means of understanding the persuasive power of speech. It is thus a suitable tool to describe how hate speech works to convince and persuade. The concepts from rhetorical theory used in this article are ethos, logos and pathos. The concept of ethos is used to pinpoint factors that contributed to Osama bin Laden's impact, namely factors that lent credibility to his promotion of the conspiracy theory of the Crusade. In particular, Bin Laden projected common sense, good morals and good will towards his audience. He seemed to have coherent and relevant arguments; he appeared to possess moral credibility; and his use of language demonstrated that he wanted the best for his audience.The concept of pathos is used to define hate speech, since hate speech targets its audience's emotions. In hate speech it is the
Speech Inconsistency in Children with Childhood Apraxia of Speech, Language Impairment, and Speech Delay: Depends on the Stimuli

Science.gov (United States)

Iuzzini-Seigel, Jenya; Hogan, Tiffany P.; Green, Jordan R.

2017-01-01

Purpose: The current research sought to determine (a) if speech inconsistency is a core feature of childhood apraxia of speech (CAS) or if it is driven by comorbid language impairment that affects a large subset of children with CAS and (b) if speech inconsistency is a sensitive and specific diagnostic marker that can differentiate between CAS and…
Clear Speech - Mere Speech? How segmental and prosodic speech reduction shape the impression that speakers create on listeners

DEFF Research Database (Denmark)

Niebuhr, Oliver

2017-01-01

of reduction levels and perceived speaker attributes in which moderate reduction can make a better impression on listeners than no reduction. In addition to its relevance in reduction models and theories, this interplay is instructive for various fields of speech application from social robotics to charisma...... whether variation in the degree of reduction also has a systematic effect on the attributes we ascribe to the speaker who produces the speech signal. A perception experiment was carried out for German in which 46 listeners judged whether or not speakers showing 3 different combinations of segmental...... and prosodic reduction levels (unreduced, moderately reduced, strongly reduced) are appropriately described by 13 physical, social, and cognitive attributes. The experiment shows that clear speech is not mere speech, and less clear speech is not just reduced either. Rather, results revealed a complex interplay...
Audiovisual Temporal Recalibration for Speech in Synchrony Perception and Speech Identification

Science.gov (United States)

Asakawa, Kaori; Tanaka, Akihiro; Imai, Hisato

We investigated whether audiovisual synchrony perception for speech could change after observation of the audiovisual temporal mismatch. Previous studies have revealed that audiovisual synchrony perception is re-calibrated after exposure to a constant timing difference between auditory and visual signals in non-speech. In the present study, we examined whether this audiovisual temporal recalibration occurs at the perceptual level even for speech (monosyllables). In Experiment 1, participants performed an audiovisual simultaneity judgment task (i.e., a direct measurement of the audiovisual synchrony perception) in terms of the speech signal after observation of the speech stimuli which had a constant audiovisual lag. The results showed that the “simultaneous” responses (i.e., proportion of responses for which participants judged the auditory and visual stimuli to be synchronous) at least partly depended on exposure lag. In Experiment 2, we adopted the McGurk identification task (i.e., an indirect measurement of the audiovisual synchrony perception) to exclude the possibility that this modulation of synchrony perception was solely attributable to the response strategy using stimuli identical to those of Experiment 1. The characteristics of the McGurk effect reported by participants depended on exposure lag. Thus, it was shown that audiovisual synchrony perception for speech could be modulated following exposure to constant lag both in direct and indirect measurement. Our results suggest that temporal recalibration occurs not only in non-speech signals but also in monosyllabic speech at the perceptual level.
Under-resourced speech recognition based on the speech manifold

CSIR Research Space (South Africa)

Sahraeian, R

2015-09-01

Full Text Available Conventional acoustic modeling involves estimating many parameters to effectively model feature distributions. The sparseness of speech and text data, however, degrades the reliability of the estimation process and makes speech recognition a...
PRACTICING SPEECH THERAPY INTERVENTION FOR SOCIAL INTEGRATION OF CHILDREN WITH SPEECH DISORDERS

Directory of Open Access Journals (Sweden)

Martin Ofelia POPESCU

2016-11-01

Full Text Available The article presents a concise speech correction intervention program in of dyslalia in conjunction with capacity development of intra, interpersonal and social integration of children with speech disorders. The program main objectives represent: the potential increasing of individual social integration by correcting speech disorders in conjunction with intra- and interpersonal capacity, the potential growth of children and community groups for social integration by optimizing the socio-relational context of children with speech disorder. In the program were included 60 children / students with dyslalia speech disorders (monomorphic and polymorphic dyslalia, from 11 educational institutions - 6 kindergartens and 5 schools / secondary schools, joined with inter-school logopedic centre (CLI from Targu Jiu city and areas of Gorj district. The program was implemented under the assumption that therapeutic-formative intervention to correct speech disorders and facilitate the social integration will lead, in combination with correct pronunciation disorders, to social integration optimization of children with speech disorders. The results conirm the hypothesis and gives facts about the intervention program eficiency.
Schizophrenia alters intra-network functional connectivity in the caudate for detecting speech under informational speech masking conditions.

Science.gov (United States)

Zheng, Yingjun; Wu, Chao; Li, Juanhua; Li, Ruikeng; Peng, Hongjun; She, Shenglin; Ning, Yuping; Li, Liang

2018-04-04

Speech recognition under noisy "cocktail-party" environments involves multiple perceptual/cognitive processes, including target detection, selective attention, irrelevant signal inhibition, sensory/working memory, and speech production. Compared to health listeners, people with schizophrenia are more vulnerable to masking stimuli and perform worse in speech recognition under speech-on-speech masking conditions. Although the schizophrenia-related speech-recognition impairment under "cocktail-party" conditions is associated with deficits of various perceptual/cognitive processes, it is crucial to know whether the brain substrates critically underlying speech detection against informational speech masking are impaired in people with schizophrenia. Using functional magnetic resonance imaging (fMRI), this study investigated differences between people with schizophrenia (n = 19, mean age = 33 ± 10 years) and their matched healthy controls (n = 15, mean age = 30 ± 9 years) in intra-network functional connectivity (FC) specifically associated with target-speech detection under speech-on-speech-masking conditions. The target-speech detection performance under the speech-on-speech-masking condition in participants with schizophrenia was significantly worse than that in matched healthy participants (healthy controls). Moreover, in healthy controls, but not participants with schizophrenia, the strength of intra-network FC within the bilateral caudate was positively correlated with the speech-detection performance under the speech-masking conditions. Compared to controls, patients showed altered spatial activity pattern and decreased intra-network FC in the caudate. In people with schizophrenia, the declined speech-detection performance under speech-on-speech masking conditions is associated with reduced intra-caudate functional connectivity, which normally contributes to detecting target speech against speech masking via its functions of suppressing masking-speech signals.
Speech disorder prevention

Directory of Open Access Journals (Sweden)

Miladis Fornaris-Méndez

2017-04-01

Full Text Available Language therapy has trafficked from a medical focus until a preventive focus. However, difficulties are evidenced in the development of this last task, because he is devoted bigger space to the correction of the disorders of the language. Because the speech disorders is the dysfunction with more frequently appearance, acquires special importance the preventive work that is developed to avoid its appearance. Speech education since early age of the childhood makes work easier for prevent the appearance of speech disorders in the children. The present work has as objective to offer different activities for the prevention of the speech disorders.
Speech and Speech-Related Quality of Life After Late Palate Repair: A Patient's Perspective.

Science.gov (United States)

Schönmeyr, Björn; Wendby, Lisa; Sharma, Mitali; Jacobson, Lia; Restrepo, Carolina; Campbell, Alex

2015-07-01

Many patients with cleft palate deformities worldwide receive treatment at a later age than is recommended for normal speech to develop. The outcomes after late palate repairs in terms of speech and quality of life (QOL) still remain largely unstudied. In the current study, questionnaires were used to assess the patients' perception of speech and QOL before and after primary palate repair. All of the patients were operated at a cleft center in northeast India and had a cleft palate with a normal lip or with a cleft lip that had been previously repaired. A total of 134 patients (7-35 years) were interviewed preoperatively and 46 patients (7-32 years) were assessed in the postoperative survey. The survey showed that scores based on the speech handicap index, concerning speech and speech-related QOL, did not improve postoperatively. In fact, the questionnaires indicated that the speech became more unpredictable (P reported that their self-confidence had improved after the operation. Thus, the majority of interviewed patients who underwent late primary palate repair were satisfied with the surgery. At the same time, speech and speech-related QOL did not improve according to the speech handicap index-based survey. Speech predictability may even become worse and nasal regurgitation may increase after late palate repair, according to these results.

Visual Speech Fills in Both Discrimination and Identification of Non-Intact Auditory Speech in Children

Science.gov (United States)

Jerger, Susan; Damian, Markus F.; McAlpine, Rachel P.; Abdi, Herve

2018-01-01

To communicate, children must discriminate and identify speech sounds. Because visual speech plays an important role in this process, we explored how visual speech influences phoneme discrimination and identification by children. Critical items had intact visual speech (e.g. baez) coupled to non-intact (excised onsets) auditory speech (signified…
Tackling the complexity in speech

DEFF Research Database (Denmark)

section includes four carefully selected chapters. They deal with facets of speech production, speech acoustics, and/or speech perception or recognition, place them in an integrated phonetic-phonological perspective, and relate them in more or less explicit ways to aspects of speech technology. Therefore......, we hope that this volume can help speech scientists with traditional training in phonetics and phonology to keep up with the latest developments in speech technology. In the opposite direction, speech researchers starting from a technological perspective will hopefully get inspired by reading about...... the questions, phenomena, and communicative functions that are currently addressed in phonetics and phonology. Either way, the future of speech research lies in international, interdisciplinary collaborations, and our volume is meant to reflect and facilitate such collaborations...
Speech in spinocerebellar ataxia.

Science.gov (United States)

Schalling, Ellika; Hartelius, Lena

2013-12-01

Spinocerebellar ataxias (SCAs) are a heterogeneous group of autosomal dominant cerebellar ataxias clinically characterized by progressive ataxia, dysarthria and a range of other concomitant neurological symptoms. Only a few studies include detailed characterization of speech symptoms in SCA. Speech symptoms in SCA resemble ataxic dysarthria but symptoms related to phonation may be more prominent. One study to date has shown an association between differences in speech and voice symptoms related to genotype. More studies of speech and voice phenotypes are motivated, to possibly aid in clinical diagnosis. In addition, instrumental speech analysis has been demonstrated to be a reliable measure that may be used to monitor disease progression or therapy outcomes in possible future pharmacological treatments. Intervention by speech and language pathologists should go beyond assessment. Clinical guidelines for management of speech, communication and swallowing need to be developed for individuals with progressive cerebellar ataxia. Copyright © 2013 Elsevier Inc. All rights reserved.
Affordable headphones for accessible screening audiometry: An evaluation of the Sennheiser HD202 II supra-aural headphone.

Science.gov (United States)

Van der Aerschot, Mathieu; Swanepoel, De Wet; Mahomed-Asmail, Faheema; Myburgh, Herman Carel; Eikelboom, Robert Henry

2016-11-01

Evaluation of the Sennheiser HD 202 II supra-aural headphones as an alternative headphone to enable more affordable hearing screening. Study 1 measured the equivalent threshold sound pressure levels (ETSPL) of the Sennheiser HD 202 II. Study 2 evaluated the attenuation of the headphones. Study 3 determined headphone characteristics by analyzing the total harmonic distortion (THD), frequency response and force of the headband. Twenty-five participants were included in study 1 and 15 in study 2 with ages ranging between 18 and 25. No participants were involved in study 3. The Sennheiser HD 202 II ETSPLs (250-16000 Hz) showed no significant effects on ETSPL for ear laterality, gender or age. Attenuation was not significantly different (p > 0.01) to TDH 39 except at 8000 Hz (p 3%. Sennheiser HD 202 II supra-aural headphones can be used as an affordable headphone for screening audiometry provided reported MPANLs, maximum intensities and ETSPL values are employed.
Predicting Speech Intelligibility with a Multiple Speech Subsystems Approach in Children with Cerebral Palsy

Science.gov (United States)

Lee, Jimin; Hustad, Katherine C.; Weismer, Gary

2014-01-01

Purpose: Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystems approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Method: Nine acoustic variables reflecting different subsystems, and…
A causal test of the motor theory of speech perception: a case of impaired speech production and spared speech perception.

Science.gov (United States)

Stasenko, Alena; Bonn, Cory; Teghipco, Alex; Garcea, Frank E; Sweet, Catherine; Dombovy, Mary; McDonough, Joyce; Mahon, Bradford Z

2015-01-01

The debate about the causal role of the motor system in speech perception has been reignited by demonstrations that motor processes are engaged during the processing of speech sounds. Here, we evaluate which aspects of auditory speech processing are affected, and which are not, in a stroke patient with dysfunction of the speech motor system. We found that the patient showed a normal phonemic categorical boundary when discriminating two non-words that differ by a minimal pair (e.g., ADA-AGA). However, using the same stimuli, the patient was unable to identify or label the non-word stimuli (using a button-press response). A control task showed that he could identify speech sounds by speaker gender, ruling out a general labelling impairment. These data suggest that while the motor system is not causally involved in perception of the speech signal, it may be used when other cues (e.g., meaning, context) are not available.
The Relationship between Speech Production and Speech Perception Deficits in Parkinson's Disease

Science.gov (United States)

De Keyser, Kim; Santens, Patrick; Bockstael, Annelies; Botteldooren, Dick; Talsma, Durk; De Vos, Stefanie; Van Cauwenberghe, Mieke; Verheugen, Femke; Corthals, Paul; De Letter, Miet

2016-01-01

Purpose: This study investigated the possible relationship between hypokinetic speech production and speech intensity perception in patients with Parkinson's disease (PD). Method: Participants included 14 patients with idiopathic PD and 14 matched healthy controls (HCs) with normal hearing and cognition. First, speech production was objectified…
Visual speech information: a help or hindrance in perceptual processing of dysarthric speech.

Science.gov (United States)

Borrie, Stephanie A

2015-03-01

This study investigated the influence of visual speech information on perceptual processing of neurologically degraded speech. Fifty listeners identified spastic dysarthric speech under both audio (A) and audiovisual (AV) conditions. Condition comparisons revealed that the addition of visual speech information enhanced processing of the neurologically degraded input in terms of (a) acuity (percent phonemes correct) of vowels and consonants and (b) recognition (percent words correct) of predictive and nonpredictive phrases. Listeners exploited stress-based segmentation strategies more readily in AV conditions, suggesting that the perceptual benefit associated with adding visual speech information to the auditory signal-the AV advantage-has both segmental and suprasegmental origins. Results also revealed that the magnitude of the AV advantage can be predicted, to some degree, by the extent to which an individual utilizes syllabic stress cues to inform word recognition in AV conditions. Findings inform the development of a listener-specific model of speech perception that applies to processing of dysarthric speech in everyday communication contexts.
The treatment of apraxia of speech : Speech and music therapy, an innovative joint effort

NARCIS (Netherlands)

Hurkmans, Josephus Johannes Stephanus

2016-01-01

Apraxia of Speech (AoS) is a neurogenic speech disorder. A wide variety of behavioural methods have been developed to treat AoS. Various therapy programmes use musical elements to improve speech production. A unique therapy programme combining elements of speech therapy and music therapy is called
Practical speech user interface design

CERN Document Server

Lewis, James R

2010-01-01

Although speech is the most natural form of communication between humans, most people find using speech to communicate with machines anything but natural. Drawing from psychology, human-computer interaction, linguistics, and communication theory, Practical Speech User Interface Design provides a comprehensive yet concise survey of practical speech user interface (SUI) design. It offers practice-based and research-based guidance on how to design effective, efficient, and pleasant speech applications that people can really use. Focusing on the design of speech user interfaces for IVR application
Motor Speech Phenotypes of Frontotemporal Dementia, Primary Progressive Aphasia, and Progressive Apraxia of Speech

Science.gov (United States)

Poole, Matthew L.; Brodtmann, Amy; Darby, David; Vogel, Adam P.

2017-01-01

Purpose: Our purpose was to create a comprehensive review of speech impairment in frontotemporal dementia (FTD), primary progressive aphasia (PPA), and progressive apraxia of speech in order to identify the most effective measures for diagnosis and monitoring, and to elucidate associations between speech and neuroimaging. Method: Speech and…
An analysis of the masking of speech by competing speech using self-report data.

Science.gov (United States)

Agus, Trevor R; Akeroyd, Michael A; Noble, William; Bhullar, Navjot

2009-01-01

Many of the items in the "Speech, Spatial, and Qualities of Hearing" scale questionnaire [S. Gatehouse and W. Noble, Int. J. Audiol. 43, 85-99 (2004)] are concerned with speech understanding in a variety of backgrounds, both speech and nonspeech. To study if this self-report data reflected informational masking, previously collected data on 414 people were analyzed. The lowest scores (greatest difficulties) were found for the two items in which there were two speech targets, with successively higher scores for competing speech (six items), energetic masking (one item), and no masking (three items). The results suggest significant masking by competing speech in everyday listening situations.
Neural pathways for visual speech perception

Directory of Open Access Journals (Sweden)

Lynne E Bernstein

2014-12-01

Full Text Available This paper examines the questions, what levels of speech can be perceived visually, and how is visual speech represented by the brain? Review of the literature leads to the conclusions that every level of psycholinguistic speech structure (i.e., phonetic features, phonemes, syllables, words, and prosody can be perceived visually, although individuals differ in their abilities to do so; and that there are visual modality-specific representations of speech qua speech in higher-level vision brain areas. That is, the visual system represents the modal patterns of visual speech. The suggestion that the auditory speech pathway receives and represents visual speech is examined in light of neuroimaging evidence on the auditory speech pathways. We outline the generally agreed-upon organization of the visual ventral and dorsal pathways and examine several types of visual processing that might be related to speech through those pathways, specifically, face and body, orthography, and sign language processing. In this context, we examine the visual speech processing literature, which reveals widespread diverse patterns activity in posterior temporal cortices in response to visual speech stimuli. We outline a model of the visual and auditory speech pathways and make several suggestions: (1 The visual perception of speech relies on visual pathway representations of speech qua speech. (2 A proposed site of these representations, the temporal visual speech area (TVSA has been demonstrated in posterior temporal cortex, ventral and posterior to multisensory posterior superior temporal sulcus (pSTS. (3 Given that visual speech has dynamic and configural features, its representations in feedforward visual pathways are expected to integrate these features, possibly in TVSA.
Part-of-speech effects on text-to-speech synthesis

CSIR Research Space (South Africa)

Schlunz, GI

2010-11-01

Full Text Available One of the goals of text-to-speech (TTS) systems is to produce natural-sounding synthesised speech. Towards this end various natural language processing (NLP) tasks are performed to model the prosodic aspects of the TTS voice. One of the fundamental...
75 FR 26701 - Telecommunications Relay Services and Speech-to-Speech Services for Individuals With Hearing and...

Science.gov (United States)

2010-05-12

...] Telecommunications Relay Services and Speech-to-Speech Services for Individuals With Hearing and Speech Disabilities... proposed compensation rates for Interstate TRS, Speech-to-Speech Services (STS), Captioned Telephone... costs reported in the data submitted to NECA by VRS providers. In this regard, document DA 10-761 also...
Predicting automatic speech recognition performance over communication channels from instrumental speech quality and intelligibility scores

NARCIS (Netherlands)

Gallardo, L.F.; Möller, S.; Beerends, J.

2017-01-01

The performance of automatic speech recognition based on coded-decoded speech heavily depends on the quality of the transmitted signals, determined by channel impairments. This paper examines relationships between speech recognition performance and measurements of speech quality and intelligibility
[Non-speech oral motor treatment efficacy for children with developmental speech sound disorders].

Science.gov (United States)

Ygual-Fernandez, A; Cervera-Merida, J F

2016-01-01

In the treatment of speech disorders by means of speech therapy two antagonistic methodological approaches are applied: non-verbal ones, based on oral motor exercises (OME), and verbal ones, which are based on speech processing tasks with syllables, phonemes and words. In Spain, OME programmes are called 'programas de praxias', and are widely used and valued by speech therapists. To review the studies conducted on the effectiveness of OME-based treatments applied to children with speech disorders and the theoretical arguments that could justify, or not, their usefulness. Over the last few decades evidence has been gathered about the lack of efficacy of this approach to treat developmental speech disorders and pronunciation problems in populations without any neurological alteration of motor functioning. The American Speech-Language-Hearing Association has advised against its use taking into account the principles of evidence-based practice. The knowledge gathered to date on motor control shows that the pattern of mobility and its corresponding organisation in the brain are different in speech and other non-verbal functions linked to nutrition and breathing. Neither the studies on their effectiveness nor the arguments based on motor control studies recommend the use of OME-based programmes for the treatment of pronunciation problems in children with developmental language disorders.
75 FR 54040 - Telecommunications Relay Services and Speech-to-Speech Services for Individuals With Hearing and...

Science.gov (United States)

2010-09-03

...] Telecommunications Relay Services and Speech-to-Speech Services for Individuals With Hearing and Speech Disabilities...; speech-to-speech (STS); pay-per-call (900) calls; types of calls; and equal access to interexchange... of a report, due April 16, 2011, addressing whether it is necessary for the waivers to remain in...
Speech Acquisition and Automatic Speech Recognition for Integrated Spacesuit Audio Systems

Science.gov (United States)

Huang, Yiteng; Chen, Jingdong; Chen, Shaoyan

2010-01-01

A voice-command human-machine interface system has been developed for spacesuit extravehicular activity (EVA) missions. A multichannel acoustic signal processing method has been created for distant speech acquisition in noisy and reverberant environments. This technology reduces noise by exploiting differences in the statistical nature of signal (i.e., speech) and noise that exists in the spatial and temporal domains. As a result, the automatic speech recognition (ASR) accuracy can be improved to the level at which crewmembers would find the speech interface useful. The developed speech human/machine interface will enable both crewmember usability and operational efficiency. It can enjoy a fast rate of data/text entry, small overall size, and can be lightweight. In addition, this design will free the hands and eyes of a suited crewmember. The system components and steps include beam forming/multi-channel noise reduction, single-channel noise reduction, speech feature extraction, feature transformation and normalization, feature compression, model adaption, ASR HMM (Hidden Markov Model) training, and ASR decoding. A state-of-the-art phoneme recognizer can obtain an accuracy rate of 65 percent when the training and testing data are free of noise. When it is used in spacesuits, the rate drops to about 33 percent. With the developed microphone array speech-processing technologies, the performance is improved and the phoneme recognition accuracy rate rises to 44 percent. The recognizer can be further improved by combining the microphone array and HMM model adaptation techniques and using speech samples collected from inside spacesuits. In addition, arithmetic complexity models for the major HMMbased ASR components were developed. They can help real-time ASR system designers select proper tasks when in the face of constraints in computational resources.
Environmental Contamination of Normal Speech.

Science.gov (United States)

Harley, Trevor A.

1990-01-01

Environmentally contaminated speech errors (irrelevant words or phrases derived from the speaker's environment and erroneously incorporated into speech) are hypothesized to occur at a high level of speech processing, but with a relatively late insertion point. The data indicate that speech production processes are not independent of other…

Emotionally conditioning the target-speech voice enhances recognition of the target speech under "cocktail-party" listening conditions.

Science.gov (United States)

Lu, Lingxi; Bao, Xiaohan; Chen, Jing; Qu, Tianshu; Wu, Xihong; Li, Liang

2018-05-01

Under a noisy "cocktail-party" listening condition with multiple people talking, listeners can use various perceptual/cognitive unmasking cues to improve recognition of the target speech against informational speech-on-speech masking. One potential unmasking cue is the emotion expressed in a speech voice, by means of certain acoustical features. However, it was unclear whether emotionally conditioning a target-speech voice that has none of the typical acoustical features of emotions (i.e., an emotionally neutral voice) can be used by listeners for enhancing target-speech recognition under speech-on-speech masking conditions. In this study we examined the recognition of target speech against a two-talker speech masker both before and after the emotionally neutral target voice was paired with a loud female screaming sound that has a marked negative emotional valence. The results showed that recognition of the target speech (especially the first keyword in a target sentence) was significantly improved by emotionally conditioning the target speaker's voice. Moreover, the emotional unmasking effect was independent of the unmasking effect of the perceived spatial separation between the target speech and the masker. Also, (skin conductance) electrodermal responses became stronger after emotional learning when the target speech and masker were perceptually co-located, suggesting an increase of listening efforts when the target speech was informationally masked. These results indicate that emotionally conditioning the target speaker's voice does not change the acoustical parameters of the target-speech stimuli, but the emotionally conditioned vocal features can be used as cues for unmasking target speech.
Multilevel Analysis in Analyzing Speech Data

Science.gov (United States)

Guddattu, Vasudeva; Krishna, Y.

2011-01-01

The speech produced by human vocal tract is a complex acoustic signal, with diverse applications in phonetics, speech synthesis, automatic speech recognition, speaker identification, communication aids, speech pathology, speech perception, machine translation, hearing research, rehabilitation and assessment of communication disorders and many…
Perceived Liveliness and Speech Comprehensibility in Aphasia: The Effects of Direct Speech in Auditory Narratives

Science.gov (United States)

Groenewold, Rimke; Bastiaanse, Roelien; Nickels, Lyndsey; Huiskes, Mike

2014-01-01

Background: Previous studies have shown that in semi-spontaneous speech, individuals with Broca's and anomic aphasia produce relatively many direct speech constructions. It has been claimed that in "healthy" communication direct speech constructions contribute to the liveliness, and indirectly to the comprehensibility, of speech.…
Speech Enhancement by MAP Spectral Amplitude Estimation Using a Super-Gaussian Speech Model

Directory of Open Access Journals (Sweden)

Lotter Thomas

2005-01-01

Full Text Available This contribution presents two spectral amplitude estimators for acoustical background noise suppression based on maximum a posteriori estimation and super-Gaussian statistical modelling of the speech DFT amplitudes. The probability density function of the speech spectral amplitude is modelled with a simple parametric function, which allows a high approximation accuracy for Laplace- or Gamma-distributed real and imaginary parts of the speech DFT coefficients. Also, the statistical model can be adapted to optimally fit the distribution of the speech spectral amplitudes for a specific noise reduction system. Based on the super-Gaussian statistical model, computationally efficient maximum a posteriori speech estimators are derived, which outperform the commonly applied Ephraim-Malah algorithm.
Exploring the role of brain oscillations in speech perception in noise: Intelligibility of isochronously retimed speech

Directory of Open Access Journals (Sweden)

Vincent Aubanel

2016-08-01

Full Text Available A growing body of evidence shows that brain oscillations track speech. This mechanism is thought to maximise processing efficiency by allocating resources to important speech information, effectively parsing speech into units of appropriate granularity for further decoding. However, some aspects of this mechanism remain unclear. First, while periodicity is an intrinsic property of this physiological mechanism, speech is only quasi-periodic, so it is not clear whether periodicity would present an advantage in processing. Second, it is still a matter of debate which aspect of speech triggers or maintains cortical entrainment, from bottom-up cues such as fluctuations of the amplitude envelope of speech to higher level linguistic cues such as syntactic structure. We present data from a behavioural experiment assessing the effect of isochronous retiming of speech on speech perception in noise. Two types of anchor points were defined for retiming speech, namely syllable onsets and amplitude envelope peaks. For each anchor point type, retiming was implemented at two hierarchical levels, a slow time scale around 2.5 Hz and a fast time scale around 4 Hz. Results show that while any temporal distortion resulted in reduced speech intelligibility, isochronous speech anchored to P-centers (approximated by stressed syllable vowel onsets was significantly more intelligible than a matched anisochronous retiming, suggesting a facilitative role of periodicity defined on linguistically motivated units in processing speech in noise.
Ear, Hearing and Speech

DEFF Research Database (Denmark)

Poulsen, Torben

2000-01-01

An introduction is given to the the anatomy and the function of the ear, basic psychoacoustic matters (hearing threshold, loudness, masking), the speech signal and speech intelligibility. The lecture note is written for the course: Fundamentals of Acoustics and Noise Control (51001)......An introduction is given to the the anatomy and the function of the ear, basic psychoacoustic matters (hearing threshold, loudness, masking), the speech signal and speech intelligibility. The lecture note is written for the course: Fundamentals of Acoustics and Noise Control (51001)...
Music expertise shapes audiovisual temporal integration windows for speech, sinewave speech and music

Directory of Open Access Journals (Sweden)

Hwee Ling eLee

2014-08-01

Full Text Available This psychophysics study used musicians as a model to investigate whether musical expertise shapes the temporal integration window for audiovisual speech, sinewave speech or music. Musicians and non-musicians judged the audiovisual synchrony of speech, sinewave analogues of speech, and music stimuli at 13 audiovisual stimulus onset asynchronies (±360, ±300 ±240, ±180, ±120, ±60, and 0 ms. Further, we manipulated the duration of the stimuli by presenting sentences/melodies or syllables/tones. Critically, musicians relative to non-musicians exhibited significantly narrower temporal integration windows for both music and sinewave speech. Further, the temporal integration window for music decreased with the amount of music practice, but not with age of acquisition. In other words, the more musicians practiced piano in the past three years, the more sensitive they became to the temporal misalignment of visual and auditory signals. Collectively, our findings demonstrate that music practicing fine-tunes the audiovisual temporal integration window to various extents depending on the stimulus class. While the effect of piano practicing was most pronounced for music, it also generalized to other stimulus classes such as sinewave speech and to a marginally significant degree to natural speech.
Is cochlear implantation a good treatment method for profoundly deafened elderly?

Directory of Open Access Journals (Sweden)

Lachowska M

2013-10-01

Full Text Available Magdalena Lachowska, Agnieszka Pastuszka, Paulina Glinka, Kazimierz Niemczyk Department of Otolaryngology, Hearing Implant Center, Medical University of Warsaw, Warsaw, Poland Purpose: To assess the benefits of cochlear implantation in the elderly. Patients and methods: A retrospective analysis of 31 postlingually deafened elderly (≥60 years of age with unilateral cochlear implants was conducted. Audiological testing included preoperative and postoperative pure-tone audiometry and a monosyllabic word recognition test presented from recorded material in free field. Speech perception tests included Ling's six sound test (sound detection, discrimination, and identification, syllable discrimination, and monosyllabic and multisyllabic word recognition (open set without lip-reading. Everyday life benefits from cochlear implantation were also evaluated. Results: The mean age at the time of cochlear implantation was 72.4 years old. The mean post-implantation follow-up time was 2.34 years. All patients significantly improved their audiological and speech understanding performances. The preoperative mean pure-tone average threshold for 500 Hz, 1,000 Hz, 2,000 Hz, and 4,000 Hz was 110.17 dB HL. Before cochlear implantation, all patients scored 0% on the monosyllabic word recognition test in free field at 70 dB SPL intensity level. The postoperative pure-tone average was 37.14 dB HL (the best mean threshold was 17.50 dB HL, the worst was 58.75 dB HL. After the surgery, mean monosyllabic word recognition reached 47.25%. Speech perception tests showed statistically significant improvement in speech recognition. Conclusion: The results of this study showed that cochlear implantation is indeed a successful treatment for improving speech recognition and offers a great help in everyday life to deafened elderly patients. Therefore, they can be good candidates for cochlear implantation and their age alone should not be a relevant or excluding factor when choosing
Effect of gap detection threshold on consistency of speech in children with speech sound disorder.

Science.gov (United States)

Sayyahi, Fateme; Soleymani, Zahra; Akbari, Mohammad; Bijankhan, Mahmood; Dolatshahi, Behrooz

2017-02-01

The present study examined the relationship between gap detection threshold and speech error consistency in children with speech sound disorder. The participants were children five to six years of age who were categorized into three groups of typical speech, consistent speech disorder (CSD) and inconsistent speech disorder (ISD).The phonetic gap detection threshold test was used for this study, which is a valid test comprised six syllables with inter-stimulus intervals between 20-300ms. The participants were asked to listen to the recorded stimuli three times and indicate whether they heard one or two sounds. There was no significant difference between the typical and CSD groups (p=0.55), but there were significant differences in performance between the ISD and CSD groups and the ISD and typical groups (p=0.00). The ISD group discriminated between speech sounds at a higher threshold. Children with inconsistent speech errors could not distinguish speech sounds during time-limited phonetic discrimination. It is suggested that inconsistency in speech is a representation of inconsistency in auditory perception, which causes by high gap detection threshold. Copyright © 2016 Elsevier Ltd. All rights reserved.
Speech Perception as a Multimodal Phenomenon

OpenAIRE

Rosenblum, Lawrence D.

2008-01-01

Speech perception is inherently multimodal. Visual speech (lip-reading) information is used by all perceivers and readily integrates with auditory speech. Imaging research suggests that the brain treats auditory and visual speech similarly. These findings have led some researchers to consider that speech perception works by extracting amodal information that takes the same form across modalities. From this perspective, speech integration is a property of the input information itself. Amodal s...
Poor Speech Perception Is Not a Core Deficit of Childhood Apraxia of Speech: Preliminary Findings

Science.gov (United States)

Zuk, Jennifer; Iuzzini-Seigel, Jenya; Cabbage, Kathryn; Green, Jordan R.; Hogan, Tiffany P.

2018-01-01

Purpose: Childhood apraxia of speech (CAS) is hypothesized to arise from deficits in speech motor planning and programming, but the influence of abnormal speech perception in CAS on these processes is debated. This study examined speech perception abilities among children with CAS with and without language impairment compared to those with…
Principles of speech coding

CERN Document Server

Ogunfunmi, Tokunbo

2010-01-01

It is becoming increasingly apparent that all forms of communication-including voice-will be transmitted through packet-switched networks based on the Internet Protocol (IP). Therefore, the design of modern devices that rely on speech interfaces, such as cell phones and PDAs, requires a complete and up-to-date understanding of the basics of speech coding. Outlines key signal processing algorithms used to mitigate impairments to speech quality in VoIP networksOffering a detailed yet easily accessible introduction to the field, Principles of Speech Coding provides an in-depth examination of the
The Neural Bases of Difficult Speech Comprehension and Speech Production: Two Activation Likelihood Estimation (ALE) Meta-Analyses

Science.gov (United States)

Adank, Patti

2012-01-01

The role of speech production mechanisms in difficult speech comprehension is the subject of on-going debate in speech science. Two Activation Likelihood Estimation (ALE) analyses were conducted on neuroimaging studies investigating difficult speech comprehension or speech production. Meta-analysis 1 included 10 studies contrasting comprehension…
Metaheuristic applications to speech enhancement

CERN Document Server

Kunche, Prajna

2016-01-01

This book serves as a basic reference for those interested in the application of metaheuristics to speech enhancement. The major goal of the book is to explain the basic concepts of optimization methods and their use in heuristic optimization in speech enhancement to scientists, practicing engineers, and academic researchers in speech processing. The authors discuss why it has been a challenging problem for researchers to develop new enhancement algorithms that aid in the quality and intelligibility of degraded speech. They present powerful optimization methods to speech enhancement that can help to solve the noise reduction problems. Readers will be able to understand the fundamentals of speech processing as well as the optimization techniques, how the speech enhancement algorithms are implemented by utilizing optimization methods, and will be given the tools to develop new algorithms. The authors also provide a comprehensive literature survey regarding the topic.
Systematic Studies of Modified Vocalization: The Effect of Speech Rate on Speech Production Measures during Metronome-Paced Speech in Persons Who Stutter

Science.gov (United States)

Davidow, Jason H.

2014-01-01

Background: Metronome-paced speech results in the elimination, or substantial reduction, of stuttering moments. The cause of fluency during this fluency-inducing condition is unknown. Several investigations have reported changes in speech pattern characteristics from a control condition to a metronome-paced speech condition, but failure to control…
TongueToSpeech (TTS): Wearable wireless assistive device for augmented speech.

Science.gov (United States)

Marjanovic, Nicholas; Piccinini, Giacomo; Kerr, Kevin; Esmailbeigi, Hananeh

2017-07-01

Speech is an important aspect of human communication; individuals with speech impairment are unable to communicate vocally in real time. Our team has developed the TongueToSpeech (TTS) device with the goal of augmenting speech communication for the vocally impaired. The proposed device is a wearable wireless assistive device that incorporates a capacitive touch keyboard interface embedded inside a discrete retainer. This device connects to a computer, tablet or a smartphone via Bluetooth connection. The developed TTS application converts text typed by the tongue into audible speech. Our studies have concluded that an 8-contact point configuration between the tongue and the TTS device would yield the best user precision and speed performance. On average using the TTS device inside the oral cavity takes 2.5 times longer than the pointer finger using a T9 (Text on 9 keys) keyboard configuration to type the same phrase. In conclusion, we have developed a discrete noninvasive wearable device that allows the vocally impaired individuals to communicate in real time.
Social eye gaze modulates processing of speech and co-speech gesture.

Science.gov (United States)

Holler, Judith; Schubotz, Louise; Kelly, Spencer; Hagoort, Peter; Schuetze, Manuela; Özyürek, Aslı

2014-12-01

In human face-to-face communication, language comprehension is a multi-modal, situated activity. However, little is known about how we combine information from different modalities during comprehension, and how perceived communicative intentions, often signaled through visual signals, influence this process. We explored this question by simulating a multi-party communication context in which a speaker alternated her gaze between two recipients. Participants viewed speech-only or speech+gesture object-related messages when being addressed (direct gaze) or unaddressed (gaze averted to other participant). They were then asked to choose which of two object images matched the speaker's preceding message. Unaddressed recipients responded significantly more slowly than addressees for speech-only utterances. However, perceiving the same speech accompanied by gestures sped unaddressed recipients up to a level identical to that of addressees. That is, when unaddressed recipients' speech processing suffers, gestures can enhance the comprehension of a speaker's message. We discuss our findings with respect to two hypotheses attempting to account for how social eye gaze may modulate multi-modal language comprehension. Copyright © 2014 Elsevier B.V. All rights reserved.
Electrophysiological evidence for speech-specific audiovisual integration.

Science.gov (United States)

Baart, Martijn; Stekelenburg, Jeroen J; Vroomen, Jean

2014-01-01

Lip-read speech is integrated with heard speech at various neural levels. Here, we investigated the extent to which lip-read induced modulations of the auditory N1 and P2 (measured with EEG) are indicative of speech-specific audiovisual integration, and we explored to what extent the ERPs were modulated by phonetic audiovisual congruency. In order to disentangle speech-specific (phonetic) integration from non-speech integration, we used Sine-Wave Speech (SWS) that was perceived as speech by half of the participants (they were in speech-mode), while the other half was in non-speech mode. Results showed that the N1 obtained with audiovisual stimuli peaked earlier than the N1 evoked by auditory-only stimuli. This lip-read induced speeding up of the N1 occurred for listeners in speech and non-speech mode. In contrast, if listeners were in speech-mode, lip-read speech also modulated the auditory P2, but not if listeners were in non-speech mode, thus revealing speech-specific audiovisual binding. Comparing ERPs for phonetically congruent audiovisual stimuli with ERPs for incongruent stimuli revealed an effect of phonetic stimulus congruency that started at ~200 ms after (in)congruence became apparent. Critically, akin to the P2 suppression, congruency effects were only observed if listeners were in speech mode, and not if they were in non-speech mode. Using identical stimuli, we thus confirm that audiovisual binding involves (partially) different neural mechanisms for sound processing in speech and non-speech mode. © 2013 Published by Elsevier Ltd.
Free Speech Yearbook 1978.

Science.gov (United States)

Phifer, Gregg, Ed.

The 17 articles in this collection deal with theoretical and practical freedom of speech issues. The topics include: freedom of speech in Marquette Park, Illinois; Nazis in Skokie, Illinois; freedom of expression in the Confederate States of America; Robert M. LaFollette's arguments for free speech and the rights of Congress; the United States…
Visual context enhanced. The joint contribution of iconic gestures and visible speech to degraded speech comprehension.

NARCIS (Netherlands)

Drijvers, L.; Özyürek, A.

2017-01-01

Purpose: This study investigated whether and to what extent iconic co-speech gestures contribute to information from visible speech to enhance degraded speech comprehension at different levels of noise-vocoding. Previous studies of the contributions of these 2 visual articulators to speech

Multisensory integration of speech sounds with letters vs. visual speech : only visual speech induces the mismatch negativity

NARCIS (Netherlands)

Stekelenburg, J.J.; Keetels, M.N.; Vroomen, J.H.M.

2018-01-01

Numerous studies have demonstrated that the vision of lip movements can alter the perception of auditory speech syllables (McGurk effect). While there is ample evidence for integration of text and auditory speech, there are only a few studies on the orthographic equivalent of the McGurk effect.
Speech Research

Science.gov (United States)

Several articles addressing topics in speech research are presented. The topics include: exploring the functional significance of physiological tremor: A biospectroscopic approach; differences between experienced and inexperienced listeners to deaf speech; a language-oriented view of reading and its disabilities; Phonetic factors in letter detection; categorical perception; Short-term recall by deaf signers of American sign language; a common basis for auditory sensory storage in perception and immediate memory; phonological awareness and verbal short-term memory; initiation versus execution time during manual and oral counting by stutterers; trading relations in the perception of speech by five-year-old children; the role of the strap muscles in pitch lowering; phonetic validation of distinctive features; consonants and syllable boundaires; and vowel information in postvocalic frictions.
Represented Speech in Qualitative Health Research

DEFF Research Database (Denmark)

Musaeus, Peter

2017-01-01

Represented speech refers to speech where we reference somebody. Represented speech is an important phenomenon in everyday conversation, health care communication, and qualitative research. This case will draw first from a case study on physicians’ workplace learning and second from a case study...... on nurses’ apprenticeship learning. The aim of the case is to guide the qualitative researcher to use own and others’ voices in the interview and to be sensitive to represented speech in everyday conversation. Moreover, reported speech matters to health professionals who aim to represent the voice...... of their patients. Qualitative researchers and students might learn to encourage interviewees to elaborate different voices or perspectives. Qualitative researchers working with natural speech might pay attention to how people talk and use represented speech. Finally, represented speech might be relevant...
Spectral integration in speech and non-speech sounds

Science.gov (United States)

Jacewicz, Ewa

2005-04-01

Spectral integration (or formant averaging) was proposed in vowel perception research to account for the observation that a reduction of the intensity of one of two closely spaced formants (as in /u/) produced a predictable shift in vowel quality [Delattre et al., Word 8, 195-210 (1952)]. A related observation was reported in psychoacoustics, indicating that when the components of a two-tone periodic complex differ in amplitude and frequency, its perceived pitch is shifted toward that of the more intense tone [Helmholtz, App. XIV (1875/1948)]. Subsequent research in both fields focused on the frequency interval that separates these two spectral components, in an attempt to determine the size of the bandwidth for spectral integration to occur. This talk will review the accumulated evidence for and against spectral integration within the hypothesized limit of 3.5 Bark for static and dynamic signals in speech perception and psychoacoustics. Based on similarities in the processing of speech and non-speech sounds, it is suggested that spectral integration may reflect a general property of the auditory system. A larger frequency bandwidth, possibly close to 3.5 Bark, may be utilized in integrating acoustic information, including speech, complex signals, or sound quality of a violin.
Measurement of speech parameters in casual speech of dementia patients

NARCIS (Netherlands)

Ossewaarde, Roelant; Jonkers, Roel; Jalvingh, Fedor; Bastiaanse, Yvonne

Measurement of speech parameters in casual speech of dementia patients Roelant Adriaan Ossewaarde1,2, Roel Jonkers1, Fedor Jalvingh1,3, Roelien Bastiaanse1 1CLCG, University of Groningen (NL); 2HU University of Applied Sciences Utrecht (NL); 33St. Marienhospital - Vechta, Geriatric Clinic Vechta
Development of The Viking Speech Scale to classify the speech of children with cerebral palsy.

Science.gov (United States)

Pennington, Lindsay; Virella, Daniel; Mjøen, Tone; da Graça Andrada, Maria; Murray, Janice; Colver, Allan; Himmelmann, Kate; Rackauskaite, Gija; Greitane, Andra; Prasauskiene, Audrone; Andersen, Guro; de la Cruz, Javier

2013-10-01

Surveillance registers monitor the prevalence of cerebral palsy and the severity of resulting impairments across time and place. The motor disorders of cerebral palsy can affect children's speech production and limit their intelligibility. We describe the development of a scale to classify children's speech performance for use in cerebral palsy surveillance registers, and its reliability across raters and across time. Speech and language therapists, other healthcare professionals and parents classified the speech of 139 children with cerebral palsy (85 boys, 54 girls; mean age 6.03 years, SD 1.09) from observation and previous knowledge of the children. Another group of health professionals rated children's speech from information in their medical notes. With the exception of parents, raters reclassified children's speech at least four weeks after their initial classification. Raters were asked to rate how easy the scale was to use and how well the scale described the child's speech production using Likert scales. Inter-rater reliability was moderate to substantial (k>.58 for all comparisons). Test-retest reliability was substantial to almost perfect for all groups (k>.68). Over 74% of raters found the scale easy or very easy to use; 66% of parents and over 70% of health care professionals judged the scale to describe children's speech well or very well. We conclude that the Viking Speech Scale is a reliable tool to describe the speech performance of children with cerebral palsy, which can be applied through direct observation of children or through case note review. Copyright © 2013 Elsevier Ltd. All rights reserved.
Visual Context Enhanced: The Joint Contribution of Iconic Gestures and Visible Speech to Degraded Speech Comprehension

Science.gov (United States)

Drijvers, Linda; Ozyurek, Asli

2017-01-01

Purpose: This study investigated whether and to what extent iconic co-speech gestures contribute to information from visible speech to enhance degraded speech comprehension at different levels of noise-vocoding. Previous studies of the contributions of these 2 visual articulators to speech comprehension have only been performed separately. Method:…
Speech enhancement using emotion dependent codebooks

NARCIS (Netherlands)

Naidu, D.H.R.; Srinivasan, S.

2012-01-01

Several speech enhancement approaches utilize trained models of clean speech data, such as codebooks, Gaussian mixtures, and hidden Markov models. These models are typically trained on neutral clean speech data, without any emotion. However, in practical scenarios, emotional speech is a common
Crianças com fenilcetonúria: avaliação audiológica básica e supressão das otoemissões Children with phenylketonuria: basic audiological evaluation and suppression of otoacoustic emissions

Directory of Open Access Journals (Sweden)

Patrícia Souza Ribeiro

2012-01-01

Full Text Available OBJETIVO: Avaliar a via auditiva de crianças com fenilcetonúria tratadas precocemente, por meio de audiometria, imitanciometria e supressão das emissões otoacústicas transientes. MÉTODOS:Estudo prospectivo transversal comparativo com amostra composta por 28 crianças, sendo 12 com fenilcetonúria e 16 sem a doença. Foi realizada a pesquisa dos limiares de audibilidade por via aérea e óssea, logoaudiometria, imitanciometria e supressão das emissões otoacústicas transientes. RESULTADOS: A audiometria e a logoaudiometria estiveram normais em todos os participantes. Foram encontrados piores resultados para o índice de reconhecimento de fala (IRF no grupo com fenilcetonúria. A imitanciometria revelou curva normal para todas as crianças, mas a pesquisa dos reflexos estapedianos demonstrou que as crianças do grupo com fenilcetonúria apresentaram aumento nos seus limiares nas frequências de 2 e 4 kHz. A supressão das emissões otoacústicas transientes não revelou diferença na comparação entre os grupos. CONCLUSÃO: A avaliação audiológica básica não identifica alterações na audição das crianças com fenilcetonúria, mas há pior discriminação ao IRF e aumento nos limiares de reflexos estapedianos nessas crianças, podendo indicar distúrbios do processamento auditivo. O estudo da supressão das otoemissões demonstra integridade do sistema eferente olivococlear medial nas crianças com fenilcetonúria.PURPOSE: To evaluate the auditory pathways of children with early-treated phenylketonuria through audiometry, immitance tests, and suppression of transient otoacoustic emissions. METHODS: Prospective cross-sectional study with sample composed by 28 children: 12 with phenylketonuria and 16 without the disease. Participants underwent auditory evaluations composed of air- and bone-conduction pure-tone audiometry, speech audiometry, immittance tests and suppression of transient otoacoustic emissions. RESULTS: All participants
Linguistic contributions to speech-on-speech masking for native and non-native listeners: Language familiarity and semantic content

Science.gov (United States)

Brouwer, Susanne; Van Engen, Kristin J.; Calandruccio, Lauren; Bradlow, Ann R.

2012-01-01

This study examined whether speech-on-speech masking is sensitive to variation in the degree of similarity between the target and the masker speech. Three experiments investigated whether speech-in-speech recognition varies across different background speech languages (English vs Dutch) for both English and Dutch targets, as well as across variation in the semantic content of the background speech (meaningful vs semantically anomalous sentences), and across variation in listener status vis-à-vis the target and masker languages (native, non-native, or unfamiliar). The results showed that the more similar the target speech is to the masker speech (e.g., same vs different language, same vs different levels of semantic content), the greater the interference on speech recognition accuracy. Moreover, the listener’s knowledge of the target and the background language modulate the size of the release from masking. These factors had an especially strong effect on masking effectiveness in highly unfavorable listening conditions. Overall this research provided evidence that that the degree of target-masker similarity plays a significant role in speech-in-speech recognition. The results also give insight into how listeners assign their resources differently depending on whether they are listening to their first or second language. PMID:22352516
Speech-specificity of two audiovisual integration effects

DEFF Research Database (Denmark)

Eskelund, Kasper; Tuomainen, Jyrki; Andersen, Tobias

2010-01-01

Seeing the talker’s articulatory mouth movements can influence the auditory speech percept both in speech identification and detection tasks. Here we show that these audiovisual integration effects also occur for sine wave speech (SWS), which is an impoverished speech signal that naïve observers...... often fail to perceive as speech. While audiovisual integration in the identification task only occurred when observers were informed of the speech-like nature of SWS, integration occurred in the detection task both for informed and naïve observers. This shows that both speech-specific and general...... mechanisms underlie audiovisual integration of speech....
Recognizing speech in a novel accent: the motor theory of speech perception reframed.

Science.gov (United States)

Moulin-Frier, Clément; Arbib, Michael A

2013-08-01

The motor theory of speech perception holds that we perceive the speech of another in terms of a motor representation of that speech. However, when we have learned to recognize a foreign accent, it seems plausible that recognition of a word rarely involves reconstruction of the speech gestures of the speaker rather than the listener. To better assess the motor theory and this observation, we proceed in three stages. Part 1 places the motor theory of speech perception in a larger framework based on our earlier models of the adaptive formation of mirror neurons for grasping, and for viewing extensions of that mirror system as part of a larger system for neuro-linguistic processing, augmented by the present consideration of recognizing speech in a novel accent. Part 2 then offers a novel computational model of how a listener comes to understand the speech of someone speaking the listener's native language with a foreign accent. The core tenet of the model is that the listener uses hypotheses about the word the speaker is currently uttering to update probabilities linking the sound produced by the speaker to phonemes in the native language repertoire of the listener. This, on average, improves the recognition of later words. This model is neutral regarding the nature of the representations it uses (motor vs. auditory). It serve as a reference point for the discussion in Part 3, which proposes a dual-stream neuro-linguistic architecture to revisits claims for and against the motor theory of speech perception and the relevance of mirror neurons, and extracts some implications for the reframing of the motor theory.
Advocate: A Distributed Architecture for Speech-to-Speech Translation

Science.gov (United States)

2009-01-01

tecture, are either wrapped natural-language processing ( NLP ) components or objects developed from scratch using the architecture’s API. GATE is...framework, we put together a demonstration Arabic -to- English speech translation system using both internally developed ( Arabic speech recognition and MT...conditions of our Arabic S2S demonstration system described earlier. Once again, the data size was varied and eighty identical requests were
Using the Speech Transmission Index for predicting non-native speech intelligibility

NARCIS (Netherlands)

Wijngaarden, S.J. van; Bronkhorst, A.W.; Houtgast, T.; Steeneken, H.J.M.

2004-01-01

While the Speech Transmission Index ~STI! is widely applied for prediction of speech intelligibility in room acoustics and telecommunication engineering, it is unclear how to interpret STI values when non-native talkers or listeners are involved. Based on subjectively measured psychometric functions
Speech Planning Happens before Speech Execution: Online Reaction Time Methods in the Study of Apraxia of Speech

Science.gov (United States)

Maas, Edwin; Mailend, Marja-Liisa

2012-01-01

Purpose: The purpose of this article is to present an argument for the use of online reaction time (RT) methods to the study of apraxia of speech (AOS) and to review the existing small literature in this area and the contributions it has made to our fundamental understanding of speech planning (deficits) in AOS. Method: Following a brief…
Predicting speech intelligibility in adverse conditions: evaluation of the speech-based envelope power spectrum model

DEFF Research Database (Denmark)

Jørgensen, Søren; Dau, Torsten

2011-01-01

conditions by comparing predictions to measured data from [Kjems et al. (2009). J. Acoust. Soc. Am. 126 (3), 1415-1426] where speech is mixed with four different interferers, including speech-shaped noise, bottle noise, car noise, and cafe noise. The model accounts well for the differences in intelligibility......The speech-based envelope power spectrum model (sEPSM) [Jørgensen and Dau (2011). J. Acoust. Soc. Am., 130 (3), 1475–1487] estimates the envelope signal-to-noise ratio (SNRenv) of distorted speech and accurately describes the speech recognition thresholds (SRT) for normal-hearing listeners...... observed for the different interferers. None of the standardized models successfully describe these data....
Improving on hidden Markov models: An articulatorily constrained, maximum likelihood approach to speech recognition and speech coding

Energy Technology Data Exchange (ETDEWEB)

Hogden, J.

1996-11-05

The goal of the proposed research is to test a statistical model of speech recognition that incorporates the knowledge that speech is produced by relatively slow motions of the tongue, lips, and other speech articulators. This model is called Maximum Likelihood Continuity Mapping (Malcom). Many speech researchers believe that by using constraints imposed by articulator motions, we can improve or replace the current hidden Markov model based speech recognition algorithms. Unfortunately, previous efforts to incorporate information about articulation into speech recognition algorithms have suffered because (1) slight inaccuracies in our knowledge or the formulation of our knowledge about articulation may decrease recognition performance, (2) small changes in the assumptions underlying models of speech production can lead to large changes in the speech derived from the models, and (3) collecting measurements of human articulator positions in sufficient quantity for training a speech recognition algorithm is still impractical. The most interesting (and in fact, unique) quality of Malcom is that, even though Malcom makes use of a mapping between acoustics and articulation, Malcom can be trained to recognize speech using only acoustic data. By learning the mapping between acoustics and articulation using only acoustic data, Malcom avoids the difficulties involved in collecting articulator position measurements and does not require an articulatory synthesizer model to estimate the mapping between vocal tract shapes and speech acoustics. Preliminary experiments that demonstrate that Malcom can learn the mapping between acoustics and articulation are discussed. Potential applications of Malcom aside from speech recognition are also discussed. Finally, specific deliverables resulting from the proposed research are described.
Cleft Audit Protocol for Speech (CAPS-A): A Comprehensive Training Package for Speech Analysis

Science.gov (United States)

Sell, D.; John, A.; Harding-Bell, A.; Sweeney, T.; Hegarty, F.; Freeman, J.

2009-01-01

Background: The previous literature has largely focused on speech analysis systems and ignored process issues, such as the nature of adequate speech samples, data acquisition, recording and playback. Although there has been recognition of the need for training on tools used in speech analysis associated with cleft palate, little attention has been…
Perceived liveliness and speech comprehensibility in aphasia : the effects of direct speech in auditory narratives

NARCIS (Netherlands)

Groenewold, Rimke; Bastiaanse, Roelien; Nickels, Lyndsey; Huiskes, Mike

2014-01-01

Background: Previous studies have shown that in semi-spontaneous speech, individuals with Broca's and anomic aphasia produce relatively many direct speech constructions. It has been claimed that in 'healthy' communication direct speech constructions contribute to the liveliness, and indirectly to
Preschool speech intelligibility and vocabulary skills predict long-term speech and language outcomes following cochlear implantation in early childhood.

Science.gov (United States)

Castellanos, Irina; Kronenberger, William G; Beer, Jessica; Henning, Shirley C; Colson, Bethany G; Pisoni, David B

2014-07-01

Speech and language measures during grade school predict adolescent speech-language outcomes in children who receive cochlear implants (CIs), but no research has examined whether speech and language functioning at even younger ages is predictive of long-term outcomes in this population. The purpose of this study was to examine whether early preschool measures of speech and language performance predict speech-language functioning in long-term users of CIs. Early measures of speech intelligibility and receptive vocabulary (obtained during preschool ages of 3-6 years) in a sample of 35 prelingually deaf, early-implanted children predicted speech perception, language, and verbal working memory skills up to 18 years later. Age of onset of deafness and age at implantation added additional variance to preschool speech intelligibility in predicting some long-term outcome scores, but the relationship between preschool speech-language skills and later speech-language outcomes was not significantly attenuated by the addition of these hearing history variables. These findings suggest that speech and language development during the preschool years is predictive of long-term speech and language functioning in early-implanted, prelingually deaf children. As a result, measures of speech-language functioning at preschool ages can be used to identify and adjust interventions for very young CI users who may be at long-term risk for suboptimal speech and language outcomes.

Speech Clarity Index (Ψ): A Distance-Based Speech Quality Indicator and Recognition Rate Prediction for Dysarthric Speakers with Cerebral Palsy

Science.gov (United States)

Kayasith, Prakasith; Theeramunkong, Thanaruk

It is a tedious and subjective task to measure severity of a dysarthria by manually evaluating his/her speech using available standard assessment methods based on human perception. This paper presents an automated approach to assess speech quality of a dysarthric speaker with cerebral palsy. With the consideration of two complementary factors, speech consistency and speech distinction, a speech quality indicator called speech clarity index (Ψ) is proposed as a measure of the speaker's ability to produce consistent speech signal for a certain word and distinguished speech signal for different words. As an application, it can be used to assess speech quality and forecast speech recognition rate of speech made by an individual dysarthric speaker before actual exhaustive implementation of an automatic speech recognition system for the speaker. The effectiveness of Ψ as a speech recognition rate predictor is evaluated by rank-order inconsistency, correlation coefficient, and root-mean-square of difference. The evaluations had been done by comparing its predicted recognition rates with ones predicted by the standard methods called the articulatory and intelligibility tests based on the two recognition systems (HMM and ANN). The results show that Ψ is a promising indicator for predicting recognition rate of dysarthric speech. All experiments had been done on speech corpus composed of speech data from eight normal speakers and eight dysarthric speakers.
Automated Speech Rate Measurement in Dysarthria

Science.gov (United States)

Martens, Heidi; Dekens, Tomas; Van Nuffelen, Gwen; Latacz, Lukas; Verhelst, Werner; De Bodt, Marc

2015-01-01

Purpose: In this study, a new algorithm for automated determination of speech rate (SR) in dysarthric speech is evaluated. We investigated how reliably the algorithm calculates the SR of dysarthric speech samples when compared with calculation performed by speech-language pathologists. Method: The new algorithm was trained and tested using Dutch…
Simultaneous natural speech and AAC interventions for children with childhood apraxia of speech: lessons from a speech-language pathologist focus group.

Science.gov (United States)

Oommen, Elizabeth R; McCarthy, John W

2015-03-01

In childhood apraxia of speech (CAS), children exhibit varying levels of speech intelligibility depending on the nature of errors in articulation and prosody. Augmentative and alternative communication (AAC) strategies are beneficial, and commonly adopted with children with CAS. This study focused on the decision-making process and strategies adopted by speech-language pathologists (SLPs) when simultaneously implementing interventions that focused on natural speech and AAC. Eight SLPs, with significant clinical experience in CAS and AAC interventions, participated in an online focus group. Thematic analysis revealed eight themes: key decision-making factors; treatment history and rationale; benefits; challenges; therapy strategies and activities; collaboration with team members; recommendations; and other comments. Results are discussed along with clinical implications and directions for future research.
Speech Recognition on Mobile Devices

DEFF Research Database (Denmark)

Tan, Zheng-Hua; Lindberg, Børge

2010-01-01

in the mobile context covering motivations, challenges, fundamental techniques and applications. Three ASR architectures are introduced: embedded speech recognition, distributed speech recognition and network speech recognition. Their pros and cons and implementation issues are discussed. Applications within......The enthusiasm of deploying automatic speech recognition (ASR) on mobile devices is driven both by remarkable advances in ASR technology and by the demand for efficient user interfaces on such devices as mobile phones and personal digital assistants (PDAs). This chapter presents an overview of ASR...
Song and speech: examining the link between singing talent and speech imitation ability.

Science.gov (United States)

Christiner, Markus; Reiterer, Susanne M

2013-01-01

In previous research on speech imitation, musicality, and an ability to sing were isolated as the strongest indicators of good pronunciation skills in foreign languages. We, therefore, wanted to take a closer look at the nature of the ability to sing, which shares a common ground with the ability to imitate speech. This study focuses on whether good singing performance predicts good speech imitation. Forty-one singers of different levels of proficiency were selected for the study and their ability to sing, to imitate speech, their musical talent and working memory were tested. Results indicated that singing performance is a better indicator of the ability to imitate speech than the playing of a musical instrument. A multiple regression revealed that 64% of the speech imitation score variance could be explained by working memory together with educational background and singing performance. A second multiple regression showed that 66% of the speech imitation variance of completely unintelligible and unfamiliar language stimuli (Hindi) could be explained by working memory together with a singer's sense of rhythm and quality of voice. This supports the idea that both vocal behaviors have a common grounding in terms of vocal and motor flexibility, ontogenetic and phylogenetic development, neural orchestration and auditory memory with singing fitting better into the category of "speech" on the productive level and "music" on the acoustic level. As a result, good singers benefit from vocal and motor flexibility, productively and cognitively, in three ways. (1) Motor flexibility and the ability to sing improve language and musical function. (2) Good singers retain a certain plasticity and are open to new and unusual sound combinations during adulthood both perceptually and productively. (3) The ability to sing improves the memory span of the auditory working memory.
Song and speech: examining the link between singing talent and speech imitation ability

Directory of Open Access Journals (Sweden)

Markus eChristiner

2013-11-01

Full Text Available In previous research on speech imitation, musicality and an ability to sing were isolated as the strongest indicators of good pronunciation skills in foreign languages. We, therefore, wanted to take a closer look at the nature of the ability to sing, which shares a common ground with the ability to imitate speech. This study focuses on whether good singing performance predicts good speech imitation. Fourty-one singers of different levels of proficiency were selected for the study and their ability to sing, to imitate speech, their musical talent and working memory were tested. Results indicated that singing performance is a better indicator of the ability to imitate speech than the playing of a musical instrument. A multiple regression revealed that 64 % of the speech imitation score variance could be explained by working memory together with educational background and singing performance. A second multiple regression showed that 66 % of the speech imitation variance of completely unintelligible and unfamiliar language stimuli (Hindi could be explained by working memory together with a singer’s sense of rhythm and quality of voice. This supports the idea that both vocal behaviors have a common grounding in terms of vocal and motor flexibility, ontogenetic and phylogenetic development, neural orchestration and sound memory with singing fitting better into the category of "speech" on the productive level and "music" on the acoustic level. As a result, good singers benefit from vocal and motor flexibility, productively and cognitively, in three ways. 1. Motor flexibility and the ability to sing improve language and musical function. 2. Good singers retain a certain plasticity and are open to new and unusual sound combinations during adulthood both perceptually and productively. 3. The ability to sing improves the memory span of the auditory short term memory.
Speech Alarms Pilot Study

Science.gov (United States)

Sandor, Aniko; Moses, Haifa

2016-01-01

Speech alarms have been used extensively in aviation and included in International Building Codes (IBC) and National Fire Protection Association's (NFPA) Life Safety Code. However, they have not been implemented on space vehicles. Previous studies conducted at NASA JSC showed that speech alarms lead to faster identification and higher accuracy. This research evaluated updated speech and tone alerts in a laboratory environment and in the Human Exploration Research Analog (HERA) in a realistic setup.
Freedom of Speech Newsletter, September, 1975.

Science.gov (United States)

Allen, Winfred G., Jr., Ed.

The Freedom of Speech Newsletter is the communication medium for the Freedom of Speech Interest Group of the Western Speech Communication Association. The newsletter contains such features as a statement of concern by the National Ad Hoc Committee Against Censorship; Reticence and Free Speech, an article by James F. Vickrey discussing the subtle…
Automatic speech recognition used for evaluation of text-to-speech systems

Czech Academy of Sciences Publication Activity Database

Vích, Robert; Nouza, J.; Vondra, Martin

-, č. 5042 (2008), s. 136-148 ISSN 0302-9743 R&D Projects: GA AV ČR 1ET301710509; GA AV ČR 1QS108040569 Institutional research plan: CEZ:AV0Z20670512 Keywords : speech recognition * speech processing Subject RIV: JA - Electronics ; Optoelectronics, Electrical Engineering
Outcome measures in stapes surgery: postoperative results are independent from preoperative parameters.

Science.gov (United States)

Koopmann, Mario; Weiss, Daniel; Savvas, Eleftherios; Rudack, Claudia; Stenner, Markus

2015-09-01

The aim of this study was to compare audiometric results before and after stapes surgery and identify potential prognostic factors to appropriately select patients with otosclerosis who will most likely benefit from surgery. We enrolled 126 patients with otosclerosis (162 consecutive ears) in our study who underwent stapes surgery between 2007 and 2012 at our institution. Preoperative and postoperative data including pure-tone audiometry, speech audiometry, stapedial reflex audiometry and surgical data were analyzed. The average preoperative air-bone gap (ABG) was 28.9 ± 8.6 dB. Male patients and patients older than 45 years of age had greater preoperative ABGs in comparison to females and younger patients. Postoperative ABGs were 11.2 ± 7.4 dB. The average ABG gain was 17.7 ± 11.1 dB. Preoperative audiometric data, age, gender and type of surgery did not influence the postoperative results. Stapes surgery offers predictable results independent from disease progression or patient-related factors. While absolute values of hearing improvement are instrumental in reflecting audiometric results of a cohort, relative values better reflect individual's audiometric data resembling the patient's benefit.
Audiologic and subjective evaluation of Baha® Attract device.

Science.gov (United States)

Pérez-Carbonell, Tomàs; Pla-Gil, Ignacio; Redondo-Martínez, Jaume; Morant-Ventura, Antonio; García-Callejo, Francisco Javier; Marco-Algarra, Jaime

We included 9 patients implanted with Baha ® Attract. All our patients were evaluated by free field tonal audiometry, free field verbal audiometry and free field verbal audiometry with background noise, all the tests were performed with and without the device. To evaluate the subjective component of the implantation, we used the Glasgow Benefit Inventory (GBI) and Abbreviated Profile of Hearing Aid Benefit (APHAB). The auditive assessment with the device showed average auditive thresholds of 35.8dB with improvements of 25.8dB over the previous situation. Speech reception thresholds were 37dB with Baha ® Attract, showing improvements of 23dB. Maximum discrimination thresholds showed an average gain of 60dB with the device. Baha ® Attract achieves auditive improvements in patients for whom it is correctly indicated, with a consequent positive subjective evaluation. This study shows the attenuation effect in transcutaneous transmission, that prevents the device achieving greater improvements. Copyright © 2017 Elsevier España, S.L.U. and Sociedad Española de Otorrinolaringología y Cirugía de Cabeza y Cuello. All rights reserved.
SynFace—Speech-Driven Facial Animation for Virtual Speech-Reading Support

Directory of Open Access Journals (Sweden)

Giampiero Salvi

2009-01-01

Full Text Available This paper describes SynFace, a supportive technology that aims at enhancing audio-based spoken communication in adverse acoustic conditions by providing the missing visual information in the form of an animated talking head. Firstly, we describe the system architecture, consisting of a 3D animated face model controlled from the speech input by a specifically optimised phonetic recogniser. Secondly, we report on speech intelligibility experiments with focus on multilinguality and robustness to audio quality. The system, already available for Swedish, English, and Flemish, was optimised for German and for Swedish wide-band speech quality available in TV, radio, and Internet communication. Lastly, the paper covers experiments with nonverbal motions driven from the speech signal. It is shown that turn-taking gestures can be used to affect the flow of human-human dialogues. We have focused specifically on two categories of cues that may be extracted from the acoustic signal: prominence/emphasis and interactional cues (turn-taking/back-channelling.
The Effect of English Verbal Songs on Connected Speech Aspects of Adult English Learners’ Speech Production

Directory of Open Access Journals (Sweden)

Farshid Tayari Ashtiani

2015-02-01

Full Text Available The present study was an attempt to investigate the impact of English verbal songs on connected speech aspects of adult English learners’ speech production. 40 participants were selected based on the results of their performance in a piloted and validated version of NELSON test given to 60 intermediate English learners in a language institute in Tehran. Then they were equally distributed in two control and experimental groups and received a validated pretest of reading aloud and speaking in English. Afterward, the treatment was performed in 18 sessions by singing preselected songs culled based on some criteria such as popularity, familiarity, amount, and speed of speech delivery, etc. In the end, the posttests of reading aloud and speaking in English were administered. The results revealed that the treatment had statistically positive effects on the connected speech aspects of English learners’ speech production at statistical .05 level of significance. Meanwhile, the results represented that there was not any significant difference between the experimental group’s mean scores on the posttests of reading aloud and speaking. It was thus concluded that providing the EFL learners with English verbal songs could positively affect connected speech aspects of both modes of speech production, reading aloud and speaking. The Findings of this study have pedagogical implications for language teachers to be more aware and knowledgeable of the benefits of verbal songs to promote speech production of language learners in terms of naturalness and fluency. Keywords: English Verbal Songs, Connected Speech, Speech Production, Reading Aloud, Speaking
[A case of transient auditory agnosia and schizophrenia].

Science.gov (United States)

Kanzaki, Jin; Harada, Tatsuhiko; Kanzaki, Sho

2011-03-01

We report a case of transient functional auditory agnosia and schizophrenia and discuss their relationship. A 30-year-old woman with schizophrenia reporting bilateral hearing loss was found in history taking to be able to hear but could neither understand speech nor discriminate among environmental sounds. Audiometry clarified normal but low speech discrimination. Otoacoustic emission and auditory brainstem response were normal. Magnetic resonance imaging (MRI) elsewhere evidenced no abnormal findings. We assumed that taking care of her grandparents who had been discharged from the hospital had unduly stressed her, and her condition improved shortly after she stopped caring for them, returned home and started taking a minor tranquilizer.
Assessment of auditory processing in children with dyslalia

Directory of Open Access Journals (Sweden)

Wlodarczyk £.

2011-09-01

Full Text Available The objective of the work was to assess occurrence of central auditory processing disorders in children with dyslalia. Material and method. The material included 30 children at the age 798 years old being under long-term speech therapy care due to articulation disorders. All the children were subjected to the phoniatric and speech examination, including tonal and impedance audiometry, speech therapist's consultation and psychologist's consultation. Electrophysi-ological (N2, P2, N2, P2, P300 record and following psychoacoustic test of central auditory functions were performed (Frequency Pattern Test. Results. Analysis of the results revealed disorders in the process of sound analysis within frequency and P300 wave latency prolongation in children with dyslalia. Conclusions. Auditory processing disorders may be significant in development of correct articulation in children, they also may explain unsatisfactory results of long-term speech therapy
An analysis of the masking of speech by competing speech using self-report data (L)

OpenAIRE

Agus, Trevor R.; Akeroyd, Michael A.; Noble, William; Bhullar, Navjot

2009-01-01

Many of the items in the “Speech, Spatial, and Qualities of Hearing” scale questionnaire [S. Gatehouse and W. Noble, Int. J. Audiol.43, 85–99 (2004)] are concerned with speech understanding in a variety of backgrounds, both speech and nonspeech. To study if this self-report data reflected informational masking, previously collected data on 414 people were analyzed. The lowest scores (greatest difficulties) were found for the two items in which there were two speech targets, with successively ...
Illustrated Speech Anatomy.

Science.gov (United States)

Shearer, William M.

Written for students in the fields of speech correction and audiology, the text deals with the following: structures involved in respiration; the skeleton and the processes of inhalation and exhalation; phonation and pitch, the larynx, and esophageal speech; muscles involved in articulation; muscles involved in resonance; and the anatomy of the…
Speech Entrainment Compensates for Broca's Area Damage

Science.gov (United States)

Fridriksson, Julius; Basilakos, Alexandra; Hickok, Gregory; Bonilha, Leonardo; Rorden, Chris

2015-01-01

Speech entrainment (SE), the online mimicking of an audiovisual speech model, has been shown to increase speech fluency in patients with Broca's aphasia. However, not all individuals with aphasia benefit from SE. The purpose of this study was to identify patterns of cortical damage that predict a positive response SE's fluency-inducing effects. Forty-four chronic patients with left hemisphere stroke (15 female) were included in this study. Participants completed two tasks: 1) spontaneous speech production, and 2) audiovisual SE. Number of different words per minute was calculated as a speech output measure for each task, with the difference between SE and spontaneous speech conditions yielding a measure of fluency improvement. Voxel-wise lesion-symptom mapping (VLSM) was used to relate the number of different words per minute for spontaneous speech, SE, and SE-related improvement to patterns of brain damage in order to predict lesion locations associated with the fluency-inducing response to speech entrainment. Individuals with Broca's aphasia demonstrated a significant increase in different words per minute during speech entrainment versus spontaneous speech. A similar pattern of improvement was not seen in patients with other types of aphasia. VLSM analysis revealed damage to the inferior frontal gyrus predicted this response. Results suggest that SE exerts its fluency-inducing effects by providing a surrogate target for speech production via internal monitoring processes. Clinically, these results add further support for the use of speech entrainment to improve speech production and may help select patients for speech entrainment treatment. PMID:25989443
[The Russian-language version of the matrix test (RUMatrix) in free field in patients after cochlear implantation in the long term].

Science.gov (United States)

Goykhburg, M V; Bakhshinyan, V V; Petrova, I P; Wazybok, A; Kollmeier, B; Tavartkiladze, G A

The deterioration of speech intelligibility in the patients using cochlear implantation (CI) systems is especially well apparent in the noisy environment. It explains why phrasal speech tests, such as a Matrix sentence test, have become increasingly more popular in the speech audiometry during rehabilitation after CI. The Matrix test allows to estimate speech perception by the patients in a real life situation. The objective of this study was to assess the effectiveness of audiological rehabilitation of CI patients using the Russian-language version of the matrix test (RUMatrix) in free field in the noisy environment. 33 patients aged from 5 to 40 years with a more than 3 year experience of using cochlear implants inserted at the National Research Center for Audiology and Hearing Rehabilitation were included in our study. Five of these patients were implanted bilaterally. The results of our study showed a statistically significant improvement of speech intelligibility in the noisy environment after the speech processor adjustment; dynamics of the signal-to-noise ratio changes was -1.7 dB (planguages makes possible its application in international multicenter studies.
Patterns of poststroke brain damage that predict speech production errors in apraxia of speech and aphasia dissociate.

Science.gov (United States)

Basilakos, Alexandra; Rorden, Chris; Bonilha, Leonardo; Moser, Dana; Fridriksson, Julius

2015-06-01

Acquired apraxia of speech (AOS) is a motor speech disorder caused by brain damage. AOS often co-occurs with aphasia, a language disorder in which patients may also demonstrate speech production errors. The overlap of speech production deficits in both disorders has raised questions on whether AOS emerges from a unique pattern of brain damage or as a subelement of the aphasic syndrome. The purpose of this study was to determine whether speech production errors in AOS and aphasia are associated with distinctive patterns of brain injury. Forty-three patients with history of a single left-hemisphere stroke underwent comprehensive speech and language testing. The AOS Rating Scale was used to rate speech errors specific to AOS versus speech errors that can also be associated with both AOS and aphasia. Localized brain damage was identified using structural magnetic resonance imaging, and voxel-based lesion-impairment mapping was used to evaluate the relationship between speech errors specific to AOS, those that can occur in AOS or aphasia, and brain damage. The pattern of brain damage associated with AOS was most strongly associated with damage to cortical motor regions, with additional involvement of somatosensory areas. Speech production deficits that could be attributed to AOS or aphasia were associated with damage to the temporal lobe and the inferior precentral frontal regions. AOS likely occurs in conjunction with aphasia because of the proximity of the brain areas supporting speech and language, but the neurobiological substrate for each disorder differs. © 2015 American Heart Association, Inc.

A NOVEL APPROACH TO STUTTERED SPEECH CORRECTION

Directory of Open Access Journals (Sweden)

Alim Sabur Ajibola

2016-06-01

Full Text Available Stuttered speech is a dysfluency rich speech, more prevalent in males than females. It has been associated with insufficient air pressure or poor articulation, even though the root causes are more complex. The primary features include prolonged speech and repetitive speech, while some of its secondary features include, anxiety, fear, and shame. This study used LPC analysis and synthesis algorithms to reconstruct the stuttered speech. The results were evaluated using cepstral distance, Itakura-Saito distance, mean square error, and likelihood ratio. These measures implied perfect speech reconstruction quality. ASR was used for further testing, and the results showed that all the reconstructed speech samples were perfectly recognized while only three samples of the original speech were perfectly recognized.
Prisoner Fasting as Symbolic Speech: The Ultimate Speech-Action Test.

Science.gov (United States)

Sneed, Don; Stonecipher, Harry W.

The ultimate test of the speech-action dichotomy, as it relates to symbolic speech to be considered by the courts, may be the fasting of prison inmates who use hunger strikes to protest the conditions of their confinement or to make political statements. While hunger strikes have been utilized by prisoners for years as a means of protest, it was…
Childhood apraxia of speech and multiple phonological disorders in Cairo-Egyptian Arabic speaking children: language, speech, and oro-motor differences.

Science.gov (United States)

Aziz, Azza Adel; Shohdi, Sahar; Osman, Dalia Mostafa; Habib, Emad Iskander

2010-06-01

Childhood apraxia of speech is a neurological childhood speech-sound disorder in which the precision and consistency of movements underlying speech are impaired in the absence of neuromuscular deficits. Children with childhood apraxia of speech and those with multiple phonological disorder share some common phonological errors that can be misleading in diagnosis. This study posed a question about a possible significant difference in language, speech and non-speech oral performances between children with childhood apraxia of speech, multiple phonological disorder and normal children that can be used for a differential diagnostic purpose. 30 pre-school children between the ages of 4 and 6 years served as participants. Each of these children represented one of 3 possible subject-groups: Group 1: multiple phonological disorder; Group 2: suspected cases of childhood apraxia of speech; Group 3: control group with no communication disorder. Assessment procedures included: parent interviews; testing of non-speech oral motor skills and testing of speech skills. Data showed that children with suspected childhood apraxia of speech showed significantly lower language score only in their expressive abilities. Non-speech tasks did not identify significant differences between childhood apraxia of speech and multiple phonological disorder groups except for those which required two sequential motor performances. In speech tasks, both consonant and vowel accuracy were significantly lower and inconsistent in childhood apraxia of speech group than in the multiple phonological disorder group. Syllable number, shape and sequence accuracy differed significantly in the childhood apraxia of speech group than the other two groups. In addition, children with childhood apraxia of speech showed greater difficulty in processing prosodic features indicating a clear need to address these variables for differential diagnosis and treatment of children with childhood apraxia of speech. Copyright (c
Individual differneces in degraded speech perception

Science.gov (United States)

Carbonell, Kathy M.

One of the lasting concerns in audiology is the unexplained individual differences in speech perception performance even for individuals with similar audiograms. One proposal is that there are cognitive/perceptual individual differences underlying this vulnerability and that these differences are present in normal hearing (NH) individuals but do not reveal themselves in studies that use clear speech produced in quiet (because of a ceiling effect). However, previous studies have failed to uncover cognitive/perceptual variables that explain much of the variance in NH performance on more challenging degraded speech tasks. This lack of strong correlations may be due to either examining the wrong measures (e.g., working memory capacity) or to there being no reliable differences in degraded speech performance in NH listeners (i.e., variability in performance is due to measurement noise). The proposed project has 3 aims; the first, is to establish whether there are reliable individual differences in degraded speech performance for NH listeners that are sustained both across degradation types (speech in noise, compressed speech, noise-vocoded speech) and across multiple testing sessions. The second aim is to establish whether there are reliable differences in NH listeners' ability to adapt their phonetic categories based on short-term statistics both across tasks and across sessions; and finally, to determine whether performance on degraded speech perception tasks are correlated with performance on phonetic adaptability tasks, thus establishing a possible explanatory variable for individual differences in speech perception for NH and hearing impaired listeners.
Collective speech acts

NARCIS (Netherlands)

Meijers, A.W.M.; Tsohatzidis, S.L.

2007-01-01

From its early development in the 1960s, speech act theory always had an individualistic orientation. It focused exclusively on speech acts performed by individual agents. Paradigmatic examples are ‘I promise that p’, ‘I order that p’, and ‘I declare that p’. There is a single speaker and a single
Commencement Speech as a Hybrid Polydiscursive Practice

Directory of Open Access Journals (Sweden)

Светлана Викторовна Иванова

2017-12-01

Full Text Available Discourse and media communication researchers pay attention to the fact that popular discursive and communicative practices have a tendency to hybridization and convergence. Discourse which is understood as language in use is flexible. Consequently, it turns out that one and the same text can represent several types of discourses. A vivid example of this tendency is revealed in American commencement speech / commencement address / graduation speech. A commencement speech is a speech university graduates are addressed with which in compliance with the modern trend is delivered by outstanding media personalities (politicians, athletes, actors, etc.. The objective of this study is to define the specificity of the realization of polydiscursive practices within commencement speech. The research involves discursive, contextual, stylistic and definitive analyses. Methodologically the study is based on the discourse analysis theory, in particular the notion of a discursive practice as a verbalized social practice makes up the conceptual basis of the research. This research draws upon a hundred commencement speeches delivered by prominent representatives of American society since 1980s till now. In brief, commencement speech belongs to institutional discourse public speech embodies. Commencement speech institutional parameters are well represented in speeches delivered by people in power like American and university presidents. Nevertheless, as the results of the research indicate commencement speech institutional character is not its only feature. Conceptual information analysis enables to refer commencement speech to didactic discourse as it is aimed at teaching university graduates how to deal with challenges life is rich in. Discursive practices of personal discourse are also actively integrated into the commencement speech discourse. More than that, existential discursive practices also find their way into the discourse under study. Commencement
The effectiveness of Speech-Music Therapy for Aphasia (SMTA) in five speakers with Apraxia of Speech and aphasia

NARCIS (Netherlands)

Hurkmans, Joost; Jonkers, Roel; de Bruijn, Madeleen; Boonstra, Anne M.; Hartman, Paul P.; Arendzen, Hans; Reinders - Messelink, Heelen

2015-01-01

Background: Several studies using musical elements in the treatment of neurological language and speech disorders have reported improvement of speech production. One such programme, Speech-Music Therapy for Aphasia (SMTA), integrates speech therapy and music therapy (MT) to treat the individual with
Current trends in multilingual speech processing

Indian Academy of Sciences (India)

2016-08-26

; speech-to-speech translation; language identiﬁcation. ... interest owing to two strong driving forces. Firstly, technical advances in speech recognition and synthesis are posing new challenges and opportunities to researchers.
Do long-term tongue piercings affect speech quality?

Science.gov (United States)

Heinen, Esther; Birkholz, Peter; Willmes, Klaus; Neuschaefer-Rube, Christiane

2017-10-01

To explore possible effects of tongue piercing on perceived speech quality. Using a quasi-experimental design, we analyzed the effect of tongue piercing on speech in a perception experiment. Samples of spontaneous speech and read speech were recorded from 20 long-term pierced and 20 non-pierced individuals (10 males, 10 females each). The individuals having a tongue piercing were recorded with attached and removed piercing. The audio samples were blindly rated by 26 female and 20 male laypersons and by 5 female speech-language pathologists with regard to perceived speech quality along 5 dimensions: speech clarity, speech rate, prosody, rhythm and fluency. We found no statistically significant differences for any of the speech quality dimensions between the pierced and non-pierced individuals, neither for the read nor for the spontaneous speech. In addition, neither length nor position of piercing had a significant effect on speech quality. The removal of tongue piercings had no effects on speech performance either. Rating differences between laypersons and speech-language pathologists were not dependent on the presence of a tongue piercing. People are able to perfectly adapt their articulation to long-term tongue piercings such that their speech quality is not perceptually affected.
Patterns of Post-Stroke Brain Damage that Predict Speech Production Errors in Apraxia of Speech and Aphasia Dissociate

Science.gov (United States)

Basilakos, Alexandra; Rorden, Chris; Bonilha, Leonardo; Moser, Dana; Fridriksson, Julius

2015-01-01

Background and Purpose Acquired apraxia of speech (AOS) is a motor speech disorder caused by brain damage. AOS often co-occurs with aphasia, a language disorder in which patients may also demonstrate speech production errors. The overlap of speech production deficits in both disorders has raised questions regarding if AOS emerges from a unique pattern of brain damage or as a sub-element of the aphasic syndrome. The purpose of this study was to determine whether speech production errors in AOS and aphasia are associated with distinctive patterns of brain injury. Methods Forty-three patients with history of a single left-hemisphere stroke underwent comprehensive speech and language testing. The Apraxia of Speech Rating Scale was used to rate speech errors specific to AOS versus speech errors that can also be associated with AOS and/or aphasia. Localized brain damage was identified using structural MRI, and voxel-based lesion-impairment mapping was used to evaluate the relationship between speech errors specific to AOS, those that can occur in AOS and/or aphasia, and brain damage. Results The pattern of brain damage associated with AOS was most strongly associated with damage to cortical motor regions, with additional involvement of somatosensory areas. Speech production deficits that could be attributed to AOS and/or aphasia were associated with damage to the temporal lobe and the inferior pre-central frontal regions. Conclusion AOS likely occurs in conjunction with aphasia due to the proximity of the brain areas supporting speech and language, but the neurobiological substrate for each disorder differs. PMID:25908457
Progressive apraxia of speech as a window into the study of speech planning processes.

Science.gov (United States)

Laganaro, Marina; Croisier, Michèle; Bagou, Odile; Assal, Frédéric

2012-09-01

We present a 3-year follow-up study of a patient with progressive apraxia of speech (PAoS), aimed at investigating whether the theoretical organization of phonetic encoding is reflected in the progressive disruption of speech. As decreased speech rate was the most striking pattern of disruption during the first 2 years, durational analyses were carried out longitudinally on syllables excised from spontaneous, repetition and reading speech samples. The crucial result of the present study is the demonstration of an effect of syllable frequency on duration: the progressive disruption of articulation rate did not affect all syllables in the same way, but followed a gradient that was function of the frequency of use of syllable-sized motor programs. The combination of data from this case of PAoS with previous psycholinguistic and neurolinguistic data, points to a frequency organization of syllable-sized speech-motor plans. In this study we also illustrate how studying PAoS can be exploited in theoretical and clinical investigations of phonetic encoding as it represents a unique opportunity to investigate speech while it progressively disrupts. Copyright © 2011 Elsevier Srl. All rights reserved.
Musicians do not benefit from differences in fundamental frequency when listening to speech in competing speech backgrounds

DEFF Research Database (Denmark)

Madsen, Sara Miay Kim; Whiteford, Kelly L.; Oxenham, Andrew J.

2017-01-01

Recent studies disagree on whether musicians have an advantage over non-musicians in understanding speech in noise. However, it has been suggested that musicians may be able to use diferences in fundamental frequency (F0) to better understand target speech in the presence of interfering talkers....... Here we studied a relatively large (N=60) cohort of young adults, equally divided between nonmusicians and highly trained musicians, to test whether the musicians were better able to understand speech either in noise or in a two-talker competing speech masker. The target speech and competing speech...... were presented with either their natural F0 contours or on a monotone F0, and the F0 diference between the target and masker was systematically varied. As expected, speech intelligibility improved with increasing F0 diference between the target and the two-talker masker for both natural and monotone...
Novel Techniques for Dialectal Arabic Speech Recognition

CERN Document Server

Elmahdy, Mohamed; Minker, Wolfgang

2012-01-01

Novel Techniques for Dialectal Arabic Speech describes approaches to improve automatic speech recognition for dialectal Arabic. Since speech resources for dialectal Arabic speech recognition are very sparse, the authors describe how existing Modern Standard Arabic (MSA) speech data can be applied to dialectal Arabic speech recognition, while assuming that MSA is always a second language for all Arabic speakers. In this book, Egyptian Colloquial Arabic (ECA) has been chosen as a typical Arabic dialect. ECA is the first ranked Arabic dialect in terms of number of speakers, and a high quality ECA speech corpus with accurate phonetic transcription has been collected. MSA acoustic models were trained using news broadcast speech. In order to cross-lingually use MSA in dialectal Arabic speech recognition, the authors have normalized the phoneme sets for MSA and ECA. After this normalization, they have applied state-of-the-art acoustic model adaptation techniques like Maximum Likelihood Linear Regression (MLLR) and M...
Speech and Communication Disorders

Science.gov (United States)

... to being completely unable to speak or understand speech. Causes include Hearing disorders and deafness Voice problems, ... or those caused by cleft lip or palate Speech problems like stuttering Developmental disabilities Learning disorders Autism ...
Speech of people with autism: Echolalia and echolalic speech

OpenAIRE

Błeszyński, Jacek Jarosław

2013-01-01

Speech of people with autism is recognised as one of the basic diagnostic, therapeutic and theoretical problems. One of the most common symptoms of autism in children is echolalia, described here as being of different types and severity. This paper presents the results of studies into different levels of echolalia, both in normally developing children and in children diagnosed with autism, discusses the differences between simple echolalia and echolalic speech - which can be considered to b...
A Diagnostic Marker to Discriminate Childhood Apraxia of Speech from Speech Delay: Introduction

Science.gov (United States)

Shriberg, Lawrence D.; Strand, Edythe A.; Fourakis, Marios; Jakielski, Kathy J.; Hall, Sheryl D.; Karlsson, Heather B.; Mabie, Heather L.; McSweeny, Jane L.; Tilkens, Christie M.; Wilson, David L.

2017-01-01

Purpose: The goal of this article is to introduce the pause marker (PM), a single-sign diagnostic marker proposed to discriminate early or persistent childhood apraxia of speech (CAS) from speech delay.
A speech production model including the nasal Cavity: A novel approach to articulatory analysis of speech signals

DEFF Research Database (Denmark)

Olesen, Morten

In order to obtain articulatory analysis of speech production the model is improved. the standard model, as used in LPC analysis, to a large extent only models the acoustic properties of speech signal as opposed to articulatory modelling of the speech production. In spite of this the LPC model...... is by far the most widely used model in speech technology....
Successful and rapid response of speech bulb reduction program combined with speech therapy in velopharyngeal dysfunction: a case report.

Science.gov (United States)

Shin, Yu-Jeong; Ko, Seung-O

2015-12-01

Velopharyngeal dysfunction in cleft palate patients following the primary palate repair may result in nasal air emission, hypernasality, articulation disorder and poor intelligibility of speech. Among conservative treatment methods, speech aid prosthesis combined with speech therapy is widely used method. However because of its long time of treatment more than a year and low predictability, some clinicians prefer a surgical intervention. Thus, the purpose of this report was to increase an attention on the effectiveness of speech aid prosthesis by introducing a case that was successfully treated. In this clinical report, speech bulb reduction program with intensive speech therapy was applied for a patient with velopharyngeal dysfunction and it was rapidly treated by 5months which was unusually short period for speech aid therapy. Furthermore, advantages of pre-operative speech aid therapy were discussed.
Speech Intelligibility Evaluation for Mobile Phones

DEFF Research Database (Denmark)

Jørgensen, Søren; Cubick, Jens; Dau, Torsten

2015-01-01

In the development process of modern telecommunication systems, such as mobile phones, it is common practice to use computer models to objectively evaluate the transmission quality of the system, instead of time-consuming perceptual listening tests. Such models have typically focused on the quality...... of the transmitted speech, while little or no attention has been provided to speech intelligibility. The present study investigated to what extent three state-of-the art speech intelligibility models could predict the intelligibility of noisy speech transmitted through mobile phones. Sentences from the Danish...... Dantale II speech material were mixed with three different kinds of background noise, transmitted through three different mobile phones, and recorded at the receiver via a local network simulator. The speech intelligibility of the transmitted sentences was assessed by six normal-hearing listeners...
Atypical speech versus non-speech detection and discrimination in 4- to 6- yr old children with autism spectrum disorder: An ERP study.

Directory of Open Access Journals (Sweden)

Alena Galilee

Full Text Available Previous event-related potential (ERP research utilizing oddball stimulus paradigms suggests diminished processing of speech versus non-speech sounds in children with an Autism Spectrum Disorder (ASD. However, brain mechanisms underlying these speech processing abnormalities, and to what extent they are related to poor language abilities in this population remain unknown. In the current study, we utilized a novel paired repetition paradigm in order to investigate ERP responses associated with the detection and discrimination of speech and non-speech sounds in 4- to 6-year old children with ASD, compared with gender and verbal age matched controls. ERPs were recorded while children passively listened to pairs of stimuli that were either both speech sounds, both non-speech sounds, speech followed by non-speech, or non-speech followed by speech. Control participants exhibited N330 match/mismatch responses measured from temporal electrodes, reflecting speech versus non-speech detection, bilaterally, whereas children with ASD exhibited this effect only over temporal electrodes in the left hemisphere. Furthermore, while the control groups exhibited match/mismatch effects at approximately 600 ms (central N600, temporal P600 when a non-speech sound was followed by a speech sound, these effects were absent in the ASD group. These findings suggest that children with ASD fail to activate right hemisphere mechanisms, likely associated with social or emotional aspects of speech detection, when distinguishing non-speech from speech stimuli. Together, these results demonstrate the presence of atypical speech versus non-speech processing in children with ASD when compared with typically developing children matched on verbal age.

Atypical speech versus non-speech detection and discrimination in 4- to 6- yr old children with autism spectrum disorder: An ERP study.

Science.gov (United States)

Galilee, Alena; Stefanidou, Chrysi; McCleery, Joseph P

2017-01-01

Previous event-related potential (ERP) research utilizing oddball stimulus paradigms suggests diminished processing of speech versus non-speech sounds in children with an Autism Spectrum Disorder (ASD). However, brain mechanisms underlying these speech processing abnormalities, and to what extent they are related to poor language abilities in this population remain unknown. In the current study, we utilized a novel paired repetition paradigm in order to investigate ERP responses associated with the detection and discrimination of speech and non-speech sounds in 4- to 6-year old children with ASD, compared with gender and verbal age matched controls. ERPs were recorded while children passively listened to pairs of stimuli that were either both speech sounds, both non-speech sounds, speech followed by non-speech, or non-speech followed by speech. Control participants exhibited N330 match/mismatch responses measured from temporal electrodes, reflecting speech versus non-speech detection, bilaterally, whereas children with ASD exhibited this effect only over temporal electrodes in the left hemisphere. Furthermore, while the control groups exhibited match/mismatch effects at approximately 600 ms (central N600, temporal P600) when a non-speech sound was followed by a speech sound, these effects were absent in the ASD group. These findings suggest that children with ASD fail to activate right hemisphere mechanisms, likely associated with social or emotional aspects of speech detection, when distinguishing non-speech from speech stimuli. Together, these results demonstrate the presence of atypical speech versus non-speech processing in children with ASD when compared with typically developing children matched on verbal age.
Radiological evaluation of esophageal speech on total laryngectomee

International Nuclear Information System (INIS)

Chung, Tae Sub; Suh, Jung Ho; Kim, Dong Ik; Kim, Gwi Eon; Hong, Won Phy; Lee, Won Sang

1988-01-01

Total laryngectomee requires some form of alaryngeal speech for communication. Generally, esophageal speech is regarded as the most available and comfortable technique for alaryngeal speech. But esophageal speech is difficult to train, so many patients are unable to attain esophageal speech for communication. To understand mechanism of esophageal of esophageal speech on total laryngectomee, evaluation of anatomical change of the pharyngoesophageal segment is very important. We used video fluoroscopy for evaluation of pharyngesophageal segment during esophageal speech. Eighteen total laryngectomees were evaluated with video fluoroscopy from Dec. 1986 to May 1987 at Y.U.M.C. Our results were as follows: 1. Peseudoglottis is the most important factor for esophageal speech, which is visualized in 7 cases among 8 cases of excellent esophageal speech group. 2. Two cases of longer A-P diameter at the pseudoglottis have the best quality of esophageal speech than others. 3. Two cases of mucosal vibration at the pharyngoesophageal segment can make excellent esophageal speech. 4. The cases of failed esophageal speech are poor aerophagia in 6 cases, abscence of pseudoglottis in 4 cases and poor air ejection in 3 cases. 5. Aerophagia synchronizes with diaphragmatic motion in 8 cases of excellent esophageal speech.
Automatic Speech Recognition Systems for the Evaluation of Voice and Speech Disorders in Head and Neck Cancer

Directory of Open Access Journals (Sweden)

Andreas Maier

2010-01-01

Full Text Available In patients suffering from head and neck cancer, speech intelligibility is often restricted. For assessment and outcome measurements, automatic speech recognition systems have previously been shown to be appropriate for objective and quick evaluation of intelligibility. In this study we investigate the applicability of the method to speech disorders caused by head and neck cancer. Intelligibility was quantified by speech recognition on recordings of a standard text read by 41 German laryngectomized patients with cancer of the larynx or hypopharynx and 49 German patients who had suffered from oral cancer. The speech recognition provides the percentage of correctly recognized words of a sequence, that is, the word recognition rate. Automatic evaluation was compared to perceptual ratings by a panel of experts and to an age-matched control group. Both patient groups showed significantly lower word recognition rates than the control group. Automatic speech recognition yielded word recognition rates which complied with experts' evaluation of intelligibility on a significant level. Automatic speech recognition serves as a good means with low effort to objectify and quantify the most important aspect of pathologic speech—the intelligibility. The system was successfully applied to voice and speech disorders.
On speech recognition during anaesthesia

DEFF Research Database (Denmark)

Alapetite, Alexandre

2007-01-01

This PhD thesis in human-computer interfaces (informatics) studies the case of the anaesthesia record used during medical operations and the possibility to supplement it with speech recognition facilities. Problems and limitations have been identified with the traditional paper-based anaesthesia...... and inaccuracies in the anaesthesia record. Supplementing the electronic anaesthesia record interface with speech input facilities is proposed as one possible solution to a part of the problem. The testing of the various hypotheses has involved the development of a prototype of an electronic anaesthesia record...... interface with speech input facilities in Danish. The evaluation of the new interface was carried out in a full-scale anaesthesia simulator. This has been complemented by laboratory experiments on several aspects of speech recognition for this type of use, e.g. the effects of noise on speech recognition...
78 FR 63152 - Telecommunications Relay Services and Speech-to-Speech Services for Individuals With Hearing and...

Science.gov (United States)

2013-10-23

...] Telecommunications Relay Services and Speech-to-Speech Services for Individuals With Hearing and Speech Disabilities... for telecommunications relay services (TRS) by eliminating standards for Internet-based relay services... comments, identified by CG Docket No. 03-123, by any of the following methods: Electronic Filers: Comments...
Speech Alarms Pilot Study

Science.gov (United States)

Sandor, A.; Moses, H. R.

2016-01-01

Currently on the International Space Station (ISS) and other space vehicles Caution & Warning (C&W) alerts are represented with various auditory tones that correspond to the type of event. This system relies on the crew's ability to remember what each tone represents in a high stress, high workload environment when responding to the alert. Furthermore, crew receive a year or more in advance of the mission that makes remembering the semantic meaning of the alerts more difficult. The current system works for missions conducted close to Earth where ground operators can assist as needed. On long duration missions, however, they will need to work off-nominal events autonomously. There is evidence that speech alarms may be easier and faster to recognize, especially during an off-nominal event. The Information Presentation Directed Research Project (FY07-FY09) funded by the Human Research Program included several studies investigating C&W alerts. The studies evaluated tone alerts currently in use with NASA flight deck displays along with candidate speech alerts. A follow-on study used four types of speech alerts to investigate how quickly various types of auditory alerts with and without a speech component - either at the beginning or at the end of the tone - can be identified. Even though crew were familiar with the tone alert from training or direct mission experience, alerts starting with a speech component were identified faster than alerts starting with a tone. The current study replicated the results from the previous study in a more rigorous experimental design to determine if the candidate speech alarms are ready for transition to operations or if more research is needed. Four types of alarms (caution, warning, fire, and depressurization) were presented to participants in both tone and speech formats in laboratory settings and later in the Human Exploration Research Analog (HERA). In the laboratory study, the alerts were presented by software and participants were
Functional connectivity between face-movement and speech-intelligibility areas during auditory-only speech perception.

Science.gov (United States)

Schall, Sonja; von Kriegstein, Katharina

2014-01-01

It has been proposed that internal simulation of the talking face of visually-known speakers facilitates auditory speech recognition. One prediction of this view is that brain areas involved in auditory-only speech comprehension interact with visual face-movement sensitive areas, even under auditory-only listening conditions. Here, we test this hypothesis using connectivity analyses of functional magnetic resonance imaging (fMRI) data. Participants (17 normal participants, 17 developmental prosopagnosics) first learned six speakers via brief voice-face or voice-occupation training (comprehension. Overall, the present findings indicate that learned visual information is integrated into the analysis of auditory-only speech and that this integration results from the interaction of task-relevant face-movement and auditory speech-sensitive areas.
Visualizing structures of speech expressiveness

DEFF Research Database (Denmark)

Herbelin, Bruno; Jensen, Karl Kristoffer; Graugaard, Lars

2008-01-01

Speech is both beautiful and informative. In this work, a conceptual study of the speech, through investigation of the tower of Babel, the archetypal phonemes, and a study of the reasons of uses of language is undertaken in order to create an artistic work investigating the nature of speech. The ....... The artwork is presented at the Re:New festival in May 2008....
A Clinician Survey of Speech and Non-Speech Characteristics of Neurogenic Stuttering

Science.gov (United States)

Theys, Catherine; van Wieringen, Astrid; De Nil, Luc F.

2008-01-01

This study presents survey data on 58 Dutch-speaking patients with neurogenic stuttering following various neurological injuries. Stroke was the most prevalent cause of stuttering in our patients, followed by traumatic brain injury, neurodegenerative diseases, and other causes. Speech and non-speech characteristics were analyzed separately for…
Automatic Speech Signal Analysis for Clinical Diagnosis and Assessment of Speech Disorders

CERN Document Server

Baghai-Ravary, Ladan

2013-01-01

Automatic Speech Signal Analysis for Clinical Diagnosis and Assessment of Speech Disorders provides a survey of methods designed to aid clinicians in the diagnosis and monitoring of speech disorders such as dysarthria and dyspraxia, with an emphasis on the signal processing techniques, statistical validity of the results presented in the literature, and the appropriateness of methods that do not require specialized equipment, rigorously controlled recording procedures or highly skilled personnel to interpret results. Such techniques offer the promise of a simple and cost-effective, yet objective, assessment of a range of medical conditions, which would be of great value to clinicians. The ideal scenario would begin with the collection of examples of the clients’ speech, either over the phone or using portable recording devices operated by non-specialist nursing staff. The recordings could then be analyzed initially to aid diagnosis of conditions, and subsequently to monitor the clients’ progress and res...
Temporal modulations in speech and music.

Science.gov (United States)

Ding, Nai; Patel, Aniruddh D; Chen, Lin; Butler, Henry; Luo, Cheng; Poeppel, David

2017-10-01

Speech and music have structured rhythms. Here we discuss a major acoustic correlate of spoken and musical rhythms, the slow (0.25-32Hz) temporal modulations in sound intensity and compare the modulation properties of speech and music. We analyze these modulations using over 25h of speech and over 39h of recordings of Western music. We show that the speech modulation spectrum is highly consistent across 9 languages (including languages with typologically different rhythmic characteristics). A different, but similarly consistent modulation spectrum is observed for music, including classical music played by single instruments of different types, symphonic, jazz, and rock. The temporal modulations of speech and music show broad but well-separated peaks around 5 and 2Hz, respectively. These acoustically dominant time scales may be intrinsic features of speech and music, a possibility which should be investigated using more culturally diverse samples in each domain. Distinct modulation timescales for speech and music could facilitate their perceptual analysis and its neural processing. Copyright © 2017 Elsevier Ltd. All rights reserved.
The development of the University of Jordan word recognition test.

Science.gov (United States)

Garadat, Soha N; Abdulbaqi, Khader J; Haj-Tas, Maisa A

2017-06-01

To develop and validate a digitally recorded speech test battery to assess speech perception in Jordanian Arabic-speaking adults. Selected stimuli were digitally recorded and were divided into four lists of 25 words each. Speech audiometry was completed for all listeners. Participants were divided into two equal groups of 30 listeners each with equal male to female ratio. The first group of participants completed speech reception thresholds (SRTs) and word recognition testing on each of the four lists using a fixed intensity. The second group of listeners was tested on each of the four lists at different intensity levels in order to obtain the performance-intensity function. Sixty normal-hearing listeners in the age range of 19-25 years. All participants were native speakers of Jordanian Arabic. Results revealed that there were no significant differences between SRTs and pure tone average. Additionally, there were no differences across lists at multiple intensity levels. In general, the current study was successful in producing recorded speech materials for Jordanian Arabic population. This suggests that the speech stimuli generated by this study are suitable for measuring speech recognition in Jordanian Arabic-speaking listeners.
Song and speech: examining the link between singing talent and speech imitation ability

Science.gov (United States)

Christiner, Markus; Reiterer, Susanne M.

2013-01-01

In previous research on speech imitation, musicality, and an ability to sing were isolated as the strongest indicators of good pronunciation skills in foreign languages. We, therefore, wanted to take a closer look at the nature of the ability to sing, which shares a common ground with the ability to imitate speech. This study focuses on whether good singing performance predicts good speech imitation. Forty-one singers of different levels of proficiency were selected for the study and their ability to sing, to imitate speech, their musical talent and working memory were tested. Results indicated that singing performance is a better indicator of the ability to imitate speech than the playing of a musical instrument. A multiple regression revealed that 64% of the speech imitation score variance could be explained by working memory together with educational background and singing performance. A second multiple regression showed that 66% of the speech imitation variance of completely unintelligible and unfamiliar language stimuli (Hindi) could be explained by working memory together with a singer's sense of rhythm and quality of voice. This supports the idea that both vocal behaviors have a common grounding in terms of vocal and motor flexibility, ontogenetic and phylogenetic development, neural orchestration and auditory memory with singing fitting better into the category of “speech” on the productive level and “music” on the acoustic level. As a result, good singers benefit from vocal and motor flexibility, productively and cognitively, in three ways. (1) Motor flexibility and the ability to sing improve language and musical function. (2) Good singers retain a certain plasticity and are open to new and unusual sound combinations during adulthood both perceptually and productively. (3) The ability to sing improves the memory span of the auditory working memory. PMID:24319438
Dysfluencies in the speech of adults with intellectual disabilities and reported speech difficulties.

Science.gov (United States)

Coppens-Hofman, Marjolein C; Terband, Hayo R; Maassen, Ben A M; van Schrojenstein Lantman-De Valk, Henny M J; van Zaalen-op't Hof, Yvonne; Snik, Ad F M

2013-01-01

In individuals with an intellectual disability, speech dysfluencies are more common than in the general population. In clinical practice, these fluency disorders are generally diagnosed and treated as stuttering rather than cluttering. To characterise the type of dysfluencies in adults with intellectual disabilities and reported speech difficulties with an emphasis on manifestations of stuttering and cluttering, which distinction is to help optimise treatment aimed at improving fluency and intelligibility. The dysfluencies in the spontaneous speech of 28 adults (18-40 years; 16 men) with mild and moderate intellectual disabilities (IQs 40-70), who were characterised as poorly intelligible by their caregivers, were analysed using the speech norms for typically developing adults and children. The speakers were subsequently assigned to different diagnostic categories by relating their resulting dysfluency profiles to mean articulatory rate and articulatory rate variability. Twenty-two (75%) of the participants showed clinically significant dysfluencies, of which 21% were classified as cluttering, 29% as cluttering-stuttering and 25% as clear cluttering at normal articulatory rate. The characteristic pattern of stuttering did not occur. The dysfluencies in the speech of adults with intellectual disabilities and poor intelligibility show patterns that are specific for this population. Together, the results suggest that in this specific group of dysfluent speakers interventions should be aimed at cluttering rather than stuttering. The reader will be able to (1) describe patterns of dysfluencies in the speech of adults with intellectual disabilities that are specific for this group of people, (2) explain that a high rate of dysfluencies in speech is potentially a major determiner of poor intelligibility in adults with ID and (3) describe suggestions for intervention focusing on cluttering rather than stuttering in dysfluent speakers with ID. Copyright © 2013 Elsevier Inc
A Diagnostic Marker to Discriminate Childhood Apraxia of Speech from Speech Delay: III. Theoretical Coherence of the Pause Marker with Speech Processing Deficits in Childhood Apraxia of Speech

Science.gov (United States)

Shriberg, Lawrence D.; Strand, Edythe A.; Fourakis, Marios; Jakielski, Kathy J.; Hall, Sheryl D.; Karlsson, Heather B.; Mabie, Heather L.; McSweeny, Jane L.; Tilkens, Christie M.; Wilson, David L.

2017-01-01

Purpose: Previous articles in this supplement described rationale for and development of the pause marker (PM), a diagnostic marker of childhood apraxia of speech (CAS), and studies supporting its validity and reliability. The present article assesses the theoretical coherence of the PM with speech processing deficits in CAS. Method: PM and other…
Speech and language support: How physicians can identify and treat speech and language delays in the office setting.

Science.gov (United States)

Moharir, Madhavi; Barnett, Noel; Taras, Jillian; Cole, Martha; Ford-Jones, E Lee; Levin, Leo

2014-01-01

Failure to recognize and intervene early in speech and language delays can lead to multifaceted and potentially severe consequences for early child development and later literacy skills. While routine evaluations of speech and language during well-child visits are recommended, there is no standardized (office) approach to facilitate this. Furthermore, extensive wait times for speech and language pathology consultation represent valuable lost time for the child and family. Using speech and language expertise, and paediatric collaboration, key content for an office-based tool was developed. early and accurate identification of speech and language delays as well as children at risk for literacy challenges; appropriate referral to speech and language services when required; and teaching and, thus, empowering parents to create rich and responsive language environments at home. Using this tool, in combination with the Canadian Paediatric Society's Read, Speak, Sing and Grow Literacy Initiative, physicians will be better positioned to offer practical strategies to caregivers to enhance children's speech and language capabilities. The tool represents a strategy to evaluate speech and language delays. It depicts age-specific linguistic/phonetic milestones and suggests interventions. The tool represents a practical interim treatment while the family is waiting for formal speech and language therapy consultation.
Abortion and compelled physician speech.

Science.gov (United States)

Orentlicher, David

2015-01-01

Informed consent mandates for abortion providers may infringe the First Amendment's freedom of speech. On the other hand, they may reinforce the physician's duty to obtain informed consent. Courts can promote both doctrines by ensuring that compelled physician speech pertains to medical facts about abortion rather than abortion ideology and that compelled speech is truthful and not misleading. © 2015 American Society of Law, Medicine & Ethics, Inc.
Speech enhancement

CERN Document Server

Benesty, Jacob; Chen, Jingdong

2006-01-01

We live in a noisy world! In all applications (telecommunications, hands-free communications, recording, human-machine interfaces, etc.) that require at least one microphone, the signal of interest is usually contaminated by noise and reverberation. As a result, the microphone signal has to be ""cleaned"" with digital signal processing tools before it is played out, transmitted, or stored.This book is about speech enhancement. Different well-known and state-of-the-art methods for noise reduction, with one or multiple microphones, are discussed. By speech enhancement, we mean not only noise red
Effect of speech rate variation on acoustic phone stability in Afrikaans speech recognition

CSIR Research Space (South Africa)

Badenhorst, JAC

2007-11-01

Full Text Available The authors analyse the effect of speech rate variation on Afrikaans phone stability from an acoustic perspective. Specifically they introduce two techniques for the acoustic analysis of speech rate variation, apply these techniques to an Afrikaans...
Speech, "Inner Speech," and the Development of Short-Term Memory: Effects of Picture-Labeling on Recall.

Science.gov (United States)

Hitch, Graham J.; And Others

1991-01-01

Reports on experiments to determine effects of overt speech on children's use of inner speech in short-term memory. Word length and phonemic similarity had greater effects on older children and when pictures were labeled at presentation. Suggests that speaking or listening to speech activates an internal articulatory loop. (Author/GH)

Phonetic recalibration of speech by text

NARCIS (Netherlands)

Keetels, M.N.; Schakel, L.; de Bonte, M.; Vroomen, J.

2016-01-01

Listeners adjust their phonetic categories to cope with variations in the speech signal (phonetic recalibration). Previous studies have shown that lipread speech (and word knowledge) can adjust the perception of ambiguous speech and can induce phonetic adjustments (Bertelson, Vroomen, & de Gelder in
Epoch-based analysis of speech signals

Indian Academy of Sciences (India)

on speech production characteristics, but also helps in accurate analysis of speech. .... include time delay estimation, speech enhancement from single and multi- ...... log. (. E[k]. ∑K−1 l=0. E[l]. ) ,. (7) where K is the number of samples in the ...
Automatic Speech Recognition Systems for the Evaluation of Voice and Speech Disorders in Head and Neck Cancer

OpenAIRE

Andreas Maier; Tino Haderlein; Florian Stelzle; Elmar Nöth; Emeka Nkenke; Frank Rosanowski; Anne Schützenberger; Maria Schuster

2010-01-01

In patients suffering from head and neck cancer, speech intelligibility is often restricted. For assessment and outcome measurements, automatic speech recognition systems have previously been shown to be appropriate for objective and quick evaluation of intelligibility. In this study we investigate the applicability of the method to speech disorders caused by head and neck cancer. Intelligibility was quantified by speech recognition on recordings of a standard text read by 41 German laryngect...
Nobel peace speech

Directory of Open Access Journals (Sweden)

Joshua FRYE

2017-07-01

Full Text Available The Nobel Peace Prize has long been considered the premier peace prize in the world. According to Geir Lundestad, Secretary of the Nobel Committee, of the 300 some peace prizes awarded worldwide, “none is in any way as well known and as highly respected as the Nobel Peace Prize” (Lundestad, 2001. Nobel peace speech is a unique and significant international site of public discourse committed to articulating the universal grammar of peace. Spanning over 100 years of sociopolitical history on the world stage, Nobel Peace Laureates richly represent an important cross-section of domestic and international issues increasingly germane to many publics. Communication scholars’ interest in this rhetorical genre has increased in the past decade. Yet, the norm has been to analyze a single speech artifact from a prestigious or controversial winner rather than examine the collection of speeches for generic commonalities of import. In this essay, we analyze the discourse of Nobel peace speech inductively and argue that the organizing principle of the Nobel peace speech genre is the repetitive form of normative liberal principles and values that function as rhetorical topoi. These topoi include freedom and justice and appeal to the inviolable, inborn right of human beings to exercise certain political and civil liberties and the expectation of equality of protection from totalitarian and tyrannical abuses. The significance of this essay to contemporary communication theory is to expand our theoretical understanding of rhetoric’s role in the maintenance and development of an international and cross-cultural vocabulary for the grammar of peace.
Speech-Language Dissociations, Distractibility, and Childhood Stuttering

Science.gov (United States)

Conture, Edward G.; Walden, Tedra A.; Lambert, Warren E.

2015-01-01

Purpose This study investigated the relation among speech-language dissociations, attentional distractibility, and childhood stuttering. Method Participants were 82 preschool-age children who stutter (CWS) and 120 who do not stutter (CWNS). Correlation-based statistics (Bates, Appelbaum, Salcedo, Saygin, & Pizzamiglio, 2003) identified dissociations across 5 norm-based speech-language subtests. The Behavioral Style Questionnaire Distractibility subscale measured attentional distractibility. Analyses addressed (a) between-groups differences in the number of children exhibiting speech-language dissociations; (b) between-groups distractibility differences; (c) the relation between distractibility and speech-language dissociations; and (d) whether interactions between distractibility and dissociations predicted the frequency of total, stuttered, and nonstuttered disfluencies. Results More preschool-age CWS exhibited speech-language dissociations compared with CWNS, and more boys exhibited dissociations compared with girls. In addition, male CWS were less distractible than female CWS and female CWNS. For CWS, but not CWNS, less distractibility (i.e., greater attention) was associated with more speech-language dissociations. Last, interactions between distractibility and dissociations did not predict speech disfluencies in CWS or CWNS. Conclusions The present findings suggest that for preschool-age CWS, attentional processes are associated with speech-language dissociations. Future investigations are warranted to better understand the directionality of effect of this association (e.g., inefficient attentional processes → speech-language dissociations vs. inefficient attentional processes ← speech-language dissociations). PMID:26126203
International aspirations for speech-language pathologists' practice with multilingual children with speech sound disorders: development of a position paper.

Science.gov (United States)

McLeod, Sharynne; Verdon, Sarah; Bowen, Caroline

2013-01-01

A major challenge for the speech-language pathology profession in many cultures is to address the mismatch between the "linguistic homogeneity of the speech-language pathology profession and the linguistic diversity of its clientele" (Caesar & Kohler, 2007, p. 198). This paper outlines the development of the Multilingual Children with Speech Sound Disorders: Position Paper created to guide speech-language pathologists' (SLPs') facilitation of multilingual children's speech. An international expert panel was assembled comprising 57 researchers (SLPs, linguists, phoneticians, and speech scientists) with knowledge about multilingual children's speech, or children with speech sound disorders. Combined, they had worked in 33 countries and used 26 languages in professional practice. Fourteen panel members met for a one-day workshop to identify key points for inclusion in the position paper. Subsequently, 42 additional panel members participated online to contribute to drafts of the position paper. A thematic analysis was undertaken of the major areas of discussion using two data sources: (a) face-to-face workshop transcript (133 pages) and (b) online discussion artifacts (104 pages). Finally, a moderator with international expertise in working with children with speech sound disorders facilitated the incorporation of the panel's recommendations. The following themes were identified: definitions, scope, framework, evidence, challenges, practices, and consideration of a multilingual audience. The resulting position paper contains guidelines for providing services to multilingual children with speech sound disorders (http://www.csu.edu.au/research/multilingual-speech/position-paper). The paper is structured using the International Classification of Functioning, Disability and Health: Children and Youth Version (World Health Organization, 2007) and incorporates recommendations for (a) children and families, (b) SLPs' assessment and intervention, (c) SLPs' professional
Free Speech. No. 38.

Science.gov (United States)

Kane, Peter E., Ed.

This issue of "Free Speech" contains the following articles: "Daniel Schoor Relieved of Reporting Duties" by Laurence Stern, "The Sellout at CBS" by Michael Harrington, "Defending Dan Schorr" by Tome Wicker, "Speech to the Washington Press Club, February 25, 1976" by Daniel Schorr, "Funds…
APPRECIATING SPEECH THROUGH GAMING

Directory of Open Access Journals (Sweden)

Mario T Carreon

2014-06-01

Full Text Available This paper discusses the Speech and Phoneme Recognition as an Educational Aid for the Deaf and Hearing Impaired (SPREAD application and the ongoing research on its deployment as a tool for motivating deaf and hearing impaired students to learn and appreciate speech. This application uses the Sphinx-4 voice recognition system to analyze the vocalization of the student and provide prompt feedback on their pronunciation. The packaging of the application as an interactive game aims to provide additional motivation for the deaf and hearing impaired student through visual motivation for them to learn and appreciate speech.
Global Freedom of Speech

DEFF Research Database (Denmark)

Binderup, Lars Grassme

2007-01-01

, as opposed to a legal norm, that curbs exercises of the right to free speech that offend the feelings or beliefs of members from other cultural groups. The paper rejects the suggestion that acceptance of such a norm is in line with liberal egalitarian thinking. Following a review of the classical liberal...... egalitarian reasons for free speech - reasons from overall welfare, from autonomy and from respect for the equality of citizens - it is argued that these reasons outweigh the proposed reasons for curbing culturally offensive speech. Currently controversial cases such as that of the Danish Cartoon Controversy...
Extensions to the Speech Disorders Classification System (SDCS)

Science.gov (United States)

Shriberg, Lawrence D.; Fourakis, Marios; Hall, Sheryl D.; Karlsson, Heather B.; Lohmeier, Heather L.; McSweeny, Jane L.; Potter, Nancy L.; Scheer-Cohen, Alison R.; Strand, Edythe A.; Tilkens, Christie M.; Wilson, David L.

2010-01-01

This report describes three extensions to a classification system for paediatric speech sound disorders termed the Speech Disorders Classification System (SDCS). Part I describes a classification extension to the SDCS to differentiate motor speech disorders from speech delay and to differentiate among three sub-types of motor speech disorders.…
Frontal and temporal contributions to understanding the iconic co-speech gestures that accompany speech.

Science.gov (United States)

Dick, Anthony Steven; Mok, Eva H; Raja Beharelle, Anjali; Goldin-Meadow, Susan; Small, Steven L

2014-03-01

In everyday conversation, listeners often rely on a speaker's gestures to clarify any ambiguities in the verbal message. Using fMRI during naturalistic story comprehension, we examined which brain regions in the listener are sensitive to speakers' iconic gestures. We focused on iconic gestures that contribute information not found in the speaker's talk, compared with those that convey information redundant with the speaker's talk. We found that three regions-left inferior frontal gyrus triangular (IFGTr) and opercular (IFGOp) portions, and left posterior middle temporal gyrus (MTGp)--responded more strongly when gestures added information to nonspecific language, compared with when they conveyed the same information in more specific language; in other words, when gesture disambiguated speech as opposed to reinforced it. An increased BOLD response was not found in these regions when the nonspecific language was produced without gesture, suggesting that IFGTr, IFGOp, and MTGp are involved in integrating semantic information across gesture and speech. In addition, we found that activity in the posterior superior temporal sulcus (STSp), previously thought to be involved in gesture-speech integration, was not sensitive to the gesture-speech relation. Together, these findings clarify the neurobiology of gesture-speech integration and contribute to an emerging picture of how listeners glean meaning from gestures that accompany speech. Copyright © 2012 Wiley Periodicals, Inc.
Freedom of racist speech: Ego and expressive threats.

Science.gov (United States)

White, Mark H; Crandall, Christian S

2017-09-01

Do claims of "free speech" provide cover for prejudice? We investigate whether this defense of racist or hate speech serves as a justification for prejudice. In a series of 8 studies (N = 1,624), we found that explicit racial prejudice is a reliable predictor of the "free speech defense" of racist expression. Participants endorsed free speech values for singing racists songs or posting racist comments on social media; people high in prejudice endorsed free speech more than people low in prejudice (meta-analytic r = .43). This endorsement was not principled-high levels of prejudice did not predict endorsement of free speech values when identical speech was directed at coworkers or the police. Participants low in explicit racial prejudice actively avoided endorsing free speech values in racialized conditions compared to nonracial conditions, but participants high in racial prejudice increased their endorsement of free speech values in racialized conditions. Three experiments failed to find evidence that defense of racist speech by the highly prejudiced was based in self-relevant or self-protective motives. Two experiments found evidence that the free speech argument protected participants' own freedom to express their attitudes; the defense of other's racist speech seems motivated more by threats to autonomy than threats to self-regard. These studies serve as an elaboration of the Justification-Suppression Model (Crandall & Eshleman, 2003) of prejudice expression. The justification of racist speech by endorsing fundamental political values can serve to buffer racial and hate speech from normative disapproval. (PsycINFO Database Record (c) 2017 APA, all rights reserved).
Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

International Nuclear Information System (INIS)

Holzrichter, J.F.; Ng, L.C.

1998-01-01

The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching. 35 figs
Speech coding, reconstruction and recognition using acoustics and electromagnetic waves

Science.gov (United States)

Holzrichter, John F.; Ng, Lawrence C.

1998-01-01

The use of EM radiation in conjunction with simultaneously recorded acoustic speech information enables a complete mathematical coding of acoustic speech. The methods include the forming of a feature vector for each pitch period of voiced speech and the forming of feature vectors for each time frame of unvoiced, as well as for combined voiced and unvoiced speech. The methods include how to deconvolve the speech excitation function from the acoustic speech output to describe the transfer function each time frame. The formation of feature vectors defining all acoustic speech units over well defined time frames can be used for purposes of speech coding, speech compression, speaker identification, language-of-speech identification, speech recognition, speech synthesis, speech translation, speech telephony, and speech teaching.
Speech and language support: How physicians can identify and treat speech and language delays in the office setting

Science.gov (United States)

Moharir, Madhavi; Barnett, Noel; Taras, Jillian; Cole, Martha; Ford-Jones, E Lee; Levin, Leo

2014-01-01

Failure to recognize and intervene early in speech and language delays can lead to multifaceted and potentially severe consequences for early child development and later literacy skills. While routine evaluations of speech and language during well-child visits are recommended, there is no standardized (office) approach to facilitate this. Furthermore, extensive wait times for speech and language pathology consultation represent valuable lost time for the child and family. Using speech and language expertise, and paediatric collaboration, key content for an office-based tool was developed. The tool aimed to help physicians achieve three main goals: early and accurate identification of speech and language delays as well as children at risk for literacy challenges; appropriate referral to speech and language services when required; and teaching and, thus, empowering parents to create rich and responsive language environments at home. Using this tool, in combination with the Canadian Paediatric Society’s Read, Speak, Sing and Grow Literacy Initiative, physicians will be better positioned to offer practical strategies to caregivers to enhance children’s speech and language capabilities. The tool represents a strategy to evaluate speech and language delays. It depicts age-specific linguistic/phonetic milestones and suggests interventions. The tool represents a practical interim treatment while the family is waiting for formal speech and language therapy consultation. PMID:24627648
Application of wavelets in speech processing

CERN Document Server

Farouk, Mohamed Hesham

2014-01-01

This book provides a survey on wide-spread of employing wavelets analysis in different applications of speech processing. The author examines development and research in different application of speech processing. The book also summarizes the state of the art research on wavelet in speech processing.
Recent advances in nonlinear speech processing

CERN Document Server

Faundez-Zanuy, Marcos; Esposito, Antonietta; Cordasco, Gennaro; Drugman, Thomas; Solé-Casals, Jordi; Morabito, Francesco

2016-01-01

This book presents recent advances in nonlinear speech processing beyond nonlinear techniques. It shows that it exploits heuristic and psychological models of human interaction in order to succeed in the implementations of socially believable VUIs and applications for human health and psychological support. The book takes into account the multifunctional role of speech and what is “outside of the box” (see Björn Schuller’s foreword). To this aim, the book is organized in 6 sections, each collecting a small number of short chapters reporting advances “inside” and “outside” themes related to nonlinear speech research. The themes emphasize theoretical and practical issues for modelling socially believable speech interfaces, ranging from efforts to capture the nature of sound changes in linguistic contexts and the timing nature of speech; labors to identify and detect speech features that help in the diagnosis of psychological and neuronal disease, attempts to improve the effectiveness and performa...
Speech and non-speech processing in children with phonological disorders: an electrophysiological study

Directory of Open Access Journals (Sweden)

Isabela Crivellaro Gonçalves

2011-01-01

Full Text Available OBJECTIVE: To determine whether neurophysiological auditory brainstem responses to clicks and repeated speech stimuli differ between typically developing children and children with phonological disorders. INTRODUCTION: Phonological disorders are language impairments resulting from inadequate use of adult phonological language rules and are among the most common speech and language disorders in children (prevalence: 8 - 9%. Our hypothesis is that children with phonological disorders have basic differences in the way that their brains encode acoustic signals at brainstem level when compared to normal counterparts. METHODS: We recorded click and speech evoked auditory brainstem responses in 18 typically developing children (control group and in 18 children who were clinically diagnosed with phonological disorders (research group. The age range of the children was from 7-11 years. RESULTS: The research group exhibited significantly longer latency responses to click stimuli (waves I, III and V and speech stimuli (waves V and A when compared to the control group. DISCUSSION: These results suggest that the abnormal encoding of speech sounds may be a biological marker of phonological disorders. However, these results cannot define the biological origins of phonological problems. We also observed that speech-evoked auditory brainstem responses had a higher specificity/sensitivity for identifying phonological disorders than click-evoked auditory brainstem responses. CONCLUSIONS: Early stages of the auditory pathway processing of an acoustic stimulus are not similar in typically developing children and those with phonological disorders. These findings suggest that there are brainstem auditory pathway abnormalities in children with phonological disorders.
Conflict monitoring in speech processing : An fMRI study of error detection in speech production and perception

NARCIS (Netherlands)

Gauvin, Hanna; De Baene, W.; Brass, Marcel; Hartsuiker, Robert

2016-01-01

To minimize the number of errors in speech, and thereby facilitate communication, speech is monitored before articulation. It is, however, unclear at which level during speech production monitoring takes place, and what mechanisms are used to detect and correct errors. The present study investigated
Religious Speech in the Military: Freedoms and Limitations

Science.gov (United States)

2011-01-01

abridging the freedom of speech .” Speech is construed broadly and includes both oral and written speech, as well as expressive conduct and displays when...intended to convey a message that is likely to be understood.7 Religious speech is certainly included. As a bedrock constitutional right, freedom of speech has...to good order and discipline or of a nature to bring discredit upon the armed forces)—the First Amendment’s freedom of speech will not provide them

Perceived Speech Quality Estimation Using DTW Algorithm

Directory of Open Access Journals (Sweden)

S. Arsenovski

2009-06-01

Full Text Available In this paper a method for speech quality estimation is evaluated by simulating the transfer of speech over packet switched and mobile networks. The proposed system uses Dynamic Time Warping algorithm for test and received speech comparison. Several tests have been made on a test speech sample of a single speaker with simulated packet (frame loss effects on the perceived speech. The achieved results have been compared with measured PESQ values on the used transmission channel and their correlation has been observed.
Audiovisual Asynchrony Detection in Human Speech

Science.gov (United States)

Maier, Joost X.; Di Luca, Massimiliano; Noppeney, Uta

2011-01-01

Combining information from the visual and auditory senses can greatly enhance intelligibility of natural speech. Integration of audiovisual speech signals is robust even when temporal offsets are present between the component signals. In the present study, we characterized the temporal integration window for speech and nonspeech stimuli with…
Detection of target phonemes in spontaneous and read speech.

Science.gov (United States)

Mehta, G; Cutler, A

1988-01-01

Although spontaneous speech occurs more frequently in most listeners' experience than read speech, laboratory studies of human speech recognition typically use carefully controlled materials read from a script. The phonological and prosodic characteristics of spontaneous and read speech differ considerably, however, which suggests that laboratory results may not generalise to the recognition of spontaneous speech. In the present study listeners were presented with both spontaneous and read speech materials, and their response time to detect word-initial target phonemes was measured. Responses were, overall, equally fast in each speech mode. However, analysis of effects previously reported in phoneme detection studies revealed significant differences between speech modes. In read speech but not in spontaneous speech, later targets were detected more rapidly than targets preceded by short words. In contrast, in spontaneous speech but not in read speech, targets were detected more rapidly in accented than in unaccented words and in strong than in weak syllables. An explanation for this pattern is offered in terms of characteristic prosodic differences between spontaneous and read speech. The results support claims from previous work that listeners pay great attention to prosodic information in the process of recognising speech.
Voice Activity Detection. Fundamentals and Speech Recognition System Robustness

OpenAIRE

Ramirez, J.; Gorriz, J. M.; Segura, J. C.

2007-01-01

This chapter has shown an overview of the main challenges in robust speech detection and a review of the state of the art and applications. VADs are frequently used in a number of applications including speech coding, speech enhancement and speech recognition. A precise VAD extracts a set of discriminative speech features from the noisy speech and formulates the decision in terms of well defined rule. The chapter has summarized three robust VAD methods that yield high speech/non-speech discri...
Religion, hate speech, and non-domination

OpenAIRE

Bonotti, Matteo

2017-01-01

In this paper I argue that one way of explaining what is wrong with hate speech is by critically assessing what kind of freedom free speech involves and, relatedly, what kind of freedom hate speech undermines. More specifically, I argue that the main arguments for freedom of speech (e.g. from truth, from autonomy, and from democracy) rely on a “positive” conception of freedom intended as autonomy and self-mastery (Berlin, 2006), and can only partially help us to understand what is wrong with ...
Modelling speech intelligibility in adverse conditions

DEFF Research Database (Denmark)

Jørgensen, Søren; Dau, Torsten

2013-01-01

Jørgensen and Dau (J Acoust Soc Am 130:1475-1487, 2011) proposed the speech-based envelope power spectrum model (sEPSM) in an attempt to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index (SII) in conditions with nonlinearly processed speech...... subjected to phase jitter, a condition in which the spectral structure of the intelligibility of speech signal is strongly affected, while the broadband temporal envelope is kept largely intact. In contrast, the effects of this distortion can be predicted -successfully by the spectro-temporal modulation...... suggest that the SNRenv might reflect a powerful decision metric, while some explicit across-frequency analysis seems crucial in some conditions. How such across-frequency analysis is "realized" in the auditory system remains unresolved....
Speech and audio processing for coding, enhancement and recognition

CERN Document Server

Togneri, Roberto; Narasimha, Madihally

2015-01-01

This book describes the basic principles underlying the generation, coding, transmission and enhancement of speech and audio signals, including advanced statistical and machine learning techniques for speech and speaker recognition with an overview of the key innovations in these areas. Key research undertaken in speech coding, speech enhancement, speech recognition, emotion recognition and speaker diarization are also presented, along with recent advances and new paradigms in these areas. · Offers readers a single-source reference on the significant applications of speech and audio processing to speech coding, speech enhancement and speech/speaker recognition. Enables readers involved in algorithm development and implementation issues for speech coding to understand the historical development and future challenges in speech coding research; · Discusses speech coding methods yielding bit-streams that are multi-rate and scalable for Voice-over-IP (VoIP) Networks; · �...
Sensorimotor influences on speech perception in infancy.

Science.gov (United States)

Bruderer, Alison G; Danielson, D Kyle; Kandhadai, Padmapriya; Werker, Janet F

2015-11-03

The influence of speech production on speech perception is well established in adults. However, because adults have a long history of both perceiving and producing speech, the extent to which the perception-production linkage is due to experience is unknown. We addressed this issue by asking whether articulatory configurations can influence infants' speech perception performance. To eliminate influences from specific linguistic experience, we studied preverbal, 6-mo-old infants and tested the discrimination of a nonnative, and hence never-before-experienced, speech sound distinction. In three experimental studies, we used teething toys to control the position and movement of the tongue tip while the infants listened to the speech sounds. Using ultrasound imaging technology, we verified that the teething toys consistently and effectively constrained the movement and positioning of infants' tongues. With a looking-time procedure, we found that temporarily restraining infants' articulators impeded their discrimination of a nonnative consonant contrast but only when the relevant articulator was selectively restrained to prevent the movements associated with producing those sounds. Our results provide striking evidence that even before infants speak their first words and without specific listening experience, sensorimotor information from the articulators influences speech perception. These results transform theories of speech perception by suggesting that even at the initial stages of development, oral-motor movements influence speech sound discrimination. Moreover, an experimentally induced "impairment" in articulator movement can compromise speech perception performance, raising the question of whether long-term oral-motor impairments may impact perceptual development.
Microscopic prediction of speech intelligibility in spatially distributed speech-shaped noise for normal-hearing listeners.

Science.gov (United States)

Geravanchizadeh, Masoud; Fallah, Ali

2015-12-01

A binaural and psychoacoustically motivated intelligibility model, based on a well-known monaural microscopic model is proposed. This model simulates a phoneme recognition task in the presence of spatially distributed speech-shaped noise in anechoic scenarios. In the proposed model, binaural advantage effects are considered by generating a feature vector for a dynamic-time-warping speech recognizer. This vector consists of three subvectors incorporating two monaural subvectors to model the better-ear hearing, and a binaural subvector to simulate the binaural unmasking effect. The binaural unit of the model is based on equalization-cancellation theory. This model operates blindly, which means separate recordings of speech and noise are not required for the predictions. Speech intelligibility tests were conducted with 12 normal hearing listeners by collecting speech reception thresholds (SRTs) in the presence of single and multiple sources of speech-shaped noise. The comparison of the model predictions with the measured binaural SRTs, and with the predictions of a macroscopic binaural model called extended equalization-cancellation, shows that this approach predicts the intelligibility in anechoic scenarios with good precision. The square of the correlation coefficient (r(2)) and the mean-absolute error between the model predictions and the measurements are 0.98 and 0.62 dB, respectively.
Prediction and constraint in audiovisual speech perception

Science.gov (United States)

Peelle, Jonathan E.; Sommers, Mitchell S.

2015-01-01

During face-to-face conversational speech listeners must efficiently process a rapid and complex stream of multisensory information. Visual speech can serve as a critical complement to auditory information because it provides cues to both the timing of the incoming acoustic signal (the amplitude envelope, influencing attention and perceptual sensitivity) and its content (place and manner of articulation, constraining lexical selection). Here we review behavioral and neurophysiological evidence regarding listeners' use of visual speech information. Multisensory integration of audiovisual speech cues improves recognition accuracy, particularly for speech in noise. Even when speech is intelligible based solely on auditory information, adding visual information may reduce the cognitive demands placed on listeners through increasing precision of prediction. Electrophysiological studies demonstrate oscillatory cortical entrainment to speech in auditory cortex is enhanced when visual speech is present, increasing sensitivity to important acoustic cues. Neuroimaging studies also suggest increased activity in auditory cortex when congruent visual information is available, but additionally emphasize the involvement of heteromodal regions of posterior superior temporal sulcus as playing a role in integrative processing. We interpret these findings in a framework of temporally-focused lexical competition in which visual speech information affects auditory processing to increase sensitivity to auditory information through an early integration mechanism, and a late integration stage that incorporates specific information about a speaker's articulators to constrain the number of possible candidates in a spoken utterance. Ultimately it is words compatible with both auditory and visual information that most strongly determine successful speech perception during everyday listening. Thus, audiovisual speech perception is accomplished through multiple stages of integration, supported
Prediction and constraint in audiovisual speech perception.

Science.gov (United States)

Peelle, Jonathan E; Sommers, Mitchell S

2015-07-01

During face-to-face conversational speech listeners must efficiently process a rapid and complex stream of multisensory information. Visual speech can serve as a critical complement to auditory information because it provides cues to both the timing of the incoming acoustic signal (the amplitude envelope, influencing attention and perceptual sensitivity) and its content (place and manner of articulation, constraining lexical selection). Here we review behavioral and neurophysiological evidence regarding listeners' use of visual speech information. Multisensory integration of audiovisual speech cues improves recognition accuracy, particularly for speech in noise. Even when speech is intelligible based solely on auditory information, adding visual information may reduce the cognitive demands placed on listeners through increasing the precision of prediction. Electrophysiological studies demonstrate that oscillatory cortical entrainment to speech in auditory cortex is enhanced when visual speech is present, increasing sensitivity to important acoustic cues. Neuroimaging studies also suggest increased activity in auditory cortex when congruent visual information is available, but additionally emphasize the involvement of heteromodal regions of posterior superior temporal sulcus as playing a role in integrative processing. We interpret these findings in a framework of temporally-focused lexical competition in which visual speech information affects auditory processing to increase sensitivity to acoustic information through an early integration mechanism, and a late integration stage that incorporates specific information about a speaker's articulators to constrain the number of possible candidates in a spoken utterance. Ultimately it is words compatible with both auditory and visual information that most strongly determine successful speech perception during everyday listening. Thus, audiovisual speech perception is accomplished through multiple stages of integration
Regulation of speech in multicultural societies

NARCIS (Netherlands)

Maussen, M.; Grillo, R.

2015-01-01

This book focuses on the way in which public debate and legal practice intersect when it comes to the value of free speech and the need to regulate "offensive", "blasphemous" or "hate" speech, especially, though not exclusively where such speech is thought to be offensive to members of ethnic and
ACOUSTIC SPEECH RECOGNITION FOR MARATHI LANGUAGE USING SPHINX

Directory of Open Access Journals (Sweden)

Aman Ankit

2016-09-01

Full Text Available Speech recognition or speech to text processing, is a process of recognizing human speech by the computer and converting into text. In speech recognition, transcripts are created by taking recordings of speech as audio and their text transcriptions. Speech based applications which include Natural Language Processing (NLP techniques are popular and an active area of research. Input to such applications is in natural language and output is obtained in natural language. Speech recognition mostly revolves around three approaches namely Acoustic phonetic approach, Pattern recognition approach and Artificial intelligence approach. Creation of acoustic model requires a large database of speech and training algorithms. The output of an ASR system is recognition and translation of spoken language into text by computers and computerized devices. ASR today finds enormous application in tasks that require human machine interfaces like, voice dialing, and etc. Our key contribution in this paper is to create corpora for Marathi language and explore the use of Sphinx engine for automatic speech recognition
Is Birdsong More Like Speech or Music?

Science.gov (United States)

Shannon, Robert V

2016-04-01

Music and speech share many acoustic cues but not all are equally important. For example, harmonic pitch is essential for music but not for speech. When birds communicate is their song more like speech or music? A new study contrasting pitch and spectral patterns shows that birds perceive their song more like humans perceive speech. Copyright © 2016 Elsevier Ltd. All rights reserved.
Effect of "developmental speech and language training through music" on speech production in children with autism spectrum disorders.

Science.gov (United States)

Lim, Hayoung A

2010-01-01

The study compared the effect of music training, speech training and no-training on the verbal production of children with Autism Spectrum Disorders (ASD). Participants were 50 children with ASD, age range 3 to 5 years, who had previously been evaluated on standard tests of language and level of functioning. They were randomly assigned to one of three 3-day conditions. Participants in music training (n = 18) watched a music video containing 6 songs and pictures of the 36 target words; those in speech training (n = 18) watched a speech video containing 6 stories and pictures, and those in the control condition (n = 14) received no treatment. Participants' verbal production including semantics, phonology, pragmatics, and prosody was measured by an experimenter designed verbal production evaluation scale. Results showed that participants in both music and speech training significantly increased their pre to posttest verbal production. Results also indicated that both high and low functioning participants improved their speech production after receiving either music or speech training; however, low functioning participants showed a greater improvement after the music training than the speech training. Children with ASD perceive important linguistic information embedded in music stimuli organized by principles of pattern perception, and produce the functional speech.
Speech networks at rest and in action: interactions between functional brain networks controlling speech production

Science.gov (United States)

Fuertinger, Stefan

2015-01-01

Speech production is one of the most complex human behaviors. Although brain activation during speaking has been well investigated, our understanding of interactions between the brain regions and neural networks remains scarce. We combined seed-based interregional correlation analysis with graph theoretical analysis of functional MRI data during the resting state and sentence production in healthy subjects to investigate the interface and topology of functional networks originating from the key brain regions controlling speech, i.e., the laryngeal/orofacial motor cortex, inferior frontal and superior temporal gyri, supplementary motor area, cingulate cortex, putamen, and thalamus. During both resting and speaking, the interactions between these networks were bilaterally distributed and centered on the sensorimotor brain regions. However, speech production preferentially recruited the inferior parietal lobule (IPL) and cerebellum into the large-scale network, suggesting the importance of these regions in facilitation of the transition from the resting state to speaking. Furthermore, the cerebellum (lobule VI) was the most prominent region showing functional influences on speech-network integration and segregation. Although networks were bilaterally distributed, interregional connectivity during speaking was stronger in the left vs. right hemisphere, which may have underlined a more homogeneous overlap between the examined networks in the left hemisphere. Among these, the laryngeal motor cortex (LMC) established a core network that fully overlapped with all other speech-related networks, determining the extent of network interactions. Our data demonstrate complex interactions of large-scale brain networks controlling speech production and point to the critical role of the LMC, IPL, and cerebellum in the formation of speech production network. PMID:25673742
Speech Synthesis Applied to Language Teaching.

Science.gov (United States)

Sherwood, Bruce

1981-01-01

The experimental addition of speech output to computer-based Esperanto lessons using speech synthesized from text is described. Because of Esperanto's phonetic spelling and simple rhythm, it is particularly easy to describe the mechanisms of Esperanto synthesis. Attention is directed to how the text-to-speech conversion is performed and the ways…
The Functional Connectome of Speech Control.

Directory of Open Access Journals (Sweden)

Stefan Fuertinger

2015-07-01

Full Text Available In the past few years, several studies have been directed to understanding the complexity of functional interactions between different brain regions during various human behaviors. Among these, neuroimaging research installed the notion that speech and language require an orchestration of brain regions for comprehension, planning, and integration of a heard sound with a spoken word. However, these studies have been largely limited to mapping the neural correlates of separate speech elements and examining distinct cortical or subcortical circuits involved in different aspects of speech control. As a result, the complexity of the brain network machinery controlling speech and language remained largely unknown. Using graph theoretical analysis of functional MRI (fMRI data in healthy subjects, we quantified the large-scale speech network topology by constructing functional brain networks of increasing hierarchy from the resting state to motor output of meaningless syllables to complex production of real-life speech as well as compared to non-speech-related sequential finger tapping and pure tone discrimination networks. We identified a segregated network of highly connected local neural communities (hubs in the primary sensorimotor and parietal regions, which formed a commonly shared core hub network across the examined conditions, with the left area 4p playing an important role in speech network organization. These sensorimotor core hubs exhibited features of flexible hubs based on their participation in several functional domains across different networks and ability to adaptively switch long-range functional connectivity depending on task content, resulting in a distinct community structure of each examined network. Specifically, compared to other tasks, speech production was characterized by the formation of six distinct neural communities with specialized recruitment of the prefrontal cortex, insula, putamen, and thalamus, which collectively
Private Speech in Ballet

Science.gov (United States)

Johnston, Dale

2006-01-01

Authoritarian teaching practices in ballet inhibit the use of private speech. This paper highlights the critical importance of private speech in the cognitive development of young ballet students, within what is largely a non-verbal art form. It draws upon research by Russian psychologist Lev Vygotsky and contemporary socioculturalists, to…
Speech versus singing: Infants choose happier sounds

Directory of Open Access Journals (Sweden)

Marieve eCorbeil

2013-06-01

Full Text Available Infants prefer speech to non-vocal sounds and to non-human vocalizations, and they prefer happy-sounding speech to neutral speech. They also exhibit an interest in singing, but there is little knowledge of their relative interest in speech and singing. The present study explored infants’ attention to unfamiliar audio samples of speech and singing. In Experiment 1, infants 4-13 months of age were exposed to happy-sounding infant-directed speech versus hummed lullabies by the same woman. They listened significantly longer to the speech, which had considerably greater acoustic variability and expressiveness, than to the lullabies. In Experiment 2, infants of comparable age who heard the lyrics of a Turkish children’s song spoken versus sung in a joyful/happy manner did not exhibit differential listening. Infants in Experiment 3 heard the happily sung lyrics of the Turkish children’s song versus a version that was spoken in an adult-directed or affectively neutral manner. They listened significantly longer to the sung version. Overall, happy voice quality rather than vocal mode (speech or singing was the principal contributor to infant attention, regardless of age.

Digitized Ethnic Hate Speech: Understanding Effects of Digital Media Hate Speech on Citizen Journalism in Kenya

Directory of Open Access Journals (Sweden)

Stephen Gichuhi Kimotho

2016-06-01

Full Text Available Ethnicity in Kenya permeates all spheres of life. However, it is in politics that ethnicity is most visible. Election time in Kenya often leads to ethnic competition and hatred, often expressed through various media. Ethnic hate speech characterized the 2007 general elections in party rallies and through text messages, emails, posters and leaflets. This resulted in widespread skirmishes that left over 1200 people dead, and many displaced (KNHRC, 2008. In 2013, however, the new battle zone was the war of words on social media platform. More than any other time in Kenyan history, Kenyans poured vitriolic ethnic hate speech through digital media like Facebook, tweeter and blogs. Although scholars have studied the role and effects of the mainstream media like television and radio in proliferating the ethnic hate speech in Kenya (Michael Chege, 2008; Goldstein & Rotich, 2008a; Ismail & Deane, 2008; Jacqueline Klopp & Prisca Kamungi, 2007, little has been done in regard to social media. This paper investigated the nature of digitized hate speech by: describing the forms of ethnic hate speech on social media in Kenya; the effects of ethnic hate speech on Kenyan’s perception of ethnic entities; ethnic conflict and ethics of citizen journalism. This study adopted a descriptive interpretive design, and utilized Austin’s Speech Act Theory, which explains use of language to achieve desired purposes and direct behaviour (Tarhom & Miracle, 2013. Content published between January and April 2013 from six purposefully identified blogs was analysed. Questionnaires were used to collect data from university students as they form a good sample of Kenyan population, are most active on social media and are drawn from all parts of the country. Qualitative data were analysed using NVIVO 10 software, while responses from the questionnaire were analysed using IBM SPSS version 21. The findings indicated that Facebook and Twitter were the main platforms used to
Speech and nonspeech: What are we talking about?

Science.gov (United States)

Maas, Edwin

2017-08-01

Understanding of the behavioural, cognitive and neural underpinnings of speech production is of interest theoretically, and is important for understanding disorders of speech production and how to assess and treat such disorders in the clinic. This paper addresses two claims about the neuromotor control of speech production: (1) speech is subserved by a distinct, specialised motor control system and (2) speech is holistic and cannot be decomposed into smaller primitives. Both claims have gained traction in recent literature, and are central to a task-dependent model of speech motor control. The purpose of this paper is to stimulate thinking about speech production, its disorders and the clinical implications of these claims. The paper poses several conceptual and empirical challenges for these claims - including the critical importance of defining speech. The emerging conclusion is that a task-dependent model is called into question as its two central claims are founded on ill-defined and inconsistently applied concepts. The paper concludes with discussion of methodological and clinical implications, including the potential utility of diadochokinetic (DDK) tasks in assessment of motor speech disorders and the contraindication of nonspeech oral motor exercises to improve speech function.
Noise-robust speech triage.

Science.gov (United States)

Bartos, Anthony L; Cipr, Tomas; Nelson, Douglas J; Schwarz, Petr; Banowetz, John; Jerabek, Ladislav

2018-04-01

A method is presented in which conventional speech algorithms are applied, with no modifications, to improve their performance in extremely noisy environments. It has been demonstrated that, for eigen-channel algorithms, pre-training multiple speaker identification (SID) models at a lattice of signal-to-noise-ratio (SNR) levels and then performing SID using the appropriate SNR dependent model was successful in mitigating noise at all SNR levels. In those tests, it was found that SID performance was optimized when the SNR of the testing and training data were close or identical. In this current effort multiple i-vector algorithms were used, greatly improving both processing throughput and equal error rate classification accuracy. Using identical approaches in the same noisy environment, performance of SID, language identification, gender identification, and diarization were significantly improved. A critical factor in this improvement is speech activity detection (SAD) that performs reliably in extremely noisy environments, where the speech itself is barely audible. To optimize SAD operation at all SNR levels, two algorithms were employed. The first maximized detection probability at low levels (-10 dB ≤ SNR < +10 dB) using just the voiced speech envelope, and the second exploited features extracted from the original speech to improve overall accuracy at higher quality levels (SNR ≥ +10 dB).
Specialization in audiovisual speech perception: a replication study

DEFF Research Database (Denmark)

Eskelund, Kasper; Andersen, Tobias

Speech perception is audiovisual as evidenced by bimodal integration in the McGurk effect. This integration effect may be specific to speech or be applied to all stimuli in general. To investigate this, Tuomainen et al. (2005) used sine-wave speech, which naÃ¯ve observers may perceive as non......-speech, but hear as speech once informed of the linguistic origin of the signal. Combinations of sine-wave speech and incongruent video of the talker elicited a McGurk effect only for informed observers. This indicates that the audiovisual integration effect is specific to speech perception. However, observers...... that observers did look near the mouth. We conclude that eye-movements did not influence the results of Tuomainen et al. and that their results thus can be taken as evidence of a speech specific mode of audiovisual integration underlying the McGurk illusion....
Speech and Debate as Civic Education

Science.gov (United States)

Hogan, J. Michael; Kurr, Jeffrey A.; Johnson, Jeremy D.; Bergmaier, Michael J.

2016-01-01

In light of the U.S. Senate's designation of March 15, 2016 as "National Speech and Debate Education Day" (S. Res. 398, 2016), it only seems fitting that "Communication Education" devote a special section to the role of speech and debate in civic education. Speech and debate have been at the heart of the communication…
Tuning Neural Phase Entrainment to Speech.

Science.gov (United States)

Falk, Simone; Lanzilotti, Cosima; Schön, Daniele

2017-08-01

Musical rhythm positively impacts on subsequent speech processing. However, the neural mechanisms underlying this phenomenon are so far unclear. We investigated whether carryover effects from a preceding musical cue to a speech stimulus result from a continuation of neural phase entrainment to periodicities that are present in both music and speech. Participants listened and memorized French metrical sentences that contained (quasi-)periodic recurrences of accents and syllables. Speech stimuli were preceded by a rhythmically regular or irregular musical cue. Our results show that the presence of a regular cue modulates neural response as estimated by EEG power spectral density, intertrial coherence, and source analyses at critical frequencies during speech processing compared with the irregular condition. Importantly, intertrial coherences for regular cues were indicative of the participants' success in memorizing the subsequent speech stimuli. These findings underscore the highly adaptive nature of neural phase entrainment across fundamentally different auditory stimuli. They also support current models of neural phase entrainment as a tool of predictive timing and attentional selection across cognitive domains.
Speech perception as an active cognitive process

Directory of Open Access Journals (Sweden)

Shannon eHeald

2014-03-01

Full Text Available One view of speech perception is that acoustic signals are transformed into representations for pattern matching to determine linguistic structure. This process can be taken as a statistical pattern-matching problem, assuming realtively stable linguistic categories are characterized by neural representations related to auditory properties of speech that can be compared to speech input. This kind of pattern matching can be termed a passive process which implies rigidity of processingd with few demands on cognitive processing. An alternative view is that speech recognition, even in early stages, is an active process in which speech analysis is attentionally guided. Note that this does not mean consciously guided but that information-contingent changes in early auditory encoding can occur as a function of context and experience. Active processing assumes that attention, plasticity, and listening goals are important in considering how listeners cope with adverse circumstances that impair hearing by masking noise in the environment or hearing loss. Although theories of speech perception have begun to incorporate some active processing, they seldom treat early speech encoding as plastic and attentionally guided. Recent research has suggested that speech perception is the product of both feedforward and feedback interactions between a number of brain regions that include descending projections perhaps as far downstream as the cochlea. It is important to understand how the ambiguity of the speech signal and constraints of context dynamically determine cognitive resources recruited during perception including focused attention, learning, and working memory. Theories of speech perception need to go beyond the current corticocentric approach in order to account for the intrinsic dynamics of the auditory encoding of speech. In doing so, this may provide new insights into ways in which hearing disorders and loss may be treated either through augementation or
Water-soluble coenzyme Q10 formulation (Q-TER(®)) in the treatment of presbycusis.

Science.gov (United States)

Salami, Angelo; Mora, Renzo; Dellepiane, Massimo; Manini, Giorgio; Santomauro, Valentina; Barettini, Luciano; Guastini, Luca

2010-10-01

These preliminary data are encouraging for a larger clinical trial to collect additional evidence on the effect of Q-TER(®) in preventing the development of hearing loss in subjects with presbycusis. The purpose of this study was to evaluate the efficiency and applicability of a water-soluble formulation of CoQ10 (Q-TER(®)) in subjects with presbycusis. A total of 60 patients with presbycusis were included and divided into three numerically equal groups. Group A underwent therapy with Q-TER(®), 160 mg, once a day for 30 days; group B underwent therapy with vitamin E (50 mg), once a day for 30 days; group C received placebo, once a day for 30 days. Before and at the end of the treatment, all patients underwent pure tone audiometry, transient evoked otoacoustic emissions, otoacoustic products of distortion, auditory brainstem response, and speech audiometry. Compared with group B, at the end of the treatment in group A the liminar tonal audiometry showed a significant improvement of the air and bone thresholds at the 1000 (14/20 vs 9/20), 2000 (14/20 vs 7/20), 4000 (15/20 vs 6/20), and 8000 Hz (13/20 vs 5/20). We found no significant differences in the other parameters and in group C.
Audiovisual Speech Synchrony Measure: Application to Biometrics

Directory of Open Access Journals (Sweden)

Gérard Chollet

2007-01-01

Full Text Available Speech is a means of communication which is intrinsically bimodal: the audio signal originates from the dynamics of the articulators. This paper reviews recent works in the field of audiovisual speech, and more specifically techniques developed to measure the level of correspondence between audio and visual speech. It overviews the most common audio and visual speech front-end processing, transformations performed on audio, visual, or joint audiovisual feature spaces, and the actual measure of correspondence between audio and visual speech. Finally, the use of synchrony measure for biometric identity verification based on talking faces is experimented on the BANCA database.
The motor theory of speech perception revisited.

Science.gov (United States)

Massaro, Dominic W; Chen, Trevor H

2008-04-01

Galantucci, Fowler, and Turvey (2006) have claimed that perceiving speech is perceiving gestures and that the motor system is recruited for perceiving speech. We make the counter argument that perceiving speech is not perceiving gestures, that the motor system is not recruitedfor perceiving speech, and that speech perception can be adequately described by a prototypical pattern recognition model, the fuzzy logical model of perception (FLMP). Empirical evidence taken as support for gesture and motor theory is reconsidered in more detail and in the framework of the FLMR Additional theoretical and logical arguments are made to challenge gesture and motor theory.
Commercial speech in crisis: Crisis Pregnancy Center regulations and definitions of commercial speech.

Science.gov (United States)

Gilbert, Kathryn E

2013-02-01

Recent attempts to regulate Crisis Pregnancy Centers, pseudoclinics that surreptitiously aim to dissuade pregnant women from choosing abortion, have confronted the thorny problem of how to define commercial speech. The Supreme Court has offered three potential answers to this definitional quandary. This Note uses the Crisis Pregnancy Center cases to demonstrate that courts should use one of these solutions, the factor-based approach of Bolger v. Youngs Drugs Products Corp., to define commercial speech in the Crisis Pregnancy Center cases and elsewhere. In principle and in application, the Bolger factor-based approach succeeds in structuring commercial speech analysis at the margins of the doctrine.
Hearing thresholds at high frequency in patients with cystic fibrosis: a systematic review

Directory of Open Access Journals (Sweden)

Debora T.M. Caumo

Full Text Available Abstract Introduction: High-frequency audiometry may contribute to the early detection of hearing loss caused by ototoxic medications. Many ototoxic drugs are widely used in the treatment of patients with cystic fibrosis. Early detection of hearing loss should allow known harmful drugs to be identified before the damage affects speech frequencies. The damage caused by ototoxicity is irreversible, resulting in important social and psychological consequences. In children, hearing loss, even when restricted to high frequencies, can affect the development of language. Objective: To investigate the efficacy and effectiveness of hearing monitoring through high-frequency audiometry in pediatric patients with cystic fibrosis. Methods: Electronic databases PubMed, MedLine, Web of Science and LILACS were searched, from January to November 2015. The selected studies included those in which high-frequency audiometry was performed in patients with cystic fibrosis, undergoing treatment with ototoxic drugs and published in Portuguese, English and Spanish. The GRADE system was chosen for the evaluation of the methodological quality of the articles. Results: During the search process carried out from January 2015 to November 2015, 512 publications were identified, of which 250 were found in PubMed, 118 in MedLine, 142 in Web of Science and 2 in LILACS. Of these, nine articles were selected. Conclusion: The incidence of hearing loss was identified at high frequencies in cystic fibrosis patients without hearing complaints. It is assumed that high-frequency audiometry can be an early diagnostic method to be recommended for hearing investigation of patients at risk of ototoxicity.
Neurophysiological influence of musical training on speech perception.

Science.gov (United States)

Shahin, Antoine J

2011-01-01

Does musical training affect our perception of speech? For example, does learning to play a musical instrument modify the neural circuitry for auditory processing in a way that improves one's ability to perceive speech more clearly in noisy environments? If so, can speech perception in individuals with hearing loss (HL), who struggle in noisy situations, benefit from musical training? While music and speech exhibit some specialization in neural processing, there is evidence suggesting that skills acquired through musical training for specific acoustical processes may transfer to, and thereby improve, speech perception. The neurophysiological mechanisms underlying the influence of musical training on speech processing and the extent of this influence remains a rich area to be explored. A prerequisite for such transfer is the facilitation of greater neurophysiological overlap between speech and music processing following musical training. This review first establishes a neurophysiological link between musical training and speech perception, and subsequently provides further hypotheses on the neurophysiological implications of musical training on speech perception in adverse acoustical environments and in individuals with HL.
Auditory Masking Effects on Speech Fluency in Apraxia of Speech and Aphasia: Comparison to Altered Auditory Feedback

Science.gov (United States)

Jacks, Adam; Haley, Katarina L.

2015-01-01

Purpose: To study the effects of masked auditory feedback (MAF) on speech fluency in adults with aphasia and/or apraxia of speech (APH/AOS). We hypothesized that adults with AOS would increase speech fluency when speaking with noise. Altered auditory feedback (AAF; i.e., delayed/frequency-shifted feedback) was included as a control condition not…
LIBERDADE DE EXPRESSÃO E DISCURSO DO ÓDIO NO BRASIL / FREE SPEECH AND HATE SPEECH IN BRAZIL

Directory of Open Access Journals (Sweden)

Nevita Maria Pessoa de Aquino Franca Luna

2014-12-01

Full Text Available The purpose of this article is to analyze the restriction of free speech when it comes close to hate speech. In this perspective, the aim of this study is to answer the question: what is the understanding adopted by the Brazilian Supreme Court in cases involving the conflict between free speech and hate speech? The methodology combines a bibliographic review on the theoretical assumptions of the research (concept of free speech and hate speech, and understanding of the rights of defense of traditionally discriminated minorities and empirical research (documental and jurisprudential analysis of judged cases of American Court, German Court and Brazilian Court. Firstly, free speech is discussed, defining its meaning, content and purpose. Then, the hate speech is pointed as an inhibitor element of free speech for offending members of traditionally discriminated minorities, who are outnumbered or in a situation of cultural, socioeconomic or political subordination. Subsequently, are discussed some aspects of American (negative freedom and German models (positive freedom, to demonstrate that different cultures adopt different legal solutions. At the end, it is concluded that there is an approximation of the Brazilian understanding with the German doctrine, from the analysis of landmark cases as the publisher Siegfried Ellwanger (2003 and the Samba School Unidos do Viradouro (2008. The Brazilian comprehension, a multicultural country made up of different ethnicities, leads to a new process of defending minorities who, despite of involving the collision of fundamental rights (dignity, equality and freedom, is still restrained by incompatible barriers of a contemporary pluralistic democracy.
Speech production in amplitude-modulated noise

DEFF Research Database (Denmark)

Macdonald, Ewen N; Raufer, Stefan

2013-01-01

The Lombard effect refers to the phenomenon where talkers automatically increase their level of speech in a noisy environment. While many studies have characterized how the Lombard effect influences different measures of speech production (e.g., F0, spectral tilt, etc.), few have investigated...... the consequences of temporally fluctuating noise. In the present study, 20 talkers produced speech in a variety of noise conditions, including both steady-state and amplitude-modulated white noise. While listening to noise over headphones, talkers produced randomly generated five word sentences. Similar...... of noisy environments and will alter their speech accordingly....
Free Speech Yearbook 1980.

Science.gov (United States)

Kane, Peter E., Ed.

The 11 articles in this collection deal with theoretical and practical freedom of speech issues. The topics covered are (1) the United States Supreme Court and communication theory; (2) truth, knowledge, and a democratic respect for diversity; (3) denial of freedom of speech in Jock Yablonski's campaign for the presidency of the United Mine…
Facial Speech Gestures: The Relation between Visual Speech Processing, Phonological Awareness, and Developmental Dyslexia in 10-Year-Olds

Science.gov (United States)

Schaadt, Gesa; Männel, Claudia; van der Meer, Elke; Pannekamp, Ann; Friederici, Angela D.

2016-01-01

Successful communication in everyday life crucially involves the processing of auditory and visual components of speech. Viewing our interlocutor and processing visual components of speech facilitates speech processing by triggering auditory processing. Auditory phoneme processing, analyzed by event-related brain potentials (ERP), has been shown…
Speech enhancement on smartphone voice recording

International Nuclear Information System (INIS)

Atmaja, Bagus Tris; Farid, Mifta Nur; Arifianto, Dhany

2016-01-01

Speech enhancement is challenging task in audio signal processing to enhance the quality of targeted speech signal while suppress other noises. In the beginning, the speech enhancement algorithm growth rapidly from spectral subtraction, Wiener filtering, spectral amplitude MMSE estimator to Non-negative Matrix Factorization (NMF). Smartphone as revolutionary device now is being used in all aspect of life including journalism; personally and professionally. Although many smartphones have two microphones (main and rear) the only main microphone is widely used for voice recording. This is why the NMF algorithm widely used for this purpose of speech enhancement. This paper evaluate speech enhancement on smartphone voice recording by using some algorithms mentioned previously. We also extend the NMF algorithm to Kulback-Leibler NMF with supervised separation. The last algorithm shows improved result compared to others by spectrogram and PESQ score evaluation. (paper)
Hearing speech in music.

Science.gov (United States)

Ekström, Seth-Reino; Borg, Erik

2011-01-01

The masking effect of a piano composition, played at different speeds and in different octaves, on speech-perception thresholds was investigated in 15 normal-hearing and 14 moderately-hearing-impaired subjects. Running speech (just follow conversation, JFC) testing and use of hearing aids increased the everyday validity of the findings. A comparison was made with standard audiometric noises [International Collegium of Rehabilitative Audiology (ICRA) noise and speech spectrum-filtered noise (SPN)]. All masking sounds, music or noise, were presented at the same equivalent sound level (50 dBA). The results showed a significant effect of piano performance speed and octave (Ptempo had the largest effect; and high octave and slow tempo, the smallest. Music had a lower masking effect than did ICRA noise with two or six speakers at normal vocal effort (Pmusic offers an interesting opportunity for studying masking under realistic conditions, where spectral and temporal features can be varied independently. The results have implications for composing music with vocal parts, designing acoustic environments and creating a balance between speech perception and privacy in social settings.

Speech networks at rest and in action: interactions between functional brain networks controlling speech production.

Science.gov (United States)

Simonyan, Kristina; Fuertinger, Stefan

2015-04-01

Speech production is one of the most complex human behaviors. Although brain activation during speaking has been well investigated, our understanding of interactions between the brain regions and neural networks remains scarce. We combined seed-based interregional correlation analysis with graph theoretical analysis of functional MRI data during the resting state and sentence production in healthy subjects to investigate the interface and topology of functional networks originating from the key brain regions controlling speech, i.e., the laryngeal/orofacial motor cortex, inferior frontal and superior temporal gyri, supplementary motor area, cingulate cortex, putamen, and thalamus. During both resting and speaking, the interactions between these networks were bilaterally distributed and centered on the sensorimotor brain regions. However, speech production preferentially recruited the inferior parietal lobule (IPL) and cerebellum into the large-scale network, suggesting the importance of these regions in facilitation of the transition from the resting state to speaking. Furthermore, the cerebellum (lobule VI) was the most prominent region showing functional influences on speech-network integration and segregation. Although networks were bilaterally distributed, interregional connectivity during speaking was stronger in the left vs. right hemisphere, which may have underlined a more homogeneous overlap between the examined networks in the left hemisphere. Among these, the laryngeal motor cortex (LMC) established a core network that fully overlapped with all other speech-related networks, determining the extent of network interactions. Our data demonstrate complex interactions of large-scale brain networks controlling speech production and point to the critical role of the LMC, IPL, and cerebellum in the formation of speech production network. Copyright © 2015 the American Physiological Society.
Segmental intelligibility of synthetic speech produced by rule.

Science.gov (United States)

Logan, J S; Greene, B G; Pisoni, D B

1989-08-01

This paper reports the results of an investigation that employed the modified rhyme test (MRT) to measure the segmental intelligibility of synthetic speech generated automatically by rule. Synthetic speech produced by ten text-to-speech systems was studied and compared to natural speech. A variation of the standard MRT was also used to study the effects of response set size on perceptual confusions. Results indicated that the segmental intelligibility scores formed a continuum. Several systems displayed very high levels of performance that were close to or equal to scores obtained with natural speech; other systems displayed substantially worse performance compared to natural speech. The overall performance of the best system, DECtalk--Paul, was equivalent to the data obtained with natural speech for consonants in syllable-initial position. The findings from this study are discussed in terms of the use of a set of standardized procedures for measuring intelligibility of synthetic speech under controlled laboratory conditions. Recent work investigating the perception of synthetic speech under more severe conditions in which greater demands are made on the listener's processing resources is also considered. The wide range of intelligibility scores obtained in the present study demonstrates important differences in perception and suggests that not all synthetic speech is perceptually equivalent to the listener.
Segmental intelligibility of synthetic speech produced by rule

Science.gov (United States)

Logan, John S.; Greene, Beth G.; Pisoni, David B.

2012-01-01

This paper reports the results of an investigation that employed the modified rhyme test (MRT) to measure the segmental intelligibility of synthetic speech generated automatically by rule. Synthetic speech produced by ten text-to-speech systems was studied and compared to natural speech. A variation of the standard MRT was also used to study the effects of response set size on perceptual confusions. Results indicated that the segmental intelligibility scores formed a continuum. Several systems displayed very high levels of performance that were close to or equal to scores obtained with natural speech; other systems displayed substantially worse performance compared to natural speech. The overall performance of the best system, DECtalk—Paul, was equivalent to the data obtained with natural speech for consonants in syllable-initial position. The findings from this study are discussed in terms of the use of a set of standardized procedures for measuring intelligibility of synthetic speech under controlled laboratory conditions. Recent work investigating the perception of synthetic speech under more severe conditions in which greater demands are made on the listener’s processing resources is also considered. The wide range of intelligibility scores obtained in the present study demonstrates important differences in perception and suggests that not all synthetic speech is perceptually equivalent to the listener. PMID:2527884
Empathy, Ways of Knowing, and Interdependence as Mediators of Gender Differences in Attitudes toward Hate Speech and Freedom of Speech

Science.gov (United States)

Cowan, Gloria; Khatchadourian, Desiree

2003-01-01

Women are more intolerant of hate speech than men. This study examined relationality measures as mediators of gender differences in the perception of the harm of hate speech and the importance of freedom of speech. Participants were 107 male and 123 female college students. Questionnaires assessed the perceived harm of hate speech, the importance…
Speech enhancement theory and practice

CERN Document Server

Loizou, Philipos C

2013-01-01

With the proliferation of mobile devices and hearing devices, including hearing aids and cochlear implants, there is a growing and pressing need to design algorithms that can improve speech intelligibility without sacrificing quality. Responding to this need, Speech Enhancement: Theory and Practice, Second Edition introduces readers to the basic problems of speech enhancement and the various algorithms proposed to solve these problems. Updated and expanded, this second edition of the bestselling textbook broadens its scope to include evaluation measures and enhancement algorithms aimed at impr
Recognizing emotional speech in Persian: a validated database of Persian emotional speech (Persian ESD).

Science.gov (United States)

Keshtiari, Niloofar; Kuhlmann, Michael; Eslami, Moharram; Klann-Delius, Gisela

2015-03-01

Research on emotional speech often requires valid stimuli for assessing perceived emotion through prosody and lexical content. To date, no comprehensive emotional speech database for Persian is officially available. The present article reports the process of designing, compiling, and evaluating a comprehensive emotional speech database for colloquial Persian. The database contains a set of 90 validated novel Persian sentences classified in five basic emotional categories (anger, disgust, fear, happiness, and sadness), as well as a neutral category. These sentences were validated in two experiments by a group of 1,126 native Persian speakers. The sentences were articulated by two native Persian speakers (one male, one female) in three conditions: (1) congruent (emotional lexical content articulated in a congruent emotional voice), (2) incongruent (neutral sentences articulated in an emotional voice), and (3) baseline (all emotional and neutral sentences articulated in neutral voice). The speech materials comprise about 470 sentences. The validity of the database was evaluated by a group of 34 native speakers in a perception test. Utterances recognized better than five times chance performance (71.4 %) were regarded as valid portrayals of the target emotions. Acoustic analysis of the valid emotional utterances revealed differences in pitch, intensity, and duration, attributes that may help listeners to correctly classify the intended emotion. The database is designed to be used as a reliable material source (for both text and speech) in future cross-cultural or cross-linguistic studies of emotional speech, and it is available for academic research purposes free of charge. To access the database, please contact the first author.
Imitation and speech: commonalities within Broca's area.

Science.gov (United States)

Kühn, Simone; Brass, Marcel; Gallinat, Jürgen

2013-11-01

The so-called embodiment of communication has attracted considerable interest. Recently a growing number of studies have proposed a link between Broca's area's involvement in action processing and its involvement in speech. The present quantitative meta-analysis set out to test whether neuroimaging studies on imitation and overt speech show overlap within inferior frontal gyrus. By means of activation likelihood estimation (ALE), we investigated concurrence of brain regions activated by object-free hand imitation studies as well as overt speech studies including simple syllable and more complex word production. We found direct overlap between imitation and speech in bilateral pars opercularis (BA 44) within Broca's area. Subtraction analyses revealed no unique localization neither for speech nor for imitation. To verify the potential of ALE subtraction analysis to detect unique involvement within Broca's area, we contrasted the results of a meta-analysis on motor inhibition and imitation and found separable regions involved for imitation. This is the first meta-analysis to compare the neural correlates of imitation and overt speech. The results are in line with the proposed evolutionary roots of speech in imitation.
Design and realisation of an audiovisual speech activity detector

NARCIS (Netherlands)

Van Bree, K.C.

2006-01-01

For many speech telecommunication technologies a robust speech activity detector is important. An audio-only speech detector will givefalse positives when the interfering signal is speech or has speech characteristics. The modality video is suitable to solve this problem. In this report the approach
Utility of TMS to understand the neurobiology of speech

Directory of Open Access Journals (Sweden)

Takenobu eMurakami

2013-07-01

Full Text Available According to a traditional view, speech perception and production are processed largely separately in sensory and motor brain areas. Recent psycholinguistic and neuroimaging studies provide novel evidence that the sensory and motor systems dynamically interact in speech processing, by demonstrating that speech perception and imitation share regional brain activations. However, the exact nature and mechanisms of these sensorimotor interactions are not completely understood yet.Transcranial magnetic stimulation (TMS has often been used in the cognitive neurosciences, including speech research, as a complementary technique to behavioral and neuroimaging studies. Here we provide an up-to-date review focusing on TMS studies that explored speech perception and imitation.Single-pulse TMS of the primary motor cortex (M1 demonstrated a speech specific and somatotopically specific increase of excitability of the M1 lip area during speech perception (listening to speech or lip reading. A paired-coil TMS approach showed increases in effective connectivity from brain regions that are involved in speech processing to the M1 lip area when listening to speech. TMS in virtual lesion mode applied to speech processing areas modulated performance of phonological recognition and imitation of perceived speech.In summary, TMS is an innovative tool to investigate processing of speech perception and imitation. TMS studies have provided strong evidence that the sensory system is critically involved in mapping sensory input onto motor output and that the motor system plays an important role in speech perception.
LinguaTag: an Emotional Speech Analysis Application

OpenAIRE

Cullen, Charlie; Vaughan, Brian; Kousidis, Spyros

2008-01-01

The analysis of speech, particularly for emotional content, is an open area of current research. Ongoing work has developed an emotional speech corpus for analysis, and defined a vowel stress method by which this analysis may be performed. This paper documents the development of LinguaTag, an open source speech analysis software application which implements this vowel stress emotional speech analysis method developed as part of research into the acoustic and linguistic correlates of emotional...
Correlational Analysis of Speech Intelligibility Tests and Metrics for Speech Transmission

Science.gov (United States)

2017-12-04

sounds, are more prone to masking than the high-energy, wide-spectrum vowels. Such contaminated speech is still audible but not clear. Thus, speech...Science; 2012 June 12–14; Kuala Lumpur ( Malaysia ): New York (NY): IEEE; c2012. p. 676–682. Approved for public release; distribution is unlimited. 47...ARRABITO 1 UNIV OF COLORADO (PDF) K AREHART 1 NASA (PDF) J ALLEN 1 FOOD AND DRUG ADM-DEPT (PDF) OF HEALTH AND HUMAN SERVICES
Impairments of speech fluency in Lewy body spectrum disorder.

Science.gov (United States)

Ash, Sharon; McMillan, Corey; Gross, Rachel G; Cook, Philip; Gunawardena, Delani; Morgan, Brianna; Boller, Ashley; Siderowf, Andrew; Grossman, Murray

2012-03-01

Few studies have examined connected speech in demented and non-demented patients with Parkinson's disease (PD). We assessed the speech production of 35 patients with Lewy body spectrum disorder (LBSD), including non-demented PD patients, patients with PD dementia (PDD), and patients with dementia with Lewy bodies (DLB), in a semi-structured narrative speech sample in order to characterize impairments of speech fluency and to determine the factors contributing to reduced speech fluency in these patients. Both demented and non-demented PD patients exhibited reduced speech fluency, characterized by reduced overall speech rate and long pauses between sentences. Reduced speech rate in LBSD correlated with measures of between-utterance pauses, executive functioning, and grammatical comprehension. Regression analyses related non-fluent speech, grammatical difficulty, and executive difficulty to atrophy in frontal brain regions. These findings indicate that multiple factors contribute to slowed speech in LBSD, and this is mediated in part by disease in frontal brain regions. Copyright Â© 2011 Elsevier Inc. All rights reserved.
Cognitive functions in Childhood Apraxia of Speech

NARCIS (Netherlands)

Nijland, L.; Terband, H.; Maassen, B.

2015-01-01

Purpose: Childhood Apraxia of Speech (CAS) is diagnosed on the basis of specific speech characteristics, in the absence of problems in hearing, intelligence, and language comprehension. This does not preclude the possibility that children with this speech disorder might demonstrate additional
Subjective Quality Measurement of Speech Its Evaluation, Estimation and Applications

CERN Document Server

Kondo, Kazuhiro

2012-01-01

It is becoming crucial to accurately estimate and monitor speech quality in various ambient environments to guarantee high quality speech communication. This practical hands-on book shows speech intelligibility measurement methods so that the readers can start measuring or estimating speech intelligibility of their own system. The book also introduces subjective and objective speech quality measures, and describes in detail speech intelligibility measurement methods. It introduces a diagnostic rhyme test which uses rhyming word-pairs, and includes: An investigation into the effect of word familiarity on speech intelligibility. Speech intelligibility measurement of localized speech in virtual 3-D acoustic space using the rhyme test. Estimation of speech intelligibility using objective measures, including the ITU standard PESQ measures, and automatic speech recognizers.
Comparison of two speech privacy measurements, articulation index (AI) and speech privacy noise isolation class (NIC'), in open workplaces

Science.gov (United States)

Yoon, Heakyung C.; Loftness, Vivian

2002-05-01

Lack of speech privacy has been reported to be the main dissatisfaction among occupants in open workplaces, according to workplace surveys. Two speech privacy measurements, Articulation Index (AI), standardized by the American National Standards Institute in 1969, and Speech Privacy Noise Isolation Class (NIC', Noise Isolation Class Prime), adapted from Noise Isolation Class (NIC) by U. S. General Services Administration (GSA) in 1979, have been claimed as objective tools to measure speech privacy in open offices. To evaluate which of them, normal privacy for AI or satisfied privacy for NIC', is a better tool in terms of speech privacy in a dynamic open office environment, measurements were taken in the field. AIs and NIC's in the different partition heights and workplace configurations have been measured following ASTM E1130 (Standard Test Method for Objective Measurement of Speech Privacy in Open Offices Using Articulation Index) and GSA test PBS-C.1 (Method for the Direct Measurement of Speech-Privacy Potential (SPP) Based on Subjective Judgments) and PBS-C.2 (Public Building Service Standard Method of Test Method for the Sufficient Verification of Speech-Privacy Potential (SPP) Based on Objective Measurements Including Methods for the Rating of Functional Interzone Attenuation and NC-Background), respectively.
SPEECH VISUALIZATION SISTEM AS A BASIS FOR SPEECH TRAINING AND COMMUNICATION AIDS

Directory of Open Access Journals (Sweden)

Oliana KRSTEVA

1997-09-01

Full Text Available One receives much more information through a visual sense than through a tactile one. However, most visual aids for hearing-impaired persons are not wearable because it is difficult to make them compact and it is not a best way to mask always their vision.Generally it is difficult to get the integrated patterns by a single mathematical transform of signals, such as a Foruier transform. In order to obtain the integrated pattern speech parameters should be carefully extracted by an analysis according as each parameter, and a visual pattern, which can intuitively be understood by anyone, must be synthesized from them. Successful integration of speech parameters will never disturb understanding of individual features, so that the system can be used for speech training and communication.
SUSTAINABILITY IN THE BOWELS OF SPEECHES

Directory of Open Access Journals (Sweden)

Jadir Mauro Galvao

2012-10-01

Full Text Available The theme of sustainability has not yet achieved the feat of make up as an integral part the theoretical medley that brings out our most everyday actions, often visits some of our thoughts and permeates many of our speeches. The big event of 2012, the meeting gathered Rio +20 glances from all corners of the planet around that theme as burning, but we still see forward timidly. Although we have no very clear what the term sustainability closes it does not sound quite strange. Associate with things like ecology, planet, wastes emitted by smokestacks of factories, deforestation, recycling and global warming must be related, but our goal in this article is the least of clarifying the term conceptually and more try to observe as it appears in speeches of such conference. When the competent authorities talk about sustainability relate to what? We intend to investigate the lines and between the lines of these speeches, any assumptions associated with the term. Therefore we will analyze the speech of the People´s Summit, the opening speech of President Dilma and emblematic speech of the President of Uruguay, José Pepe Mujica.
Modeling speech intelligibility in adverse conditions

DEFF Research Database (Denmark)

Dau, Torsten

2012-01-01

) in conditions with nonlinearly processed speech. Instead of considering the reduction of the temporal modulation energy as the intelligibility metric, as assumed in the STI, the sEPSM applies the signal-to-noise ratio in the envelope domain (SNRenv). This metric was shown to be the key for predicting...... understanding speech when more than one person is talking, even when reduced audibility has been fully compensated for by a hearing aid. The reasons for these difficulties are not well understood. This presentation highlights recent concepts of the monaural and binaural signal processing strategies employed...... by the normal as well as impaired auditory system. Jørgensen and Dau [(2011). J. Acoust. Soc. Am. 130, 1475-1487] proposed the speech-based envelope power spectrum model (sEPSM) in an attempt to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index (SII...
Parent-child interaction in motor speech therapy.

Science.gov (United States)

Namasivayam, Aravind Kumar; Jethava, Vibhuti; Pukonen, Margit; Huynh, Anna; Goshulak, Debra; Kroll, Robert; van Lieshout, Pascal

2018-01-01

This study measures the reliability and sensitivity of a modified Parent-Child Interaction Observation scale (PCIOs) used to monitor the quality of parent-child interaction. The scale is part of a home-training program employed with direct motor speech intervention for children with speech sound disorders. Eighty-four preschool age children with speech sound disorders were provided either high- (2×/week/10 weeks) or low-intensity (1×/week/10 weeks) motor speech intervention. Clinicians completed the PCIOs at the beginning, middle, and end of treatment. Inter-rater reliability (Kappa scores) was determined by an independent speech-language pathologist who assessed videotaped sessions at the midpoint of the treatment block. Intervention sensitivity of the scale was evaluated using a Friedman test for each item and then followed up with Wilcoxon pairwise comparisons where appropriate. We obtained fair-to-good inter-rater reliability (Kappa = 0.33-0.64) for the PCIOs using only video-based scoring. Child-related items were more strongly influenced by differences in treatment intensity than parent-related items, where a greater number of sessions positively influenced parent learning of treatment skills and child behaviors. The adapted PCIOs is reliable and sensitive to monitor the quality of parent-child interactions in a 10-week block of motor speech intervention with adjunct home therapy. Implications for rehabilitation Parent-centered therapy is considered a cost effective method of speech and language service delivery. However, parent-centered models may be difficult to implement for treatments such as developmental motor speech interventions that require a high degree of skill and training. For children with speech sound disorders and motor speech difficulties, a translated and adapted version of the parent-child observation scale was found to be sufficiently reliable and sensitive to assess changes in the quality of the parent-child interactions during
Speech-enabled Computer-aided Translation

DEFF Research Database (Denmark)

Mesa-Lao, Bartolomé

2014-01-01

The present study has surveyed post-editor trainees’ views and attitudes before and after the introduction of speech technology as a front end to a computer-aided translation workbench. The aim of the survey was (i) to identify attitudes and perceptions among post-editor trainees before performing...... a post-editing task using automatic speech recognition (ASR); and (ii) to assess the degree to which post-editors’ attitudes and expectations to the use of speech technology changed after actually using it. The survey was based on two questionnaires: the first one administered before the participants...

Comment on "Monkey vocal tracts are speech-ready".

Science.gov (United States)

Lieberman, Philip

2017-07-01

Monkey vocal tracts are capable of producing monkey speech, not the full range of articulate human speech. The evolution of human speech entailed both anatomy and brains. Fitch, de Boer, Mathur, and Ghazanfar in Science Advances claim that "monkey vocal tracts are speech-ready," and conclude that "…the evolution of human speech capabilities required neural change rather than modifications of vocal anatomy." Neither premise is consistent either with the data presented and the conclusions reached by de Boer and Fitch themselves in their own published papers on the role of anatomy in the evolution of human speech or with the body of independent studies published since the 1950s.
Ultra low bit-rate speech coding

CERN Document Server

Ramasubramanian, V

2015-01-01

"Ultra Low Bit-Rate Speech Coding" focuses on the specialized topic of speech coding at very low bit-rates of 1 Kbits/sec and less, particularly at the lower ends of this range, down to 100 bps. The authors set forth the fundamental results and trends that form the basis for such ultra low bit-rates to be viable and provide a comprehensive overview of various techniques and systems in literature to date, with particular attention to their work in the paradigm of unit-selection based segment quantization. The book is for research students, academic faculty and researchers, and industry practitioners in the areas of speech processing and speech coding.
The Effectiveness of Clear Speech as a Masker

Science.gov (United States)

Calandruccio, Lauren; Van Engen, Kristin; Dhar, Sumitrajit; Bradlow, Ann R.

2010-01-01

Purpose: It is established that speaking clearly is an effective means of enhancing intelligibility. Because any signal-processing scheme modeled after known acoustic-phonetic features of clear speech will likely affect both target and competing speech, it is important to understand how speech recognition is affected when a competing speech signal…
Speech Motor Development in Childhood Apraxia of Speech : Generating Testable Hypotheses by Neurocomputational Modeling

NARCIS (Netherlands)

Terband, H.; Maassen, B.

2010-01-01

Childhood apraxia of speech (CAS) is a highly controversial clinical entity, with respect to both clinical signs and underlying neuromotor deficit. In the current paper, we advocate a modeling approach in which a computational neural model of speech acquisition and production is utilized in order to
Speech motor development in childhood apraxia of speech: generating testable hypotheses by neurocomputational modeling.

NARCIS (Netherlands)

Terband, H.R.; Maassen, B.A.M.

2010-01-01

Childhood apraxia of speech (CAS) is a highly controversial clinical entity, with respect to both clinical signs and underlying neuromotor deficit. In the current paper, we advocate a modeling approach in which a computational neural model of speech acquisition and production is utilized in order to
Between-Word Simplification Patterns in the Continuous Speech of Children with Speech Sound Disorders

Science.gov (United States)

Klein, Harriet B.; Liu-Shea, May

2009-01-01

Purpose: This study was designed to identify and describe between-word simplification patterns in the continuous speech of children with speech sound disorders. It was hypothesized that word combinations would reveal phonological changes that were unobserved with single words, possibly accounting for discrepancies between the intelligibility of…
Effects of Synthetic Speech Output on Requesting and Natural Speech Production in Children with Autism: A Preliminary Study

Science.gov (United States)

Schlosser, Ralf W.; Sigafoos, Jeff; Luiselli, James K.; Angermeier, Katie; Harasymowyz, Ulana; Schooley, Katherine; Belfiore, Phil J.

2007-01-01

Requesting is often taught as an initial target during augmentative and alternative communication intervention in children with autism. Speech-generating devices are purported to have advantages over non-electronic systems due to their synthetic speech output. On the other hand, it has been argued that speech output, being in the auditory…
Speech auditory brainstem response (speech ABR) characteristics depending on recording conditions, and hearing status: an experimental parametric study.

Science.gov (United States)

Akhoun, Idrick; Moulin, Annie; Jeanvoine, Arnaud; Ménard, Mikael; Buret, François; Vollaire, Christian; Scorretti, Riccardo; Veuillet, Evelyne; Berger-Vachon, Christian; Collet, Lionel; Thai-Van, Hung

2008-11-15

Speech elicited auditory brainstem responses (Speech ABR) have been shown to be an objective measurement of speech processing in the brainstem. Given the simultaneous stimulation and recording, and the similarities between the recording and the speech stimulus envelope, there is a great risk of artefactual recordings. This study sought to systematically investigate the source of artefactual contamination in Speech ABR response. In a first part, we measured the sound level thresholds over which artefactual responses were obtained, for different types of transducers and experimental setup parameters. A watermelon model was used to model the human head susceptibility to electromagnetic artefact. It was found that impedances between the electrodes had a great effect on electromagnetic susceptibility and that the most prominent artefact is due to the transducer's electromagnetic leakage. The only artefact-free condition was obtained with insert-earphones shielded in a Faraday cage linked to common ground. In a second part of the study, using the previously defined artefact-free condition, we recorded speech ABR in unilateral deaf subjects and bilateral normal hearing subjects. In an additional control condition, Speech ABR was recorded with the insert-earphones used to deliver the stimulation, unplugged from the ears, so that the subjects did not perceive the stimulus. No responses were obtained from the deaf ear of unilaterally hearing impaired subjects, nor in the insert-out-of-the-ear condition in all the subjects, showing that Speech ABR reflects the functioning of the auditory pathways.
The selective role of premotor cortex in speech perception: a contribution to phoneme judgements but not speech comprehension.

Science.gov (United States)

Krieger-Redwood, Katya; Gaskell, M Gareth; Lindsay, Shane; Jefferies, Elizabeth

2013-12-01

Several accounts of speech perception propose that the areas involved in producing language are also involved in perceiving it. In line with this view, neuroimaging studies show activation of premotor cortex (PMC) during phoneme judgment tasks; however, there is debate about whether speech perception necessarily involves motor processes, across all task contexts, or whether the contribution of PMC is restricted to tasks requiring explicit phoneme awareness. Some aspects of speech processing, such as mapping sounds onto meaning, may proceed without the involvement of motor speech areas if PMC specifically contributes to the manipulation and categorical perception of phonemes. We applied TMS to three sites-PMC, posterior superior temporal gyrus, and occipital pole-and for the first time within the TMS literature, directly contrasted two speech perception tasks that required explicit phoneme decisions and mapping of speech sounds onto semantic categories, respectively. TMS to PMC disrupted explicit phonological judgments but not access to meaning for the same speech stimuli. TMS to two further sites confirmed that this pattern was site specific and did not reflect a generic difference in the susceptibility of our experimental tasks to TMS: stimulation of pSTG, a site involved in auditory processing, disrupted performance in both language tasks, whereas stimulation of occipital pole had no effect on performance in either task. These findings demonstrate that, although PMC is important for explicit phonological judgments, crucially, PMC is not necessary for mapping speech onto meanings.
Acquirement and enhancement of remote speech signals

Science.gov (United States)

Lü, Tao; Guo, Jin; Zhang, He-yong; Yan, Chun-hui; Wang, Can-jin

2017-07-01

To address the challenges of non-cooperative and remote acoustic detection, an all-fiber laser Doppler vibrometer (LDV) is established. The all-fiber LDV system can offer the advantages of smaller size, lightweight design and robust structure, hence it is a better fit for remote speech detection. In order to improve the performance and the efficiency of LDV for long-range hearing, the speech enhancement technology based on optimally modified log-spectral amplitude (OM-LSA) algorithm is used. The experimental results show that the comprehensible speech signals within the range of 150 m can be obtained by the proposed LDV. The signal-to-noise ratio ( SNR) and mean opinion score ( MOS) of the LDV speech signal can be increased by 100% and 27%, respectively, by using the speech enhancement technology. This all-fiber LDV, which combines the speech enhancement technology, can meet the practical demand in engineering.
Mobile speech and advanced natural language solutions

CERN Document Server

Markowitz, Judith

2013-01-01

Mobile Speech and Advanced Natural Language Solutions provides a comprehensive and forward-looking treatment of natural speech in the mobile environment. This fourteen-chapter anthology brings together lead scientists from Apple, Google, IBM, AT&T, Yahoo! Research and other companies, along with academicians, technology developers and market analysts. They analyze the growing markets for mobile speech, new methodological approaches to the study of natural language, empirical research findings on natural language and mobility, and future trends in mobile speech. Mobile Speech opens with a challenge to the industry to broaden the discussion about speech in mobile environments beyond the smartphone, to consider natural language applications across different domains. Among the new natural language methods introduced in this book are Sequence Package Analysis, which locates and extracts valuable opinion-related data buried in online postings; microintonation as a way to make TTS truly human-like; and se...
Monkey Lipsmacking Develops Like the Human Speech Rhythm

Science.gov (United States)

Morrill, Ryan J.; Paukner, Annika; Ferrari, Pier F.; Ghazanfar, Asif A.

2012-01-01

Across all languages studied to date, audiovisual speech exhibits a consistent rhythmic structure. This rhythm is critical to speech perception. Some have suggested that the speech rhythm evolved "de novo" in humans. An alternative account--the one we explored here--is that the rhythm of speech evolved through the modification of rhythmic facial…
Understanding the Linguistic Characteristics of the Great Speeches

OpenAIRE

Mouritzen, Kristian

2016-01-01

This dissertation attempts to find the common traits of great speeches. It does so by closely examining the language of some of the most well-known speeches in world. These speeches are presented in the book Speeches that Changed the World (2006) by Simon Sebag Montefiore. The dissertation specifically looks at four variables: The beginnings and endings of the speeches, the use of passive voice, the use of personal pronouns and the difficulty of the language. These four variables are based on...
Speech spectrum envelope modeling

Czech Academy of Sciences Publication Activity Database

Vích, Robert; Vondra, Martin

Vol. 4775, - (2007), s. 129-137 ISSN 0302-9743. [COST Action 2102 International Workshop. Vietri sul Mare, 29.03.2007-31.03.2007] R&D Projects: GA AV ČR(CZ) 1ET301710509 Institutional research plan: CEZ:AV0Z20670512 Keywords : speech * speech processing * cepstral analysis Subject RIV: JA - Electronics ; Optoelectronics, Electrical Engineering Impact factor: 0.302, year: 2005
Speech emotion recognition methods: A literature review

Science.gov (United States)

Basharirad, Babak; Moradhaseli, Mohammadreza

2017-10-01

Recently, attention of the emotional speech signals research has been boosted in human machine interfaces due to availability of high computation capability. There are many systems proposed in the literature to identify the emotional state through speech. Selection of suitable feature sets, design of a proper classifications methods and prepare an appropriate dataset are the main key issues of speech emotion recognition systems. This paper critically analyzed the current available approaches of speech emotion recognition methods based on the three evaluating parameters (feature set, classification of features, accurately usage). In addition, this paper also evaluates the performance and limitations of available methods. Furthermore, it highlights the current promising direction for improvement of speech emotion recognition systems.
Speech neglect: A strange educational blind spot

Science.gov (United States)

Harris, Katherine Safford

2005-09-01

Speaking is universally acknowledged as an important human talent, yet as a topic of educated common knowledge, it is peculiarly neglected. Partly, this is a consequence of the relatively recent growth of research on speech perception, production, and development, but also a function of the way that information is sliced up by undergraduate colleges. Although the basic acoustic mechanism of vowel production was known to Helmholtz, the ability to view speech production as a physiological event is evolving even now with such techniques as fMRI. Intensive research on speech perception emerged only in the early 1930s as Fletcher and the engineers at Bell Telephone Laboratories developed the transmission of speech over telephone lines. The study of speech development was revolutionized by the papers of Eimas and his colleagues on speech perception in infants in the 1970s. Dissemination of knowledge in these fields is the responsibility of no single academic discipline. It forms a center for two departments, Linguistics, and Speech and Hearing, but in the former, there is a heavy emphasis on other aspects of language than speech and, in the latter, a focus on clinical practice. For psychologists, it is a rather minor component of a very diverse assembly of topics. I will focus on these three fields in proposing possible remedies.
Automatic Speech Recognition from Neural Signals: A Focused Review

Directory of Open Access Journals (Sweden)

Christian Herff

2016-09-01

Full Text Available Speech interfaces have become widely accepted and are nowadays integrated in various real-life applications and devices. They have become a part of our daily life. However, speech interfaces presume the ability to produce intelligible speech, which might be impossible due to either loud environments, bothering bystanders or incapabilities to produce speech (i.e.~patients suffering from locked-in syndrome. For these reasons it would be highly desirable to not speak but to simply envision oneself to say words or sentences. Interfaces based on imagined speech would enable fast and natural communication without the need for audible speech and would give a voice to otherwise mute people.This focused review analyzes the potential of different brain imaging techniques to recognize speech from neural signals by applying Automatic Speech Recognition technology. We argue that modalities based on metabolic processes, such as functional Near Infrared Spectroscopy and functional Magnetic Resonance Imaging, are less suited for Automatic Speech Recognition from neural signals due to low temporal resolution but are very useful for the investigation of the underlying neural mechanisms involved in speech processes. In contrast, electrophysiologic activity is fast enough to capture speech processes and is therefor better suited for ASR. Our experimental results indicate the potential of these signals for speech recognition from neural data with a focus on invasively measured brain activity (electrocorticography. As a first example of Automatic Speech Recognition techniques used from neural signals, we discuss the emph{Brain-to-text} system.
An evaluation of speech production in two boys with neurodevelopmental disorders who received communication intervention with a speech-generating device.

Science.gov (United States)

Roche, Laura; Sigafoos, Jeff; Lancioni, Giulio E; O'Reilly, Mark F; Schlosser, Ralf W; Stevens, Michelle; van der Meer, Larah; Achmadi, Donna; Kagohara, Debora; James, Ruth; Carnett, Amarie; Hodis, Flaviu; Green, Vanessa A; Sutherland, Dean; Lang, Russell; Rispoli, Mandy; Machalicek, Wendy; Marschik, Peter B

2014-11-01

Children with neurodevelopmental disorders often present with little or no speech. Augmentative and alternative communication (AAC) aims to promote functional communication using non-speech modes, but it might also influence natural speech production. To investigate this possibility, we provided AAC intervention to two boys with neurodevelopmental disorders and severe communication impairment. Intervention focused on teaching the boys to use a tablet computer-based speech-generating device (SGD) to request preferred stimuli. During SGD intervention, both boys began to utter relevant single words. In an effort to induce more speech, and investigate the relation between SGD availability and natural speech production, the SGD was removed during some requesting opportunities. With intervention, both participants learned to use the SGD to request preferred stimuli. After learning to use the SGD, both participants began to respond more frequently with natural speech when the SGD was removed. The results suggest that a rehabilitation program involving initial SGD intervention, followed by subsequent withdrawal of the SGD, might increase the frequency of natural speech production in some children with neurodevelopmental disorders. This effect could be an example of response generalization. Copyright © 2014 ISDN. Published by Elsevier Ltd. All rights reserved.
Multistage audiovisual integration of speech: dissociating identification and detection.

Science.gov (United States)

Eskelund, Kasper; Tuomainen, Jyrki; Andersen, Tobias S

2011-02-01

Speech perception integrates auditory and visual information. This is evidenced by the McGurk illusion where seeing the talking face influences the auditory phonetic percept and by the audiovisual detection advantage where seeing the talking face influences the detectability of the acoustic speech signal. Here, we show that identification of phonetic content and detection can be dissociated as speech-specific and non-specific audiovisual integration effects. To this end, we employed synthetically modified stimuli, sine wave speech (SWS), which is an impoverished speech signal that only observers informed of its speech-like nature recognize as speech. While the McGurk illusion only occurred for informed observers, the audiovisual detection advantage occurred for naïve observers as well. This finding supports a multistage account of audiovisual integration of speech in which the many attributes of the audiovisual speech signal are integrated by separate integration processes.
Developmental profile of speech-language and communicative functions in an individual with the preserved speech variant of Rett syndrome.

Science.gov (United States)

Marschik, Peter B; Vollmann, Ralf; Bartl-Pokorny, Katrin D; Green, Vanessa A; van der Meer, Larah; Wolin, Thomas; Einspieler, Christa

2014-08-01

We assessed various aspects of speech-language and communicative functions of an individual with the preserved speech variant of Rett syndrome (RTT) to describe her developmental profile over a period of 11 years. For this study, we incorporated the following data resources and methods to assess speech-language and communicative functions during pre-, peri- and post-regressional development: retrospective video analyses, medical history data, parental checklists and diaries, standardized tests on vocabulary and grammar, spontaneous speech samples and picture stories to elicit narrative competences. Despite achieving speech-language milestones, atypical behaviours were present at all times. We observed a unique developmental speech-language trajectory (including the RTT typical regression) affecting all linguistic and socio-communicative sub-domains in the receptive as well as the expressive modality. Future research should take into consideration a potentially considerable discordance between formal and functional language use by interpreting communicative acts on a more cautionary note.

A Case of Generalized Auditory Agnosia with Unilateral Subcortical Brain Lesion

Science.gov (United States)

Suh, Hyee; Kim, Soo Yeon; Kim, Sook Hee; Chang, Jae Hyeok; Shin, Yong Beom; Ko, Hyun-Yoon

2012-01-01

The mechanisms and functional anatomy underlying the early stages of speech perception are still not well understood. Auditory agnosia is a deficit of auditory object processing defined as a disability to recognize spoken languages and/or nonverbal environmental sounds and music despite adequate hearing while spontaneous speech, reading and writing are preserved. Usually, either the bilateral or unilateral temporal lobe, especially the transverse gyral lesions, are responsible for auditory agnosia. Subcortical lesions without cortical damage rarely causes auditory agnosia. We present a 73-year-old right-handed male with generalized auditory agnosia caused by a unilateral subcortical lesion. He was not able to repeat or dictate but to perform fluent and comprehensible speech. He could understand and read written words and phrases. His auditory brainstem evoked potential and audiometry were intact. This case suggested that the subcortical lesion involving unilateral acoustic radiation could cause generalized auditory agnosia. PMID:23342322
Separating Underdetermined Convolutive Speech Mixtures

DEFF Research Database (Denmark)

Pedersen, Michael Syskind; Wang, DeLiang; Larsen, Jan

2006-01-01

a method for underdetermined blind source separation of convolutive mixtures. The proposed framework is applicable for separation of instantaneous as well as convolutive speech mixtures. It is possible to iteratively extract each speech signal from the mixture by combining blind source separation...
[Prosody, speech input and language acquisition].

Science.gov (United States)

Jungheim, M; Miller, S; Kühn, D; Ptok, M

2014-04-01

In order to acquire language, children require speech input. The prosody of the speech input plays an important role. In most cultures adults modify their code when communicating with children. Compared to normal speech this code differs especially with regard to prosody. For this review a selective literature search in PubMed and Scopus was performed. Prosodic characteristics are a key feature of spoken language. By analysing prosodic features, children gain knowledge about underlying grammatical structures. Child-directed speech (CDS) is modified in a way that meaningful sequences are highlighted acoustically so that important information can be extracted from the continuous speech flow more easily. CDS is said to enhance the representation of linguistic signs. Taking into consideration what has previously been described in the literature regarding the perception of suprasegmentals, CDS seems to be able to support language acquisition due to the correspondence of prosodic and syntactic units. However, no findings have been reported, stating that the linguistically reduced CDS could hinder first language acquisition.
Speech-in-speech perception and executive function involvement.

Directory of Open Access Journals (Sweden)

Marcela Perrone-Bertolotti

Full Text Available This present study investigated the link between speech-in-speech perception capacities and four executive function components: response suppression, inhibitory control, switching and working memory. We constructed a cross-modal semantic priming paradigm using a written target word and a spoken prime word, implemented in one of two concurrent auditory sentences (cocktail party situation. The prime and target were semantically related or unrelated. Participants had to perform a lexical decision task on visual target words and simultaneously listen to only one of two pronounced sentences. The attention of the participant was manipulated: The prime was in the pronounced sentence listened to by the participant or in the ignored one. In addition, we evaluate the executive function abilities of participants (switching cost, inhibitory-control cost and response-suppression cost and their working memory span. Correlation analyses were performed between the executive and priming measurements. Our results showed a significant interaction effect between attention and semantic priming. We observed a significant priming effect in the attended but not in the ignored condition. Only priming effects obtained in the ignored condition were significantly correlated with some of the executive measurements. However, no correlation between priming effects and working memory capacity was found. Overall, these results confirm, first, the role of attention for semantic priming effect and, second, the implication of executive functions in speech-in-noise understanding capacities.
Infants' brain responses to speech suggest analysis by synthesis.

Science.gov (United States)

Kuhl, Patricia K; Ramírez, Rey R; Bosseler, Alexis; Lin, Jo-Fu Lotus; Imada, Toshiaki

2014-08-05

Historic theories of speech perception (Motor Theory and Analysis by Synthesis) invoked listeners' knowledge of speech production to explain speech perception. Neuroimaging data show that adult listeners activate motor brain areas during speech perception. In two experiments using magnetoencephalography (MEG), we investigated motor brain activation, as well as auditory brain activation, during discrimination of native and nonnative syllables in infants at two ages that straddle the developmental transition from language-universal to language-specific speech perception. Adults are also tested in Exp. 1. MEG data revealed that 7-mo-old infants activate auditory (superior temporal) as well as motor brain areas (Broca's area, cerebellum) in response to speech, and equivalently for native and nonnative syllables. However, in 11- and 12-mo-old infants, native speech activates auditory brain areas to a greater degree than nonnative, whereas nonnative speech activates motor brain areas to a greater degree than native speech. This double dissociation in 11- to 12-mo-old infants matches the pattern of results obtained in adult listeners. Our infant data are consistent with Analysis by Synthesis: auditory analysis of speech is coupled with synthesis of the motor plans necessary to produce the speech signal. The findings have implications for: (i) perception-action theories of speech perception, (ii) the impact of "motherese" on early language learning, and (iii) the "social-gating" hypothesis and humans' development of social understanding.
The interpersonal level in English: reported speech

NARCIS (Netherlands)

Keizer, E.

2009-01-01

The aim of this article is to describe and classify a number of different forms of English reported speech (or thought), and subsequently to analyze and represent them within the theory of FDG. First, the most prototypical forms of reported speech are discussed (direct and indirect speech);
Cognitive Functions in Childhood Apraxia of Speech

Science.gov (United States)

Nijland, Lian; Terband, Hayo; Maassen, Ben

2015-01-01

Purpose: Childhood apraxia of speech (CAS) is diagnosed on the basis of specific speech characteristics, in the absence of problems in hearing, intelligence, and language comprehension. This does not preclude the possibility that children with this speech disorder might demonstrate additional problems. Method: Cognitive functions were investigated…
Age-Related Differences in Speech Rate Perception Do Not Necessarily Entail Age-Related Differences in Speech Rate Use

Science.gov (United States)

Heffner, Christopher C.; Newman, Rochelle S.; Dilley, Laura C.; Idsardi, William J.

2015-01-01

Purpose: A new literature has suggested that speech rate can influence the parsing of words quite strongly in speech. The purpose of this study was to investigate differences between younger adults and older adults in the use of context speech rate in word segmentation, given that older adults perceive timing information differently from younger…
Primary progressive aphasia and apraxia of speech.

Science.gov (United States)

Jung, Youngsin; Duffy, Joseph R; Josephs, Keith A

2013-09-01

Primary progressive aphasia is a neurodegenerative syndrome characterized by progressive language dysfunction. The majority of primary progressive aphasia cases can be classified into three subtypes: nonfluent/agrammatic, semantic, and logopenic variants. Each variant presents with unique clinical features, and is associated with distinctive underlying pathology and neuroimaging findings. Unlike primary progressive aphasia, apraxia of speech is a disorder that involves inaccurate production of sounds secondary to impaired planning or programming of speech movements. Primary progressive apraxia of speech is a neurodegenerative form of apraxia of speech, and it should be distinguished from primary progressive aphasia given its discrete clinicopathological presentation. Recently, there have been substantial advances in our understanding of these speech and language disorders. The clinical, neuroimaging, and histopathological features of primary progressive aphasia and apraxia of speech are reviewed in this article. The distinctions among these disorders for accurate diagnosis are increasingly important from a prognostic and therapeutic standpoint. Thieme Medical Publishers 333 Seventh Avenue, New York, NY 10001, USA.
Optimizing acoustical conditions for speech intelligibility in classrooms

Science.gov (United States)

Yang, Wonyoung

High speech intelligibility is imperative in classrooms where verbal communication is critical. However, the optimal acoustical conditions to achieve a high degree of speech intelligibility have previously been investigated with inconsistent results, and practical room-acoustical solutions to optimize the acoustical conditions for speech intelligibility have not been developed. This experimental study validated auralization for speech-intelligibility testing, investigated the optimal reverberation for speech intelligibility for both normal and hearing-impaired listeners using more realistic room-acoustical models, and proposed an optimal sound-control design for speech intelligibility based on the findings. The auralization technique was used to perform subjective speech-intelligibility tests. The validation study, comparing auralization results with those of real classroom speech-intelligibility tests, found that if the room to be auralized is not very absorptive or noisy, speech-intelligibility tests using auralization are valid. The speech-intelligibility tests were done in two different auralized sound fields---approximately diffuse and non-diffuse---using the Modified Rhyme Test and both normal and hearing-impaired listeners. A hybrid room-acoustical prediction program was used throughout the work, and it and a 1/8 scale-model classroom were used to evaluate the effects of ceiling barriers and reflectors. For both subject groups, in approximately diffuse sound fields, when the speech source was closer to the listener than the noise source, the optimal reverberation time was zero. When the noise source was closer to the listener than the speech source, the optimal reverberation time was 0.4 s (with another peak at 0.0 s) with relative output power levels of the speech and noise sources SNS = 5 dB, and 0.8 s with SNS = 0 dB. In non-diffuse sound fields, when the noise source was between the speaker and the listener, the optimal reverberation time was 0.6 s with
Internet Video Telephony Allows Speech Reading by Deaf Individuals and Improves Speech Perception by Cochlear Implant Users

Science.gov (United States)

Mantokoudis, Georgios; Dähler, Claudia; Dubach, Patrick; Kompis, Martin; Caversaccio, Marco D.; Senn, Pascal

2013-01-01

Objective To analyze speech reading through Internet video calls by profoundly hearing-impaired individuals and cochlear implant (CI) users. Methods Speech reading skills of 14 deaf adults and 21 CI users were assessed using the Hochmair Schulz Moser (HSM) sentence test. We presented video simulations using different video resolutions (1280×720, 640×480, 320×240, 160×120 px), frame rates (30, 20, 10, 7, 5 frames per second (fps)), speech velocities (three different speakers), webcameras (Logitech Pro9000, C600 and C500) and image/sound delays (0–500 ms). All video simulations were presented with and without sound and in two screen sizes. Additionally, scores for live Skype™ video connection and live face-to-face communication were assessed. Results Higher frame rate (>7 fps), higher camera resolution (>640×480 px) and shorter picture/sound delay (<100 ms) were associated with increased speech perception scores. Scores were strongly dependent on the speaker but were not influenced by physical properties of the camera optics or the full screen mode. There is a significant median gain of +8.5%pts (p = 0.009) in speech perception for all 21 CI-users if visual cues are additionally shown. CI users with poor open set speech perception scores (n = 11) showed the greatest benefit under combined audio-visual presentation (median speech perception +11.8%pts, p = 0.032). Conclusion Webcameras have the potential to improve telecommunication of hearing-impaired individuals. PMID:23359119
Internet video telephony allows speech reading by deaf individuals and improves speech perception by cochlear implant users.

Directory of Open Access Journals (Sweden)

Georgios Mantokoudis

Full Text Available OBJECTIVE: To analyze speech reading through Internet video calls by profoundly hearing-impaired individuals and cochlear implant (CI users. METHODS: Speech reading skills of 14 deaf adults and 21 CI users were assessed using the Hochmair Schulz Moser (HSM sentence test. We presented video simulations using different video resolutions (1280 × 720, 640 × 480, 320 × 240, 160 × 120 px, frame rates (30, 20, 10, 7, 5 frames per second (fps, speech velocities (three different speakers, webcameras (Logitech Pro9000, C600 and C500 and image/sound delays (0-500 ms. All video simulations were presented with and without sound and in two screen sizes. Additionally, scores for live Skype™ video connection and live face-to-face communication were assessed. RESULTS: Higher frame rate (>7 fps, higher camera resolution (>640 × 480 px and shorter picture/sound delay (<100 ms were associated with increased speech perception scores. Scores were strongly dependent on the speaker but were not influenced by physical properties of the camera optics or the full screen mode. There is a significant median gain of +8.5%pts (p = 0.009 in speech perception for all 21 CI-users if visual cues are additionally shown. CI users with poor open set speech perception scores (n = 11 showed the greatest benefit under combined audio-visual presentation (median speech perception +11.8%pts, p = 0.032. CONCLUSION: Webcameras have the potential to improve telecommunication of hearing-impaired individuals.
The benefit obtained from visually displayed text from an automatic speech recognizer during listening to speech presented in noise

NARCIS (Netherlands)

Zekveld, A.A.; Kramer, S.E.; Kessens, J.M.; Vlaming, M.S.M.G.; Houtgast, T.

2008-01-01

OBJECTIVES: The aim of this study was to evaluate the benefit that listeners obtain from visually presented output from an automatic speech recognition (ASR) system during listening to speech in noise. DESIGN: Auditory-alone and audiovisual speech reception thresholds (SRTs) were measured. The SRT
The Hierarchical Cortical Organization of Human Speech Processing.

Science.gov (United States)

de Heer, Wendy A; Huth, Alexander G; Griffiths, Thomas L; Gallant, Jack L; Theunissen, Frédéric E

2017-07-05

Speech comprehension requires that the brain extract semantic meaning from the spectral features represented at the cochlea. To investigate this process, we performed an fMRI experiment in which five men and two women passively listened to several hours of natural narrative speech. We then used voxelwise modeling to predict BOLD responses based on three different feature spaces that represent the spectral, articulatory, and semantic properties of speech. The amount of variance explained by each feature space was then assessed using a separate validation dataset. Because some responses might be explained equally well by more than one feature space, we used a variance partitioning analysis to determine the fraction of the variance that was uniquely explained by each feature space. Consistent with previous studies, we found that speech comprehension involves hierarchical representations starting in primary auditory areas and moving laterally on the temporal lobe: spectral features are found in the core of A1, mixtures of spectral and articulatory in STG, mixtures of articulatory and semantic in STS, and semantic in STS and beyond. Our data also show that both hemispheres are equally and actively involved in speech perception and interpretation. Further, responses as early in the auditory hierarchy as in STS are more correlated with semantic than spectral representations. These results illustrate the importance of using natural speech in neurolinguistic research. Our methodology also provides an efficient way to simultaneously test multiple specific hypotheses about the representations of speech without using block designs and segmented or synthetic speech. SIGNIFICANCE STATEMENT To investigate the processing steps performed by the human brain to transform natural speech sound into meaningful language, we used models based on a hierarchical set of speech features to predict BOLD responses of individual voxels recorded in an fMRI experiment while subjects listened to
Inner Speech: Development, Cognitive Functions, Phenomenology, and Neurobiology

Science.gov (United States)

2015-01-01

Inner speech—also known as covert speech or verbal thinking—has been implicated in theories of cognitive development, speech monitoring, executive function, and psychopathology. Despite a growing body of knowledge on its phenomenology, development, and function, approaches to the scientific study of inner speech have remained diffuse and largely unintegrated. This review examines prominent theoretical approaches to inner speech and methodological challenges in its study, before reviewing current evidence on inner speech in children and adults from both typical and atypical populations. We conclude by considering prospects for an integrated cognitive science of inner speech, and present a multicomponent model of the phenomenon informed by developmental, cognitive, and psycholinguistic considerations. Despite its variability among individuals and across the life span, inner speech appears to perform significant functions in human cognition, which in some cases reflect its developmental origins and its sharing of resources with other cognitive processes. PMID:26011789
Multistage audiovisual integration of speech: dissociating identification and detection

DEFF Research Database (Denmark)

Eskelund, Kasper; Tuomainen, Jyrki; Andersen, Tobias

2011-01-01

Speech perception integrates auditory and visual information. This is evidenced by the McGurk illusion where seeing the talking face influences the auditory phonetic percept and by the audiovisual detection advantage where seeing the talking face influences the detectability of the acoustic speech...... signal. Here we show that identification of phonetic content and detection can be dissociated as speech-specific and non-specific audiovisual integration effects. To this end, we employed synthetically modified stimuli, sine wave speech (SWS), which is an impoverished speech signal that only observers...... informed of its speech-like nature recognize as speech. While the McGurk illusion only occurred for informed observers the audiovisual detection advantage occurred for naïve observers as well. This finding supports a multi-stage account of audiovisual integration of speech in which the many attributes...
Treating speech subsystems in childhood apraxia of speech with tactual input: the PROMPT approach.

Science.gov (United States)

Dale, Philip S; Hayden, Deborah A

2013-11-01

Prompts for Restructuring Oral Muscular Phonetic Targets (PROMPT; Hayden, 2004; Hayden, Eigen, Walker, & Olsen, 2010)-a treatment approach for the improvement of speech sound disorders in children-uses tactile-kinesthetic- proprioceptive (TKP) cues to support and shape movements of the oral articulators. No research to date has systematically examined the efficacy of PROMPT for children with childhood apraxia of speech (CAS). Four children (ages 3;6 [years;months] to 4;8), all meeting the American Speech-Language-Hearing Association (2007) criteria for CAS, were treated using PROMPT. All children received 8 weeks of 2 × per week treatment, including at least 4 weeks of full PROMPT treatment that included TKP cues. During the first 4 weeks, 2 of the 4 children received treatment that included all PROMPT components except TKP cues. This design permitted both between-subjects and within-subjects comparisons to evaluate the effect of TKP cues. Gains in treatment were measured by standardized tests and by criterion-referenced measures based on the production of untreated probe words, reflecting change in speech movements and auditory perceptual accuracy. All 4 children made significant gains during treatment, but measures of motor speech control and untreated word probes provided evidence for more gain when TKP cues were included. PROMPT as a whole appears to be effective for treating children with CAS, and the inclusion of TKP cues appears to facilitate greater effect.
Interventions for Speech Sound Disorders in Children

Science.gov (United States)

Williams, A. Lynn, Ed.; McLeod, Sharynne, Ed.; McCauley, Rebecca J., Ed.

2010-01-01

With detailed discussion and invaluable video footage of 23 treatment interventions for speech sound disorders (SSDs) in children, this textbook and DVD set should be part of every speech-language pathologist's professional preparation. Focusing on children with functional or motor-based speech disorders from early childhood through the early…
DEVELOPMENT AND DISORDERS OF SPEECH IN CHILDHOOD.

Science.gov (United States)

KARLIN, ISAAC W.; AND OTHERS

THE GROWTH, DEVELOPMENT, AND ABNORMALITIES OF SPEECH IN CHILDHOOD ARE DESCRIBED IN THIS TEXT DESIGNED FOR PEDIATRICIANS, PSYCHOLOGISTS, EDUCATORS, MEDICAL STUDENTS, THERAPISTS, PATHOLOGISTS, AND PARENTS. THE NORMAL DEVELOPMENT OF SPEECH AND LANGUAGE IS DISCUSSED, INCLUDING THEORIES ON THE ORIGIN OF SPEECH IN MAN AND FACTORS INFLUENCING THE NORMAL…
Auditory Modeling for Noisy Speech Recognition

National Research Council Canada - National Science Library

2000-01-01

... digital filtering for noise cancellation which interfaces to speech recognition software. It uses auditory features in speech recognition training, and provides applications to multilingual spoken language translation...

Describing Speech Usage in Daily Activities in Typical Adults.

Science.gov (United States)

Anderson, Laine; Baylor, Carolyn R; Eadie, Tanya L; Yorkston, Kathryn M

2016-01-01

"Speech usage" refers to what people want or need to do with their speech to meet communication demands in life roles. The purpose of this study was to contribute to validation of the Levels of Speech Usage scale by providing descriptive data from a sample of adults without communication disorders, comparing this scale to a published Occupational Voice Demands scale and examining predictors of speech usage levels. This is a survey design. Adults aged ≥25 years without reported communication disorders were recruited nationally to complete an online questionnaire. The questionnaire included the Levels of Speech Usage scale, questions about relevant occupational and nonoccupational activities (eg, socializing, hobbies, childcare, and so forth), and demographic information. Participants were also categorized according to Koufman and Isaacson occupational voice demands scale. A total of 276 participants completed the questionnaires. People who worked for pay tended to report higher levels of speech usage than those who do not work for pay. Regression analyses showed employment to be the major contributor to speech usage; however, considerable variance left unaccounted for suggests that determinants of speech usage and the relationship between speech usage, employment, and other life activities are not yet fully defined. The Levels of Speech Usage may be a viable instrument to systematically rate speech usage because it captures both occupational and nonoccupational speech demands. These data from a sample of typical adults may provide a reference to help in interpreting the impact of communication disorders on speech usage patterns. Copyright © 2016 The Voice Foundation. Published by Elsevier Inc. All rights reserved.
The Apraxia of Speech Rating Scale: a tool for diagnosis and description of apraxia of speech.

Science.gov (United States)

Strand, Edythe A; Duffy, Joseph R; Clark, Heather M; Josephs, Keith

2014-01-01

The purpose of this report is to describe an initial version of the Apraxia of Speech Rating Scale (ASRS), a scale designed to quantify the presence or absence, relative frequency, and severity of characteristics frequently associated with apraxia of speech (AOS). In this paper we report intra-judge and inter-judge reliability, as well as indices of validity, for the ASRS which was completed for 133 adult participants with a neurodegenerative speech or language disorder, 56 of whom had AOS. The overall inter-judge ICC among three clinicians was 0.94 for the total ASRS score and 0.91 for the number of AOS characteristics identified as present. Intra-judge ICC measures were high, ranging from 0.91 to 0.98. Validity was demonstrated on the basis of strong correlations with independent clinical diagnosis, as well as strong correlations of ASRS scores with independent clinical judgments of AOS severity. Results suggest that the ASRS is a potentially useful tool for documenting the presence and severity of characteristics of AOS. At this point in its development it has good potential for broader clinical use and for better subject description in AOS research. The Apraxia of Speech Rating Scale: A new tool for diagnosis and description of apraxia of speech 1. The reader will be able to explain characteristics of apraxia of speech. 2. The reader will be able to demonstrate use of a rating scale to document the presence and severity of speech characteristics. 3. The reader will be able to explain the reliability and validity of the ASRS. Copyright © 2014 Elsevier Inc. All rights reserved.
The chairman's speech

International Nuclear Information System (INIS)

Allen, A.M.

1986-01-01

The paper contains a transcript of a speech by the chairman of the UKAEA, to mark the publication of the 1985/6 annual report. The topics discussed in the speech include: the Chernobyl accident and its effect on public attitudes to nuclear power, management and disposal of radioactive waste, the operation of UKAEA as a trading fund, and the UKAEA development programmes. The development programmes include work on the following: fast reactor technology, thermal reactors, reactor safety, health and safety aspects of water cooled reactors, the Joint European Torus, and under-lying research. (U.K.)
Visual speech alters the discrimination and identification of non-intact auditory speech in children with hearing loss.

Science.gov (United States)

Jerger, Susan; Damian, Markus F; McAlpine, Rachel P; Abdi, Hervé

2017-03-01

Understanding spoken language is an audiovisual event that depends critically on the ability to discriminate and identify phonemes yet we have little evidence about the role of early auditory experience and visual speech on the development of these fundamental perceptual skills. Objectives of this research were to determine 1) how visual speech influences phoneme discrimination and identification; 2) whether visual speech influences these two processes in a like manner, such that discrimination predicts identification; and 3) how the degree of hearing loss affects this relationship. Such evidence is crucial for developing effective intervention strategies to mitigate the effects of hearing loss on language development. Participants were 58 children with early-onset sensorineural hearing loss (CHL, 53% girls, M = 9;4 yrs) and 58 children with normal hearing (CNH, 53% girls, M = 9;4 yrs). Test items were consonant-vowel (CV) syllables and nonwords with intact visual speech coupled to non-intact auditory speech (excised onsets) as, for example, an intact consonant/rhyme in the visual track (Baa or Baz) coupled to non-intact onset/rhyme in the auditory track (/-B/aa or/-B/az). The items started with an easy-to-speechread/B/or difficult-to-speechread/G/onset and were presented in the auditory (static face) vs. audiovisual (dynamic face) modes. We assessed discrimination for intact vs. non-intact different pairs (e.g., Baa:/-B/aa). We predicted that visual speech would cause the non-intact onset to be perceived as intact and would therefore generate more same-as opposed to different-responses in the audiovisual than auditory mode. We assessed identification by repetition of nonwords with non-intact onsets (e.g.,/-B/az). We predicted that visual speech would cause the non-intact onset to be perceived as intact and would therefore generate more Baz-as opposed to az- responses in the audiovisual than auditory mode. Performance in the audiovisual mode showed more same
Visual Speech Alters the Discrimination and Identification of Non-Intact Auditory Speech in Children with Hearing Loss

Science.gov (United States)

Jerger, Susan; Damian, Markus F.; McAlpine, Rachel P.; Abdi, Hervé

2017-01-01

Objectives Understanding spoken language is an audiovisual event that depends critically on the ability to discriminate and identify phonemes yet we have little evidence about the role of early auditory experience and visual speech on the development of these fundamental perceptual skills. Objectives of this research were to determine 1) how visual speech influences phoneme discrimination and identification; 2) whether visual speech influences these two processes in a like manner, such that discrimination predicts identification; and 3) how the degree of hearing loss affects this relationship. Such evidence is crucial for developing effective intervention strategies to mitigate the effects of hearing loss on language development. Methods Participants were 58 children with early-onset sensorineural hearing loss (CHL, 53% girls, M = 9;4 yrs) and 58 children with normal hearing (CNH, 53% girls, M = 9;4 yrs). Test items were consonant-vowel (CV) syllables and nonwords with intact visual speech coupled to non-intact auditory speech (excised onsets) as, for example, an intact consonant/rhyme in the visual track (Baa or Baz) coupled to non-intact onset/rhyme in the auditory track (/–B/aa or /–B/az). The items started with an easy-to-speechread /B/ or difficult-to-speechread /G/ onset and were presented in the auditory (static face) vs. audiovisual (dynamic face) modes. We assessed discrimination for intact vs. non-intact different pairs (e.g., Baa:/–B/aa). We predicted that visual speech would cause the non-intact onset to be perceived as intact and would therefore generate more same—as opposed to different—responses in the audiovisual than auditory mode. We assessed identification by repetition of nonwords with non-intact onsets (e.g., /–B/az). We predicted that visual speech would cause the non-intact onset to be perceived as intact and would therefore generate more Baz—as opposed to az— responses in the audiovisual than auditory mode. Results
Spotlight on Speech Codes 2011: The State of Free Speech on Our Nation's Campuses

Science.gov (United States)

Foundation for Individual Rights in Education (NJ1), 2011

2011-01-01

Each year, the Foundation for Individual Rights in Education (FIRE) conducts a rigorous survey of restrictions on speech at America's colleges and universities. The survey and accompanying report explore the extent to which schools are meeting their legal and moral obligations to uphold students' and faculty members' rights to freedom of speech,…
Spotlight on Speech Codes 2009: The State of Free Speech on Our Nation's Campuses

Science.gov (United States)

Foundation for Individual Rights in Education (NJ1), 2009

2009-01-01

Each year, the Foundation for Individual Rights in Education (FIRE) conducts a wide, detailed survey of restrictions on speech at America's colleges and universities. The survey and resulting report explore the extent to which schools are meeting their obligations to uphold students' and faculty members' rights to freedom of speech, freedom of…
Spotlight on Speech Codes 2010: The State of Free Speech on Our Nation's Campuses

Science.gov (United States)

Foundation for Individual Rights in Education (NJ1), 2010

2010-01-01

Each year, the Foundation for Individual Rights in Education (FIRE) conducts a rigorous survey of restrictions on speech at America's colleges and universities. The survey and resulting report explore the extent to which schools are meeting their legal and moral obligations to uphold students' and faculty members' rights to freedom of speech,…
A speech production model including the nasal Cavity

DEFF Research Database (Denmark)

Olesen, Morten

In order to obtain articulatory analysis of speech production the model is improved. the standard model, as used in LPC analysis, to a large extent only models the acoustic properties of speech signal as opposed to articulatory modelling of the speech production. In spite of this the LPC model...... is by far the most widely used model in speech technology....
Speech Data Compression using Vector Quantization

OpenAIRE

H. B. Kekre; Tanuja K. Sarode

2008-01-01

Mostly transforms are used for speech data compressions which are lossy algorithms. Such algorithms are tolerable for speech data compression since the loss in quality is not perceived by the human ear. However the vector quantization (VQ) has a potential to give more data compression maintaining the same quality. In this paper we propose speech data compression algorithm using vector quantization technique. We have used VQ algorithms LBG, KPE and FCG. The results table s...
Lip Movement Exaggerations during Infant-Directed Speech

Science.gov (United States)

Green, Jordan R.; Nip, Ignatius S. B.; Wilson, Erin M.; Mefferd, Antje S.; Yunusova, Yana

2010-01-01

Purpose: Although a growing body of literature has identified the positive effects of visual speech on speech and language learning, oral movements of infant-directed speech (IDS) have rarely been studied. This investigation used 3-dimensional motion capture technology to describe how mothers modify their lip movements when talking to their…
38 CFR 8.18 - Total disability-speech.

Science.gov (United States)

2010-07-01

... 38 Pensions, Bonuses, and Veterans' Relief 1 2010-07-01 2010-07-01 false Total disability-speech... SERVICE LIFE INSURANCE Premium Waivers and Total Disability § 8.18 Total disability—speech. The organic loss of speech shall be deemed to be total disability under National Service Life Insurance. [67 FR...
Normal Aspects of Speech, Hearing, and Language.

Science.gov (United States)

Minifie, Fred. D., Ed.; And Others

This book is written as a guide to the understanding of the processes involved in human speech communication. Ten authorities contributed material to provide an introduction to the physiological aspects of speech production and reception, the acoustical aspects of speech production and transmission, the psychophysics of sound reception, the nature…
Electrophysiological evidence for speech-specific audiovisual integration

NARCIS (Netherlands)

Baart, M.; Stekelenburg, J.J.; Vroomen, J.

2014-01-01

Lip-read speech is integrated with heard speech at various neural levels. Here, we investigated the extent to which lip-read induced modulations of the auditory N1 and P2 (measured with EEG) are indicative of speech-specific audiovisual integration, and we explored to what extent the ERPs were
High-frequency energy in singing and speech

Science.gov (United States)

Monson, Brian Bruce

While human speech and the human voice generate acoustical energy up to (and beyond) 20 kHz, the energy above approximately 5 kHz has been largely neglected. Evidence is accruing that this high-frequency energy contains perceptual information relevant to speech and voice, including percepts of quality, localization, and intelligibility. The present research was an initial step in the long-range goal of characterizing high-frequency energy in singing voice and speech, with particular regard for its perceptual role and its potential for modification during voice and speech production. In this study, a database of high-fidelity recordings of talkers was created and used for a broad acoustical analysis and general characterization of high-frequency energy, as well as specific characterization of phoneme category, voice and speech intensity level, and mode of production (speech versus singing) by high-frequency energy content. Directionality of radiation of high-frequency energy from the mouth was also examined. The recordings were used for perceptual experiments wherein listeners were asked to discriminate between speech and voice samples that differed only in high-frequency energy content. Listeners were also subjected to gender discrimination tasks, mode-of-production discrimination tasks, and transcription tasks with samples of speech and singing that contained only high-frequency content. The combination of these experiments has revealed that (1) human listeners are able to detect very subtle level changes in high-frequency energy, and (2) human listeners are able to extract significant perceptual information from high-frequency energy.
Unvoiced Speech Recognition Using Tissue-Conductive Acoustic Sensor

Directory of Open Access Journals (Sweden)

Heracleous Panikos

2007-01-01

Full Text Available We present the use of stethoscope and silicon NAM (nonaudible murmur microphones in automatic speech recognition. NAM microphones are special acoustic sensors, which are attached behind the talker's ear and can capture not only normal (audible speech, but also very quietly uttered speech (nonaudible murmur. As a result, NAM microphones can be applied in automatic speech recognition systems when privacy is desired in human-machine communication. Moreover, NAM microphones show robustness against noise and they might be used in special systems (speech recognition, speech transform, etc. for sound-impaired people. Using adaptation techniques and a small amount of training data, we achieved for a 20 k dictation task a word accuracy for nonaudible murmur recognition in a clean environment. In this paper, we also investigate nonaudible murmur recognition in noisy environments and the effect of the Lombard reflex on nonaudible murmur recognition. We also propose three methods to integrate audible speech and nonaudible murmur recognition using a stethoscope NAM microphone with very promising results.
Detection of target phonemes in spontaneous and read speech

NARCIS (Netherlands)

Mehta, G.; Cutler, A.

1988-01-01

Although spontaneous speech occurs more frequently in most listeners' experience than read speech, laboratory studies of human speech recognition typically use carefully controlled materials read from a script. The phonological and prosodic characteristics of spontaneous and read speech differ
Processing changes when listening to foreign-accented speech

Directory of Open Access Journals (Sweden)

Carlos eRomero-Rivas

2015-03-01

Full Text Available This study investigates the mechanisms responsible for fast changes in processing foreign-accented speech. Event Related brain Potentials (ERPs were obtained while native speakers of Spanish listened to native and foreign-accented speakers of Spanish. We observed a less positive P200 component for foreign-accented speech relative to native speech comprehension. This suggests that the extraction of spectral information and other important acoustic features was hampered during foreign-accented speech comprehension. However, the amplitude of the N400 component for foreign-accented speech comprehension decreased across the experiment, suggesting the use of a higher level, lexical mechanism. Furthermore, during native speech comprehension, semantic violations in the critical words elicited an N400 effect followed by a late positivity. During foreign-accented speech comprehension, semantic violations only elicited an N400 effect. Overall, our results suggest that, despite a lack of improvement in phonetic discrimination, native listeners experience changes at lexical-semantic levels of processing after brief exposure to foreign-accented speech. Moreover, these results suggest that lexical access, semantic integration and linguistic re-analysis processes are permeable to external factors, such as the accent of the speaker.
Gender and Speech in a Disney Princess Movie

Directory of Open Access Journals (Sweden)

Azmi N.J.

2016-11-01

Full Text Available One of the latest Disney princess movies is Frozen which was released in 2013. Female characters in Frozen differ from the female characters in previous Disney movies, such as The Little Mermaid and Tangled. In comparison, female characters in Frozen are portrayed as having more heroic values and norms, which makes it interesting to examine their speech characteristics. Do they use typical female speech despite having more heroic characteristics? This paper aims to provide insights into the female speech characteristics in this movie based on Lakoff’s (1975 model of female speech. Data analysis shows that female and male characters in the movie used almost equal number of female speech elements in their dialogues. Interestingly, although female characters in the movie do not behave stereotypically, their speech still contain the elements of female speech, such as the use empty adjectives, questions, hedges and intensifier. This paper argues that the blurring of boundaries between male and female speech characteristics in this movie is an attempt to break gender stereotyping by showing that female characters share similar characteristics with heroic male characters thus they should not be seen as inferior to the male characters.
Using Zebra-speech to study sequential and simultaneous speech segregation in a cochlear-implant simulation.

Science.gov (United States)

Gaudrain, Etienne; Carlyon, Robert P

2013-01-01

Previous studies have suggested that cochlear implant users may have particular difficulties exploiting opportunities to glimpse clear segments of a target speech signal in the presence of a fluctuating masker. Although it has been proposed that this difficulty is associated with a deficit in linking the glimpsed segments across time, the details of this mechanism are yet to be explained. The present study introduces a method called Zebra-speech developed to investigate the relative contribution of simultaneous and sequential segregation mechanisms in concurrent speech perception, using a noise-band vocoder to simulate cochlear implants. One experiment showed that the saliency of the difference between the target and the masker is a key factor for Zebra-speech perception, as it is for sequential segregation. Furthermore, forward masking played little or no role, confirming that intelligibility was not limited by energetic masking but by across-time linkage abilities. In another experiment, a binaural cue was used to distinguish the target and the masker. It showed that the relative contribution of simultaneous and sequential segregation depended on the spectral resolution, with listeners relying more on sequential segregation when the spectral resolution was reduced. The potential of Zebra-speech as a segregation enhancement strategy for cochlear implants is discussed.

Methods of analysis speech rate: a pilot study.

Science.gov (United States)

Costa, Luanna Maria Oliveira; Martins-Reis, Vanessa de Oliveira; Celeste, Letícia Côrrea

2016-01-01

To describe the performance of fluent adults in different measures of speech rate. The study included 24 fluent adults, of both genders, speakers of Brazilian Portuguese, who were born and still living in the metropolitan region of Belo Horizonte, state of Minas Gerais, aged between 18 and 59 years. Participants were grouped by age: G1 (18-29 years), G2 (30-39 years), G3 (40-49 years), and G4 (50-59 years). The speech samples were obtained following the methodology of the Speech Fluency Assessment Protocol. In addition to the measures of speech rate proposed by the protocol (speech rate in words and syllables per minute), the rate of speech into phonemes per second and the articulation rate with and without the disfluencies were calculated. We used the nonparametric Friedman test and the Wilcoxon test for multiple comparisons. Groups were compared using the nonparametric Kruskal Wallis. The significance level was of 5%. There were significant differences between measures of speech rate involving syllables. The multiple comparisons showed that all the three measures were different. There was no effect of age for the studied measures. These findings corroborate previous studies. The inclusion of temporal acoustic measures such as speech rate in phonemes per second and articulation rates with and without disfluencies can be a complementary approach in the evaluation of speech rate.
Particularities of Speech Readiness for Schooling in Pre-School Children Having General Speech Underdevelopment: A Social and Pedagogical Aspect

Science.gov (United States)

Emelyanova, Irina A.; Borisova, Elena A.; Shapovalova, Olga E.; Karynbaeva, Olga V.; Vorotilkina, Irina M.

2018-01-01

The relevance of the research is due to the necessity of creating the pedagogical conditions for correction and development of speech in children having the general speech underdevelopment. For them, difficulties generating a coherent utterance are characteristic, which prevents a sufficient speech readiness for schooling forming in them as well…
Speech, Language, and Reading in 10-Year-Olds With Cleft: Associations With Teasing, Satisfaction With Speech, and Psychological Adjustment.

Science.gov (United States)

Feragen, Kristin Billaud; Særvold, Tone Kristin; Aukner, Ragnhild; Stock, Nicola Marie

2017-03-01

Despite the use of multidisciplinary services, little research has addressed issues involved in the care of those with cleft lip and/or palate across disciplines. The aim was to investigate associations between speech, language, reading, and reports of teasing, subjective satisfaction with speech, and psychological adjustment. Cross-sectional data collected during routine, multidisciplinary assessments in a centralized treatment setting, including speech and language therapists and clinical psychologists. Children with cleft with palatal involvement aged 10 years from three birth cohorts (N = 170) and their parents. Speech: SVANTE-N. Language: Language 6-16 (sentence recall, serial recall, vocabulary, and phonological awareness). Reading: Word Chain Test and Reading Comprehension Test. Psychological measures: Strengths and Difficulties Questionnaire and extracts from the Satisfaction With Appearance Scale and Child Experience Questionnaire. Reading skills were associated with self- and parent-reported psychological adjustment in the child. Subjective satisfaction with speech was associated with psychological adjustment, while not being consistently associated with speech therapists' assessments. Parent-reported teasing was found to be associated with lower levels of reading skills. Having a medical and/or psychological condition in addition to the cleft was found to affect speech, language, and reading significantly. Cleft teams need to be aware of speech, language, and/or reading problems as potential indicators of psychological risk in children with cleft. This study highlights the importance of multiple reports (self, parent, and specialist) and a multidisciplinary approach to cleft care and research.
Robust digital processing of speech signals

CERN Document Server

Kovacevic, Branko; Veinović, Mladen; Marković, Milan

2017-01-01

This book focuses on speech signal phenomena, presenting a robustification of the usual speech generation models with regard to the presumed types of excitation signals, which is equivalent to the introduction of a class of nonlinear models and the corresponding criterion functions for parameter estimation. Compared to the general class of nonlinear models, such as various neural networks, these models possess good properties of controlled complexity, the option of working in “online” mode, as well as a low information volume for efficient speech encoding and transmission. Providing comprehensive insights, the book is based on the authors’ research, which has already been published, supplemented by additional texts discussing general considerations of speech modeling, linear predictive analysis and robust parameter estimation.
Ultrasound applicability in Speech Language Pathology and Audiology

OpenAIRE

Barberena,Luciana da Silva; Brasil,Brunah de Castro; Melo,Roberta Michelon; Mezzomo,Carolina Lisbôa; Mota,Helena Bolli; Keske-Soares,Márcia

2014-01-01

PURPOSE: To present recent studies that used the ultrasound in the fields of Speech Language Pathology and Audiology, which evidence possibilities of the applicability of this technique in different subareas. RESEARCH STRATEGY: A bibliographic research was carried out in the PubMed database, using the keywords "ultrasonic," "speech," "phonetics," "Speech, Language and Hearing Sciences," "voice," "deglutition," and "myofunctional therapy," comprising some areas of Speech Language Pathology and...
Ultrasound applicability in Speech Language Pathology and Audiology

OpenAIRE

Barberena, Luciana da Silva; Brasil, Brunah de Castro; Melo, Roberta Michelon; Mezzomo, Carolina Lisbôa; Mota, Helena Bolli; Keske-Soares, Márcia

2014-01-01

PURPOSE: To present recent studies that used the ultrasound in the fields of Speech Language Pathology and Audiology, which evidence possibilities of the applicability of this technique in different subareas. RESEARCH STRATEGY: A bibliographic research was carried out in the PubMed database, using the keywords "ultrasonic," "speech," "phonetics," "Speech, Language and Hearing Sciences," "voice," "deglutition," and "myofunctional therapy," comprising some areas of Speech Language Patholog...
The influence of age, hearing, and working memory on the speech comprehension benefit derived from an automatic speech recognition system.

Science.gov (United States)

Zekveld, Adriana A; Kramer, Sophia E; Kessens, Judith M; Vlaming, Marcel S M G; Houtgast, Tammo

2009-04-01

The aim of the current study was to examine whether partly incorrect subtitles that are automatically generated by an Automatic Speech Recognition (ASR) system, improve speech comprehension by listeners with hearing impairment. In an earlier study (Zekveld et al. 2008), we showed that speech comprehension in noise by young listeners with normal hearing improves when presenting partly incorrect, automatically generated subtitles. The current study focused on the effects of age, hearing loss, visual working memory capacity, and linguistic skills on the benefit obtained from automatically generated subtitles during listening to speech in noise. In order to investigate the effects of age and hearing loss, three groups of participants were included: 22 young persons with normal hearing (YNH, mean age = 21 years), 22 middle-aged adults with normal hearing (MA-NH, mean age = 55 years) and 30 middle-aged adults with hearing impairment (MA-HI, mean age = 57 years). The benefit from automatic subtitling was measured by Speech Reception Threshold (SRT) tests (Plomp & Mimpen, 1979). Both unimodal auditory and bimodal audiovisual SRT tests were performed. In the audiovisual tests, the subtitles were presented simultaneously with the speech, whereas in the auditory test, only speech was presented. The difference between the auditory and audiovisual SRT was defined as the audiovisual benefit. Participants additionally rated the listening effort. We examined the influences of ASR accuracy level and text delay on the audiovisual benefit and the listening effort using a repeated measures General Linear Model analysis. In a correlation analysis, we evaluated the relationships between age, auditory SRT, visual working memory capacity and the audiovisual benefit and listening effort. The automatically generated subtitles improved speech comprehension in noise for all ASR accuracies and delays covered by the current study. Higher ASR accuracy levels resulted in more benefit obtained
Speech Prosody in Cerebellar Ataxia

Science.gov (United States)

Casper, Maureen A.; Raphael, Lawrence J.; Harris, Katherine S.; Geibel, Jennifer M.

2007-01-01

Persons with cerebellar ataxia exhibit changes in physical coordination and speech and voice production. Previously, these alterations of speech and voice production were described primarily via perceptual coordinates. In this study, the spatial-temporal properties of syllable production were examined in 12 speakers, six of whom were healthy…
Acquired apraxia of speech: features, accounts, and treatment.

Science.gov (United States)

Peach, Richard K

2004-01-01

The features of apraxia of speech (AOS) are presented with regard to both traditional and contemporary descriptions of the disorder. Models of speech processing, including the neurological bases for apraxia of speech, are discussed. Recent findings concerning subcortical contributions to apraxia of speech and the role of the insula are presented. The key features to differentially diagnose AOS from related speech syndromes are identified. Treatment implications derived from motor accounts of AOS are presented along with a summary of current approaches designed to treat the various subcomponents of the disorder. Finally, guidelines are provided for treating the AOS patient with coexisting aphasia.
Speech enhancement in the Karhunen-Loeve expansion domain

CERN Document Server

Benesty, Jacob

2011-01-01

This book is devoted to the study of the problem of speech enhancement whose objective is the recovery of a signal of interest (i.e., speech) from noisy observations. Typically, the recovery process is accomplished by passing the noisy observations through a linear filter (or a linear transformation). Since both the desired speech and undesired noise are filtered at the same time, the most critical issue of speech enhancement resides in how to design a proper optimal filter that can fully take advantage of the difference between the speech and noise statistics to mitigate the noise effect as m
Voice and Speech Quality Perception Assessment and Evaluation

CERN Document Server

Jekosch, Ute

2005-01-01

Foundations of Voice and Speech Quality Perception starts out with the fundamental question of: "How do listeners perceive voice and speech quality and how can these processes be modeled?" Any quantitative answers require measurements. This is natural for physical quantities but harder to imagine for perceptual measurands. This book approaches the problem by actually identifying major perceptual dimensions of voice and speech quality perception, defining units wherever possible and offering paradigms to position these dimensions into a structural skeleton of perceptual speech and voice quality. The emphasis is placed on voice and speech quality assessment of systems in artificial scenarios. Many scientific fields are involved. This book bridges the gap between two quite diverse fields, engineering and humanities, and establishes the new research area of Voice and Speech Quality Perception.
Speech recognition using articulatory and excitation source features

CERN Document Server

Rao, K Sreenivasa

2017-01-01

This book discusses the contribution of articulatory and excitation source information in discriminating sound units. The authors focus on excitation source component of speech -- and the dynamics of various articulators during speech production -- for enhancement of speech recognition (SR) performance. Speech recognition is analyzed for read, extempore, and conversation modes of speech. Five groups of articulatory features (AFs) are explored for speech recognition, in addition to conventional spectral features. Each chapter provides the motivation for exploring the specific feature for SR task, discusses the methods to extract those features, and finally suggests appropriate models to capture the sound unit specific knowledge from the proposed features. The authors close by discussing various combinations of spectral, articulatory and source features, and the desired models to enhance the performance of SR systems.
Detection of target phonemes in spontaneous and read speech

OpenAIRE

Mehta, G.; Cutler, A.

1988-01-01

Although spontaneous speech occurs more frequently in most listeners’ experience than read speech, laboratory studies of human speech recognition typically use carefully controlled materials read from a script. The phonological and prosodic characteristics of spontaneous and read speech differ considerably, however, which suggests that laboratory results may not generalize to the recognition of spontaneous and read speech materials, and their response time to detect word-initial target phonem...
Effects of human fatigue on speech signals

Science.gov (United States)

Stamoulis, Catherine

2004-05-01

Cognitive performance may be significantly affected by fatigue. In the case of critical personnel, such as pilots, monitoring human fatigue is essential to ensure safety and success of a given operation. One of the modalities that may be used for this purpose is speech, which is sensitive to respiratory changes and increased muscle tension of vocal cords, induced by fatigue. Age, gender, vocal tract length, physical and emotional state may significantly alter speech intensity, duration, rhythm, and spectral characteristics. In addition to changes in speech rhythm, fatigue may also affect the quality of speech, such as articulation. In a noisy environment, detecting fatigue-related changes in speech signals, particularly subtle changes at the onset of fatigue, may be difficult. Therefore, in a performance-monitoring system, speech parameters which are significantly affected by fatigue need to be identified and extracted from input signals. For this purpose, a series of experiments was performed under slowly varying cognitive load conditions and at different times of the day. The results of the data analysis are presented here.
A Motor Speech Assessment for Children with Severe Speech Disorders: Reliability and Validity Evidence

Science.gov (United States)

Strand, Edythe A.; McCauley, Rebecca J.; Weigand, Stephen D.; Stoeckel, Ruth E.; Baas, Becky S.

2013-01-01

Purpose: In this article, the authors report reliability and validity evidence for the Dynamic Evaluation of Motor Speech Skill (DEMSS), a new test that uses dynamic assessment to aid in the differential diagnosis of childhood apraxia of speech (CAS). Method: Participants were 81 children between 36 and 79 months of age who were referred to the…
Objective voice and speech analysis of persons with chronic hoarseness by prosodic analysis of speech samples.

Science.gov (United States)

Haderlein, Tino; Döllinger, Michael; Matoušek, Václav; Nöth, Elmar

2016-10-01

Automatic voice assessment is often performed using sustained vowels. In contrast, speech analysis of read-out texts can be applied to voice and speech assessment. Automatic speech recognition and prosodic analysis were used to find regression formulae between automatic and perceptual assessment of four voice and four speech criteria. The regression was trained with 21 men and 62 women (average age 49.2 years) and tested with another set of 24 men and 49 women (48.3 years), all suffering from chronic hoarseness. They read the text 'Der Nordwind und die Sonne' ('The North Wind and the Sun'). Five voice and speech therapists evaluated the data on 5-point Likert scales. Ten prosodic and recognition accuracy measures (features) were identified which describe all the examined criteria. Inter-rater correlation within the expert group was between r = 0.63 for the criterion 'match of breath and sense units' and r = 0.87 for the overall voice quality. Human-machine correlation was between r = 0.40 for the match of breath and sense units and r = 0.82 for intelligibility. The perceptual ratings of different criteria were highly correlated with each other. Likewise, the feature sets modeling the criteria were very similar. The automatic method is suitable for assessing chronic hoarseness in general and for subgroups of functional and organic dysphonia. In its current version, it is almost as reliable as a randomly picked rater from a group of voice and speech therapists.
Memory for speech and speech for memory.

Science.gov (United States)

Locke, J L; Kutz, K J

1975-03-01

Thirty kindergarteners, 15 who substituted /w/ for /r/ and 15 with correct articulation, received two perception tests and a memory test that included /w/ and /r/ in minimally contrastive syllables. Although both groups had nearly perfect perception of the experimenter's productions of /w/ and /r/, misarticulating subjects perceived their own tape-recorded w/r productions as /w/. In the memory task these same misarticulating subjects committed significantly more /w/-/r/ confusions in unspoken recall. The discussion considers why people subvocally rehearse; a developmental period in which children do not rehearse; ways subvocalization may aid recall, including motor and acoustic encoding; an echoic store that provides additional recall support if subjects rehearse vocally, and perception of self- and other- produced phonemes by misarticulating children-including its relevance to a motor theory of perception. Evidence is presented that speech for memory can be sufficiently impaired to cause memory disorder. Conceptions that restrict speech disorder to an impairment of communication are challenged.
Speech Training for Inmate Rehabilitation.

Science.gov (United States)

Parkinson, Michael G.; Dobkins, David H.

1982-01-01

Using a computerized content analysis, the authors demonstrate changes in speech behaviors of prison inmates. They conclude that two to four hours of public speaking training can have only limited effect on students who live in a culture in which "prison speech" is the expected and rewarded form of behavior. (PD)
Speech recognition from spectral dynamics

Indian Academy of Sciences (India)

Some of the history of gradual infusion of the modulation spectrum concept into Automatic recognition of speech (ASR) comes next, pointing to the relationship of modulation spectrum processing to wellaccepted ASR techniques such as dynamic speech features or RelAtive SpecTrAl (RASTA) ﬁltering. Next, the frequency ...
Impact of speech-generating devices on the language development of a child with childhood apraxia of speech: a case study.

Science.gov (United States)

Lüke, Carina

2016-01-01

The purpose of the study was to evaluate the effectiveness of speech-generating devices (SGDs) on the communication and language development of a 2-year-old boy with severe childhood apraxia of speech (CAS). An A-B design was used over a treatment period of 1 year, followed by three additional follow-up measurements, in order to evaluate the implementation of SGDs in the speech therapy of a 2;7-year-old boy with severe CAS. In total, 53 therapy sessions were videotaped and analyzed to better understand his communicative (operationalized as means of communication) and linguistic (operationalized as intelligibility and consistency of speech-productions, lexical and grammatical development) development. The trend-lines of baseline phase A and intervention phase B were compared and percentage of non-overlapping data points were calculated to verify the value of the intervention. The use of SGDs led to an immediate increase in the communicative development of the child. An increase in all linguistic variables was observed, with a latency effect of eight to nine treatment sessions. The implementation of SGDs in speech therapy has the potential to be highly effective in regards to both communicative and linguistic competencies in young children with severe CAS. Implications for Rehabilitation Childhood apraxia of speech (CAS) is a neurological speech sound disorder which results in significant deficits in speech production and lead to a higher risk for language, reading and spelling difficulties. Speech-generating devices (SGD), as one method of augmentative and alternative communication (AAC), can effectively enhance the communicative and linguistic development of children with severe CAS.

Theoretical Value in Teaching Freedom of Speech.

Science.gov (United States)

Carney, John J., Jr.

The exercise of freedom of speech within our nation has deteriorated. A practical value in teaching free speech is the possibility of restoring a commitment to its principles by educators. What must be taught is why freedom of speech is important, why it has been compromised, and the extent to which it has been compromised. Every technological…
Speech profile of patients undergoing primary palatoplasty.

Science.gov (United States)

Menegueti, Katia Ignacio; Mangilli, Laura Davison; Alonso, Nivaldo; Andrade, Claudia Regina Furquim de

2017-10-26

To characterize the profile and speech characteristics of patients undergoing primary palatoplasty in a Brazilian university hospital, considering the time of intervention (early, before two years of age; late, after two years of age). Participants were 97 patients of both genders with cleft palate and/or cleft and lip palate, assigned to the Speech-language Pathology Department, who had been submitted to primary palatoplasty and presented no prior history of speech-language therapy. Patients were divided into two groups: early intervention group (EIG) - 43 patients undergoing primary palatoplasty before 2 years of age and late intervention group (LIG) - 54 patients undergoing primary palatoplasty after 2 years of age. All patients underwent speech-language pathology assessment. The following parameters were assessed: resonance classification, presence of nasal turbulence, presence of weak intraoral air pressure, presence of audible nasal air emission, speech understandability, and compensatory articulation disorder (CAD). At statistical significance level of 5% (p≤0.05), no significant difference was observed between the groups in the following parameters: resonance classification (p=0.067); level of hypernasality (p=0.113), presence of nasal turbulence (p=0.179); presence of weak intraoral air pressure (p=0.152); presence of nasal air emission (p=0.369), and speech understandability (p=0.113). The groups differed with respect to presence of compensatory articulation disorders (p=0.020), with the LIG presenting higher occurrence of altered phonemes. It was possible to assess the general profile and speech characteristics of the study participants. Patients submitted to early primary palatoplasty present better speech profile.
Non-fluent speech following stroke is caused by impaired efference copy.

Science.gov (United States)

Feenaughty, Lynda; Basilakos, Alexandra; Bonilha, Leonardo; den Ouden, Dirk-Bart; Rorden, Chris; Stark, Brielle; Fridriksson, Julius

2017-09-01

Efference copy is a cognitive mechanism argued to be critical for initiating and monitoring speech: however, the extent to which breakdown of efference copy mechanisms impact speech production is unclear. This study examined the best mechanistic predictors of non-fluent speech among 88 stroke survivors. Objective speech fluency measures were subjected to a principal component analysis (PCA). The primary PCA factor was then entered into a multiple stepwise linear regression analysis as the dependent variable, with a set of independent mechanistic variables. Participants' ability to mimic audio-visual speech ("speech entrainment response") was the best independent predictor of non-fluent speech. We suggest that this "speech entrainment" factor reflects integrity of internal monitoring (i.e., efference copy) of speech production, which affects speech initiation and maintenance. Results support models of normal speech production and suggest that therapy focused on speech initiation and maintenance may improve speech fluency for individuals with chronic non-fluent aphasia post stroke.
Stability and composition of functional synergies for speech movements in children with developmental speech disorders

NARCIS (Netherlands)

Terband, H.; Maassen, B.; van Lieshout, P.; Nijland, L.

2011-01-01

The aim of this study was to investigate the consistency and composition of functional synergies for speech movements in children with developmental speech disorders. Kinematic data were collected on the reiterated productions of syllables spa (/spa:/) and paas (/pa:s/) by 10 6- to 9-year-olds with
Speech to Text Software Evaluation Report

CERN Document Server

Martins Santo, Ana Luisa

2017-01-01

This document compares out-of-box performance of three commercially available speech recognition software: Vocapia VoxSigma TM , Google Cloud Speech, and Lime- craft Transcriber. It is defined a set of evaluation criteria and test methods for speech recognition softwares. The evaluation of these softwares in noisy environments are also included for the testing purposes. Recognition accuracy was compared using noisy environments and languages. Testing in ”ideal” non-noisy environment of a quiet room has been also performed for comparison.
The Influence of Direct and Indirect Speech on Source Memory

Directory of Open Access Journals (Sweden)

Anita Eerland

2018-02-01

Full Text Available People perceive the same situation described in direct speech (e.g., John said, “I like the food at this restaurant” as more vivid and perceptually engaging than described in indirect speech (e.g., John said that he likes the food at the restaurant. So, if direct speech enhances the perception of vividness relative to indirect speech, what are the effects of using indirect speech? In four experiments, we examined whether the use of direct and indirect speech influences the comprehender’s memory for the identity of the speaker. Participants read a direct or an indirect speech version of a story and then addressed statements to one of the four protagonists of the story in a memory task. We found better source memory at the level of protagonist gender after indirect than direct speech (Exp. 1–3. When the story was rewritten to make the protagonists more distinctive, we also found an effect of speech type on source memory at the level of the individual, with better memory after indirect than direct speech (Exp. 3–4. Memory for the content of the story, however, was not influenced by speech type (Exp. 4. While previous research showed that direct speech may enhance memory for how something was said, we conclude that indirect speech enhances memory for who said what.
Recognizing intentions in infant-directed speech: evidence for universals.

Science.gov (United States)

Bryant, Gregory A; Barrett, H Clark

2007-08-01

In all languages studied to date, distinct prosodic contours characterize different intention categories of infant-directed (ID) speech. This vocal behavior likely exists universally as a species-typical trait, but little research has examined whether listeners can accurately recognize intentions in ID speech using only vocal cues, without access to semantic information. We recorded native-English-speaking mothers producing four intention categories of utterances (prohibition, approval, comfort, and attention) as both ID and adult-directed (AD) speech, and we then presented the utterances to Shuar adults (South American hunter-horticulturalists). Shuar subjects were able to reliably distinguish ID from AD speech and were able to reliably recognize the intention categories in both types of speech, although performance was significantly better with ID speech. This is the first demonstration that adult listeners in an indigenous, nonindustrialized, and nonliterate culture can accurately infer intentions from both ID speech and AD speech in a language they do not speak.
75 FR 67333 - Telecommunications Relay Services and Speech-to-Speech Services for Individuals With Hearing and...

Science.gov (United States)

2010-11-02

... Docket No. 10-191; FCC 10-161] Telecommunications Relay Services and Speech-to-Speech Services for...-Based Telecommunications Relay Service Numbering AGENCY: Federal Communications Commission. ACTION... Internet-based Telecommunications Relay Service (iTRS), specifically, Video Relay Service (VRS) and IP...
75 FR 29914 - Telecommunications Relay Services, Speech-to-Speech Services, E911 Requirements for IP-Enabled...

Science.gov (United States)

2010-05-28

... FEDERAL COMMUNICATIONS COMMISSION 47 CFR Part 64 [CG Docket No. 03-123; WC Docket No. 05-196; FCC 08-275] Telecommunications Relay Services, Speech-to-Speech Services, E911 Requirements for IP... with the Commission's Telecommunications Relay Services, [[Page 29915
Out-of-synchrony speech entrainment in developmental dyslexia.

Science.gov (United States)

Molinaro, Nicola; Lizarazu, Mikel; Lallier, Marie; Bourguignon, Mathieu; Carreiras, Manuel

2016-08-01

Developmental dyslexia is a reading disorder often characterized by reduced awareness of speech units. Whether the neural source of this phonological disorder in dyslexic readers results from the malfunctioning of the primary auditory system or damaged feedback communication between higher-order phonological regions (i.e., left inferior frontal regions) and the auditory cortex is still under dispute. Here we recorded magnetoencephalographic (MEG) signals from 20 dyslexic readers and 20 age-matched controls while they were listening to ∼10-s-long spoken sentences. Compared to controls, dyslexic readers had (1) an impaired neural entrainment to speech in the delta band (0.5-1 Hz); (2) a reduced delta synchronization in both the right auditory cortex and the left inferior frontal gyrus; and (3) an impaired feedforward functional coupling between neural oscillations in the right auditory cortex and the left inferior frontal regions. This shows that during speech listening, individuals with developmental dyslexia present reduced neural synchrony to low-frequency speech oscillations in primary auditory regions that hinders higher-order speech processing steps. The present findings, thus, strengthen proposals assuming that improper low-frequency acoustic entrainment affects speech sampling. This low speech-brain synchronization has the strong potential to cause severe consequences for both phonological and reading skills. Interestingly, the reduced speech-brain synchronization in dyslexic readers compared to normal readers (and its higher-order consequences across the speech processing network) appears preserved through the development from childhood to adulthood. Thus, the evaluation of speech-brain synchronization could possibly serve as a diagnostic tool for early detection of children at risk of dyslexia. Hum Brain Mapp 37:2767-2783, 2016. © 2016 Wiley Periodicals, Inc. © 2016 Wiley Periodicals, Inc.
Ultrasound applicability in Speech Language Pathology and Audiology.

Science.gov (United States)

Barberena, Luciana da Silva; Brasil, Brunah de Castro; Melo, Roberta Michelon; Mezzomo, Carolina Lisbôa; Mota, Helena Bolli; Keske-Soares, Márcia

2014-01-01

To present recent studies that used the ultrasound in the fields of Speech Language Pathology and Audiology, which evidence possibilities of the applicability of this technique in different subareas. A bibliographic research was carried out in the PubMed database, using the keywords "ultrasonic," "speech," "phonetics," "Speech, Language and Hearing Sciences," "voice," "deglutition," and "myofunctional therapy," comprising some areas of Speech Language Pathology and Audiology Sciences. The keywords "ultrasound," "ultrasonography," "swallow," "orofacial myofunctional therapy," and "orofacial myology" were also used in the search. Studies in humans from the past 5 years were selected. In the preselection, duplicated studies, articles not fully available, and those that did not present direct relation between ultrasound and Speech Language Pathology and Audiology Sciences were discarded. The data were analyzed descriptively and classified subareas of Speech Language Pathology and Audiology Sciences. The following items were considered: purposes, participants, procedures, and results. We selected 12 articles for ultrasound versus speech/phonetics subarea, 5 for ultrasound versus voice, 1 for ultrasound versus muscles of mastication, and 10 for ultrasound versus swallow. Studies relating "ultrasound" and "Speech Language Pathology and Audiology Sciences" in the past 5 years were not found. Different studies on the use of ultrasound in Speech Language Pathology and Audiology Sciences were found. Each of them, according to its purpose, confirms new possibilities of the use of this instrument in the several subareas, aiming at a more accurate diagnosis and new evaluative and therapeutic possibilities.
Speech rate in Parkinson's disease: A controlled study.

Science.gov (United States)

Martínez-Sánchez, F; Meilán, J J G; Carro, J; Gómez Íñiguez, C; Millian-Morell, L; Pujante Valverde, I M; López-Alburquerque, T; López, D E

2016-09-01

Speech disturbances will affect most patients with Parkinson's disease (PD) over the course of the disease. The origin and severity of these symptoms are of clinical and diagnostic interest. To evaluate the clinical pattern of speech impairment in PD patients and identify significant differences in speech rate and articulation compared to control subjects. Speech rate and articulation in a reading task were measured using an automatic analytical method. A total of 39 PD patients in the 'on' state and 45 age-and sex-matched asymptomatic controls participated in the study. None of the patients experienced dyskinesias or motor fluctuations during the test. The patients with PD displayed a significant reduction in speech and articulation rates; there were no significant correlations between the studied speech parameters and patient characteristics such as L-dopa dose, duration of the disorder, age, and UPDRS III scores and Hoehn & Yahr scales. Patients with PD show a characteristic pattern of declining speech rate. These results suggest that in PD, disfluencies are the result of the movement disorder affecting the physiology of speech production systems. Copyright © 2014 Sociedad Española de Neurología. Publicado por Elsevier España, S.L.U. All rights reserved.
Unvoiced Speech Recognition Using Tissue-Conductive Acoustic Sensor

Directory of Open Access Journals (Sweden)

Hiroshi Saruwatari

2007-01-01

Full Text Available We present the use of stethoscope and silicon NAM (nonaudible murmur microphones in automatic speech recognition. NAM microphones are special acoustic sensors, which are attached behind the talker's ear and can capture not only normal (audible speech, but also very quietly uttered speech (nonaudible murmur. As a result, NAM microphones can be applied in automatic speech recognition systems when privacy is desired in human-machine communication. Moreover, NAM microphones show robustness against noise and they might be used in special systems (speech recognition, speech transform, etc. for sound-impaired people. Using adaptation techniques and a small amount of training data, we achieved for a 20 k dictation task a 93.9% word accuracy for nonaudible murmur recognition in a clean environment. In this paper, we also investigate nonaudible murmur recognition in noisy environments and the effect of the Lombard reflex on nonaudible murmur recognition. We also propose three methods to integrate audible speech and nonaudible murmur recognition using a stethoscope NAM microphone with very promising results.
Attention mechanisms and the mosaic evolution of speech

Directory of Open Access Journals (Sweden)

Pedro Tiago Martins

2014-12-01

Full Text Available There is still no categorical answer for why humans, and no other species, have speech, or why speech is the way it is. Several purely anatomical arguments have been put forward, but they have been shown to be false, biologically implausible, or of limited scope. This perspective paper supports the idea that evolutionary theories of speech could benefit from a focus on the cognitive mechanisms that make speech possible, for which antecedents in evolutionary history and brain correlates can be found. This type of approach is part of a very recent, but rapidly growing tradition, which has provided crucial insights on the nature of human speech by focusing on the biological bases of vocal learning. Here, we call attention to what might be an important ingredient for speech. We contend that a general mechanism of attention, which manifests itself not only in visual but also auditory (and possibly other modalities, might be one of the key pieces of human speech, in addition to the mechanisms underlying vocal learning, and the pairing of facial gestures with vocalic units.
Speech recognition systems on the Cell Broadband Engine

Energy Technology Data Exchange (ETDEWEB)

Liu, Y; Jones, H; Vaidya, S; Perrone, M; Tydlitat, B; Nanda, A

2007-04-20

In this paper we describe our design, implementation, and first results of a prototype connected-phoneme-based speech recognition system on the Cell Broadband Engine{trademark} (Cell/B.E.). Automatic speech recognition decodes speech samples into plain text (other representations are possible) and must process samples at real-time rates. Fortunately, the computational tasks involved in this pipeline are highly data-parallel and can receive significant hardware acceleration from vector-streaming architectures such as the Cell/B.E. Identifying and exploiting these parallelism opportunities is challenging, but also critical to improving system performance. We observed, from our initial performance timings, that a single Cell/B.E. processor can recognize speech from thousands of simultaneous voice channels in real time--a channel density that is orders-of-magnitude greater than the capacity of existing software speech recognizers based on CPUs (central processing units). This result emphasizes the potential for Cell/B.E.-based speech recognition and will likely lead to the future development of production speech systems using Cell/B.E. clusters.
Charisma in business speeches

DEFF Research Database (Denmark)

Niebuhr, Oliver; Brem, Alexander; Novák-Tót, Eszter

2016-01-01

to business speeches. Consistent with the public opinion, our findings are indicative of Steve Jobs being a more charismatic speaker than Mark Zuckerberg. Beyond previous studies, our data suggest that rhythm and emphatic accentuation are also involved in conveying charisma. Furthermore, the differences...... between Steve Jobs and Mark Zuckerberg and the investor- and customer-related sections of their speeches support the modern understanding of charisma as a gradual, multiparametric, and context-sensitive concept....
Longitudinal Study of Speech Perception, Speech, and Language for Children with Hearing Loss in an Auditory-Verbal Therapy Program

Science.gov (United States)

Dornan, Dimity; Hickson, Louise; Murdoch, Bruce; Houston, Todd

2009-01-01

This study examined the speech perception, speech, and language developmental progress of 25 children with hearing loss (mean Pure-Tone Average [PTA] 79.37 dB HL) in an auditory verbal therapy program. Children were tested initially and then 21 months later on a battery of assessments. The speech and language results over time were compared with…
Measures to Evaluate the Effects of DBS on Speech Production

Science.gov (United States)

Weismer, Gary; Yunusova, Yana; Bunton, Kate

2011-01-01

The purpose of this paper is to review and evaluate measures of speech production that could be used to document effects of Deep Brain Stimulation (DBS) on speech performance, especially in persons with Parkinson disease (PD). A small set of evaluative criteria for these measures is presented first, followed by consideration of several speech physiology and speech acoustic measures that have been studied frequently and reported on in the literature on normal speech production, and speech production affected by neuromotor disorders (dysarthria). Each measure is reviewed and evaluated against the evaluative criteria. Embedded within this review and evaluation is a presentation of new data relating speech motions to speech intelligibility measures in speakers with PD, amyotrophic lateral sclerosis (ALS), and control speakers (CS). These data are used to support the conclusion that at the present time the slope of second formant transitions (F2 slope), an acoustic measure, is well suited to make inferences to speech motion and to predict speech intelligibility. The use of other measures should not be ruled out, however, and we encourage further development of evaluative criteria for speech measures designed to probe the effects of DBS or any treatment with potential effects on speech production and communication skills. PMID:24932066
Speech Motor Programming in Apraxia of Speech: Evidence from a Delayed Picture-Word Interference Task

Science.gov (United States)

Mailend, Marja-Liisa; Maas, Edwin

2013-01-01

Purpose: Apraxia of speech (AOS) is considered a speech motor programming impairment, but the specific nature of the impairment remains a matter of debate. This study investigated 2 hypotheses about the underlying impairment in AOS framed within the Directions Into Velocities of Articulators (DIVA; Guenther, Ghosh, & Tourville, 2006) model: The…
Autonomic and Emotional Responses of Graduate Student Clinicians in Speech-Language Pathology to Stuttered Speech

Science.gov (United States)

Guntupalli, Vijaya K.; Nanjundeswaran, Chayadevie; Dayalu, Vikram N.; Kalinowski, Joseph

2012-01-01

Background: Fluent speakers and people who stutter manifest alterations in autonomic and emotional responses as they view stuttered relative to fluent speech samples. These reactions are indicative of an aroused autonomic state and are hypothesized to be triggered by the abrupt breakdown in fluency exemplified in stuttered speech. Furthermore,…

Neurophysiological Influence of Musical Training on Speech Perception

OpenAIRE

Shahin, Antoine J.

2011-01-01

Does musical training affect our perception of speech? For example, does learning to play a musical instrument modify the neural circuitry for auditory processing in a way that improves one’s ability to perceive speech more clearly in noisy environments? If so, can speech perception in individuals with hearing loss, who struggle in noisy situations, benefit from musical training? While music and speech exhibit some specialization in neural processing, there is evidence suggesting that skill...
The development and validation of the speech quality instrument.

Science.gov (United States)

Chen, Stephanie Y; Griffin, Brianna M; Mancuso, Dean; Shiau, Stephanie; DiMattia, Michelle; Cellum, Ilana; Harvey Boyd, Kelly; Prevoteau, Charlotte; Kohlberg, Gavriel D; Spitzer, Jaclyn B; Lalwani, Anil K

2017-12-08

Although speech perception tests are available to evaluate hearing, there is no standardized validated tool to quantify speech quality. The objective of this study is to develop a validated tool to measure quality of speech heard. Prospective instrument validation study of 35 normal hearing adults recruited at a tertiary referral center. Participants listened to 44 speech clips of male/female voices reciting the Rainbow Passage. Speech clips included original and manipulated excerpts capturing goal qualities such as mechanical and garbled. Listeners rated clips on a 10-point visual analog scale (VAS) of 18 characteristics (e.g. cartoonish, garbled). Skewed distribution analysis identified mean ratings in the upper and lower 2-point limits of the VAS (ratings of 8-10, 0-2, respectively); items with inconsistent responses were eliminated. The test was pruned to a final instrument of nine speech clips that clearly define qualities of interest: speech-like, male/female, cartoonish, echo-y, garbled, tinny, mechanical, rough, breathy, soothing, hoarse, like, pleasant, natural. Mean ratings were highest for original female clips (8.8) and lowest for not-speech manipulation (2.1). Factor analysis identified two subsets of characteristics: internal consistency demonstrated Cronbach's alpha of 0.95 and 0.82 per subset. Test-retest reliability of total scores was high, with an intraclass correlation coefficient of 0.76. The Speech Quality Instrument (SQI) is a concise, valid tool for assessing speech quality as an indicator for hearing performance. SQI may be a valuable outcome measure for cochlear implant recipients who, despite achieving excellent speech perception, often experience poor speech quality. 2b. Laryngoscope, 2017. © 2017 The American Laryngological, Rhinological and Otological Society, Inc.
Improved Methods for Pitch Synchronous Linear Prediction Analysis of Speech

OpenAIRE

劉, 麗清

2015-01-01

Linear prediction (LP) analysis has been applied to speech system over the last few decades. LP technique is well-suited for speech analysis due to its ability to model speech production process approximately. Hence LP analysis has been widely used for speech enhancement, low-bit-rate speech coding in cellular telephony, speech recognition, characteristic parameter extraction (vocal tract resonances frequencies, fundamental frequency called pitch) and so on. However, the performance of the co...
Internet images of the speech pathology profession.

Science.gov (United States)

Byrne, Nicole

2017-06-05

Objective The Internet provides the general public with information about speech pathology services, including client groups and service delivery models, as well as the professionals providing the services. Although this information assists the general public and other professionals to both access and understand speech pathology services, it also potentially provides information about speech pathology as a prospective career, including the types of people who are speech pathologists (i.e. demographics). The aim of the present study was to collect baseline data on how the speech pathology profession was presented via images on the Internet. Methods A pilot prospective observational study using content analysis methodology was conducted to analyse publicly available Internet images related to the speech pathology profession. The terms 'Speech Pathology' and 'speech pathologist' to represent both the profession and the professional were used, resulting in the identification of 200 images. These images were considered across a range of areas, including who was in the image (e.g. professional, client, significant other), the technology used and the types of intervention. Results The majority of images showed both a client and a professional (i.e. speech pathologist). While the professional was predominantly presented as female, the gender of the client was more evenly distributed. The clients were more likely to be preschool or school aged, however male speech pathologists were presented as providing therapy to selected age groups (i.e. school aged and younger adults). Images were predominantly of individual therapy and the few group images that were presented were all paediatric. Conclusion Current images of speech pathology continue to portray narrow professional demographics and client groups (e.g. paediatrics). Promoting images of wider scope to fully represent the depth and breadth of speech pathology professional practice may assist in attracting a more diverse
The relationship between the intelligibility of time-compressed speech and speech in noise in young and elderly listeners

Science.gov (United States)

Versfeld, Niek J.; Dreschler, Wouter A.

2002-01-01

A conventional measure to determine the ability to understand speech in noisy backgrounds is the so-called speech reception threshold (SRT) for sentences. It yields the signal-to-noise ratio (in dB) for which half of the sentences are correctly perceived. The SRT defines to what degree speech must be audible to a listener in order to become just intelligible. There are indications that elderly listeners have greater difficulty in understanding speech in adverse listening conditions than young listeners. This may be partly due to the differences in hearing sensitivity (presbycusis), hence audibility, but other factors, such as temporal acuity, may also play a significant role. A potential measure for the temporal acuity may be the threshold to which speech can be accelerated, or compressed in time. A new test is introduced where the speech rate is varied adaptively. In analogy to the SRT, the time-compression threshold (or TCT) then is defined as the speech rate (expressed in syllables per second) for which half of the sentences are correctly perceived. In experiment I, the TCT test is introduced and normative data are provided. In experiment II, four groups of subjects (young and elderly normal-hearing and hearing-impaired subjects) participated, and the SRT's in stationary and fluctuating speech-shaped noise were determined, as well as the TCT. The results show that the SRT in fluctuating noise and the TCT are highly correlated. All tests indicate that, even after correction for the hearing loss, elderly normal-hearing subjects perform worse than young normal-hearing subjects. The results indicate that the use of the TCT test or the SRT test in fluctuating noise is preferred over the SRT test in stationary noise.
Individual differences in speech-in-noise perception parallel neural speech processing and attention in preschoolers

Science.gov (United States)

Thompson, Elaine C.; Carr, Kali Woodruff; White-Schwoch, Travis; Otto-Meyer, Sebastian; Kraus, Nina

2016-01-01

From bustling classrooms to unruly lunchrooms, school settings are noisy. To learn effectively in the unwelcome company of numerous distractions, children must clearly perceive speech in noise. In older children and adults, speech-in-noise perception is supported by sensory and cognitive processes, but the correlates underlying this critical listening skill in young children (3–5 year olds) remain undetermined. Employing a longitudinal design (two evaluations separated by ~12 months), we followed a cohort of 59 preschoolers, ages 3.0–4.9, assessing word-in-noise perception, cognitive abilities (intelligence, short-term memory, attention), and neural responses to speech. Results reveal changes in word-in-noise perception parallel changes in processing of the fundamental frequency (F0), an acoustic cue known for playing a role central to speaker identification and auditory scene analysis. Four unique developmental trajectories (speech-in-noise perception groups) confirm this relationship, in that improvements and declines in word-in-noise perception couple with enhancements and diminishments of F0 encoding, respectively. Improvements in word-in-noise perception also pair with gains in attention. Word-in-noise perception does not relate to strength of neural harmonic representation or short-term memory. These findings reinforce previously-reported roles of F0 and attention in hearing speech in noise in older children and adults, and extend this relationship to preschool children. PMID:27864051
The Interpersonal Metafunction Analysis of Barack Obama's Victory Speech

Science.gov (United States)

Ye, Ruijuan

2010-01-01

This paper carries on a tentative interpersonal metafunction analysis of Barack Obama's victory speech from the interpersonal metafunction, which aims to help readers understand and evaluate the speech regarding its suitability, thus to provide some guidance for readers to make better speeches. This study has promising implications for speeches as…
Audiovisual Cues and Perceptual Learning of Spectrally Distorted Speech

Science.gov (United States)

Pilling, Michael; Thomas, Sharon

2011-01-01

Two experiments investigate the effectiveness of audiovisual (AV) speech cues (cues derived from both seeing and hearing a talker speak) in facilitating perceptual learning of spectrally distorted speech. Speech was distorted through an eight channel noise-vocoder which shifted the spectral envelope of the speech signal to simulate the properties…
THE ONTOGENESIS OF SPEECH DEVELOPMENT

Directory of Open Access Journals (Sweden)

T. E. Braudo

2017-01-01

Full Text Available The purpose of this article is to acquaint the specialists, working with children having developmental disorders, with age-related norms for speech development. Many well-known linguists and psychologists studied speech ontogenesis (logogenesis. Speech is a higher mental function, which integrates many functional systems. Speech development in infants during the first months after birth is ensured by the innate hearing and emerging ability to fix the gaze on the face of an adult. Innate emotional reactions are also being developed during this period, turning into nonverbal forms of communication. At about 6 months a baby starts to pronounce some syllables; at 7–9 months – repeats various sounds combinations, pronounced by adults. At 10–11 months a baby begins to react on the words, referred to him/her. The first words usually appear at an age of 1 year; this is the start of the stage of active speech development. At this time it is acceptable, if a child confuses or rearranges sounds, distorts or misses them. By the age of 1.5 years a child begins to understand abstract explanations of adults. Significant vocabulary enlargement occurs between 2 and 3 years; grammatical structures of the language are being formed during this period (a child starts to use phrases and sentences. Preschool age (3–7 y. o. is characterized by incorrect, but steadily improving pronunciation of sounds and phonemic perception. The vocabulary increases; abstract speech and retelling are being formed. Children over 7 y. o. continue to improve grammar, writing and reading skills. The described stages may not have strict age boundaries, as soon as they are dependent not only on environment, but also on the child’s mental constitution, heredity and character.
Freedom of Speech: The M Word

OpenAIRE

Murphy, Timothy; Olsen, Kristina; Andersen, Christopher; Reichhardt, Line

2015-01-01

The first objective of the project is to show how freedom of speech and democracy are dependent on one another in Denmark. The project’s next focal point is to look at how freedom of speech was framed in relation to the Mohammed publications in 2005. To do this, it identifies how freedom of speech was used by many Danish and European newspapers to justify the publications. Arguments against the publications by both the Danish media and the Muslim community (within Denmark and abroad) are also...
Engaged listeners: shared neural processing of powerful political speeches.

Science.gov (United States)

Schmälzle, Ralf; Häcker, Frank E K; Honey, Christopher J; Hasson, Uri

2015-08-01

Powerful speeches can captivate audiences, whereas weaker speeches fail to engage their listeners. What is happening in the brains of a captivated audience? Here, we assess audience-wide functional brain dynamics during listening to speeches of varying rhetorical quality. The speeches were given by German politicians and evaluated as rhetorically powerful or weak. Listening to each of the speeches induced similar neural response time courses, as measured by inter-subject correlation analysis, in widespread brain regions involved in spoken language processing. Crucially, alignment of the time course across listeners was stronger for rhetorically powerful speeches, especially for bilateral regions of the superior temporal gyri and medial prefrontal cortex. Thus, during powerful speeches, listeners as a group are more coupled to each other, suggesting that powerful speeches are more potent in taking control of the listeners' brain responses. Weaker speeches were processed more heterogeneously, although they still prompted substantially correlated responses. These patterns of coupled neural responses bear resemblance to metaphors of resonance, which are often invoked in discussions of speech impact, and contribute to the literature on auditory attention under natural circumstances. Overall, this approach opens up possibilities for research on the neural mechanisms mediating the reception of entertaining or persuasive messages. © The Author (2015). Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.
Speech rhythms and multiplexed oscillatory sensory coding in the human brain.

Directory of Open Access Journals (Sweden)

Joachim Gross

2013-12-01

Full Text Available Cortical oscillations are likely candidates for segmentation and coding of continuous speech. Here, we monitored continuous speech processing with magnetoencephalography (MEG to unravel the principles of speech segmentation and coding. We demonstrate that speech entrains the phase of low-frequency (delta, theta and the amplitude of high-frequency (gamma oscillations in the auditory cortex. Phase entrainment is stronger in the right and amplitude entrainment is stronger in the left auditory cortex. Furthermore, edges in the speech envelope phase reset auditory cortex oscillations thereby enhancing their entrainment to speech. This mechanism adapts to the changing physical features of the speech envelope and enables efficient, stimulus-specific speech sampling. Finally, we show that within the auditory cortex, coupling between delta, theta, and gamma oscillations increases following speech edges. Importantly, all couplings (i.e., brain-speech and also within the cortex attenuate for backward-presented speech, suggesting top-down control. We conclude that segmentation and coding of speech relies on a nested hierarchy of entrained cortical oscillations.
Speech Rhythms and Multiplexed Oscillatory Sensory Coding in the Human Brain

Science.gov (United States)

Gross, Joachim; Hoogenboom, Nienke; Thut, Gregor; Schyns, Philippe; Panzeri, Stefano; Belin, Pascal; Garrod, Simon

2013-01-01

Cortical oscillations are likely candidates for segmentation and coding of continuous speech. Here, we monitored continuous speech processing with magnetoencephalography (MEG) to unravel the principles of speech segmentation and coding. We demonstrate that speech entrains the phase of low-frequency (delta, theta) and the amplitude of high-frequency (gamma) oscillations in the auditory cortex. Phase entrainment is stronger in the right and amplitude entrainment is stronger in the left auditory cortex. Furthermore, edges in the speech envelope phase reset auditory cortex oscillations thereby enhancing their entrainment to speech. This mechanism adapts to the changing physical features of the speech envelope and enables efficient, stimulus-specific speech sampling. Finally, we show that within the auditory cortex, coupling between delta, theta, and gamma oscillations increases following speech edges. Importantly, all couplings (i.e., brain-speech and also within the cortex) attenuate for backward-presented speech, suggesting top-down control. We conclude that segmentation and coding of speech relies on a nested hierarchy of entrained cortical oscillations. PMID:24391472
Voice-associated static face image releases speech from informational masking.

Science.gov (United States)

Gao, Yayue; Cao, Shuyang; Qu, Tianshu; Wu, Xihong; Li, Haifeng; Zhang, Jinsheng; Li, Liang

2014-06-01

In noisy, multipeople talking environments such as a cocktail party, listeners can use various perceptual and/or cognitive cues to improve recognition of target speech against masking, particularly informational masking. Previous studies have shown that temporally prepresented voice cues (voice primes) improve recognition of target speech against speech masking but not noise masking. This study investigated whether static face image primes that have become target-voice associated (i.e., facial images linked through associative learning with voices reciting the target speech) can be used by listeners to unmask speech. The results showed that in 32 normal-hearing younger adults, temporally prepresenting a voice-priming sentence with the same voice reciting the target sentence significantly improved the recognition of target speech that was masked by irrelevant two-talker speech. When a person's face photograph image became associated with the voice reciting the target speech by learning, temporally prepresenting the target-voice-associated face image significantly improved recognition of target speech against speech masking, particularly for the last two keywords in the target sentence. Moreover, speech-recognition performance under the voice-priming condition was significantly correlated to that under the face-priming condition. The results suggest that learned facial information on talker identity plays an important role in identifying the target-talker's voice and facilitating selective attention to the target-speech stream against the masking-speech stream. © 2014 The Institute of Psychology, Chinese Academy of Sciences and Wiley Publishing Asia Pty Ltd.
Speech Recognition for the iCub Platform

Directory of Open Access Journals (Sweden)

Bertrand Higy

2018-02-01

Full Text Available This paper describes open source software (available at https://github.com/robotology/natural-speech to build automatic speech recognition (ASR systems and run them within the YARP platform. The toolkit is designed (i to allow non-ASR experts to easily create their own ASR system and run it on iCub and (ii to build deep learning-based models specifically addressing the main challenges an ASR system faces in the context of verbal human–iCub interactions. The toolkit mostly consists of Python, C++ code and shell scripts integrated in YARP. As additional contribution, a second codebase (written in Matlab is provided for more expert ASR users who want to experiment with bio-inspired and developmental learning-inspired ASR systems. Specifically, we provide code for two distinct kinds of speech recognition: “articulatory” and “unsupervised” speech recognition. The first is largely inspired by influential neurobiological theories of speech perception which assume speech perception to be mediated by brain motor cortex activities. Our articulatory systems have been shown to outperform strong deep learning-based baselines. The second type of recognition systems, the “unsupervised” systems, do not use any supervised information (contrary to most ASR systems, including our articulatory systems. To some extent, they mimic an infant who has to discover the basic speech units of a language by herself. In addition, we provide resources consisting of pre-trained deep learning models for ASR, and a 2.5-h speech dataset of spoken commands, the VoCub dataset, which can be used to adapt an ASR system to the typical acoustic environments in which iCub operates.
Visual and Auditory Input in Second-Language Speech Processing

Science.gov (United States)

Hardison, Debra M.

2010-01-01

The majority of studies in second-language (L2) speech processing have involved unimodal (i.e., auditory) input; however, in many instances, speech communication involves both visual and auditory sources of information. Some researchers have argued that multimodal speech is the primary mode of speech perception (e.g., Rosenblum 2005). Research on…
Communication Supports for People with Motor Speech Disorders

Science.gov (United States)

Hanson, Elizabeth K.; Fager, Susan K.

2017-01-01

Communication supports for people with motor speech disorders can include strategies and technologies to supplement natural speech efforts, resolve communication breakdowns, and replace natural speech when necessary to enhance participation in all communicative contexts. This article emphasizes communication supports that can enhance…
Multi-thread Parallel Speech Recognition for Mobile Applications

Directory of Open Access Journals (Sweden)

LOJKA Martin

2014-05-01

Full Text Available In this paper, the server based solution of the multi-thread large vocabulary automatic speech recognition engine is described along with the Android OS and HTML5 practical application examples. The basic idea was to bring speech recognition available for full variety of applications for computers and especially for mobile devices. The speech recognition engine should be independent of commercial products and services (where the dictionary could not be modified. Using of third-party services could be also a security and privacy problem in specific applications, when the unsecured audio data could not be sent to uncontrolled environments (voice data transferred to servers around the globe. Using our experience with speech recognition applications, we have been able to construct a multi-thread speech recognition serverbased solution designed for simple applications interface (API to speech recognition engine modified to specific needs of particular application.
Relative Contributions of the Dorsal vs. Ventral Speech Streams to Speech Perception are Context Dependent: a lesion study

Directory of Open Access Journals (Sweden)

Corianne Rogalsky

2014-04-01

Full Text Available The neural basis of speech perception has been debated for over a century. While it is generally agreed that the superior temporal lobes are critical for the perceptual analysis of speech, a major current topic is whether the motor system contributes to speech perception, with several conflicting findings attested. In a dorsal-ventral speech stream framework (Hickok & Poeppel 2007, this debate is essentially about the roles of the dorsal versus ventral speech processing streams. A major roadblock in characterizing the neuroanatomy of speech perception is task-specific effects. For example, much of the evidence for dorsal stream involvement comes from syllable discrimination type tasks, which have been found to behaviorally doubly dissociate from auditory comprehension tasks (Baker et al. 1981. Discrimination task deficits could be a result of difficulty perceiving the sounds themselves, which is the typical assumption, or it could be a result of failures in temporary maintenance of the sensory traces, or the comparison and/or the decision process. Similar complications arise in perceiving sentences: the extent of inferior frontal (i.e. dorsal stream activation during listening to sentences increases as a function of increased task demands (Love et al. 2006. Another complication is the stimulus: much evidence for dorsal stream involvement uses speech samples lacking semantic context (CVs, non-words. The present study addresses these issues in a large-scale lesion-symptom mapping study. 158 patients with focal cerebral lesions from the Mutli-site Aphasia Research Consortium underwent a structural MRI or CT scan, as well as an extensive psycholinguistic battery. Voxel-based lesion symptom mapping was used to compare the neuroanatomy involved in the following speech perception tasks with varying phonological, semantic, and task loads: (i two discrimination tasks of syllables (non-words and words, respectively, (ii two auditory comprehension tasks
Speech, language and swallowing in Huntington’ Disease

Directory of Open Access Journals (Sweden)

Maryluz Camargo-Mendoza

2017-04-01

Full Text Available Huntington’s disease (HD has been described as a genetic condition caused by a mutation in the CAG (cytosine-adenine-guanine nucleotide sequence. Depending on the stage of the disease, people may have difficulties in speech, language and swallowing. The purpose of this paper is to describe these difficulties in detail, as well as to provide an account on speech and language therapy approach to this condition. Regarding speech, it is worth noticing that characteristics typical of hyperkinetic dysarthria can be found due to underlying choreic movements. The speech of people with HD tends to show shorter sentences, with much simpler syntactic structures, and difficulties in tasks that require complex cognitive processing. Moreover, swallowing may present dysphagia that progresses as the disease develops. A timely, comprehensive and effective speech-language intervention is essential to improve the quality of life of people and contribute to their communicative welfare.

Sparsity in Linear Predictive Coding of Speech

DEFF Research Database (Denmark)

Giacobello, Daniele

of the effectiveness of their application in audio processing. The second part of the thesis deals with introducing sparsity directly in the linear prediction analysis-by-synthesis (LPAS) speech coding paradigm. We first propose a novel near-optimal method to look for a sparse approximate excitation using a compressed...... one with direct applications to coding but also consistent with the speech production model of voiced speech, where the excitation of the all-pole filter can be modeled as an impulse train, i.e., a sparse sequence. Introducing sparsity in the LP framework will also bring to de- velop the concept...... sensing formulation. Furthermore, we define a novel re-estimation procedure to adapt the predictor coefficients to the given sparse excitation, balancing the two representations in the context of speech coding. Finally, the advantages of the compact parametric representation of a segment of speech, given...
Dysarthric Bengali speech: A neurolinguistic study

Directory of Open Access Journals (Sweden)

Chakraborty N

2008-01-01

Full Text Available Background and Aims: Dysarthria affects linguistic domains such as respiration, phonation, articulation, resonance and prosody due to upper motor neuron, lower motor neuron, cerebellar or extrapyramidal tract lesions. Although Bengali is one of the major languages globally, dysarthric Bengali speech has not been subjected to neurolinguistic analysis. We attempted such an analysis with the goal of identifying the speech defects in native Bengali speakers in various types of dysarthria encountered in neurological disorders. Settings and Design: A cross-sectional observational study was conducted with 66 dysarthric subjects, predominantly middle-aged males, attending the Neuromedicine OPD of a tertiary care teaching hospital in Kolkata. Materials and Methods: After neurological examination, an instrument comprising commonly used Bengali words and a text block covering all Bengali vowels and consonants were used to carry out perceptual analysis of dysarthric speech. From recorded speech, 24 parameters pertaining to five linguistic domains were assessed. The Kruskal-Wallis analysis of variance, Chi-square test and Fisher′s exact test were used for analysis. Results: The dysarthria types were spastic (15 subjects, flaccid (10, mixed (12, hypokinetic (12, hyperkinetic (9 and ataxic (8. Of the 24 parameters assessed, 15 were found to occur in one or more types with a prevalence of at least 25%. Imprecise consonant was the most frequently occurring defect in most dysarthrias. The spectrum of defects in each type was identified. Some parameters were capable of distinguishing between types. Conclusions: This perceptual analysis has defined linguistic defects likely to be encountered in dysarthric Bengali speech in neurological disorders. The speech distortion can be described and distinguished by a limited number of parameters. This may be of importance to the speech therapist and neurologist in planning rehabilitation and further management.
Acquisition of speech rhythm in first language.

Science.gov (United States)

Polyanskaya, Leona; Ordin, Mikhail

2015-09-01

Analysis of English rhythm in speech produced by children and adults revealed that speech rhythm becomes increasingly more stress-timed as language acquisition progresses. Children reach the adult-like target by 11 to 12 years. The employed speech elicitation paradigm ensured that the sentences produced by adults and children at different ages were comparable in terms of lexical content, segmental composition, and phonotactic complexity. Detected differences between child and adult rhythm and between rhythm in child speech at various ages cannot be attributed to acquisition of phonotactic language features or vocabulary, and indicate the development of language-specific phonetic timing in the course of acquisition.
Talker Variability in Audiovisual Speech Perception

Directory of Open Access Journals (Sweden)

Shannon eHeald

2014-07-01

Full Text Available A change in talker is a change in the context for the phonetic interpretation of acoustic patterns of speech. Different talkers have different mappings between acoustic patterns and phonetic categories and listeners need to adapt to these differences. Despite this complexity, listeners are adept at comprehending speech in multiple-talker contexts, albeit at a slight but measurable performance cost (e.g., slower recognition. So far, this talker-variability cost has been demonstrated only in audio-only speech. Other research in single-talker contexts have shown, however, that when listeners are able to see a talker’s face, speech recognition is improved under adverse listening (e.g., noise or distortion conditions that can increase uncertainty in the mapping between acoustic patterns and phonetic categories. Does seeing a talker's face reduce the cost of word recognition in multiple-talker contexts? We used a speeded word-monitoring task in which listeners make quick judgments about target-word recognition in single- and multiple-talker contexts. Results show faster recognition performance in single-talker conditions compared to multiple-talker conditions for both audio-only and audio-visual speech. However, recognition time in a multiple-talker context was slower in the audio-visual condition compared to audio-only condition. These results suggest that seeing a talker’s face during speech perception may slow recognition by increasing the importance of talker identification, signaling to the listener a change in talker has occurred.
CAR2 - Czech Database of Car Speech

Directory of Open Access Journals (Sweden)

P. Sovka

1999-12-01

Full Text Available This paper presents new Czech language two-channel (stereo speech database recorded in car environment. The created database was designed for experiments with speech enhancement for communication purposes and for the study and the design of a robust speech recognition systems. Tools for automated phoneme labelling based on Baum-Welch re-estimation were realised. The noise analysis of the car background environment was done.
CAR2 - Czech Database of Car Speech

OpenAIRE

Pollak, P.; Vopicka, J.; Hanzl, V.; Sovka, Pavel

1999-01-01

This paper presents new Czech language two-channel (stereo) speech database recorded in car environment. The created database was designed for experiments with speech enhancement for communication purposes and for the study and the design of a robust speech recognition systems. Tools for automated phoneme labelling based on Baum-Welch re-estimation were realised. The noise analysis of the car background environment was done.
Corollary discharge provides the sensory content of inner speech.

Science.gov (United States)

Scott, Mark

2013-09-01

Inner speech is one of the most common, but least investigated, mental activities humans perform. It is an internal copy of one's external voice and so is similar to a well-established component of motor control: corollary discharge. Corollary discharge is a prediction of the sound of one's voice generated by the motor system. This prediction is normally used to filter self-caused sounds from perception, which segregates them from externally caused sounds and prevents the sensory confusion that would otherwise result. The similarity between inner speech and corollary discharge motivates the theory, tested here, that corollary discharge provides the sensory content of inner speech. The results reported here show that inner speech attenuates the impact of external sounds. This attenuation was measured using a context effect (an influence of contextual speech sounds on the perception of subsequent speech sounds), which weakens in the presence of speech imagery that matches the context sound. Results from a control experiment demonstrated this weakening in external speech as well. Such sensory attenuation is a hallmark of corollary discharge.
Stability and Composition of Functional Synergies for Speech Movements in Children with Developmental Speech Disorders

Science.gov (United States)

Terband, H.; Maassen, B.; van Lieshout, P.; Nijland, L.

2011-01-01

The aim of this study was to investigate the consistency and composition of functional synergies for speech movements in children with developmental speech disorders. Kinematic data were collected on the reiterated productions of syllables spa(/spa[image omitted]/) and paas(/pa[image omitted]s/) by 10 6- to 9-year-olds with developmental speech…
The role of across-frequency envelope processing for speech intelligibility

DEFF Research Database (Denmark)

Chabot-Leclerc, Alexandre; Jørgensen, Søren; Dau, Torsten

2013-01-01

Speech intelligibility models consist of a preprocessing part that transforms the stimuli into some internal (auditory) representation, and a decision metric that quantifies effects of transmission channel, speech interferers, and auditory processing on the speech intelligibility. Here, two recent...... speech intelligibility models, the spectro-temporal modulation index [STMI; Elhilali et al. (2003)] and the speech-based envelope power spectrum model [sEPSM; Jørgensen and Dau (2011)] were evaluated in conditions of noisy speech subjected to reverberation, and to nonlinear distortions through either...
The role of across-frequency envelope processing for speech intelligibility

DEFF Research Database (Denmark)

Chabot-Leclerc, Alexandre; Jørgensen, Søren; Dau, Torsten

2013-01-01

Speech intelligibility models consist of a preprocessing part that transforms the stimuli into some internal (auditory) representation, and a decision metric that quantifies effects of transmission channel, speech interferers, and auditory processing on the speech intelligibility. Here, two recent...... speech intelligibility models, the spectro-temporal modulation index (STMI; Elhilali et al., 2003) and the speech-based envelope power spectrum model (sEPSM; Jørgensen and Dau, 2011) were evaluated in conditions of noisy speech subjected to reverberation, and to nonlinear distortions through either...
Start/End Delays of Voiced and Unvoiced Speech Signals

Energy Technology Data Exchange (ETDEWEB)

Herrnstein, A

1999-09-24

Recent experiments using low power EM-radar like sensors (e.g, GEMs) have demonstrated a new method for measuring vocal fold activity and the onset times of voiced speech, as vocal fold contact begins to take place. Similarly the end time of a voiced speech segment can be measured. Secondly it appears that in most normal uses of American English speech, unvoiced-speech segments directly precede or directly follow voiced-speech segments. For many applications, it is useful to know typical duration times of these unvoiced speech segments. A corpus, assembled earlier of spoken ''Timit'' words, phrases, and sentences and recorded using simultaneously measured acoustic and EM-sensor glottal signals, from 16 male speakers, was used for this study. By inspecting the onset (or end) of unvoiced speech, using the acoustic signal, and the onset (or end) of voiced speech using the EM sensor signal, the average duration times for unvoiced segments preceding onset of vocalization were found to be 300ms, and for following segments, 500ms. An unvoiced speech period is then defined in time, first by using the onset of the EM-sensed glottal signal, as the onset-time marker for the voiced speech segment and end marker for the unvoiced segment. Then, by subtracting 300ms from the onset time mark of voicing, the unvoiced speech segment start time is found. Similarly, the times for a following unvoiced speech segment can be found. While data of this nature have proven to be useful for work in our laboratory, a great deal of additional work remains to validate such data for use with general populations of users. These procedures have been useful for applying optimal processing algorithms over time segments of unvoiced, voiced, and non-speech acoustic signals. For example, these data appear to be of use in speaker validation, in vocoding, and in denoising algorithms.
The speech signal segmentation algorithm using pitch synchronous analysis

Directory of Open Access Journals (Sweden)

Amirgaliyev Yedilkhan

2017-03-01

Full Text Available Parameterization of the speech signal using the algorithms of analysis synchronized with the pitch frequency is discussed. Speech parameterization is performed by the average number of zero transitions function and the signal energy function. Parameterization results are used to segment the speech signal and to isolate the segments with stable spectral characteristics. Segmentation results can be used to generate a digital voice pattern of a person or be applied in the automatic speech recognition. Stages needed for continuous speech segmentation are described.
Emotion recognition from speech: tools and challenges

Science.gov (United States)

Al-Talabani, Abdulbasit; Sellahewa, Harin; Jassim, Sabah A.

2015-05-01

Human emotion recognition from speech is studied frequently for its importance in many applications, e.g. human-computer interaction. There is a wide diversity and non-agreement about the basic emotion or emotion-related states on one hand and about where the emotion related information lies in the speech signal on the other side. These diversities motivate our investigations into extracting Meta-features using the PCA approach, or using a non-adaptive random projection RP, which significantly reduce the large dimensional speech feature vectors that may contain a wide range of emotion related information. Subsets of Meta-features are fused to increase the performance of the recognition model that adopts the score-based LDC classifier. We shall demonstrate that our scheme outperform the state of the art results when tested on non-prompted databases or acted databases (i.e. when subjects act specific emotions while uttering a sentence). However, the huge gap between accuracy rates achieved on the different types of datasets of speech raises questions about the way emotions modulate the speech. In particular we shall argue that emotion recognition from speech should not be dealt with as a classification problem. We shall demonstrate the presence of a spectrum of different emotions in the same speech portion especially in the non-prompted data sets, which tends to be more "natural" than the acted datasets where the subjects attempt to suppress all but one emotion.
Free Speech as a Cultural Value in the United States

Directory of Open Access Journals (Sweden)

Mauricio J. Alvarez

2018-02-01

Full Text Available Political orientation influences support for free speech, with liberals often reporting greater support for free speech than conservatives. We hypothesized that this effect should be moderated by cultural context: individualist cultures value individual self-expression and self-determination, and collectivist cultures value group harmony and conformity. These different foci should differently influence liberals and conservatives’ support for free speech within these cultures. Two studies evaluated the joint influence of political orientation and cultural context on support for free speech. Study 1, using a multilevel analysis of data from 37 U.S. states (n = 1,001, showed that conservatives report stronger support for free speech in collectivist states, whereas there were no differences between conservatives and liberals in support for free speech in individualist states. Study 2 (n = 90 confirmed this pattern by priming independent and interdependent self-construals in liberals and conservatives. Results demonstrate the importance of cultural context for free speech. Findings suggest that in the U.S. support for free speech might be embraced for different reasons: conservatives’ support for free speech appears to be motivated by a focus on collectively held values favoring free speech, while liberals’ support for free speech might be motivated by a focus on individualist self-expression.
Hearing speech in music

Directory of Open Access Journals (Sweden)

Seth-Reino Ekström

2011-01-01

Full Text Available The masking effect of a piano composition, played at different speeds and in different octaves, on speech-perception thresholds was investigated in 15 normal-hearing and 14 moderately-hearing-impaired subjects. Running speech (just follow conversation, JFC testing and use of hearing aids increased the everyday validity of the findings. A comparison was made with standard audiometric noises [International Collegium of Rehabilitative Audiology (ICRA noise and speech spectrum-filtered noise (SPN]. All masking sounds, music or noise, were presented at the same equivalent sound level (50 dBA. The results showed a significant effect of piano performance speed and octave (P<.01. Low octave and fast tempo had the largest effect; and high octave and slow tempo, the smallest. Music had a lower masking effect than did ICRA noise with two or six speakers at normal vocal effort (P<.01 and SPN (P<.05. Subjects with hearing loss had higher masked thresholds than the normal-hearing subjects (P<.01, but there were smaller differences between masking conditions (P<.01. It is pointed out that music offers an interesting opportunity for studying masking under realistic conditions, where spectral and temporal features can be varied independently. The results have implications for composing music with vocal parts, designing acoustic environments and creating a balance between speech perception and privacy in social settings.
Enhancement of Non-Stationary Speech using Harmonic Chirp Filters

DEFF Research Database (Denmark)

Nørholm, Sidsel Marie; Jensen, Jesper Rindom; Christensen, Mads Græsbøll

2015-01-01

In this paper, the issue of single channel speech enhancement of non-stationary voiced speech is addressed. The non-stationarity of speech is well known, but state of the art speech enhancement methods assume stationarity within frames of 20–30 ms. We derive optimal distortionless filters that take...... the non-stationarity nature of voiced speech into account via linear constraints. This is facilitated by imposing a harmonic chirp model on the speech signal. As an implicit part of the filter design, the noise statistics are also estimated based on the observed signal and parameters of the harmonic chirp...... model. Simulations on real speech show that the chirp based filters perform better than their harmonic counterparts. Further, it is seen that the gain of using the chirp model increases when the estimated chirp parameter is big corresponding to periods in the signal where the instantaneous fundamental...
Barack Obama’s pauses and gestures in humorous speeches

DEFF Research Database (Denmark)

Navarretta, Costanza

2017-01-01

The main aim of this paper is to investigate speech pauses and gestures as means to engage the audience and present the humorous message in an effective way. The data consist of two speeches by the USA president Barack Obama at the 2011 and 2016 Annual White House Correspondents’ Association Dinner...... produced significantly more hand gestures in 2016 than in 2011. An analysis of the hand gestures produced by Barack Obama in two political speeches held at the United Nations in 2011 and 2016 confirms that the president produced significantly less communicative co-speech hand gestures during his speeches...... and they emphasise the speech segment which they follow or precede. We also found a highly significant correlation between Obama’s speech pauses and audience response. Obama produces numerous head movements, facial expressions and hand gestures and their functions are related to both discourse content and structure...
Variable Span Filters for Speech Enhancement

DEFF Research Database (Denmark)

Jensen, Jesper Rindom; Benesty, Jacob; Christensen, Mads Græsbøll

2016-01-01

In this work, we consider enhancement of multichannel speech recordings. Linear filtering and subspace approaches have been considered previously for solving the problem. The current linear filtering methods, although many variants exist, have limited control of noise reduction and speech...
HMM Adaptation for child speech synthesis

CSIR Research Space (South Africa)

Govender, Avashna

2015-09-01

Full Text Available Hidden Markov Model (HMM)-based synthesis in combination with speaker adaptation has proven to be an approach that is well-suited for child speech synthesis. This paper describes the development and evaluation of different HMM-based child speech...
Statistical representation of sound textures in the impaired auditory system

DEFF Research Database (Denmark)

McWalter, Richard Ian; Dau, Torsten

2015-01-01

Many challenges exist when it comes to understanding and compensating for hearing impairment. Traditional methods, such as pure tone audiometry and speech intelligibility tests, offer insight into the deficiencies of a hearingimpaired listener, but can only partially reveal the mechanisms...... that underlie the hearing loss. An alternative approach is to investigate the statistical representation of sounds for hearing-impaired listeners along the auditory pathway. Using models of the auditory periphery and sound synthesis, we aimed to probe hearing impaired perception for sound textures – temporally...

Speech, stone tool-making and the evolution of language.

Science.gov (United States)

Cataldo, Dana Michelle; Migliano, Andrea Bamberg; Vinicius, Lucio

2018-01-01

The 'technological hypothesis' proposes that gestural language evolved in early hominins to enable the cultural transmission of stone tool-making skills, with speech appearing later in response to the complex lithic industries of more recent hominins. However, no flintknapping study has assessed the efficiency of speech alone (unassisted by gesture) as a tool-making transmission aid. Here we show that subjects instructed by speech alone underperform in stone tool-making experiments in comparison to subjects instructed through either gesture alone or 'full language' (gesture plus speech), and also report lower satisfaction with their received instruction. The results provide evidence that gesture was likely to be selected over speech as a teaching aid in the earliest hominin tool-makers; that speech could not have replaced gesturing as a tool-making teaching aid in later hominins, possibly explaining the functional retention of gesturing in the full language of modern humans; and that speech may have evolved for reasons unrelated to tool-making. We conclude that speech is unlikely to have evolved as tool-making teaching aid superior to gesture, as claimed by the technological hypothesis, and therefore alternative views should be considered. For example, gestural language may have evolved to enable tool-making in earlier hominins, while speech may have later emerged as a response to increased trade and more complex inter- and intra-group interactions in Middle Pleistocene ancestors of Neanderthals and Homo sapiens; or gesture and speech may have evolved in parallel rather than in sequence.
[Modeling developmental aspects of sensorimotor control of speech production].

Science.gov (United States)

Kröger, B J; Birkholz, P; Neuschaefer-Rube, C

2007-05-01

Detailed knowledge of the neurophysiology of speech acquisition is important for understanding the developmental aspects of speech perception and production and for understanding developmental disorders of speech perception and production. A computer implemented neural model of sensorimotor control of speech production was developed. The model is capable of demonstrating the neural functions of different cortical areas during speech production in detail. (i) Two sensory and two motor maps or neural representations and the appertaining neural mappings or projections establish the sensorimotor feedback control system. These maps and mappings are already formed and trained during the prelinguistic phase of speech acquisition. (ii) The feedforward sensorimotor control system comprises the lexical map (representations of sounds, syllables, and words of the first language) and the mappings from lexical to sensory and to motor maps. The training of the appertaining mappings form the linguistic phase of speech acquisition. (iii) Three prelinguistic learning phases--i. e. silent mouthing, quasi stationary vocalic articulation, and realisation of articulatory protogestures--can be defined on the basis of our simulation studies using the computational neural model. These learning phases can be associated with temporal phases of prelinguistic speech acquisition obtained from natural data. The neural model illuminates the detailed function of specific cortical areas during speech production. In particular it can be shown that developmental disorders of speech production may result from a delayed or incorrect process within one of the prelinguistic learning phases defined by the neural model.
Speech Pathology in Ancient India--A Review of Sanskrit Literature.

Science.gov (United States)

Savithri, S. R.

1987-01-01

The paper is a review of ancient Sanskrit literature for information on the origin and development of speech and language, speech production, normality of speech and language, and disorders of speech and language and their treatment. (DB)
Speech cues contribute to audiovisual spatial integration.

Directory of Open Access Journals (Sweden)

Christopher W Bishop

Full Text Available Speech is the most important form of human communication but ambient sounds and competing talkers often degrade its acoustics. Fortunately the brain can use visual information, especially its highly precise spatial information, to improve speech comprehension in noisy environments. Previous studies have demonstrated that audiovisual integration depends strongly on spatiotemporal factors. However, some integrative phenomena such as McGurk interference persist even with gross spatial disparities, suggesting that spatial alignment is not necessary for robust integration of audiovisual place-of-articulation cues. It is therefore unclear how speech-cues interact with audiovisual spatial integration mechanisms. Here, we combine two well established psychophysical phenomena, the McGurk effect and the ventriloquist's illusion, to explore this dependency. Our results demonstrate that conflicting spatial cues may not interfere with audiovisual integration of speech, but conflicting speech-cues can impede integration in space. This suggests a direct but asymmetrical influence between ventral 'what' and dorsal 'where' pathways.
Innovative Speech Reconstructive Surgery

OpenAIRE

Hashem Shemshadi

2003-01-01

Proper speech functioning in human being, depends on the precise coordination and timing balances in a series of complex neuro nuscular movements and actions. Starting from the prime organ of energy source of expelled air from respirato y system; deliver such air to trigger vocal cords; swift changes of this phonatory episode to a comprehensible sound in RESONACE and final coordination of all head and neck structures to elicit final speech in ...
Causes of Speech Disorders in Primary School Students of Zahedan

Directory of Open Access Journals (Sweden)

Saeed Fakhrerahimi

2013-02-01

Full Text Available Background: Since making communication with others is the most important function of speech, undoubtedly, any type of disorder in speech will affect the human communicability with others. The objective of the study was to investigate reasons behind the [high] prevalence rate of stammer, producing disorders and aglossia.Materials and Methods: This descriptive-analytical study was conducted on 118 male and female students, who were studying in a primary school in Zahedan; they had referred to the Speech Therapy Centers of Zahedan University of Medical Sciences in a period of seven months. The speech therapist examinations, diagnosis tools common in speech therapy, Spielberg Children Trait and also patients' cases were used to find the reasons behind the [high] prevalence rate of speech disorders. Results: Psychological causes had the highest rate of correlation with the speech disorders among the other factors affecting the speech disorders. After psychological causes, family history and age of the subjects are the other factors which may bring about the speech disorders (P<0.05. Bilingualism and birth order has a negative relationship with the speech disorders. Likewise, another result of this study shows that only psychological causes, social causes, hereditary causes and age of subjects can predict the speech disorders (P<0.05.Conclusion: The present study shows that the speech disorders have a strong and close relationship with the psychological causes at the first step and also history of family and age of individuals at the next steps.
The Clinical Practice of Speech and Language Therapists with Children with Phonologically Based Speech Sound Disorders

Science.gov (United States)

Oliveira, Carla; Lousada, Marisa; Jesus, Luis M. T.

2015-01-01

Children with speech sound disorders (SSD) represent a large number of speech and language therapists' caseloads. The intervention with children who have SSD can involve different therapy approaches, and these may be articulatory or phonologically based. Some international studies reveal a widespread application of articulatory based approaches in…
The politeness prosody of the Javanese directive speech

Directory of Open Access Journals (Sweden)

F.X. Rahyono

2009-10-01

Full Text Available This experimental phonetic research deals with the prosodies of directive speech in Javanese. The research procedures were: (1 speech production, (2 acoustic analysis, and (3 perception test. The data investigated are three directive utterances, in the form of statements, commands, and questions. The data were obtained by recording dialogues that present polite as well as impolite speech. Three acoustic experiments were conducted for statements, commands, and questions in directive speech: (1 modifications of duration, (2 modifications of contour, and (3 modifications of fundamental frequency. The result of the subsequent perception tests to 90 stimuli with 24 subjects were analysed statistically with ANOVA (Analysis of Variant. Based on this statistic analysis, the prosodic characteristics of polite and impolite speech were identified.
Prevalence of Speech Disorders in Arak Primary School Students, 2014-2015

Directory of Open Access Journals (Sweden)

Abdoreza Yavari

2016-09-01

Full Text Available Abstract Background: The speech disorders may produce irreparable damage to childs speech and language development in the psychosocial view. The voice, speech sound production and fluency disorders are speech disorders, that may result from delay or impairment in speech motor control mechanism, central neuron system disorders, improper language stimulation or voice abuse. Materials and Methods: This study examined the prevalence of speech disorders in 1393 Arakian students at 1 to 6th grades of primary school. After collecting continuous speech samples, picture description, passage reading and phonetic test, we recorded the pathological signs of stuttering, articulation disorder and voice disorders in a special sheet. Results: The prevalence of articulation, voice and stuttering disorders was 8%, 3.5% and%1 and the prevalence of speech disorders was 11.9%. The prevalence of speech disorders was decreasing with increasing of student’s grade. 12.2% of boy students and 11.7% of girl students of primary school in Arak had speech disorders. Conclusion: The prevalence of speech disorders of primary school students in Arak is similar to the prevalence of speech disorders in Kermanshah, but the prevalence of speech disorders in this research is smaller than many similar researches in Iran. It seems that racial and cultural diversity has some effect on increasing the prevalence of speech disorders in Arak city.
The Beginnings of Danish Speech Perception

DEFF Research Database (Denmark)

Østerbye, Torkil

, in the light of the rich and complex Danish sound system. The first two studies report on native adults’ perception of Danish speech sounds in quiet and noise. The third study examined the development of language-specific perception in native Danish infants at 6, 9 and 12 months of age. The book points......Little is known about the perception of speech sounds by native Danish listeners. However, the Danish sound system differs in several interesting ways from the sound systems of other languages. For instance, Danish is characterized, among other features, by a rich vowel inventory and by different...... reductions of speech sounds evident in the pronunciation of the language. This book (originally a PhD thesis) consists of three studies based on the results of two experiments. The experiments were designed to provide knowledge of the perception of Danish speech sounds by Danish adults and infants...
Syntactic error modeling and scoring normalization in speech recognition: Error modeling and scoring normalization in the speech recognition task for adult literacy training

Science.gov (United States)

Olorenshaw, Lex; Trawick, David

1991-01-01

The purpose was to develop a speech recognition system to be able to detect speech which is pronounced incorrectly, given that the text of the spoken speech is known to the recognizer. Better mechanisms are provided for using speech recognition in a literacy tutor application. Using a combination of scoring normalization techniques and cheater-mode decoding, a reasonable acceptance/rejection threshold was provided. In continuous speech, the system was tested to be able to provide above 80 pct. correct acceptance of words, while correctly rejecting over 80 pct. of incorrectly pronounced words.
Inconsistency of speech in children with childhood apraxia of speech, phonological disorders, and typical speech

Science.gov (United States)

Iuzzini, Jenya

There is a lack of agreement on the features used to differentiate Childhood Apraxia of Speech (CAS) from Phonological Disorders (PD). One criterion which has gained consensus is lexical inconsistency of speech (ASHA, 2007); however, no accepted measure of this feature has been defined. Although lexical assessment provides information about consistency of an item across repeated trials, it may not capture the magnitude of inconsistency within an item. In contrast, segmental analysis provides more extensive information about consistency of phoneme usage across multiple contexts and word-positions. The current research compared segmental and lexical inconsistency metrics in preschool-aged children with PD, CAS, and typical development (TD) to determine how inconsistency varies with age in typical and disordered speakers, and whether CAS and PD were differentiated equally well by both assessment levels. Whereas lexical and segmental analyses may be influenced by listener characteristics or speaker intelligibility, the acoustic signal is less vulnerable to these factors. In addition, the acoustic signal may reveal information which is not evident in the perceptual signal. A second focus of the current research was motivated by Blumstein et al.'s (1980) classic study on voice onset time (VOT) in adults with acquired apraxia of speech (AOS) which demonstrated a motor impairment underlying AOS. In the current study, VOT analyses were conducted to determine the relationship between age and group with the voicing distribution for bilabial and alveolar plosives. Findings revealed that 3-year-olds evidenced significantly higher inconsistency than 5-year-olds; segmental inconsistency approached 0% in 5-year-olds with TD, whereas it persisted in children with PD and CAS suggesting that for child in this age-range, inconsistency is a feature of speech disorder rather than typical development (Holm et al., 2007). Likewise, whereas segmental and lexical inconsistency were
Aging and Spectro-Temporal Integration of Speech

Directory of Open Access Journals (Sweden)

John H. Grose

2016-10-01

Full Text Available The purpose of this study was to determine the effects of age on the spectro-temporal integration of speech. The hypothesis was that the integration of speech fragments distributed over frequency, time, and ear of presentation is reduced in older listeners—even for those with good audiometric hearing. Younger, middle-aged, and older listeners (10 per group with good audiometric hearing participated. They were each tested under seven conditions that encompassed combinations of spectral, temporal, and binaural integration. Sentences were filtered into two bands centered at 500 Hz and 2500 Hz, with criterion bandwidth tailored for each participant. In some conditions, the speech bands were individually square wave interrupted at a rate of 10 Hz. Configurations of uninterrupted, synchronously interrupted, and asynchronously interrupted frequency bands were constructed that constituted speech fragments distributed across frequency, time, and ear of presentation. The over-arching finding was that, for most configurations, performance was not differentially affected by listener age. Although speech intelligibility varied across condition, there was no evidence of performance deficits in older listeners in any condition. This study indicates that age, per se, does not necessarily undermine the ability to integrate fragments of speech dispersed across frequency and time.
Oral Articulatory Control in Childhood Apraxia of Speech

Science.gov (United States)

Grigos, Maria I.; Moss, Aviva; Lu, Ying

2015-01-01

Purpose: The purpose of this research was to examine spatial and temporal aspects of articulatory control in children with childhood apraxia of speech (CAS), children with speech delay characterized by an articulation/phonological impairment (SD), and controls with typical development (TD) during speech tasks that increased in word length. Method:…
Associations between speech features and phenotypic severity in Treacher Collins syndrome.

Science.gov (United States)

Asten, Pamela; Akre, Harriet; Persson, Christina

2014-04-28

Treacher Collins syndrome (TCS, OMIM 154500) is a rare congenital disorder of craniofacial development. Characteristic hypoplastic malformations of the ears, zygomatic arch, mandible and pharynx have been described in detail. However, reports on the impact of these malformations on speech are few. Exploring speech features and investigating if speech function is related to phenotypic severity are essential for optimizing follow-up and treatment. Articulation, nasal resonance, voice and intelligibility were examined in 19 individuals (5-74 years, median 34 years) divided into three groups comprising children 5-10 years (n = 4), adolescents 11-18 years (n = 4) and adults 29 years and older (n = 11). A speech composite score (0-6) was calculated to reflect the variability of speech deviations. TCS severity scores of phenotypic expression and total scores of Nordic Orofacial Test-Screening (NOT-S) measuring orofacial dysfunction were used in analyses of correlation with speech characteristics (speech composite scores). Children and adolescents presented with significantly higher speech composite scores (median 4, range 1-6) than adults (median 1, range 0-5). Nearly all children and adolescents (6/8) displayed speech deviations of articulation, nasal resonance and voice, while only three adults were identified with multiple speech aberrations. The variability of speech dysfunction in TCS was exhibited by individual combinations of speech deviations in 13/19 participants. The speech composite scores correlated with TCS severity scores and NOT-S total scores. Speech composite scores higher than 4 were associated with cleft palate. The percent of intelligible words in connected speech was significantly lower in children and adolescents (median 77%, range 31-99) than in adults (98%, range 93-100). Intelligibility of speech among the children was markedly inconsistent and clearly affecting the understandability. Multiple speech deviations were identified in
Relationship between the stuttering severity index and speech rate

Directory of Open Access Journals (Sweden)

Claudia Regina Furquim de Andrade

Full Text Available CONTEXT: The speech rate is one of the parameters considered when investigating speech fluency and is an important variable in the assessment of individuals with communication complaints. OBJECTIVE: To correlate the stuttering severity index with one of the indices used for assessing fluency/speech rate. DESIGN: Cross-sectional study. SETTING: Fluency and Fluency Disorders Investigation Laboratory, Faculdade de Medicina da Universidade de São Paulo. PARTICIPANTS: Seventy adults with stuttering diagnosis. MAIN MEASUREMENTS: A speech sample from each participant containing at least 200 fluent syllables was videotaped and analyzed according to a stuttering severity index test and speech rate parameters. RESULTS: The results obtained in this study indicate that the stuttering severity and the speech rate present significant variation, i.e., the more severe the stuttering is, the lower the speech rate in words and syllables per minute. DISCUSSION AND CONCLUSION: The results suggest that speech rate is an important indicator of fluency levels and should be incorporated in the assessment and treatment of stuttering. This study represents a first attempt to identify the possible subtypes of developmental stuttering. DEFINITION: Objective tests that quantify diseases are important in their diagnosis, treatment and prognosis.
Interdependent processing and encoding of speech and concurrent background noise.

Science.gov (United States)

Cooper, Angela; Brouwer, Susanne; Bradlow, Ann R

2015-05-01

Speech processing can often take place in adverse listening conditions that involve the mixing of speech and background noise. In this study, we investigated processing dependencies between background noise and indexical speech features, using a speeded classification paradigm (Garner, 1974; Exp. 1), and whether background noise is encoded and represented in memory for spoken words in a continuous recognition memory paradigm (Exp. 2). Whether or not the noise spectrally overlapped with the speech signal was also manipulated. The results of Experiment 1 indicated that background noise and indexical features of speech (gender, talker identity) cannot be completely segregated during processing, even when the two auditory streams are spectrally nonoverlapping. Perceptual interference was asymmetric, whereby irrelevant indexical feature variation in the speech signal slowed noise classification to a greater extent than irrelevant noise variation slowed speech classification. This asymmetry may stem from the fact that speech features have greater functional relevance to listeners, and are thus more difficult to selectively ignore than background noise. Experiment 2 revealed that a recognition cost for words embedded in different types of background noise on the first and second occurrences only emerged when the noise and the speech signal were spectrally overlapping. Together, these data suggest integral processing of speech and background noise, modulated by the level of processing and the spectral separation of the speech and noise.
Bandwidth Extension of Telephone Speech Aided by Data Embedding

Directory of Open Access Journals (Sweden)

Sagi Ariel

2007-01-01

Full Text Available A system for bandwidth extension of telephone speech, aided by data embedding, is presented. The proposed system uses the transmitted analog narrowband speech signal as a carrier of the side information needed to carry out the bandwidth extension. The upper band of the wideband speech is reconstructed at the receiving end from two components: a synthetic wideband excitation signal, generated from the narrowband telephone speech and a wideband spectral envelope, parametrically represented and transmitted as embedded data in the telephone speech. We propose a novel data embedding scheme, in which the scalar Costa scheme is combined with an auditory masking model allowing high rate transparent embedding, while maintaining a low bit error rate. The signal is transformed to the frequency domain via the discrete Hartley transform (DHT and is partitioned into subbands. Data is embedded in an adaptively chosen subset of subbands by modifying the DHT coefficients. In our simulations, high quality wideband speech was obtained from speech transmitted over a telephone line (characterized by spectral magnitude distortion, dispersion, and noise, in which side information data is transparently embedded at the rate of 600 information bits/second and with a bit error rate of approximately . In a listening test, the reconstructed wideband speech was preferred (at different degrees over conventional telephone speech in of the test utterances.
Bandwidth Extension of Telephone Speech Aided by Data Embedding

Directory of Open Access Journals (Sweden)

David Malah

2007-01-01

Full Text Available A system for bandwidth extension of telephone speech, aided by data embedding, is presented. The proposed system uses the transmitted analog narrowband speech signal as a carrier of the side information needed to carry out the bandwidth extension. The upper band of the wideband speech is reconstructed at the receiving end from two components: a synthetic wideband excitation signal, generated from the narrowband telephone speech and a wideband spectral envelope, parametrically represented and transmitted as embedded data in the telephone speech. We propose a novel data embedding scheme, in which the scalar Costa scheme is combined with an auditory masking model allowing high rate transparent embedding, while maintaining a low bit error rate. The signal is transformed to the frequency domain via the discrete Hartley transform (DHT and is partitioned into subbands. Data is embedded in an adaptively chosen subset of subbands by modifying the DHT coefficients. In our simulations, high quality wideband speech was obtained from speech transmitted over a telephone line (characterized by spectral magnitude distortion, dispersion, and noise, in which side information data is transparently embedded at the rate of 600 information bits/second and with a bit error rate of approximately 3⋅10−4. In a listening test, the reconstructed wideband speech was preferred (at different degrees over conventional telephone speech in 92.5% of the test utterances.
Source Separation via Spectral Masking for Speech Recognition Systems

Directory of Open Access Journals (Sweden)

Gustavo Fernandes Rodrigues

2012-12-01

Full Text Available In this paper we present an insight into the use of spectral masking techniques in time-frequency domain, as a preprocessing step for the speech signal recognition. Speech recognition systems have their performance negatively affected in noisy environments or in the presence of other speech signals. The limits of these masking techniques for different levels of the signal-to-noise ratio are discussed. We show the robustness of the spectral masking techniques against four types of noise: white, pink, brown and human speech noise (bubble noise. The main contribution of this work is to analyze the performance limits of recognition systems using spectral masking. We obtain an increase of 18% on the speech hit rate, when the speech signals were corrupted by other speech signals or bubble noise, with different signal-to-noise ratio of approximately 1, 10 and 20 dB. On the other hand, applying the ideal binary masks to mixtures corrupted by white, pink and brown noise, results an average growth of 9% on the speech hit rate, with the same different signal-to-noise ratio. The experimental results suggest that the masking spectral techniques are more suitable for the case when it is applied a bubble noise, which is produced by human speech, than for the case of applying white, pink and brown noise.

Speech-Based Information Retrieval for Digital Libraries

National Research Council Canada - National Science Library

Oard, Douglas W

1997-01-01

Libraries and archives collect recorded speech and multimedia objects that contain recorded speech, and such material may comprise a substantial portion of the collection in future digital libraries...
Speech and language pathology & pediatric HIV.

Science.gov (United States)

Retzlaff, C

1999-12-01

Children with HIV have critical speech and language issues because the virus manifests itself primarily in the developing central nervous system, sometimes causing speech, motor control, and language disabilities. Language impediments that develop during the second year of life seem to be especially severe. HIV-infected children are also susceptible to recurrent ear infections, which can damage hearing. Developmental issues must be addressed for these children to reach their full potential. A decline in language skills may coincide with or precede other losses in cognitive ability. A speech pathologist can play an important role on a pediatric HIV team. References are included.
Hidden Markov models in automatic speech recognition

Science.gov (United States)

Wrzoskowicz, Adam

1993-11-01

This article describes a method for constructing an automatic speech recognition system based on hidden Markov models (HMMs). The author discusses the basic concepts of HMM theory and the application of these models to the analysis and recognition of speech signals. The author provides algorithms which make it possible to train the ASR system and recognize signals on the basis of distinct stochastic models of selected speech sound classes. The author describes the specific components of the system and the procedures used to model and recognize speech. The author discusses problems associated with the choice of optimal signal detection and parameterization characteristics and their effect on the performance of the system. The author presents different options for the choice of speech signal segments and their consequences for the ASR process. The author gives special attention to the use of lexical, syntactic, and semantic information for the purpose of improving the quality and efficiency of the system. The author also describes an ASR system developed by the Speech Acoustics Laboratory of the IBPT PAS. The author discusses the results of experiments on the effect of noise on the performance of the ASR system and describes methods of constructing HMM's designed to operate in a noisy environment. The author also describes a language for human-robot communications which was defined as a complex multilevel network from an HMM model of speech sounds geared towards Polish inflections. The author also added mandatory lexical and syntactic rules to the system for its communications vocabulary.
Human phoneme recognition depending on speech-intrinsic variability.

Science.gov (United States)

Meyer, Bernd T; Jürgens, Tim; Wesker, Thorsten; Brand, Thomas; Kollmeier, Birger

2010-11-01

The influence of different sources of speech-intrinsic variation (speaking rate, effort, style and dialect or accent) on human speech perception was investigated. In listening experiments with 16 listeners, confusions of consonant-vowel-consonant (CVC) and vowel-consonant-vowel (VCV) sounds in speech-weighted noise were analyzed. Experiments were based on the OLLO logatome speech database, which was designed for a man-machine comparison. It contains utterances spoken by 50 speakers from five dialect/accent regions and covers several intrinsic variations. By comparing results depending on intrinsic and extrinsic variations (i.e., different levels of masking noise), the degradation induced by variabilities can be expressed in terms of the SNR. The spectral level distance between the respective speech segment and the long-term spectrum of the masking noise was found to be a good predictor for recognition rates, while phoneme confusions were influenced by the distance to spectrally close phonemes. An analysis based on transmitted information of articulatory features showed that voicing and manner of articulation are comparatively robust cues in the presence of intrinsic variations, whereas the coding of place is more degraded. The database and detailed results have been made available for comparisons between human speech recognition (HSR) and automatic speech recognizers (ASR).
Cortical activity patterns predict robust speech discrimination ability in noise

Science.gov (United States)

Shetake, Jai A.; Wolf, Jordan T.; Cheung, Ryan J.; Engineer, Crystal T.; Ram, Satyananda K.; Kilgard, Michael P.

2012-01-01

The neural mechanisms that support speech discrimination in noisy conditions are poorly understood. In quiet conditions, spike timing information appears to be used in the discrimination of speech sounds. In this study, we evaluated the hypothesis that spike timing is also used to distinguish between speech sounds in noisy conditions that significantly degrade neural responses to speech sounds. We tested speech sound discrimination in rats and recorded primary auditory cortex (A1) responses to speech sounds in background noise of different intensities and spectral compositions. Our behavioral results indicate that rats, like humans, are able to accurately discriminate consonant sounds even in the presence of background noise that is as loud as the speech signal. Our neural recordings confirm that speech sounds evoke degraded but detectable responses in noise. Finally, we developed a novel neural classifier that mimics behavioral discrimination. The classifier discriminates between speech sounds by comparing the A1 spatiotemporal activity patterns evoked on single trials with the average spatiotemporal patterns evoked by known sounds. Unlike classifiers in most previous studies, this classifier is not provided with the stimulus onset time. Neural activity analyzed with the use of relative spike timing was well correlated with behavioral speech discrimination in quiet and in noise. Spike timing information integrated over longer intervals was required to accurately predict rat behavioral speech discrimination in noisy conditions. The similarity of neural and behavioral discrimination of speech in noise suggests that humans and rats may employ similar brain mechanisms to solve this problem. PMID:22098331
Sound frequency affects speech emotion perception: results from congenital amusia.

Science.gov (United States)

Lolli, Sydney L; Lewenstein, Ari D; Basurto, Julian; Winnik, Sean; Loui, Psyche

2015-01-01

Congenital amusics, or "tone-deaf" individuals, show difficulty in perceiving and producing small pitch differences. While amusia has marked effects on music perception, its impact on speech perception is less clear. Here we test the hypothesis that individual differences in pitch perception affect judgment of emotion in speech, by applying low-pass filters to spoken statements of emotional speech. A norming study was first conducted on Mechanical Turk to ensure that the intended emotions from the Macquarie Battery for Evaluation of Prosody were reliably identifiable by US English speakers. The most reliably identified emotional speech samples were used in Experiment 1, in which subjects performed a psychophysical pitch discrimination task, and an emotion identification task under low-pass and unfiltered speech conditions. Results showed a significant correlation between pitch-discrimination threshold and emotion identification accuracy for low-pass filtered speech, with amusics (defined here as those with a pitch discrimination threshold >16 Hz) performing worse than controls. This relationship with pitch discrimination was not seen in unfiltered speech conditions. Given the dissociation between low-pass filtered and unfiltered speech conditions, we inferred that amusics may be compensating for poorer pitch perception by using speech cues that are filtered out in this manipulation. To assess this potential compensation, Experiment 2 was conducted using high-pass filtered speech samples intended to isolate non-pitch cues. No significant correlation was found between pitch discrimination and emotion identification accuracy for high-pass filtered speech. Results from these experiments suggest an influence of low frequency information in identifying emotional content of speech.
Surgical improvement of speech disorder caused by amyotrophic lateral sclerosis.

Science.gov (United States)

Saigusa, Hideto; Yamaguchi, Satoshi; Nakamura, Tsuyoshi; Komachi, Taro; Kadosono, Osamu; Ito, Hiroyuki; Saigusa, Makoto; Niimi, Seiji

2012-12-01

Amyotrophic lateral sclerosis (ALS) is a progressive debilitating neurological disease. ALS disturbs the quality of life by affecting speech, swallowing and free mobility of the arms without affecting intellectual function. It is therefore of significance to improve intelligibility and quality of speech sounds, especially for ALS patients with slowly progressive courses. Currently, however, there is no effective or established approach to improve speech disorder caused by ALS. We investigated a surgical procedure to improve speech disorder for some patients with neuromuscular diseases with velopharyngeal closure incompetence. In this study, we performed the surgical procedure for two patients suffering from severe speech disorder caused by slowly progressing ALS. The patients suffered from speech disorder with hypernasality and imprecise and weak articulation during a 6-year course (patient 1) and a 3-year course (patient 2) of slowly progressing ALS. We narrowed bilateral lateral palatopharyngeal wall at velopharyngeal port, and performed this surgery under general anesthesia without muscle relaxant for the two patients. Postoperatively, intelligibility and quality of their speech sounds were greatly improved within one month without any speech therapy. The patients were also able to generate longer speech phrases after the surgery. Importantly, there was no serious complication during or after the surgery. In summary, we performed bilateral narrowing of lateral palatopharyngeal wall as a speech surgery for two patients suffering from severe speech disorder associated with ALS. With this technique, improved intelligibility and quality of speech can be maintained for longer duration for the patients with slowly progressing ALS.
The Repeatable Battery for the Assessment of Neuropsychological Status for Hearing Impaired Individuals (RBANS-H) before and after Cochlear Implantation: A Protocol for a Prospective, Longitudinal Cohort Study.

Science.gov (United States)

Claes, Annes J; Mertens, Griet; Gilles, Annick; Hofkens-Van den Brandt, Anouk; Fransen, Erik; Van Rompaey, Vincent; Van de Heyning, Paul

2016-01-01

Background: Currently, an independent relationship between hearing loss and cognitive decline in older adults is suggested by large prospective studies. In general, cochlear implants improve hearing and the quality of life in severely to profoundly hearing impaired older persons. However, little is known about the effects of cochlear implantation on the cognitive evolution in this population. Aim of the study: The primary goal of this prospective, longitudinal cohort study is to explore the cognitive profile of severely to profoundly postlingually hearing impaired subjects before and after cochlear implantation. In addition, the current study aims to investigate the relationship between the cognitive function, audiometric performances, quality of life, and self-reliance in these patients. Methods: Twenty-five patients aged 55 or older, scheduled for cochlear implantation, will be enrolled in the study. They will be examined prior to implantation, at 6 and 12 months after implantation and annually thereafter. The test battery consists of (1) a cognitive examination, using the Repeatable Battery for the Assessment of Neuropsychological Status adapted for Hearing impaired persons (RBANS-H), (2) an audiological examination, including unaided and aided pure tone audiometry, speech audiometry in quiet and speech audiometry in noise, (3) the administration of four questionnaires evaluating quality of life and subjective hearing benefit and (4) a semi-structured interview about the self-reliance of the participant. Discussion: Up until now only one study has been conducted on this topic, focusing on the short-term effects of cochlear implantation on cognition in older adults. The present study is the first study to apply a comprehensive neuropsychological assessment adapted for severely to profoundly hearing impaired subjects in order to investigate the cognitive capabilities before and after cochlear implantation. Trial registration: The present protocol is
Cortical oscillations and entrainment in speech processing during working memory load

DEFF Research Database (Denmark)

Hjortkjær, Jens; Märcher-Rørsted, Jonatan; Fuglsang, Søren A

2018-01-01

Neuronal oscillations are thought to play an important role in working memory (WM) and speech processing. Listening to speech in real-life situations is often cognitively demanding but it is unknown whether WM load influences how auditory cortical activity synchronizes to speech features. Here, we...... developed an auditory n-back paradigm to investigate cortical entrainment to speech envelope fluctuations under different degrees of WM load. We measured the electroencephalogram, pupil dilations and behavioural performance from 22 subjects listening to continuous speech with an embedded n-back task....... The speech stimuli consisted of long spoken number sequences created to match natural speech in terms of sentence intonation, syllabic rate and phonetic content. To burden different WM functions during speech processing, listeners performed an n-back task on the speech sequences in different levels...
42 CFR 485.715 - Condition of participation: Speech pathology services.

Science.gov (United States)

2010-10-01

... 42 Public Health 5 2010-10-01 2010-10-01 false Condition of participation: Speech pathology... Agencies as Providers of Outpatient Physical Therapy and Speech-Language Pathology Services § 485.715 Condition of participation: Speech pathology services. If speech pathology services are offered, the...
Audiovisual speech perception development at varying levels of perceptual processing.

Science.gov (United States)

Lalonde, Kaylah; Holt, Rachael Frush

2016-04-01

This study used the auditory evaluation framework [Erber (1982). Auditory Training (Alexander Graham Bell Association, Washington, DC)] to characterize the influence of visual speech on audiovisual (AV) speech perception in adults and children at multiple levels of perceptual processing. Six- to eight-year-old children and adults completed auditory and AV speech perception tasks at three levels of perceptual processing (detection, discrimination, and recognition). The tasks differed in the level of perceptual processing required to complete them. Adults and children demonstrated visual speech influence at all levels of perceptual processing. Whereas children demonstrated the same visual speech influence at each level of perceptual processing, adults demonstrated greater visual speech influence on tasks requiring higher levels of perceptual processing. These results support previous research demonstrating multiple mechanisms of AV speech processing (general perceptual and speech-specific mechanisms) with independent maturational time courses. The results suggest that adults rely on both general perceptual mechanisms that apply to all levels of perceptual processing and speech-specific mechanisms that apply when making phonetic decisions and/or accessing the lexicon. Six- to eight-year-old children seem to rely only on general perceptual mechanisms across levels. As expected, developmental differences in AV benefit on this and other recognition tasks likely reflect immature speech-specific mechanisms and phonetic processing in children.
The effect of filtered speech feedback on the frequency of stuttering

Science.gov (United States)

Rami, Manish Krishnakant

2000-10-01

This study investigated the effects of filtered components of speech and whispered speech on the frequency of stuttering. It is known that choral speech, shadowing, and altered auditory feedback are the only conditions which induce fluency without any additional effort than normally required to speak on the part of people who stutter. All these conditions use speech as a second signal. This experiment examined the role of components of speech signal as delineated by the source- filter theory of speech production. Three filtered speech signals, a whispered speech signal, and a choral speech signal formed the stimuli. It was postulated that if the speech signal in whole was necessary for producing fluency in people who stutter, then all other conditions except choral speech should fail to produce fluency enhancement. If the glottal source alone was adequate in restoring fluency, then only the conditions of NAF and whispered speech should fail in promoting fluency. In the event that full filter characteristics are necessary for the fluency creating effects, then all conditions except the choral speech and whispered speech should fail to produce fluency. If any part of the filter characteristics is sufficient in yielding fluency, then only the NAF and the approximate glottal source should fail to demonstrate an increase in the amount of fluency. Twelve adults who stuttered read passages under the six conditions while receiving auditory feedback consisting of one of the six experimental conditions: (a)NAF; (b)approximate glottal source; (c)glottal source and first formant; (d)glottal source and first two formants; and (e)whispered speech. Frequencies of stuttering were obtained for each condition and submitted to descriptive and inferential statistical analysis. Statistically significant differences in means were found within the choral feedback conditions. Specifically, the choral speech, the source and first formant, source and the first two formants, and the
Speech Intelligibility and Hearing Protector Selection

Science.gov (United States)

2016-08-29

not only affect the listener of speech communication in a noisy environment, HPDs can also affect the speaker . Tufts and Frank (2003) found that...of hearing protection on speech intelligibility in noise. Sound and Vibration . 20(10): 12-14. Berger, E. H. 1980. EARLog #4 – The
The Tuning of Human Neonates' Preference for Speech

Science.gov (United States)

Vouloumanos, Athena; Hauser, Marc D.; Werker, Janet F.; Martin, Alia

2010-01-01

Human neonates prefer listening to speech compared to many nonspeech sounds, suggesting that humans are born with a bias for speech. However, neonates' preference may derive from properties of speech that are not unique but instead are shared with the vocalizations of other species. To test this, thirty neonates and sixteen 3-month-olds were…
Perceived gender in clear and conversational speech

Science.gov (United States)

Booz, Jaime A.

Although many studies have examined acoustic and sociolinguistic differences between male and female speech, the relationship between talker speaking style and perceived gender has not yet been explored. The present study attempts to determine whether clear speech, a style adopted by talkers who perceive some barrier to effective communication, shifts perceptions of femininity for male and female talkers. Much of our understanding of gender perception in voice and speech is based on sustained vowels or single words, eliminating temporal, prosodic, and articulatory cues available in more naturalistic, connected speech. Thus, clear and conversational sentence stimuli, selected from the 41 talkers of the Ferguson Clear Speech Database (Ferguson, 2004) were presented to 17 normal-hearing listeners, aged 18 to 30. They rated the talkers' gender using a visual analog scale with "masculine" and "feminine" endpoints. This response method was chosen to account for within-category shifts of gender perception by allowing nonbinary responses. Mixed-effects regression analysis of listener responses revealed a small but significant effect of speaking style, and this effect was larger for male talkers than female talkers. Because of the high degree of talker variability observed for talker gender, acoustic analyses of these sentences were undertaken to determine the relationship between acoustic changes in clear and conversational speech and perceived femininity. Results of these analyses showed that mean fundamental frequency (fo) and f o standard deviation were significantly correlated to perceived gender for both male and female talkers, and vowel space was significantly correlated only for male talkers. Speaking rate and breathiness measures (CPPS) were not significantly related for either group. Outcomes of this study indicate that adopting a clear speaking style is correlated with increases in perceived femininity. Although the increase was small, some changes associated
Rule-Based Storytelling Text-to-Speech (TTS Synthesis

Directory of Open Access Journals (Sweden)

Ramli Izzad

2016-01-01

Full Text Available In recent years, various real life applications such as talking books, gadgets and humanoid robots have drawn the attention to pursue research in the area of expressive speech synthesis. Speech synthesis is widely used in various applications. However, there is a growing need for an expressive speech synthesis especially for communication and robotic. In this paper, global and local rule are developed to convert neutral to storytelling style speech for the Malay language. In order to generate rules, modification of prosodic parameters such as pitch, intensity, duration, tempo and pauses are considered. Modification of prosodic parameters is examined by performing prosodic analysis on a story collected from an experienced female and male storyteller. The global and local rule is applied in sentence level and synthesized using HNM. Subjective tests are conducted to evaluate the synthesized storytelling speech quality of both rules based on naturalness, intelligibility, and similarity to the original storytelling speech. The results showed that global rule give a better result than local rule
THE MEANING OF THE PREVENTION WITH SPEECH THERAPY AS A IMPORTANT FAC-TOR FOR THE PROPER DEVELOPMENT OF THE CHILDREN SPEECH

Directory of Open Access Journals (Sweden)

S. FILIPOVA

1999-11-01

Full Text Available The paper presented some conscientious and results from the finished research which showing the meaning of the prevention with speech therapy in the development of the speech. The research was done at Negotino and with that are shown the most frequent speech deficiency of the children at preschool age.
A multimodal spectral approach to characterize rhythm in natural speech.

Science.gov (United States)

Alexandrou, Anna Maria; Saarinen, Timo; Kujala, Jan; Salmelin, Riitta

2016-01-01

Human utterances demonstrate temporal patterning, also referred to as rhythm. While simple oromotor behaviors (e.g., chewing) feature a salient periodical structure, conversational speech displays a time-varying quasi-rhythmic pattern. Quantification of periodicity in speech is challenging. Unimodal spectral approaches have highlighted rhythmic aspects of speech. However, speech is a complex multimodal phenomenon that arises from the interplay of articulatory, respiratory, and vocal systems. The present study addressed the question of whether a multimodal spectral approach, in the form of coherence analysis between electromyographic (EMG) and acoustic signals, would allow one to characterize rhythm in natural speech more efficiently than a unimodal analysis. The main experimental task consisted of speech production at three speaking rates; a simple oromotor task served as control. The EMG-acoustic coherence emerged as a sensitive means of tracking speech rhythm, whereas spectral analysis of either EMG or acoustic amplitude envelope alone was less informative. Coherence metrics seem to distinguish and highlight rhythmic structure in natural speech.
Auditory changes in acromegaly.

Science.gov (United States)

Tabur, S; Korkmaz, H; Baysal, E; Hatipoglu, E; Aytac, I; Akarsu, E

2017-06-01

The aim of this study is to determine the changes involving auditory system in cases with acromegaly. Otological examinations of 41 cases with acromegaly (uncontrolled n = 22, controlled n = 19) were compared with those of age and gender-matched 24 healthy subjects. Whereas the cases with acromegaly underwent examination with pure tone audiometry (PTA), speech audiometry for speech discrimination (SD), tympanometry, stapedius reflex evaluation and otoacoustic emission tests, the control group did only have otological examination and PTA. Additionally, previously performed paranasal sinus-computed tomography of all cases with acromegaly and control subjects were obtained to measure the length of internal acoustic canal (IAC). PTA values were higher (p acromegaly group was narrower compared to that in control group (p = 0.03 for right ears and p = 0.02 for left ears). When only cases with acromegaly were taken into consideration, PTA values in left ears had positive correlation with growth hormone and insulin-like growth factor-1 levels (r = 0.4, p = 0.02 and r = 0.3, p = 0.03). Of all cases with acromegaly 13 (32%) had hearing loss in at least one ear, 7 (54%) had sensorineural type and 6 (46%) had conductive type hearing loss. Acromegaly may cause certain changes in the auditory system in cases with acromegaly. The changes in the auditory system may be multifactorial causing both conductive and sensorioneural defects.
Individually-Personal Peculiarities of Younger Preschoolers’ Speech

Directory of Open Access Journals (Sweden)

M E Novikova

2013-12-01

Full Text Available Studying the speech of the younger preschoolers is a major factor in designing educational methods and preparing children for school. There exist individual and gender differences in the way children acquire speech skills. Word comprehension and idea interpretation depend on the child’s upbringing, his or her environment, the interaction within the family. This article submits the research data obtained from the study of the individual peculiarities of the younger preschool children’s speech.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.