WorldWideScience

Sample records for audiometry speech

  1. Serial audiometry and speech recognition findings in Finnish Usher syndrome type III patients.

    NARCIS (Netherlands)

    Plantinga, R.F.; Kleemola, L.; Huygen, P.L.M.; Joensuu, T.; Sankila, E.M.; Pennings, R.J.E.; Cremers, C.W.R.J.

    2005-01-01

    Audiometric features, evaluated by serial pure tone audiometry and speech recognition tests (n = 31), were analysed in 59 Finnish Usher syndrome type III patients (USH3) with Finmajor/Finmajor (n = 55) and Finmajor/Finminor (n = 4) USH3A mutations. These patients showed a highly variable type and de

  2. Monosyllable speech audiometry in noise-exposed workers—consonant and vowel confusion

    Science.gov (United States)

    Miyakita, T.; Miura, H.

    1988-12-01

    To obtain basic data for evaluating the hearing handicaps experienced by workers with noise-induced hearing loss, the ability to distinguish monosyllables was examined by speech audiometry. The percentage of correct scores for each monosyllable varied widely in 88 male workers, depending on the presentation level and the severity of hearing loss. A 67-S word list (prepared by the Japan Audiological Society), consisting of 20 Japanese monosyllables (17 consonant-vowel (CV) syllables and three vowel syllables), was used to evaluate consonant and vowel confusion at the level of 20 to 90 dB ( re HL at 1000 Hz [9]). Regarding the confusion among five subsequent vowel nuclei, we observed particular confusion patterns resulting from the similarity of the first formant (F1). Analysis of the tendency toward confusion among individual monosyllables together with the audiometric configuration will provide useful information for evaluating noise-induced hearing loss.

  3. [Implementation of the new quality assurance agreement for the fitting of hearing aids in daily practice. Part 2: New diagnostic aspects of speech audiometry].

    Science.gov (United States)

    Löhler, J; Akcicek, B; Wollenberg, B; Schönweiler, R

    2014-09-01

    Upon review of the statutory health insurance reimbursement guidelines, a specific quality assurance questionnaire concerned with the provision of hearing aids was introduced that assesses elements of patient satisfaction within Germany's public healthcare system. APHAB questionnaire-based patient evaluation of the benefit of hearing aids represents the third pillar of audiological diagnostics, alongside classical pure-tone and speech audiometry. Another new aspect of the national guidelines is inclusion of free-field measurements in noise with and without hearing aids. Part 2 of this review describes new diagnostic aspects of speech audiometry. In addition to adaptive speech audiometry, a proposed method for applying the gold standard of speech audiometry - the Freiburg monosyllabic speech test - in noise is described. Finally, the quality assurance questionnaire will be explained as an appendix to template 15 of the regulations governing hearing aids.

  4. Audiometry screening and interpretation.

    Science.gov (United States)

    Walker, Jennifer Junnila; Cleveland, Leanne M; Davis, Jenny L; Seales, Jennifer S

    2013-01-01

    The prevalence of hearing loss varies with age, affecting at least 25 percent of patients older than 50 years and more than 50 percent of those older than 80 years. Adolescents and young adults represent groups in which the prevalence of hearing loss is increasing and may therefore benefit from screening. If offered, screening can be performed periodically by asking the patient or family if there are perceived hearing problems, or by using clinical office tests such as whispered voice, finger rub, or audiometry. Audiometry in the family medicine clinic setting is a relatively simple procedure that can be interpreted by a trained health care professional. Pure-tone testing presents tones across the speech spectrum (500 to 4,000 Hz) to determine if the patient's hearing levels fall within normal limits. A quiet testing environment, calibrated audiometric equipment, and appropriately trained personnel are required for in-office testing. Pure-tone audiometry may help physicians appropriately refer patients to an audiologist or otolaryngologist. Unilateral or asymmetrical hearing loss can be symptomatic of a central nervous system lesion and requires additional evaluation.

  5. High-frequency audiometry: A means for early diagnosis of noise-induced hearing loss

    OpenAIRE

    Amir H Mehrparvar; Seyyed J Mirmohammadi; Abbas Ghoreyshi; Abolfazl Mollasadeghi; Ziba Loukzadeh

    2011-01-01

    Noise-induced hearing loss (NIHL), an irreversible disorder, is a common problem in industrial settings. Early diagnosis of NIHL can help prevent the progression of hearing loss, especially in speech frequencies. For early diagnosis of NIHL, audiometry is performed routinely in conventional frequencies. We designed this study to compare the effect of noise on high-frequency audiometry (HFA) and conventional audiometry. In a historical cohort study, we compared hearing threshold and prevalence...

  6. Evaluation of temporal difference limen in preoperative non-invasive ear canal audiometry as a predictive factor for speech perception after cochlear implantation

    Directory of Open Access Journals (Sweden)

    Saku T. Sinkkonen

    2014-03-01

    Full Text Available The temporal difference limen (TDL can be measured with noninvasive electrical ear canal stimulation. The objective of the study wa to determine the role of preoperative TDL measurements in predicting patients’ speech perception after cochlear implantation. We carried out a retrospective chart analysis of fifty-four cochlear implant (CI patients with preoperative TDL and postoperative bisyllabic word recognition measurements in Helsinki University Central Hospital between March 1994 and March 2011. Our results show that there is no correlation between TDL and postoperative speech perception. However, patient’s advancing age correlates with longer TDL but notdirectly with poorer speech perception. The results are in line with previous results concerning the lack of predictive value of preoperativ TDL measurements in CI patients.

  7. A web-based audiometry database system.

    Science.gov (United States)

    Yeh, Chung-Hui; Wei, Sung-Tai; Chen, Tsung-Wen; Wang, Ching-Yuang; Tsai, Ming-Hsui; Lin, Chia-Der

    2014-07-01

    To establish a real-time, web-based, customized audiometry database system, we worked in cooperation with the departments of medical records, information technology, and otorhinolaryngology at our hospital. This system includes an audiometry data entry system, retrieval and display system, patient information incorporation system, audiometry data transmission program, and audiometry data integration. Compared with commercial audiometry systems and traditional hand-drawn audiometry data, this web-based system saves time and money and is convenient for statistics research.

  8. Occupational hearing loss: tonal audiometry X high frequencies audiometry

    Directory of Open Access Journals (Sweden)

    Lauris, José Roberto Pereira

    2009-09-01

    Full Text Available Introduction: Studies on the occupational exposure show that noise has been reaching a large part of the working population around the world, and NIHL (noise-induced hearing loss is the second most frequent disease of the hearing system. Objective: To review the audiometry results of employees at the campus of the University of São Paulo, Bauru. Method: 40 audiometry results were analyzed between 2007 and 2008, whose ages comprised between 32 and 59 years, of both sexes and several professions: gardeners, maintenance technicians, drivers etc. The participants were divided into 2 groups: those with tonal thresholds within acceptable thresholds and those who presented auditory thresholds alterations, that is tonal thresholds below 25 dB (NA in any frequency (Administrative Rule no. 19 of the Ministry of Labor 1998. In addition to the Conventional Audiologic Evaluation (250Hz to 8.000Hz we also carried out High Frequencies Audiometry (9000Hz, 10000Hz, 11200Hz, 12500Hz, 14000Hz and 16000Hz. Results: According to the classification proposed by FIORINI (1994, 25.0% (N=10 they presented with NIHL suggestive audiometric configurations. The results of high frequencies Audiometry confirmed worse thresholds than those obtained in the conventional audiometry in the 2 groups evaluated. Conclusion: The use of high frequencies audiometry proved to be an important register as a hearing alteration early detection method.

  9. High-frequency audiometry: a means for early diagnosis of noise-induced hearing loss.

    Science.gov (United States)

    Mehrparvar, Amir H; Mirmohammadi, Seyyed J; Ghoreyshi, Abbas; Mollasadeghi, Abolfazl; Loukzadeh, Ziba

    2011-01-01

    Noise-induced hearing loss (NIHL), an irreversible disorder, is a common problem in industrial settings. Early diagnosis of NIHL can help prevent the progression of hearing loss, especially in speech frequencies. For early diagnosis of NIHL, audiometry is performed routinely in conventional frequencies. We designed this study to compare the effect of noise on high-frequency audiometry (HFA) and conventional audiometry. In a historical cohort study, we compared hearing threshold and prevalence of hearing loss in conventional and high frequencies of audiometry among textile workers divided into two groups: With and without exposure to noise more than 85 dB. The highest hearing threshold was observed at 4000 Hz, 6000 Hz and 16000 Hz in conventional right ear audiometry, conventional left ear audiometry and HFA in each ear, respectively. The hearing threshold was significantly higher at 16000 Hz compared to 4000. Hearing loss was more common in HFA than conventional audiometry. HFA is more sensitive to detect NIHL than conventional audiometry. It can be useful for early diagnosis of hearing sensitivity to noise, and thus preventing hearing loss in lower frequencies especially speech frequencies.

  10. Test person operated 2-Alternative Forced Choice Audiometry compared to traditional audiometry

    DEFF Research Database (Denmark)

    Schmidt, Jesper Hvass; Brandt, Christian; Christensen-Dalsgaard, Jakob;

    as a comparison with traditional audiometry. A series of 30 persons (60 ears) have conducted traditional audiometry as well as self-operated 2AFC-audiometry. Test subjects are normal as well as moderately hearing impaired people. The different thresholds are compared.   Results: 2 AFC Audiometry is reliable...... and comparable to traditional audiometry. 2AFC audiometry tends to give thresholds 1-2 dB lower compared to traditional audiometry. In general standard deviations between the two test methods are below 4.5 dB for frequencies from (250-4000 Hz) and up to 6.7 dB for frequencies above 4000 Hz. Results from test......-retest studies of 2AFC audiometry are comparable to test-retest results known from traditional audiometry under standard clinical settings.   Conclusions 2 Alternative Forced Choice audiometry can be a reliable alternative to traditional audiometry especially under certain circumstances, where it can...

  11. Averaged Electroencephalic Audiometry in Infants

    Science.gov (United States)

    Lentz, William E.; McCandless, Geary A.

    1971-01-01

    Normal, preterm, and high-risk infants were tested at 1, 3, 6, and 12 months of age using averaged electroencephalic audiometry (AEA) to determine the usefulness of AEA as a measurement technique for assessing auditory acuity in infants, and to delineate some of the procedural and technical problems often encountered. (KW)

  12. Audiometry for the Retarded: With Implications for the Difficult-to-Test.

    Science.gov (United States)

    Fulton, Robert T., Ed.; And Others

    Directed to professionals with a basic knowledge of audiological principles, the text presents a review of audiological assessment procedures and their applicability to the retarded. Pure-tone, speech, and Bekesy audiometry are described. Also discussed are differential diagnosis of auditory impairments, conditioning and audiological assessment,…

  13. ABR Audiometry in Cornelia De Lange Syndrome.

    Science.gov (United States)

    Brown, Denice P.

    Eight children (ages 13 days to 5 years) with a diagnosis of Cornelia de Lange syndrome received audiologic evaluation consisting of immittance audiometry and auditory brainstem response audiometry to air and bone conducted "click" stimuli, as behavioral testing was unreliable due to patient age and/or developmental delay. Developmental…

  14. Objective Audiometry using Ear-EEG

    DEFF Research Database (Denmark)

    Christensen, Christian Bech; Kidmose, Preben

    therefore be an enabling technology for objective audiometry out of the clinic, allowing regularly fitting of the hearing aids to be made by the users in their everyday life environment. The objective of this study is to investigate the application of ear-EEG in objective audiometry....

  15. Objective Audiometry using Ear-EEG

    DEFF Research Database (Denmark)

    Christensen, Christian Bech; Kidmose, Preben

    life. Ear-EEG may therefore be an enabling technology for objective audiometry out of the clinic, allowing regularly fitting of the hearing aids to be made by the users in their everyday life environment. In this study we investigate the application of ear-EEG in objective audiometry....

  16. Brain stem evoked response audiometry A Review

    OpenAIRE

    Balasubramanian Thiagarajan

    2015-01-01

    Brain stem evoked response audiometry (BERA) is a useful objective assessement of hearing. Major advantage of this procedure is its ability to test even infants in whom conventional audiometry may not be useful. This investigation can be used as a screening test for deafness in high risk infants. Early diagnosis and rehabilitation will reduce disability in these children. This article attempts to review the published literature on this subject. Methadology: Internet search using goog...

  17. PC-based tele-audiometry.

    Science.gov (United States)

    Choi, Jong Min; Lee, Haet Bit; Park, Cheol Soo; Oh, Seung Ha; Park, Kwang Suk

    2007-10-01

    A personal computer (PC)-based audiometer was developed for interactive remote audiometry. This paper describes a tele-audiometric system and evaluates the performance of the device when compared with conventional face-to-face audiometry. The tele-audiometric system is fully PC-based. A sound card featuring a high-quality digital-to-analog converter is used as a pure-tone generator. The audiometric programs were developed based on Microsoft Windows in order to maximize usability. Audiologists and their subjects can use the tele-audiometry system as one would utilize any PC application. A calibration procedure has been applied for the standardization of sound levels in the remote system. The performance of this system was evaluated by comparing PC-based audiometry with the conventional clinical audiometry system for 37 subjects. Also, performance of the PC-based system was evaluated during use at a remote site. The PC-based audiometry system estimated the audiometric threshold with an error of less than 2.3 dBSPL. Only 10.7% of the results exhibited an error greater than 5 dBSPL during use at a remote site. The PC-based tele-audiomerty showed acceptable results for use at a remote site. This PC-based system can be used effectively and easily in many locations that have Internet access but no local audiologists.

  18. AUDIOMETRY-FIRST STEP TO EARLY DETECTION

    Directory of Open Access Journals (Sweden)

    Slavco CADIEV

    1997-09-01

    Full Text Available The problem of early detection of children with sense of hearing is very complicate. If the damage of sense of hearing is not detected on time and if You don’t take adequate treatment, after seven year that is impossible. Audiometry is one step to reestablishment diagnostic with help of electronic technology.

  19. The relevance of the high frequency audiometry in tinnitus patients with normal hearing in conventional pure-tone audiometry

    OpenAIRE

    Veronika Vielsmeier; Astrid Lehner; Jürgen Strutz; Thomas Steffens; Kreuzer, Peter M; Martin Schecklmann; Michael Landgrebe; Berthold Langguth; Tobias Kleinjung

    2015-01-01

    Objective. The majority of tinnitus patients suffer from hearing loss. But a subgroup of tinnitus patients show normal hearing thresholds in the conventional pure-tone audiometry (125 Hz–8 kHz). Here we explored whether the results of the high frequency audiometry (>8 kHz) provide relevant additional information in tinnitus patients with normal conventional audiometry by comparing those with normal and pathological high frequency audiometry with respect to their demographic and clinical chara...

  20. Conventional Audiometry, Extended High-Frequency Audiometry, and DPOAE for Early Diagnosis of NIHL

    OpenAIRE

    Mehrparvar, Amir Houshang; Mirmohammadi, Seyyed Jalil; Davari, Mohammad Hossein; MOSTAGHACI, Mehrdad; Mollasadeghi, Abolfazl; Bahaloo, Maryam; Hashemi, Seyyed Hesam

    2014-01-01

    Background: Noise most frequently affects hearing system, as it may typically cause a bilateral, progressive sensorineural hearing loss at high frequencies. Objectives: This study was designed to compare three different methods to evaluate noise-induced hearing loss (conventional audiometry, high-frequency audiometry, and distortion product otoacoustic emission). Material and Methods: This was a cross-sectional study. Data was analyzed by SPSS (ver. 19) using chi square, T test and repeated m...

  1. The Relevance of the High Frequency Audiometry in Tinnitus Patients with Normal Hearing in Conventional Pure-Tone Audiometry

    Directory of Open Access Journals (Sweden)

    Veronika Vielsmeier

    2015-01-01

    Full Text Available Objective. The majority of tinnitus patients suffer from hearing loss. But a subgroup of tinnitus patients show normal hearing thresholds in the conventional pure-tone audiometry (125 Hz–8 kHz. Here we explored whether the results of the high frequency audiometry (>8 kHz provide relevant additional information in tinnitus patients with normal conventional audiometry by comparing those with normal and pathological high frequency audiometry with respect to their demographic and clinical characteristics. Subjects and Methods. From the database of the Tinnitus Clinic at Regensburg we identified 75 patients with normal hearing thresholds in the conventional pure-tone audiometry. We contrasted these patients with normal and pathological high-frequency audiogram and compared them with respect to gender, age, tinnitus severity, pitch, laterality and duration, comorbid symptoms and triggers for tinnitus onset. Results. Patients with pathological high frequency audiometry were significantly older and had higher scores on the tinnitus questionnaires in comparison to patients with normal high frequency audiometry. Furthermore, there was an association of high frequency audiometry with the laterality of tinnitus. Conclusion. In tinnitus patients with normal pure-tone audiometry the high frequency audiometry provides useful additional information. The association between tinnitus laterality and asymmetry of the high frequency audiometry suggests a potential causal role for the high frequency hearing loss in tinnitus etiopathogenesis.

  2. Understanding Bilingualism and Its Impact on Speech Audiometry.

    Science.gov (United States)

    von Hapsburg, Deborah; Pena, Elizabeth D.

    2002-01-01

    This tutorial reviews auditory research conducted with monolingual and bilingual speakers of Spanish and English. Based on a functional view of bilingualism and on auditory research findings showing that the bilingual experience may affect the outcome of auditory research, it discusses methods for improving descriptions of linguistically diverse…

  3. Electrophysiological Techniques for Sea Lion Population-Level Audiometry

    Science.gov (United States)

    2009-09-30

    Audiometry James J. Finneran Space and Naval Warfare Systems Center Pacific, Biosciences Division, Code 71510, 53560 Hull Street, San Diego, CA...DATES COVERED 00-00-2009 to 00-00-2009 4. TITLE AND SUBTITLE Electrophysiological Techniques For Sea Lion Population-Level Audiometry 5a

  4. Collection and analysis of offshore workforce audiometry data

    Energy Technology Data Exchange (ETDEWEB)

    NONE

    2000-05-01

    This report summarises the results of a study analysing audiometry data to determine if noise induced related hearing loss is happening in offshore operations. The background to the study is traced, and details are given of the initial contacts with medical and operational companies holding audiometry data, the confidentiality of the data sources, the questionnaire for the holders of personnel audiometry data, and initial data checking. A descriptive analysis of the study population is presented, and the analysis of audiometry data, hearing threshold levels, and the classification of the data using the Health and Safety Executive (HSE) categorisation scheme are discussed. The questionnaire for the data holders, the audiometry data collection proforma, and guidance for completion of data collection proformas are included in appendices.

  5. The ratio of the subjective audiometry in patients with acoustic trauma and “noisy” production workers

    Directory of Open Access Journals (Sweden)

    Shydlovska T.А.

    2014-11-01

    Full Text Available Introduction: The problem of diagnosis and treatment of sensorineural hearing loss (SHL, including forms developed under the influence of noise, takes one of the leading places in ot¬olaryngology. However, there are not many studies on acoustic trauma, although this problem has recently become more and more important. Objective: A comparison of subjective audiometry in patients with sensorineural hearing loss after acute acoustic trauma and chronic noise exposure. Materials and methods. In the work the results of exa¬mination of 84 patients with acoustic trauma, 15 healthy as the control group and 15 workers employed on 'noise' occupations as a comparison group are given. Subjective audiometry was fully carried out by clinical audiometer AC-40 «Interacoustics» (Denmark. Hearing indices were investigated in the conventional (0,125-8 kHz and extended (9-16 kHz frequency bands. Results: Subjective audiometry showed a reduction in sound perception in all patients. Ac¬cor¬ding to the threshold tone audiometry in patients with acoustic trauma hearing thresholds were authentically (P <0,05 increased by 4, 6 and 8 kHz tones of conventional (0,125-8 kHz frequency band and by 14-16 kHz tones of the extended (9-16 kHz in comparison with the control group, as with workers employed on noise occupations. All the examined patients had deterioration of speech-test audiometry and above-threshold audiometry. Conclusions: According to su¬b¬jective audiometry, the type similar disorders of auditory function are in patients with acoustic trauma as in patients with long-term noise exposure, but they are more pronounced and develop much faster. The most informative features which show the origin and progression of hearing loss in patients with acoustic trauma are: increasing hearing thresholds by 14 and 16 kHz tones of the extended (9-16 kHz frequency band and by 4, 6 and 8 kHz tones of con¬ven¬tional (0,125-8 kHz frequency band plus the reduction of

  6. Evoked response audiometry used in testing auditory organs of miners

    Energy Technology Data Exchange (ETDEWEB)

    Malinowski, T.; Klepacki, J.; Wagstyl, R.

    1980-01-01

    The evoked response audiometry method of testing hearing loss is presented and the results of comparative studies using subjective tonal audiometry and evoked response audiometry in tests of 56 healthy men with good hearing are discussed. The men were divided into three groups according to age and place of work: work place without increased noise; work place with noise and vibrations (at drilling machines); work place with noise and shocks (work at excavators in surface coal mines). The ERA-MKII audiometer produced by the Medelec-Amplaid firm was used. Audiometric threshhold curves for the three groups of tested men are given. At frequencies of 500, 1000 and 4000 Hz mean objective auditory threshhold was shifted by 4-9.5 dB in comparison to the subjective auditory threshold. (21 refs.) (In Polish)

  7. The Role of Immittance Audiometry in Detecting Middle Ear Disease

    OpenAIRE

    Jacobson, John T.

    1981-01-01

    Immittance audiometry is an objective technique which evaluates middle ear function by three procedures: static immittance, tympanometry, and the measurement of acoustic reflex threshold sensitivity. This article discusses the technique's ability to identify middle ear effusion, the single leading ear disease in children.

  8. Visual reinforcement audiometry: an Adobe Flash based approach.

    Science.gov (United States)

    Atherton, Steve

    2010-09-01

    Visual Reinforcement Audiometry (VRA) is a key behavioural test for young children. It is central to the diagnosis of hearing-impaired infants (1) . Habituation to the visual reinforcement can give misleading results. Medical Illustration ABM University Health Board has designed a collection of Flash animations to overcome this.

  9. Extended high-frequency audiometry (9,000-20,000 Hz). Usefulness in audiological diagnosis.

    Science.gov (United States)

    Rodríguez Valiente, Antonio; Roldán Fidalgo, Amaya; Villarreal, Ithzel M; García Berrocal, José R

    2016-01-01

    Early detection and appropriate treatment of hearing loss are essential to minimise the consequences of hearing loss. In addition to conventional audiometry (125-8,000 Hz), extended high-frequency audiometry (9,000-20,000 Hz) is available. This type of audiometry may be useful in early diagnosis of hearing loss in certain conditions, such as the ototoxic effect of cisplatin-based treatment, noise exposure or oral misunderstanding, especially in noisy environments. Eleven examples are shown in which extended high-frequency audiometry has been useful in early detection of hearing loss, despite the subject having a normal conventional audiometry. The goal of the present paper was to highlight the importance of the extended high-frequency audiometry examination for it to become a standard tool in routine audiological examinations.

  10. The Frequency of Hearing Loss and Hearing Aid Prescription in the Clients of the Avesina Education and Health Center, Audiometry Clinic, 1377

    Directory of Open Access Journals (Sweden)

    Abbas Bastani

    2003-08-01

    Full Text Available Objective: Determining the frequency of hearing disorders and hearing aid using in the clients referring to the Avesina education and health center, audiometry clinic, 1377. Method and Material: This is an assesive-descriptive survey that conducted on more than 2053 (1234 males and 819 females who referred for audiometry after examination by a physician. Case history, otoscopy, PTA, speech and immittance audiometry were conducted for all the clients. The findings were expressed in tables and diagrams of frequency. The age and sex relationship. All types of hearing losses and the number of the hearing-impaired clients need a hearing aid were assessed. Findings: 56% of this population were hearing-impaired and 44% had normal hearing were hearing. 60% were males and 40% females. Of the hearing-impaired, 44% had SNHL, 35.6% CHL and 8.2% mixed hearing loss. The hearing aid was prescribed for 204 (83 females and121 males if they need that only 20 females and 32 males wear it. Conclusion: It this sample, SNHL is of higher frequency. According to this survey, the more the age, the more the hearing aid is accepted (85% of wearer are more than 49 the prevalence of the hearing impaired males are more than females (60% versus 40%. Only 25% of the hearing-impaired wear hearing aids.

  11. A user-operated audiometry method based on the maximum likelihood principle and the two-alternative forced-choice paradigm

    DEFF Research Database (Denmark)

    Schmidt, Jesper Hvass; Brandt, Christian; Pedersen, Ellen Raben

    2014-01-01

    response criteria. User-operated audiometry was developed as an alternative to traditional audiometry for research purposes among musicians. Design: Test-retest reliability of the user-operated audiometry system was evaluated and the user-operated audiometry system was compared with traditional audiometry......Objective: To create a user-operated pure-tone audiometry method based on the method of maximum likelihood (MML) and the two-alternative forced-choice (2AFC) paradigm with high test-retest reliability without the need of an external operator and with minimal influence of subjects' fluctuating....... Study sample: Test-retest reliability of user-operated 2AFC audiometry was tested with 38 naïve listeners. User-operated 2AFC audiometry was compared to traditional audiometry in 41 subjects. Results: The repeatability of user-operated 2AFC audiometry was comparable to traditional audiometry...

  12. Extended High Frequency Audiometry in Polycystic Ovary Syndrome

    Directory of Open Access Journals (Sweden)

    Cuneyt Kucur

    2013-01-01

    and BMI of PCOS and control groups were comparable. Each subject was tested with low (250–2000 Hz, high (4000–8000 Hz, and extended high frequency audiometry (8000–20000. Hormonal and biochemical values including LH, LH/FSH, testosterone, fasting glucose, fasting insulin, HOMA-I, and CRP were calculated. Results. PCOS patients showed high levels of LH, LH/FSH, testosterone, fasting insulin, glucose, HOMA-I, and CRP levels. The hearing thresholds of the groups were similar at frequencies of 250, 500, 1000, 2000, and 4000 Hz; statistically significant difference was observed in 8000–14000 Hz in PCOS group compared to control group. Conclusion. PCOS patients have hearing impairment especially in extended high frequencies. Further studies are needed to help elucidate the mechanism behind hearing impairment in association with PCOS.

  13. Correlation of the CT analysis and audiometry in otosclerosis

    Energy Technology Data Exchange (ETDEWEB)

    Kiyomizu, Kensuke; Tono, Tetsuya; Yang, Dewen; Haruta, Atsushi; Kodama, Takao; Kato, Eiji; Komune, Shizuo [Miyazaki Medical Coll., Kiyotake (Japan)

    1998-11-01

    Thirty-three patients (62 ears) with surgically confirmed otosclerosis underwent a preoperative CT examination in order to determine the presence of any correlation between the audiometric and CT findings. Based on the CT findings, the ears were classified into five groups as follows: group A; 25 ears (40.3%) with normal CT findings, group B1; 15 ears (24.2%) with demineralization in the region of the fissula antefenestram, group B2; 12 ears (19.4%) with demineralization around the anterior to the oval window, group B3; 4 ears (6.5%) with demineralization surrounding the cochlea, and group C; 6 ears (9.7%) with thick anterior and posterior plaques. The expansion of demineralization led to an increase in average bone conduction hearing level: group A ; 27.1 dB, group B1; 30.6 dB, group B2; 34.6 dB, group B3; 36.7 dB, and group C; 30.3 dB. This increase is most likely due to progressive labyrinthine otosclerosis. Group C in the average air-bone gap was greater (37.5 dB) than that in the patients with demineralization, group B1 (21.6 dB), group B2 (28.2 dB), group B3 (26.7 dB), the Carhart effect of group C was smaller than that of any other groups, thus suggesting the mode of otosclerosis progression in group C to be different from that in patients with demineralization. The results of the present study indicate that the preoperative CT findings of otosclerosis correlate with the audiometry findings, thus proving the usefulness of CT in diagnosing otosclerosis. (author)

  14. Noise induced hearing loss: Screening with pure-tone audiometry and speech-in-noise testing

    NARCIS (Netherlands)

    Leensen, M.C.J.

    2013-01-01

    Noise-induced hearing loss (NIHL) is a highly prevalent public health problem, caused by exposure to loud noises both during leisure time, e.g. by listening to loud music, and during work. In the past years NIHL was the most commonly reported occupational disease in the Netherlands. Hearing damage c

  15. Pure-tone and speech audiometry in patients with Meniere's disease

    NARCIS (Netherlands)

    Mateijsen, DJM; Van Hengel, PWJ; Van Huffelen, WM; Wit, HP; Albers, FWJ

    2001-01-01

    The aim of this study was to reinvestigate many of the claims in the literature about hearing loss in patients with Meniere's disease, We carried this out on a well-defined group of patients under well-controlled circumstances. Thus, we were able to find support for sonic claims and none for many ot

  16. Pure tone audiometry and impedance screening of school entrant children by nurses: evaluation in a practical setting.

    OpenAIRE

    Holtby, I; Forster, D P; Kumar, U.

    1997-01-01

    BACKGROUND: Screening for hearing loss in English children at entry to school (age 5-6 years) is usually by pure tone audiometry sweep undertaken by school nurses. This study aimed to compare the validity and screening rates of pure tone audiometry with impedance screening in these children. METHODS: Two stage pure tone audiometry and impedance methods of screening were compared in 610 school entry children from 19 infant schools in north east England. Both procedures were completed by school...

  17. A low cost setup for behavioral audiometry in rodents.

    Science.gov (United States)

    Tziridis, Konstantin; Ahlf, Sönke; Schulze, Holger

    2012-10-16

    In auditory animal research it is crucial to have precise information about basic hearing parameters of the animal subjects that are involved in the experiments. Such parameters may be physiological response characteristics of the auditory pathway, e.g. via brainstem audiometry (BERA). But these methods allow only indirect and uncertain extrapolations about the auditory percept that corresponds to these physiological parameters. To assess the perceptual level of hearing, behavioral methods have to be used. A potential problem with the use of behavioral methods for the description of perception in animal models is the fact that most of these methods involve some kind of learning paradigm before the subjects can be behaviorally tested, e.g. animals may have to learn to press a lever in response to a sound. As these learning paradigms change perception itself (1,2) they consequently will influence any result about perception obtained with these methods and therefore have to be interpreted with caution. Exceptions are paradigms that make use of reflex responses, because here no learning paradigms have to be carried out prior to perceptual testing. One such reflex response is the acoustic startle response (ASR) that can highly reproducibly be elicited with unexpected loud sounds in naïve animals. This ASR in turn can be influenced by preceding sounds depending on the perceptibility of this preceding stimulus: Sounds well above hearing threshold will completely inhibit the amplitude of the ASR; sounds close to threshold will only slightly inhibit the ASR. This phenomenon is called pre-pulse inhibition (PPI) (3,4), and the amount of PPI on the ASR gradually depends on the perceptibility of the pre-pulse. PPI of the ASR is therefore well suited to determine behavioral audiograms in naïve, non-trained animals, to determine hearing impairments or even to detect possible subjective tinnitus percepts in these animals. In this paper we demonstrate the use of this method in a

  18. A STUDY OF HEARING IMPROVEMENT AFTER TYMPANOPLASTY BY MEANS OF PURE TONE AUDIOMETRY

    Directory of Open Access Journals (Sweden)

    Siddharth Nirwan

    2015-01-01

    Full Text Available BACKGROUND : Chronic S uppurative O titis M edia (CSOM is an important cause of preventable hearing loss , particularly in the developing world. Tympanoplasty is a procedure to eradicate the disease in middle ear and to reconstruct hearing mechanism. Pure tone audiometry is an efficient , simple and economic tool to assess the level of postoperative hearing gain

  19. Air-Puff Conditioning Audiometry: Extending Its Applicability with Multiply Handicapped Individuals.

    Science.gov (United States)

    Lancioni, G. E.; And Others

    1990-01-01

    This study examined the use of air-puff conditioning audiometry in the hearing assessment of 12 multiply handicapped (including severe/profound mental retardation) subjects, ages 9-32. Ten subjects reached criterion conditioning and then completed the hearing assessment with the air-puff procedure while one reached criterion with a modified…

  20. Infant Thresholds with Enhanced Attention to the Signal in Visual Reinforcement Audiometry.

    Science.gov (United States)

    Primus, Michael A.

    1988-01-01

    A standard operant procedure, Visual Reinforcement Audiometry, was modified to enhance 16 infants' attention to impending auditory signals. The modified technique achieved an average 5.5 dB improvement in threshold over the conventional technique. Correction for adult performance in similar tasks indicated a 3.3 dB attentional effect between…

  1. Identification Audiometry in an Institutionalized Severely and Profoundly Mentally Retarded Population.

    Science.gov (United States)

    Moore, Ernest J.; And Others

    An audiometric screening survey was conducted on a severely and profoundly mentally retarded population using noise-makers and pure tone audiometry. Of those tested with noise-makers, 83% gave an identifiable response to sound, 7% did not respond, and 10% were considered difficult-to-test. By contrast, 4% passed, 2% failed, and 94% were…

  2. Validity of diagnostic computer-based air and forehead bone conduction audiometry.

    Science.gov (United States)

    Swanepoel, De Wet; Biagio, Leigh

    2011-04-01

    Computer-based audiometry allows for novel applications, including remote testing and automation, that may improve the accessibility and efficiency of hearing assessment in various clinical and occupational health settings. This study describes the validity of computer-based, diagnostic air and forehead bone conduction audiometry when compared wtih conventional industry standard audiometry in a sound booth environment. A sample of 30 subjects (19 to 77 years of age) was assessed with computer-based (KUDUwave 5000) and industry standard conventional audiometers (GSI 61) to compare air and bone conduction thresholds and test-retest reliability. Air conduction thresholds for the two audiometers corresponded within 5 dB or less in more than 90% of instances, with an average absolute difference of 3.5 dB (3.8 SD) and a 95% confidence interval of 2.6 to 4.5 dB. Bone conduction thresholds for the two audiometers corresponded within 10 dB or less in 92% of instances, with an average absolute difference of 4.9 dB (4.9 SD) and a 95% confidence interval of 3.6 to 6.1 dB. The average absolute test-retest threshold difference for bone conduction on the industry standard audiometer was 5.1 dB (5.3 SD) and for the computer-based audiometer 7.1 dB (6.4 SD). Computer-based audiometry provided air and bone conduction thresholds within the test-retest reliability limits of industry standard audiometry.

  3. Speech Problems

    Science.gov (United States)

    ... of your treatment plan may include seeing a speech therapist , a person who is trained to treat speech disorders. How often you have to see the speech therapist will vary — you'll probably start out seeing ...

  4. Relationship between the findings of pure-tone audiometry and otoacoustic emission tests on military police personnel

    OpenAIRE

    Guida, Heraldo Lorena; De Sousa, Ariane Laís [UNESP; Cardoso,Ana Cláudia Vieira

    2012-01-01

    Introduction: Otoacoustic emissions can be an alternative for cochlear evaluation in noise induced hearing loss (NIHL). Objective: To investigate the correlation between the findings of audiometry results and distortion product otoacoustic emissions (DPOAE) in the military police. Method: from cross-sectional and retrospective study, 200 military police officers were submitted to audiological evaluation - pure tone audiometry and DPOAE. Results: considering the provisions of Ordinance 19 of t...

  5. Dynamics of pure tone audiometry and DPOAE changes induced by glycerol in Meniere's disease.

    Science.gov (United States)

    Jablonka-Strom, Agnieszka; Pospiech, Lucyna; Zatonski, Maciej; Bochnia, Marek

    2013-05-01

    The purpose of this study is to follow up the dynamics of pure tone threshold and DPOAE amplitude changes induced by glycerol with reference to its activity in the inner ear. Selection was made among 38 patients with Meniere's disease for those having positive glycerol test. Pure-tone audiometry and DP-gram were performed in four series: as an initial examination before glycerol intake, 1, 2 and 3 h after. Audiometric changes formed distinct biphasal pattern at all frequencies between 250 and 4,000 Hz. The most dynamic pure tone threshold decrease occurred during the first hour. Between the first and second hour after glycerol ingestion there was a phase of no significant hearing changes. Further pure tone threshold decrease went on within the third hour reaching its top. Observing DPOAE changes, the highest DP amplitude growth occurred after the second and the third hour at DP-gram frequencies 2, 3 and 4 kHz. The fastest DP-amplitude increase was registered as well during the first hour after glycerol ingestion. In 11 persons with both audiometry and DPOAE positive glycerol test, parallel dynamics in the course of the glycerol test was observed. Biphasal glycerol test dynamics suggests the possibility of two mechanisms of glycerol activity in the inner ear.

  6. Cisplatin-based chemotherapy: Add high-frequency audiometry in the regimen

    Directory of Open Access Journals (Sweden)

    R Arora

    2009-01-01

    Full Text Available Background : Cisplatin-induced ototoxicity shows high interindividual variability and is often accompanied by transient or permanent tinnitus. It is not possible to identify the susceptible individuals before commencement of the treatment. We conducted a prospective, randomized and observational study in a tertiary care centre and evaluated the effects of different doses of cisplatin on hearing. Materials and Methods : Fifty-seven patients scheduled for cisplatin-based chemotherapy were included in the study. All patients were divided into three groups depending on the dose of cisplatin infused in 3 weeks. Results : The subjective hearing loss was found in seven patients, while six patients had tinnitus during the chemotherapy. The hearing loss was sensorineural, dose dependent, symmetrical, bilateral and irreversible. Higher frequencies were first to be affected in cisplatin chemotherapy. Conclusion : As use of high-frequency audiometry is still limited in research work only, we need a strict protocol of adding high-frequency audiometry in the cisplatin-based chemotherapy regimen.

  7. The Galker test of speech reception in noise

    DEFF Research Database (Denmark)

    Lauritsen, Maj-Britt Glenn; Söderström, Margareta; Kreiner, Svend

    2016-01-01

    and daycare teachers completed questionnaires on the children's ability to hear and understand speech. As most of the variables were not assessed using interval scales, non-parametric statistics (Goodman-Kruskal's gamma) were used for analyzing associations with the Galker test score. For comparisons......, analysis of variance (ANOVA) was used. Interrelations were adjusted for using a non-parametric graphic model. RESULTS: In unadjusted analyses, the Galker test was associated with gender, age group, language development (Reynell revised scale), audiometry, and tympanometry. The Galker score was also...

  8. STANDARDIZNG OF BRAINSTEM EVOKED RESPONSE AUDIOMETRY VALUES PRELIMINARY TO STARTING BERA LAB IN A HOSPITAL

    Directory of Open Access Journals (Sweden)

    Sivaprasad

    2014-07-01

    Full Text Available INTRODUCTION: The subjective assessment of hearing is primarily done by pure tone audiometry. It is commonly undertaken test which can tell us the hearing acuity of a person when carried under ideal conditions. However, not infrequently the otologists encounter a difficulty to do subjective audiometry or in those circumstances where the test results are not correlating with the disease in question. Hence they have to depend upon the objective tests to get a workable knowledge about the patients hearing threshold. Of the various objective tests available the most popular are Brain stem evoked response audiometry –non-invasive and more standardized parameter, Electro-cochleography, auditory steady state response. Otoacoustic Emission test (OAE Otoacoustic emission doesn’t measure the hearing acuity, it gives us an idea whether there is any deafness or not. But BERA is useful in detecting and quantification of deafness in the difficult-to-test patients like infants, mentally retarded people, malingers, deeply sedated and anaesthetized patients. It determines objectively the nature of deafness (i.e., whether sensory or neural in difficult-to-test patients. It helps to locate the site of lesion in retro-cochlear pathologies (in an area from spiral ganglion of the cochlear nerve to midbrain (inferior colliculus. Study of central auditory disorders is possible. Study of maturity of central nervous system in newborns, objective identification of brain death, assessing prognosis in comatose patients are other uses. Before starting a BERA lab in a hospital it is mandatory to standardize the normal values in a randomly selected group of persons with certain criteria like; normal ears with intact T.M and without any complaints of loss of hearing. Persons aged between 05 to 60 years are taken for this study. The study group included both males and females. The aim of this study is to assess the hearing pathway in normal hearing individuals and compare

  9. National Survey of State Identification Audiometry Programs and Special Educational Services for Hearing Impaired Children and Youth United States: 1972.

    Science.gov (United States)

    Gallaudet Coll., Washington, DC. Office of Demographic Studies.

    Reported were descriptive data concerning identification audiometry (hearing screening) and special educational programs for the hearing impaired. Data were provided in tabular format for each state in the country and the District of Columbia. Hearing screening program data included extent of coverage, grade or ages covered annually, year and…

  10. Auditory evaluation of the microcephalic children with brain stem evoked response audiometry (BERA).

    Science.gov (United States)

    Das, Piyali; Bandyopadhyay, Manimay; Ghugare, Balaji W; Ghate, Jayshree; Singh, Ramji

    2010-01-01

    Microcephaly implies a reduced occipito-frontal circumference (Audiometry (BERA) to locate the exact site of lesion resulting in the auditory impairment, so that appropriate early rehabilitative measures can be taken. The study revealed that absolute peak latency of wave V, inter peak latencies of III-V and I-V were significantly higher (P- value < 0.05 in each case) in microcephalics than the normal children. Auditory impairment in microcephaly is a common neurodeficit that can be authentically assessed by BERA. The hearing impairment in microcephalics is mostly due to insufficiency of central components of auditory pathway at the level of brainstem, function of peripheral structures being almost within normal limit.

  11. Accuracy of Mobile-Based Audiometry in the Evaluation of Hearing Loss in Quiet and Noisy Environments.

    Science.gov (United States)

    Saliba, Joe; Al-Reefi, Mahmoud; Carriere, Junie S; Verma, Neil; Provencal, Christiane; Rappaport, Jamie M

    2017-04-01

    Objectives (1) To compare the accuracy of 2 previously validated mobile-based hearing tests in determining pure tone thresholds and screening for hearing loss. (2) To determine the accuracy of mobile audiometry in noisy environments through noise reduction strategies. Study Design Prospective clinical study. Setting Tertiary hospital. Subjects and Methods Thirty-three adults with or without hearing loss were tested (mean age, 49.7 years; women, 42.4%). Air conduction thresholds measured as pure tone average and at individual frequencies were assessed by conventional audiogram and by 2 audiometric applications (consumer and professional) on a tablet device. Mobile audiometry was performed in a quiet sound booth and in a noisy sound booth (50 dB of background noise) through active and passive noise reduction strategies. Results On average, 91.1% (95% confidence interval [95% CI], 89.1%-93.2%) and 95.8% (95% CI, 93.5%-97.1%) of the threshold values obtained in a quiet sound booth with the consumer and professional applications, respectively, were within 10 dB of the corresponding audiogram thresholds, as compared with 86.5% (95% CI, 82.6%-88.5%) and 91.3% (95% CI, 88.5%-92.8%) in a noisy sound booth through noise cancellation. When screening for at least moderate hearing loss (pure tone average >40 dB HL), the consumer application showed a sensitivity and specificity of 87.5% and 95.9%, respectively, and the professional application, 100% and 95.9%. Overall, patients preferred mobile audiometry over conventional audiograms. Conclusion Mobile audiometry can correctly estimate pure tone thresholds and screen for moderate hearing loss. Noise reduction strategies in mobile audiometry provide a portable effective solution for hearing assessments outside clinical settings.

  12. Investigation of Persian Speech Interaural Attenuation in Adults

    Directory of Open Access Journals (Sweden)

    Fahimeh Hajiabolhassan

    2010-06-01

    Full Text Available Background and Aim: As clinical audiometry assessment of each ear needs to know interaural attenuation (IA, the aim of this study was to investigate Persian speech IA in adults.Methods: This cross-sectional, analytic study was performed on 50 normal hearing students (25 males, 25 females, aged 18-25 years old in Faculty of Rehabilitation, Tehran University of Medical Sciences. Speech reception threshold (SRT was determined with descending method with and without noise. Then speech IA for Persian spondaic words was caculated with TDH-39 earphones.Results: Mean speech IA was 53.06±3.25 dB. There was no significant difference between mean IA in males (53.88±2.93 dB and females (52.24±3.40 dB(p>0.05. The lowest IA was in females (45 dB and the highest IA was in males (60 dB. Mother’s language has no significant effect on speech IA.Conclusion: We may consider 45 dB as the lowest IA for Persian speech assessment, however generalization needs more study on a larger sample.

  13. Plowing Speech

    OpenAIRE

    Zla ba sgrol ma

    2009-01-01

    This file contains a plowing speech and a discussion about the speech This collection presents forty-nine audio files including: several folk song genres; folktales and; local history from the Sman shad Valley of Sde dge county World Oral Literature Project

  14. Speech Indexing

    NARCIS (Netherlands)

    Ordelman, R.J.F.; Jong, de F.M.G.; Leeuwen, van D.A.; Blanken, H.M.; de Vries, A.P.; Blok, H.E.; Feng, L.

    2007-01-01

    This chapter will focus on the automatic extraction of information from the speech in multimedia documents. This approach is often referred to as speech indexing and it can be regarded as a subfield of audio indexing that also incorporates for example the analysis of music and sounds. If the objecti

  15. Speech coding

    Energy Technology Data Exchange (ETDEWEB)

    Ravishankar, C., Hughes Network Systems, Germantown, MD

    1998-05-08

    Speech is the predominant means of communication between human beings and since the invention of the telephone by Alexander Graham Bell in 1876, speech services have remained to be the core service in almost all telecommunication systems. Original analog methods of telephony had the disadvantage of speech signal getting corrupted by noise, cross-talk and distortion Long haul transmissions which use repeaters to compensate for the loss in signal strength on transmission links also increase the associated noise and distortion. On the other hand digital transmission is relatively immune to noise, cross-talk and distortion primarily because of the capability to faithfully regenerate digital signal at each repeater purely based on a binary decision. Hence end-to-end performance of the digital link essentially becomes independent of the length and operating frequency bands of the link Hence from a transmission point of view digital transmission has been the preferred approach due to its higher immunity to noise. The need to carry digital speech became extremely important from a service provision point of view as well. Modem requirements have introduced the need for robust, flexible and secure services that can carry a multitude of signal types (such as voice, data and video) without a fundamental change in infrastructure. Such a requirement could not have been easily met without the advent of digital transmission systems, thereby requiring speech to be coded digitally. The term Speech Coding is often referred to techniques that represent or code speech signals either directly as a waveform or as a set of parameters by analyzing the speech signal. In either case, the codes are transmitted to the distant end where speech is reconstructed or synthesized using the received set of codes. A more generic term that is applicable to these techniques that is often interchangeably used with speech coding is the term voice coding. This term is more generic in the sense that the

  16. The relationship of speech intelligibility with hearing sensitivity, cognition, and perceived hearing difficulties varies for different speech perception tests

    Directory of Open Access Journals (Sweden)

    Antje eHeinrich

    2015-06-01

    Full Text Available Listeners vary in their ability to understand speech in noisy environments. Hearing sensitivity, as measured by pure-tone audiometry, can only partly explain these results, and cognition has emerged as another key concept. Although cognition relates to speech perception, the exact nature of the relationship remains to be fully understood. This study investigates how different aspects of cognition, particularly working memory and attention, relate to speech intelligibility for various tests.Perceptual accuracy of speech perception represents just one aspect of functioning in a listening environment. Activity and participation limits imposed by hearing loss, in addition to the demands of a listening environment, are also important and may be better captured by self-report questionnaires. Understanding how speech perception relates to self-reported aspects of listening forms the second focus of the study.Forty-four listeners aged between 50-74 years with mild SNHL were tested on speech perception tests differing in complexity from low (phoneme discrimination in quiet, to medium (digit triplet perception in speech-shaped noise to high (sentence perception in modulated noise; cognitive tests of attention, memory, and nonverbal IQ; and self-report questionnaires of general health-related and hearing-specific quality of life.Hearing sensitivity and cognition related to intelligibility differently depending on the speech test: neither was important for phoneme discrimination, hearing sensitivity alone was important for digit triplet perception, and hearing and cognition together played a role in sentence perception. Self-reported aspects of auditory functioning were correlated with speech intelligibility to different degrees, with digit triplets in noise showing the richest pattern. The results suggest that intelligibility tests can vary in their auditory and cognitive demands and their sensitivity to the challenges that auditory environments pose on

  17. The role of ultrahigh-frequency audiometry in the early detection of systemic drug-induced hearing loss.

    Science.gov (United States)

    Singh Chauhan, Rajeev; Saxena, Ravinder Kumar; Varshey, Saurabh

    2011-05-01

    In monitoring patients for drug-induced hearing loss, most audiometric evaluations are limited to the range of frequencies from 0.25 to 8 kHz. However, such testing would fail to detect ototoxicity in patients who have already experienced hearing loss in the ultrahigh frequencies from 10 to 20 kHz. Awareness of ultrahigh-frequency ototoxicity could lead to changes in a drug regimen to prevent further damage. We conducted a prospective study of 105 patients who were receiving a potentially ototoxic drug-either gentamicin, amikacin, or cisplatin-to assess the value of ultrahigh-frequency audiometry in detecting systemic drug-induced hearing loss. We found that expanding audiometry into the ultrahigh-frequency range led to the detection of a substantial number of cases of hearing loss that would have otherwise been missed.

  18. Relationship between the findings of pure-tone audiometry and otoacoustic emission tests on military police personnel

    Directory of Open Access Journals (Sweden)

    Guida, Heraldo Lorena

    2012-01-01

    Full Text Available Introduction: Otoacoustic emissions can be an alternative for cochlear evaluation in noise induced hearing loss (NIHL. Objective: To investigate the correlation between the findings of audiometry results and distortion product otoacoustic emissions (DPOAE in the military police. Method: from cross-sectional and retrospective study, 200 military police officers were submitted to audiological evaluation - pure tone audiometry and DPOAE. Results: considering the provisions of Ordinance 19 of the Labour Department, the results were suggestive of induced hearing loss by high sound pressure levels in 58 individuals, distributed as follows: 28 (48.3% bilateral cases and 30 (51.7% unilateral cases, and 15 (25.85% in each ear. The correlation between the audiometric and DPOAE showed statistical significance in most of the frequencies tested in both ears, confirming that the greater the degree of hearing loss, the smaller the DPOAE amplitudes. In addition, there was observed significant difference between the DPOAEs amplitudes of normal subjects and listeners with hearing loss, confirming the lowering of responses in the group with hearing loss. Conclusion: considering that the correlation between pure tone audiometry and DPOAE, we conclude that otoacoustic emissions can be a complementary tool for the detection and control of NIHL in military police.

  19. Amharic Speech Recognition for Speech Translation

    OpenAIRE

    Melese, Michael; Besacier, Laurent; Meshesha, Million

    2016-01-01

    International audience; The state-of-the-art speech translation can be seen as a cascade of Automatic Speech Recognition, Statistical Machine Translation and Text-To-Speech synthesis. In this study an attempt is made to experiment on Amharic speech recognition for Amharic-English speech translation in tourism domain. Since there is no Amharic speech corpus, we developed a read-speech corpus of 7.43hr in tourism domain. The Amharic speech corpus has been recorded after translating standard Bas...

  20. Contrast sensitivity test and conventional and high frequency audiometry: information beyond that required to prescribe lenses and headsets

    Science.gov (United States)

    Comastri, S. A.; Martin, G.; Simon, J. M.; Angarano, C.; Dominguez, S.; Luzzi, F.; Lanusse, M.; Ranieri, M. V.; Boccio, C. M.

    2008-04-01

    In Optometry and in Audiology, the routine tests to prescribe correction lenses and headsets are respectively the visual acuity test (the first chart with letters was developed by Snellen in 1862) and conventional pure tone audiometry (the first audiometer with electrical current was devised by Hartmann in 1878). At present there are psychophysical non invasive tests that, besides evaluating visual and auditory performance globally and even in cases catalogued as normal according to routine tests, supply early information regarding diseases such as diabetes, hypertension, renal failure, cardiovascular problems, etc. Concerning Optometry, one of these tests is the achromatic luminance contrast sensitivity test (introduced by Schade in 1956). Concerning Audiology, one of these tests is high frequency pure tone audiometry (introduced a few decades ago) which yields information relative to pathologies affecting the basal cochlea and complements data resulting from conventional audiometry. These utilities of the contrast sensitivity test and of pure tone audiometry derive from the facts that Fourier components constitute the basis to synthesize stimuli present at the entrance of the visual and auditory systems; that these systems responses depend on frequencies and that the patient's psychophysical state affects frequency processing. The frequency of interest in the former test is the effective spatial frequency (inverse of the angle subtended at the eye by a cycle of a sinusoidal grating and measured in cycles/degree) and, in the latter, the temporal frequency (measured in cycles/sec). Both tests have similar duration and consist in determining the patient's threshold (corresponding to the inverse multiplicative of the contrast or to the inverse additive of the sound intensity level) for each harmonic stimulus present at the system entrance (sinusoidal grating or pure tone sound). In this article the frequencies, standard normality curves and abnormal threshold shifts

  1. Hate speech

    Directory of Open Access Journals (Sweden)

    Anne Birgitta Nilsen

    2014-12-01

    Full Text Available The manifesto of the Norwegian terrorist Anders Behring Breivik is based on the “Eurabia” conspiracy theory. This theory is a key starting point for hate speech amongst many right-wing extremists in Europe, but also has ramifications beyond these environments. In brief, proponents of the Eurabia theory claim that Muslims are occupying Europe and destroying Western culture, with the assistance of the EU and European governments. By contrast, members of Al-Qaeda and other extreme Islamists promote the conspiracy theory “the Crusade” in their hate speech directed against the West. Proponents of the latter theory argue that the West is leading a crusade to eradicate Islam and Muslims, a crusade that is similarly facilitated by their governments. This article presents analyses of texts written by right-wing extremists and Muslim extremists in an effort to shed light on how hate speech promulgates conspiracy theories in order to spread hatred and intolerance.The aim of the article is to contribute to a more thorough understanding of hate speech’s nature by applying rhetorical analysis. Rhetorical analysis is chosen because it offers a means of understanding the persuasive power of speech. It is thus a suitable tool to describe how hate speech works to convince and persuade. The concepts from rhetorical theory used in this article are ethos, logos and pathos. The concept of ethos is used to pinpoint factors that contributed to Osama bin Laden's impact, namely factors that lent credibility to his promotion of the conspiracy theory of the Crusade. In particular, Bin Laden projected common sense, good morals and good will towards his audience. He seemed to have coherent and relevant arguments; he appeared to possess moral credibility; and his use of language demonstrated that he wanted the best for his audience.The concept of pathos is used to define hate speech, since hate speech targets its audience's emotions. In hate speech it is the

  2. Speech enhancement

    CERN Document Server

    Benesty, Jacob; Chen, Jingdong

    2006-01-01

    We live in a noisy world! In all applications (telecommunications, hands-free communications, recording, human-machine interfaces, etc.) that require at least one microphone, the signal of interest is usually contaminated by noise and reverberation. As a result, the microphone signal has to be ""cleaned"" with digital signal processing tools before it is played out, transmitted, or stored.This book is about speech enhancement. Different well-known and state-of-the-art methods for noise reduction, with one or multiple microphones, are discussed. By speech enhancement, we mean not only noise red

  3. Speech dynamics

    NARCIS (Netherlands)

    Pols, L.C.W.

    2011-01-01

    In order for speech to be informative and communicative, segmental and suprasegmental variation is mandatory. Only this leads to meaningful words and sentences. The building blocks are no stable entities put next to each other (like beads on a string or like printed text), but there are gradual tran

  4. Speech Intelligibility

    Science.gov (United States)

    Brand, Thomas

    Speech intelligibility (SI) is important for different fields of research, engineering and diagnostics in order to quantify very different phenomena like the quality of recordings, communication and playback devices, the reverberation of auditoria, characteristics of hearing impairment, benefit using hearing aids or combinations of these things.

  5. Comparison of pure tone audiometry and auditory steady-state responses in subjects with normal hearing and hearing loss.

    Science.gov (United States)

    Ozdek, Ali; Karacay, Mahmut; Saylam, Guleser; Tatar, Emel; Aygener, Nurdan; Korkmaz, Mehmet Hakan

    2010-01-01

    The objective of this study is to compare pure tone audiometry and auditory steady-state response (ASSR) thresholds in normal hearing (NH) subjects and subjects with hearing loss. This study involved 23 NH adults and 38 adults with hearing loss (HI). After detection of behavioral thresholds (BHT) with pure tone audiometry, each subject was tested for ASSR responses in the same day. Only one ear was tested for each subject. The mean pure tone average was 9 ± 4 dB for NH group and 57 ± 14 for HI group. There was a very strong correlation between BHT and ASSR measurements in HI group. However, the correlation was weaker in the NH group. The mean differences of pure tone average of four frequencies (0.5, 1, 2, and 4 kHz) and ASSR threshold average of same frequencies were 13 ± 6 dB in NH group and 7 ± 5 dB in HI group and the difference was significant (P = 0.01). It was found that 86% of threshold difference values were less than 20 dB in NH group and 92% of threshold difference values were less than 20 dB in HI group. In conclusion, ASSR thresholds can be used to predict the configuration of pure tone audiometry. Results are more accurate in HI group than NH group. Although ASSR can be used in cochlear implant decision-making process, findings do not permit the utilization of the test for medico-legal reasons.

  6. Examination of Hearing in a Rheumatoid Arthritis Population: Role of Extended-High-Frequency Audiometry in the Diagnosis of Subclinical Involvement.

    Science.gov (United States)

    Lasso de la Vega, Mar; Villarreal, Ithzel Maria; Lopez-Moya, Julio; Garcia-Berrocal, Jose Ramon

    2016-01-01

    Objective. The aim of this study is to analyze the high-frequency hearing levels in patients with rheumatoid arthritis and to determine the relationship between hearing loss, disease duration, and immunological parameters. Materials and Methods. A descriptive cross-sectional study including fifty-three patients with rheumatoid arthritis was performed. The control group consisted of 71 age- and sex-matched patients from the study population (consecutively recruited in Madrid "Area 9," from January 2010 to February 2011). Both a pure tone audiometry and an extended-high-frequency audiometry were performed. Results. Extended-high-frequency audiometry diagnosed sensorineural hearing loss in 69.8% of the patients which exceeded the results obtained with pure tone audiometry (43% of the patients). This study found significant correlations in patients with sensorineural hearing loss related to age, sex, and serum anti-cardiolipin (aCL) antibody levels. Conclusion. Sensorineural hearing loss must be considered within the clinical context of rheumatoid arthritis. Our results demonstrated that an extended-high-frequency audiometry is a useful audiological test that must be performed within the diagnostic and follow-up testing of patients with rheumatoid arthritis, providing further insight into a disease-modifying treatment or a hearing loss preventive treatment.

  7. Speech communications in noise

    Science.gov (United States)

    1984-07-01

    The physical characteristics of speech, the methods of speech masking measurement, and the effects of noise on speech communication are investigated. Topics include the speech signal and intelligibility, the effects of noise on intelligibility, the articulation index, and various devices for evaluating speech systems.

  8. TYPE-2 DIABETES MELLITUS AND BRAIN STEM EVOKED RESPONSE AUDIOMETRY: A CASE CONTROL STUDY

    Directory of Open Access Journals (Sweden)

    Praveen S

    2016-01-01

    Full Text Available BACKGROUND AND OBJECTIVE Type-2 Diabetes Mellitus (T2DM causes pathophysiological changes in multiple organ system. The peripheral, autonomic and central neuropathy is known to occur in T2DM, which can be studied electrophysiologically. AIM Present study is aimed to evaluate functional integrity of auditory pathway in T2DM by Brainstem Evoked Response Audiometry (BERA. MATERIAL AND METHOD In the present case control study, BERA was recorded from the scalp of 20 T2DM patients aged 30-65 years and were compared with age matched 20 healthy controls. The BERA was performed using EMG Octopus, Clarity Medical Pvt. Ltd. The latencies of wave I, III, V and Wave I-III, I-V and III-V interpeak latencies of both right and left ear were recorded at 70dBHL. STATISTICAL RESULT AND USE Mean±SD of latencies of wave I, III, V and interpeak latency of I-III, I-V and III-V were estimated of T2DM and healthy controls. The significant differences between the two groups were assessed using unpaired student ‘t’ test for T2DM and control groups using GraphPad QuickCalcs calculator. P value <0.05 was considered to be significant. RESULT In T2DM BERA study revealed statistically significant (p<0.05 prolonged latencies of wave I, III and V in both right (1.81±0.33ms, 3.96±0.32ms, 5.60±0.25ms and left (1.96±0.24ms, 3.79±0.22ms, 5.67±0.25ms ear as compared to controls at 70dB. Wave III-V interpeak latency of left ear (1.87±0.31, 1.85±0.41ms and wave I-III (2.51±0.42ms, 1.96±0.48ms and III-V (2.01±0.43ms, 1.76±0.45ms of right ear was prolonged in diabetic patient as compared to controls, although no significant difference was obtained (p<0.05. INTERPRETATION AND CONCLUSION Increase in absolute latencies and interpeak latencies inT2DM patients suggest involvement of central neuronal axis at the level of brain stem and midbrain.

  9. 78 FR 49717 - Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services...

    Science.gov (United States)

    2013-08-15

    ... COMMISSION 47 CFR Part 64 Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services; Telecommunications Relay Services and Speech-to-Speech Services for Individuals With... Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services; Telecommunications...

  10. [The potential of tone audiometry for the determination of the sound-absorbing properties of various materials].

    Science.gov (United States)

    Zinkin, V N; Sheshegov, P M

    2014-01-01

    The objective of the present work was to experimentally estimate the potential of the tone audiometry technique for the determination of the sound-absorbing properties of various material. The study included 15 subjects at the age from 19 to 32 years. Their audiological examination was followed by the placement of the 5×7 cm spacer plate from the study material beneath the bone vibrator telephone to determine the bone sound-conduction threshold; no air-marking was undertaken. The sound absorption by the study materials of interest was determined in each octave-band from 250 to 8000 Hz from the difference between the starting audiogram and the audiogram of the material of interest. The study was carried out in three stages: (1) evaluation of sound absorption of each of the five materials, (2) measurement of the same parameter in the combinations of 2--4 layers for increasing sound absorption, and (3) fixation of the bone conduction telephone by the operator's hand (the head-mounted harness was used for the same purpose at stages 1 and 2). The experiments demonstrated that the study of bone sound conduction by means of tone audiometry allows to estimate the sound absorption of various materials. This technique may be applied for the development of a subjective method for the measurement of sound absorption in order to evaluate the acoustic effectiveness of materials that can be used to construct individual protective anti-noise devices.

  11. Valproate-induced reversible sensorineural hearing loss: a case report with serial audiometry and pharmacokinetic modelling during a valproate rechallenge.

    Science.gov (United States)

    Yeap, Li-Ling; Lim, Kheng-Seang; Lo, Yoke-Lin; Bakar, Mohd Zukiflee Abu; Tan, Chong-Tin

    2014-09-01

    Hearing loss has been reported with valproic acid (VPA) use. However, this is the first case of VPA-induced hearing loss that was tested and confirmed with a VPA rechallenge, supported by serial audiometry and pharmacokinetic modelling. A 39-year-old truck driver with temporal lobe epilepsy was treated with VPA at 400 mg, twice daily, and developed hearing loss after each dose, but recovered within three hours. Hearing loss fully resolved after VPA discontinuation. Audiometry performed five hours after VPA rechallenge showed significant improvement in hearing thresholds. Pharmacokinetic modelling during the VPA rechallenge showed that hearing loss occurred at a level below the therapeutic range. Brainstem auditory evoked potential at three months after VPA discontinuation showed bilateral conduction defect between the cochlear and superior olivary nucleus, supporting a pre-existing auditory deficit. VPA may cause temporary hearing threshold shift. Pre-existing auditory defect may be a risk factor for VPA-induced hearing loss. Caution should be taken while prescribing VPA to patients with pre-existing auditory deficit.

  12. Going to a Speech Therapist

    Science.gov (United States)

    ... Video: Getting an X-ray Going to a Speech Therapist KidsHealth > For Kids > Going to a Speech Therapist ... therapists (also called speech-language pathologists ). What Do Speech Therapists Help With? Speech therapists help people of all ...

  13. Speech research

    Science.gov (United States)

    1992-06-01

    Phonology is traditionally seen as the discipline that concerns itself with the building blocks of linguistic messages. It is the study of the structure of sound inventories of languages and of the participation of sounds in rules or processes. Phonetics, in contrast, concerns speech sounds as produced and perceived. Two extreme positions on the relationship between phonological messages and phonetic realizations are represented in the literature. One holds that the primary home for linguistic symbols, including phonological ones, is the human mind, itself housed in the human brain. The second holds that their primary home is the human vocal tract.

  14. Speech-language pathology findings in patients with mouth breathing: multidisciplinary diagnosis according to etiology.

    Science.gov (United States)

    Junqueira, Patrícia; Marchesan, Irene Queiroz; de Oliveira, Luciana Regina; Ciccone, Emílio; Haddad, Leonardo; Rizzo, Maria Cândida

    2010-11-01

    The purpose of this study was to identify and compare the results of the findings from speech-language pathology evaluations for orofacial function including tongue and lip rest postures, tonus, articulation and speech, voice and language, chewing, and deglutition in children who had a history of mouth breathing. The diagnoses for mouth breathing included: allergic rhinitis, adenoidal hypertrophy, allergic rhinitis with adenoidal hypertrophy; and/or functional mouth breathing. This study was conducted with on 414 subjects of both genders, from 2 to 16-years old. A team consisting of 3 speech-language pathologists, 1 pediatrician, 1 allergist, and 1 otolaryngologist, evaluated the patients. Multidisciplinary clinical examinations were carried out (complete blood counting, X-rays, nasofibroscopy, audiometry). The two most commonly found etiologies were allergic rhinitis, followed by functional mouth breathing. Of the 414 patients in the study, 346 received a speech-language pathology evaluation. The most prevalent finding in this group of 346 subjects was the presence of orofacial myofunctional disorders. The most frequently orofacial myofunctional disorder identified in these subjects who also presented mouth breathing included: habitual open lips rest posture, low and forward tongue rest posture and lack of adequate muscle tone. There were also no statistically significant relationships identified between etiology and speech-language diagnosis. Therefore, the specific type of etiology of mouth breathing does not appear to contribute to the presence, type, or number of speech-language findings which may result from mouth breathing behavior.

  15. Speech production, Psychology of

    NARCIS (Netherlands)

    Schriefers, H.J.; Vigliocco, G.

    2015-01-01

    Research on speech production investigates the cognitive processes involved in transforming thoughts into speech. This article starts with a discussion of the methodological issues inherent to research in speech production that illustrates how empirical approaches to speech production must differ fr

  16. Age-related hearing loss in dogs : Diagnosis with Brainstem-Evoked Response Audiometry and Treatment with Vibrant Soundbridge Middle Ear Implant.

    NARCIS (Netherlands)

    ter Haar, G.

    2009-01-01

    Age-related hearing loss (ARHL) is the most common cause of acquired hearing impairment in dogs. Diagnosis requires objective electrophysiological tests (brainstem evoked response audiometry [BERA]) evaluating the entire audible frequency range in dogs. In our laboratory a method was developed to de

  17. Brainstem response audiometry in the determination of low-frequency hearing loss : a study of various methods for frequency-specific ABR-threshold assessment

    NARCIS (Netherlands)

    E.A.G.J. Conijn

    1992-01-01

    textabstractBrainstem Electric Response Audiometry (BERA) is a method to visualize some of the electric activity generated in the auditory nerve and the brainstem during the processing of sound. The amplitude of the Auditory Brainstem Response (ABR) is very small (0.05-0.5 flV). The potentials origi

  18. Operant Audiometry Manual for Difficult-to-Test Children. Institute on Mental Retardation and Intellectual Development; Papers and Reports, Volume V, Number 19.

    Science.gov (United States)

    Bricker, Diane D.; And Others

    To facilitate the use of operant audiometry with low functioning children (psychotic, severely retarded, or multiply handicapped), a procedures manual was developed containing definitions of terms, instructions for determining reinforcers, physical facilities and equipment needs, diagrams, component lists, and technical descriptions. Development…

  19. Speech Enhancement

    DEFF Research Database (Denmark)

    Benesty, Jacob; Jensen, Jesper Rindom; Christensen, Mads Græsbøll;

    of methods and have been introduced in somewhat different contexts. Linear filtering methods originate in stochastic processes, while subspace methods have largely been based on developments in numerical linear algebra and matrix approximation theory. This book bridges the gap between these two classes......Speech enhancement is a classical problem in signal processing, yet still largely unsolved. Two of the conventional approaches for solving this problem are linear filtering, like the classical Wiener filter, and subspace methods. These approaches have traditionally been treated as different classes...... of methods by showing how the ideas behind subspace methods can be incorporated into traditional linear filtering. In the context of subspace methods, the enhancement problem can then be seen as a classical linear filter design problem. This means that various solutions can more easily be compared...

  20. Speech therapy with obturator.

    Science.gov (United States)

    Shyammohan, A; Sreenivasulu, D

    2010-12-01

    Rehabilitation of speech is tantamount to closure of defect in cases with velopharyngeal insufficiency. Often the importance of speech therapy is sidelined during the fabrication of obturators. Usually the speech part is taken up only at a later stage and is relegated entirely to a speech therapist without the active involvement of the prosthodontist. The article suggests a protocol for speech therapy in such cases to be done in unison with a prosthodontist.

  1. Early Posttreatment Audiometry Underestimates Hearing Recovery after Intratympanic Steroid Treatment of Sudden Sensorineural Hearing Loss

    Directory of Open Access Journals (Sweden)

    Benjamin J. Wycherly

    2011-01-01

    Full Text Available Objective. To review our experience with intratympanic steroids (ITSs for the treatment of idiopathic sudden sensorineural hearing loss (ISSNHL, emphasizing the ideal time to perform follow-up audiograms. Methods. Retrospective case review of patients diagnosed with ISSNHL treated with intratympanic methylprednisolone. Injections were repeated weekly with a total of 3 injections. Improvement was defined as an improved pure-tone average ≥20 dB or speech-discrimination score ≥20%. Results. Forty patients met the inclusion criteria with a recovery rate of 45% (18/40. A significantly increased response rate was found in patients having an audiogram >5 weeks after the first dose of ITS (9/13 over those tested ≤5 weeks after the first dose of ITS (9/27 (=0.03. Conclusions. Recovery from ISSNHL after ITS injections occurs more frequently >5 weeks after initiating ITS. This may be due to the natural history of sudden hearing loss or the prolonged effect of steroid in the inner ear.

  2. Delayed Speech or Language Development

    Science.gov (United States)

    ... to 2-Year-Old Delayed Speech or Language Development KidsHealth > For Parents > Delayed Speech or Language Development ... child is right on schedule. Normal Speech & Language Development It's important to discuss early speech and language ...

  3. Audiometria de alta freqüência em adultos jovens e mais velhos quando a audiometria convencional é normal High-frequency audiometry in young and older adults when conventional audiometry is normal

    Directory of Open Access Journals (Sweden)

    Isabella Monteiro de Castro Silva

    2006-10-01

    Full Text Available A audiometria de alta freqüência é capaz de detectar precocemente alterações em sensibilidade advindas de processos como o envelhecimento. Seu uso é limitado, o que recomenda estudos para esclarecer seu desempenho, especialmente entre adultos de mais idade. OBJETIVO: Comparar os limiares para as freqüências de 250Hz a 16kHz, entre adultos jovens e mais velhos normoacúsicos, com e sem queixa audiológica. CASUÍSTICA E MÉTODO: A sensibilidade a tons puros de 250Hz a 16kHz foi avaliada com audiômetro AC-40, em 64 adultos, igualmente distribuídos: jovens (25 a 35 anos e mais velhos (45 a 55 anos de ambos os gêneros, com forma de estudo de coorte transversal. RESULTADOS: Os adultos mais velhos apresentaram limiares mais elevados em todas as freqüências, mais significativamente nas mais altas (8 a 16kHz, quando comparados com os adultos jovens. Homens apresentaram limiares mais elevados do que mulheres entre 3 e 10kHz. CONCLUSÃO: O processo de envelhecimento auditivo, envolvendo perda de sensibilidade auditiva para altas freqüências, pode ser detectado em idades anteriores às tipicamente pesquisadas, uma vez que a audiometria de alta freqüência demonstrou ser instrumento importante para distinguir a sensibilidade auditiva entre adultos jovens e mais velhos, quando audiologicamente normais.High-frequency audiometry can detect early changes in auditory sensitivity resulting from processes such as aging. Nonetheless its use is still limited, and additional studies are required to establish its use, particularly among older adults. AIM: To compare pure tone thresholds for frequencies from 250 Hz to 16 kHz in young and older adults, with or without audiologic complaints. METHOD: Pure tone sensitivity to 250 Hz to 16 kHz was assessed with an AC-40 audiometer in 64 adults, evenly distributed in young (25 to 35 years-old and older (45 to 55 years-old adults of both sexes. This is a cross-sectional study. RESULTS: Although all

  4. 78 FR 49693 - Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services...

    Science.gov (United States)

    2013-08-15

    ... COMMISSION 47 CFR Part 64 Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services; Telecommunications Relay Services and Speech-to-Speech Services for Individuals With... this document, the Commission amends telecommunications relay services (TRS) mandatory...

  5. Speech 7 through 12.

    Science.gov (United States)

    Nederland Independent School District, TX.

    GRADES OR AGES: Grades 7 through 12. SUBJECT MATTER: Speech. ORGANIZATION AND PHYSICAL APPEARANCE: Following the foreward, philosophy and objectives, this guide presents a speech curriculum. The curriculum covers junior high and Speech I, II, III (senior high). Thirteen units of study are presented for junior high, each unit is divided into…

  6. Age-related hearing loss in dogs : Diagnosis with Brainstem-Evoked Response Audiometry and Treatment with Vibrant Soundbridge Middle Ear Implant.

    OpenAIRE

    ter Haar, G.

    2009-01-01

    Age-related hearing loss (ARHL) is the most common cause of acquired hearing impairment in dogs. Diagnosis requires objective electrophysiological tests (brainstem evoked response audiometry [BERA]) evaluating the entire audible frequency range in dogs. In our laboratory a method was developed to deliver tone bursts ranging in frequency from 1 - 32 kHz for frequency-specific assessment of the cochlea in dogs. Brainstem auditory evoked responses to a click (CS) and to 1, 2, 4, 8, 12, 16, 24, a...

  7. A study of brainstem evoked response audiometry in high-risk infants and children under 10 years of age

    Directory of Open Access Journals (Sweden)

    Ramanathan Thirunavukarasu

    2015-01-01

    Full Text Available Aims: To evaluate the hearing threshold and find the incidence of hearing loss in infants and children belonging to high-risk category and analyze the common risk factors. Subjects and Methods: Totally, 125 infants and children belonging to high-risk category were subjected to brainstem evoked response audiometry. Clicks were given at the rate of 11.1 clicks/s. Totally, 2000 responses were averaged. The intensity at which wave V just disappears was established as hearing the threshold. Degree of impairment and risk factors were analyzed. Results: Totally, 44 (35.2% were found to have sensorineural hearing loss. Totally, 30 children with hearing loss (68% belonged to age group 1-5 years. Consanguineous marriage was the most commonly associated risk factor. Majority (34 had profound hearing loss. Conclusion: Newborn screening is mandatory to identify hearing loss in the prelinguistic period to reduce the burden of handicap in the community. The need of the hour is health education and genetic counseling to decrease the hereditary hearing loss, as hearing impairment due to perinatal factors has reduced due to recent medical advancements.

  8. Speech in spinocerebellar ataxia.

    Science.gov (United States)

    Schalling, Ellika; Hartelius, Lena

    2013-12-01

    Spinocerebellar ataxias (SCAs) are a heterogeneous group of autosomal dominant cerebellar ataxias clinically characterized by progressive ataxia, dysarthria and a range of other concomitant neurological symptoms. Only a few studies include detailed characterization of speech symptoms in SCA. Speech symptoms in SCA resemble ataxic dysarthria but symptoms related to phonation may be more prominent. One study to date has shown an association between differences in speech and voice symptoms related to genotype. More studies of speech and voice phenotypes are motivated, to possibly aid in clinical diagnosis. In addition, instrumental speech analysis has been demonstrated to be a reliable measure that may be used to monitor disease progression or therapy outcomes in possible future pharmacological treatments. Intervention by speech and language pathologists should go beyond assessment. Clinical guidelines for management of speech, communication and swallowing need to be developed for individuals with progressive cerebellar ataxia.

  9. Digital speech processing using Matlab

    CERN Document Server

    Gopi, E S

    2014-01-01

    Digital Speech Processing Using Matlab deals with digital speech pattern recognition, speech production model, speech feature extraction, and speech compression. The book is written in a manner that is suitable for beginners pursuing basic research in digital speech processing. Matlab illustrations are provided for most topics to enable better understanding of concepts. This book also deals with the basic pattern recognition techniques (illustrated with speech signals using Matlab) such as PCA, LDA, ICA, SVM, HMM, GMM, BPN, and KSOM.

  10. Exploration of Speech Planning and Producing by Speech Error Analysis

    Institute of Scientific and Technical Information of China (English)

    冷卉

    2012-01-01

    Speech error analysis is an indirect way to discover speech planning and producing processes. From some speech errors made by people in their daily life, linguists and learners can reveal the planning and producing processes more easily and clearly.

  11. Indirect Speech Acts

    Institute of Scientific and Technical Information of China (English)

    李威

    2001-01-01

    Indirect speech acts are frequently used in verbal communication, the interpretation of them is of great importance in order to meet the demands of the development of students' communicative competence. This paper, therefore, intends to present Searle' s indirect speech acts and explore the way how indirect speech acts are interpreted in accordance with two influential theories. It consists of four parts. Part one gives a general introduction to the notion of speech acts theory. Part two makes an elaboration upon the conception of indirect speech act theory proposed by Searle and his supplement and development of illocutionary acts. Part three deals with the interpretation of indirect speech acts. Part four draws implication from the previous study and also serves as the conclusion of the dissertation.

  12. Charisma in business speeches

    DEFF Research Database (Denmark)

    Niebuhr, Oliver; Brem, Alexander; Novák-Tót, Eszter

    2016-01-01

    of the acoustic-prosodic signal, secondly, focuses on business speeches like product presentations, and, thirdly, in doing so, advances the still fairly fragmentary evidence on the prosodic correlates of charismatic speech. We show that the prosodic features of charisma in political speeches also apply...... to business speeches. Consistent with the public opinion, our findings are indicative of Steve Jobs being a more charismatic speaker than Mark Zuckerberg. Beyond previous studies, our data suggest that rhythm and emphatic accentuation are also involved in conveying charisma. Furthermore, the differences...

  13. Principles of speech coding

    CERN Document Server

    Ogunfunmi, Tokunbo

    2010-01-01

    It is becoming increasingly apparent that all forms of communication-including voice-will be transmitted through packet-switched networks based on the Internet Protocol (IP). Therefore, the design of modern devices that rely on speech interfaces, such as cell phones and PDAs, requires a complete and up-to-date understanding of the basics of speech coding. Outlines key signal processing algorithms used to mitigate impairments to speech quality in VoIP networksOffering a detailed yet easily accessible introduction to the field, Principles of Speech Coding provides an in-depth examination of the

  14. Advances in Speech Recognition

    CERN Document Server

    Neustein, Amy

    2010-01-01

    This volume is comprised of contributions from eminent leaders in the speech industry, and presents a comprehensive and in depth analysis of the progress of speech technology in the topical areas of mobile settings, healthcare and call centers. The material addresses the technical aspects of voice technology within the framework of societal needs, such as the use of speech recognition software to produce up-to-date electronic health records, not withstanding patients making changes to health plans and physicians. Included will be discussion of speech engineering, linguistics, human factors ana

  15. Speech-Language Therapy (For Parents)

    Science.gov (United States)

    ... Feeding Your 1- to 2-Year-Old Speech-Language Therapy KidsHealth > For Parents > Speech-Language Therapy A ... with speech and/or language disorders. Speech Disorders, Language Disorders, and Feeding Disorders A speech disorder refers ...

  16. Speech and language evaluation for cochlear implant%人工耳蜗的言语评估

    Institute of Scientific and Technical Information of China (English)

    张华; 王靓; 王硕; 陈雪清; 陈静

    2005-01-01

    OBJECTIVE: In the last decade, cochlear implantation(CI) has achieved great progress in quantity and research in China. But our researches on speech audiometry, the major index of CI, is relatively slow, or it is hard to compare with European, Australian and American results. Based on the review of the development and application of English speech audiometry, this paper mainly analyzed the composition, testing method and applicative range of the existing English speech test materials and introduced the primary speech and language evaluation tool recommended by the present American expert committee. This paper also made some suggestions for further researches on Chinese evaluation tools, which was anticipated to be useful for the domestic development in this field, based on the review of Chinese tests and aimed on the present evaluation tool.DATA SOURCES: Articles related to auditory researches during January 1964 to April 2004 were searched by the computer with the searching words of "cochlear implant, speech audiometry" in Medline. The language of the articles was limited to English. Corresponding articles during January 1998to April 2004 were searched manually or by computer in "Chinese Journal Clinical Rehabilitation" periodicals with the searching words of "cochlear implant, speech and language evaluation, cochlear implant". The language of the articles was limited to Chinese.DATA SELECTION: Totally 39 national and international original literatures regarding cochlear implant evaluation were selected. Original literatures of non-randomized researches were excluded while original literatures of non-blindness researches were not excluded.DATA EXTRACTION: In 39 literatures regarding cochlear implant evaluation, 32 of them were in accordance with the criteria. The rest 7 articles were excluded due to the repetition of the same research. The selected 32articles regarding cochlear implant evaluation were classified and arranged for literature review.DATA SYNTHESIS

  17. Speech Compression for Noise-Corrupted Thai Expressive Speech

    Directory of Open Access Journals (Sweden)

    Suphattharachai Chomphan

    2011-01-01

    Full Text Available Problem statement: In speech communication, speech coding aims at preserving the speech quality with lower coding bitrate. When considering the communication environment, various types of noises deteriorates the speech quality. The expressive speech with different speaking styles may cause different speech quality with the same coding method. Approach: This research proposed a study of speech compression for noise-corrupted Thai expressive speech by using two coding methods of CS-ACELP and MP-CELP. The speech material included a hundredmale speech utterances and a hundred female speech utterances. Four speaking styles included enjoyable, sad, angry and reading styles. Five sentences of Thai speech were chosen. Three types of noises were included (train, car and air conditioner. Five levels of each type of noise were varied from 0-20 dB. The subjective test of mean opinion score was exploited in the evaluation process. Results: The experimental results showed that CS-ACELP gave the better speech quality than that of MP-CELP at all three bitrates of 6000, 8600-12600 bps. When considering the levels of noise, the 20-dB noise gave the best speech quality, while 0-dB noise gave the worst speech quality. When considering the speech gender, female speech gave the better results than that of male speech. When considering the types of noise, the air-conditioner noise gave the best speech quality, while the train noise gave the worst speech quality. Conclusion: From the study, it can be seen that coding methods, types of noise, levels of noise, speech gender influence on the coding speech quality.

  18. Free Speech Yearbook 1976.

    Science.gov (United States)

    Phifer, Gregg, Ed.

    The articles collected in this annual address several aspects of First Amendment Law. The following titles are included: "Freedom of Speech As an Academic Discipline" (Franklyn S. Haiman), "Free Speech and Foreign-Policy Decision Making" (Douglas N. Freeman), "The Supreme Court and the First Amendment: 1975-1976"…

  19. Private Speech in Ballet

    Science.gov (United States)

    Johnston, Dale

    2006-01-01

    Authoritarian teaching practices in ballet inhibit the use of private speech. This paper highlights the critical importance of private speech in the cognitive development of young ballet students, within what is largely a non-verbal art form. It draws upon research by Russian psychologist Lev Vygotsky and contemporary socioculturalists, to…

  20. Tracking Speech Sound Acquisition

    Science.gov (United States)

    Powell, Thomas W.

    2011-01-01

    This article describes a procedure to aid in the clinical appraisal of child speech. The approach, based on the work by Dinnsen, Chin, Elbert, and Powell (1990; Some constraints on functionally disordered phonologies: Phonetic inventories and phonotactics. "Journal of Speech and Hearing Research", 33, 28-37), uses a railway idiom to track gains in…

  1. Preschool Connected Speech Inventory.

    Science.gov (United States)

    DiJohnson, Albert; And Others

    This speech inventory developed for a study of aurally handicapped preschool children (see TM 001 129) provides information on intonation patterns in connected speech. The inventory consists of a list of phrases and simple sentences accompanied by pictorial clues. The test is individually administered by a teacher-examiner who presents the spoken…

  2. Free Speech. No. 38.

    Science.gov (United States)

    Kane, Peter E., Ed.

    This issue of "Free Speech" contains the following articles: "Daniel Schoor Relieved of Reporting Duties" by Laurence Stern, "The Sellout at CBS" by Michael Harrington, "Defending Dan Schorr" by Tome Wicker, "Speech to the Washington Press Club, February 25, 1976" by Daniel Schorr, "Funds Voted For Schorr Inquiry" by Richard Lyons, "Erosion of the…

  3. Advertising and Free Speech.

    Science.gov (United States)

    Hyman, Allen, Ed.; Johnson, M. Bruce, Ed.

    The articles collected in this book originated at a conference at which legal and economic scholars discussed the issue of First Amendment protection for commercial speech. The first article, in arguing for freedom for commercial speech, finds inconsistent and untenable the arguments of those who advocate freedom from regulation for political…

  4. Predicting speech intelligibility in conditions with nonlinearly processed noisy speech

    DEFF Research Database (Denmark)

    Jørgensen, Søren; Dau, Torsten

    2013-01-01

    The speech-based envelope power spectrum model (sEPSM; [1]) was proposed in order to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index (SII). The sEPSM applies the signal-tonoise ratio in the envelope domain (SNRenv), which was demonstrated...... to successfully predict speech intelligibility in conditions with nonlinearly processed noisy speech, such as processing with spectral subtraction. Moreover, a multiresolution version (mr-sEPSM) was demonstrated to account for speech intelligibility in various conditions with stationary and fluctuating...... from computational auditory scene analysis and further support the hypothesis that the SNRenv is a powerful metric for speech intelligibility prediction....

  5. The development of a test of speech reception disability for use in 5- to 8-year-old children with otitis media with effusion.

    Science.gov (United States)

    Williamson, I; Sheridan, C

    1994-01-01

    The study aims to develop a test of speech reception disability under simulated classroom conditions for use in young (5-8 year old) schoolchildren, to manage children with otitis media with effusion more effectively. A new video test, TADAST (Two Alternative Auditory Disability and Speech Reception Test), has been constructed in stages by extensively modifying the FADAST (Four Alternative) currently used to assess auditory disability in adults. Minimal word pair lists which were easily identifiable in picture form were developed and refined, and then formulated as a two alternative forced choice picture test. The distribution characteristics of the new test were defined in 89 schoolchildren with and without otitis media with effusion (OME) and compared with pure-tone audiometry performed at the same time. The test correlated highly with audiometry in older children. The distribution characteristics revealed a considerable proportion of children with bilateral OME with little functional disability who might otherwise be at risk of surgery. The test appeared to be sensitive to a 'history of OME' effect. Further refinements are needed to develop a final version which could also be used to evaluate hearing disability in 4-year-old children.

  6. Environmental Contamination of Normal Speech.

    Science.gov (United States)

    Harley, Trevor A.

    1990-01-01

    Environmentally contaminated speech errors (irrelevant words or phrases derived from the speaker's environment and erroneously incorporated into speech) are hypothesized to occur at a high level of speech processing, but with a relatively late insertion point. The data indicate that speech production processes are not independent of other…

  7. Speech processing in mobile environments

    CERN Document Server

    Rao, K Sreenivasa

    2014-01-01

    This book focuses on speech processing in the presence of low-bit rate coding and varying background environments. The methods presented in the book exploit the speech events which are robust in noisy environments. Accurate estimation of these crucial events will be useful for carrying out various speech tasks such as speech recognition, speaker recognition and speech rate modification in mobile environments. The authors provide insights into designing and developing robust methods to process the speech in mobile environments. Covering temporal and spectral enhancement methods to minimize the effect of noise and examining methods and models on speech and speaker recognition applications in mobile environments.

  8. Global Freedom of Speech

    DEFF Research Database (Denmark)

    Binderup, Lars Grassme

    2007-01-01

    , as opposed to a legal norm, that curbs exercises of the right to free speech that offend the feelings or beliefs of members from other cultural groups. The paper rejects the suggestion that acceptance of such a norm is in line with liberal egalitarian thinking. Following a review of the classical liberal...... egalitarian reasons for free speech - reasons from overall welfare, from autonomy and from respect for the equality of citizens - it is argued that these reasons outweigh the proposed reasons for curbing culturally offensive speech. Currently controversial cases such as that of the Danish Cartoon Controversy...

  9. The Rhetoric in English Speech

    Institute of Scientific and Technical Information of China (English)

    马鑫

    2014-01-01

    English speech has a very long history and always attached importance of people highly. People usually give a speech in economic activities, political forums and academic reports to express their opinions to investigate or persuade others. English speech plays a rather important role in English literature. The distinct theme of speech should attribute to the rhetoric. It discusses parallelism, repetition and rhetorical question in English speech, aiming to help people appreciate better the charm of them.

  10. Speech intelligibility in hospitals.

    Science.gov (United States)

    Ryherd, Erica E; Moeller, Michael; Hsu, Timothy

    2013-07-01

    Effective communication between staff members is key to patient safety in hospitals. A variety of patient care activities including admittance, evaluation, and treatment rely on oral communication. Surprisingly, published information on speech intelligibility in hospitals is extremely limited. In this study, speech intelligibility measurements and occupant evaluations were conducted in 20 units of five different U.S. hospitals. A variety of unit types and locations were studied. Results show that overall, no unit had "good" intelligibility based on the speech intelligibility index (SII > 0.75) and several locations found to have "poor" intelligibility (SII speech intelligibility across a variety of hospitals and unit types, offers some evidence of the positive impact of absorption on intelligibility, and identifies areas for future research.

  11. Anxiety and ritualized speech

    Science.gov (United States)

    Lalljee, Mansur; Cook, Mark

    1975-01-01

    The experiment examines the effects on a number of words that seem irrelevant to semantic communication. The Units of Ritualized Speech (URSs) considered are: 'I mean', 'in fact', 'really', 'sort of', 'well' and 'you know'. (Editor)

  12. Speech disorders - children

    Science.gov (United States)

    ... this page: //medlineplus.gov/ency/article/001430.htm Speech disorders - children To use the sharing features on ... 2017, A.D.A.M., Inc. Duplication for commercial use must be authorized in writing by ADAM ...

  13. Speech impairment (adult)

    Science.gov (United States)

    ... this page: //medlineplus.gov/ency/article/003204.htm Speech impairment (adult) To use the sharing features on ... 2017, A.D.A.M., Inc. Duplication for commercial use must be authorized in writing by ADAM ...

  14. Speech Compression and Synthesis

    Science.gov (United States)

    1980-10-01

    phonological rules combined with diphone improved the algorithms used by the phonetic synthesis prog?Im for gain normalization and time... phonetic vocoder, spectral template. i0^Th^TreprtTörc"u’d1sTuV^ork for the past two years on speech compression’and synthesis. Since there was an...from Block 19: speech recognition, pnoneme recogmtion. initial design for a phonetic recognition program. We also recorded ana partially labeled a

  15. Recognizing GSM Digital Speech

    OpenAIRE

    2005-01-01

    The Global System for Mobile (GSM) environment encompasses three main problems for automatic speech recognition (ASR) systems: noisy scenarios, source coding distortion, and transmission errors. The first one has already received much attention; however, source coding distortion and transmission errors must be explicitly addressed. In this paper, we propose an alternative front-end for speech recognition over GSM networks. This front-end is specially conceived to be effective against source c...

  16. SPEECH DISORDERS ENCOUNTERED DURING SPEECH THERAPY AND THERAPY TECHNIQUES

    Directory of Open Access Journals (Sweden)

    İlhan ERDEM

    2013-06-01

    Full Text Available Speech which is a physical and mental process, agreed signs and sounds to create a sense of mind to the message that change . Process to identify the sounds of speech it is essential to know the structure and function of various organs which allows to happen the conversation. Speech is a physical and mental process so many factors can lead to speech disorders. Speech disorder can be about language acquisitions as well as it can be caused medical and psychological many factors. Disordered speech, language, medical and psychological conditions as well as acquisitions also be caused by many factors. Speaking, is the collective work of many organs, such as an orchestra. Mental dimension of the speech disorder which is a very complex skill so it must be found which of these obstacles inhibit conversation. Speech disorder is a defect in speech flow, rhythm, tizliğinde, beats, the composition and vocalization. In this study, speech disorders such as articulation disorders, stuttering, aphasia, dysarthria, a local dialect speech, , language and lip-laziness, rapid speech peech defects in a term of language skills. This causes of speech disorders were investigated and presented suggestions for remedy was discussed.

  17. Practical speech user interface design

    CERN Document Server

    Lewis, James R

    2010-01-01

    Although speech is the most natural form of communication between humans, most people find using speech to communicate with machines anything but natural. Drawing from psychology, human-computer interaction, linguistics, and communication theory, Practical Speech User Interface Design provides a comprehensive yet concise survey of practical speech user interface (SUI) design. It offers practice-based and research-based guidance on how to design effective, efficient, and pleasant speech applications that people can really use. Focusing on the design of speech user interfaces for IVR application

  18. Speech-Language Therapy (For Parents)

    Science.gov (United States)

    ... Speech-language pathologists (SLPs), often informally known as speech therapists, are professionals educated in the study of human ... Palate Hearing Evaluation in Children Going to a Speech Therapist Stuttering Hearing Impairment Speech Problems Cleft Lip and ...

  19. THE PRESENCE OF ADENOID VEGETATIONS AND NASAL SPEECH, AND HEARING LOSS IN RELATION TO SECRETORY OTITIS MEDIA

    Directory of Open Access Journals (Sweden)

    Gabriela KOPACHEVA

    2004-12-01

    Full Text Available This study presents the treatment of 68 children with secretory otitis media. Children underwent adenoid vegetations, nasal speech, conductive hearing loss, ventilation disturbance in Eustachian tube. In all children adenoidectomy was indicated.38 boys and 30 girls at the age of 3-17 were divided in two main groups: * 29 children without hypertrophic (enlarged adenoids, * 39 children with enlarged (hypertrophic adenoids.The surgical treatment included insertion of ventilation tubes and adenoidectomy where there where hypertrophic adenoids.Clinical material was analyzed according to hearing threshold, hearing level, middle ear condition estimated by pure tone audiometry and tympanometry before and after treatment. Data concerning both groups were compared.The results indicated that adenoidectomy combined with the ventilation tubes facilitates secretory otitis media heeling as well as decrease of hearing impairments. That enables prompt restoration of the hearing function as an important precondition for development of the language, social, emotional and academic development of children.

  20. Automatic speech recognition An evaluation of Google Speech

    OpenAIRE

    Stenman, Magnus

    2015-01-01

    The use of speech recognition is increasing rapidly and is now available in smart TVs, desktop computers, every new smart phone, etc. allowing us to talk to computers naturally. With the use in home appliances, education and even in surgical procedures accuracy and speed becomes very important. This thesis aims to give an introduction to speech recognition and discuss its use in robotics. An evaluation of Google Speech, using Google’s speech API, in regards to word error rate and translation ...

  1. Tackling the complexity in speech

    DEFF Research Database (Denmark)

    section includes four carefully selected chapters. They deal with facets of speech production, speech acoustics, and/or speech perception or recognition, place them in an integrated phonetic-phonological perspective, and relate them in more or less explicit ways to aspects of speech technology. Therefore......, we hope that this volume can help speech scientists with traditional training in phonetics and phonology to keep up with the latest developments in speech technology. In the opposite direction, speech researchers starting from a technological perspective will hopefully get inspired by reading about...... the questions, phenomena, and communicative functions that are currently addressed in phonetics and phonology. Either way, the future of speech research lies in international, interdisciplinary collaborations, and our volume is meant to reflect and facilitate such collaborations...

  2. Denial Denied: Freedom of Speech

    Directory of Open Access Journals (Sweden)

    Glen Newey

    2009-12-01

    Full Text Available Free speech is a widely held principle. This is in some ways surprising, since formal and informal censorship of speech is widespread, and rather different issues seem to arise depending on whether the censorship concerns who speaks, what content is spoken or how it is spoken. I argue that despite these facts, free speech can indeed be seen as a unitary principle. On my analysis, the core of the free speech principle is the denial of the denial of speech, whether to a speaker, to a proposition, or to a mode of expression. Underlying free speech is the principle of freedom of association, according to which speech is both a precondition of future association (e.g. as a medium for negotiation and a mode of association in its own right. I conclude by applying this account briefly to two contentious issues: hate speech and pornography.

  3. Avaliação dos limiares auditivos com e sem equipamento de proteção individual Pure tone audiometry with and without specific ear protectors

    Directory of Open Access Journals (Sweden)

    Carlos Antonio Rodrigues de Faria

    2008-06-01

    Full Text Available Os autores realizaram estudo caso-controle audiométrico em indivíduos com e sem protetor auricular auditivo. OBJETIVOS: O objetivo do estudo foi avaliar a real atenuação individual dado pelos protetores. MATERIAL E MÉTODO: Foram avaliados 30 indivíduos (ou 60 orelhas de diferentes atividades profissionais, de ambos os sexos, com idades entre 20 e 58 anos, apresentando audição normal e tendo realizado repouso auditivo de 10 horas, submetidos a exame audiométrico com e sem protetor auricular auditivo, no período de fevereiro a julho de 2003, utilizando protetor tipo plugue. Avaliou-se as audiometrias nas vias aérea e óssea em freqüências de 500 a 4000Hz. RESULTADOS: Os resultados foram analisados estatisticamente e comparados aos dados fornecidos pelo fabricante. Assim se observou em ouvido real os níveis de atenuação auditiva obtidos com o uso destes produtos. CONCLUSÃO: Os resultados permitiram chegar à conclusão de que os índices fornecidos pelos fabricantes foram compatíveis com os que obtive nos testes.The authors evaluated pure tone audiometry with and without specific ear protectors. AIM: The purpose of this case control study was to measure the level of sound attenuation by earplugs. MATERIAL AND METHODS: The evaluation included sixty ears of 30 subjects of both sexes, aged between 20 and 58 years, of various professional activities, with normal hearing thresholds, and following ten hours of auditory rest. The statistical results of pure tone audiometry at 500 to 4000 Hertz with and without specific ear protectors were analyzed. RESULTS: These results were compared with those provided by the ear protector manufacturer. CONCLUSION: The results show that the rate of sound reduction was similar to the manufacturer's specifications.

  4. RECOGNISING SPEECH ACTS

    Directory of Open Access Journals (Sweden)

    Phyllis Kaburise

    2012-09-01

    Full Text Available Speech Act Theory (SAT, a theory in pragmatics, is an attempt to describe what happens during linguistic interactions. Inherent within SAT is the idea that language forms and intentions are relatively formulaic and that there is a direct correspondence between sentence forms (for example, in terms of structure and lexicon and the function or meaning of an utterance. The contention offered in this paper is that when such a correspondence does not exist, as in indirect speech utterances, this creates challenges for English second language speakers and may result in miscommunication. This arises because indirect speech acts allow speakers to employ various pragmatic devices such as inference, implicature, presuppositions and context clues to transmit their messages. Such devices, operating within the non-literal level of language competence, may pose challenges for ESL learners.

  5. Speech spectrogram expert

    Energy Technology Data Exchange (ETDEWEB)

    Johannsen, J.; Macallister, J.; Michalek, T.; Ross, S.

    1983-01-01

    Various authors have pointed out that humans can become quite adept at deriving phonetic transcriptions from speech spectrograms (as good as 90percent accuracy at the phoneme level). The authors describe an expert system which attempts to simulate this performance. The speech spectrogram expert (spex) is actually a society made up of three experts: a 2-dimensional vision expert, an acoustic-phonetic expert, and a phonetics expert. The visual reasoning expert finds important visual features of the spectrogram. The acoustic-phonetic expert reasons about how visual features relates to phonemes, and about how phonemes change visually in different contexts. The phonetics expert reasons about allowable phoneme sequences and transformations, and deduces an english spelling for phoneme strings. The speech spectrogram expert is highly interactive, allowing users to investigate hypotheses and edit rules. 10 references.

  6. Protection limits on free speech

    Institute of Scientific and Technical Information of China (English)

    李敏

    2014-01-01

    Freedom of speech is one of the basic rights of citizens should receive broad protection, but in the real context of China under what kind of speech can be protected and be restricted, how to grasp between state power and free speech limit is a question worth considering. People tend to ignore the freedom of speech and its function, so that some of the rhetoric cannot be demonstrated in the open debates.

  7. Designing speech for a recipient

    DEFF Research Database (Denmark)

    Fischer, Kerstin

    is investigated on three candidates for so-called ‘simplified registers’: speech to children (also called motherese or baby talk), speech to foreigners (also called foreigner talk) and speech to robots. The volume integrates research from various disciplines, such as psychology, sociolinguistics...

  8. Abortion and compelled physician speech.

    Science.gov (United States)

    Orentlicher, David

    2015-01-01

    Informed consent mandates for abortion providers may infringe the First Amendment's freedom of speech. On the other hand, they may reinforce the physician's duty to obtain informed consent. Courts can promote both doctrines by ensuring that compelled physician speech pertains to medical facts about abortion rather than abortion ideology and that compelled speech is truthful and not misleading.

  9. SPEECH DISORDERS ENCOUNTERED DURING SPEECH THERAPY AND THERAPY TECHNIQUES

    OpenAIRE

    2013-01-01

    Speech which is a physical and mental process, agreed signs and sounds to create a sense of mind to the message that change . Process to identify the sounds of speech it is essential to know the structure and function of various organs which allows to happen the conversation. Speech is a physical and mental process so many factors can lead to speech disorders. Speech disorder can be about language acquisitions as well as it can be caused medical and psychological many factors. Disordered sp...

  10. Speech transmission index from running speech: A neural network approach

    Science.gov (United States)

    Li, F. F.; Cox, T. J.

    2003-04-01

    Speech transmission index (STI) is an important objective parameter concerning speech intelligibility for sound transmission channels. It is normally measured with specific test signals to ensure high accuracy and good repeatability. Measurement with running speech was previously proposed, but accuracy is compromised and hence applications limited. A new approach that uses artificial neural networks to accurately extract the STI from received running speech is developed in this paper. Neural networks are trained on a large set of transmitted speech examples with prior knowledge of the transmission channels' STIs. The networks perform complicated nonlinear function mappings and spectral feature memorization to enable accurate objective parameter extraction from transmitted speech. Validations via simulations demonstrate the feasibility of this new method on a one-net-one-speech extract basis. In this case, accuracy is comparable with normal measurement methods. This provides an alternative to standard measurement techniques, and it is intended that the neural network method can facilitate occupied room acoustic measurements.

  11. Mandarin Visual Speech Information

    Science.gov (United States)

    Chen, Trevor H.

    2010-01-01

    While the auditory-only aspects of Mandarin speech are heavily-researched and well-known in the field, this dissertation addresses its lesser-known aspects: The visual and audio-visual perception of Mandarin segmental information and lexical-tone information. Chapter II of this dissertation focuses on the audiovisual perception of Mandarin…

  12. Speech After Banquet

    Science.gov (United States)

    Yang, Chen Ning

    2013-05-01

    I am usually not so short of words, but the previous speeches have rendered me really speechless. I have known and admired the eloquence of Freeman Dyson, but I did not know that there is a hidden eloquence in my colleague George Sterman...

  13. Speech and Hearing Therapy.

    Science.gov (United States)

    Sakata, Reiko; Sakata, Robert

    1978-01-01

    In the public school, the speech and hearing therapist attempts to foster child growth and development through the provision of services basic to awareness of self and others, management of personal and social interactions, and development of strategies for coping with the handicap. (MM)

  14. The Commercial Speech Doctrine.

    Science.gov (United States)

    Luebke, Barbara F.

    In its 1942 ruling in the "Valentine vs. Christensen" case, the Supreme Court established the doctrine that commercial speech is not protected by the First Amendment. In 1975, in the "Bigelow vs. Virginia" case, the Supreme Court took a decisive step toward abrogating that doctrine, by ruling that advertising is not stripped of…

  15. Metaheuristic applications to speech enhancement

    CERN Document Server

    Kunche, Prajna

    2016-01-01

    This book serves as a basic reference for those interested in the application of metaheuristics to speech enhancement. The major goal of the book is to explain the basic concepts of optimization methods and their use in heuristic optimization in speech enhancement to scientists, practicing engineers, and academic researchers in speech processing. The authors discuss why it has been a challenging problem for researchers to develop new enhancement algorithms that aid in the quality and intelligibility of degraded speech. They present powerful optimization methods to speech enhancement that can help to solve the noise reduction problems. Readers will be able to understand the fundamentals of speech processing as well as the optimization techniques, how the speech enhancement algorithms are implemented by utilizing optimization methods, and will be given the tools to develop new algorithms. The authors also provide a comprehensive literature survey regarding the topic.

  16. A Mobile Phone based Speech Therapist

    OpenAIRE

    Pandey, Vinod K.; Pande, Arun; Kopparapu, Sunil Kumar

    2016-01-01

    Patients with articulatory disorders often have difficulty in speaking. These patients need several speech therapy sessions to enable them speak normally. These therapy sessions are conducted by a specialized speech therapist. The goal of speech therapy is to develop good speech habits as well as to teach how to articulate sounds the right way. Speech therapy is critical for continuous improvement to regain normal speech. Speech therapy sessions require a patient to travel to a hospital or a ...

  17. Sensitivity of cortical auditory evoked potential detection for hearing-impaired infants in response to short speech sounds

    Directory of Open Access Journals (Sweden)

    Bram Van Dun

    2012-01-01

    Full Text Available

    Background: Cortical auditory evoked potentials (CAEPs are an emerging tool for hearing aid fitting evaluation in young children who cannot provide reliable behavioral feedback. It is therefore useful to determine the relationship between the sensation level of speech sounds and the detection sensitivity of CAEPs.

    Design and methods: Twenty-five sensorineurally hearing impaired infants with an age range of 8 to 30 months were tested once, 18 aided and 7 unaided. First, behavioral thresholds of speech stimuli /m/, /g/, and /t/ were determined using visual reinforcement orientation audiometry (VROA. Afterwards, the same speech stimuli were presented at 55, 65, and 75 dB SPL, and CAEP recordings were made. An automatic statistical detection paradigm was used for CAEP detection.

    Results: For sensation levels above 0, 10, and 20 dB respectively, detection sensitivities were equal to 72 ± 10, 75 ± 10, and 78 ± 12%. In 79% of the cases, automatic detection p-values became smaller when the sensation level was increased by 10 dB.

    Conclusions: The results of this study suggest that the presence or absence of CAEPs can provide some indication of the audibility of a speech sound for infants with sensorineural hearing loss. The detection of a CAEP provides confidence, to a degree commensurate with the detection probability, that the infant is detecting that sound at the level presented. When testing infants where the audibility of speech sounds has not been established behaviorally, the lack of a cortical response indicates the possibility, but by no means a certainty, that the sensation level is 10 dB or less.

  18. Speech Motor Control in Fluent and Dysfluent Speech Production of an Individual with Apraxia of Speech and Broca's Aphasia

    Science.gov (United States)

    van Lieshout, Pascal H. H. M.; Bose, Arpita; Square, Paula A.; Steele, Catriona M.

    2007-01-01

    Apraxia of speech (AOS) is typically described as a motor-speech disorder with clinically well-defined symptoms, but without a clear understanding of the underlying problems in motor control. A number of studies have compared the speech of subjects with AOS to the fluent speech of controls, but only a few have included speech movement data and if…

  19. Sensorimotor Interactions in Speech Learning

    Directory of Open Access Journals (Sweden)

    Douglas M Shiller

    2011-10-01

    Full Text Available Auditory input is essential for normal speech development and plays a key role in speech production throughout the life span. In traditional models, auditory input plays two critical roles: 1 establishing the acoustic correlates of speech sounds that serve, in part, as the targets of speech production, and 2 as a source of feedback about a talker's own speech outcomes. This talk will focus on both of these roles, describing a series of studies that examine the capacity of children and adults to adapt to real-time manipulations of auditory feedback during speech production. In one study, we examined sensory and motor adaptation to a manipulation of auditory feedback during production of the fricative “s”. In contrast to prior accounts, adaptive changes were observed not only in speech motor output but also in subjects' perception of the sound. In a second study, speech adaptation was examined following a period of auditory–perceptual training targeting the perception of vowels. The perceptual training was found to systematically improve subjects' motor adaptation response to altered auditory feedback during speech production. The results of both studies support the idea that perceptual and motor processes are tightly coupled in speech production learning, and that the degree and nature of this coupling may change with development.

  20. Variation and Synthetic Speech

    CERN Document Server

    Miller, C; Massey, N; Miller, Corey; Karaali, Orhan; Massey, Noel

    1997-01-01

    We describe the approach to linguistic variation taken by the Motorola speech synthesizer. A pan-dialectal pronunciation dictionary is described, which serves as the training data for a neural network based letter-to-sound converter. Subsequent to dictionary retrieval or letter-to-sound generation, pronunciations are submitted a neural network based postlexical module. The postlexical module has been trained on aligned dictionary pronunciations and hand-labeled narrow phonetic transcriptions. This architecture permits the learning of individual postlexical variation, and can be retrained for each speaker whose voice is being modeled for synthesis. Learning variation in this way can result in greater naturalness for the synthetic speech that is produced by the system.

  1. Speech is Golden

    DEFF Research Database (Denmark)

    Juel Henrichsen, Peter

    2014-01-01

    Most of the Danish municipalities are ready to begin to adopt automatic speech recognition, but at the same time remain nervous following a long series of bad business cases in the recent past. Complaints are voiced over costly licences and low service levels, typical effects of a de facto monopoly...... on the supply side. The present article reports on a new public action strategy which has taken shape in the course of 2013-14. While Denmark is a small language area, our public sector is well organised and has considerable purchasing power. Across this past year, Danish local authorities have organised around...... of the present article, in the role of economically neutral advisers. The aim of the initiative is to pave the way for the first profitable contract in the field - which we hope to see in 2014 - an event which would precisely break the present deadlock and open up a billion EUR market for speech technology...

  2. [Improving speech comprehension using a new cochlear implant speech processor].

    Science.gov (United States)

    Müller-Deile, J; Kortmann, T; Hoppe, U; Hessel, H; Morsnowski, A

    2009-06-01

    The aim of this multicenter clinical field study was to assess the benefits of the new Freedom 24 sound processor for cochlear implant (CI) users implanted with the Nucleus 24 cochlear implant system. The study included 48 postlingually profoundly deaf experienced CI users who demonstrated speech comprehension performance with their current speech processor on the Oldenburg sentence test (OLSA) in quiet conditions of at least 80% correct scores and who were able to perform adaptive speech threshold testing using the OLSA in noisy conditions. Following baseline measures of speech comprehension performance with their current speech processor, subjects were upgraded to the Freedom 24 speech processor. After a take-home trial period of at least 2 weeks, subject performance was evaluated by measuring the speech reception threshold with the Freiburg multisyllabic word test and speech intelligibility with the Freiburg monosyllabic word test at 50 dB and 70 dB in the sound field. The results demonstrated highly significant benefits for speech comprehension with the new speech processor. Significant benefits for speech comprehension were also demonstrated with the new speech processor when tested in competing background noise.In contrast, use of the Abbreviated Profile of Hearing Aid Benefit (APHAB) did not prove to be a suitably sensitive assessment tool for comparative subjective self-assessment of hearing benefits with each processor. Use of the preprocessing algorithm known as adaptive dynamic range optimization (ADRO) in the Freedom 24 led to additional improvements over the standard upgrade map for speech comprehension in quiet and showed equivalent performance in noise. Through use of the preprocessing beam-forming algorithm BEAM, subjects demonstrated a highly significant improved signal-to-noise ratio for speech comprehension thresholds (i.e., signal-to-noise ratio for 50% speech comprehension scores) when tested with an adaptive procedure using the Oldenburg

  3. Neurophysiology of speech differences in childhood apraxia of speech.

    Science.gov (United States)

    Preston, Jonathan L; Molfese, Peter J; Gumkowski, Nina; Sorcinelli, Andrea; Harwood, Vanessa; Irwin, Julia R; Landi, Nicole

    2014-01-01

    Event-related potentials (ERPs) were recorded during a picture naming task of simple and complex words in children with typical speech and with childhood apraxia of speech (CAS). Results reveal reduced amplitude prior to speaking complex (multisyllabic) words relative to simple (monosyllabic) words for the CAS group over the right hemisphere during a time window thought to reflect phonological encoding of word forms. Group differences were also observed prior to production of spoken tokens regardless of word complexity during a time window just prior to speech onset (thought to reflect motor planning/programming). Results suggest differences in pre-speech neurolinguistic processes.

  4. Hiding Information under Speech

    Science.gov (United States)

    2005-12-12

    as it arrives in real time, and it disappears as fast as it arrives. Furthermore, our cognitive process for translating audio sounds to the meaning... steganography , whose goal is to make the embedded data completely undetectable. In addi- tion, we must dismiss the idea of hiding data by using any...therefore, an image has more room to hide data; and (2) speech steganography has not led to many money-making commercial businesses. For these two

  5. Speech Quality Measurement

    Science.gov (United States)

    1977-06-10

    noise test , t=2 for t1-v low p’ass f lit er te st ,and t 3 * or theit ADP(NI cod ing tevst ’*s is the sub lec nube 0l e tet Bostz- Av L b U0...a 1ý...it aepa rate, speech clu.1 t laboratory and controlled by the NOVA 830 computoer . Bach of tho stations has a CRT, .15 response buttons, a "rad button

  6. Speech recognition in university classrooms

    OpenAIRE

    Wald, Mike; Bain, Keith; Basson, Sara H

    2002-01-01

    The LIBERATED LEARNING PROJECT (LLP) is an applied research project studying two core questions: 1) Can speech recognition (SR) technology successfully digitize lectures to display spoken words as text in university classrooms? 2) Can speech recognition technology be used successfully as an alternative to traditional classroom notetaking for persons with disabilities? This paper addresses these intriguing questions and explores the underlying complex relationship between speech recognition te...

  7. Speech Recognition on Mobile Devices

    DEFF Research Database (Denmark)

    Tan, Zheng-Hua; Lindberg, Børge

    2010-01-01

    The enthusiasm of deploying automatic speech recognition (ASR) on mobile devices is driven both by remarkable advances in ASR technology and by the demand for efficient user interfaces on such devices as mobile phones and personal digital assistants (PDAs). This chapter presents an overview of ASR...... in the mobile context covering motivations, challenges, fundamental techniques and applications. Three ASR architectures are introduced: embedded speech recognition, distributed speech recognition and network speech recognition. Their pros and cons and implementation issues are discussed. Applications within...... command and control, text entry and search are presented with an emphasis on mobile text entry....

  8. Huntington's Disease: Speech, Language and Swallowing

    Science.gov (United States)

    ... Disease Society of America Huntington's Disease Youth Organization Movement Disorder Society National Institute of Neurological Disorders and Stroke Typical Speech and Language Development Learning More Than One Language Adult Speech and Language Child Speech and Language Swallowing ...

  9. Three-year experience with the Sophono in children with congenital conductive unilateral hearing loss: tolerability, audiometry, and sound localization compared to a bone-anchored hearing aid.

    Science.gov (United States)

    Nelissen, Rik C; Agterberg, Martijn J H; Hol, Myrthe K S; Snik, Ad F M

    2016-10-01

    Bone conduction devices (BCDs) are advocated as an amplification option for patients with congenital conductive unilateral hearing loss (UHL), while other treatment options could also be considered. The current study compared a transcutaneous BCD (Sophono) with a percutaneous BCD (bone-anchored hearing aid, BAHA) in 12 children with congenital conductive UHL. Tolerability, audiometry, and sound localization abilities with both types of BCD were studied retrospectively. The mean follow-up was 3.6 years for the Sophono users (n = 6) and 4.7 years for the BAHA users (n = 6). In each group, two patients had stopped using their BCD. Tolerability was favorable for the Sophono. Aided thresholds with the Sophono were unsatisfactory, as they did not reach under a mean pure tone average of 30 dB HL. Sound localization generally improved with both the Sophono and the BAHA, although localization abilities did not reach the level of normal hearing children. These findings, together with previously reported outcomes, are important to take into account when counseling patients and their caretakers. The selection of a suitable amplification option should always be made deliberately and on individual basis for each patient in this diverse group of children with congenital conductive UHL.

  10. Teaching Speech Acts

    Directory of Open Access Journals (Sweden)

    Teaching Speech Acts

    2007-01-01

    Full Text Available In this paper I argue that pragmatic ability must become part of what we teach in the classroom if we are to realize the goals of communicative competence for our students. I review the research on pragmatics, especially those articles that point to the effectiveness of teaching pragmatics in an explicit manner, and those that posit methods for teaching. I also note two areas of scholarship that address classroom needs—the use of authentic data and appropriate assessment tools. The essay concludes with a summary of my own experience teaching speech acts in an advanced-level Portuguese class.

  11. PESQ Based Speech Intelligibility Measurement

    NARCIS (Netherlands)

    Beerends, J.G.; Buuren, R.A. van; Vugt, J.M. van; Verhave, J.A.

    2009-01-01

    Several measurement techniques exist to quantify the intelligibility of a speech transmission chain. In the objective domain, the Articulation Index [1] and the Speech Transmission Index STI [2], [3], [4], [5] have been standardized for predicting intelligibility. The STI uses a signal that contains

  12. Separating Underdetermined Convolutive Speech Mixtures

    DEFF Research Database (Denmark)

    Pedersen, Michael Syskind; Wang, DeLiang; Larsen, Jan

    2006-01-01

    a method for underdetermined blind source separation of convolutive mixtures. The proposed framework is applicable for separation of instantaneous as well as convolutive speech mixtures. It is possible to iteratively extract each speech signal from the mixture by combining blind source separation...

  13. Perceptual Learning of Interrupted Speech

    NARCIS (Netherlands)

    Benard, Michel Ruben; Başkent, Deniz

    2013-01-01

    The intelligibility of periodically interrupted speech improves once the silent gaps are filled with noise bursts. This improvement has been attributed to phonemic restoration, a top-down repair mechanism that helps intelligibility of degraded speech in daily life. Two hypotheses were investigated u

  14. Speech Compression Using Multecirculerletet Transform

    Directory of Open Access Journals (Sweden)

    Sulaiman Murtadha

    2012-01-01

    Full Text Available Compressing the speech reduces the data storage requirements, leading to reducing the time of transmitting the digitized speech over long-haul links like internet. To obtain best performance in speech compression, wavelet transforms require filters that combine a number of desirable properties, such as orthogonality and symmetry.The MCT bases functions are derived from GHM bases function using 2D linear convolution .The fast computation algorithm methods introduced here added desirable features to the current transform. We further assess the performance of the MCT in speech compression application. This paper discusses the effect of using DWT and MCT (one and two dimension on speech compression. DWT and MCT performances in terms of compression ratio (CR, mean square error (MSE and peak signal to noise ratio (PSNR are assessed. Computer simulation results indicate that the two dimensions MCT offer a better compression ratio, MSE and PSNR than DWT.

  15. PCA-Based Speech Enhancement for Distorted Speech Recognition

    Directory of Open Access Journals (Sweden)

    Tetsuya Takiguchi

    2007-09-01

    Full Text Available We investigated a robust speech feature extraction method using kernel PCA (Principal Component Analysis for distorted speech recognition. Kernel PCA has been suggested for various image processing tasks requiring an image model, such as denoising, where a noise-free image is constructed from a noisy input image. Much research for robust speech feature extraction has been done, but it remains difficult to completely remove additive or convolution noise (distortion. The most commonly used noise-removal techniques are based on the spectraldomain operation, and then for speech recognition, the MFCC (Mel Frequency Cepstral Coefficient is computed, where DCT (Discrete Cosine Transform is applied to the mel-scale filter bank output. This paper describes a new PCA-based speech enhancement algorithm using kernel PCA instead of DCT, where the main speech element is projected onto low-order features, while the noise or distortion element is projected onto high-order features. Its effectiveness is confirmed by word recognition experiments on distorted speech.

  16. Interactions between distal speech rate, linguistic knowledge, and speech environment.

    Science.gov (United States)

    Morrill, Tuuli; Baese-Berk, Melissa; Heffner, Christopher; Dilley, Laura

    2015-10-01

    During lexical access, listeners use both signal-based and knowledge-based cues, and information from the linguistic context can affect the perception of acoustic speech information. Recent findings suggest that the various cues used in lexical access are implemented with flexibility and may be affected by information from the larger speech context. We conducted 2 experiments to examine effects of a signal-based cue (distal speech rate) and a knowledge-based cue (linguistic structure) on lexical perception. In Experiment 1, we manipulated distal speech rate in utterances where an acoustically ambiguous critical word was either obligatory for the utterance to be syntactically well formed (e.g., Conner knew that bread and butter (are) both in the pantry) or optional (e.g., Don must see the harbor (or) boats). In Experiment 2, we examined identical target utterances as in Experiment 1 but changed the distribution of linguistic structures in the fillers. The results of the 2 experiments demonstrate that speech rate and linguistic knowledge about critical word obligatoriness can both influence speech perception. In addition, it is possible to alter the strength of a signal-based cue by changing information in the speech environment. These results provide support for models of word segmentation that include flexible weighting of signal-based and knowledge-based cues.

  17. Potenciais evocados auditivos de tronco encefálico de ex-usuários de drogas Brain stem evoked response audiometry of former drug users

    Directory of Open Access Journals (Sweden)

    Tainara Milbradt Weich

    2012-10-01

    Full Text Available As drogas ilícitas são conhecidas pelos seus efeitos deletérios no sistema nervoso central; no entanto, elas também podem atingir o sistema auditivo, provocando alterações. OBJETIVOS: Analisar e comparar os resultados dos potenciais evocados auditivos de tronco encefálico (PEATE de frequentadores de grupos de apoio a ex-usuários de drogas. MÉTODO: Estudo transversal, não experimental, descritivo e quantitativo. A amostra foi composta por 17 indivíduos divididos conforme o tipo de droga mais consumida: 10 indivíduos no grupo maconha (G1 e sete no grupo crack/cocaína (G2. Eles foram subdivididos pelo tempo de uso de drogas: um a cinco anos, seis a 10 anos e mais que 15 anos. A avaliação foi feita por meio de anamnese, audiometria tonal liminar, medidas de imitância acústica e PEATE. RESULTADOS: Ao comparar os resultados de G1 e G2, independente do tempo de uso de drogas, não se observou diferença estatisticamente significante nas latências absolutas e nos intervalos interpicos. No entanto, apenas cinco dos 17 indivíduos tiveram PEATE com resultados adequados para a faixa etária. CONCLUSÃO: Independentemente do tempo de utilização das drogas, o uso de maconha e crack/cocaína pode provocar alterações difusas no tronco encefálico, comprometendo a transmissão do estímulo auditivo.Illicit drugs are known for their deleterious effects upon the central nervous system and more specifically for how they adversely affect hearing. OBJECTIVE: This study aims to analyze and compare the hearing complaints and the results of brainstem evoked response audiometry (BERA of former drug user support group goers. METHODS: This is a cross-sectional non-experimental descriptive quantitative study. The sample consisted of 17 subjects divided by their preferred drug of use. Ten individuals were placed in the marijuana group (G1 and seven in the crack/cocaine group (G2. The subjects were further divided based on how long they had been using

  18. Hate Speech/Free Speech: Using Feminist Perspectives To Foster On-Campus Dialogue.

    Science.gov (United States)

    Cornwell, Nancy; Orbe, Mark P.; Warren, Kiesha

    1999-01-01

    Explores the complex issues inherent in the tension between hate speech and free speech, focusing on the phenomenon of hate speech on college campuses. Describes the challenges to hate speech made by critical race theorists and explains how a feminist critique can reorient the parameters of hate speech. (SLD)

  19. The Stylistic Analysis of Public Speech

    Institute of Scientific and Technical Information of China (English)

    李龙

    2011-01-01

    Public speech is a very important part in our daily life.The ability to deliver a good public speech is something we need to learn and to have,especially,in the service sector.This paper attempts to analyze the style of public speech,in the hope of providing inspiration to us whenever delivering such a speech.

  20. Linguistic Units and Speech Production Theory.

    Science.gov (United States)

    MacNeilage, Peter F.

    This paper examines the validity of the concept of linguistic units in a theory of speech production. Substantiating data are drawn from the study of the speech production process itself. Secondarily, an attempt is made to reconcile the postulation of linguistic units in speech production theory with their apparent absence in the speech signal.…

  1. Automated Speech Rate Measurement in Dysarthria

    Science.gov (United States)

    Martens, Heidi; Dekens, Tomas; Van Nuffelen, Gwen; Latacz, Lukas; Verhelst, Werner; De Bodt, Marc

    2015-01-01

    Purpose: In this study, a new algorithm for automated determination of speech rate (SR) in dysarthric speech is evaluated. We investigated how reliably the algorithm calculates the SR of dysarthric speech samples when compared with calculation performed by speech-language pathologists. Method: The new algorithm was trained and tested using Dutch…

  2. Coevolution of Human Speech and Trade

    NARCIS (Netherlands)

    Horan, R.D.; Bulte, E.H.; Shogren, J.F.

    2008-01-01

    We propose a paleoeconomic coevolutionary explanation for the origin of speech in modern humans. The coevolutionary process, in which trade facilitates speech and speech facilitates trade, gives rise to multiple stable trajectories. While a `trade-speech¿ equilibrium is not an inevitable outcome for

  3. Connected Speech Processes in Australian English.

    Science.gov (United States)

    Ingram, J. C. L.

    1989-01-01

    Explores the role of Connected Speech Processes (CSP) in accounting for sociolinguistically significant dimensions of speech variation, and presents initial findings on the distribution of CSPs in the speech of Australian adolescents. The data were gathered as part of a wider survey of speech of Brisbane school children. (Contains 26 references.)…

  4. ARMA Modelling for Whispered Speech

    Institute of Scientific and Technical Information of China (English)

    Xue-li LI; Wei-dong ZHOU

    2010-01-01

    The Autoregressive Moving Average (ARMA) model for whispered speech is proposed. Compared with normal speech, whispered speech has no fundamental frequency because of the glottis being semi-opened and turbulent flow being created, and formant shifting exists in the lower frequency region due to the narrowing of the tract in the false vocal fold regions and weak acoustic coupling with the subglottal system. Analysis shows that the effect of the subglottal system is to introduce additional pole-zero pairs into the vocal tract transfer function. Theoretically, the method based on an ARMA process is superior to that based on an AR process in the spectral analysis of the whispered speech. Two methods, the least squared modified Yule-Walker likelihood estimate (LSMY) algorithm and the Frequency-Domain Steiglitz-Mcbride (FDSM) algorithm, are applied to the ARMA model for the whispered speech. The performance evaluation shows that the ARMA model is much more appropriate for representing the whispered speech than the AR model, and the FDSM algorithm provides a more accurate estimation of the whispered speech spectral envelope than the LSMY algorithm with higher computational complexity.

  5. Speech recognition from spectral dynamics

    Indian Academy of Sciences (India)

    Hynek Hermansky

    2011-10-01

    Information is carried in changes of a signal. The paper starts with revisiting Dudley’s concept of the carrier nature of speech. It points to its close connection to modulation spectra of speech and argues against short-term spectral envelopes as dominant carriers of the linguistic information in speech. The history of spectral representations of speech is briefly discussed. Some of the history of gradual infusion of the modulation spectrum concept into Automatic recognition of speech (ASR) comes next, pointing to the relationship of modulation spectrum processing to wellaccepted ASR techniques such as dynamic speech features or RelAtive SpecTrAl (RASTA) filtering. Next, the frequency domain perceptual linear prediction technique for deriving autoregressive models of temporal trajectories of spectral power in individual frequency bands is reviewed. Finally, posterior-based features, which allow for straightforward application of modulation frequency domain information, are described. The paper is tutorial in nature, aims at a historical global overview of attempts for using spectral dynamics in machine recognition of speech, and does not always provide enough detail of the described techniques. However, extensive references to earlier work are provided to compensate for the lack of detail in the paper.

  6. Brain-inspired speech segmentation for automatic speech recognition using the speech envelope as a temporal reference

    Science.gov (United States)

    Lee, Byeongwook; Cho, Kwang-Hyun

    2016-11-01

    Speech segmentation is a crucial step in automatic speech recognition because additional speech analyses are performed for each framed speech segment. Conventional segmentation techniques primarily segment speech using a fixed frame size for computational simplicity. However, this approach is insufficient for capturing the quasi-regular structure of speech, which causes substantial recognition failure in noisy environments. How does the brain handle quasi-regular structured speech and maintain high recognition performance under any circumstance? Recent neurophysiological studies have suggested that the phase of neuronal oscillations in the auditory cortex contributes to accurate speech recognition by guiding speech segmentation into smaller units at different timescales. A phase-locked relationship between neuronal oscillation and the speech envelope has recently been obtained, which suggests that the speech envelope provides a foundation for multi-timescale speech segmental information. In this study, we quantitatively investigated the role of the speech envelope as a potential temporal reference to segment speech using its instantaneous phase information. We evaluated the proposed approach by the achieved information gain and recognition performance in various noisy environments. The results indicate that the proposed segmentation scheme not only extracts more information from speech but also provides greater robustness in a recognition test.

  7. INTEGRATING MACHINE TRANSLATION AND SPEECH SYNTHESIS COMPONENT FOR ENGLISH TO DRAVIDIAN LANGUAGE SPEECH TO SPEECH TRANSLATION SYSTEM

    Directory of Open Access Journals (Sweden)

    J. SANGEETHA

    2015-02-01

    Full Text Available This paper provides an interface between the machine translation and speech synthesis system for converting English speech to Tamil text in English to Tamil speech to speech translation system. The speech translation system consists of three modules: automatic speech recognition, machine translation and text to speech synthesis. Many procedures for incorporation of speech recognition and machine translation have been projected. Still speech synthesis system has not yet been measured. In this paper, we focus on integration of machine translation and speech synthesis, and report a subjective evaluation to investigate the impact of speech synthesis, machine translation and the integration of machine translation and speech synthesis components. Here we implement a hybrid machine translation (combination of rule based and statistical machine translation and concatenative syllable based speech synthesis technique. In order to retain the naturalness and intelligibility of synthesized speech Auto Associative Neural Network (AANN prosody prediction is used in this work. The results of this system investigation demonstrate that the naturalness and intelligibility of the synthesized speech are strongly influenced by the fluency and correctness of the translated text.

  8. A Survey on Speech Enhancement Methodologies

    Directory of Open Access Journals (Sweden)

    Ravi Kumar. K

    2016-12-01

    Full Text Available Speech enhancement is a technique which processes the noisy speech signal. The aim of speech enhancement is to improve the perceived quality of speech and/or to improve its intelligibility. Due to its vast applications in mobile telephony, VOIP, hearing aids, Skype and speaker recognition, the challenges in speech enhancement have grown over the years. It is more challenging to suppress back ground noise that effects human communication in noisy environments like airports, road works, traffic, and cars. The objective of this survey paper is to outline the single channel speech enhancement methodologies used for enhancing the speech signal which is corrupted with additive background noise and also discuss the challenges and opportunities of single channel speech enhancement. This paper mainly focuses on transform domain techniques and supervised (NMF, HMM speech enhancement techniques. This paper gives frame work for developments in speech enhancement methodologies

  9. Speech enhancement theory and practice

    CERN Document Server

    Loizou, Philipos C

    2013-01-01

    With the proliferation of mobile devices and hearing devices, including hearing aids and cochlear implants, there is a growing and pressing need to design algorithms that can improve speech intelligibility without sacrificing quality. Responding to this need, Speech Enhancement: Theory and Practice, Second Edition introduces readers to the basic problems of speech enhancement and the various algorithms proposed to solve these problems. Updated and expanded, this second edition of the bestselling textbook broadens its scope to include evaluation measures and enhancement algorithms aimed at impr

  10. Computational neuroanatomy of speech production.

    Science.gov (United States)

    Hickok, Gregory

    2012-01-05

    Speech production has been studied predominantly from within two traditions, psycholinguistics and motor control. These traditions have rarely interacted, and the resulting chasm between these approaches seems to reflect a level of analysis difference: whereas motor control is concerned with lower-level articulatory control, psycholinguistics focuses on higher-level linguistic processing. However, closer examination of both approaches reveals a substantial convergence of ideas. The goal of this article is to integrate psycholinguistic and motor control approaches to speech production. The result of this synthesis is a neuroanatomically grounded, hierarchical state feedback control model of speech production.

  11. Steganalysis of recorded speech

    Science.gov (United States)

    Johnson, Micah K.; Lyu, Siwei; Farid, Hany

    2005-03-01

    Digital audio provides a suitable cover for high-throughput steganography. At 16 bits per sample and sampled at a rate of 44,100 Hz, digital audio has the bit-rate to support large messages. In addition, audio is often transient and unpredictable, facilitating the hiding of messages. Using an approach similar to our universal image steganalysis, we show that hidden messages alter the underlying statistics of audio signals. Our statistical model begins by building a linear basis that captures certain statistical properties of audio signals. A low-dimensional statistical feature vector is extracted from this basis representation and used by a non-linear support vector machine for classification. We show the efficacy of this approach on LSB embedding and Hide4PGP. While no explicit assumptions about the content of the audio are made, our technique has been developed and tested on high-quality recorded speech.

  12. Speech recovery device

    Energy Technology Data Exchange (ETDEWEB)

    Frankle, Christen M.

    2004-04-20

    There is provided an apparatus and method for assisting speech recovery in people with inability to speak due to aphasia, apraxia or another condition with similar effect. A hollow, rigid, thin-walled tube with semi-circular or semi-elliptical cut out shapes at each open end is positioned such that one end mates with the throat/voice box area of the neck of the assistor and the other end mates with the throat/voice box area of the assisted. The speaking person (assistor) makes sounds that produce standing wave vibrations at the same frequency in the vocal cords of the assisted person. Driving the assisted person's vocal cords with the assisted person being able to hear the correct tone enables the assisted person to speak by simply amplifying the vibration of membranes in their throat.

  13. Speech recovery device

    Energy Technology Data Exchange (ETDEWEB)

    Frankle, Christen M.

    2000-10-19

    There is provided an apparatus and method for assisting speech recovery in people with inability to speak due to aphasia, apraxia or another condition with similar effect. A hollow, rigid, thin-walled tube with semi-circular or semi-elliptical cut out shapes at each open end is positioned such that one end mates with the throat/voice box area of the neck of the assistor and the other end mates with the throat/voice box area of the assisted. The speaking person (assistor) makes sounds that produce standing wave vibrations at the same frequency in the vocal cords of the assisted person. Driving the assisted person's vocal cords with the assisted person being able to hear the correct tone enables the assisted person to speak by simply amplifying the vibration of membranes in their throat.

  14. Join Cost for Unit Selection Speech Synthesis

    OpenAIRE

    Vepa, Jithendra

    2004-01-01

    Undoubtedly, state-of-the-art unit selection-based concatenative speech systems produce very high quality synthetic speech. this is due to a large speech database containing many instances of each speech unit, with a varied and natural distribution of prosodic and spectral characteristics. the join cost, which measures how well two units can be joined together is one of the main criteria for selecting appropriate units from this large speech database. The ideal join cost is one that measur...

  15. Preventive measures in speech and language therapy

    OpenAIRE

    Slokar, Polona

    2014-01-01

    Preventive care plays an important role in speech and language therapy. Through training, a speech and language therapist informs the expert and the general public about his efforts in the field of feeding, speech and language development, as well as about the missing elements that may appear in relation to communication and feeding. A speech and language therapist is also responsible for early detection of irregularities and of those factors which affect speech and language development. To a...

  16. Speech distortion measure based on auditory properties

    Institute of Scientific and Technical Information of China (English)

    CHEN Guo; HU Xiulin; ZHANG Yunyu; ZHU Yaoting

    2000-01-01

    The Perceptual Spectrum Distortion (PSD), based on auditory properties of human being, is presented to measure speech distortion. The PSD measure calculates the speech distortion distance by simulating the auditory properties of human being and converting short-time speech power spectrum to auditory perceptual spectrum. Preliminary simulative experiments in comparison with the Itakura measure have been done. The results show that the PSD measure is a perferable speech distortion measure and more consistent with subjective assessment of speech quality.

  17. Speech of people with autism: Echolalia and echolalic speech

    OpenAIRE

    Błeszyński, Jacek Jarosław

    2013-01-01

    Speech of people with autism is recognised as one of the basic diagnostic, therapeutic and theoretical problems. One of the most common symptoms of autism in children is echolalia, described here as being of different types and severity. This paper presents the results of studies into different levels of echolalia, both in normally developing children and in children diagnosed with autism, discusses the differences between simple echolalia and echolalic speech - which can be considered to b...

  18. Perception of Emotion in Conversational Speech by Younger and Older Listeners

    Directory of Open Access Journals (Sweden)

    Juliane eSchmidt

    2016-05-01

    Full Text Available This study investigated whether age and/or differences in hearing sensitivity influence the perception of the emotion dimensions arousal (calm vs. aroused and valence (positive vs. negative attitude in conversational speech. To that end, this study specifically focused on the relationship between participants’ ratings of short affective utterances and the utterances’ acoustic parameters (pitch, intensity, and articulation rate known to be associated with the emotion dimensions arousal and valence. Stimuli consisted of short utterances taken from a corpus of conversational speech. In two rating tasks, younger and older adults either rated arousal or valence using a 5-point scale. Mean intensity was found to be the main cue participants used in the arousal task (i.e., higher mean intensity cueing higher levels of arousal while mean F0 was the main cue in the valence task (i.e., higher mean F0 being interpreted as more negative. Even though there were no overall age group differences in arousal or valence ratings, compared to younger adults, older adults responded less strongly to mean intensity differences cueing arousal and responded more strongly to differences in mean F0 cueing valence. Individual hearing sensitivity among the older adults did not modify the use of mean intensity as an arousal cue. However, individual hearing sensitivity generally affected valence ratings and modified the use of mean F0. We conclude that age differences in the interpretation of mean F0 as a cue for valence are likely due to age-related hearing loss, whereas age differences in rating arousal do not seem to be driven by hearing sensitivity differences between age groups (as measured by pure-tone audiometry.

  19. Why Go to Speech Therapy?

    Science.gov (United States)

    ... a Difference (PDF) Brief History About The Founder Corporate Directors Audit The Facts FAQ Basic Research Resources ... teens who stutter make positive changes in their communication skills. As you work with your speech pathologist ...

  20. Emotion Recognition using Speech Features

    CERN Document Server

    Rao, K Sreenivasa

    2013-01-01

    “Emotion Recognition Using Speech Features” covers emotion-specific features present in speech and discussion of suitable models for capturing emotion-specific information for distinguishing different emotions.  The content of this book is important for designing and developing  natural and sophisticated speech systems. Drs. Rao and Koolagudi lead a discussion of how emotion-specific information is embedded in speech and how to acquire emotion-specific knowledge using appropriate statistical models. Additionally, the authors provide information about using evidence derived from various features and models. The acquired emotion-specific knowledge is useful for synthesizing emotions. Discussion includes global and local prosodic features at syllable, word and phrase levels, helpful for capturing emotion-discriminative information; use of complementary evidences obtained from excitation sources, vocal tract systems and prosodic features in order to enhance the emotion recognition performance;  and pro...

  1. English Speeches Of Three Minutes

    Institute of Scientific and Technical Information of China (English)

    凌和军; 丁小琴

    2002-01-01

    English speeches, which were made at the beginning of this term, are popular among us, English learners, as it is very useful for us to improve our spoken English. So each of us feels very interested te join the activity.

  2. Delayed Speech or Language Development

    Science.gov (United States)

    ... What Parents Can Do en español Retraso en el desarrollo del habla o del lenguaje Your son ... for communication exchange and participation? What kind of feedback does the child get? When speech, language, hearing, ...

  3. Writing, Inner Speech, and Meditation.

    Science.gov (United States)

    Moffett, James

    1982-01-01

    Examines the interrelationships among meditation, inner speech (stream of consciousness), and writing. Considers the possibilities and implications of using the techniques of meditation in educational settings, especially in the writing classroom. (RL)

  4. Enhancement of speech signals - with a focus on voiced speech models

    DEFF Research Database (Denmark)

    Nørholm, Sidsel Marie

    This thesis deals with speech enhancement, i.e., noise reduction in speech signals. This has applications in, e.g., hearing aids and teleconference systems. We consider a signal-driven approach to speech enhancement where a model of the speech is assumed and filters are generated based on this mo......This thesis deals with speech enhancement, i.e., noise reduction in speech signals. This has applications in, e.g., hearing aids and teleconference systems. We consider a signal-driven approach to speech enhancement where a model of the speech is assumed and filters are generated based...

  5. Novel Techniques for Dialectal Arabic Speech Recognition

    CERN Document Server

    Elmahdy, Mohamed; Minker, Wolfgang

    2012-01-01

    Novel Techniques for Dialectal Arabic Speech describes approaches to improve automatic speech recognition for dialectal Arabic. Since speech resources for dialectal Arabic speech recognition are very sparse, the authors describe how existing Modern Standard Arabic (MSA) speech data can be applied to dialectal Arabic speech recognition, while assuming that MSA is always a second language for all Arabic speakers. In this book, Egyptian Colloquial Arabic (ECA) has been chosen as a typical Arabic dialect. ECA is the first ranked Arabic dialect in terms of number of speakers, and a high quality ECA speech corpus with accurate phonetic transcription has been collected. MSA acoustic models were trained using news broadcast speech. In order to cross-lingually use MSA in dialectal Arabic speech recognition, the authors have normalized the phoneme sets for MSA and ECA. After this normalization, they have applied state-of-the-art acoustic model adaptation techniques like Maximum Likelihood Linear Regression (MLLR) and M...

  6. Tactile Modulation of Emotional Speech Samples

    Directory of Open Access Journals (Sweden)

    Katri Salminen

    2012-01-01

    Full Text Available Traditionally only speech communicates emotions via mobile phone. However, in daily communication the sense of touch mediates emotional information during conversation. The present aim was to study if tactile stimulation affects emotional ratings of speech when measured with scales of pleasantness, arousal, approachability, and dominance. In the Experiment 1 participants rated speech-only and speech-tactile stimuli. The tactile signal mimicked the amplitude changes of the speech. In the Experiment 2 the aim was to study whether the way the tactile signal was produced affected the ratings. The tactile signal either mimicked the amplitude changes of the speech sample in question, or the amplitude changes of another speech sample. Also, concurrent static vibration was included. The results showed that the speech-tactile stimuli were rated as more arousing and dominant than the speech-only stimuli. The speech-only stimuli were rated as more approachable than the speech-tactile stimuli, but only in the Experiment 1. Variations in tactile stimulation also affected the ratings. When the tactile stimulation was static vibration the speech-tactile stimuli were rated as more arousing than when the concurrent tactile stimulation was mimicking speech samples. The results suggest that tactile stimulation offers new ways of modulating and enriching the interpretation of speech.

  7. An Approach to Hide Secret Speech Information

    Institute of Scientific and Technical Information of China (English)

    WU Zhi-jun; DUAN Hai-xin; LI Xing

    2006-01-01

    This paper presented an approach to hide secret speech information in code excited linear prediction(CELP)-based speech coding scheme by adopting the analysis-by-synthesis (ABS)-based algorithm of speech information hiding and extracting for the purpose of secure speech communication. The secret speech is coded in 2.4Kb/s mixed excitation linear prediction (MELP), which is embedded in CELP type public speech. The ABS algorithm adopts speech synthesizer in speech coder. Speech embedding and coding are synchronous, i.e. a fusion of speech information data of public and secret. The experiment of embedding 2.4 Kb/s MELP secret speech in G.728 scheme coded public speech transmitted via public switched telephone network (PSTN) shows that the proposed approach satisfies the requirements of information hiding, meets the secure communication speech quality constraints, and achieves high hiding capacity of average 3.2 Kb/s with an excellent speech quality and complicating speakers' recognition.

  8. Impaired motor speech performance in Huntington's disease.

    Science.gov (United States)

    Skodda, Sabine; Schlegel, Uwe; Hoffmann, Rainer; Saft, Carsten

    2014-04-01

    Dysarthria is a common symptom of Huntington's disease and has been reported, besides other features, to be characterized by alterations of speech rate and regularity. However, data on the specific pattern of motor speech impairment and their relationship to other motor and neuropsychological symptoms are sparse. Therefore, the aim of the present study was to describe and objectively analyse different speech parameters with special emphasis on the aspect of speech timing of connected speech and non-speech verbal utterances. 21 patients with manifest Huntington's disease and 21 age- and gender-matched healthy controls had to perform a reading task and several syllable repetition tasks. Computerized acoustic analysis of different variables for the measurement of speech rate and regularity generated a typical pattern of impaired motor speech performance with a reduction of speech rate, an increase of pauses and a marked disability to steadily repeat single syllables. Abnormalities of speech parameters were more pronounced in the subgroup of patients with Huntington's disease receiving antidopaminergic medication, but were also present in the drug-naïve patients. Speech rate related to connected speech and parameters of syllable repetition showed correlations to overall motor impairment, capacity of tapping in a quantitative motor assessment and some score of cognitive function. After these preliminary data, further investigations on patients in different stages of disease are warranted to survey if the analysis of speech and non-speech verbal utterances might be a helpful additional tool for the monitoring of functional disability in Huntington's disease.

  9. Neural pathways for visual speech perception

    Directory of Open Access Journals (Sweden)

    Lynne E Bernstein

    2014-12-01

    Full Text Available This paper examines the questions, what levels of speech can be perceived visually, and how is visual speech represented by the brain? Review of the literature leads to the conclusions that every level of psycholinguistic speech structure (i.e., phonetic features, phonemes, syllables, words, and prosody can be perceived visually, although individuals differ in their abilities to do so; and that there are visual modality-specific representations of speech qua speech in higher-level vision brain areas. That is, the visual system represents the modal patterns of visual speech. The suggestion that the auditory speech pathway receives and represents visual speech is examined in light of neuroimaging evidence on the auditory speech pathways. We outline the generally agreed-upon organization of the visual ventral and dorsal pathways and examine several types of visual processing that might be related to speech through those pathways, specifically, face and body, orthography, and sign language processing. In this context, we examine the visual speech processing literature, which reveals widespread diverse patterns activity in posterior temporal cortices in response to visual speech stimuli. We outline a model of the visual and auditory speech pathways and make several suggestions: (1 The visual perception of speech relies on visual pathway representations of speech qua speech. (2 A proposed site of these representations, the temporal visual speech area (TVSA has been demonstrated in posterior temporal cortex, ventral and posterior to multisensory posterior superior temporal sulcus (pSTS. (3 Given that visual speech has dynamic and configural features, its representations in feedforward visual pathways are expected to integrate these features, possibly in TVSA.

  10. Child directed speech, speech in noise and hyperarticulated speech in the Pacific Northwest

    Science.gov (United States)

    Wright, Richard; Carmichael, Lesley; Beckford Wassink, Alicia; Galvin, Lisa

    2001-05-01

    Three types of exaggerated speech are thought to be systematic responses to accommodate the needs of the listener: child-directed speech (CDS), hyperspeech, and the Lombard response. CDS (e.g., Kuhl et al., 1997) occurs in interactions with young children and infants. Hyperspeech (Johnson et al., 1993) is a modification in response to listeners difficulties in recovering the intended message. The Lombard response (e.g., Lane et al., 1970) is a compensation for increased noise in the signal. While all three result from adaptations to accommodate the needs of the listener, and therefore should share some features, the triggering conditions are quite different, and therefore should exhibit differences in their phonetic outcomes. While CDS has been the subject of a variety of acoustic studies, it has never been studied in the broader context of the other ``exaggerated'' speech styles. A large crosslinguistic study was undertaken that compares speech produced under four conditions: spontaneous conversations, CDS aimed at 6-9-month-old infants, hyperarticulated speech, and speech in noise. This talk will present some findings for North American English as spoken in the Pacific Northwest. The measures include f0, vowel duration, F1 and F2 at vowel midpoint, and intensity.

  11. Experimental comparison between speech transmission index, rapid speech transmission index, and speech intelligibility index.

    Science.gov (United States)

    Larm, Petra; Hongisto, Valtteri

    2006-02-01

    During the acoustical design of, e.g., auditoria or open-plan offices, it is important to know how speech can be perceived in various parts of the room. Different objective methods have been developed to measure and predict speech intelligibility, and these have been extensively used in various spaces. In this study, two such methods were compared, the speech transmission index (STI) and the speech intelligibility index (SII). Also the simplification of the STI, the room acoustics speech transmission index (RASTI), was considered. These quantities are all based on determining an apparent speech-to-noise ratio on selected frequency bands and summing them using a specific weighting. For comparison, some data were needed on the possible differences of these methods resulting from the calculation scheme and also measuring equipment. Their prediction accuracy was also of interest. Measurements were made in a laboratory having adjustable noise level and absorption, and in a real auditorium. It was found that the measurement equipment, especially the selection of the loudspeaker, can greatly affect the accuracy of the results. The prediction accuracy of the RASTI was found acceptable, if the input values for the prediction are accurately known, even though the studied space was not ideally diffuse.

  12. NICT/ATR Chinese-Japanese-English Speech-to-Speech Translation System

    Institute of Scientific and Technical Information of China (English)

    Tohru Shimizu; Yutaka Ashikari; Eiichiro Sumita; ZHANG Jinsong; Satoshi Nakamura

    2008-01-01

    This paper describes the latest version of the Chinese-Japanese-English handheld speech-to-speech translation system developed by NICT/ATR,which is now ready to be deployed for travelers.With the entire speech-to-speech translation function being implemented into one terminal,it realizes real-time,location-free speech-to-speech translation.A new noise-suppression technique notably improves the speech recognition performance.Corpus-based approaches of speech recognition,machine translation,and speech synthesis enable coverage of a wide variety of topics and portability to other languages.Test results show that the character accuracy of speech recognition is 82%-94% for Chinese speech,with a bilingual evaluation understudy score of machine translation is 0.55-0.74 for Chinese-Japanese and Chinese-English.

  13. Binary Masking & Speech Intelligibility

    DEFF Research Database (Denmark)

    Boldt, Jesper

    The purpose of this thesis is to examine how binary masking can be used to increase intelligibility in situations where hearing impaired listeners have difficulties understanding what is being said. The major part of the experiments carried out in this thesis can be categorized as either experime...... mask using a directional system and a method for correcting errors in the target binary mask. The last part of the thesis, proposes a new method for objective evaluation of speech intelligibility.......The purpose of this thesis is to examine how binary masking can be used to increase intelligibility in situations where hearing impaired listeners have difficulties understanding what is being said. The major part of the experiments carried out in this thesis can be categorized as either...... experiments under ideal conditions or as experiments under more realistic conditions useful for real-life applications such as hearing aids. In the experiments under ideal conditions, the previously defined ideal binary mask is evaluated using hearing impaired listeners, and a novel binary mask -- the target...

  14. Critical Thinking Process in English Speech

    Institute of Scientific and Technical Information of China (English)

    WANG Jia-li

    2016-01-01

    With the development of mass media, English speech has become an important way for international cultural exchange in the context of globalization. Whether it is a political speech, a motivational speech, or an ordinary public speech, the wisdom and charm of critical thinking are always given into full play. This study analyzes the cultivation of critical thinking in English speech with the aid of representative examples, which is significant for cultivating college students’critical thinking as well as developing their critical thinking skills in English speech.

  15. Application of auditory brainstem response and pure tone audiometry in early diagnosis of acoustic neuroma%听性脑干反应和纯音听阈在听神经瘤早期诊断中的应用

    Institute of Scientific and Technical Information of China (English)

    赵赋; 武丽; 王博; 杨智君; 王振民; 王兴朝; 李朋; 张晶; 刘丕楠

    2015-01-01

    目的 探讨采用听性脑干反应和纯音听阈对早期诊断听神经瘤的临床应用价值.方法 回顾性分析了111例听神经瘤患者的临床资料、纯音听阈、听性脑干反应及增强磁共振结果,采用线性回归分析纯音听阈均值与肿瘤体积、病程是否存在相关性,采用卡方检验分析不同肿瘤体积在听性脑干反应异常发生率上是否存在差异.结果 听神经瘤引起感音神经性耳聋,纯音听阈均值与病程存在显著地相关性(P=0.000);听性脑干反应诊断听神经瘤的敏感度和特异度分别为98.2%和93.6%,肿瘤最大径>3 cm与≤3 cm两组,在患侧和对侧Ⅲ~Ⅳ波间期异常发生率上,差异均具有统计学意义(P值分别为0.038和0.045).结论 听性脑干反应联合纯音测听是早期诊断听神经瘤的有效方法.%Objective To investigate the clinical application value of using auditory brainstem response and pure tone audiometry for early diagnosis of acoustic neuroma.Methods The clinical data,the results of pure tone audiometry,auditory brainstem response,and enhanced MRI in 111 patients with acoustic neuroma were analyzed retrospectively.Linear regression analysis was used to analyze the correlation between the nean value of pure tone audiometry and the neuroma volune or course of disease.Chi-squared test was used to analyze the whether there were differences in the different neuroma volumes on the incidence of abnormal auditory brainstem response.Results Acoustic neuroma caused sensorineural deafness.There was a significant correlation between the mean value of pure tone audiometry and the course of disease (P =0.000).The sensitivity and specificity of auditory brainstem response for the diagnosis of acoustic neuroma were 98.2% and 93.6% respectively.The maximum diameters of neuromas were divided into 2 groups:> 3 cm or ≤3 cm.There were significant differences on the abnormal incidence of the Ⅲ to Ⅴ wave intervals of the

  16. The Role of Visual Speech Information in Supporting Perceptual Learning of Degraded Speech

    Science.gov (United States)

    Wayne, Rachel V.; Johnsrude, Ingrid S.

    2012-01-01

    Following cochlear implantation, hearing-impaired listeners must adapt to speech as heard through their prosthesis. Visual speech information (VSI; the lip and facial movements of speech) is typically available in everyday conversation. Here, we investigate whether learning to understand a popular auditory simulation of speech as transduced by a…

  17. Visual Context Enhanced: The Joint Contribution of Iconic Gestures and Visible Speech to Degraded Speech Comprehension

    Science.gov (United States)

    Drijvers, Linda; Ozyurek, Asli

    2017-01-01

    Purpose: This study investigated whether and to what extent iconic co-speech gestures contribute to information from visible speech to enhance degraded speech comprehension at different levels of noise-vocoding. Previous studies of the contributions of these 2 visual articulators to speech comprehension have only been performed separately. Method:…

  18. Connected Speech Processes in Developmental Speech Impairment: Observations from an Electropalatographic Perspective

    Science.gov (United States)

    Howard, Sara

    2004-01-01

    This paper uses a combination of perceptual and electropalatographic (EPG) analysis to explore the presence and characteristics of connected speech processes in the speech output of five older children with developmental speech impairments. Each of the children is shown to use some processes typical of normal speech production but also to use a…

  19. A Danish open-set speech corpus for competing-speech studies

    DEFF Research Database (Denmark)

    Nielsen, Jens Bo; Dau, Torsten; Neher, Tobias

    2014-01-01

    Studies investigating speech-on-speech masking effects commonly use closed-set speech materials such as the coordinate response measure [Bolia et al. (2000). J. Acoust. Soc. Am. 107, 1065-1066]. However, these studies typically result in very low (i.e., negative) speech recognition thresholds (SR...

  20. The treatment of apraxia of speech : Speech and music therapy, an innovative joint effort

    NARCIS (Netherlands)

    Hurkmans, Josephus Johannes Stephanus

    2016-01-01

    Apraxia of Speech (AoS) is a neurogenic speech disorder. A wide variety of behavioural methods have been developed to treat AoS. Various therapy programmes use musical elements to improve speech production. A unique therapy programme combining elements of speech therapy and music therapy is called S

  1. Predicting Speech Intelligibility with a Multiple Speech Subsystems Approach in Children with Cerebral Palsy

    Science.gov (United States)

    Lee, Jimin; Hustad, Katherine C.; Weismer, Gary

    2014-01-01

    Purpose: Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystems approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Method: Nine acoustic variables reflecting different subsystems, and…

  2. Experimental study on phase perception in speech

    Institute of Scientific and Technical Information of China (English)

    BU Fanliang; CHEN Yanpu

    2003-01-01

    As the human ear is dull to the phase in speech, little attention has been paid tophase information in speech coding. In fact, the speech perceptual quality may be degeneratedif the phase distortion is very large. The perceptual effect of the STFT (Short time Fouriertransform) phase spectrum is studied by auditory subjective hearing tests. Three main con-clusions are (1) If the phase information is neglected completely, the subjective quality of thereconstructed speech may be very poor; (2) Whether the neglected phase is in low frequencyband or high frequency band, the difference from the original speech can be perceived by ear;(3) It is very difficult for the human ear to perceive the difference of speech quality betweenoriginal speech and reconstructed speech while the phase quantization step size is shorter thanπ/7.

  3. Quick Statistics about Voice, Speech, and Language

    Science.gov (United States)

    ... here Home » Health Info » Statistics and Epidemiology Quick Statistics About Voice, Speech, Language Voice, Speech, Language, and ... no 205. Hyattsville, MD: National Center for Health Statistics. 2015. Hoffman HJ, Li C-M, Losonczy K, ...

  4. STUDY ON PHASE PERCEPTION IN SPEECH

    Institute of Scientific and Technical Information of China (English)

    Tong Ming; Bian Zhengzhong; Li Xiaohui; Dai Qijun; Chen Yanpu

    2003-01-01

    The perceptual effect of the phase information in speech has been studied by auditorysubjective tests. On the condition that the phase spectrum in speech is changed while amplitudespectrum is unchanged, the tests show that: (1) If the envelop of the reconstructed speech signalis unchanged, there is indistinctive auditory perception between the original speech and thereconstructed speech; (2) The auditory perception effect of the reconstructed speech mainly lieson the amplitude of the derivative of the additive phase; (3) td is the maximum relative time shiftbetween different frequency components of the reconstructed speech signal. The speech qualityis excellent while td <10ms; good while 10ms< td <20ms; common while 20ms< td <35ms, andpoor while td >35ms.

  5. What Is Language? What Is Speech?

    Science.gov (United States)

    ... request did not produce results) Speech is the verbal means of communicating. Speech consists of the following: ... questions and requests for information from members and non-members. Available 8:30 a.m.–5:00 ...

  6. High Performance Speech Compression System

    Institute of Scientific and Technical Information of China (English)

    2001-01-01

    Since Pulse Code Modulation emerged in 1937, digitized speec h has experienced rapid development due to its outstanding voice quali ty, reliability, robustness and security in communication. But how to reduce channel width without loss of speech quality remains a crucial problem in speech coding theory. A new full-duplex digital speech comm unication system based on the Vocoder of AMBE-1000 and microcontroller ATMEL 89C51 is introduced. It shows higher voice quality than current mobile phone system with only a quarter of channel width needed for t he latter. The prospective areas in which the system can be applied in clude satellite communication, IP Phone, virtual meeting and the most important, defence industry.

  7. Spatial localization of speech segments

    DEFF Research Database (Denmark)

    Karlsen, Brian Lykkegaard

    1999-01-01

    angle the target is likely to have originated from. The model is trained on the experimental data. On the basis of the experimental results, it is concluded that the human ability to localize speech segments in adverse noise depends on the speech segment as well as its point of origin in space...... the task of the experiment. The psychoacoustical experiment used naturally-spoken Danish consonant-vowel combinations as targets presented in diffuse speech-shaped noise at a peak SNR of -10 dB. The subjects were normal hearing persons. The experiment took place in an anechoic chamber where eight...... loudspeakers were suspended so that they surrounded the subjects in the horizontal plane. The subjects were required to push a button on a pad indicating where they had localized the target to in the horizontal plane. The response pad had twelve buttons arranged uniformly in a circle and two further buttons so...

  8. MUSAN: A Music, Speech, and Noise Corpus

    OpenAIRE

    Snyder, David; Chen, Guoguo; Povey, Daniel

    2015-01-01

    This report introduces a new corpus of music, speech, and noise. This dataset is suitable for training models for voice activity detection (VAD) and music/speech discrimination. Our corpus is released under a flexible Creative Commons license. The dataset consists of music from several genres, speech from twelve languages, and a wide assortment of technical and non-technical noises. We demonstrate use of this corpus for music/speech discrimination on Broadcast news and VAD for speaker identif...

  9. Current trends in multilingual speech processing

    Indian Academy of Sciences (India)

    Hervé Bourlard; John Dines; Mathew Magimai-Doss; Philip N Garner; David Imseng; Petr Motlicek; Hui Liang; Lakshmi Saheer; Fabio Valente

    2011-10-01

    In this paper, we describe recent work at Idiap Research Institute in the domain of multilingual speech processing and provide some insights into emerging challenges for the research community. Multilingual speech processing has been a topic of ongoing interest to the research community for many years and the field is now receiving renewed interest owing to two strong driving forces. Firstly, technical advances in speech recognition and synthesis are posing new challenges and opportunities to researchers. For example, discriminative features are seeing wide application by the speech recognition community, but additional issues arise when using such features in a multilingual setting. Another example is the apparent convergence of speech recognition and speech synthesis technologies in the form of statistical parametric methodologies. This convergence enables the investigation of new approaches to unified modelling for automatic speech recognition and text-to-speech synthesis (TTS) as well as cross-lingual speaker adaptation for TTS. The second driving force is the impetus being provided by both government and industry for technologies to help break down domestic and international language barriers, these also being barriers to the expansion of policy and commerce. Speech-to-speech and speech-to-text translation are thus emerging as key technologies at the heart of which lies multilingual speech processing.

  10. Freedom of Speech as an Academic Discipline.

    Science.gov (United States)

    Haiman, Franklyn S.

    Since its formation, the Speech Communication Association's Committee on Freedom of Speech has played a critical leadership role in course offerings, research efforts, and regional activities in freedom of speech. Areas in which research has been done and in which further research should be carried out include: historical-critical research, in…

  11. Development of binaural speech transmission index

    NARCIS (Netherlands)

    Wijngaarden, S.J. van; Drullman, R.

    2006-01-01

    Although the speech transmission index (STI) is a well-accepted and standardized method for objective prediction of speech intelligibility in a wide range of-environments and applications, it is essentially a monaural model. Advantages of binaural hearing to the intelligibility of speech are disrega

  12. Recovering Asynchronous Watermark Tones from Speech

    Science.gov (United States)

    2009-04-01

    Audio steganography for covert data transmission by impercep- tible tone insertion,” Proceedings Communications Sys- tems and Applications, IEEE, vol. 4, pp. 1647–1653, 2004. 1408 ...by a comfortable margin. Index Terms— Speech Watermarking, Hidden Tones, Speech Steganography , Speech Data Hiding 1. BACKGROUND Imperceptibly

  13. Audiovisual Asynchrony Detection in Human Speech

    Science.gov (United States)

    Maier, Joost X.; Di Luca, Massimiliano; Noppeney, Uta

    2011-01-01

    Combining information from the visual and auditory senses can greatly enhance intelligibility of natural speech. Integration of audiovisual speech signals is robust even when temporal offsets are present between the component signals. In the present study, we characterized the temporal integration window for speech and nonspeech stimuli with…

  14. Epoch-based analysis of speech signals

    Indian Academy of Sciences (India)

    B Yegnanarayana; Suryakanth V Gangashetty

    2011-10-01

    Speech analysis is traditionally performed using short-time analysis to extract features in time and frequency domains. The window size for the analysis is fixed somewhat arbitrarily, mainly to account for the time varying vocal tract system during production. However, speech in its primary mode of excitation is produced due to impulse-like excitation in each glottal cycle. Anchoring the speech analysis around the glottal closure instants (epochs) yields significant benefits for speech analysis. Epoch-based analysis of speech helps not only to segment the speech signals based on speech production characteristics, but also helps in accurate analysis of speech. It enables extraction of important acoustic-phonetic features such as glottal vibrations, formants, instantaneous fundamental frequency, etc. Epoch sequence is useful to manipulate prosody in speech synthesis applications. Accurate estimation of epochs helps in characterizing voice quality features. Epoch extraction also helps in speech enhancement and multispeaker separation. In this tutorial article, the importance of epochs for speech analysis is discussed, and methods to extract the epoch information are reviewed. Applications of epoch extraction for some speech applications are demonstrated.

  15. Application of wavelets in speech processing

    CERN Document Server

    Farouk, Mohamed Hesham

    2014-01-01

    This book provides a survey on wide-spread of employing wavelets analysis  in different applications of speech processing. The author examines development and research in different application of speech processing. The book also summarizes the state of the art research on wavelet in speech processing.

  16. Cognitive Functions in Childhood Apraxia of Speech

    Science.gov (United States)

    Nijland, Lian; Terband, Hayo; Maassen, Ben

    2015-01-01

    Purpose: Childhood apraxia of speech (CAS) is diagnosed on the basis of specific speech characteristics, in the absence of problems in hearing, intelligence, and language comprehension. This does not preclude the possibility that children with this speech disorder might demonstrate additional problems. Method: Cognitive functions were investigated…

  17. Cognitive functions in Childhood Apraxia of Speech

    NARCIS (Netherlands)

    Nijland, L.; Terband, H.; Maassen, B.

    2015-01-01

    Purpose: Childhood Apraxia of Speech (CAS) is diagnosed on the basis of specific speech characteristics, in the absence of problems in hearing, intelligence, and language comprehension. This does not preclude the possibility that children with this speech disorder might demonstrate additional proble

  18. Speech-Song Interface of Chinese Speakers

    Science.gov (United States)

    Mang, Esther

    2007-01-01

    Pitch is a psychoacoustic construct crucial in the production and perception of speech and songs. This article is an exploration of the interface of speech and song performance of Chinese speakers. Although parallels might be drawn from the prosodic and sound structures of the linguistic and musical systems, perceiving and producing speech and…

  19. Factors of Politeness and Indirect Speech Acts

    Institute of Scientific and Technical Information of China (English)

    杨雪梅

    2016-01-01

    Polite principle is influenced deeply by a nation's history,culture,custom and so on,therefor different countries have different understandings and expressions of politeness and indirect speech acts.This paper shows some main factors influencing a polite speech.Through this article,readers can comprehensively know about politeness and indirect speech acts.

  20. Speech and Debate as Civic Education

    Science.gov (United States)

    Hogan, J. Michael; Kurr, Jeffrey A.; Johnson, Jeremy D.; Bergmaier, Michael J.

    2016-01-01

    In light of the U.S. Senate's designation of March 15, 2016 as "National Speech and Debate Education Day" (S. Res. 398, 2016), it only seems fitting that "Communication Education" devote a special section to the role of speech and debate in civic education. Speech and debate have been at the heart of the communication…

  1. Audiovisual Speech Integration and Lipreading in Autism

    Science.gov (United States)

    Smith, Elizabeth G.; Bennetto, Loisa

    2007-01-01

    Background: During speech perception, the ability to integrate auditory and visual information causes speech to sound louder and be more intelligible, and leads to quicker processing. This integration is important in early language development, and also continues to affect speech comprehension throughout the lifespan. Previous research shows that…

  2. Liberalism, Speech Codes, and Related Problems.

    Science.gov (United States)

    Sunstein, Cass R.

    1993-01-01

    It is argued that universities are pervasively and necessarily engaged in regulation of speech, which complicates many existing claims about hate speech codes on campus. The ultimate test is whether the restriction on speech is a legitimate part of the institution's mission, commitment to liberal education. (MSE)

  3. Gesture & Speech Based Appliance Control

    Directory of Open Access Journals (Sweden)

    Dr. Sayleegharge,

    2014-01-01

    Full Text Available This document explores the use of speech & gestures to control home appliances. Aiming at the aging population of the world and relieving them from their dependencies. The two approaches used to sail through the target are the MFCC approach for speech processing and the Identification of Characteristic Point Algorithm for gesture recognition. A barrier preventing wide adoption is that this audience can find controlling assistive technology difficult, as they are less dexterous and computer literate. Our results hope to provide a more natural and intuitive interface to help bridge the gap between technology and elderly users.

  4. Discriminative learning for speech recognition

    CERN Document Server

    He, Xiadong

    2008-01-01

    In this book, we introduce the background and mainstream methods of probabilistic modeling and discriminative parameter optimization for speech recognition. The specific models treated in depth include the widely used exponential-family distributions and the hidden Markov model. A detailed study is presented on unifying the common objective functions for discriminative learning in speech recognition, namely maximum mutual information (MMI), minimum classification error, and minimum phone/word error. The unification is presented, with rigorous mathematical analysis, in a common rational-functio

  5. Reproducible Research in Speech Sciences

    Directory of Open Access Journals (Sweden)

    Kandaacute;lmandaacute;n Abari

    2012-11-01

    Full Text Available Reproducible research is the minimum standard of scientific claims in cases when independent replication proves to be difficult. With the special combination of available software tools, we provide a reproducibility recipe for the experimental research conducted in some fields of speech sciences. We have based our model on the triad of the R environment, the EMU-format speech database, and the executable publication. We present the use of three typesetting systems (LaTeX, Markdown, Org, with the help of a mini research.

  6. Speech Communication and Telephone Networks

    Science.gov (United States)

    Gierlich, H. W.

    Speech communication over telephone networks has one major constraint: The communication has to be “real time”. The basic principle since the beginning of all telephone networks has been to provide a communication system capable of substituting the air path between two persons having a conversation at 1-m distance. This is the so-called orthotelephonic reference position [7]. Although many technical compromises must be made to enable worldwide communication over telephone networks, it is still the goal to achieve speech quality performance which is close to this reference.

  7. Annotating Speech Corpus for Prosody Modeling in Indian Language Text to Speech Systems

    Directory of Open Access Journals (Sweden)

    Kiruthiga S

    2012-01-01

    Full Text Available A spoken language system, it may either be a speech synthesis or a speech recognition system, starts with building a speech corpora. We give a detailed survey of issues and a methodology that selects the appropriate speech unit in building a speech corpus for Indian language Text to Speech systems. The paper ultimately aims to improve the intelligibility of the synthesized speech in Text to Speech synthesis systems. To begin with, an appropriate text file should be selected for building the speech corpus. Then a corresponding speech file is generated and stored. This speech file is the phonetic representation of the selected text file. The speech file is processed in different levels viz., paragraphs, sentences, phrases, words, syllables and phones. These are called the speech units of the file. Researches have been done taking these units as the basic unit for processing. This paper analyses the researches done using phones, diphones, triphones, syllables and polysyllables as their basic unit for speech synthesis. The paper also provides a recommended set of combinations for polysyllables. Concatenative speech synthesis involves the concatenation of these basic units to synthesize an intelligent, natural sounding speech. The speech units are annotated with relevant prosodic information about each unit, manually or automatically, based on an algorithm. The database consisting of the units along with their annotated information is called as the annotated speech corpus. A Clustering technique is used in the annotated speech corpus that provides way to select the appropriate unit for concatenation, based on the lowest total join cost of the speech unit.

  8. Audiometria de alta freqüência: estudo com indivíduos audiologicamente normais High-frequency audiometry: study with normal audiological subjects

    Directory of Open Access Journals (Sweden)

    Daniela R. Sahyeb

    2003-01-01

    Full Text Available Pesquisas recentes apontam a Audiometria Tonal de Alta Freqüência (AT-AF como um instrumento para o diagnóstico precoce de danos auditivos decorrentes de alguns agentes etiológicos principais, como envelhecimento e exposição a drogas ototóxicas e a intensidades elevadas de ruído. OBJETIVO: Apesar de já existirem várias técnicas desenvolvidas para essa avaliação, algumas não se aplicam à rotina clínica, em função da falta de praticidade e, por vezes, falta de consistência nos resultados. Segundo a literatura, ainda estão por emergir uma metodologia adequada a tal avaliação e valores indicados como referência à normalidade. Forma de Estudo: Clínico prospectivo. MATERIAL E MÉTODO: A presente pesquisa observou o comportamento dos limiares auditivos de alta freqüência em indivíduos jovens e audiologicamente normais e analisou variabilidades acústicas, inter e intra-indivíduos, que de acordo com a literatura, podem interferir na estabilidade dos resultados. CONCLUSÃO: Com os dados obtidos, pôde-se estabelecer valores de média, desvio padrão e mediana, além de valores mínimos e máximos para cada freqüência. Os testes estatísticos não identificaram diferenças significantes na maioria das análises realizadas (entre sexo, interaurais, variabilidades acústicas e intra-indivíduos, no mesmo dia de testes. A variabilidade dos resultados entre os exames de um mesmo indivíduo, realizados em dias deferentes de testes, mostrou ser significante, sendo as médias dos limiares no segundo dia sempre melhores que as do primeiro dia.Recent research studies pointed to High-Frequency Audiometry (HFA as a tool for early diagnosis of hearing impairment caused by the main etiological agents, such as aging, exposure to ototoxic drugs, and occupational noise. AIM: Although there are already several techniques developed for this assessment, some of them should not be applied to clinical routine, because of their lack of

  9. Relationship between Speech Intelligibility and Speech Comprehension in Babble Noise

    Science.gov (United States)

    Fontan, Lionel; Tardieu, Julien; Gaillard, Pascal; Woisard, Virginie; Ruiz, Robert

    2015-01-01

    Purpose: The authors investigated the relationship between the intelligibility and comprehension of speech presented in babble noise. Method: Forty participants listened to French imperative sentences (commands for moving objects) in a multitalker babble background for which intensity was experimentally controlled. Participants were instructed to…

  10. Speech intelligibility of native and non-native speech

    NARCIS (Netherlands)

    Wijngaarden, S.J. van

    1999-01-01

    The intelligibility of speech is known to be lower if the talker is non-native instead of native for the given language. This study is aimed at quantifying the overall degradation due to acoustic-phonetic limitations of non-native talkers of Dutch, specifically of Dutch-speaking Americans who have l

  11. Issues in acoustic modeling of speech for automatic speech recognition

    OpenAIRE

    Gong, Yifan; Haton, Jean-Paul; Mari, Jean-François

    1994-01-01

    Projet RFIA; Stochastic modeling is a flexible method for handling the large variability in speech for recognition applications. In contrast to dynamic time warping where heuristic training methods for estimating word templates are used, stochastic modeling allows a probabilistic and automatic training for estimating models. This paper deals with the improvement of stochastic techniques, especially for a better representation of time varying phenomena.

  12. Speech perception in children with speech output disorders.

    NARCIS (Netherlands)

    Nijland, L.

    2009-01-01

    Research in the field of speech production pathology is dominated by describing deficits in output. However, perceptual problems might underlie, precede, or interact with production disorders. The present study hypothesizes that the level of the production disorders is linked to level of perception

  13. Speech entrainment enables patients with Broca's aphasia to produce fluent speech.

    Science.gov (United States)

    Fridriksson, Julius; Hubbard, H Isabel; Hudspeth, Sarah Grace; Holland, Audrey L; Bonilha, Leonardo; Fromm, Davida; Rorden, Chris

    2012-12-01

    A distinguishing feature of Broca's aphasia is non-fluent halting speech typically involving one to three words per utterance. Yet, despite such profound impairments, some patients can mimic audio-visual speech stimuli enabling them to produce fluent speech in real time. We call this effect 'speech entrainment' and reveal its neural mechanism as well as explore its usefulness as a treatment for speech production in Broca's aphasia. In Experiment 1, 13 patients with Broca's aphasia were tested in three conditions: (i) speech entrainment with audio-visual feedback where they attempted to mimic a speaker whose mouth was seen on an iPod screen; (ii) speech entrainment with audio-only feedback where patients mimicked heard speech; and (iii) spontaneous speech where patients spoke freely about assigned topics. The patients produced a greater variety of words using audio-visual feedback compared with audio-only feedback and spontaneous speech. No difference was found between audio-only feedback and spontaneous speech. In Experiment 2, 10 of the 13 patients included in Experiment 1 and 20 control subjects underwent functional magnetic resonance imaging to determine the neural mechanism that supports speech entrainment. Group results with patients and controls revealed greater bilateral cortical activation for speech produced during speech entrainment compared with spontaneous speech at the junction of the anterior insula and Brodmann area 47, in Brodmann area 37, and unilaterally in the left middle temporal gyrus and the dorsal portion of Broca's area. Probabilistic white matter tracts constructed for these regions in the normal subjects revealed a structural network connected via the corpus callosum and ventral fibres through the extreme capsule. Unilateral areas were connected via the arcuate fasciculus. In Experiment 3, all patients included in Experiment 1 participated in a 6-week treatment phase using speech entrainment to improve speech production. Behavioural and

  14. Pattern recognition in speech and language processing

    CERN Document Server

    Chou, Wu

    2003-01-01

    Minimum Classification Error (MSE) Approach in Pattern Recognition, Wu ChouMinimum Bayes-Risk Methods in Automatic Speech Recognition, Vaibhava Goel and William ByrneA Decision Theoretic Formulation for Adaptive and Robust Automatic Speech Recognition, Qiang HuoSpeech Pattern Recognition Using Neural Networks, Shigeru KatagiriLarge Vocabulary Speech Recognition Based on Statistical Methods, Jean-Luc GauvainToward Spontaneous Speech Recognition and Understanding, Sadaoki FuruiSpeaker Authentication, Qi Li and Biing-Hwang JuangHMMs for Language Processing Problems, Ri

  15. Speech perception of noise with binary gains

    DEFF Research Database (Denmark)

    Wang, DeLiang; Kjems, Ulrik; Pedersen, Michael Syskind;

    2008-01-01

    For a given mixture of speech and noise, an ideal binary time-frequency mask is constructed by comparing speech energy and noise energy within local time-frequency units. It is observed that listeners achieve nearly perfect speech recognition from gated noise with binary gains prescribed...... by the ideal binary mask. Only 16 filter channels and a frame rate of 100 Hz are sufficient for high intelligibility. The results show that, despite a dramatic reduction of speech information, a pattern of binary gains provides an adequate basis for speech perception....

  16. Speech in Mobile and Pervasive Environments

    CERN Document Server

    Rajput, Nitendra

    2012-01-01

    This book brings together the latest research in one comprehensive volume that deals with issues related to speech processing on resource-constrained, wireless, and mobile devices, such as speech recognition in noisy environments, specialized hardware for speech recognition and synthesis, the use of context to enhance recognition, the emerging and new standards required for interoperability, speech applications on mobile devices, distributed processing between the client and the server, and the relevance of Speech in Mobile and Pervasive Environments for developing regions--an area of explosiv

  17. Perceived Speech Quality Estimation Using DTW Algorithm

    Directory of Open Access Journals (Sweden)

    S. Arsenovski

    2009-06-01

    Full Text Available In this paper a method for speech quality estimation is evaluated by simulating the transfer of speech over packet switched and mobile networks. The proposed system uses Dynamic Time Warping algorithm for test and received speech comparison. Several tests have been made on a test speech sample of a single speaker with simulated packet (frame loss effects on the perceived speech. The achieved results have been compared with measured PESQ values on the used transmission channel and their correlation has been observed.

  18. Text To Speech System for Telugu Language

    OpenAIRE

    Siva kumar, M; E. Prakash Babu

    2014-01-01

    Telugu is one of the oldest languages in India. This paper describes the development of Telugu Text-to-Speech System (TTS).In Telugu TTS the input is Telugu text in Unicode. The voices are sampled from real recorded speech. The objective of a text to speech system is to convert an arbitrary text into its corresponding spoken waveform. Speech synthesis is a process of building machinery that can generate human-like speech from any text input to imitate human speakers. Text proc...

  19. Recent Advances in Robust Speech Recognition Technology

    CERN Document Server

    Ramírez, Javier

    2011-01-01

    This E-book is a collection of articles that describe advances in speech recognition technology. Robustness in speech recognition refers to the need to maintain high speech recognition accuracy even when the quality of the input speech is degraded, or when the acoustical, articulate, or phonetic characteristics of speech in the training and testing environments differ. Obstacles to robust recognition include acoustical degradations produced by additive noise, the effects of linear filtering, nonlinearities in transduction or transmission, as well as impulsive interfering sources, and diminishe

  20. [Nature of speech disorders in Parkinson disease].

    Science.gov (United States)

    Pawlukowska, W; Honczarenko, K; Gołąb-Janowska, M

    2013-01-01

    The aim of the study was to discuss physiology and pathology of speech and review of the literature on speech disorders in Parkinson disease. Additionally, the most effective methods to diagnose the speech disorders in Parkinson disease were also stressed. Afterward, articulatory, respiratory, acoustic and pragmatic factors contributing to the exacerbation of the speech disorders were discussed. Furthermore, the study dealt with the most important types of speech treatment techniques available (pharmacological and behavioral) and a significance of Lee Silverman Voice Treatment was highlighted.

  1. Fast Monaural Separation of Speech

    DEFF Research Database (Denmark)

    Pontoppidan, Niels Henrik; Dyrholm, Mads

    2003-01-01

    a Factorial Hidden Markov Model, with non-stationary assumptions on the source autocorrelations modelled through the Factorial Hidden Markov Model, leads to separation in the monaural case. By extending Hansens work we find that Roweis' assumptions are necessary for monaural speech separation. Furthermore we...

  2. Aerosol Emission during Human Speech

    Science.gov (United States)

    Asadi, Sima; Ristenpart, William

    2016-11-01

    The traditional emphasis for airborne disease transmission has been on coughing and sneezing, which are dramatic expiratory events that yield easily visible droplets. Recent research suggests that normal speech can release even larger quantities of aerosols that are too small to see with the naked eye, but are nonetheless large enough to carry a variety of pathogens (e.g., influenza A). This observation raises an important question: what types of speech emit the most aerosols? Here we show that the concentration of aerosols emitted during healthy human speech is positively correlated with both the amplitude (loudness) and fundamental frequency (pitch) of the vocalization. Experimental measurements with an aerodynamic particle sizer (APS) indicate that speaking in a loud voice (95 decibels) yields up to fifty times more aerosols than in a quiet voice (75 decibels), and that sounds associated with certain phonemes (e.g., [a] or [o]) release more aerosols than others. We interpret these results in terms of the egressive airflow rate associated with each phoneme and the corresponding fundamental frequency, which is known to vary significantly with gender and age. The results suggest that individual speech patterns could affect the probability of airborne disease transmission.

  3. Affecting Critical Thinking through Speech.

    Science.gov (United States)

    O'Keefe, Virginia P.

    Intended for teachers, this booklet shows how spoken language can affect student thinking and presents strategies for teaching critical thinking skills. The first section discusses the theoretical and research bases for promoting critical thinking through speech, defines critical thinking, explores critical thinking as abstract thinking, and tells…

  4. Acoustic Analysis of PD Speech

    Directory of Open Access Journals (Sweden)

    Karen Chenausky

    2011-01-01

    Full Text Available According to the U.S. National Institutes of Health, approximately 500,000 Americans have Parkinson's disease (PD, with roughly another 50,000 receiving new diagnoses each year. 70%–90% of these people also have the hypokinetic dysarthria associated with PD. Deep brain stimulation (DBS substantially relieves motor symptoms in advanced-stage patients for whom medication produces disabling dyskinesias. This study investigated speech changes as a result of DBS settings chosen to maximize motor performance. The speech of 10 PD patients and 12 normal controls was analyzed for syllable rate and variability, syllable length patterning, vowel fraction, voice-onset time variability, and spirantization. These were normalized by the controls' standard deviation to represent distance from normal and combined into a composite measure. Results show that DBS settings relieving motor symptoms can improve speech, making it up to three standard deviations closer to normal. However, the clinically motivated settings evaluated here show greater capacity to impair, rather than improve, speech. A feedback device developed from these findings could be useful to clinicians adjusting DBS parameters, as a means for ensuring they do not unwittingly choose DBS settings which impair patients' communication.

  5. Gaucho Gazette: Speech and Sensationalism

    Directory of Open Access Journals (Sweden)

    Roberto José Ramos

    2013-07-01

    Full Text Available The Gaucho Gazette presents itself as a “popular newspaper”. Attempts to produce a denial about his aesthetic tabloid. Search only say that discloses what happens, as if the media were merely a reflection of society. This paper will seek to understand and explain your Sensationalism, through their speeches. Use for both, semiology, Roland Barthes, in their possibilities transdisciplinary.

  6. Speech recognition with amplitude and frequency modulations

    Science.gov (United States)

    Zeng, Fan-Gang; Nie, Kaibao; Stickney, Ginger S.; Kong, Ying-Yee; Vongphoe, Michael; Bhargave, Ashish; Wei, Chaogang; Cao, Keli

    2005-02-01

    Amplitude modulation (AM) and frequency modulation (FM) are commonly used in communication, but their relative contributions to speech recognition have not been fully explored. To bridge this gap, we derived slowly varying AM and FM from speech sounds and conducted listening tests using stimuli with different modulations in normal-hearing and cochlear-implant subjects. We found that although AM from a limited number of spectral bands may be sufficient for speech recognition in quiet, FM significantly enhances speech recognition in noise, as well as speaker and tone recognition. Additional speech reception threshold measures revealed that FM is particularly critical for speech recognition with a competing voice and is independent of spectral resolution and similarity. These results suggest that AM and FM provide independent yet complementary contributions to support robust speech recognition under realistic listening situations. Encoding FM may improve auditory scene analysis, cochlear-implant, and audiocoding performance. auditory analysis | cochlear implant | neural code | phase | scene analysis

  7. Speech motor learning in profoundly deaf adults.

    Science.gov (United States)

    Nasir, Sazzad M; Ostry, David J

    2008-10-01

    Speech production, like other sensorimotor behaviors, relies on multiple sensory inputs--audition, proprioceptive inputs from muscle spindles and cutaneous inputs from mechanoreceptors in the skin and soft tissues of the vocal tract. However, the capacity for intelligible speech by deaf speakers suggests that somatosensory input alone may contribute to speech motor control and perhaps even to speech learning. We assessed speech motor learning in cochlear implant recipients who were tested with their implants turned off. A robotic device was used to alter somatosensory feedback by displacing the jaw during speech. We found that implant subjects progressively adapted to the mechanical perturbation with training. Moreover, the corrections that we observed were for movement deviations that were exceedingly small, on the order of millimeters, indicating that speakers have precise somatosensory expectations. Speech motor learning is substantially dependent on somatosensory input.

  8. Speech Enhancement with Natural Sounding Residual Noise Based on Connected Time-Frequency Speech Presence Regions

    Directory of Open Access Journals (Sweden)

    Sørensen Karsten Vandborg

    2005-01-01

    Full Text Available We propose time-frequency domain methods for noise estimation and speech enhancement. A speech presence detection method is used to find connected time-frequency regions of speech presence. These regions are used by a noise estimation method and both the speech presence decisions and the noise estimate are used in the speech enhancement method. Different attenuation rules are applied to regions with and without speech presence to achieve enhanced speech with natural sounding attenuated background noise. The proposed speech enhancement method has a computational complexity, which makes it feasible for application in hearing aids. An informal listening test shows that the proposed speech enhancement method has significantly higher mean opinion scores than minimum mean-square error log-spectral amplitude (MMSE-LSA and decision-directed MMSE-LSA.

  9. Speech Evaluation with Special Focus on Children Suffering from Apraxia of Speech

    Directory of Open Access Journals (Sweden)

    Manasi Dixit

    2013-07-01

    Full Text Available Speech disorders are very complicated in individuals suffering from Apraxia of Speech-AOS. In this paper ,the pathological cases of speech disabled children affected with AOS are analyzed. The speech signalsamples of childrenSpeech disorders are very complicated in individuals suffering from Apraxia of Speech-AOS. In this paper ,the pathological cases of speech disabled children affected with AOS are analyzed. The speech signalsamples of children of age between three to eight years are considered for the present study. These speechsignals are digitized and enhanced using the using the Speech Pause Index, Jitter,Skew ,Kurtosis analysisThis analysis is conducted on speech data samples which are concerned with both place of articulation andmanner of articulation. The speech disability of pathological subjects was estimated using results of aboveanalysis. of age between three to eight years are considered for the present study. These speechsignals are digitized and enhanced using the using the Speech Pause Index, Jitter,Skew ,Kurtosis analysisThis analysis is conducted on speech data samples which are concerned with both place of articulation andmanner of articulation. The speech disability of pathological subjects was estimated using results of aboveanalysis.

  10. Levels of Processing of Speech and Non-Speech

    Science.gov (United States)

    1991-05-10

    Timbre : A better musical analogv to speech? Presented to the Acoustical Society of America. Anaheim. A. Samuel. (Fall 1987) Central and peripheal...Thle studies of listener based factors include studies of perceptual. restoration of deleted sounds (phonemes or musical notes), and studies of the... music . The attentional investi- ctnsdemons;trate, rjAher fine-tuned ittentional control under high-predictability condi- Lios. ic~ifcart oogrssh&A; been

  11. Sensorimotor influences on speech perception in infancy.

    Science.gov (United States)

    Bruderer, Alison G; Danielson, D Kyle; Kandhadai, Padmapriya; Werker, Janet F

    2015-11-01

    The influence of speech production on speech perception is well established in adults. However, because adults have a long history of both perceiving and producing speech, the extent to which the perception-production linkage is due to experience is unknown. We addressed this issue by asking whether articulatory configurations can influence infants' speech perception performance. To eliminate influences from specific linguistic experience, we studied preverbal, 6-mo-old infants and tested the discrimination of a nonnative, and hence never-before-experienced, speech sound distinction. In three experimental studies, we used teething toys to control the position and movement of the tongue tip while the infants listened to the speech sounds. Using ultrasound imaging technology, we verified that the teething toys consistently and effectively constrained the movement and positioning of infants' tongues. With a looking-time procedure, we found that temporarily restraining infants' articulators impeded their discrimination of a nonnative consonant contrast but only when the relevant articulator was selectively restrained to prevent the movements associated with producing those sounds. Our results provide striking evidence that even before infants speak their first words and without specific listening experience, sensorimotor information from the articulators influences speech perception. These results transform theories of speech perception by suggesting that even at the initial stages of development, oral-motor movements influence speech sound discrimination. Moreover, an experimentally induced "impairment" in articulator movement can compromise speech perception performance, raising the question of whether long-term oral-motor impairments may impact perceptual development.

  12. Design and realisation of an audiovisual speech activity detector

    NARCIS (Netherlands)

    Van Bree, K.C.

    2006-01-01

    For many speech telecommunication technologies a robust speech activity detector is important. An audio-only speech detector will givefalse positives when the interfering signal is speech or has speech characteristics. The modality video is suitable to solve this problem. In this report the approach

  13. Extensions to the Speech Disorders Classification System (SDCS)

    Science.gov (United States)

    Shriberg, Lawrence D.; Fourakis, Marios; Hall, Sheryl D.; Karlsson, Heather B.; Lohmeier, Heather L.; McSweeny, Jane L.; Potter, Nancy L.; Scheer-Cohen, Alison R.; Strand, Edythe A.; Tilkens, Christie M.; Wilson, David L.

    2010-01-01

    This report describes three extensions to a classification system for paediatric speech sound disorders termed the Speech Disorders Classification System (SDCS). Part I describes a classification extension to the SDCS to differentiate motor speech disorders from speech delay and to differentiate among three sub-types of motor speech disorders.…

  14. Language processing for speech understanding

    Science.gov (United States)

    Woods, W. A.

    1983-07-01

    This report considers language understanding techniques and control strategies that can be applied to provide higher-level support to aid in the understanding of spoken utterances. The discussion is illustrated with concepts and examples from the BBN speech understanding system, HWIM (Hear What I Mean). The HWIM system was conceived as an assistant to a travel budget manager, a system that would store information about planned and taken trips, travel budgets and their planning. The system was able to respond to commands and answer questions spoken into a microphone, and was able to synthesize spoken responses as output. HWIM was a prototype system used to drive speech understanding research. It used a phonetic-based approach, with no speaker training, a large vocabulary, and a relatively unconstraining English grammar. Discussed here is the control structure of the HWIM and the parsing algorithm used to parse sentences from the middle-out, using an ATN grammar.

  15. THE BASIS FOR SPEECH PREVENTION

    Directory of Open Access Journals (Sweden)

    Jordan JORDANOVSKI

    1997-06-01

    Full Text Available The speech is a tool for accurate communication of ideas. When we talk about speech prevention as a practical realization of the language, we are referring to the fact that it should be comprised of the elements of the criteria as viewed from the perspective of the standards. This criteria, in the broad sense of the word, presupposes an exact realization of the thought expressed between the speaker and the recipient.The absence of this criterion catches the eye through the practical realization of the language and brings forth consequences, often hidden very deeply in the human psyche. Their outer manifestation already represents a delayed reaction of the social environment. The foundation for overcoming and standardization of this phenomenon must be the anatomy-physiological patterns of the body, accomplished through methods in concordance with the nature of the body.

  16. Status Report on Speech Research

    Science.gov (United States)

    1992-06-01

    1961). Speech disturbances caused by per se, but in processes from which movement tumors of the supplementary motor area. Acta PsychiatncaNeurologica...prior results (e.g., MAMA which can be interpreted as to lexical access in word recognition tasks. As de- either a Roman or a Cyrillic word was no slower...experiments cited above phonemic interpretation in both alphabets (e.g., (Lukatela et aL., 1978; Lukatela et al., 1980) was MAMA , JAJE) are

  17. Speech Prosody in Persian Language

    Directory of Open Access Journals (Sweden)

    Maryam Nikravesh

    2014-05-01

    Full Text Available Background: In verbal communication in addition of semantic and grammatical aspects, includes: vocabulary, syntax and phoneme, some special voice characteristics were use that called speech prosody. Speech prosody is one of the important factors of communication which includes: intonation, duration, pitch, loudness, stress, rhythm and etc. The aim of this survey is studying some factors of prosody as duration, fundamental frequency range and intonation contour. Materials and Methods: This study is performed with cross-sectional and descriptive-analytic approach. The participants include 134 male and female between 18-30 years old who normally speak Persian. Two sentences include: an interrogative and one declarative sentence were studied. Voice samples were analyzed by Dr. Speech software (real analysis software and data were analyzed by statistical test of unilateral variance analysis and in depended T test, and intonation contour was drawn for sentences. Results: Mean of duration between kinds of sentences had a significant difference. Mean of duration had significant difference between female and male. Fundamental frequency range between kinds of sentences had not significant difference. Fundamental frequency range in female is higher than male. Conclusion: Duration is an affective factor in Persian prosody. The higher fundamental frequency range in female is because of different anatomical and physiological mechanisms in phonation system. In addition higher fundamental frequency range in female is the result of an authority of language use in Farsi female. The end part of intonation contour in yes/no question is rising, in declarative sentence is falling.

  18. Speech parts as Poisson processes.

    Science.gov (United States)

    Badalamenti, A F

    2001-09-01

    This paper presents evidence that six of the seven parts of speech occur in written text as Poisson processes, simple or recurring. The six major parts are nouns, verbs, adjectives, adverbs, prepositions, and conjunctions, with the interjection occurring too infrequently to support a model. The data consist of more than the first 5000 words of works by four major authors coded to label the parts of speech, as well as periods (sentence terminators). Sentence length is measured via the period and found to be normally distributed with no stochastic model identified for its occurrence. The models for all six speech parts but the noun significantly distinguish some pairs of authors and likewise for the joint use of all words types. Any one author is significantly distinguished from any other by at least one word type and sentence length very significantly distinguishes each from all others. The variety of word type use, measured by Shannon entropy, builds to about 90% of its maximum possible value. The rate constants for nouns are close to the fractions of maximum entropy achieved. This finding together with the stochastic models and the relations among them suggest that the noun may be a primitive organizer of written text.

  19. Self-Evaluation and Pre-Speech Planning: A Strategy for Sharing Responsibility for Progress in the Speech Class.

    Science.gov (United States)

    Desjardins, Linda A.

    Speech class teachers can implement a pre- and post-speech strategy, using pre-speech and self-evaluation forms, to help students become active in directing their own progress, and acknowledge their own accomplishments. Every speech is tape-recorded in class. Students listen to their speeches later and fill in the self-evaluation form, which asks…

  20. Modelling speech intelligibility in adverse conditions

    DEFF Research Database (Denmark)

    Jørgensen, Søren; Dau, Torsten

    2013-01-01

    Jørgensen and Dau (J Acoust Soc Am 130:1475-1487, 2011) proposed the speech-based envelope power spectrum model (sEPSM) in an attempt to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index (SII) in conditions with nonlinearly processed speech....... Instead of considering the reduction of the temporal modulation energy as the intelligibility metric, as assumed in the STI, the sEPSM applies the signal-to-noise ratio in the envelope domain (SNRenv). This metric was shown to be the key for predicting the intelligibility of reverberant speech as well...... subjected to phase jitter, a condition in which the spectral structure of the intelligibility of speech signal is strongly affected, while the broadband temporal envelope is kept largely intact. In contrast, the effects of this distortion can be predicted -successfully by the spectro-temporal modulation...

  1. Mobile speech and advanced natural language solutions

    CERN Document Server

    Markowitz, Judith

    2013-01-01

    Mobile Speech and Advanced Natural Language Solutions provides a comprehensive and forward-looking treatment of natural speech in the mobile environment. This fourteen-chapter anthology brings together lead scientists from Apple, Google, IBM, AT&T, Yahoo! Research and other companies, along with academicians, technology developers and market analysts.  They analyze the growing markets for mobile speech, new methodological approaches to the study of natural language, empirical research findings on natural language and mobility, and future trends in mobile speech.  Mobile Speech opens with a challenge to the industry to broaden the discussion about speech in mobile environments beyond the smartphone, to consider natural language applications across different domains.   Among the new natural language methods introduced in this book are Sequence Package Analysis, which locates and extracts valuable opinion-related data buried in online postings; microintonation as a way to make TTS truly human-like; and se...

  2. Recent advances in nonlinear speech processing

    CERN Document Server

    Faundez-Zanuy, Marcos; Esposito, Antonietta; Cordasco, Gennaro; Drugman, Thomas; Solé-Casals, Jordi; Morabito, Francesco

    2016-01-01

    This book presents recent advances in nonlinear speech processing beyond nonlinear techniques. It shows that it exploits heuristic and psychological models of human interaction in order to succeed in the implementations of socially believable VUIs and applications for human health and psychological support. The book takes into account the multifunctional role of speech and what is “outside of the box” (see Björn Schuller’s foreword). To this aim, the book is organized in 6 sections, each collecting a small number of short chapters reporting advances “inside” and “outside” themes related to nonlinear speech research. The themes emphasize theoretical and practical issues for modelling socially believable speech interfaces, ranging from efforts to capture the nature of sound changes in linguistic contexts and the timing nature of speech; labors to identify and detect speech features that help in the diagnosis of psychological and neuronal disease, attempts to improve the effectiveness and performa...

  3. An enhanced relative spectral processing of speech

    Institute of Scientific and Technical Information of China (English)

    ZHEN Bin; WU Xihong; LIU Zhimin; CHI Huisheng

    2002-01-01

    An enhanced relative spectral (E_RASTA) technique for speech and speaker recognition is proposed. The new method consists of classical RASTA filtering in logarithmic spectral domain following by another additive RASTA filtering in the same domain. In this manner,both the channel distortion and additive noise are removed effectively. In speaker identification and speech recognition experiments on TI46 database, the E_RASTA performs equal or better than J_RASTA method in both tasks. The E_RASTA does not need the speech SNR estimation in order to determinate the optimal value of J in J_RASTA, and the information of how the speech degrades. The choice of E_RASTA filter also indicates that the low temporal modulation components in speech can deteriorate the performance of both recognition tasks. Besides, the speaker recognition needs less temporal modulation frequency band than that of the speech recognition.

  4. Three speech sounds, one motor action: Evidence for speech-motor disparity from English flap production

    OpenAIRE

    Derrick, Donald; Stavness, Ian; Gick, Bryan

    2015-01-01

    The assumption that units of speech production bear a one-to-one relationship to speech motor actions pervades otherwise widely varying theories of speech motor behavior. This speech production and simulation study demonstrates that commonly occurring flap sequences may violate this assumption. In the word “Saturday,” a sequence of three sounds may be produced using a single, cyclic motor action. Under this view, the initial upward tongue tip motion, starting with the first vowel and moving t...

  5. A Software Agent for Speech Abiding Systems

    Directory of Open Access Journals (Sweden)

    R. Manoharan

    2009-01-01

    Full Text Available Problem statement: In order to bring speech into the mainstream of business process an efficient digital signal processor is necessary. The Fast Fourier Transform (FFT and the butter fly structure symmetry will enable the harwaring easier. With the DSP and software proposed, togetherly established by means of a system, named here as “Speech Abiding System (SAS”, a software agent, which involves the digital representation of speech signals and the use of digital processors to analyze, synthesize, or modify such signals. The proposed SAS addresses the issues in two parts. Part I: Capturing the Speaker and the Language independent error free Speech Content for speech applications processing and Part II: To accomplish the speech content as an input to the Speech User Applications/Interface (SUI. Approach: Discrete Fourier Transform (DFT of the speech signal is the essential ingredient to evolve this SAS and Discrete-Time Fourier Transform (DTFT links the discrete-time domain to the continuous-frequency domain. The direct computation of DFT is prohibitively expensive in terms of the required computer operations. Fortunately, a number of “fast” transforms have been developed that are mathematically equivalent to the DFT, but which require significantly a fewer computer operations for their implementation. Results: From Part-I, the SAS able to capture an error free Speech content to facilitate the speech as a good input in the main stream of business processing. Part-II provides an environment to implement the speech user applications at a primitive level. Conclusion/Recommendations: The SAS agent along with the required hardware architecture, a Finite State Automata (FSA machine can be created to develop global oriented domain specific speech user applications easily. It will have a major impact on interoperability and disintermediation in the Information Technology Cycle (ITC for computer program generating.

  6. CAR2 - Czech Database of Car Speech

    Directory of Open Access Journals (Sweden)

    P. Sovka

    1999-12-01

    Full Text Available This paper presents new Czech language two-channel (stereo speech database recorded in car environment. The created database was designed for experiments with speech enhancement for communication purposes and for the study and the design of a robust speech recognition systems. Tools for automated phoneme labelling based on Baum-Welch re-estimation were realised. The noise analysis of the car background environment was done.

  7. Is markerless acquisition of speech production accurate ?

    OpenAIRE

    Ouni, Slim; Dahmani, Sara

    2016-01-01

    International audience; In this study, the precision of markerless acquisition techniques have been assessed when used to acquire articulatory data for speech production studies. Two different markerless systems have been evaluated and compared to a marker-based one. The main finding is that both markerless systems provide reasonable result during normal speech and the quality is uneven during fast articulated speech. The quality of the data is dependent on the temporal resolution of the mark...

  8. An Approach to Intelligent Speech Production System

    Institute of Scientific and Technical Information of China (English)

    陈芳; 袁保宗

    1997-01-01

    In the paper an intelligent speech production system is established by using language information processing technology.The concept of bi-directional grammar is proposed in Chinese language information processing and a corresponding Chinese characteristic network is completed.Correct text can be generated through grammar parsing and some additional rules.According to the generated text the system generates speech which has good quality in naturalness and intelligibility using Chinese Text-to-Speech Conversion System.

  9. Modelling speech intelligibility in adverse conditions.

    Science.gov (United States)

    Jørgensen, Søren; Dau, Torsten

    2013-01-01

    Jørgensen and Dau (J Acoust Soc Am 130:1475-1487, 2011) proposed the speech-based envelope power spectrum model (sEPSM) in an attempt to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index (SII) in conditions with nonlinearly processed speech. Instead of considering the reduction of the temporal modulation energy as the intelligibility metric, as assumed in the STI, the sEPSM applies the signal-to-noise ratio in the envelope domain (SNRenv). This metric was shown to be the key for predicting the intelligibility of reverberant speech as well as noisy speech processed by spectral subtraction. The key role of the SNRenv metric is further supported here by the ability of a short-term version of the sEPSM to predict speech masking release for different speech materials and modulated interferers. However, the sEPSM cannot account for speech subjected to phase jitter, a condition in which the spectral structure of the intelligibility of speech signal is strongly affected, while the broadband temporal envelope is kept largely intact. In contrast, the effects of this distortion can be predicted -successfully by the spectro-temporal modulation index (STMI) (Elhilali et al., Speech Commun 41:331-348, 2003), which assumes an explicit analysis of the spectral "ripple" structure of the speech signal. However, since the STMI applies the same decision metric as the STI, it fails to account for spectral subtraction. The results from this study suggest that the SNRenv might reflect a powerful decision metric, while some explicit across-frequency analysis seems crucial in some conditions. How such across-frequency analysis is "realized" in the auditory system remains unresolved.

  10. Design and development a children's speech database

    OpenAIRE

    Kraleva, Radoslava

    2016-01-01

    The report presents the process of planning, designing and the development of a database of spoken children's speech whose native language is Bulgarian. The proposed model is designed for children between the age of 4 and 6 without speech disorders, and reflects their specific capabilities. At this age most children cannot read, there is no sustained concentration, they are emotional, etc. The aim is to unite all the media information accompanying the recording and processing of spoken speech...

  11. Intelligibility Enhancement of Speech in Noise

    OpenAIRE

    Valentini-Botinhao, Cassia; Yamagishi, Junichi; King, Simon

    2014-01-01

    Speech technology can facilitate human-machine interaction and create new communication interfaces. Text-To-Speech (TTS) systems provide speech output for dialogue, notification and reading applications as well as personalized voices for people that have lost the use of their own. TTS systems are built to produce synthetic voices that should sound as natural, expressive and intelligible as possible and if necessary be similar to a particular speaker. Although naturalness is an important requi...

  12. Coevolutionary Investments in Human Speech and Trade

    OpenAIRE

    Bulte, Erwin H; Horan, Richard D.; Shogren, Jason F.

    2006-01-01

    We propose a novel explanation for the emergence of language in modern humans, and the lack thereof in other hominids. A coevolutionary process, where trade facilitates speech and speech facilitates trade, driven by expectations and potentially influenced by geography, gives rise to multiple stable development trajectories. While the trade-speech equilibrium is not an inevitable outcome for modern humans, we do find that it is a relatively likely result given that our species evolved in Afric...

  13. Perceived liveliness and speech comprehensibility in aphasia : the effects of direct speech in auditory narratives

    NARCIS (Netherlands)

    Groenewold, Rimke; Bastiaanse, Roelien; Nickels, Lyndsey; Huiskes, Mike

    2014-01-01

    Background: Previous studies have shown that in semi-spontaneous speech, individuals with Broca's and anomic aphasia produce relatively many direct speech constructions. It has been claimed that in 'healthy' communication direct speech constructions contribute to the liveliness, and indirectly to th

  14. Speech rate effects on the processing of conversational speech across the adult life span.

    Science.gov (United States)

    Koch, Xaver; Janse, Esther

    2016-04-01

    This study investigates the effect of speech rate on spoken word recognition across the adult life span. Contrary to previous studies, conversational materials with a natural variation in speech rate were used rather than lab-recorded stimuli that are subsequently artificially time-compressed. It was investigated whether older adults' speech recognition is more adversely affected by increased speech rate compared to younger and middle-aged adults, and which individual listener characteristics (e.g., hearing, fluid cognitive processing ability) predict the size of the speech rate effect on recognition performance. In an eye-tracking experiment, participants indicated with a mouse-click which visually presented words they recognized in a conversational fragment. Click response times, gaze, and pupil size data were analyzed. As expected, click response times and gaze behavior were affected by speech rate, indicating that word recognition is more difficult if speech rate is faster. Contrary to earlier findings, increased speech rate affected the age groups to the same extent. Fluid cognitive processing ability predicted general recognition performance, but did not modulate the speech rate effect. These findings emphasize that earlier results of age by speech rate interactions mainly obtained with artificially speeded materials may not generalize to speech rate variation as encountered in conversational speech.

  15. Exploring the role of brain oscillations in speech perception in noise: Intelligibility of isochronously retimed speech

    Directory of Open Access Journals (Sweden)

    Vincent Aubanel

    2016-08-01

    Full Text Available A growing body of evidence shows that brain oscillations track speech. This mechanism is thought to maximise processing efficiency by allocating resources to important speech information, effectively parsing speech into units of appropriate granularity for further decoding. However, some aspects of this mechanism remain unclear. First, while periodicity is an intrinsic property of this physiological mechanism, speech is only quasi-periodic, so it is not clear whether periodicity would present an advantage in processing. Second, it is still a matter of debate which aspect of speech triggers or maintains cortical entrainment, from bottom-up cues such as fluctuations of the amplitude envelope of speech to higher level linguistic cues such as syntactic structure. We present data from a behavioural experiment assessing the effect of isochronous retiming of speech on speech perception in noise. Two types of anchor points were defined for retiming speech, namely syllable onsets and amplitude envelope peaks. For each anchor point type, retiming was implemented at two hierarchical levels, a slow time scale around 2.5 Hz and a fast time scale around 4 Hz. Results show that while any temporal distortion resulted in reduced speech intelligibility, isochronous speech anchored to P-centers (approximated by stressed syllable vowel onsets was significantly more intelligible than a matched anisochronous retiming, suggesting a facilitative role of periodicity defined on linguistically motivated units in processing speech in noise.

  16. Developmental apraxia of speech in children : quantitative assessment of speech characteristics

    NARCIS (Netherlands)

    Thoonen, G.H.J.

    1998-01-01

    Developmental apraxia of speech (DAS) in children is a speech disorder, supposed to have a neurological origin, which is commonly considered to result from particular deficits in speech processing (i.e., phonological planning, motor programming). However, the label DAS has often been used as acatch-

  17. Speech and non-speech audio-visual illusions: a developmental study.

    Directory of Open Access Journals (Sweden)

    Corinne Tremblay

    Full Text Available It is well known that simultaneous presentation of incongruent audio and visual stimuli can lead to illusory percepts. Recent data suggest that distinct processes underlie non-specific intersensory speech as opposed to non-speech perception. However, the development of both speech and non-speech intersensory perception across childhood and adolescence remains poorly defined. Thirty-eight observers aged 5 to 19 were tested on the McGurk effect (an audio-visual illusion involving speech, the Illusory Flash effect and the Fusion effect (two audio-visual illusions not involving speech to investigate the development of audio-visual interactions and contrast speech vs. non-speech developmental patterns. Whereas the strength of audio-visual speech illusions varied as a direct function of maturational level, performance on non-speech illusory tasks appeared to be homogeneous across all ages. These data support the existence of independent maturational processes underlying speech and non-speech audio-visual illusory effects.

  18. Cleft Audit Protocol for Speech (CAPS-A): A Comprehensive Training Package for Speech Analysis

    Science.gov (United States)

    Sell, D.; John, A.; Harding-Bell, A.; Sweeney, T.; Hegarty, F.; Freeman, J.

    2009-01-01

    Background: The previous literature has largely focused on speech analysis systems and ignored process issues, such as the nature of adequate speech samples, data acquisition, recording and playback. Although there has been recognition of the need for training on tools used in speech analysis associated with cleft palate, little attention has been…

  19. E-learning-based speech therapy: a web application for speech training.

    NARCIS (Netherlands)

    Beijer, L.J.; Rietveld, T.C.; Beers, M.M. van; Slangen, R.M.; Heuvel, H. van den; Swart, B.J.M. de; Geurts, A.C.H.

    2010-01-01

    Abstract In The Netherlands, a web application for speech training, E-learning-based speech therapy (EST), has been developed for patients with dysarthria, a speech disorder resulting from acquired neurological impairments such as stroke or Parkinson's disease. In this report, the EST infrastructure

  20. Acquisition of speech rhythm in first language.

    Science.gov (United States)

    Polyanskaya, Leona; Ordin, Mikhail

    2015-09-01

    Analysis of English rhythm in speech produced by children and adults revealed that speech rhythm becomes increasingly more stress-timed as language acquisition progresses. Children reach the adult-like target by 11 to 12 years. The employed speech elicitation paradigm ensured that the sentences produced by adults and children at different ages were comparable in terms of lexical content, segmental composition, and phonotactic complexity. Detected differences between child and adult rhythm and between rhythm in child speech at various ages cannot be attributed to acquisition of phonotactic language features or vocabulary, and indicate the development of language-specific phonetic timing in the course of acquisition.

  1. Normal and Time-Compressed Speech

    Science.gov (United States)

    Lemke, Ulrike; Kollmeier, Birger; Holube, Inga

    2016-01-01

    Short-term and long-term learning effects were investigated for the German Oldenburg sentence test (OLSA) using original and time-compressed fast speech in noise. Normal-hearing and hearing-impaired participants completed six lists of the OLSA in five sessions. Two groups of normal-hearing listeners (24 and 12 listeners) and two groups of hearing-impaired listeners (9 listeners each) performed the test with original or time-compressed speech. In general, original speech resulted in better speech recognition thresholds than time-compressed speech. Thresholds decreased with repetition for both speech materials. Confirming earlier results, the largest improvements were observed within the first measurements of the first session, indicating a rapid initial adaptation phase. The improvements were larger for time-compressed than for original speech. The novel results on long-term learning effects when using the OLSA indicate a longer phase of ongoing learning, especially for time-compressed speech, which seems to be limited by a floor effect. In addition, for normal-hearing participants, no complete transfer of learning benefits from time-compressed to original speech was observed. These effects should be borne in mind when inviting listeners repeatedly, for example, in research settings.

  2. Speech Enhancement based on Compressive Sensing Algorithm

    Science.gov (United States)

    Sulong, Amart; Gunawan, Teddy S.; Khalifa, Othman O.; Chebil, Jalel

    2013-12-01

    There are various methods, in performance of speech enhancement, have been proposed over the years. The accurate method for the speech enhancement design mainly focuses on quality and intelligibility. The method proposed with high performance level. A novel speech enhancement by using compressive sensing (CS) is a new paradigm of acquiring signals, fundamentally different from uniform rate digitization followed by compression, often used for transmission or storage. Using CS can reduce the number of degrees of freedom of a sparse/compressible signal by permitting only certain configurations of the large and zero/small coefficients, and structured sparsity models. Therefore, CS is significantly provides a way of reconstructing a compressed version of the speech in the original signal by taking only a small amount of linear and non-adaptive measurement. The performance of overall algorithms will be evaluated based on the speech quality by optimise using informal listening test and Perceptual Evaluation of Speech Quality (PESQ). Experimental results show that the CS algorithm perform very well in a wide range of speech test and being significantly given good performance for speech enhancement method with better noise suppression ability over conventional approaches without obvious degradation of speech quality.

  3. Text To Speech System for Telugu Language

    Directory of Open Access Journals (Sweden)

    M. Siva Kumar

    2014-03-01

    Full Text Available Telugu is one of the oldest languages in India. This paper describes the development of Telugu Text-to-Speech System (TTS.In Telugu TTS the input is Telugu text in Unicode. The voices are sampled from real recorded speech. The objective of a text to speech system is to convert an arbitrary text into its corresponding spoken waveform. Speech synthesis is a process of building machinery that can generate human-like speech from any text input to imitate human speakers. Text processing and speech generation are two main components of a text to speech system. To build a natural sounding speech synthesis system, it is essential that text processing component produce an appropriate sequence of phonemic units. Generation of sequence of phonetic units for a given standard word is referred to as letter to phoneme rule or text to phoneme rule. The complexity of these rules and their derivation depends upon the nature of the language. The quality of a speech synthesizer is judged by its closeness to the natural human voice and understandability. In this paper we described an approach to build a Telugu TTS system using concatenative synthesis method with syllable as a basic unit of concatenation.

  4. Indonesian Automatic Speech Recognition For Command Speech Controller Multimedia Player

    Directory of Open Access Journals (Sweden)

    Vivien Arief Wardhany

    2014-12-01

    Full Text Available The purpose of multimedia devices development is controlling through voice. Nowdays voice that can be recognized only in English. To overcome the issue, then recognition using Indonesian language model and accousticc model and dictionary. Automatic Speech Recognizier is build using engine CMU Sphinx with modified english language to Indonesian Language database and XBMC used as the multimedia player. The experiment is using 10 volunteers testing items based on 7 commands. The volunteers is classifiedd by the genders, 5 Male & 5 female. 10 samples is taken in each command, continue with each volunteer perform 10 testing command. Each volunteer also have to try all 7 command that already provided. Based on percentage clarification table, the word “Kanan” had the most recognize with percentage 83% while “pilih” is the lowest one. The word which had the most wrong clarification is “kembali” with percentagee 67%, while the word “kanan” is the lowest one. From the result of Recognition Rate by male there are several command such as “Kembali”, “Utama”, “Atas “ and “Bawah” has the low Recognition Rate. Especially for “kembali” cannot be recognized as the command in the female voices but in male voice that command has 4% of RR this is because the command doesn’t have similar word in english near to “kembali” so the system unrecognize the command. Also for the command “Pilih” using the female voice has 80% of RR but for the male voice has only 4% of RR. This problem is mostly because of the different voice characteristic between adult male and female which male has lower voice frequencies (from 85 to 180 Hz than woman (165 to 255 Hz.The result of the experiment showed that each man had different number of recognition rate caused by the difference tone, pronunciation, and speed of speech. For further work needs to be done in order to improving the accouracy of the Indonesian Automatic Speech Recognition system

  5. Perceptual centres in speech - an acoustic analysis

    Science.gov (United States)

    Scott, Sophie Kerttu

    Perceptual centres, or P-centres, represent the perceptual moments of occurrence of acoustic signals - the 'beat' of a sound. P-centres underlie the perception and production of rhythm in perceptually regular speech sequences. P-centres have been modelled both in speech and non speech (music) domains. The three aims of this thesis were toatest out current P-centre models to determine which best accounted for the experimental data bto identify a candidate parameter to map P-centres onto (a local approach) as opposed to the previous global models which rely upon the whole signal to determine the P-centre the final aim was to develop a model of P-centre location which could be applied to speech and non speech signals. The first aim was investigated by a series of experiments in which a) speech from different speakers was investigated to determine whether different models could account for variation between speakers b) whether rendering the amplitude time plot of a speech signal affects the P-centre of the signal c) whether increasing the amplitude at the offset of a speech signal alters P-centres in the production and perception of speech. The second aim was carried out by a) manipulating the rise time of different speech signals to determine whether the P-centre was affected, and whether the type of speech sound ramped affected the P-centre shift b) manipulating the rise time and decay time of a synthetic vowel to determine whether the onset alteration was had more affect on P-centre than the offset manipulation c) and whether the duration of a vowel affected the P-centre, if other attributes (amplitude, spectral contents) were held constant. The third aim - modelling P-centres - was based on these results. The Frequency dependent Amplitude Increase Model of P-centre location (FAIM) was developed using a modelling protocol, the APU GammaTone Filterbank and the speech from different speakers. The P-centres of the stimuli corpus were highly predicted by attributes of

  6. Speech perception as an active cognitive process

    Directory of Open Access Journals (Sweden)

    Shannon eHeald

    2014-03-01

    Full Text Available One view of speech perception is that acoustic signals are transformed into representations for pattern matching to determine linguistic structure. This process can be taken as a statistical pattern-matching problem, assuming realtively stable linguistic categories are characterized by neural representations related to auditory properties of speech that can be compared to speech input. This kind of pattern matching can be termed a passive process which implies rigidity of processingd with few demands on cognitive processing. An alternative view is that speech recognition, even in early stages, is an active process in which speech analysis is attentionally guided. Note that this does not mean consciously guided but that information-contingent changes in early auditory encoding can occur as a function of context and experience. Active processing assumes that attention, plasticity, and listening goals are important in considering how listeners cope with adverse circumstances that impair hearing by masking noise in the environment or hearing loss. Although theories of speech perception have begun to incorporate some active processing, they seldom treat early speech encoding as plastic and attentionally guided. Recent research has suggested that speech perception is the product of both feedforward and feedback interactions between a number of brain regions that include descending projections perhaps as far downstream as the cochlea. It is important to understand how the ambiguity of the speech signal and constraints of context dynamically determine cognitive resources recruited during perception including focused attention, learning, and working memory. Theories of speech perception need to go beyond the current corticocentric approach in order to account for the intrinsic dynamics of the auditory encoding of speech. In doing so, this may provide new insights into ways in which hearing disorders and loss may be treated either through augementation or

  7. A Review on Speech Corpus Development for Automatic Speech Recognition in Indian Languages

    Directory of Open Access Journals (Sweden)

    Cini kurian

    2015-05-01

    Full Text Available Corpus development gained much attention due to recent statistics based natural language processing. It has new applications in Language Technology, linguistic research, language education and information exchange. Corpus based Language research has an innovative outlook which will discard the aged linguistic theories. Speech corpus is the essential resources for building a speech recognizer. One of the main challenges faced by speech scientist is the unavailability of these resources. Very fewer efforts have been made in Indian languages to make these resources available to public compared to English. In this paper we review the efforts made in Indian languages for developing speech corpus for automatic speech recognition.

  8. Reflection and Optimization of Primary English Teachers’Speech Acts Based on Speech Act Theory

    Institute of Scientific and Technical Information of China (English)

    HU Qi-hai

    2015-01-01

    The primary English teacher's speech acts have major impact on foreign language teaching and learning in primary school. Application of teacher,s speech acts in the classroom is actually a kind of selective process. From the perspective of Speech Act Theory, primary English teachers can optimize their speech acts with the strategies of activating the greetings with proper con⁃text information, standardizing teacher talk, choosing suitable questions,providing appropriate feedback for pupils ’classroom per⁃formances in order to improve the effectiveness of primary teachers,classroom speech acts.

  9. The inhibition of stuttering via the presentation of natural speech and sinusoidal speech analogs.

    Science.gov (United States)

    Saltuklaroglu, Tim; Kalinowski, Joseph

    2006-08-14

    Sensory signals containing speech or gestural (articulatory) information (e.g., choral speech) have repeatedly been found to be highly effective inhibitors of stuttering. Sine wave analogs of speech consist of a trio of changing pure tones representative of formant frequencies. They are otherwise devoid of traditional speech cues, yet have proven to evoke consistent linguistic percepts in listeners. Thus, we investigated the potency of sinusoidal speech for inhibiting stuttering. Ten adults who stutter read while listening to (a) forward-flowing natural speech; (b) forward-flowing sinusoid analogs of natural speech; (c) reversed natural speech; (d) reversed sinusoid analogs of natural speech; and (e) a continuous 1000 Hz pure tone. The levels of stuttering inhibition achieved using the sinusoidal stimuli were potent and not significantly different from those achieved using natural speech (approximately 50% in forward conditions and approximately 25% in the reversed conditions), suggesting that the patterns of undulating pure tones are sufficient to endow sinusoidal sentences with 'quasi-gestural' qualities. These data highlight the sensitivity of a specialized 'phonetic module' for extracting gestural information from sensory stimuli. Stuttering inhibition is thought to occur when perceived gestural information facilitates fluent productions via the engagement of mirror neurons (e.g., in Broca's area), which appear to play a crucial role in our ability to perceive and produce speech.

  10. Monaural speech intelligibility and detection in maskers with varying amounts of spectro-temporal speech features.

    Science.gov (United States)

    Schubotz, Wiebke; Brand, Thomas; Kollmeier, Birger; Ewert, Stephan D

    2016-07-01

    Speech intelligibility is strongly affected by the presence of maskers. Depending on the spectro-temporal structure of the masker and its similarity to the target speech, different masking aspects can occur which are typically referred to as energetic, amplitude modulation, and informational masking. In this study speech intelligibility and speech detection was measured in maskers that vary systematically in the time-frequency domain from steady-state noise to a single interfering talker. Male and female target speech was used in combination with maskers based on speech for the same or different gender. Observed data were compared to predictions of the speech intelligibility index, extended speech intelligibility index, multi-resolution speech-based envelope-power-spectrum model, and the short-time objective intelligibility measure. The different models served as analysis tool to help distinguish between the different masking aspects. Comparison shows that overall masking can to a large extent be explained by short-term energetic masking. However, the other masking aspects (amplitude modulation an informational masking) influence speech intelligibility as well. Additionally, it was obvious that all models showed considerable deviations from the data. Therefore, the current study provides a benchmark for further evaluation of speech prediction models.

  11. The Speech Spectrum and its Relationship to Intelligibility of Speech

    Science.gov (United States)

    Englert, Sue Ellen

    The present experiment was designed to investigate and understand the causes of failures of the Articulation Index as a predictive tool. An electroacoustic system was used in which: (1) The frequency response was optimally flattened at the listener's ear. (2) An ear-insert earphone was designed to give close electroacoustic control. (3) An infinite-impulse-response digital filter was used to filter the speech signal from a pre-recorded nonsense syllable test. (4) Four formant regions were filtered in fourteen different ways. It was found that the results agreed with past experiments in that: (1) The Articulation Index fails as a predictive tool when using band-pass filters. (2) Low frequencies seem to mask higher frequencies causing a decrease in intelligibility. It was concluded that: (1) It is inappropriate to relate the total fraction of the speech spectrum to a specific intelligibility score since the fraction remaining after filtering may be in the low-, mid-, or high-frequency range. (2) The relationship between intelligibility and the total area under the spectral curve is not monotonic. (3) The fourth formant region (2925Hz to 4200Hz) enhanced intelligibility when included with other formant regions. Methods for relating spectral regions and intelligibility were discussed.

  12. Earlier speech exposure does not accelerate speech acquisition.

    Science.gov (United States)

    Peña, Marcela; Werker, Janet F; Dehaene-Lambertz, Ghislaine

    2012-08-15

    Critical periods in language acquisition have been discussed primarily with reference to studies of people who are deaf or bilingual. Here, we provide evidence on the opening of sensitivity to the linguistic environment by studying the response to a change of phoneme at a native and nonnative phonetic boundary in full-term and preterm human infants using event-related potentials. Full-term infants show a decline in their discrimination of nonnative phonetic contrasts between 9 and 12 months of age. Because the womb is a high-frequency filter, many phonemes are strongly degraded in utero. Preterm infants thus benefit from earlier and richer exposure to broadcast speech. We find that preterms do not take advantage of this enriched linguistic environment: the decrease in amplitude of the mismatch response to a nonnative change of phoneme at the end of the first year of life was dependent on maturational age and not on the duration of exposure to broadcast speech. The shaping of phonological representations by the environment is thus strongly constrained by brain maturation factors.

  13. On speech recognition during anaesthesia

    DEFF Research Database (Denmark)

    Alapetite, Alexandre

    2007-01-01

    This PhD thesis in human-computer interfaces (informatics) studies the case of the anaesthesia record used during medical operations and the possibility to supplement it with speech recognition facilities. Problems and limitations have been identified with the traditional paper-based anaesthesia...... record, but also with newer electronic versions typically based on touch-screen and keyboard, in particular ergonomic issues and the fact that anaesthesiologists tend to postpone the registration of the medications and other events during busy periods of anaesthesia, which in turn may lead to gaps...

  14. Audiometria de altas frequências em bombeiros militares com audiometria normal expostos ao ruído High-frequency audiometry in normal hearing military firemen exposed to noise

    Directory of Open Access Journals (Sweden)

    Rita Leniza Oliveira da Rocha

    2010-12-01

    Full Text Available O estudo das altas frequências vem demonstrando sua importância para detecção de danos na orelha interna. Em alguns casos, as frequências convencionais não são sensíveis a alterações da orelha interna em seu estágio inicial. OBJETIVO: Analisar os resultados dos limiares das altas frequências de indivíduos expostos ao ruído com audiometria convencional normal. MATERIAL E MÉTODO: Foi realizado um estudo de coorte transversal retrospectivo com 47 combatentes do Corpo de Bombeiros do Rio de Janeiro alocados no aeroporto Santos Dumont e 33 militares sem exposição ao ruído. Os grupos foram divididos em duas faixas etárias: 30-39 anos e 40-49 anos. Imediatamente após a audiometria tonal e vocal eram testadas as altas frequências. RESULTADOS: Os resultados mais significativos ocorreram na faixa de 40 a 49 anos, onde o grupo experimental apresentou limiar significativamente maior que o grupo controle 14000Hz (p = 0,008 e 16000Hz (p = 0,0001. CONCLUSÕES: Concluiu-se que o ruído interferiu nos limiares das altas frequências, onde todas as médias encontradas no grupo experimental foram maiores do que as do grupo controle. Sugeriu-se que esses dados reforçariam a importância da pesquisa das altas frequências, mesmo com a audiometria convencional normal, no diagnóstico precoce da perda auditiva induzida pela exposição ao ruído.The study of high frequencies has proven its importance for detecting inner ear damage. In some cases, conventional frequencies are not sensitive enough to pick up early changes to the inner ear. AIM: To analyze the results of threshold high frequency analysis of individuals exposed to noise with normal conventional audiometry. MATERIALS AND METHODS: This was a retrospective cross-sectional cohort study, in which we studied 47 firefighters of the Fire Department of Rio de Janeiro, based on Santos Dumont airport and 33 military men without noise exposure. They were broken down into two age groups: 30

  15. Audiometria de altas frequências no diagnóstico complementar em audiologia: uma revisão da literatura nacional High-frequency audiometry in audiological complementary diagnosis: a revision of the national literature

    Directory of Open Access Journals (Sweden)

    Karlin Fabianne Klagenberg

    2011-03-01

    Full Text Available A audiometria de altas frequências (AAF é um exame audiológico importante na detecção precoce de perdas auditivas por lesões na base do ducto coclear. Nos últimos anos, a sua utilização foi facilitada pelo fato de os audiômetros comercializados passarem a incorporar frequências superiores a 8 kHz. Porém, existem diferenças relacionadas aos equipamentos utilizados, às metodologias empregadas e/ou aos resultados e interpretação. Assim, o objetivo deste artigo foi analisar a produção científica nacional sobre a aplicação clínica com AAF, para compreender sua utilização atual. Foram pesquisados textos publicados e indexados nas bases de dados LILACS, SciELO e Medline, num período de tempo de dez anos, utilizando como descritor "audiometria de altas frequências/high-frequency audiometry". Encontraram-se 24 artigos científicos nacionais utilizando AAF, cuja população avaliada, em sua maioria, apresentava de 18 a 50 anos de idade; 13 dos estudos determinaram os limiares utilizando como referência decibel nível de audição (dBNA; alguns estudos realizaram a comparação dos limiares auditivos tonais entre grupos para definir a normalidade; os autores relataram diferenças significativas nos limiares auditivos de altas frequências entre as idades. A AAF é utilizada na clínica audiológica para identificação precoce de alterações auditivas e no acompanhamento da audição de sujeitos expostos a drogas ototóxicas e/ou agentes otoagressores.High-frequency audiometry (HFA is an important audiological test for early detection of hearing losses caused by leasions in the base of the cochlear duct. In recent years, its use was facilitated because audiometers began to identify frequencies higher than 8 kHz. However, there are differences related to the equipment used, the methodologies followed, and/or to the results and their interpretation. Therefore, the aim of this study was to analyze the national scientific production

  16. Recent Trends in Free Speech Theory.

    Science.gov (United States)

    Haiman, Franklyn S.

    This syllabus of a convention workshop course on free speech theory consists of descriptions of several United States Supreme Court decisions related to free speech. Some specific areas in which decisions are discussed are: obscene and indecent communication, the definition of a public figure for purposes of libel action, the press versus official…

  17. Speech and Language Problems in Children

    Science.gov (United States)

    ... usually has one or two words like "Hi," "dog," "Dada," or "Mama" by her first birthday. Sometimes a delay may be caused by hearing loss. Other times it may be due to a speech or language disorder. Children who have speech disorders may have ...

  18. Speech-Language Pathologists' Connotations of Stuttering.

    Science.gov (United States)

    Ragsdale, J. Donald; Ashby, Jon K.

    1982-01-01

    Results indicated that increasing age, higher degrees, more coursework, or more clinical experience did not produce more positive connotations of stuttering among 206 speech-language pathologists. Those holding the Certificate of Clinical Competence in Speech-Language Pathology showed more positive connotative responses than the noncertified…

  19. Speech Intelligibility in Severe Adductor Spasmodic Dysphonia

    Science.gov (United States)

    Bender, Brenda K.; Cannito, Michael P.; Murry, Thomas; Woodson, Gayle E.

    2004-01-01

    This study compared speech intelligibility in nondisabled speakers and speakers with adductor spasmodic dysphonia (ADSD) before and after botulinum toxin (Botox) injection. Standard speech samples were obtained from 10 speakers diagnosed with severe ADSD prior to and 1 month following Botox injection, as well as from 10 age- and gender-matched…

  20. Tampa Bay International Business Summit Keynote Speech

    Science.gov (United States)

    Clary, Christina

    2011-01-01

    A keynote speech outlining the importance of collaboration and diversity in the workplace. The 20-minute speech describes NASA's challenges and accomplishments over the years and what lies ahead. Topics include: diversity and inclusion principles, international cooperation, Kennedy Space Center planning and development, opportunities for cooperation, and NASA's vision for exploration.

  1. Hypnosis and the Reduction of Speech Anxiety.

    Science.gov (United States)

    Barker, Larry L.; And Others

    The purposes of this paper are (1) to review the background and nature of hypnosis, (2) to synthesize research on hypnosis related to speech communication, and (3) to delineate and compare two potential techniques for reducing speech anxiety--hypnosis and systematic desensitization. Hypnosis has been defined as a mental state characterised by…

  2. Speech-Language Pathology: Preparing Early Interventionists

    Science.gov (United States)

    Prelock, Patricia A.; Deppe, Janet

    2015-01-01

    The purpose of this article is to explain the role of speech-language pathology in early intervention. The expected credentials of professionals in the field are described, and the current numbers of practitioners serving young children are identified. Several resource documents available from the American Speech-­Language Hearing Association are…

  3. Analog Acoustic Expression in Speech Communication

    Science.gov (United States)

    Shintel, Hadas; Nusbaum, Howard C.; Okrent, Arika

    2006-01-01

    We present the first experimental evidence of a phenomenon in speech communication we call "analog acoustic expression." Speech is generally thought of as conveying information in two distinct ways: discrete linguistic-symbolic units such as words and sentences represent linguistic meaning, and continuous prosodic forms convey information about…

  4. SPEECH MANUAL. RHETORIC CURRICULUM V, STUDENT VERSION.

    Science.gov (United States)

    KITZHABER, ALBERT R.

    THIS MANUAL IS A REFERENCE AID FOR 11TH-GRADE STUDENTS PREPARING SPEAKING ASSIGNMENTS. CHAPTER 1, "THE PHYSIOLOGY OF SPEECH," CONTAINS INFORMATION ON THE SPEECH ORGANS AND THEIR FUNCTIONS IN THE PRODUCTION OF SOUNDS. THE MAIN POINTS OF "ROBERT'S RULES OF ORDER" ARE OUTLINED IN CHAPTER 2. CHAPTER 3 GIVES ATTENTION TO OUTLINING…

  5. Speech after Mao: Literature and Belonging

    Science.gov (United States)

    Hsieh, Victoria Linda

    2012-01-01

    This dissertation aims to understand the apparent failure of speech in post-Mao literature to fulfill its conventional functions of representation and communication. In order to understand this pattern, I begin by looking back on the utility of speech for nation-building in modern China. In addition to literary analysis of key authors and works,…

  6. Two-microphone Separation of Speech Mixtures

    DEFF Research Database (Denmark)

    2006-01-01

    Matlab source code for underdetermined separation of instaneous speech mixtures. The algorithm is described in [1] Michael Syskind Pedersen, DeLiang Wang, Jan Larsen and Ulrik Kjems: ''Two-microphone Separation of Speech Mixtures,'' 2006, submitted for journal publoication. See also, [2] Michael...

  7. CLEFT PALATE. FOUNDATIONS OF SPEECH PATHOLOGY SERIES.

    Science.gov (United States)

    RUTHERFORD, DAVID; WESTLAKE, HAROLD

    DESIGNED TO PROVIDE AN ESSENTIAL CORE OF INFORMATION, THIS BOOK TREATS NORMAL AND ABNORMAL DEVELOPMENT, STRUCTURE, AND FUNCTION OF THE LIPS AND PALATE AND THEIR RELATIONSHIPS TO CLEFT LIP AND CLEFT PALATE SPEECH. PROBLEMS OF PERSONAL AND SOCIAL ADJUSTMENT, HEARING, AND SPEECH IN CLEFT LIP OR CLEFT PALATE INDIVIDUALS ARE DISCUSSED. NASAL RESONANCE…

  8. Treatment Intensity and Childhood Apraxia of Speech

    Science.gov (United States)

    Namasivayam, Aravind K.; Pukonen, Margit; Goshulak, Debra; Hard, Jennifer; Rudzicz, Frank; Rietveld, Toni; Maassen, Ben; Kroll, Robert; van Lieshout, Pascal

    2015-01-01

    Background: Intensive treatment has been repeatedly recommended for the treatment of speech deficits in childhood apraxia of speech (CAS). However, differences in treatment outcomes as a function of treatment intensity have not been systematically studied in this population. Aim: To investigate the effects of treatment intensity on outcome…

  9. How should a speech recognizer work?

    NARCIS (Netherlands)

    Scharenborg, O.E.; Norris, D.G.; Bosch, L.F.M. ten; McQueen, J.M.

    2005-01-01

    Although researchers studying human speech recognition (HSR) and automatic speech recognition (ASR) share a common interest in how information processing systems (human or machine) recognize spoken language, there is little communication between the two disciplines. We suggest that this lack of comm

  10. Building Searchable Collections of Enterprise Speech Data.

    Science.gov (United States)

    Cooper, James W.; Viswanathan, Mahesh; Byron, Donna; Chan, Margaret

    The study has applied speech recognition and text-mining technologies to a set of recorded outbound marketing calls and analyzed the results. Since speaker-independent speech recognition technology results in a significantly lower recognition rate than that found when the recognizer is trained for a particular speaker, a number of post-processing…

  11. Analysing the spontaneous speech of aphasic speakers

    NARCIS (Netherlands)

    Bastiaanse, Y.R.M.; Prins, R.S.

    2004-01-01

    Background: Aphasia has very serious consequences for speech production and, hence, for communication in daily life. Nevertheless, in the standard diagnostic procedures and in clinical practice, analysis of speech production in daily life is usually ignored or is restricted to the scoring of one or

  12. Education in the 80's: Speech Communication.

    Science.gov (United States)

    Friedrich, Gustav W., Ed.

    Taken together, the 20 chapters in this book provide many suggestions, predictions, alternatives, innovations, and improvements in the speech communication curriculum that can be either undertaken or accomplished during the 1980s. The first five chapters speculate positively about the future of speech communication instruction in five of its most…

  13. Modelling context in automatic speech recognition

    NARCIS (Netherlands)

    Wiggers, P.

    2008-01-01

    Speech is at the core of human communication. Speaking and listing comes so natural to us that we do not have to think about it at all. The underlying cognitive processes are very rapid and almost completely subconscious. It is hard, if not impossible not to understand speech. For computers on the o

  14. Isolated Speech Recognition Using Artificial Neural Networks

    Science.gov (United States)

    2007-11-02

    In this project Artificial Neural Networks are used as research tool to accomplish Automated Speech Recognition of normal speech. A small size...the first stage of this work are satisfactory and thus the application of artificial neural networks in conjunction with cepstral analysis in isolated word recognition holds promise.

  15. SPEECH SEPARATION ALGORITHM FOR AUDITORY SCENE ANALYSIS

    Institute of Scientific and Technical Information of China (English)

    Huang Xiuxuan; Wei Gang

    2004-01-01

    A simple and efficient algorithm is presented to separate concurrent speeches. The parameters of mixed speeches are estimated by searching in the neighbor area of given pitches to minimize the error between the original and the synthetic spectrums. The effectiveness of the proposed algorithm to separate close frequencies is demonstrated.

  16. Automatic Blind Syllable Segmentation for Continuous Speech

    OpenAIRE

    Villing, Rudi; Timoney, Joseph; Ward, Tomas

    2004-01-01

    In this paper a simple practical method for blind segmentation of continuous speech into its constituent syllables is presented. This technique which uses amplitude onset velocity and coarse spectral makeup to identify syllable boundaries is tested on a corpus of continuous speech and compared with an established segmentation algorithm. The results show substantial performance benefit using the proposed algorithm.

  17. Speech Genres in Writing Cognitive Artifacts.

    Science.gov (United States)

    Shambaugh, R. Neal

    This paper reports on the analysis of an instructional text on the basis of M. Bakhtin's (1986) notion of speech genres, which is used to theorize the different influences on the writing of an instructional text. Speech genres are used to reveal the multiple voices inherent in any text: the writer's, the reader's, and the text's. The…

  18. Second Language Learners and Speech Act Comprehension

    Science.gov (United States)

    Holtgraves, Thomas

    2007-01-01

    Recognizing the specific speech act ( Searle, 1969) that a speaker performs with an utterance is a fundamental feature of pragmatic competence. Past research has demonstrated that native speakers of English automatically recognize speech acts when they comprehend utterances (Holtgraves & Ashley, 2001). The present research examined whether this…

  19. Treatment intensity and childhood apraxia of speech

    NARCIS (Netherlands)

    Namasivayam, Aravind K.; Pukonen, Margit; Goshulak, Debra; Hard, Jennifer; Rudzicz, Frank; Rietveld, Toni; Maassen, Ben; Kroll, Robert; van Lieshout, Pascal

    2015-01-01

    BackgroundIntensive treatment has been repeatedly recommended for the treatment of speech deficits in childhood apraxia of speech (CAS). However, differences in treatment outcomes as a function of treatment intensity have not been systematically studied in this population. AimTo investigate the effe

  20. Speech versus singing: Infants choose happier sounds

    Directory of Open Access Journals (Sweden)

    Marieve eCorbeil

    2013-06-01

    Full Text Available Infants prefer speech to non-vocal sounds and to non-human vocalizations, and they prefer happy-sounding speech to neutral speech. They also exhibit an interest in singing, but there is little knowledge of their relative interest in speech and singing. The present study explored infants’ attention to unfamiliar audio samples of speech and singing. In Experiment 1, infants 4-13 months of age were exposed to happy-sounding infant-directed speech versus hummed lullabies by the same woman. They listened significantly longer to the speech, which had considerably greater acoustic variability and expressiveness, than to the lullabies. In Experiment 2, infants of comparable age who heard the lyrics of a Turkish children’s song spoken versus sung in a joyful/happy manner did not exhibit differential listening. Infants in Experiment 3 heard the happily sung lyrics of the Turkish children’s song versus a version that was spoken in an adult-directed or affectively neutral manner. They listened significantly longer to the sung version. Overall, happy voice quality rather than vocal mode (speech or singing was the principal contributor to infant attention, regardless of age.

  1. Spoken Content Retrieval: Searching Spontaneous Conversational Speech

    NARCIS (Netherlands)

    Kohler, J; Larson, M; Jong, de F.M.G.; Kraaij, W.; Ordelman, R.J.F.

    2008-01-01

    The second workshop on Searching Spontaneous Conversational Speech (SSCS 2008) was held in Singapore on July 24, 2008 in conjunction with the 31st Annual International ACM SIGIR Conference. The goal of the workshop was to bring the speech community and the information retrieval community together. T

  2. Visual speech gestures modulate efferent auditory system.

    Science.gov (United States)

    Namasivayam, Aravind Kumar; Wong, Wing Yiu Stephanie; Sharma, Dinaay; van Lieshout, Pascal

    2015-03-01

    Visual and auditory systems interact at both cortical and subcortical levels. Studies suggest a highly context-specific cross-modal modulation of the auditory system by the visual system. The present study builds on this work by sampling data from 17 young healthy adults to test whether visual speech stimuli evoke different responses in the auditory efferent system compared to visual non-speech stimuli. The descending cortical influences on medial olivocochlear (MOC) activity were indirectly assessed by examining the effects of contralateral suppression of transient-evoked otoacoustic emissions (TEOAEs) at 1, 2, 3 and 4 kHz under three conditions: (a) in the absence of any contralateral noise (Baseline), (b) contralateral noise + observing facial speech gestures related to productions of vowels /a/ and /u/ and (c) contralateral noise + observing facial non-speech gestures related to smiling and frowning. The results are based on 7 individuals whose data met strict recording criteria and indicated a significant difference in TEOAE suppression between observing speech gestures relative to the non-speech gestures, but only at the 1 kHz frequency. These results suggest that observing a speech gesture compared to a non-speech gesture may trigger a difference in MOC activity, possibly to enhance peripheral neural encoding. If such findings can be reproduced in future research, sensory perception models and theories positing the downstream convergence of unisensory streams of information in the cortex may need to be revised.

  3. The Effects of TV on Speech Education

    Science.gov (United States)

    Gocen, Gokcen; Okur, Alpaslan

    2013-01-01

    Generally, the speaking aspect is not properly debated when discussing the positive and negative effects of television (TV), especially on children. So, to highlight this point, this study was first initialized by asking the question: "What are the effects of TV on speech?" and secondly, to transform the effects that TV has on speech in…

  4. Childhood Apraxia of Speech Family Start Guide

    Science.gov (United States)

    ... is right for every child with apraxia of speech. Commercial products, programs, apps or kits can be […] Read More What To Look for In an SLP for Your Child In the United States, speech-language pathologists (SLP) are certified by the American ...

  5. Subjective Quality Measurement of Speech Its Evaluation, Estimation and Applications

    CERN Document Server

    Kondo, Kazuhiro

    2012-01-01

    It is becoming crucial to accurately estimate and monitor speech quality in various ambient environments to guarantee high quality speech communication. This practical hands-on book shows speech intelligibility measurement methods so that the readers can start measuring or estimating speech intelligibility of their own system. The book also introduces subjective and objective speech quality measures, and describes in detail speech intelligibility measurement methods. It introduces a diagnostic rhyme test which uses rhyming word-pairs, and includes: An investigation into the effect of word familiarity on speech intelligibility. Speech intelligibility measurement of localized speech in virtual 3-D acoustic space using the rhyme test. Estimation of speech intelligibility using objective measures, including the ITU standard PESQ measures, and automatic speech recognizers.

  6. Speech Emotion Recognition Using Fuzzy Logic Classifier

    Directory of Open Access Journals (Sweden)

    Daniar aghsanavard

    2016-01-01

    Full Text Available Over the last two decades, emotions, speech recognition and signal processing have been one of the most significant issues in the adoption of techniques to detect them. Each method has advantages and disadvantages. This paper tries to suggest fuzzy speech emotion recognition based on the classification of speech's signals in order to better recognition along with a higher speed. In this system, the use of fuzzy logic system with 5 layers, which is the combination of neural progressive network and algorithm optimization of firefly, first, speech samples have been given to input of fuzzy orbit and then, signals will be investigated and primary classified in a fuzzy framework. In this model, a pattern of signals will be created for each class of signals, which results in reduction of signal data dimension as well as easier speech recognition. The obtained experimental results show that our proposed method (categorized by firefly, improves recognition of utterances.

  7. Modeling speech intelligibility in adverse conditions

    DEFF Research Database (Denmark)

    Dau, Torsten

    2012-01-01

    understanding speech when more than one person is talking, even when reduced audibility has been fully compensated for by a hearing aid. The reasons for these difficulties are not well understood. This presentation highlights recent concepts of the monaural and binaural signal processing strategies employed...... by the normal as well as impaired auditory system. Jørgensen and Dau [(2011). J. Acoust. Soc. Am. 130, 1475-1487] proposed the speech-based envelope power spectrum model (sEPSM) in an attempt to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index (SII......) in conditions with nonlinearly processed speech. Instead of considering the reduction of the temporal modulation energy as the intelligibility metric, as assumed in the STI, the sEPSM applies the signal-to-noise ratio in the envelope domain (SNRenv). This metric was shown to be the key for predicting...

  8. Speech enhancement on smartphone voice recording

    Science.gov (United States)

    Tris Atmaja, Bagus; Nur Farid, Mifta; Arifianto, Dhany

    2016-11-01

    Speech enhancement is challenging task in audio signal processing to enhance the quality of targeted speech signal while suppress other noises. In the beginning, the speech enhancement algorithm growth rapidly from spectral subtraction, Wiener filtering, spectral amplitude MMSE estimator to Non-negative Matrix Factorization (NMF). Smartphone as revolutionary device now is being used in all aspect of life including journalism; personally and professionally. Although many smartphones have two microphones (main and rear) the only main microphone is widely used for voice recording. This is why the NMF algorithm widely used for this purpose of speech enhancement. This paper evaluate speech enhancement on smartphone voice recording by using some algorithms mentioned previously. We also extend the NMF algorithm to Kulback-Leibler NMF with supervised separation. The last algorithm shows improved result compared to others by spectrogram and PESQ score evaluation.

  9. [Improving the speech with a prosthetic construction].

    Science.gov (United States)

    Stalpers, M J; Engelen, M; van der Stappen, J A A M; Weijs, W L J; Takes, R P; van Heumen, C C M

    2016-03-01

    A 12-year-old boy had problems with his speech due to a defect in the soft palate. This defect was caused by the surgical removal of a synovial sarcoma. Testing with a nasometer revealed hypernasality above normal values. Given the size and severity of the defect in the soft palate, the possibility of improving the speech with speech therapy was limited. At a centre for special dentistry an attempt was made with a prosthetic construction to improve the performance of the palate and, in that way, the speech. This construction consisted of a denture with an obturator attached to it. With it, an effective closure of the palate could be achieved. New measurements with acoustic nasometry showed scores within the normal values. The nasality in the speech largely disappeared. The obturator is an effective and relatively easy solution for palatal insufficiency resulting from surgical resection. Intrusive reconstructive surgery can be avoided in this way.

  10. Speech Intelligibility Evaluation for Mobile Phones

    DEFF Research Database (Denmark)

    Jørgensen, Søren; Cubick, Jens; Dau, Torsten

    2015-01-01

    and model predictions were compared to the perceptual data. Statistically significant differences between the intelligibility of the three phones were found in stationary speech-shaped noise. A good correspondence between the measured data and the predictions from one of the three models was found in all......In the development process of modern telecommunication systems, such as mobile phones, it is common practice to use computer models to objectively evaluate the transmission quality of the system, instead of time-consuming perceptual listening tests. Such models have typically focused on the quality...... of the transmitted speech, while little or no attention has been provided to speech intelligibility. The present study investigated to what extent three state-of-the art speech intelligibility models could predict the intelligibility of noisy speech transmitted through mobile phones. Sentences from the Danish...

  11. The Functional Connectome of Speech Control.

    Directory of Open Access Journals (Sweden)

    Stefan Fuertinger

    2015-07-01

    Full Text Available In the past few years, several studies have been directed to understanding the complexity of functional interactions between different brain regions during various human behaviors. Among these, neuroimaging research installed the notion that speech and language require an orchestration of brain regions for comprehension, planning, and integration of a heard sound with a spoken word. However, these studies have been largely limited to mapping the neural correlates of separate speech elements and examining distinct cortical or subcortical circuits involved in different aspects of speech control. As a result, the complexity of the brain network machinery controlling speech and language remained largely unknown. Using graph theoretical analysis of functional MRI (fMRI data in healthy subjects, we quantified the large-scale speech network topology by constructing functional brain networks of increasing hierarchy from the resting state to motor output of meaningless syllables to complex production of real-life speech as well as compared to non-speech-related sequential finger tapping and pure tone discrimination networks. We identified a segregated network of highly connected local neural communities (hubs in the primary sensorimotor and parietal regions, which formed a commonly shared core hub network across the examined conditions, with the left area 4p playing an important role in speech network organization. These sensorimotor core hubs exhibited features of flexible hubs based on their participation in several functional domains across different networks and ability to adaptively switch long-range functional connectivity depending on task content, resulting in a distinct community structure of each examined network. Specifically, compared to other tasks, speech production was characterized by the formation of six distinct neural communities with specialized recruitment of the prefrontal cortex, insula, putamen, and thalamus, which collectively

  12. The Functional Connectome of Speech Control.

    Science.gov (United States)

    Fuertinger, Stefan; Horwitz, Barry; Simonyan, Kristina

    2015-07-01

    In the past few years, several studies have been directed to understanding the complexity of functional interactions between different brain regions during various human behaviors. Among these, neuroimaging research installed the notion that speech and language require an orchestration of brain regions for comprehension, planning, and integration of a heard sound with a spoken word. However, these studies have been largely limited to mapping the neural correlates of separate speech elements and examining distinct cortical or subcortical circuits involved in different aspects of speech control. As a result, the complexity of the brain network machinery controlling speech and language remained largely unknown. Using graph theoretical analysis of functional MRI (fMRI) data in healthy subjects, we quantified the large-scale speech network topology by constructing functional brain networks of increasing hierarchy from the resting state to motor output of meaningless syllables to complex production of real-life speech as well as compared to non-speech-related sequential finger tapping and pure tone discrimination networks. We identified a segregated network of highly connected local neural communities (hubs) in the primary sensorimotor and parietal regions, which formed a commonly shared core hub network across the examined conditions, with the left area 4p playing an important role in speech network organization. These sensorimotor core hubs exhibited features of flexible hubs based on their participation in several functional domains across different networks and ability to adaptively switch long-range functional connectivity depending on task content, resulting in a distinct community structure of each examined network. Specifically, compared to other tasks, speech production was characterized by the formation of six distinct neural communities with specialized recruitment of the prefrontal cortex, insula, putamen, and thalamus, which collectively forged the formation

  13. Inconsistency of speech in children with childhood apraxia of speech, phonological disorders, and typical speech

    Science.gov (United States)

    Iuzzini, Jenya

    There is a lack of agreement on the features used to differentiate Childhood Apraxia of Speech (CAS) from Phonological Disorders (PD). One criterion which has gained consensus is lexical inconsistency of speech (ASHA, 2007); however, no accepted measure of this feature has been defined. Although lexical assessment provides information about consistency of an item across repeated trials, it may not capture the magnitude of inconsistency within an item. In contrast, segmental analysis provides more extensive information about consistency of phoneme usage across multiple contexts and word-positions. The current research compared segmental and lexical inconsistency metrics in preschool-aged children with PD, CAS, and typical development (TD) to determine how inconsistency varies with age in typical and disordered speakers, and whether CAS and PD were differentiated equally well by both assessment levels. Whereas lexical and segmental analyses may be influenced by listener characteristics or speaker intelligibility, the acoustic signal is less vulnerable to these factors. In addition, the acoustic signal may reveal information which is not evident in the perceptual signal. A second focus of the current research was motivated by Blumstein et al.'s (1980) classic study on voice onset time (VOT) in adults with acquired apraxia of speech (AOS) which demonstrated a motor impairment underlying AOS. In the current study, VOT analyses were conducted to determine the relationship between age and group with the voicing distribution for bilabial and alveolar plosives. Findings revealed that 3-year-olds evidenced significantly higher inconsistency than 5-year-olds; segmental inconsistency approached 0% in 5-year-olds with TD, whereas it persisted in children with PD and CAS suggesting that for child in this age-range, inconsistency is a feature of speech disorder rather than typical development (Holm et al., 2007). Likewise, whereas segmental and lexical inconsistency were

  14. Detection and Separation of Speech Event Using Audio and Video Information Fusion and Its Application to Robust Speech Interface

    Directory of Open Access Journals (Sweden)

    Futoshi Asano

    2004-09-01

    Full Text Available A method of detecting speech events in a multiple-sound-source condition using audio and video information is proposed. For detecting speech events, sound localization using a microphone array and human tracking by stereo vision is combined by a Bayesian network. From the inference results of the Bayesian network, information on the time and location of speech events can be known. The information on the detected speech events is then utilized in the robust speech interface. A maximum likelihood adaptive beamformer is employed as a preprocessor of the speech recognizer to separate the speech signal from environmental noise. The coefficients of the beamformer are kept updated based on the information of the speech events. The information on the speech events is also used by the speech recognizer for extracting the speech segment.

  15. Clinical and audiological features of a syndrome with deterioration in speech recognition out of proportion to pure hearing loss

    Directory of Open Access Journals (Sweden)

    Abdi S

    2007-04-01

    Full Text Available Background: The objective of this study was to describe the audiologic and related characteristics of a group patient with speech perception affected out of proportion to pure tone hearing loss. A case series of patient were referred for evaluation and management to the Hearing Research Center.To describe the clinical picture of the patients with the key clinical feature of hearing loss for pure tones and reduction in speech discrimination out of proportion to the pure tone loss, having some of the criteria of auditory neuropathy (i.e. normal otoacoustic emissions, OAE, and abnormal auditory brainstem evoked potentials, ABR and lacking others (e.g. present auditory reflexes. Methods: Hearing abilities were measured by Pure Tone Audiometry (PTA and Speech Discrimination Scores (SDS, measured in all patients using a standardized list of 25 monosyllabic Farsi words at MCL in quiet. Auditory pathway integrity was measured by using Auditory Brainstem Response (ABR and Otoacoustic Emission (OAE and anatomical lesions Computed Tomography Scan (CT and Magnetic Resonance Image (MRI of brain and retrocochlea. Patient included in the series were 35 patients who have SDS disproportionably low with regard to PTA, absent ABR waves and normal OAE. Results: All patients reported the beginning of their problem around adolescence. Neither of them had anatomical lesion in imaging studies and neither of them had any finding suggestive of conductive hearing lesion. Although in most of the cases the hearing loss had been more apparent in the lower frequencies (i.e. 1000 Hz and less, a stronger correlation was found between SDS and hearing threshold at higher frequencies. These patients may not benefit from hearing aids, as the outer hair cells are functional and amplification doesn’t seem to help; though, it was tried for all. Conclusion: These patients share a pattern of sensory –neural loss with no detectable lesion. The age of onset and the gradual

  16. Development of The Viking Speech Scale to classify the speech of children with cerebral palsy.

    Science.gov (United States)

    Pennington, Lindsay; Virella, Daniel; Mjøen, Tone; da Graça Andrada, Maria; Murray, Janice; Colver, Allan; Himmelmann, Kate; Rackauskaite, Gija; Greitane, Andra; Prasauskiene, Audrone; Andersen, Guro; de la Cruz, Javier

    2013-10-01

    Surveillance registers monitor the prevalence of cerebral palsy and the severity of resulting impairments across time and place. The motor disorders of cerebral palsy can affect children's speech production and limit their intelligibility. We describe the development of a scale to classify children's speech performance for use in cerebral palsy surveillance registers, and its reliability across raters and across time. Speech and language therapists, other healthcare professionals and parents classified the speech of 139 children with cerebral palsy (85 boys, 54 girls; mean age 6.03 years, SD 1.09) from observation and previous knowledge of the children. Another group of health professionals rated children's speech from information in their medical notes. With the exception of parents, raters reclassified children's speech at least four weeks after their initial classification. Raters were asked to rate how easy the scale was to use and how well the scale described the child's speech production using Likert scales. Inter-rater reliability was moderate to substantial (k>.58 for all comparisons). Test-retest reliability was substantial to almost perfect for all groups (k>.68). Over 74% of raters found the scale easy or very easy to use; 66% of parents and over 70% of health care professionals judged the scale to describe children's speech well or very well. We conclude that the Viking Speech Scale is a reliable tool to describe the speech performance of children with cerebral palsy, which can be applied through direct observation of children or through case note review.

  17. Relationship between Chinese speech intelligibility and speech transmission index in rooms using dichotic listening

    Institute of Scientific and Technical Information of China (English)

    PENG JianXin

    2008-01-01

    Speech intelligibility (SI) is an important index for the design and assessment of speech purpose hall. The relationship between Chinese speech intelligibility scores in rooms and speech transmission index (STI) under diotic listening condition was studied using monaural room impulse responses obtained from the room acoustical simulation software Odeon in previous paper. The present study employs the simulated binaural room impulse responses and auralization technique to obtain the subjective Chi-nese speech intelligibility scores using rhyme test. The relationship between Chinese speech intelligi-bility scores and STI is built and validated in rooms using dichotic (binaural) listening. The result shows that there is a high correlation between Chinese speech intelligibility scores and STI using di-chotic listening. The relationship between Chinese speech intelligibility scores and STI under diotic and dichotic listening conditions is also analyzed. Compared with diotic listening, dichotic (binaural) listening (an actual listening situation) can improve 2.7 dB signal-to-noise ratio for Mandarin Chinese speech intelligibility. STI method can predict and evaluate the speech intelligibility for Mandarin Chi-nese in rooms for dichotic (binaural) listening.

  18. Speech identification in noise: Contribution of temporal, spectral, and visual speech cues.

    Science.gov (United States)

    Kim, Jeesun; Davis, Chris; Groot, Christopher

    2009-12-01

    This study investigated the degree to which two types of reduced auditory signals (cochlear implant simulations) and visual speech cues combined for speech identification. The auditory speech stimuli were filtered to have only amplitude envelope cues or both amplitude envelope and spectral cues and were presented with/without visual speech. In Experiment 1, IEEE sentences were presented in quiet and noise. For in-quiet presentation, speech identification was enhanced by the addition of both spectral and visual speech cues. Due to a ceiling effect, the degree to which these effects combined could not be determined. In noise, these facilitation effects were more marked and were additive. Experiment 2 examined consonant and vowel identification in the context of CVC or VCV syllables presented in noise. For consonants, both spectral and visual speech cues facilitated identification and these effects were additive. For vowels, the effect of combined cues was underadditive, with the effect of spectral cues reduced when presented with visual speech cues. Analysis indicated that without visual speech, spectral cues facilitated the transmission of place information and vowel height, whereas with visual speech, they facilitated lip rounding, with little impact on the transmission of place information.

  19. The Effect of English Verbal Songs on Connected Speech Aspects of Adult English Learners’ Speech Production

    Directory of Open Access Journals (Sweden)

    Farshid Tayari Ashtiani

    2015-02-01

    Full Text Available The present study was an attempt to investigate the impact of English verbal songs on connected speech aspects of adult English learners’ speech production. 40 participants were selected based on the results of their performance in a piloted and validated version of NELSON test given to 60 intermediate English learners in a language institute in Tehran. Then they were equally distributed in two control and experimental groups and received a validated pretest of reading aloud and speaking in English. Afterward, the treatment was performed in 18 sessions by singing preselected songs culled based on some criteria such as popularity, familiarity, amount, and speed of speech delivery, etc. In the end, the posttests of reading aloud and speaking in English were administered. The results revealed that the treatment had statistically positive effects on the connected speech aspects of English learners’ speech production at statistical .05 level of significance. Meanwhile, the results represented that there was not any significant difference between the experimental group’s mean scores on the posttests of reading aloud and speaking. It was thus concluded that providing the EFL learners with English verbal songs could positively affect connected speech aspects of both modes of speech production, reading aloud and speaking. The Findings of this study have pedagogical implications for language teachers to be more aware and knowledgeable of the benefits of verbal songs to promote speech production of language learners in terms of naturalness and fluency. Keywords: English Verbal Songs, Connected Speech, Speech Production, Reading Aloud, Speaking

  20. An articulatorily constrained, maximum entropy approach to speech recognition and speech coding

    Energy Technology Data Exchange (ETDEWEB)

    Hogden, J.

    1996-12-31

    Hidden Markov models (HMM`s) are among the most popular tools for performing computer speech recognition. One of the primary reasons that HMM`s typically outperform other speech recognition techniques is that the parameters used for recognition are determined by the data, not by preconceived notions of what the parameters should be. This makes HMM`s better able to deal with intra- and inter-speaker variability despite the limited knowledge of how speech signals vary and despite the often limited ability to correctly formulate rules describing variability and invariance in speech. In fact, it is often the case that when HMM parameter values are constrained using the limited knowledge of speech, recognition performance decreases. However, the structure of an HMM has little in common with the mechanisms underlying speech production. Here, the author argues that by using probabilistic models that more accurately embody the process of speech production, he can create models that have all the advantages of HMM`s, but that should more accurately capture the statistical properties of real speech samples--presumably leading to more accurate speech recognition. The model he will discuss uses the fact that speech articulators move smoothly and continuously. Before discussing how to use articulatory constraints, he will give a brief description of HMM`s. This will allow him to highlight the similarities and differences between HMM`s and the proposed technique.

  1. E-learning-based speech therapy: a web application for speech training.

    Science.gov (United States)

    Beijer, Lilian J; Rietveld, Toni C M; van Beers, Marijn M A; Slangen, Robert M L; van den Heuvel, Henk; de Swart, Bert J M; Geurts, Alexander C H

    2010-03-01

    Abstract In The Netherlands, a web application for speech training, E-learning-based speech therapy (EST), has been developed for patients with dysarthria, a speech disorder resulting from acquired neurological impairments such as stroke or Parkinson's disease. In this report, the EST infrastructure and its potentials for both therapists and patients are elucidated. EST provides patients with dysarthria the opportunity to engage in intensive speech training in their own environment, in addition to undergoing the traditional face-to-face therapy. Moreover, patients with chronic dysarthria can use EST to independently maintain the quality of their speech once the face-to-face sessions with their speech therapist have been completed. This telerehabilitation application allows therapists to remotely compose speech training programs tailored to suit each individual patient. Moreover, therapists can remotely monitor and evaluate changes in the patient's speech. In addition to its value as a device for composing, monitoring, and carrying out web-based speech training, the EST system compiles a database of dysarthric speech. This database is vital for further scientific research in this area.

  2. Speech Characteristics Associated with Three Genotypes of Ataxia

    Science.gov (United States)

    Sidtis, John J.; Ahn, Ji Sook; Gomez, Christopher; Sidtis, Diana

    2011-01-01

    Purpose: Advances in neurobiology are providing new opportunities to investigate the neurological systems underlying motor speech control. This study explores the perceptual characteristics of the speech of three genotypes of spino-cerebellar ataxia (SCA) as manifest in four different speech tasks. Methods: Speech samples from 26 speakers with SCA…

  3. Tracking Change in Children with Severe and Persisting Speech Difficulties

    Science.gov (United States)

    Newbold, Elisabeth Joy; Stackhouse, Joy; Wells, Bill

    2013-01-01

    Standardised tests of whole-word accuracy are popular in the speech pathology and developmental psychology literature as measures of children's speech performance. However, they may not be sensitive enough to measure changes in speech output in children with severe and persisting speech difficulties (SPSD). To identify the best ways of doing this,…

  4. The Interpersonal Metafunction Analysis of Barack Obama's Victory Speech

    Science.gov (United States)

    Ye, Ruijuan

    2010-01-01

    This paper carries on a tentative interpersonal metafunction analysis of Barack Obama's victory speech from the interpersonal metafunction, which aims to help readers understand and evaluate the speech regarding its suitability, thus to provide some guidance for readers to make better speeches. This study has promising implications for speeches as…

  5. Monkey Lipsmacking Develops Like the Human Speech Rhythm

    Science.gov (United States)

    Morrill, Ryan J.; Paukner, Annika; Ferrari, Pier F.; Ghazanfar, Asif A.

    2012-01-01

    Across all languages studied to date, audiovisual speech exhibits a consistent rhythmic structure. This rhythm is critical to speech perception. Some have suggested that the speech rhythm evolved "de novo" in humans. An alternative account--the one we explored here--is that the rhythm of speech evolved through the modification of rhythmic facial…

  6. Transcribing Disordered Speech: By Target or by Production?

    Science.gov (United States)

    Ball, Martin J.

    2008-01-01

    The ability to transcribe disordered speech is a vital tool for speech-language pathologists, as accurate description of a client's speech output is needed for both diagnosis and effective intervention. Clients in the speech clinic often use sounds that are not part of the target sound system and which may, in some cases, be sounds not found in…

  7. The Effectiveness of Clear Speech as a Masker

    Science.gov (United States)

    Calandruccio, Lauren; Van Engen, Kristin; Dhar, Sumitrajit; Bradlow, Ann R.

    2010-01-01

    Purpose: It is established that speaking clearly is an effective means of enhancing intelligibility. Because any signal-processing scheme modeled after known acoustic-phonetic features of clear speech will likely affect both target and competing speech, it is important to understand how speech recognition is affected when a competing speech signal…

  8. The New Findings Made in Speech Act Theory

    Institute of Scientific and Technical Information of China (English)

    管彦波

    2007-01-01

    Through carefully studying the theory of speech acts and the literature concerning it,the author made some new findings which reflects in three aspects:the similarities and differences in Chinese and English in expressing the same speech act,the relations between different types of speech acts and the correspondence between sentenee sets and sets of speech acts.

  9. Implementation of a Speech Improvement Program at the Kindergarten Level.

    Science.gov (United States)

    Green, Robert A.

    Evaluated was a speech improvement program for kindergarten students in which speech improvement lessons were summarized for teachers, and the services of itinerant speech therapists were shared by classroom teachers. Teacher and therapist agreed upon specific speech lessons which were conducted on a weekly basis. Program development involved…

  10. Emotion recognition from speech: tools and challenges

    Science.gov (United States)

    Al-Talabani, Abdulbasit; Sellahewa, Harin; Jassim, Sabah A.

    2015-05-01

    Human emotion recognition from speech is studied frequently for its importance in many applications, e.g. human-computer interaction. There is a wide diversity and non-agreement about the basic emotion or emotion-related states on one hand and about where the emotion related information lies in the speech signal on the other side. These diversities motivate our investigations into extracting Meta-features using the PCA approach, or using a non-adaptive random projection RP, which significantly reduce the large dimensional speech feature vectors that may contain a wide range of emotion related information. Subsets of Meta-features are fused to increase the performance of the recognition model that adopts the score-based LDC classifier. We shall demonstrate that our scheme outperform the state of the art results when tested on non-prompted databases or acted databases (i.e. when subjects act specific emotions while uttering a sentence). However, the huge gap between accuracy rates achieved on the different types of datasets of speech raises questions about the way emotions modulate the speech. In particular we shall argue that emotion recognition from speech should not be dealt with as a classification problem. We shall demonstrate the presence of a spectrum of different emotions in the same speech portion especially in the non-prompted data sets, which tends to be more "natural" than the acted datasets where the subjects attempt to suppress all but one emotion.

  11. Hidden Markov models in automatic speech recognition

    Science.gov (United States)

    Wrzoskowicz, Adam

    1993-11-01

    This article describes a method for constructing an automatic speech recognition system based on hidden Markov models (HMMs). The author discusses the basic concepts of HMM theory and the application of these models to the analysis and recognition of speech signals. The author provides algorithms which make it possible to train the ASR system and recognize signals on the basis of distinct stochastic models of selected speech sound classes. The author describes the specific components of the system and the procedures used to model and recognize speech. The author discusses problems associated with the choice of optimal signal detection and parameterization characteristics and their effect on the performance of the system. The author presents different options for the choice of speech signal segments and their consequences for the ASR process. The author gives special attention to the use of lexical, syntactic, and semantic information for the purpose of improving the quality and efficiency of the system. The author also describes an ASR system developed by the Speech Acoustics Laboratory of the IBPT PAS. The author discusses the results of experiments on the effect of noise on the performance of the ASR system and describes methods of constructing HMM's designed to operate in a noisy environment. The author also describes a language for human-robot communications which was defined as a complex multilevel network from an HMM model of speech sounds geared towards Polish inflections. The author also added mandatory lexical and syntactic rules to the system for its communications vocabulary.

  12. Auditory free classification of nonnative speech

    Science.gov (United States)

    Atagi, Eriko; Bent, Tessa

    2013-01-01

    Through experience with speech variability, listeners build categories of indexical speech characteristics including categories for talker, gender, and dialect. The auditory free classification task—a task in which listeners freely group talkers based on audio samples—has been a useful tool for examining listeners’ representations of some of these characteristics including regional dialects and different languages. The free classification task was employed in the current study to examine the perceptual representation of nonnative speech. The category structure and salient perceptual dimensions of nonnative speech were investigated from two perspectives: general similarity and perceived native language background. Talker intelligibility and whether native talkers were included were manipulated to test stimulus set effects. Results showed that degree of accent was a highly salient feature of nonnative speech for classification based on general similarity and on perceived native language background. This salience, however, was attenuated when listeners were listening to highly intelligible stimuli and attending to the talkers’ native language backgrounds. These results suggest that the context in which nonnative speech stimuli are presented—such as the listeners’ attention to the talkers’ native language and the variability of stimulus intelligibility—can influence listeners’ perceptual organization of nonnative speech. PMID:24363470

  13. Temporal modulations in speech and music.

    Science.gov (United States)

    Ding, Nai; Patel, Aniruddh D; Chen, Lin; Butler, Henry; Luo, Cheng; Poeppel, David

    2017-02-14

    Speech and music have structured rhythms. Here we discuss a major acoustic correlate of spoken and musical rhythms, the slow (0.25-32Hz) temporal modulations in sound intensity and compare the modulation properties of speech and music. We analyze these modulations using over 25h of speech and over 39h of recordings of Western music. We show that the speech modulation spectrum is highly consistent across 9 languages (including languages with typologically different rhythmic characteristics). A different, but similarly consistent modulation spectrum is observed for music, including classical music played by single instruments of different types, symphonic, jazz, and rock. The temporal modulations of speech and music show broad but well-separated peaks around 5 and 2Hz, respectively. These acoustically dominant time scales may be intrinsic features of speech and music, a possibility which should be investigated using more culturally diverse samples in each domain. Distinct modulation timescales for speech and music could facilitate their perceptual analysis and its neural processing.

  14. Speech-specificity of two audiovisual integration effects

    DEFF Research Database (Denmark)

    Eskelund, Kasper; Tuomainen, Jyrki; Andersen, Tobias

    2010-01-01

    Seeing the talker’s articulatory mouth movements can influence the auditory speech percept both in speech identification and detection tasks. Here we show that these audiovisual integration effects also occur for sine wave speech (SWS), which is an impoverished speech signal that naïve observers...... often fail to perceive as speech. While audiovisual integration in the identification task only occurred when observers were informed of the speech-like nature of SWS, integration occurred in the detection task both for informed and naïve observers. This shows that both speech-specific and general...

  15. Optimal subband Kalman filter for normal and oesophageal speech enhancement.

    Science.gov (United States)

    Ishaq, Rizwan; García Zapirain, Begoña

    2014-01-01

    This paper presents the single channel speech enhancement system using subband Kalman filtering by estimating optimal Autoregressive (AR) coefficients and variance for speech and noise, using Weighted Linear Prediction (WLP) and Noise Weighting Function (NWF). The system is applied for normal and Oesophageal speech signals. The method is evaluated by Perceptual Evaluation of Speech Quality (PESQ) score and Signal to Noise Ratio (SNR) improvement for normal speech and Harmonic to Noise Ratio (HNR) for Oesophageal Speech (OES). Compared with previous systems, the normal speech indicates 30% increase in PESQ score, 4 dB SNR improvement and OES shows 3 dB HNR improvement.

  16. Investigating Pragmatics of Complaint Speech Acts in English and Chinese

    Institute of Scientific and Technical Information of China (English)

    张颖卉; 李尚哲

    2013-01-01

    The speech act of complaint is an important research subject of pragmatics, which is worthy of research among speech acts. With the development of research into speech acts, some scholars have performed investigations of complaints ,but they have done little work on Chinese language complaints. Therefore, it is necessary to make a further study on complaint as a speech act in Chinese. This thesis is based on speech act theory and the politeness principle as an empirical study of the speech act of com-plaint in Chinese. It aims to provide a more complete and comprehensive result of participant production of the speech act of complaint.

  17. Algorithms and Software for Predictive and Perceptual Modeling of Speech

    CERN Document Server

    Atti, Venkatraman

    2010-01-01

    From the early pulse code modulation-based coders to some of the recent multi-rate wideband speech coding standards, the area of speech coding made several significant strides with an objective to attain high quality of speech at the lowest possible bit rate. This book presents some of the recent advances in linear prediction (LP)-based speech analysis that employ perceptual models for narrow- and wide-band speech coding. The LP analysis-synthesis framework has been successful for speech coding because it fits well the source-system paradigm for speech synthesis. Limitations associated with th

  18. Audiovisual integration in speech perception: a multi-stage process

    DEFF Research Database (Denmark)

    Eskelund, Kasper; Tuomainen, Jyrki; Andersen, Tobias

    2011-01-01

    investigate whether the integration of auditory and visual speech observed in these two audiovisual integration effects are specific traits of speech perception. We further ask whether audiovisual integration is undertaken in a single processing stage or multiple processing stages.......Integration of speech signals from ear and eye is a well-known feature of speech perception. This is evidenced by the McGurk illusion in which visual speech alters auditory speech perception and by the advantage observed in auditory speech detection when a visual signal is present. Here we...

  19. Predicting speech intelligibility in adverse conditions: evaluation of the speech-based envelope power spectrum model

    DEFF Research Database (Denmark)

    2011-01-01

    are based on the long-term SNRenv. As an attempt to extent the model to deal with fluctuating interferers, a short-time version of the sEPSM is presented. The SNRenv of a speech sample is estimated from a combination of SNRenv-values calculated in short time frames. The model is evaluated in adverse......The speech-based envelope power spectrum model (sEPSM) [Jørgensen and Dau (2011). J. Acoust. Soc. Am., 130 (3), 1475–1487] estimates the envelope signal-to-noise ratio (SNRenv) of distorted speech and accurately describes the speech recognition thresholds (SRT) for normal-hearing listeners...... conditions by comparing predictions to measured data from [Kjems et al. (2009). J. Acoust. Soc. Am. 126 (3), 1415-1426] where speech is mixed with four different interferers, including speech-shaped noise, bottle noise, car noise, and cafe noise. The model accounts well for the differences in intelligibility...

  20. Effects of seeing and hearing speech on speech production: a response time study.

    Science.gov (United States)

    Jarick, Michelle; Jones, Jeffery A

    2009-05-01

    Research demonstrates that listening to and viewing speech excites tongue and lip motor areas involved in speech production. This perceptual-motor relationship was investigated behaviourally by presenting video clips of a speaker producing vowel-consonant-vowel syllables in three conditions: visual-only, audio-only, and audiovisual. Participants identified target letters that were flashed over the mouth during the video, either manually or verbally as quickly as possible. Verbal responses were fastest when the target matched the speech stimuli in all modality conditions, yet optimal facilitation was observed when participants were presented with visual-only stimuli. Critically, no such facilitation occurred when participants were asked to identify the target manually. Our findings support previous research suggesting a close relationship between speech perception and production by demonstrating that viewing speech can 'prime' our motor system for subsequent speech production.

  1. Sparsity in Linear Predictive Coding of Speech

    DEFF Research Database (Denmark)

    Giacobello, Daniele

    of the effectiveness of their application in audio processing. The second part of the thesis deals with introducing sparsity directly in the linear prediction analysis-by-synthesis (LPAS) speech coding paradigm. We first propose a novel near-optimal method to look for a sparse approximate excitation using a compressed...... sensing formulation. Furthermore, we define a novel re-estimation procedure to adapt the predictor coefficients to the given sparse excitation, balancing the two representations in the context of speech coding. Finally, the advantages of the compact parametric representation of a segment of speech, given...

  2. Personality in speech assessment and automatic classification

    CERN Document Server

    Polzehl, Tim

    2015-01-01

    This work combines interdisciplinary knowledge and experience from research fields of psychology, linguistics, audio-processing, machine learning, and computer science. The work systematically explores a novel research topic devoted to automated modeling of personality expression from speech. For this aim, it introduces a novel personality assessment questionnaire and presents the results of extensive labeling sessions to annotate the speech data with personality assessments. It provides estimates of the Big 5 personality traits, i.e. openness, conscientiousness, extroversion, agreeableness, and neuroticism. Based on a database built on the questionnaire, the book presents models to tell apart different personality types or classes from speech automatically.

  3. Hate Speech Revisited: The "Toon" Controversy

    Directory of Open Access Journals (Sweden)

    Rajeev Dhavan

    2010-01-01

    Full Text Available Examining the cartoon controversy which ignited violent protests and ban in various countries, this article examines the contours of "hate speech" in various legal systems. While broadly supporting the case of free speech the authors remind users of free speech to exercise self-restraint. Absolute bans should not be made, but time, person and place constraints may be essential. Ironically, the toon controversy also reveals the silence of the sympathetic majority. Similarly, there is a duty to speak. Even though not enforceable, it remains a duty to democracy.

  4. A Universal Part-of-Speech Tagset

    CERN Document Server

    Petrov, Slav; McDonald, Ryan

    2011-01-01

    To facilitate future research in unsupervised induction of syntactic structure and to standardize best-practices, we propose a tagset that consists of twelve universal part-of-speech categories. In addition to the tagset, we develop a mapping from 25 different treebank tagsets to this universal set. As a result, when combined with the original treebank data, this universal tagset and mapping produce a dataset consisting of common parts-of-speech for 22 different languages. We highlight the use of this resource via two experiments, including one that reports competitive accuracies for unsupervised grammar induction without gold standard part-of-speech tags.

  5. Speech information retrieval: a review

    Energy Technology Data Exchange (ETDEWEB)

    Hafen, Ryan P.; Henry, Michael J.

    2012-11-01

    Audio is an information-rich component of multimedia. Information can be extracted from audio in a number of different ways, and thus there are several established audio signal analysis research fields. These fields include speech recognition, speaker recognition, audio segmentation and classification, and audio finger-printing. The information that can be extracted from tools and methods developed in these fields can greatly enhance multimedia systems. In this paper, we present the current state of research in each of the major audio analysis fields. The goal is to introduce enough back-ground for someone new in the field to quickly gain high-level understanding and to provide direction for further study.

  6. Effects of positive attitude toward giving a speech on cardiovascular and subjective fear responses during speech in anxious subjects.

    Science.gov (United States)

    Hu, S; Romans-Kroll, J M

    1995-10-01

    40 speech-anxious subjects were asked to deliver four speeches during the experiment. The conditions varied according to whether the subjects maintained a positive or neutral attitude toward speech prior to each presentation. Heart rate and self-reports of fear were measured during the experiment. Maintaining a positive attitude prior to delivering a speech reduced both subjective anxiety and cardiovascular responses.

  7. Empathy, Ways of Knowing, and Interdependence as Mediators of Gender Differences in Attitudes toward Hate Speech and Freedom of Speech

    Science.gov (United States)

    Cowan, Gloria; Khatchadourian, Desiree

    2003-01-01

    Women are more intolerant of hate speech than men. This study examined relationality measures as mediators of gender differences in the perception of the harm of hate speech and the importance of freedom of speech. Participants were 107 male and 123 female college students. Questionnaires assessed the perceived harm of hate speech, the importance…

  8. Two Sides of the Same Coin: The Scope of Free Speech and Hate Speech in the College Community.

    Science.gov (United States)

    Schuett, Faye

    2000-01-01

    This article presents the Two Sides interviews, which confront the serious and immediate conflict between free speech and hate speech on college campuses. Dr. Robert O' Neil discusses the scope of free speech in the college community, while Dr. Timothy Shiell focuses on hate speech on campuses. Contains 12 references. (VWC)

  9. The effectiveness of Speech-Music Therapy for Aphasia (SMTA) in five speakers with Apraxia of Speech and aphasia

    NARCIS (Netherlands)

    Hurkmans, Joost; Jonkers, Roel; de Bruijn, Madeleen; Boonstra, Anne M.; Hartman, Paul P.; Arendzen, Hans; Reinders - Messelink, Heelen

    2015-01-01

    Background: Several studies using musical elements in the treatment of neurological language and speech disorders have reported improvement of speech production. One such programme, Speech-Music Therapy for Aphasia (SMTA), integrates speech therapy and music therapy (MT) to treat the individual with

  10. 75 FR 29914 - Telecommunications Relay Services, Speech-to-Speech Services, E911 Requirements for IP-Enabled...

    Science.gov (United States)

    2010-05-28

    ... From the Federal Register Online via the Government Publishing Office FEDERAL COMMUNICATIONS COMMISSION 47 CFR Part 64 Telecommunications Relay Services, Speech-to-Speech Services, E911 Requirements for... requirements associated with the Commission's Telecommunications Relay Services, ] Speech-to-Speech...

  11. 75 FR 49491 - Telecommunications Relay Services and Speech-to-Speech Services for Individuals With Hearing and...

    Science.gov (United States)

    2010-08-13

    ... COMMISSION Telecommunications Relay Services and Speech-to-Speech Services for Individuals With Hearing and... through June 30, 2011 Interstate Telecommunications Relay Services (TRS) Fund (Fund) year. This action is... summary of the Commission's Telecommunications Relay Services and Speech-to-Speech Services...

  12. CONCURRENT SPEECHES SEPARATION USING WRAPPED DISCRETE FOURIER TRANSFORM

    Institute of Scientific and Technical Information of China (English)

    Zhang Xichun; Li Yunjie; Zhang Jun; Wei Gang

    2005-01-01

    This letter proposes a new method for concurrent voiced speech separation. Firstly the Wrapped Discrete Fourier Transform (WDFT) is used to decompose the harmonic spectra of the mixed speeches. Then the individual speech is reconstructed by using the sinusoidal speech model. By taking advantage of the non-uniform frequency resolution of WDFT, harmonic spectra parameters can be estimated and separated accurately. Experimental results on mixed vowels separation show that the proposed method can recover the original speeches effectively.

  13. A Review on Speech Corpus Development for Automatic Speech Recognition in Indian Languages

    OpenAIRE

    Cini kurian

    2015-01-01

    Corpus development gained much attention due to recent statistics based natural language processing. It has new applications in Language Technology, linguistic research, language education and information exchange. Corpus based Language research has an innovative outlook which will discard the aged linguistic theories. Speech corpus is the essential resources for building a speech recognizer. One of the main challenges faced by speech scientist is the unavailability of these resources. Very f...

  14. Relations between affective music and speech: Evidence from dynamics of affective piano performance and speech production

    Directory of Open Access Journals (Sweden)

    Xiaoluan eLiu

    2015-07-01

    Full Text Available This study compares affective piano performance with speech production from the perspective of dynamics: unlike previous research, this study uses finger force and articulatory effort as indexes reflecting the dynamics of affective piano performance and speech production respectively. Moreover, for the first time physical constraints such as piano fingerings and speech articulatory distance are included due to their potential contribution to different patterns of dynamics. A piano performance experiment and speech production experiment were conducted in four emotions: anger, fear, happiness and sadness. The results show that in both piano performance and speech production, anger and happiness generally have high dynamics while sadness has the lowest dynamics, with fear in the middle. Fingerings interact with fear in the piano experiment and articulatory distance interacts with anger in the speech experiment, i.e., large physical constraints produce significantly higher dynamics than small physical constraints in piano performance under the condition of fear and in speech production under the condition of anger. Using production experiments, this study firstly supports previous perception studies on relations between affective music and speech. Moreover, this is the first study to show quantitative evidence for the importance of considering motor aspects such as dynamics in comparing music performance and speech production in which motor mechanisms play a crucial role.

  15. Relations between affective music and speech: evidence from dynamics of affective piano performance and speech production.

    Science.gov (United States)

    Liu, Xiaoluan; Xu, Yi

    2015-01-01

    This study compares affective piano performance with speech production from the perspective of dynamics: unlike previous research, this study uses finger force and articulatory effort as indexes reflecting the dynamics of affective piano performance and speech production respectively. Moreover, for the first time physical constraints such as piano fingerings and speech articulatory constraints are included due to their potential contribution to different patterns of dynamics. A piano performance experiment and speech production experiment were conducted in four emotions: anger, fear, happiness and sadness. The results show that in both piano performance and speech production, anger and happiness generally have high dynamics while sadness has the lowest dynamics. Fingerings interact with fear in the piano experiment and articulatory constraints interact with anger in the speech experiment, i.e., large physical constraints produce significantly higher dynamics than small physical constraints in piano performance under the condition of fear and in speech production under the condition of anger. Using production experiments, this study firstly supports previous perception studies on relations between affective music and speech. Moreover, this is the first study to show quantitative evidence for the importance of considering motor aspects such as dynamics in comparing music performance and speech production in which motor mechanisms play a crucial role.

  16. Phrase-level speech simulation with an airway modulation model of speech production.

    Science.gov (United States)

    Story, Brad H

    2013-06-01

    Artificial talkers and speech synthesis systems have long been used as a means of understanding both speech production and speech perception. The development of an airway modulation model is described that simulates the time-varying changes of the glottis and vocal tract, as well as acoustic wave propagation, during speech production. The result is a type of artificial talker that can be used to study various aspects of how sound is generated by humans and how that sound is perceived by a listener. The primary components of the model are introduced and simulation of words and phrases are demonstrated.

  17. Cleft palate speech and velopharyngeal dysfunction: the approach of the speech therapist.

    Science.gov (United States)

    De Bodt, M; Van Lierde, K

    2006-01-01

    Cleft palate and velopharyngeal dysfunction cause communication disorders in many different ways (articulation, resonance, voice and language). These problems are mainly present in childhood but remain a matter of concern for many years. Speech and language pathologists are involved in speech and language assessment and speech therapy procedures. This article gives an overview of the standard procedures of the speech pathologist in a cleft palate team and discusses the relationship between the team and private practices or school teams, as well as the practical aspects relating to reimbursement by the National Institute of Health and Invalidity (RIZIV).

  18. Acoustic differences among casual, conversational, and read speech

    Science.gov (United States)

    Pinnow, DeAnna

    Speech is a complex behavior that allows speakers to use many variations to satisfy the demands connected with multiple speaking environments. Speech research typically obtains speech samples in a controlled laboratory setting using read material, yet anecdotal observations of such speech, particularly from talkers with a speech and language impairment, have identified a "performance" effect in the produced speech which masks the characteristics of impaired speech outside of the lab (Goberman, Recker, & Parveen, 2010). The aim of the current study was to investigate acoustic differences among laboratory read, laboratory conversational, and casual speech through well-defined speech tasks in the laboratory and in talkers' natural environments. Eleven healthy research participants performed lab recording tasks (19 read sentences and a dialogue about their life) and collected natural-environment recordings of themselves over 3-day periods using portable recorders. Segments were analyzed for articulatory, voice, and prosodic acoustic characteristics using computer software and hand counting. The current study results indicate that lab-read speech was significantly different from casual speech: greater articulation range, improved voice quality measures, lower speech rate, and lower mean pitch. One implication of the results is that different laboratory techniques may be beneficial in obtaining speech samples that are more like casual speech, thus making it easier to correctly analyze abnormal speech characteristics with fewer errors.

  19. Autosomal dominant rolandic epilepsy with speech dyspraxia.

    Science.gov (United States)

    Scheffer, I E

    2000-01-01

    Autosomal Dominant Rolandic Epilepsy with Speech Dyspraxia (ADRESD) is a rare disorder which highlights the relationship between Benign Rolandic Epilepsy (BRE) and speech and language disorders. Subtle speech and language disorders have recently been well characterised in BRE. ADRESD is associated with long term, more severe speech and language difficulties. The time course of rolandic epilepsy in ADRESD is typical of that of BRE. ADRESD is inherited in an autosomal dominant manner with anticipation. It is postulated that the anticipation may be due to an, as yet unidentified, triplet repeat expansion in a gene for rolandic epilepsy. BRE follows complex inheritance but it is possible that ADRESD may hold some valuable clues to the pathogenesis of BRE.

  20. Modeling Speech Intelligibility in Hearing Impaired Listeners

    DEFF Research Database (Denmark)

    Scheidiger, Christoph; Jørgensen, Søren; Dau, Torsten

    2014-01-01

    Models of speech intelligibility (SI) have a long history, starting with the articulation index (AI, [17]), followed by the SI index (SI I, [18]) and the speech transmission index (STI, [7]), to only name a few. However, these models fail to accurately predict SI with nonlinearly processed noisy...... speech, e.g. phase jitter or spectral subtraction. Recent studies predict SI for normal-hearing (NH) listeners based on a signal-to-noise ratio measure in the envelope domain (SNRenv), in the framework of the speech-based envelope power spectrum model (sEPSM, [20, 21]). These models have shown good...... agreement with measured data under a broad range of conditions, including stationary and modulated interferers, reverberation, and spectral subtraction. Despite the advances in modeling intelligibility in NH listeners, a broadly applicable model that can predict SI in hearing-impaired (HI) listeners...

  1. Coherence and the speech intelligibility index

    Science.gov (United States)

    Kates, James M.; Arehart, Kathryn H.

    2005-04-01

    The speech intelligibility index (SII) (ANSI S3.5-1997) provides a means for estimating speech intelligibility under conditions of additive stationary noise or bandwidth reduction. The SII concept for estimating intelligibility is extended in this paper to include broadband peak-clipping and center-clipping distortion, with the coherence between the input and output signals used to estimate the noise and distortion effects. The speech intelligibility predictions using the new procedure are compared with intelligibility scores obtained from normal-hearing and hearing-impaired subjects for conditions of additive noise and peak-clipping and center-clipping distortion. The most effective procedure divides the speech signal into low-, mid-, and high-level regions, computes the coherence SII separately for the signal segments in each region, and then estimates intelligibility from a weighted combination of the three coherence SII values. .

  2. Predicting masking release of lateralized speech

    DEFF Research Database (Denmark)

    Chabot-Leclerc, Alexandre; MacDonald, Ewen; Dau, Torsten

    2016-01-01

    al., 2013, J. Acoust. Soc. Am. 130], which uses a short-term equalization-cancellation process to model binaural unmasking. In the conditions where informational masking (IM) was involved, the predicted SRTs were lower than the measured values because the model is blind to confusions experienced......Locsei et al. (2015) [Speech in Noise Workshop, Copenhagen, 46] measured ˝ speech reception thresholds (SRTs) in anechoic conditions where the target speech and the maskers were lateralized using interaural time delays. The maskers were speech-shaped noise (SSN) and reversed babble with 2, 4, or 8...... talkers. For a given interferer type, the number of maskers presented on the target’s side was varied, such that none, some, or all maskers were presented on the same side as the target. In general, SRTs did not vary significantly when at least one masker was presented on the same side as the target...

  3. Ultra low bit-rate speech coding

    CERN Document Server

    Ramasubramanian, V

    2015-01-01

    "Ultra Low Bit-Rate Speech Coding" focuses on the specialized topic of speech coding at very low bit-rates of 1 Kbits/sec and less, particularly at the lower ends of this range, down to 100 bps. The authors set forth the fundamental results and trends that form the basis for such ultra low bit-rates to be viable and provide a comprehensive overview of various techniques and systems in literature to date, with particular attention to their work in the paradigm of unit-selection based segment quantization. The book is for research students, academic faculty and researchers, and industry practitioners in the areas of speech processing and speech coding.

  4. Auditory—Spectrum Quantization Based Speech Recognition

    Institute of Scientific and Technical Information of China (English)

    WuYuanqing; HaoJie; 等

    1997-01-01

    Based on the analysis of the physiological and psychological characteristics of human auditory system[1],we can classify human auditory process into two hearing modes:active one and passive one.A novel approach of robust speech recognition,Auditory-spectrum Quantization Based Speech Recognition(AQBSR),is proposed.In this method,we intend to simulate human active hearing mode and locate the effective areas of speech signals in temporal domain and in frequency domain.Adaptive filter banks are used in place of fixed-band filters to extract feature parameters.The effective speech components and their corresponding frequency areas of each word in the vocabulary can be found out during training.In recognition stage,comparison between the unknown sound and the current template is maintained only in the effective areas of the template word.The control experiments show that the AQ BSR method is more robust than traditional systems.

  5. Heart Rate Extraction from Vowel Speech Signals

    Institute of Scientific and Technical Information of China (English)

    Abdelwadood Mesleh; Dmitriy Skopin; Sergey Baglikov; Anas Quteishat

    2012-01-01

    This paper presents a novel non-contact heart rate extraction method from vowel speech signals.The proposed method is based on modeling the relationship between speech production of vowel speech signals and heart activities for humans where it is observed that the moment of heart beat causes a short increment (evolution) of vowel speech formants.The short-time Fourier transform (STFT) is used to detect the formant maximum peaks so as to accurately estimate the heart rate.Compared with traditional contact pulse oximeter,the average accuracy of the proposed non-contact heart rate extraction method exceeds 95%.The proposed non-contact heart rate extraction method is expected to play an important role in modern medical applications.

  6. Activities to Encourage Speech and Language Development

    Science.gov (United States)

    ... and Swallowing / Development Activities to Encourage Speech and Language Development Birth to 2 Years Encourage your baby ... Play games with your child such as "house." Exchange roles in the family, with your pretending to ...

  7. Towards Quranic reader controlled by speech

    CERN Document Server

    Yekache, Yacine; Kouninef, Belkacem

    2012-01-01

    In this paper we describe the process of designing a task-oriented continuous speech recognition system for Arabic, based on CMU Sphinx4, to be used in the voice interface of Quranic reader. The concept of the Quranic reader controlled by speech is presented, the collection of the corpus and creation of acoustic model are described in detail taking into account a specificities of Arabic language and the desired application.

  8. Towards Quranic reader controlled by speech

    Directory of Open Access Journals (Sweden)

    Yacine Yekache

    2011-11-01

    Full Text Available In this paper we describe the process of designing a task-oriented continuous speech recognition system for Arabic, based on CMU Sphinx4, to be used in the voice interface of Quranic reader. The concept of the Quranic reader controlled by speech is presented, the collection of the corpus and creation of acoustic model are described in detail taking into account a specificities of Arabic language and the desired application.

  9. Phoneme vs Grapheme Based Automatic Speech Recognition

    OpenAIRE

    Magimai.-Doss, Mathew; Dines, John; Bourlard, Hervé; Hermansky, Hynek

    2004-01-01

    In recent literature, different approaches have been proposed to use graphemes as subword units with implicit source of phoneme information for automatic speech recognition. The major advantage of using graphemes as subword units is that the definition of lexicon is easy. In previous studies, results comparable to phoneme-based automatic speech recognition systems have been reported using context-independent graphemes or context-dependent graphemes with decision trees. In this paper, we study...

  10. A Dialectal Chinese Speech Recognition Framework

    Institute of Scientific and Technical Information of China (English)

    Jing Li; Thomas Fang Zheng; William Byrne; Dan Jurafsky

    2006-01-01

    A framework for dialectal Chinese speech recognition is proposed and studied, in which a relatively small dialectal Chinese (or in other words Chinese influenced by the native dialect) speech corpus and dialect-related knowledge are adopted to transform a standard Chinese (or Putonghua, abbreviated as PTH) speech recognizer into a dialectal Chinese speech recognizer. Two kinds of knowledge sources are explored: one is expert knowledge and the other is a small dialectal Chinese corpus. These knowledge sources provide information at four levels: phonetic level, lexicon level, language level,and acoustic decoder level. This paper takes Wu dialectal Chinese (WDC) as an example target language. The goal is to establish a WDC speech recognizer from an existing PTH speech recognizer based on the Initial-Final structure of the Chinese language and a study of how dialectal Chinese speakers speak Putonghua. The authors propose to use contextindependent PTH-IF mappings (where IF means either a Chinese Initial or a Chinese Final), context-independent WDC-IF mappings, and syllable-dependent WDC-IF mappings (obtained from either experts or data), and combine them with the supervised maximum likelihood linear regression (MLLR) acoustic model adaptation method. To reduce the size of the multipronunciation lexicon introduced by the IF mappings, which might also enlarge the lexicon confusion and hence lead to the performance degradation, a Multi-Pronunciation Expansion (MPE) method based on the accumulated uni-gram probability (AUP) is proposed. In addition, some commonly used WDC words are selected and added to the lexicon. Compared with the original PTH speech recognizer, the resulting WDC speech recognizer achieves 10-18% absolute Character Error Rate (CER) reduction when recognizing WDC, with only a 0.62% CER increase when recognizing PTH. The proposed framework and methods are expected to work not only for Wu dialectal Chinese but also for other dialectal Chinese languages and

  11. The fragility of freedom of speech.

    Science.gov (United States)

    Shackel, Nicholas

    2013-05-01

    Freedom of speech is a fundamental liberty that imposes a stringent duty of tolerance. Tolerance is limited by direct incitements to violence. False notions and bad laws on speech have obscured our view of this freedom. Hence, perhaps, the self-righteous intolerance, incitements and threats in response to Giubilini and Minerva. Those who disagree have the right to argue back but their attempts to shut us up are morally wrong.

  12. Perception and Temporal Properties of Speech

    Science.gov (United States)

    1990-07-26

    rqvcrw it necessary and idenbty by bloc* numewrl ,ELI I ROu I SUS. GR. l speech perception, prosody, context effects, phonetic 05 09 1 segments...found to aid listeners in correctly attributing the phonological source of vowel duration. The second series of experiments examines the role of... phonetic segments, and on the role of coarse-grained aspects of the speech signal in facilitating segment recognition. These extensions will address the

  13. Speech Segregation based on Binary Classification

    Science.gov (United States)

    2016-07-15

    other provision of law, no person shall be subject to any penalty for failing to comply with a collection of information if it does not display a...to the adoption of the ideal ratio mask (IRM). A subsequent listening evaluation shows increased intelligibility in noise for human listeners...15. SUBJECT TERMS Binary classification, time-frequency masking, supervised speech segregation, speech intelligibility , room reverberation 16

  14. An introduction to statistical parametric speech synthesis

    Indian Academy of Sciences (India)

    Simon King

    2011-10-01

    Statistical parametric speech synthesis, based on hidden Markov model-like models, has become competitive with established concatenative techniques over the last few years. This paper offers a non-mathematical introduction to this method of speech synthesis. It is intended to be complementary to the wide range of excellent technical publications already available. Rather than offer a comprehensive literature review, this paper instead gives a small number of carefully chosen references which are good starting points for further reading.

  15. Electrophysiological evidence for speech-specific audiovisual integration.

    Science.gov (United States)

    Baart, Martijn; Stekelenburg, Jeroen J; Vroomen, Jean

    2014-01-01

    Lip-read speech is integrated with heard speech at various neural levels. Here, we investigated the extent to which lip-read induced modulations of the auditory N1 and P2 (measured with EEG) are indicative of speech-specific audiovisual integration, and we explored to what extent the ERPs were modulated by phonetic audiovisual congruency. In order to disentangle speech-specific (phonetic) integration from non-speech integration, we used Sine-Wave Speech (SWS) that was perceived as speech by half of the participants (they were in speech-mode), while the other half was in non-speech mode. Results showed that the N1 obtained with audiovisual stimuli peaked earlier than the N1 evoked by auditory-only stimuli. This lip-read induced speeding up of the N1 occurred for listeners in speech and non-speech mode. In contrast, if listeners were in speech-mode, lip-read speech also modulated the auditory P2, but not if listeners were in non-speech mode, thus revealing speech-specific audiovisual binding. Comparing ERPs for phonetically congruent audiovisual stimuli with ERPs for incongruent stimuli revealed an effect of phonetic stimulus congruency that started at ~200 ms after (in)congruence became apparent. Critically, akin to the P2 suppression, congruency effects were only observed if listeners were in speech mode, and not if they were in non-speech mode. Using identical stimuli, we thus confirm that audiovisual binding involves (partially) different neural mechanisms for sound processing in speech and non-speech mode.

  16. Wideband Speech Recovery Using Psychoacoustic Criteria

    Directory of Open Access Journals (Sweden)

    Visar Berisha

    2007-08-01

    Full Text Available Many modern speech bandwidth extension techniques predict the high-frequency band based on features extracted from the lower band. While this method works for certain types of speech, problems arise when the correlation between the low and the high bands is not sufficient for adequate prediction. These situations require that additional high-band information is sent to the decoder. This overhead information, however, can be cleverly quantized using human auditory system models. In this paper, we propose a novel speech compression method that relies on bandwidth extension. The novelty of the technique lies in an elaborate perceptual model that determines a quantization scheme for wideband recovery and synthesis. Furthermore, a source/filter bandwidth extension algorithm based on spectral spline fitting is proposed. Results reveal that the proposed system improves the quality of narrowband speech while performing at a lower bitrate. When compared to other wideband speech coding schemes, the proposed algorithms provide comparable speech quality at a lower bitrate.

  17. Wideband Speech Recovery Using Psychoacoustic Criteria

    Directory of Open Access Journals (Sweden)

    Berisha Visar

    2007-01-01

    Full Text Available Many modern speech bandwidth extension techniques predict the high-frequency band based on features extracted from the lower band. While this method works for certain types of speech, problems arise when the correlation between the low and the high bands is not sufficient for adequate prediction. These situations require that additional high-band information is sent to the decoder. This overhead information, however, can be cleverly quantized using human auditory system models. In this paper, we propose a novel speech compression method that relies on bandwidth extension. The novelty of the technique lies in an elaborate perceptual model that determines a quantization scheme for wideband recovery and synthesis. Furthermore, a source/filter bandwidth extension algorithm based on spectral spline fitting is proposed. Results reveal that the proposed system improves the quality of narrowband speech while performing at a lower bitrate. When compared to other wideband speech coding schemes, the proposed algorithms provide comparable speech quality at a lower bitrate.

  18. Service Robot SCORPIO with Robust Speech Interface

    Directory of Open Access Journals (Sweden)

    Stanislav Ondas

    2013-01-01

    Full Text Available The SCORPIO is a small‐size mini‐teleoperator mobile service robot for booby‐trap disposal. It can be manually controlled by an operator through a portable briefcase remote control device using joystick, keyboard and buttons. In this paper, the speech interface is described. As an auxiliary function, the remote interface allows a human operator to concentrate sight and/or hands on other operation activities that are more important. The developed speech interface is based on HMM‐based acoustic models trained using the SpeechDatE‐SK database, a small‐vocabulary language model based on fixed connected words, grammar, and the speech recognition setup adapted for low‐resource devices. To improve the robustness of the speech interface in an outdoor environment, which is the working area of the SCORPIO service robot, a speech enhancement based on the spectral subtraction method, as well as a unique combination of an iterative approach and a modified LIMA framework, were researched, developed and tested on simulated and real outdoor recordings.

  19. SUSTAINABILITY IN THE BOWELS OF SPEECHES

    Directory of Open Access Journals (Sweden)

    Jadir Mauro Galvao

    2012-10-01

    Full Text Available The theme of sustainability has not yet achieved the feat of make up as an integral part the theoretical medley that brings out our most everyday actions, often visits some of our thoughts and permeates many of our speeches. The big event of 2012, the meeting gathered Rio +20 glances from all corners of the planet around that theme as burning, but we still see forward timidly. Although we have no very clear what the term sustainability closes it does not sound quite strange. Associate with things like ecology, planet, wastes emitted by smokestacks of factories, deforestation, recycling and global warming must be related, but our goal in this article is the least of clarifying the term conceptually and more try to observe as it appears in speeches of such conference. When the competent authorities talk about sustainability relate to what? We intend to investigate the lines and between the lines of these speeches, any assumptions associated with the term. Therefore we will analyze the speech of the People´s Summit, the opening speech of President Dilma and emblematic speech of the President of Uruguay, José Pepe Mujica.

  20. A Survey on Statistical Based Single Channel Speech Enhancement Techniques

    Directory of Open Access Journals (Sweden)

    Sunnydayal. V

    2014-11-01

    Full Text Available Speech enhancement is a long standing problem with various applications like hearing aids, automatic recognition and coding of speech signals. Single channel speech enhancement technique is used for enhancement of the speech degraded by additive background noises. The background noise can have an adverse impact on our ability to converse without hindrance or smoothly in very noisy environments, such as busy streets, in a car or cockpit of an airplane. Such type of noises can affect quality and intelligibility of speech. This is a survey paper and its object is to provide an overview of speech enhancement algorithms so that enhance the noisy speech signal which is corrupted by additive noise. The algorithms are mainly based on statistical based approaches. Different estimators are compared. Challenges and Opportunities of speech enhancement are also discussed. This paper helps in choosing the best statistical based technique for speech enhancement

  1. Specialization in audiovisual speech perception: a replication study

    DEFF Research Database (Denmark)

    Eskelund, Kasper; Andersen, Tobias

    Speech perception is audiovisual as evidenced by bimodal integration in the McGurk effect. This integration effect may be specific to speech or be applied to all stimuli in general. To investigate this, Tuomainen et al. (2005) used sine-wave speech, which naïve observers may perceive as non-speech......, but hear as speech once informed of the linguistic origin of the signal. Combinations of sine-wave speech and incongruent video of the talker elicited a McGurk effect only for informed observers. This indicates that the audiovisual integration effect is specific to speech perception. However, observers...... of the speaker. Observers were required to report this after primary target categorization. We found a significant McGurk effect only in the natural speech and speech mode conditions supporting the finding of Tuomainen et al. Performance in the secondary task was similar in all conditions indicating...

  2. Utility of TMS to understand the neurobiology of speech

    Directory of Open Access Journals (Sweden)

    Takenobu eMurakami

    2013-07-01

    Full Text Available According to a traditional view, speech perception and production are processed largely separately in sensory and motor brain areas. Recent psycholinguistic and neuroimaging studies provide novel evidence that the sensory and motor systems dynamically interact in speech processing, by demonstrating that speech perception and imitation share regional brain activations. However, the exact nature and mechanisms of these sensorimotor interactions are not completely understood yet.Transcranial magnetic stimulation (TMS has often been used in the cognitive neurosciences, including speech research, as a complementary technique to behavioral and neuroimaging studies. Here we provide an up-to-date review focusing on TMS studies that explored speech perception and imitation.Single-pulse TMS of the primary motor cortex (M1 demonstrated a speech specific and somatotopically specific increase of excitability of the M1 lip area during speech perception (listening to speech or lip reading. A paired-coil TMS approach showed increases in effective connectivity from brain regions that are involved in speech processing to the M1 lip area when listening to speech. TMS in virtual lesion mode applied to speech processing areas modulated performance of phonological recognition and imitation of perceived speech.In summary, TMS is an innovative tool to investigate processing of speech perception and imitation. TMS studies have provided strong evidence that the sensory system is critically involved in mapping sensory input onto motor output and that the motor system plays an important role in speech perception.

  3. Utility of TMS to understand the neurobiology of speech.

    Science.gov (United States)

    Murakami, Takenobu; Ugawa, Yoshikazu; Ziemann, Ulf

    2013-01-01

    According to a traditional view, speech perception and production are processed largely separately in sensory and motor brain areas. Recent psycholinguistic and neuroimaging studies provide novel evidence that the sensory and motor systems dynamically interact in speech processing, by demonstrating that speech perception and imitation share regional brain activations. However, the exact nature and mechanisms of these sensorimotor interactions are not completely understood yet. Transcranial magnetic stimulation (TMS) has often been used in the cognitive neurosciences, including speech research, as a complementary technique to behavioral and neuroimaging studies. Here we provide an up-to-date review focusing on TMS studies that explored speech perception and imitation. Single-pulse TMS of the primary motor cortex (M1) demonstrated a speech specific and somatotopically specific increase of excitability of the M1 lip area during speech perception (listening to speech or lip reading). A paired-coil TMS approach showed increases in effective connectivity from brain regions that are involved in speech processing to the M1 lip area when listening to speech. TMS in virtual lesion mode applied to speech processing areas modulated performance of phonological recognition and imitation of perceived speech. In summary, TMS is an innovative tool to investigate processing of speech perception and imitation. TMS studies have provided strong evidence that the sensory system is critically involved in mapping sensory input onto motor output and that the motor system plays an important role in speech perception.

  4. Clinical features of auditory neuropathy under pure tone audiometry and acoustic immitance examination%听神经病在纯音听阈测听和声导抗检查中的特征性表现

    Institute of Scientific and Technical Information of China (English)

    李鹏; 岑锦添; 黎志诚; 张革化

    2010-01-01

    目的 探讨听神经病在纯音听阈测听及声导抗检查中的临床听力学特点及诊断要点.方法 回顾性分析中山大学附属第三医院耳鼻喉科收治的17例(32耳)听神经病确诊患者在纯音听阈测听、声导抗检查中的听力学特点.结果 17例患者中15例为双侧发病,呈左右对称性听力曲线;26耳以轻至中度低频感音性聋为主(听力图上升型);病程5年的听力损失主要为重度、极重度听力障碍.16例(31耳)声导抗为"A"型鼓室图,15例(30耳)同侧及交叉镫骨肌声反射均未引出,2例(2耳)镫骨肌声反射阈值升高.结论 听神经病在纯音听阔测听及声导抗检查中主要表现为:(1)为双侧对称性、渐进性听力下降;(2)早期为低频上升型听力图,后期为全频听力下降;(3)呈"A"型鼓室图,镫骨肌声反射阈值升高或引不出;(4)患耳无响度重振现象.%Objective To analyze the clinical characteristics of auditory neuropathy under pure tone audiometry and acoustic immitance examination.Methods Seventeen patients (32 ears) diagnosed as having auditory neuropathy were examined for audiology features by pure tone audiometry and acoustic immitance.Results Bilateral and symmetrical hearing loss was found in 15 patients and low-frequency sensorineural hearing loss was noted in 26 ears.Seventeen ears with a course of disease less than 5 years presented light or moderate dysaudia,and those with more than 5 years presented grave loss of aural comprehension.16 patients (31 ears) in the acoustic immitance examination showed type A tympanogram,and absence of stapedius muscle reflex was found in 15 patients (30 ears).The threshold of acoustic stapedius reflex increased in 2 patients (2 ears).Conclusion Auditory neuropathy primarily presents bilateral and symmetrical hearing loss,low-frequency hearing loss at the initial stage and total-frequency hearing loss finally,absence of stapedius muscle reflex and type A tympanogram,and absence of

  5. 自觉听力正常耳鸣患者纯音听力及畸变产物耳声发射临床分析%Audiometry and distortion product otoacoustic emissions in tinnitus patients without hearing complaints

    Institute of Scientific and Technical Information of China (English)

    王晓晖; 肖玉丽

    2011-01-01

    Objective To study pure tone audiometry and distortion product otoacoustic emissions (DPOAE) features in tinnitus patients who do not perceive hearing loss. Methods Pure tone threshold, hearing loss, SNR of DPOAE were performed in 114 (190 ears) adults with tinnitus cases with never perceived hearing loss, investigate the relationship between every risk factor and tinnitus. Results Audiometry showed high frequency hearing loss in 46.31% (88/ 190), low frequency hearing loss in 14.73% (28/190), normal hearing in 23.15% (44/190), and other types of audio-grams in 15.78%(30/190) of the patients. Risk factors included noise exposure, fatigue, stress, and pre-existing diseases. There was a negative correlation between DPOAE amplitude and pure tone threshold. DPOAE response frequencies were closely related to frequency distribution of pure tone thresholds. Conclusion A significant number of patients with tinnitus but no hearing complaints may still have abnormal hearing, especially in the high frequency range. A negative correlation exists between DPOAE amplitude and pure tone threshold. DPOAEs may serve as an objective indicator of the level of hearing loss in tinnitus patients with no hearing complaints, which may be of value in the clinic.%目的 分析自觉听力正常耳鸣患者的纯音听力特征,并探讨自觉听力正常耳鸣患者畸变产物耳声发射(distortion product otoacoustic emission,DPOAE)测定值与纯音听阈值之间的相关性及其意义.方法 2010-2011年在我科就诊的以耳鸣为第一主诉但无明显自觉听力障碍患者114例(190耳),对所有入组患者进行纯音测听及DPOAE检测.分析DPOAE测定值与纯音听阈值之间的相关性,并结合其发病的可能危险因素进行相关分析.结果 自觉听力正常的耳鸣患者中,听力异常可达76.84% (146/190).纯音测听表现为:高频下降型46.31% (88/190);低频下降型14.73% (28/190);正常23.15% (44/190);其它类型15.78% (30/190).

  6. A Motor Speech Assessment for Children with Severe Speech Disorders: Reliability and Validity Evidence

    Science.gov (United States)

    Strand, Edythe A.; McCauley, Rebecca J.; Weigand, Stephen D.; Stoeckel, Ruth E.; Baas, Becky S.

    2013-01-01

    Purpose: In this article, the authors report reliability and validity evidence for the Dynamic Evaluation of Motor Speech Skill (DEMSS), a new test that uses dynamic assessment to aid in the differential diagnosis of childhood apraxia of speech (CAS). Method: Participants were 81 children between 36 and 79 months of age who were referred to the…

  7. Stability and Composition of Functional Synergies for Speech Movements in Children with Developmental Speech Disorders

    Science.gov (United States)

    Terband, H.; Maassen, B.; van Lieshout, P.; Nijland, L.

    2011-01-01

    The aim of this study was to investigate the consistency and composition of functional synergies for speech movements in children with developmental speech disorders. Kinematic data were collected on the reiterated productions of syllables spa(/spa[image omitted]/) and paas(/pa[image omitted]s/) by 10 6- to 9-year-olds with developmental speech…

  8. Autonomic and Emotional Responses of Graduate Student Clinicians in Speech-Language Pathology to Stuttered Speech

    Science.gov (United States)

    Guntupalli, Vijaya K.; Nanjundeswaran, Chayadevie; Dayalu, Vikram N.; Kalinowski, Joseph

    2012-01-01

    Background: Fluent speakers and people who stutter manifest alterations in autonomic and emotional responses as they view stuttered relative to fluent speech samples. These reactions are indicative of an aroused autonomic state and are hypothesized to be triggered by the abrupt breakdown in fluency exemplified in stuttered speech. Furthermore,…

  9. A Clinician Survey of Speech and Non-Speech Characteristics of Neurogenic Stuttering

    Science.gov (United States)

    Theys, Catherine; van Wieringen, Astrid; De Nil, Luc F.

    2008-01-01

    This study presents survey data on 58 Dutch-speaking patients with neurogenic stuttering following various neurological injuries. Stroke was the most prevalent cause of stuttering in our patients, followed by traumatic brain injury, neurodegenerative diseases, and other causes. Speech and non-speech characteristics were analyzed separately for…

  10. Spotlight on Speech Codes 2009: The State of Free Speech on Our Nation's Campuses

    Science.gov (United States)

    Foundation for Individual Rights in Education (NJ1), 2009

    2009-01-01

    Each year, the Foundation for Individual Rights in Education (FIRE) conducts a wide, detailed survey of restrictions on speech at America's colleges and universities. The survey and resulting report explore the extent to which schools are meeting their obligations to uphold students' and faculty members' rights to freedom of speech, freedom of…

  11. Spotlight on Speech Codes 2010: The State of Free Speech on Our Nation's Campuses

    Science.gov (United States)

    Foundation for Individual Rights in Education (NJ1), 2010

    2010-01-01

    Each year, the Foundation for Individual Rights in Education (FIRE) conducts a rigorous survey of restrictions on speech at America's colleges and universities. The survey and resulting report explore the extent to which schools are meeting their legal and moral obligations to uphold students' and faculty members' rights to freedom of speech,…

  12. Spotlight on Speech Codes 2011: The State of Free Speech on Our Nation's Campuses

    Science.gov (United States)

    Foundation for Individual Rights in Education (NJ1), 2011

    2011-01-01

    Each year, the Foundation for Individual Rights in Education (FIRE) conducts a rigorous survey of restrictions on speech at America's colleges and universities. The survey and accompanying report explore the extent to which schools are meeting their legal and moral obligations to uphold students' and faculty members' rights to freedom of speech,…

  13. Using the Speech Transmission Index for predicting non-native speech intelligibility

    NARCIS (Netherlands)

    Wijngaarden, S.J. van; Bronkhorst, A.W.; Houtgast, T.; Steeneken, H.J.M.

    2004-01-01

    While the Speech Transmission Index ~STI! is widely applied for prediction of speech intelligibility in room acoustics and telecommunication engineering, it is unclear how to interpret STI values when non-native talkers or listeners are involved. Based on subjectively measured psychometric functions

  14. Speech Intelligibility and Accents in Speech-Mediated Interfaces: Results and Recommendations

    Science.gov (United States)

    Lawrence, Halcyon M.

    2013-01-01

    There continues to be significant growth in the development and use of speech--mediated devices and technology products; however, there is no evidence that non-native English speech is used in these devices, despite the fact that English is now spoken by more non-native speakers than native speakers, worldwide. This relative absence of nonnative…

  15. Audiovisual Temporal Recalibration for Speech in Synchrony Perception and Speech Identification

    Science.gov (United States)

    Asakawa, Kaori; Tanaka, Akihiro; Imai, Hisato

    We investigated whether audiovisual synchrony perception for speech could change after observation of the audiovisual temporal mismatch. Previous studies have revealed that audiovisual synchrony perception is re-calibrated after exposure to a constant timing difference between auditory and visual signals in non-speech. In the present study, we examined whether this audiovisual temporal recalibration occurs at the perceptual level even for speech (monosyllables). In Experiment 1, participants performed an audiovisual simultaneity judgment task (i.e., a direct measurement of the audiovisual synchrony perception) in terms of the speech signal after observation of the speech stimuli which had a constant audiovisual lag. The results showed that the “simultaneous” responses (i.e., proportion of responses for which participants judged the auditory and visual stimuli to be synchronous) at least partly depended on exposure lag. In Experiment 2, we adopted the McGurk identification task (i.e., an indirect measurement of the audiovisual synchrony perception) to exclude the possibility that this modulation of synchrony perception was solely attributable to the response strategy using stimuli identical to those of Experiment 1. The characteristics of the McGurk effect reported by participants depended on exposure lag. Thus, it was shown that audiovisual synchrony perception for speech could be modulated following exposure to constant lag both in direct and indirect measurement. Our results suggest that temporal recalibration occurs not only in non-speech signals but also in monosyllabic speech at the perceptual level.

  16. The Role of Supralexical Prosodic Units in Speech Production: Evidence from the Distribution of Speech Errors

    Science.gov (United States)

    Choe, Wook Kyung

    2013-01-01

    The current dissertation represents one of the first systematic studies of the distribution of speech errors within supralexical prosodic units. Four experiments were conducted to gain insight into the specific role of these units in speech planning and production. The first experiment focused on errors in adult English. These were found to be…

  17. Stability and composition of functional synergies for speech movements in children with developmental speech disorders

    NARCIS (Netherlands)

    Terband, H.; Maassen, B.; van Lieshout, P.; Nijland, L.

    2011-01-01

    The aim of this study was to investigate the consistency and composition of functional synergies for speech movements in children with developmental speech disorders. Kinematic data were collected on the reiterated productions of syllables spa (/spa:/) and paas (/pa:s/) by 10 6- to 9-year-olds with

  18. An effective cluster-based model for robust speech detection and speech recognition in noisy environments.

    Science.gov (United States)

    Górriz, J M; Ramírez, J; Segura, J C; Puntonet, C G

    2006-07-01

    This paper shows an accurate speech detection algorithm for improving the performance of speech recognition systems working in noisy environments. The proposed method is based on a hard decision clustering approach where a set of prototypes is used to characterize the noisy channel. Detecting the presence of speech is enabled by a decision rule formulated in terms of an averaged distance between the observation vector and a cluster-based noise model. The algorithm benefits from using contextual information, a strategy that considers not only a single speech frame but also a neighborhood of data in order to smooth the decision function and improve speech detection robustness. The proposed scheme exhibits reduced computational cost making it adequate for real time applications, i.e., automated speech recognition systems. An exhaustive analysis is conducted on the AURORA 2 and AURORA 3 databases in order to assess the performance of the algorithm and to compare it to existing standard voice activity detection (VAD) methods. The results show significant improvements in detection accuracy and speech recognition rate over standard VADs such as ITU-T G.729, ETSI GSM AMR, and ETSI AFE for distributed speech recognition and a representative set of recently reported VAD algorithms.

  19. A Computational Auditory Scene Analysis System for Speech Segregation and Robust Speech Recognition

    Science.gov (United States)

    2007-01-01

    Droppo, J., Acero , A., 2005. Dynamic compensation of HMM variances using the feature enhancement uncertainty computed from a parametric model of speech...analysis. IEEE Trans. on Audio, Speech, and Language Processing 15, 396–405. Huang, X., Acero , A., Hon, H., 2001. Spoken Language Processing. Prentice Hall

  20. Speech motor development in childhood apraxia of speech: generating testable hypotheses by neurocomputational modeling.

    NARCIS (Netherlands)

    Terband, H.R.; Maassen, B.A.M.

    2010-01-01

    Childhood apraxia of speech (CAS) is a highly controversial clinical entity, with respect to both clinical signs and underlying neuromotor deficit. In the current paper, we advocate a modeling approach in which a computational neural model of speech acquisition and production is utilized in order to

  1. Neural bases of childhood speech disorders: lateralization and plasticity for speech functions during development.

    Science.gov (United States)

    Liégeois, Frédérique J; Morgan, Angela T

    2012-01-01

    Current models of speech production in adults emphasize the crucial role played by the left perisylvian cortex, primary and pre-motor cortices, the basal ganglia, and the cerebellum for normal speech production. Whether similar brain-behaviour relationships and leftward cortical dominance are found in childhood remains unclear. Here we reviewed recent evidence linking motor speech disorders (apraxia of speech and dysarthria) and brain abnormalities in children and adolescents with developmental, progressive, or childhood-acquired conditions. We found no evidence that unilateral damage can result in apraxia of speech, or that left hemisphere lesions are more likely to result in dysarthria than lesion to the right. The few studies reporting on childhood apraxia of speech converged towards morphological, structural, metabolic or epileptic anomalies affecting the basal ganglia, perisylvian and rolandic cortices bilaterally. Persistent dysarthria, similarly, was commonly reported in individuals with syndromes and conditions affecting these same structures bilaterally. In conclusion, for the first time we provide evidence that longterm and severe childhood speech disorders result predominantly from bilateral disruption of the neural networks involved in speech production.

  2. Speech Motor Development in Childhood Apraxia of Speech : Generating Testable Hypotheses by Neurocomputational Modeling

    NARCIS (Netherlands)

    Terband, H.; Maassen, B.

    2010-01-01

    Childhood apraxia of speech (CAS) is a highly controversial clinical entity, with respect to both clinical signs and underlying neuromotor deficit. In the current paper, we advocate a modeling approach in which a computational neural model of speech acquisition and production is utilized in order to

  3. Prisoner Fasting as Symbolic Speech: The Ultimate Speech-Action Test.

    Science.gov (United States)

    Sneed, Don; Stonecipher, Harry W.

    The ultimate test of the speech-action dichotomy, as it relates to symbolic speech to be considered by the courts, may be the fasting of prison inmates who use hunger strikes to protest the conditions of their confinement or to make political statements. While hunger strikes have been utilized by prisoners for years as a means of protest, it was…

  4. The Clinical Practice of Speech and Language Therapists with Children with Phonologically Based Speech Sound Disorders

    Science.gov (United States)

    Oliveira, Carla; Lousada, Marisa; Jesus, Luis M. T.

    2015-01-01

    Children with speech sound disorders (SSD) represent a large number of speech and language therapists' caseloads. The intervention with children who have SSD can involve different therapy approaches, and these may be articulatory or phonologically based. Some international studies reveal a widespread application of articulatory based approaches in…

  5. Evaluation of speech synthesis systems using the speech reception threshold methodology

    NARCIS (Netherlands)

    Leeuwen, D.A. van; Balken, J. van

    2005-01-01

    The intelligibility of speech sysnthesis systems that are available nowadays is usually high enough to enable comparisons between different synthesis systems based on the speech quality. However, in some situations, like a civil aircraft cockpit, the acoustic environment may be such that intelligibi

  6. The logic of indirect speech.

    Science.gov (United States)

    Pinker, Steven; Nowak, Martin A; Lee, James J

    2008-01-22

    When people speak, they often insinuate their intent indirectly rather than stating it as a bald proposition. Examples include sexual come-ons, veiled threats, polite requests, and concealed bribes. We propose a three-part theory of indirect speech, based on the idea that human communication involves a mixture of cooperation and conflict. First, indirect requests allow for plausible deniability, in which a cooperative listener can accept the request, but an uncooperative one cannot react adversarially to it. This intuition is supported by a game-theoretic model that predicts the costs and benefits to a speaker of direct and indirect requests. Second, language has two functions: to convey information and to negotiate the type of relationship holding between speaker and hearer (in particular, dominance, communality, or reciprocity). The emotional costs of a mismatch in the assumed relationship type can create a need for plausible deniability and, thereby, select for indirectness even when there are no tangible costs. Third, people perceive language as a digital medium, which allows a sentence to generate common knowledge, to propagate a message with high fidelity, and to serve as a reference point in coordination games. This feature makes an indirect request qualitatively different from a direct one even when the speaker and listener can infer each other's intentions with high confidence.

  7. The DNA of prophetic speech

    Directory of Open Access Journals (Sweden)

    Friedrich W. de Wet

    2014-02-01

    Full Text Available Having to speak words that can potentially abuse the divine connotation of prophetic speech for giving authority to the own manipulative intent poses a daunting challenge to preachers. The metaphorical images triggered by ‘DNA’ and ‘genetic engineering’ are deployed in illustrating the ambivalent position in which a prophetic preacher finds himself or herself; ambivalence between anticipation of regeneration at the deepest level of humanity on the one hand, and disquiet about the possibility of forcing a human being against his or her will into meeting certain prescribed expectations on the other hand. In reflecting on possible responses to this ambivalence, the theological positions of two prolific scholars in the research field of Homiletics, Gijs D.J. Dingemans and Charles L. Campbell, are critically considered from the point of view of the relationship between Christology and Pneumatology. In reflecting on theological markers for a sensible response, the author argues for a pneumatology in which the work of the Spirit consists of grafting the very DNA of our humanity and all its faculties into Christ, the only One who can open up the true life that is intended for humanity by divine grace. It will be in the very genes of a prophet to speak graceful words, because the prophet will have seen the wonder of the working of divine grace in his or her own life and will have embraced it willingly and joyfully.

  8. The logic of indirect speech

    Science.gov (United States)

    Pinker, Steven; Nowak, Martin A.; Lee, James J.

    2008-01-01

    When people speak, they often insinuate their intent indirectly rather than stating it as a bald proposition. Examples include sexual come-ons, veiled threats, polite requests, and concealed bribes. We propose a three-part theory of indirect speech, based on the idea that human communication involves a mixture of cooperation and conflict. First, indirect requests allow for plausible deniability, in which a cooperative listener can accept the request, but an uncooperative one cannot react adversarially to it. This intuition is supported by a game-theoretic model that predicts the costs and benefits to a speaker of direct and indirect requests. Second, language has two functions: to convey information and to negotiate the type of relationship holding between speaker and hearer (in particular, dominance, communality, or reciprocity). The emotional costs of a mismatch in the assumed relationship type can create a need for plausible deniability and, thereby, select for indirectness even when there are no tangible costs. Third, people perceive language as a digital medium, which allows a sentence to generate common knowledge, to propagate a message with high fidelity, and to serve as a reference point in coordination games. This feature makes an indirect request qualitatively different from a direct one even when the speaker and listener can infer each other's intentions with high confidence. PMID:18199841

  9. Exploiting temporal correlation of speech for error robust and bandwidth flexible distributed speech recognition

    DEFF Research Database (Denmark)

    Tan, Zheng-Hua; Dalsgaard, Paul; Lindberg, Børge

    2007-01-01

    In this paper the temporal correlation of speech is exploited in front-end feature extraction, client based error recovery and server based error concealment (EC) for distributed speech recognition. First, the paper investigates a half frame rate (HFR) front-end that uses double frame shifting at....... Lastly, to understand the effects of applying various EC techniques, this paper introduces three approaches consisting of speech feature, dynamic programming distance and hidden Markov model state duration comparison.......In this paper the temporal correlation of speech is exploited in front-end feature extraction, client based error recovery and server based error concealment (EC) for distributed speech recognition. First, the paper investigates a half frame rate (HFR) front-end that uses double frame shifting...

  10. Integrated speech and morphological processing in a connectionist continuous speech understanding for Korean

    CERN Document Server

    Lee, G; Lee, Geunbae; Lee, Jong-Hyeok

    1996-01-01

    A new tightly coupled speech and natural language integration model is presented for a TDNN-based continuous possibly large vocabulary speech recognition system for Korean. Unlike popular n-best techniques developed for integrating mainly HMM-based speech recognition and natural language processing in a {\\em word level}, which is obviously inadequate for morphologically complex agglutinative languages, our model constructs a spoken language system based on a {\\em morpheme-level} speech and language integration. With this integration scheme, the spoken Korean processing engine (SKOPE) is designed and implemented using a TDNN-based diphone recognition module integrated with a Viterbi-based lexical decoding and symbolic phonological/morphological co-analysis. Our experiment results show that the speaker-dependent continuous can be achieved with over 80.6\\% success rate directly from speech inputs for the middle-level vocabularies.

  11. Preschool speech intelligibility and vocabulary skills predict long-term speech and language outcomes following cochlear implantation in early childhood.

    Science.gov (United States)

    Castellanos, Irina; Kronenberger, William G; Beer, Jessica; Henning, Shirley C; Colson, Bethany G; Pisoni, David B

    2014-07-01

    Speech and language measures during grade school predict adolescent speech-language outcomes in children who receive cochlear implants (CIs), but no research has examined whether speech and language functioning at even younger ages is predictive of long-term outcomes in this population. The purpose of this study was to examine whether early preschool measures of speech and language performance predict speech-language functioning in long-term users of CIs. Early measures of speech intelligibility and receptive vocabulary (obtained during preschool ages of 3-6 years) in a sample of 35 prelingually deaf, early-implanted children predicted speech perception, language, and verbal working memory skills up to 18 years later. Age of onset of deafness and age at implantation added additional variance to preschool speech intelligibility in predicting some long-term outcome scores, but the relationship between preschool speech-language skills and later speech-language outcomes was not significantly attenuated by the addition of these hearing history variables. These findings suggest that speech and language development during the preschool years is predictive of long-term speech and language functioning in early-implanted, prelingually deaf children. As a result, measures of speech-language functioning at preschool ages can be used to identify and adjust interventions for very young CI users who may be at long-term risk for suboptimal speech and language outcomes.

  12. A multimodal corpus of speech to infant and adult listeners.

    Science.gov (United States)

    Johnson, Elizabeth K; Lahey, Mybeth; Ernestus, Mirjam; Cutler, Anne

    2013-12-01

    An audio and video corpus of speech addressed to 28 11-month-olds is described. The corpus allows comparisons between adult speech directed toward infants, familiar adults, and unfamiliar adult addressees as well as of caregivers' word teaching strategies across word classes. Summary data show that infant-directed speech differed more from speech to unfamiliar than familiar adults, that word teaching strategies for nominals versus verbs and adjectives differed, that mothers mostly addressed infants with multi-word utterances, and that infants' vocabulary size was unrelated to speech rate, but correlated positively with predominance of continuous caregiver speech (not of isolated words) in the input.

  13. The Phase Spectra Based Feature for Robust Speech Recognition

    Directory of Open Access Journals (Sweden)

    Abbasian ALI

    2009-07-01

    Full Text Available Speech recognition in adverse environment is one of the major issue in automatic speech recognition nowadays. While most current speech recognition system show to be highly efficient for ideal environment but their performance go down extremely when they are applied in real environment because of noise effected speech. In this paper a new feature representation based on phase spectra and Perceptual Linear Prediction (PLP has been suggested which can be used for robust speech recognition. It is shown that this new features can improve the performance of speech recognition not only in clean condition but also in various levels of noise condition when it is compared to PLP features.

  14. Speech evaluation in children with temporomandibular disorders

    Directory of Open Access Journals (Sweden)

    Raquel Aparecida Pizolato

    2011-10-01

    Full Text Available OBJECTIVE: The aims of this study were to evaluate the influence of temporomandibular disorders (TMD on speech in children, and to verify the influence of occlusal characteristics. MATERIAL AND METHODS: Speech and dental occlusal characteristics were assessed in 152 Brazilian children (78 boys and 74 girls, aged 8 to 12 (mean age 10.05 ± 1.39 years with or without TMD signs and symptoms. The clinical signs were evaluated using the Research Diagnostic Criteria for TMD (RDC/TMD (axis I and the symptoms were evaluated using a questionnaire. The following groups were formed: Group TMD (n=40, TMD signs and symptoms (Group S and S, n=68, TMD signs or symptoms (Group S or S, n=33, and without signs and symptoms (Group N, n=11. Articulatory speech disorders were diagnosed during spontaneous speech and repetition of the words using the "Phonological Assessment of Child Speech" for the Portuguese language. It was also applied a list of 40 phonological balanced words, read by the speech pathologist and repeated by the children. Data were analyzed by descriptive statistics, Fisher's exact or Chi-square tests (α=0.05. RESULTS: A slight prevalence of articulatory disturbances, such as substitutions, omissions and distortions of the sibilants /s/ and /z/, and no deviations in jaw lateral movements were observed. Reduction of vertical amplitude was found in 10 children, the prevalence being greater in TMD signs and symptoms children than in the normal children. The tongue protrusion in phonemes /t/, /d/, /n/, /l/ and frontal lips in phonemes /s/ and /z/ were the most prevalent visual alterations. There was a high percentage of dental occlusal alterations. CONCLUSIONS: There was no association between TMD and speech disorders. Occlusal alterations may be factors of influence, allowing distortions and frontal lisp in phonemes /s/ and /z/ and inadequate tongue position in phonemes /t/; /d/; /n/; /l/.

  15. Automatic Speech Recognition from Neural Signals: A Focused Review

    Directory of Open Access Journals (Sweden)

    Christian Herff

    2016-09-01

    Full Text Available Speech interfaces have become widely accepted and are nowadays integrated in various real-life applications and devices. They have become a part of our daily life. However, speech interfaces presume the ability to produce intelligible speech, which might be impossible due to either loud environments, bothering bystanders or incapabilities to produce speech (i.e.~patients suffering from locked-in syndrome. For these reasons it would be highly desirable to not speak but to simply envision oneself to say words or sentences. Interfaces based on imagined speech would enable fast and natural communication without the need for audible speech and would give a voice to otherwise mute people.This focused review analyzes the potential of different brain imaging techniques to recognize speech from neural signals by applying Automatic Speech Recognition technology. We argue that modalities based on metabolic processes, such as functional Near Infrared Spectroscopy and functional Magnetic Resonance Imaging, are less suited for Automatic Speech Recognition from neural signals due to low temporal resolution but are very useful for the investigation of the underlying neural mechanisms involved in speech processes. In contrast, electrophysiologic activity is fast enough to capture speech processes and is therefor better suited for ASR. Our experimental results indicate the potential of these signals for speech recognition from neural data with a focus on invasively measured brain activity (electrocorticography. As a first example of Automatic Speech Recognition techniques used from neural signals, we discuss the emph{Brain-to-text} system.

  16. Auditory-perceptual learning improves speech motor adaptation in children.

    Science.gov (United States)

    Shiller, Douglas M; Rochon, Marie-Lyne

    2014-08-01

    Auditory feedback plays an important role in children's speech development by providing the child with information about speech outcomes that is used to learn and fine-tune speech motor plans. The use of auditory feedback in speech motor learning has been extensively studied in adults by examining oral motor responses to manipulations of auditory feedback during speech production. Children are also capable of adapting speech motor patterns to perceived changes in auditory feedback; however, it is not known whether their capacity for motor learning is limited by immature auditory-perceptual abilities. Here, the link between speech perceptual ability and the capacity for motor learning was explored in two groups of 5- to 7-year-old children who underwent a period of auditory perceptual training followed by tests of speech motor adaptation to altered auditory feedback. One group received perceptual training on a speech acoustic property relevant to the motor task while a control group received perceptual training on an irrelevant speech contrast. Learned perceptual improvements led to an enhancement in speech motor adaptation (proportional to the perceptual change) only for the experimental group. The results indicate that children's ability to perceive relevant speech acoustic properties has a direct influence on their capacity for sensory-based speech motor adaptation.

  17. [Effect of speech estimation on social anxiety].

    Science.gov (United States)

    Shirotsuki, Kentaro; Sasagawa, Satoko; Nomura, Shinobu

    2009-02-01

    This study investigates the effect of speech estimation on social anxiety to further understanding of this characteristic of Social Anxiety Disorder (SAD). In the first study, we developed the Speech Estimation Scale (SES) to assess negative estimation before giving a speech which has been reported to be the most fearful social situation in SAD. Undergraduate students (n = 306) completed a set of questionnaires, which consisted of the Short Fear of Negative Evaluation Scale (SFNE), the Social Interaction Anxiety Scale (SIAS), the Social Phobia Scale (SPS), and the SES. Exploratory factor analysis showed an adequate one-factor structure with eight items. Further analysis indicated that the SES had good reliability and validity. In the second study, undergraduate students (n = 315) completed the SFNE, SIAS, SPS, SES, and the Self-reported Depression Scale (SDS). The results of path analysis showed that fear of negative evaluation from others (FNE) predicted social anxiety, and speech estimation mediated the relationship between FNE and social anxiety. These results suggest that speech estimation might maintain SAD symptoms, and could be used as a specific target for cognitive intervention in SAD.

  18. Speech Evoked Auditory Brainstem Response in Stuttering

    Directory of Open Access Journals (Sweden)

    Ali Akbar Tahaei

    2014-01-01

    Full Text Available Auditory processing deficits have been hypothesized as an underlying mechanism for stuttering. Previous studies have demonstrated abnormal responses in subjects with persistent developmental stuttering (PDS at the higher level of the central auditory system using speech stimuli. Recently, the potential usefulness of speech evoked auditory brainstem responses in central auditory processing disorders has been emphasized. The current study used the speech evoked ABR to investigate the hypothesis that subjects with PDS have specific auditory perceptual dysfunction. Objectives. To determine whether brainstem responses to speech stimuli differ between PDS subjects and normal fluent speakers. Methods. Twenty-five subjects with PDS participated in this study. The speech-ABRs were elicited by the 5-formant synthesized syllable/da/, with duration of 40 ms. Results. There were significant group differences for the onset and offset transient peaks. Subjects with PDS had longer latencies for the onset and offset peaks relative to the control group. Conclusions. Subjects with PDS showed a deficient neural timing in the early stages of the auditory pathway consistent with temporal processing deficits and their abnormal timing may underlie to their disfluency.

  19. Markers of Deception in Italian Speech

    Directory of Open Access Journals (Sweden)

    Katelyn eSpence

    2012-10-01

    Full Text Available Lying is a universal activity and the detection of lying a universal concern. Presently, there is great interest in determining objective measures of deception. The examination of speech, in particular, holds promise in this regard; yet, most of what we know about the relationship between speech and lying is based on the assessment of English-speaking participants. Few studies have examined indicators of deception in languages other than English. The world’s languages differ in significant ways, and cross-linguistic studies of deceptive communications are a research imperative. Here we review some of these differences amongst the world’s languages, and provide an overview of a number of recent studies demonstrating that cross-linguistic research is a worthwhile endeavour. In addition, we report the results of an empirical investigation of pitch, response latency, and speech rate as cues to deception in Italian speech. True and false opinions were elicited in an audio-taped interview. A within subjects analysis revealed no significant difference between the average pitch of the two conditions; however, speech rate was significantly slower, while response latency was longer, during deception compared with truth-telling. We explore the implications of these findings and propose directions for future research, with the aim of expanding the cross-linguistic branch of research on markers of deception.

  20. A Statistical Approach to Automatic Speech Summarization

    Science.gov (United States)

    Hori, Chiori; Furui, Sadaoki; Malkin, Rob; Yu, Hua; Waibel, Alex

    2003-12-01

    This paper proposes a statistical approach to automatic speech summarization. In our method, a set of words maximizing a summarization score indicating the appropriateness of summarization is extracted from automatically transcribed speech and then concatenated to create a summary. The extraction process is performed using a dynamic programming (DP) technique based on a target compression ratio. In this paper, we demonstrate how an English news broadcast transcribed by a speech recognizer is automatically summarized. We adapted our method, which was originally proposed for Japanese, to English by modifying the model for estimating word concatenation probabilities based on a dependency structure in the original speech given by a stochastic dependency context free grammar (SDCFG). We also propose a method of summarizing multiple utterances using a two-level DP technique. The automatically summarized sentences are evaluated by summarization accuracy based on a comparison with a manual summary of speech that has been correctly transcribed by human subjects. Our experimental results indicate that the method we propose can effectively extract relatively important information and remove redundant and irrelevant information from English news broadcasts.

  1. Music and speech prosody: a common rhythm.

    Science.gov (United States)

    Hausen, Maija; Torppa, Ritva; Salmela, Viljami R; Vainio, Martti; Särkämö, Teppo

    2013-01-01

    Disorders of music and speech perception, known as amusia and aphasia, have traditionally been regarded as dissociated deficits based on studies of brain damaged patients. This has been taken as evidence that music and speech are perceived by largely separate and independent networks in the brain. However, recent studies of congenital amusia have broadened this view by showing that the deficit is associated with problems in perceiving speech prosody, especially intonation and emotional prosody. In the present study the association between the perception of music and speech prosody was investigated with healthy Finnish adults (n = 61) using an on-line music perception test including the Scale subtest of Montreal Battery of Evaluation of Amusia (MBEA) and Off-Beat and Out-of-key tasks as well as a prosodic verbal task that measures the perception of word stress. Regression analyses showed that there was a clear association between prosody perception and music perception, especially in the domain of rhythm perception. This association was evident after controlling for music education, age, pitch perception, visuospatial perception, and working memory. Pitch perception was significantly associated with music perception but not with prosody perception. The association between music perception and visuospatial perception (measured using analogous tasks) was less clear. Overall, the pattern of results indicates that there is a robust link between music and speech perception and that this link can be mediated by rhythmic cues (time and stress).

  2. Music and speech prosody: A common rhythm

    Directory of Open Access Journals (Sweden)

    Maija eHausen

    2013-09-01

    Full Text Available Disorders of music and speech perception, known as amusia and aphasia, have traditionally been regarded as dissociated deficits based on studies of brain damaged patients. This has been taken as evidence that music and speech are perceived by largely separate and independent networks in the brain. However, recent studies of congenital amusia have broadened this view by showing that the deficit is associated with problems in perceiving speech prosody, especially intonation and emotional prosody. In the present study the association between the perception of music and speech prosody was investigated with healthy Finnish adults (n = 61 using an on-line music perception test including the Scale subtest of Montreal Battery of Evaluation of Amusia (MBEA and Off-Beat and Out-of-key tasks as well as a prosodic verbal task that measures the perception of word stress. Regression analyses showed that there was a clear association between prosody perception and music perception, especially in the domain of rhythm perception. This association was evident after controlling for music education, age, pitch perception, visuospatial perception and working memory. Pitch perception was significantly associated with music perception but not with prosody perception. The association between music perception and visuospatial perception (measured using analogous tasks was less clear. Overall, the pattern of results indicates that there is a robust link between music and speech perception and that this link can be mediated by rhythmic cues (time and stress.

  3. Gesture facilitates the syntactic analysis of speech

    Directory of Open Access Journals (Sweden)

    Henning eHolle

    2012-03-01

    Full Text Available Recent research suggests that the brain routinely binds together information from gesture and speech. However, most of this research focused on the integration of representational gestures with the semantic content of speech. Much less is known about how other aspects of gesture, such as emphasis, influence the interpretation of the syntactic relations in a spoken message. Here, we investigated whether beat gestures alter which syntactic structure is assigned to ambiguous spoken German sentences. The P600 component of the Event Related Brain Potential indicated that the more complex syntactic structure is easier to process when the speaker emphasizes the subject of a sentence with a beat. Thus, a simple flick of the hand can change our interpretation of who has been doing what to whom in a spoken sentence. We conclude that gestures and speech are an integrated system. Unlike previous studies, which have shown that the brain effortlessly integrates semantic information from gesture and speech, our study is the first to demonstrate that this integration also occurs for syntactic information. Moreover, the effect appears to be gesture-specific and was not found for other stimuli that draw attention to certain parts of speech, including prosodic emphasis, or a moving visual stimulus with the same trajectory as the gesture. This suggests that only visual emphasis produced with a communicative intention in mind (that is, beat gestures influences language comprehension, but not a simple visual movement lacking such an intention.

  4. When speech sounds like music.

    Science.gov (United States)

    Falk, Simone; Rathcke, Tamara; Dalla Bella, Simone

    2014-08-01

    Repetition can boost memory and perception. However, repeating the same stimulus several times in immediate succession also induces intriguing perceptual transformations and illusions. Here, we investigate the Speech to Song Transformation (S2ST), a massed repetition effect in the auditory modality, which crosses the boundaries between language and music. In the S2ST, a phrase repeated several times shifts to being heard as sung. To better understand this unique cross-domain transformation, we examined the perceptual determinants of the S2ST, in particular the role of acoustics. In 2 Experiments, the effects of 2 pitch properties and 3 rhythmic properties on the probability and speed of occurrence of the transformation were examined. Results showed that both pitch and rhythmic properties are key features fostering the transformation. However, some properties proved to be more conducive to the S2ST than others. Stable tonal targets that allowed for the perception of a musical melody led more often and quickly to the S2ST than scalar intervals. Recurring durational contrasts arising from segmental grouping favoring a metrical interpretation of the stimulus also facilitated the S2ST. This was, however, not the case for a regular beat structure within and across repetitions. In addition, individual perceptual abilities allowed to predict the likelihood of the S2ST. Overall, the study demonstrated that repetition enables listeners to reinterpret specific prosodic features of spoken utterances in terms of musical structures. The findings underline a tight link between language and music, but they also reveal important differences in communicative functions of prosodic structure in the 2 domains.

  5. A Blueprint for a Comprehensive Australian English Auditory-Visual Speech Corpus

    NARCIS (Netherlands)

    Burnham, D.; Ambikairajah, E.; Arciuli, J.; Bennamoun, M.; Best, C.T.; Bird, S.; Butcher, A.R.; Cassidy, S.; Chetty, G.; Cox, F.M.; Cutler, A.; Dale, R.; Epps, J.R.; Fletcher, J.M.; Goecke, R.; Grayden, D.B.; Hajek, J.T.; Ingram, J.C.; Ishihara, S.; Kemp, N.; Kinoshita, Y.; Kuratate, T.; Lewis, T.W.; Loakes, D.E.; Onslow, M.; Powers, D.M.; Rose, P.; Togneri, R.; Tran, D.; Wagner, M.

    2009-01-01

    Large auditory-visual (AV) speech corpora are the grist of modern research in speech science, but no such corpus exists for Australian English. This is unfortunate, for speech science is the brains behind speech technology and applications such as text-to-speech (TTS) synthesis, automatic speech rec

  6. Can you hear my age? Influences of speech rate and speech spontaneity on estimation of speaker age.

    Science.gov (United States)

    Skoog Waller, Sara; Eriksson, Mårten; Sörqvist, Patrik

    2015-01-01

    Cognitive hearing science is mainly about the study of how cognitive factors contribute to speech comprehension, but cognitive factors also partake in speech processing to infer non-linguistic information from speech signals, such as the intentions of the talker and the speaker's age. Here, we report two experiments on age estimation by "naïve" listeners. The aim was to study how speech rate influences estimation of speaker age by comparing the speakers' natural speech rate with increased or decreased speech rate. In Experiment 1, listeners were presented with audio samples of read speech from three different speaker age groups (young, middle aged, and old adults). They estimated the speakers as younger when speech rate was faster than normal and as older when speech rate was slower than normal. This speech rate effect was slightly greater in magnitude for older (60-65 years) speakers in comparison with younger (20-25 years) speakers, suggesting that speech rate may gain greater importance as a perceptual age cue with increased speaker age. This pattern was more pronounced in Experiment 2, in which listeners estimated age from spontaneous speech. Faster speech rate was associated with lower age estimates, but only for older and middle aged (40-45 years) speakers. Taken together, speakers of all age groups were estimated as older when speech rate decreased, except for the youngest speakers in Experiment 2. The absence of a linear speech rate effect in estimates of younger speakers, for spontaneous speech, implies that listeners use different age estimation strategies or cues (possibly vocabulary) depending on the age of the speaker and the spontaneity of the speech. Potential implications for forensic investigations and other applied domains are discussed.

  7. Can you hear my age? Influences of speech rate and speech spontaneity on estimation of speaker age

    Directory of Open Access Journals (Sweden)

    Sara eWaller Skoog

    2015-07-01

    Full Text Available Cognitive hearing science is mainly about the study of how cognitive factors contribute to speech comprehension, but cognitive factors also partake in speech processing to infer non-linguistic information from speech signals, such as the intentions of the talker and the speaker’s age. Here, we report two experiments on age estimation by naïve listeners. The aim was to study how speech rate influences estimation of speaker age by comparing the speakers’ natural speech rate with increased or decreased speech rate. In Experiment 1, listeners were presented with audio samples of read speech from three different speaker age groups (young, middle aged and old adults. They estimated the speakers as younger when speech rate was faster than normal and as older when speech rate was slower than normal. This speech rate effect was slightly greater in magnitude for older (60-65 years speakers in comparison with younger (20-25 years speakers, suggesting that speech rate may gain greater importance as a perceptual age cue with increased speaker age. This pattern was more pronounced in Experiment 2, in which listeners estimated age from spontaneous speech. Faster speech rate was associated with lower age estimates, but only for older and middle aged (40-45 years speakers. Taken together, speakers of all age groups were estimated as older when speech rate decreased, except for the youngest speakers in Experiment 2. The absence of a linear speech rate effect in estimates of younger speakers, for spontaneous speech, implies that listeners use different age estimation strategies or cues (possibly vocabulary depending on the age of the speaker and the spontaneity of the speech. Potential implications for forensic investigations and other applied domains are discussed.

  8. Can you hear my age? Influences of speech rate and speech spontaneity on estimation of speaker age

    Science.gov (United States)

    Skoog Waller, Sara; Eriksson, Mårten; Sörqvist, Patrik

    2015-01-01

    Cognitive hearing science is mainly about the study of how cognitive factors contribute to speech comprehension, but cognitive factors also partake in speech processing to infer non-linguistic information from speech signals, such as the intentions of the talker and the speaker’s age. Here, we report two experiments on age estimation by “naïve” listeners. The aim was to study how speech rate influences estimation of speaker age by comparing the speakers’ natural speech rate with increased or decreased speech rate. In Experiment 1, listeners were presented with audio samples of read speech from three different speaker age groups (young, middle aged, and old adults). They estimated the speakers as younger when speech rate was faster than normal and as older when speech rate was slower than normal. This speech rate effect was slightly greater in magnitude for older (60–65 years) speakers in comparison with younger (20–25 years) speakers, suggesting that speech rate may gain greater importance as a perceptual age cue with increased speaker age. This pattern was more pronounced in Experiment 2, in which listeners estimated age from spontaneous speech. Faster speech rate was associated with lower age estimates, but only for older and middle aged (40–45 years) speakers. Taken together, speakers of all age groups were estimated as older when speech rate decreased, except for the youngest speakers in Experiment 2. The absence of a linear speech rate effect in estimates of younger speakers, for spontaneous speech, implies that listeners use different age estimation strategies or cues (possibly vocabulary) depending on the age of the speaker and the spontaneity of the speech. Potential implications for forensic investigations and other applied domains are discussed. PMID:26236259

  9. "Thoughts Concerning Education": John Locke On Teaching Speech

    Science.gov (United States)

    Baird, John E.

    1971-01-01

    Locke's suggestions for more effective speech instruction have gone largely unnoticed. Consequently, it is the purpose of this article to consider John Locke's criticisms, theory and specific methods of speech education. (Author)

  10. The Role of Phonological Rules in Speech Understanding Research

    Science.gov (United States)

    Oshika, Beatrice T.; And Others

    1975-01-01

    This paper presents phonological rules describing systematic pronunciation variation in natural continuous speech. It is argued that a speech unders tanding system must explain such variation by incorporating phonological rules. Spectrographic findings are included. (C K)

  11. Synchrony Demonstrated between Movements of the Neonate and Adult Speech

    Science.gov (United States)

    Condon, William S.; Sander, Louis W.

    1974-01-01

    Infant reaction to adult speech was studied through frame-by-frame analysis of sound films. Infants' actions were found to be synchronized with the organized speech behavior of the adults in his environment. (ST)

  12. The role of the insula in speech and language processing.

    Science.gov (United States)

    Oh, Anna; Duerden, Emma G; Pang, Elizabeth W

    2014-08-01

    Lesion and neuroimaging studies indicate that the insula mediates motor aspects of speech production, specifically, articulatory control. Although it has direct connections to Broca's area, the canonical speech production region, the insula is also broadly connected with other speech and language centres, and may play a role in coordinating higher-order cognitive aspects of speech and language production. The extent of the insula's involvement in speech and language processing was assessed using the Activation Likelihood Estimation (ALE) method. Meta-analyses of 42 fMRI studies with healthy adults were performed, comparing insula activation during performance of language (expressive and receptive) and speech (production and perception) tasks. Both tasks activated bilateral anterior insulae. However, speech perception tasks preferentially activated the left dorsal mid-insula, whereas expressive language tasks activated left ventral mid-insula. Results suggest distinct regions of the mid-insula play different roles in speech and language processing.

  13. Multistage audiovisual integration of speech: dissociating identification and detection

    DEFF Research Database (Denmark)

    Eskelund, Kasper; Tuomainen, Jyrki; Andersen, Tobias

    2011-01-01

    Speech perception integrates auditory and visual information. This is evidenced by the McGurk illusion where seeing the talking face influences the auditory phonetic percept and by the audiovisual detection advantage where seeing the talking face influences the detectability of the acoustic speech...... signal. Here we show that identification of phonetic content and detection can be dissociated as speech-specific and non-specific audiovisual integration effects. To this end, we employed synthetically modified stimuli, sine wave speech (SWS), which is an impoverished speech signal that only observers...... informed of its speech-like nature recognize as speech. While the McGurk illusion only occurred for informed observers the audiovisual detection advantage occurred for naïve observers as well. This finding supports a multi-stage account of audiovisual integration of speech in which the many attributes...

  14. Speech Recognition Technology Applied to Intelligent Mobile Navigation System

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    The capability of human-computer interaction reflects the intelligent degree of mobile navigation system.The navigation data and functions of mobile navigation system are divided into system commands and non-system commands in this paper.And then a group of speech commands are Abstracted.This paper applies speech recognition technology to intelligent mobile navigation system to process speech commands and does some deep research on the integration of speech recognition technology with mobile navigation system.The navigation operation can be performed by speech commands,which makes human-computer interaction easy during navigation.Speech command interface of navigation system is implemented by Dutty ++ Software,which is based on speech recognition system -Via Voice of IBM.Through navigation experiments,navigation can be done almost without keyboard,which proved that human-computer interaction is very convenient by speech commands and the reliability is also higher.

  15. Visual-tactile integration in speech perception: Evidence for modality neutral speech primitives.

    Science.gov (United States)

    Bicevskis, Katie; Derrick, Donald; Gick, Bryan

    2016-11-01

    Audio-visual [McGurk and MacDonald (1976). Nature 264, 746-748] and audio-tactile [Gick and Derrick (2009). Nature 462(7272), 502-504] speech stimuli enhance speech perception over audio stimuli alone. In addition, multimodal speech stimuli form an asymmetric window of integration that is consistent with the relative speeds of the various signals [Munhall, Gribble, Sacco, and Ward (1996). Percept. Psychophys. 58(3), 351-362; Gick, Ikegami, and Derrick (2010). J. Acoust. Soc. Am. 128(5), EL342-EL346]. In this experiment, participants were presented video of faces producing /pa/ and /ba/ syllables, both alone and with air puffs occurring synchronously and at different timings up to 300 ms before and after the stop release. Perceivers were asked to identify the syllable they perceived, and were more likely to respond that they perceived /pa/ when air puffs were present, with asymmetrical preference for puffs following the video signal-consistent with the relative speeds of visual and air puff signals. The results demonstrate that visual-tactile integration of speech perception occurs much as it does with audio-visual and audio-tactile stimuli. This finding contributes to the understanding of multimodal speech perception, lending support to the idea that speech is not perceived as an audio signal that is supplemented by information from other modes, but rather that primitives of speech perception are, in principle, modality neutral.

  16. Existence detection and embedding rate estimation of blended speech in covert speech communications.

    Science.gov (United States)

    Li, Lijuan; Gao, Yong

    2016-01-01

    Covert speech communications may be used by terrorists to commit crimes through Internet. Steganalysis aims to detect secret information in covert communications to prevent crimes. Herein, based on the average zero crossing rate of the odd-even difference (AZCR-OED), a steganalysis algorithm for blended speech is proposed; it can detect the existence and estimate the embedding rate of blended speech. First, the odd-even difference (OED) of the speech signal is calculated and divided into frames. The average zero crossing rate (ZCR) is calculated for each OED frame, and the minimum average ZCR and AZCR-OED of the entire speech signal are extracted as features. Then, a support vector machine classifier is used to determine whether the speech signal is blended. Finally, a voice activity detection algorithm is applied to determine the hidden location of the secret speech and estimate the embedding rate. The results demonstrate that without attack, the detection accuracy can reach 80 % or more when the embedding rate is greater than 10 %, and the estimated embedding rate is similar to the real value. And when some attacks occur, it can also reach relatively high detection accuracy. The algorithm has high performance in terms of accuracy, effectiveness and robustness.

  17. Analysis of speech-based speech transmission index methods with implications for nonlinear operations

    Science.gov (United States)

    Goldsworthy, Ray L.; Greenberg, Julie E.

    2004-12-01

    The Speech Transmission Index (STI) is a physical metric that is well correlated with the intelligibility of speech degraded by additive noise and reverberation. The traditional STI uses modulated noise as a probe signal and is valid for assessing degradations that result from linear operations on the speech signal. Researchers have attempted to extend the STI to predict the intelligibility of nonlinearly processed speech by proposing variations that use speech as a probe signal. This work considers four previously proposed speech-based STI methods and four novel methods, studied under conditions of additive noise, reverberation, and two nonlinear operations (envelope thresholding and spectral subtraction). Analyzing intermediate metrics in the STI calculation reveals why some methods fail for nonlinear operations. Results indicate that none of the previously proposed methods is adequate for all of the conditions considered, while four proposed methods produce qualitatively reasonable results and warrant further study. The discussion considers the relevance of this work to predicting the intelligibility of cochlear-implant processed speech. .

  18. Music expertise shapes audiovisual temporal integration windows for speech, sinewave speech, and music.

    Science.gov (United States)

    Lee, Hweeling; Noppeney, Uta

    2014-01-01

    This psychophysics study used musicians as a model to investigate whether musical expertise shapes the temporal integration window for audiovisual speech, sinewave speech, or music. Musicians and non-musicians judged the audiovisual synchrony of speech, sinewave analogs of speech, and music stimuli at 13 audiovisual stimulus onset asynchronies (±360, ±300 ±240, ±180, ±120, ±60, and 0 ms). Further, we manipulated the duration of the stimuli by presenting sentences/melodies or syllables/tones. Critically, musicians relative to non-musicians exhibited significantly narrower temporal integration windows for both music and sinewave speech. Further, the temporal integration window for music decreased with the amount of music practice, but not with age of acquisition. In other words, the more musicians practiced piano in the past 3 years, the more sensitive they became to the temporal misalignment of visual and auditory signals. Collectively, our findings demonstrate that music practicing fine-tunes the audiovisual temporal integration window to various extents depending on the stimulus class. While the effect of piano practicing was most pronounced for music, it also generalized to other stimulus classes such as sinewave speech and to a marginally significant degree to natural speech.

  19. Speech production in amplitude-modulated noise

    DEFF Research Database (Denmark)

    Macdonald, Ewen N; Raufer, Stefan

    2013-01-01

    The Lombard effect refers to the phenomenon where talkers automatically increase their level of speech in a noisy environment. While many studies have characterized how the Lombard effect influences different measures of speech production (e.g., F0, spectral tilt, etc.), few have investigated...... the consequences of temporally fluctuating noise. In the present study, 20 talkers produced speech in a variety of noise conditions, including both steady-state and amplitude-modulated white noise. While listening to noise over headphones, talkers produced randomly generated five word sentences. Similar...... to previous studies, talkers raised the level of their voice in steady-state noise. While talkers also increased the level of their voice in amplitude-modulated noise, the increase was not as large as that observed in steady-state noise. Importantly, for the 2 and 4 Hz amplitude-modulated noise conditions...

  20. Speech perception as complex auditory categorization

    Science.gov (United States)

    Holt, Lori L.

    2002-05-01

    Despite a long and rich history of categorization research in cognitive psychology, very little work has addressed the issue of complex auditory category formation. This is especially unfortunate because the general underlying cognitive and perceptual mechanisms that guide auditory category formation are of great importance to understanding speech perception. I will discuss a new methodological approach to examining complex auditory category formation that specifically addresses issues relevant to speech perception. This approach utilizes novel nonspeech sound stimuli to gain full experimental control over listeners' history of experience. As such, the course of learning is readily measurable. Results from this methodology indicate that the structure and formation of auditory categories are a function of the statistical input distributions of sound that listeners hear, aspects of the operating characteristics of the auditory system, and characteristics of the perceptual categorization system. These results have important implications for phonetic acquisition and speech perception.

  1. Reflections on mirror neurons and speech perception.

    Science.gov (United States)

    Lotto, Andrew J; Hickok, Gregory S; Holt, Lori L

    2009-03-01

    The discovery of mirror neurons, a class of neurons that respond when a monkey performs an action and also when the monkey observes others producing the same action, has promoted a renaissance for the Motor Theory (MT) of speech perception. This is because mirror neurons seem to accomplish the same kind of one to one mapping between perception and action that MT theorizes to be the basis of human speech communication. However, this seeming correspondence is superficial, and there are theoretical and empirical reasons to temper enthusiasm about the explanatory role mirror neurons might have for speech perception. In fact, rather than providing support for MT, mirror neurons are actually inconsistent with the central tenets of MT.

  2. Post-editing through Speech Recognition

    DEFF Research Database (Denmark)

    Mesa-Lao, Bartolomé

    In the past couple of years automatic speech recognition (ASR) software has quietly created a niche for itself in many situations of our lives. Nowadays it can be found at the other end of customer-support hotlines, it is built into operating systems and it is offered as an alternative text......-input method for smartphones. On another front, given the significant improvements in Machine Translation (MT) quality and the increasing demand for translations, post-editing of MT is becoming a popular practice in the translation industry, since it has been shown to allow for larger volumes of translations...... to be produced saving time and costs. The translation industry is at a deeply transformative point in its evolution and the coming years herald an era of converge where speech technology could make a difference. As post-editing services are becoming a common practice among language service providers and speech...

  3. Speech Recognition Technology for Hearing Disabled Community

    Directory of Open Access Journals (Sweden)

    Tanvi Dua

    2014-09-01

    Full Text Available As the number of people with hearing disabilities are increasing significantly in the world, it is always required to use technology for filling the gap of communication between Deaf and Hearing communities. To fill this gap and to allow people with hearing disabilities to communicate this paper suggests a framework that contributes to the efficient integration of people with hearing disabilities. This paper presents a robust speech recognition system, which converts the continuous speech into text and image. The results are obtained with an accuracy of 95% with the small size vocabulary of 20 greeting sentences of continuous speech form tested in a speaker independent mode. In this testing phase all these continuous sentences were given as live input to the proposed system.

  4. The Beginnings of Danish Speech Perception

    DEFF Research Database (Denmark)

    Østerbye, Torkil

    , in the light of the rich and complex Danish sound system. The first two studies report on native adults’ perception of Danish speech sounds in quiet and noise. The third study examined the development of language-specific perception in native Danish infants at 6, 9 and 12 months of age. The book points...... to interesting differences in speech perception and acquisition of Danish adults and infants when compared to English. The book is useful for professionals as well as students of linguistics, psycholinguistics and phonetics/phonology, or anyone else who may be interested in language.......Little is known about the perception of speech sounds by native Danish listeners. However, the Danish sound system differs in several interesting ways from the sound systems of other languages. For instance, Danish is characterized, among other features, by a rich vowel inventory and by different...

  5. Neural overlap in processing music and speech.

    Science.gov (United States)

    Peretz, Isabelle; Vuvan, Dominique; Lagrois, Marie-Élaine; Armony, Jorge L

    2015-03-19

    Neural overlap in processing music and speech, as measured by the co-activation of brain regions in neuroimaging studies, may suggest that parts of the neural circuitries established for language may have been recycled during evolution for musicality, or vice versa that musicality served as a springboard for language emergence. Such a perspective has important implications for several topics of general interest besides evolutionary origins. For instance, neural overlap is an important premise for the possibility of music training to influence language acquisition and literacy. However, neural overlap in processing music and speech does not entail sharing neural circuitries. Neural separability between music and speech may occur in overlapping brain regions. In this paper, we review the evidence and outline the issues faced in interpreting such neural data, and argue that converging evidence from several methodologies is needed before neural overlap is taken as evidence of sharing.

  6. An audiovisual database of English speech sounds

    Science.gov (United States)

    Frisch, Stefan A.; Nikjeh, Dee Adams

    2003-10-01

    A preliminary audiovisual database of English speech sounds has been developed for teaching purposes. This database contains all Standard English speech sounds produced in isolated words in word initial, word medial, and word final position, unless not allowed by English phonotactics. There is one example of each word spoken by a male and a female talker. The database consists of an audio recording, video of the face from a 45 deg angle off of center, and ultrasound video of the tongue in the mid-saggital plane. The files contained in the database are suitable for examination by the Wavesurfer freeware program in audio or video modes [Sjolander and Beskow, KTH Stockholm]. This database is intended as a multimedia reference for students in phonetics or speech science. A demonstration and plans for further development will be presented.

  7. Automatic Speech Segmentation Based on HMM

    Directory of Open Access Journals (Sweden)

    M. Kroul

    2007-06-01

    Full Text Available This contribution deals with the problem of automatic phoneme segmentation using HMMs. Automatization of speech segmentation task is important for applications, where large amount of data is needed to process, so manual segmentation is out of the question. In this paper we focus on automatic segmentation of recordings, which will be used for triphone synthesis unit database creation. For speech synthesis, the speech unit quality is a crucial aspect, so the maximal accuracy in segmentation is needed here. In this work, different kinds of HMMs with various parameters have been trained and their usefulness for automatic segmentation is discussed. At the end of this work, some segmentation accuracy tests of all models are presented.

  8. Fighting Hate Speech through EU Law

    Directory of Open Access Journals (Sweden)

    Uladzislau Belavusau

    2012-02-01

    Full Text Available

    This article explores the rise of the European ‘First Amendment’ beyond national and Strasbourg law, offering a fresh look into the previously under-theorised issue of hate speech in EU law. Building its argument on (1 the scrutiny of fundamental rights protection, (2 the distinction between commercial and non-commercial speech, and, finally, (3 the looking glass of critical race theory, the paper demonstrates how the judgment of the ECJ in the Feryn case implicitly consolidated legal narratives on hate speech in Europe. In this way, the paper reconstructs the dominant European theory of freedom of expression via rhetorical and victim-centered constitutional analysis, bearing important ethical implications for European integration.

     

  9. Gestures modulate speech processing early in utterances.

    Science.gov (United States)

    Wu, Ying Choon; Coulson, Seana

    2010-05-12

    Electroencephalogram was recorded as healthy adults viewed short videos of spontaneous discourse in which a speaker used depictive gestures to complement information expressed through speech. Event-related potentials were computed time-locked to content words in the speech stream and to subsequent related and unrelated picture probes. Gestures modulated event-related potentials to content words co-timed with the first gesture in a discourse segment, relative to the same words presented with static freeze frames of the speaker. Effects were observed 200-550 ms after speech onset, a time interval associated with semantic processing. Gestures also increased sensitivity to picture probe relatedness. Effects of gestures on picture probe and spoken word analysis were inversely correlated, suggesting that gestures differentially impact verbal and image-based processes.

  10. Bimodal Emotion Recognition from Speech and Text

    Directory of Open Access Journals (Sweden)

    Weilin Ye

    2014-01-01

    Full Text Available This paper presents an approach to emotion recognition from speech signals and textual content. In the analysis of speech signals, thirty-seven acoustic features are extracted from the speech input. Two different classifiers Support Vector Machines (SVMs and BP neural network are adopted to classify the emotional states. In text analysis, we use the two-step classification method to recognize the emotional states. The final emotional state is determined based on the emotion outputs from the acoustic and textual analyses. In this paper we have two parallel classifiers for acoustic information and two serial classifiers for textual information, and a final decision is made by combing these classifiers in decision level fusion. Experimental results show that the emotion recognition accuracy of the integrated system is better than that of either of the two individual approaches.

  11. Speech defect and orthodontics: a contemporary review.

    Science.gov (United States)

    Doshi, Umal Hiralal; Bhad-Patil, Wasundhara A

    2011-01-01

    In conjunction with the lips, tongue, and oropharynx, the teeth play an important role in the articulation of consonants via airflow obstruction and modification. Therefore, along with these articulators, any orthodontic therapy that changes their position may play a role in speech disorders. This paper examines the relevant studies and discusses the difficulties of scientific investigation in this area. The ability of patients to adapt their speech to compensate for most handicapping occlusion and facial deformities is recognized, but the mechanism for this adaptation remains incompletely understood. The overall conclusion is that while certain malocclusions show a relationship with speech defects, this does not appear to correlate with the severity of the condition. There is no direct cause-and-effect relationship. Similarly, no guarantees of improvement can be given to patients undergoing orthodontic or orthognathic correction of malocclusion.

  12. Changes in breathing while listening to read speech: the effect of reader and speech mode.

    Science.gov (United States)

    Rochet-Capellan, Amélie; Fuchs, Susanne

    2013-01-01

    The current paper extends previous work on breathing during speech perception and provides supplementary material regarding the hypothesis that adaptation of breathing during perception "could be a basis for understanding and imitating actions performed by other people" (Paccalin and Jeannerod, 2000). The experiments were designed to test how the differences in reader breathing due to speaker-specific characteristics, or differences induced by changes in loudness level or speech rate influence the listener breathing. Two readers (a male and a female) were pre-recorded while reading short texts with normal and then loud speech (both readers) or slow speech (female only). These recordings were then played back to 48 female listeners. The movements of the rib cage and abdomen were analyzed for both the readers and the listeners. Breathing profiles were characterized by the movement expansion due to inhalation and the duration of the breathing cycle. We found that both loudness and speech rate affected each reader's breathing in different ways. Listener breathing was different when listening to the male or the female reader and to the different speech modes. However, differences in listener breathing were not systematically in the same direction as reader differences. The breathing of listeners was strongly sensitive to the order of presentation of speech mode and displayed some adaptation in the time course of the experiment in some conditions. In contrast to specific alignments of breathing previously observed in face-to-face dialog, no clear evidence for a listener-reader alignment in breathing was found in this purely auditory speech perception task. The results and methods are relevant to the question of the involvement of physiological adaptations in speech perception and to the basic mechanisms of listener-speaker coupling.

  13. A New Speech Codec Based on ANN with Low Delay

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    The author designs a new speech codec in this paper, which is based on ANN to carry out nonlinear prediction. This new codec synthesizes speeches with better quality than the conventional waveform or hybrid codecs does at the same bit rate. Moreover, the most important characteristic of this codec is the low coding delay, which will benefit the enhancement of the speech communication QoS when we transmit speech signals in IP or ATM networks.

  14. Speech motor planning and execution deficits in early childhood stuttering

    OpenAIRE

    2015-01-01

    Background Five to eight percent of preschool children develop stuttering, a speech disorder with clearly observable, hallmark symptoms: sound repetitions, prolongations, and blocks. While the speech motor processes underlying stuttering have been widely documented in adults, few studies to date have assessed the speech motor dynamics of stuttering near its onset. We assessed fundamental characteristics of speech movements in preschool children who stutter and their fluent peers to determine ...

  15. Sequential Organization and Room Reverberation for Speech Segregation

    Science.gov (United States)

    2012-02-28

    voiced portions account for about 75-80% of spoken English . Voiced speech is characterized by periodicity (or harmonicity), which has been used as a...onset and offset cues to extract unvoiced speech segments. Acoustic- phonetic features are then used to separate unvoiced speech from nonspeech...estimate is relatively accurate due to weak voiced speech at these frequencies. Based on this analysis and acoustic- phonetic characteristics of

  16. On Speech Act Theory in Conversations of the Holy Bible

    Institute of Scientific and Technical Information of China (English)

    YANG Hongya

    2014-01-01

    Speech act theory is an important theory in current pragmatics, which is originated with the Oxford philosopher John Langshaw Austin. Speech act theory is started from research on daily language’s function. There are few papers using speech act theory to analyze literature works. The holy bible is a literature treasure in human history, so this paper tries to use speech act theory to analyze conversations of Bible and provide some enlightenment for readers.

  17. 噪声下言语识别能力对老年性聋患者助听效果的影响%The Hearing Aids Outcome Associated with Speech Discrimination Abilities in Noise in Presbycusis Patients

    Institute of Scientific and Technical Information of China (English)

    彭璐; 梅玲; 张勤; 陈建勇; 李蕴; 任燕; 黄治物

    2015-01-01

    Objective To investigate the relationship between pure -tone audiometry (PTA ) ,speech dis‐crimination abilities in quiet or in noise and the international outcome inventory for hearing aids (IOI-HA) in pres‐bycusis patients .Methods Twenty presbycusis subjects were tested in this study .Pure-tone audiometry (PTA) and speech discrimination threshold were obtained before being fitted with hearing aids .They weared hearig aids more than six months ,and pure-tone audiometry ,speech discrimination scores in quite(the level = 65 dB SPL) and in noise(signal to noise ratio = 10 dB) were carried out in sound field .A stepwise forward multiple-regression analysis was performed to investigate the impact of PTA and speech discrimination scores to IOI-HA .Results The PTAs before or after hearing aid fittings showed a negative association with IOI-HA ,while speech discrimination scores in quiet or in noise before or after hearing aid fittings showed a positive association with IOI-HA .Speech discrimination threshold in noise was identified as a single predictor of IOI-HA(P<0 .001) .Conclusion The relation between speech discrimination scores in noise and IOI-HA suggests that a poor score might limit the hearing aids outcome .The speech discrimination scores in noise help the clinicians predict the outcomes of hearing aid in real‐ities .%目的:研究老年性聋患者助听前后的纯音听阈、安静及噪声下言语识别能力与国际助听器效果评估量表(the international outcome inventory for hearing aids ,IOI-HA)得分的相关性。方法对20例老年性聋患者助听前和佩戴并适应助听器6个月后分别进行纯音听阈测试、安静环境下言语识别率测试及信噪比为10 dB的噪声下言语识别率测试,并于助听后6个月进行IO I-H A评分,助听前的言语识别率测试采用普通话言语测听词表中的单音节词汇表,助听后的言语识别测试采用噪声下汉语普通话测试材料语

  18. Contribution of envelope periodicity to release from speech-on-speech masking

    DEFF Research Database (Denmark)

    Christiansen, Claus; MacDonald, Ewen; Dau, Torsten

    2013-01-01

    Masking release (MR) is the improvement in speech intelligibility for a fluctuating interferer compared to stationary noise. Reduction in MR due to vocoder processing is usually linked to distortions in the temporal fine structure of the stimuli and a corresponding reduction in the fundamental...... frequency (F0) cues. However, it is unclear if envelope periodicity related to F0, produced by the interaction between unresolved harmonics, contributes to MR. In the present study, MR was determined from speech reception thresholds measured in the presence of stationary speech-shaped noise and a competing...

  19. Speech Enhancement Algorithm Based on MMSE Short Time Spectral Amplitude in Whispered Speech

    Institute of Scientific and Technical Information of China (English)

    Zhi-Heng Lu; Huai-Zong Shao; Tai-Liang Ju

    2009-01-01

    An improved method based on minimum mean square error-short time spectral amplitude (MMSE-STSA) is proposed to cancel background noise in whispered speech. Using the acoustic character of whispered speech, the algorithm can track the change of non-stationary background noise effectively. Compared with original MMSE-STSA algorithm and method in selectable mode Vo-coder (SMV), the improved algorithm can further suppress the residual noise for low signal-to-noise radio (SNR) and avoid the excessive suppression. Simulations show that under the non-stationary noisy environment, the proposed algorithm can not only get a better performance in enhancement, but also reduce the speech distortion.

  20. Commercial speech in crisis: Crisis Pregnancy Center regulations and definitions of commercial speech.

    Science.gov (United States)

    Gilbert, Kathryn E

    2013-02-01

    Recent attempts to regulate Crisis Pregnancy Centers, pseudoclinics that surreptitiously aim to dissuade pregnant women from choosing abortion, have confronted the thorny problem of how to define commercial speech. The Supreme Court has offered three potential answers to this definitional quandary. This Note uses the Crisis Pregnancy Center cases to demonstrate that courts should use one of these solutions, the factor-based approach of Bolger v. Youngs Drugs Products Corp., to define commercial speech in the Crisis Pregnancy Center cases and elsewhere. In principle and in application, the Bolger factor-based approach succeeds in structuring commercial speech analysis at the margins of the doctrine.

  1. The Impact of Extrinsic Demographic Factors on Cantonese Speech Acquisition

    Science.gov (United States)

    To, Carol K. S.; Cheung, Pamela S. P.; McLeod, Sharynne

    2013-01-01

    This study modeled the associations between extrinsic demographic factors and children's speech acquisition in Hong Kong Cantonese. The speech of 937 Cantonese-speaking children aged 2;4 to 6;7 in Hong Kong was assessed using a standardized speech test. Demographic information regarding household income, paternal education, maternal education,…

  2. Orangutan call communication and the puzzle of speech evolution

    NARCIS (Netherlands)

    Reis E Lameira, A.

    2013-01-01

    Speech is a human hallmark. However, its evolution is little understood. It remains largely unknown which features of the call communication of our closest relatives – great apes – may have constituted speech evolutionary feedstock. In this study, I investigate the extent to which speech building bl

  3. Recent Research on the Treatment of Speech Anxiety.

    Science.gov (United States)

    Page, Bill

    Apprehension on the part of students who must engage in public speaking figures high on the list of student fears. Speech anxiety has been viewed as a trait--the overall propensity to fear giving speeches--and as a state--the condition of fearfulness on a particular occasion of speech making. Methodology in therapy research should abide by the…

  4. 38 CFR 8.18 - Total disability-speech.

    Science.gov (United States)

    2010-07-01

    ... 38 Pensions, Bonuses, and Veterans' Relief 1 2010-07-01 2010-07-01 false Total disability-speech... SERVICE LIFE INSURANCE Premium Waivers and Total Disability § 8.18 Total disability—speech. The organic loss of speech shall be deemed to be total disability under National Service Life...

  5. Speech and audio processing for coding, enhancement and recognition

    CERN Document Server

    Togneri, Roberto; Narasimha, Madihally

    2015-01-01

    This book describes the basic principles underlying the generation, coding, transmission and enhancement of speech and audio signals, including advanced statistical and machine learning techniques for speech and speaker recognition with an overview of the key innovations in these areas. Key research undertaken in speech coding, speech enhancement, speech recognition, emotion recognition and speaker diarization are also presented, along with recent advances and new paradigms in these areas. ·         Offers readers a single-source reference on the significant applications of speech and audio processing to speech coding, speech enhancement and speech/speaker recognition. Enables readers involved in algorithm development and implementation issues for speech coding to understand the historical development and future challenges in speech coding research; ·         Discusses speech coding methods yielding bit-streams that are multi-rate and scalable for Voice-over-IP (VoIP) Networks; ·     �...

  6. Binaural intelligibility prediction based on the speech transmission index

    NARCIS (Netherlands)

    Wijngaarden, S.J. van; Drullman, R.

    2008-01-01

    Although the speech transmission index STI is a well-accepted and standardized method for objective prediction of speech intelligibility in a wide range of environments and applications, it is essentially a monaural model. Advantages of binaural hearing in speech intelligibility are disregarded. In

  7. Lip Movement Exaggerations during Infant-Directed Speech

    Science.gov (United States)

    Green, Jordan R.; Nip, Ignatius S. B.; Wilson, Erin M.; Mefferd, Antje S.; Yunusova, Yana

    2010-01-01

    Purpose: Although a growing body of literature has identified the positive effects of visual speech on speech and language learning, oral movements of infant-directed speech (IDS) have rarely been studied. This investigation used 3-dimensional motion capture technology to describe how mothers modify their lip movements when talking to their…

  8. Using on-line altered auditory feedback treating Parkinsonian speech

    Science.gov (United States)

    Wang, Emily; Verhagen, Leo; de Vries, Meinou H.

    2005-09-01

    Patients with advanced Parkinson's disease tend to have dysarthric speech that is hesitant, accelerated, and repetitive, and that is often resistant to behavior speech therapy. In this pilot study, the speech disturbances were treated using on-line altered feedbacks (AF) provided by SpeechEasy (SE), an in-the-ear device registered with the FDA for use in humans to treat chronic stuttering. Eight PD patients participated in the study. All had moderate to severe speech disturbances. In addition, two patients had moderate recurring stuttering at the onset of PD after long remission since adolescence, two had bilateral STN DBS, and two bilateral pallidal DBS. An effective combination of delayed auditory feedback and frequency-altered feedback was selected for each subject and provided via SE worn in one ear. All subjects produced speech samples (structured-monologue and reading) under three conditions: baseline, with SE without, and with feedbacks. The speech samples were randomly presented and rated for speech intelligibility goodness using UPDRS-III item 18 and the speaking rate. The results indicted that SpeechEasy is well tolerated and AF can improve speech intelligibility in spontaneous speech. Further investigational use of this device for treating speech disorders in PD is warranted [Work partially supported by Janus Dev. Group, Inc.].

  9. The Influence of Bilingualism on Speech Production: A Systematic Review

    Science.gov (United States)

    Hambly, Helen; Wren, Yvonne; McLeod, Sharynne; Roulstone, Sue

    2013-01-01

    Background: Children who are bilingual and have speech sound disorder are likely to be under-referred, possibly due to confusion about typical speech acquisition in bilingual children. Aims: To investigate what is known about the impact of bilingualism on children's acquisition of speech in English to facilitate the identification and treatment of…

  10. Causes of Speech Disorders in Primary School Students of Zahedan

    Directory of Open Access Journals (Sweden)

    Saeed Fakhrerahimi

    2013-02-01

    Full Text Available Background: Since making communication with others is the most important function of speech, undoubtedly, any type of disorder in speech will affect the human communicability with others. The objective of the study was to investigate reasons behind the [high] prevalence rate of stammer, producing disorders and aglossia.Materials and Methods: This descriptive-analytical study was conducted on 118 male and female students, who were studying in a primary school in Zahedan; they had referred to the Speech Therapy Centers of Zahedan University of Medical Sciences in a period of seven months. The speech therapist examinations, diagnosis tools common in speech therapy, Spielberg Children Trait and also patients' cases were used to find the reasons behind the [high] prevalence rate of speech disorders. Results: Psychological causes had the highest rate of correlation with the speech disorders among the other factors affecting the speech disorders. After psychological causes, family history and age of the subjects are the other factors which may bring about the speech disorders (P<0.05. Bilingualism and birth order has a negative relationship with the speech disorders. Likewise, another result of this study shows that only psychological causes, social causes, hereditary causes and age of subjects can predict the speech disorders (P<0.05.Conclusion: The present study shows that the speech disorders have a strong and close relationship with the psychological causes at the first step and also history of family and age of individuals at the next steps.

  11. Perception of Intersensory Synchrony in Audiovisual Speech: Not that Special

    Science.gov (United States)

    Vroomen, Jean; Stekelenburg, Jeroen J.

    2011-01-01

    Perception of intersensory temporal order is particularly difficult for (continuous) audiovisual speech, as perceivers may find it difficult to notice substantial timing differences between speech sounds and lip movements. Here we tested whether this occurs because audiovisual speech is strongly paired ("unity assumption"). Participants made…

  12. Speech intelligibility after gingivectomy of excess palatal tissue

    OpenAIRE

    Aruna Balasundaram; Mythreyi Vinayagavel; Dhathri Priya Bandi

    2014-01-01

    To appreciate any enhancement in speech following gingivectomy of enlarged anterior palatal gingiva. Periodontal literature has documented various conditions, pathophysiology, and treatment modalities of gingival enlargement. Relationship between gingival maladies and speech alteration has received scant attention. This case report describes on altered speech pattern enhancement secondary to the gingivectomy procedure. A systemically healthy 24-year- female patient reported with bilateral ant...

  13. Differential Diagnosis of Children with Suspected Childhood Apraxia of Speech

    Science.gov (United States)

    Murray, Elizabeth; McCabe, Patricia; Heard, Robert; Ballard, Kirrie J.

    2015-01-01

    Purpose: The gold standard for diagnosing childhood apraxia of speech (CAS) is expert judgment of perceptual features. The aim of this study was to identify a set of objective measures that differentiate CAS from other speech disorders. Method: Seventy-two children (4-12 years of age) diagnosed with suspected CAS by community speech-language…

  14. A neural mechanism for recognizing speech spoken by different speakers

    NARCIS (Netherlands)

    Kreitewolf, Jens; Gaudrain, Etienne; von Kriegstein, Katharina

    2014-01-01

    Understanding speech from different speakers is a sophisticated process, particularly because the same acoustic parameters convey important information about both the speech message and the person speaking. How the human brain accomplishes speech recognition under such conditions is unknown. One vie

  15. Prosodic Features and Speech Naturalness in Individuals with Dysarthria

    Science.gov (United States)

    Klopfenstein, Marie I.

    2012-01-01

    Despite the importance of speech naturalness to treatment outcomes, little research has been done on what constitutes speech naturalness and how to best maximize naturalness in relationship to other treatment goals like intelligibility. In addition, previous literature alludes to the relationship between prosodic aspects of speech and speech…

  16. Correlates of older adults' discrimination of acoustic properties in speech

    NARCIS (Netherlands)

    Neger, T.M.; Janse, E.; Rietveld, A.C.M.

    2015-01-01

    Auditory discrimination of speech stimuli is an essential tool in speech and language therapy, e.g., in dysarthria rehabilitation. It is unclear, however, which listener characteristics are associated with the ability to perceive differences between one's own utterance and target speech. Knowledge a

  17. The Kiel Corpora of "Speech & Emotion" - A Summary

    DEFF Research Database (Denmark)

    Niebuhr, Oliver; Peters, Benno; Landgraf, Rabea

    2015-01-01

    technology applications that sneak in every corner of our life. Apart from the fact that speech corpora seem to become constantly larger (for example, in order to properly train self-learning speech synthesis/recognition algorithms), the content of speech corpora also changes. In particular, recordings...

  18. Electrocorticography Reveals Enhanced Visual Cortex Responses to Visual Speech.

    Science.gov (United States)

    Schepers, Inga M; Yoshor, Daniel; Beauchamp, Michael S

    2015-11-01

    Human speech contains both auditory and visual components, processed by their respective sensory cortices. We test a simple model in which task-relevant speech information is enhanced during cortical processing. Visual speech is most important when the auditory component is uninformative. Therefore, the model predicts that visual cortex responses should be enhanced to visual-only (V) speech compared with audiovisual (AV) speech. We recorded neuronal activity as patients perceived auditory-only (A), V, and AV speech. Visual cortex showed strong increases in high-gamma band power and strong decreases in alpha-band power to V and AV speech. Consistent with the model prediction, gamma-band increases and alpha-band decreases were stronger for V speech. The model predicts that the uninformative nature of the auditory component (not simply its absence) is the critical factor, a prediction we tested in a second experiment in which visual speech was paired with auditory white noise. As predicted, visual speech with auditory noise showed enhanced visual cortex responses relative to AV speech. An examination of the anatomical locus of the effects showed that all visual areas, including primary visual cortex, showed enhanced responses. Visual cortex responses to speech are enhanced under circumstances when visual information is most important for comprehension.

  19. New Ways in Teaching Connected Speech. New Ways Series

    Science.gov (United States)

    Brown, James Dean, Ed.

    2012-01-01

    Connected speech is based on a set of rules used to modify pronunciations so that words connect and flow more smoothly in natural speech (hafta versus have to). Native speakers of English tend to feel that connected speech is friendlier, more natural, more sympathetic, and more personal. Is there any reason why learners of English would prefer to…

  20. Automated Discovery of Speech Act Categories in Educational Games

    Science.gov (United States)

    Rus, Vasile; Moldovan, Cristian; Niraula, Nobal; Graesser, Arthur C.

    2012-01-01

    In this paper we address the important task of automated discovery of speech act categories in dialogue-based, multi-party educational games. Speech acts are important in dialogue-based educational systems because they help infer the student speaker's intentions (the task of speech act classification) which in turn is crucial to providing adequate…