WorldWideScience

Sample records for audiometry speech

  1. Audiometry

    Science.gov (United States)

    ... The following conditions may affect test results: Acoustic neuroma Acoustic trauma Age-related hearing loss Alport syndrome ... Mosby Elsevier; 2010:chap 190. Read More Acoustic neuroma Acoustic trauma Age-related hearing loss Alport syndrome ...

  2. BRAIN STEM EVOKED RESPONSE AUDIOMETRY A REVIEW

    Directory of Open Access Journals (Sweden)

    Balasubramanian Thiagarajan

    2015-01-01

    Full Text Available Brain stem evoked response audiometry (BERA is a useful objective assessment of hearing. Major advantage of this procedure is its ability to test even infants in whom conventional audiometry may not be useful. This investigation can be used as a screening test for deafness in high risk infants. Early diagnosis and rehabilitation will reduce disability in these children. This article attempts to review the published literature on this subject.

  3. Objective Audiometry using Ear-EEG

    DEFF Research Database (Denmark)

    Christensen, Christian Bech; Kidmose, Preben

    . Recently, a novel EEG-method called ear-EEG that enable recording of auditory evoked potentials from a personalized earpiece was introduced. Since ear-EEG provides a discrete and non-invasive way of measuring neural signals and can be integrated into hearing aids, it has great potential for use in everyday...... life. Ear-EEG may therefore be an enabling technology for objective audiometry out of the clinic, allowing regularly fitting of the hearing aids to be made by the users in their everyday life environment. In this study we investigate the application of ear-EEG in objective audiometry....

  4. Test person operated 2-Alternative Forced Choice Audiometry compared to traditional audiometry

    DEFF Research Database (Denmark)

    Schmidt, Jesper Hvass; Brandt, Christian; Christensen-Dalsgaard, Jakob;

      Background: With a newly developed technique, hearing thresholds can be estimated with a system operated by the test persons themselves. This technique is based on the 2 Alternative Forced Choice paradigm known from the psychoacoustic research theory. Test persons can operate the system very...... easily themselves. Furthermore the system uses the theories behind the methods of maximum-likelihood fitting of the most probable psychometric function and a modification of the well known up-down methods in the estimation of the hearing thresholds. The combination of the 2AFC paradigm and the maximum...... comparison with traditional audiometry. A series of 30 persons (60 ears) have conducted traditional audiometry as well as self-operated 2AFC-audiometry. Test subjects are normal as well as moderately hearing impaired people. The different thresholds are compared.   Results: 2 AFC Audiometry is reliable and...

  5. Objective Audiometry using Ear-EEG

    DEFF Research Database (Denmark)

    Christensen, Christian Bech; Kidmose, Preben

    Recently, a novel electroencephalographic (EEG) method called ear-EEG [1], that enable recording of auditory evoked potentials (AEPs) from a personalized earpiece was introduced. Initial investigations show that well established AEPs, such as ASSR and P1-N1-P2 complex can be observed from ear-EEG...... recordings [2, 3], implying a possible application for ear-EEG in audiometric characterization of hearing loss. Since the Ear-EEG method provides a discrete and non-invasive way of measuring neural signals and can be integrated into hearing aids, it has great potential for use in everyday life. Ear-EEG may...... therefore be an enabling technology for objective audiometry out of the clinic, allowing regularly fitting of the hearing aids to be made by the users in their everyday life environment. The objective of this study is to investigate the application of ear-EEG in objective audiometry....

  6. Auditory assessment of children with severe hearing loss using behavioural observation audiometry and brainstem evoked response audiometry

    OpenAIRE

    Rakhi Kumari; Priyanko Chakraborty; Jain, R K; Dhananjay Kumar

    2016-01-01

    Background: Early detection of hearing loss has been a long-standing priority in the field of audiology. Currently available auditory testing methods include both behavioural and non-behavioural or objective tests of hearing. This study was planned with an objective to assess hearing loss in children using behavioural observation audiometry and brain stem evoked response audiometry. Methods: A total of 105 cases suffering from severe to profound hearing loss were registered. After proper h...

  7. The Role of Immittance Audiometry in Detecting Middle Ear Disease

    OpenAIRE

    Jacobson, John T.

    1981-01-01

    Immittance audiometry is an objective technique which evaluates middle ear function by three procedures: static immittance, tympanometry, and the measurement of acoustic reflex threshold sensitivity. This article discusses the technique's ability to identify middle ear effusion, the single leading ear disease in children.

  8. Audiometry and ossicular condition in chronic otitis media

    OpenAIRE

    mohsen Rajati Haghi; Mohamad Mahdi Ghasemi; Mehdi Bakhshaee; Atefeh Taghati; Atefeh Shahabipour

    2009-01-01

      Introduction: Ossicular chain injury is one of the most common causes of hearing loss in chronic otitis media (COM). Although definite diagnosis of ossicular discontinuity is made intraoperatively, preoperative determination of ossicular chain injury will help the surgeon decide about reconstruction options and hearing prognosis of the patient. In this study we compared preoperative pure tone audiometry (PTA) findings of COM patients with the ossicular condition determined during surgery. M...

  9. Prediction of hearing thresholds: Comparison of cortical evoked response audiometry and auditory steady state response audiometry techniques

    OpenAIRE

    Wong, LLN; Yeung, KNK

    2007-01-01

    The present study evaluated how well auditory steady state response (ASSR) and tone burst cortical evoked response audiometry (CERA) thresholds predict behavioral thresholds in the same participants. A total of 63 ears were evaluated. For ASSR testing, 100% amplitude modulated and 10% frequency modulated tone stimuli at a modulation frequency of 40Hz were used. Behavioral thresholds were closer to CERA thresholds than ASSR thresholds. ASSR and CERA thresholds were closer to behavioral thresho...

  10. Extended High Frequency Audiometry in Polycystic Ovary Syndrome

    Directory of Open Access Journals (Sweden)

    Cuneyt Kucur

    2013-01-01

    and BMI of PCOS and control groups were comparable. Each subject was tested with low (250–2000 Hz, high (4000–8000 Hz, and extended high frequency audiometry (8000–20000. Hormonal and biochemical values including LH, LH/FSH, testosterone, fasting glucose, fasting insulin, HOMA-I, and CRP were calculated. Results. PCOS patients showed high levels of LH, LH/FSH, testosterone, fasting insulin, glucose, HOMA-I, and CRP levels. The hearing thresholds of the groups were similar at frequencies of 250, 500, 1000, 2000, and 4000 Hz; statistically significant difference was observed in 8000–14000 Hz in PCOS group compared to control group. Conclusion. PCOS patients have hearing impairment especially in extended high frequencies. Further studies are needed to help elucidate the mechanism behind hearing impairment in association with PCOS.

  11. The Frequency of Hearing Loss and Hearing Aid Prescription in the Clients of the Avesina Education and Health Center, Audiometry Clinic, 1377

    Directory of Open Access Journals (Sweden)

    Abbas Bastani

    2003-08-01

    Full Text Available Objective: Determining the frequency of hearing disorders and hearing aid using in the clients referring to the Avesina education and health center, audiometry clinic, 1377. Method and Material: This is an assesive-descriptive survey that conducted on more than 2053 (1234 males and 819 females who referred for audiometry after examination by a physician. Case history, otoscopy, PTA, speech and immittance audiometry were conducted for all the clients. The findings were expressed in tables and diagrams of frequency. The age and sex relationship. All types of hearing losses and the number of the hearing-impaired clients need a hearing aid were assessed. Findings: 56% of this population were hearing-impaired and 44% had normal hearing were hearing. 60% were males and 40% females. Of the hearing-impaired, 44% had SNHL, 35.6% CHL and 8.2% mixed hearing loss. The hearing aid was prescribed for 204 (83 females and121 males if they need that only 20 females and 32 males wear it. Conclusion: It this sample, SNHL is of higher frequency. According to this survey, the more the age, the more the hearing aid is accepted (85% of wearer are more than 49 the prevalence of the hearing impaired males are more than females (60% versus 40%. Only 25% of the hearing-impaired wear hearing aids.

  12. A user-operated audiometry method based on the maximum likelihood principle and the two-alternative forced-choice paradigm

    DEFF Research Database (Denmark)

    Schmidt, Jesper Hvass; Brandt, Christian; Pedersen, Ellen Raben;

    2014-01-01

    Objective: To create a user-operated pure-tone audiometry method based on the method of maximum likelihood (MML) and the two-alternative forced-choice (2AFC) paradigm with high test-retest reliability without the need of an external operator and with minimal influence of subjects' fluctuating...... response criteria. User-operated audiometry was developed as an alternative to traditional audiometry for research purposes among musicians. Design: Test-retest reliability of the user-operated audiometry system was evaluated and the user-operated audiometry system was compared with traditional audiometry....... Study sample: Test-retest reliability of user-operated 2AFC audiometry was tested with 38 naïve listeners. User-operated 2AFC audiometry was compared to traditional audiometry in 41 subjects. Results: The repeatability of user-operated 2AFC audiometry was comparable to traditional audiometry with...

  13. Audiometry and ossicular condition in chronic otitis media

    Directory of Open Access Journals (Sweden)

    mohsen Rajati Haghi

    2009-07-01

    Full Text Available   Introduction: Ossicular chain injury is one of the most common causes of hearing loss in chronic otitis media (COM. Although definite diagnosis of ossicular discontinuity is made intraoperatively, preoperative determination of ossicular chain injury will help the surgeon decide about reconstruction options and hearing prognosis of the patient. In this study we compared preoperative pure tone audiometry (PTA findings of COM patients with the ossicular condition determined during surgery. Materials and Methods: 97 Patients with COM who underwent ear surgery for the first time were included in the study. A checklist of preoperative clinical findings, audiometric parameters and intraoperative findings was filled out for all patients. Results: Mean amount of Air-Bone Gap (ABG, Bone Conduction threshold (BC and Air Conduction threshold (AC of 97 Patients were 35.17, 13.13 and 48.30 respectively. In ears with or without cholesteatoma, granulation tissue, or otorrhea, mean of AC, BC, and ABG were not significantly different. In ossicular erosion and discontinuity (OD, mean of AC and BC thresholds increased significantly but ABG didn’t change significantly. Conclusion: According to the results of this study, in preoperative assessment of COM patients to predict ossicular condition we recommend considering AC, BC and ABG levels together instead of using ABG alone as is routine in our daily practice.

  14. Noise induced hearing loss: Screening with pure-tone audiometry and speech-in-noise testing

    OpenAIRE

    Leensen, M. C. J.

    2013-01-01

    Noise-induced hearing loss (NIHL) is a highly prevalent public health problem, caused by exposure to loud noises both during leisure time, e.g. by listening to loud music, and during work. In the past years NIHL was the most commonly reported occupational disease in the Netherlands. Hearing damage caused by noise is irreversible, but largely preventable. The early detection of hearing loss is of great importance, and is applied by preventative testing of hearing ability. This thesis investiga...

  15. Accuracy of Cortical Evoked Response Audiometry in estimating normal hearing thresholds

    OpenAIRE

    Mahdavi M E; Peyvandi A A

    2007-01-01

    Background: Cortical Evoked Response Audiometry (CERA) refers to prediction of behavioral pure-tone thresholds (500-4000 Hz) obtained by recording the N1-P2 complex of auditory long latency responses. CERA is the preferred method for frequency–specific estimation of audiogram in conscious adults and older children. CERA has an increased accuracy of determination of the hearing thresholds of alert patients with elevated hearing thresholds with sensory hearing loss; however few publications rep...

  16. STANDARDIZNG OF BRAINSTEM EVOKED RESPONSE AUDIOMETRY VALUES PRELIMINARY TO STARTING BERA LAB IN A HOSPITAL

    Directory of Open Access Journals (Sweden)

    Sivaprasad

    2014-07-01

    Full Text Available INTRODUCTION: The subjective assessment of hearing is primarily done by pure tone audiometry. It is commonly undertaken test which can tell us the hearing acuity of a person when carried under ideal conditions. However, not infrequently the otologists encounter a difficulty to do subjective audiometry or in those circumstances where the test results are not correlating with the disease in question. Hence they have to depend upon the objective tests to get a workable knowledge about the patients hearing threshold. Of the various objective tests available the most popular are Brain stem evoked response audiometry –non-invasive and more standardized parameter, Electro-cochleography, auditory steady state response. Otoacoustic Emission test (OAE Otoacoustic emission doesn’t measure the hearing acuity, it gives us an idea whether there is any deafness or not. But BERA is useful in detecting and quantification of deafness in the difficult-to-test patients like infants, mentally retarded people, malingers, deeply sedated and anaesthetized patients. It determines objectively the nature of deafness (i.e., whether sensory or neural in difficult-to-test patients. It helps to locate the site of lesion in retro-cochlear pathologies (in an area from spiral ganglion of the cochlear nerve to midbrain (inferior colliculus. Study of central auditory disorders is possible. Study of maturity of central nervous system in newborns, objective identification of brain death, assessing prognosis in comatose patients are other uses. Before starting a BERA lab in a hospital it is mandatory to standardize the normal values in a randomly selected group of persons with certain criteria like; normal ears with intact T.M and without any complaints of loss of hearing. Persons aged between 05 to 60 years are taken for this study. The study group included both males and females. The aim of this study is to assess the hearing pathway in normal hearing individuals and compare

  17. Speech Problems

    Science.gov (United States)

    ... your treatment plan may include seeing a speech therapist , a person who is trained to treat speech disorders. How often you have to see the speech therapist will vary — you'll probably start out seeing ...

  18. Speech Development

    Science.gov (United States)

    ... Spotlight Fundraising Ideas Vehicle Donation Volunteer Efforts Speech Development skip to submenu Parents & Individuals Information for Parents & Individuals Speech Development To download the PDF version of this factsheet, ...

  19. Accuracy of Cortical Evoked Response Audiometry in estimating normal hearing thresholds

    Directory of Open Access Journals (Sweden)

    Mahdavi M E

    2007-07-01

    Full Text Available Background: Cortical Evoked Response Audiometry (CERA refers to prediction of behavioral pure-tone thresholds (500-4000 Hz obtained by recording the N1-P2 complex of auditory long latency responses. CERA is the preferred method for frequency–specific estimation of audiogram in conscious adults and older children. CERA has an increased accuracy of determination of the hearing thresholds of alert patients with elevated hearing thresholds with sensory hearing loss; however few publications report studies regarding the use of CERA for estimating normal hearing thresholds. The purpose of this research was to further study the accuracy of CERA in predicting hearing thresholds when there is no hearing loss. Methods: Behavioral hearing thresholds of 40 alert normal hearing young adult male (40 ears screened at 20 dB HL in 500-8000Hz, predicted by recording N1-P2 complex of auditory evoked long latency responses to 10-30-10 ms tone bursts. After CERA, pure tone audiometry performed by other audiologist. All judgments about presence of responses performed visually. Stimulus rate variation and temporary interruption of stimulus presentation was used for preventing amplitude reduction of the responses. 200-250 responses were averaged near threshold. Results: In 95% of the hearing threshold predictions, N1-P2 thresholds were within 0-15 dB SL of true hearing thresholds. In the other 5%, the difference between the CERA threshold and true hearing threshold was 20-25 dB. The mean threshold obtained for tone bursts of 0.5, 1, 2 and 4 kHz were 12.6 ± 4.5, 10.9 ± 5.8, 10.8 ± 6.5 and 11.2 ± 4.1 dB, respectively, above the mean behavioral hearing thresholds for air-conducted pure tone stimuli. Conclusion: On average, CERA has a relatively high accuracy for the prediction of normal hearing sensitivity, comparable to that of previous studies performed on CERA in hearing-impaired populations.

  20. The Galker test of speech reception in noise

    DEFF Research Database (Denmark)

    Lauritsen, Maj-Britt Glenn; Söderström, Margareta; Kreiner, Svend;

    2016-01-01

    PURPOSE: We tested "the Galker test", a speech reception in noise test developed for primary care for Danish preschool children, to explore if the children's ability to hear and understand speech was associated with gender, age, middle ear status, and the level of background noise. METHODS......: The Galker test is a 35-item audio-visual, computerized word discrimination test in background noise. Included were 370 normally developed children attending day care center. The children were examined with the Galker test, tympanometry, audiometry, and the Reynell test of verbal comprehension. Parents...... to Reynell test scores (Gamma (G)=0.35), the children's age group (G=0.33), and the day care teachers' assessment of the children's vocabulary (G=0.26). CONCLUSIONS: The Galker test of speech reception in noise appears promising as an easy and quick tool for evaluating preschool children's understanding...

  1. Contrast sensitivity test and conventional and high frequency audiometry: information beyond that required to prescribe lenses and headsets

    Science.gov (United States)

    Comastri, S. A.; Martin, G.; Simon, J. M.; Angarano, C.; Dominguez, S.; Luzzi, F.; Lanusse, M.; Ranieri, M. V.; Boccio, C. M.

    2008-04-01

    In Optometry and in Audiology, the routine tests to prescribe correction lenses and headsets are respectively the visual acuity test (the first chart with letters was developed by Snellen in 1862) and conventional pure tone audiometry (the first audiometer with electrical current was devised by Hartmann in 1878). At present there are psychophysical non invasive tests that, besides evaluating visual and auditory performance globally and even in cases catalogued as normal according to routine tests, supply early information regarding diseases such as diabetes, hypertension, renal failure, cardiovascular problems, etc. Concerning Optometry, one of these tests is the achromatic luminance contrast sensitivity test (introduced by Schade in 1956). Concerning Audiology, one of these tests is high frequency pure tone audiometry (introduced a few decades ago) which yields information relative to pathologies affecting the basal cochlea and complements data resulting from conventional audiometry. These utilities of the contrast sensitivity test and of pure tone audiometry derive from the facts that Fourier components constitute the basis to synthesize stimuli present at the entrance of the visual and auditory systems; that these systems responses depend on frequencies and that the patient's psychophysical state affects frequency processing. The frequency of interest in the former test is the effective spatial frequency (inverse of the angle subtended at the eye by a cycle of a sinusoidal grating and measured in cycles/degree) and, in the latter, the temporal frequency (measured in cycles/sec). Both tests have similar duration and consist in determining the patient's threshold (corresponding to the inverse multiplicative of the contrast or to the inverse additive of the sound intensity level) for each harmonic stimulus present at the system entrance (sinusoidal grating or pure tone sound). In this article the frequencies, standard normality curves and abnormal threshold shifts

  2. Evaluation of adult aphasics with the Pediatric Speech Intelligibility test.

    Science.gov (United States)

    Jerger, S; Oliver, T A; Martin, R C

    1990-04-01

    Results of conventional adult speech audiometry may be compromised by the presence of speech/language disorders, such as aphasia. The purpose of this project was to determine the efficacy of the speech intelligibility materials and techniques developed for young children in evaluating central auditory function in aphasic adults. Eight adult aphasics were evaluated with the Pediatric Speech Intelligibility (PSI) test, a picture-pointing approach that was carefully developed to be relatively insensitive to linguistic-cognitive skills and relatively sensitive to auditory-perceptual function. Results on message-to-competition ratio (MCR) functions or performance-intensity (PI) functions were abnormal in all subjects. Most subjects served as their own controls, showing normal performance on one ear coupled with abnormal performance on the other ear. The patterns of abnormalities were consistent with the patterns seen (1) on conventional speech audiometry in brain-lesioned adults without aphasia and (2) on the PSI test in brain-lesioned children without aphasia. An exception to this general observation was an atypical pattern of abnormality on PI-function testing in the subgroup of nonfluent aphasics. The nonfluent subjects showed substantially poorer word-max scores than sentence-max scores, a pattern seen previously in only one other patient group, namely young children with recurrent otitis media. The unusually depressed word-max abnormality was not meaningfully related to clinical diagnostic data regarding the degree of hearing loss and the location and severity of the lesions or to experimental data regarding the integrity of phonologic processing abilities. The observations of ear-specific and condition-specific abnormalities suggest that the linguistically- and cognitively-simplified PSI test may be useful in the evaluation of auditory-specific deficits in the aphasic adult. PMID:2132591

  3. Speech coding

    Energy Technology Data Exchange (ETDEWEB)

    Ravishankar, C., Hughes Network Systems, Germantown, MD

    1998-05-08

    Speech is the predominant means of communication between human beings and since the invention of the telephone by Alexander Graham Bell in 1876, speech services have remained to be the core service in almost all telecommunication systems. Original analog methods of telephony had the disadvantage of speech signal getting corrupted by noise, cross-talk and distortion Long haul transmissions which use repeaters to compensate for the loss in signal strength on transmission links also increase the associated noise and distortion. On the other hand digital transmission is relatively immune to noise, cross-talk and distortion primarily because of the capability to faithfully regenerate digital signal at each repeater purely based on a binary decision. Hence end-to-end performance of the digital link essentially becomes independent of the length and operating frequency bands of the link Hence from a transmission point of view digital transmission has been the preferred approach due to its higher immunity to noise. The need to carry digital speech became extremely important from a service provision point of view as well. Modem requirements have introduced the need for robust, flexible and secure services that can carry a multitude of signal types (such as voice, data and video) without a fundamental change in infrastructure. Such a requirement could not have been easily met without the advent of digital transmission systems, thereby requiring speech to be coded digitally. The term Speech Coding is often referred to techniques that represent or code speech signals either directly as a waveform or as a set of parameters by analyzing the speech signal. In either case, the codes are transmitted to the distant end where speech is reconstructed or synthesized using the received set of codes. A more generic term that is applicable to these techniques that is often interchangeably used with speech coding is the term voice coding. This term is more generic in the sense that the

  4. The relationship of speech intelligibility with hearing sensitivity, cognition, and perceived hearing difficulties varies for different speech perception tests

    Directory of Open Access Journals (Sweden)

    Antje eHeinrich

    2015-06-01

    Full Text Available Listeners vary in their ability to understand speech in noisy environments. Hearing sensitivity, as measured by pure-tone audiometry, can only partly explain these results, and cognition has emerged as another key concept. Although cognition relates to speech perception, the exact nature of the relationship remains to be fully understood. This study investigates how different aspects of cognition, particularly working memory and attention, relate to speech intelligibility for various tests.Perceptual accuracy of speech perception represents just one aspect of functioning in a listening environment. Activity and participation limits imposed by hearing loss, in addition to the demands of a listening environment, are also important and may be better captured by self-report questionnaires. Understanding how speech perception relates to self-reported aspects of listening forms the second focus of the study.Forty-four listeners aged between 50-74 years with mild SNHL were tested on speech perception tests differing in complexity from low (phoneme discrimination in quiet, to medium (digit triplet perception in speech-shaped noise to high (sentence perception in modulated noise; cognitive tests of attention, memory, and nonverbal IQ; and self-report questionnaires of general health-related and hearing-specific quality of life.Hearing sensitivity and cognition related to intelligibility differently depending on the speech test: neither was important for phoneme discrimination, hearing sensitivity alone was important for digit triplet perception, and hearing and cognition together played a role in sentence perception. Self-reported aspects of auditory functioning were correlated with speech intelligibility to different degrees, with digit triplets in noise showing the richest pattern. The results suggest that intelligibility tests can vary in their auditory and cognitive demands and their sensitivity to the challenges that auditory environments pose on

  5. The relationship of speech intelligibility with hearing sensitivity, cognition, and perceived hearing difficulties varies for different speech perception tests.

    Science.gov (United States)

    Heinrich, Antje; Henshaw, Helen; Ferguson, Melanie A

    2015-01-01

    Listeners vary in their ability to understand speech in noisy environments. Hearing sensitivity, as measured by pure-tone audiometry, can only partly explain these results, and cognition has emerged as another key concept. Although cognition relates to speech perception, the exact nature of the relationship remains to be fully understood. This study investigates how different aspects of cognition, particularly working memory and attention, relate to speech intelligibility for various tests. Perceptual accuracy of speech perception represents just one aspect of functioning in a listening environment. Activity and participation limits imposed by hearing loss, in addition to the demands of a listening environment, are also important and may be better captured by self-report questionnaires. Understanding how speech perception relates to self-reported aspects of listening forms the second focus of the study. Forty-four listeners aged between 50 and 74 years with mild sensorineural hearing loss were tested on speech perception tests differing in complexity from low (phoneme discrimination in quiet), to medium (digit triplet perception in speech-shaped noise) to high (sentence perception in modulated noise); cognitive tests of attention, memory, and non-verbal intelligence quotient; and self-report questionnaires of general health-related and hearing-specific quality of life. Hearing sensitivity and cognition related to intelligibility differently depending on the speech test: neither was important for phoneme discrimination, hearing sensitivity alone was important for digit triplet perception, and hearing and cognition together played a role in sentence perception. Self-reported aspects of auditory functioning were correlated with speech intelligibility to different degrees, with digit triplets in noise showing the richest pattern. The results suggest that intelligibility tests can vary in their auditory and cognitive demands and their sensitivity to the challenges that

  6. Hate speech

    Directory of Open Access Journals (Sweden)

    Anne Birgitta Nilsen

    2014-03-01

    Full Text Available The manifesto of the Norwegian terrorist Anders Behring Breivik is based on the “Eurabia” conspiracy theory. This theory is a key starting point for hate speech amongst many right-wing extremists in Europe, but also has ramifications beyond these environments. In brief, proponents of the Eurabia theory claim that Muslims are occupying Europe and destroying Western culture, with the assistance of the EU and European governments. By contrast, members of Al-Qaeda and other extreme Islamists promote the conspiracy theory “the Crusade” in their hate speech directed against the West. Proponents of the latter theory argue that the West is leading a crusade to eradicate Islam and Muslims, a crusade that is similarly facilitated by their governments. This article presents analyses of texts written by right-wing extremists and Muslim extremists in an effort to shed light on how hate speech promulgates conspiracy theories in order to spread hatred and intolerance.The aim of the article is to contribute to a more thorough understanding of hate speech’s nature by applying rhetorical analysis. Rhetorical analysis is chosen because it offers a means of understanding the persuasive power of speech. It is thus a suitable tool to describe how hate speech works to convince and persuade. The concepts from rhetorical theory used in this article are ethos, logos and pathos. The concept of ethos is used to pinpoint factors that contributed to Osama bin Laden's impact, namely factors that lent credibility to his promotion of the conspiracy theory of the Crusade. In particular, Bin Laden projected common sense, good morals and good will towards his audience. He seemed to have coherent and relevant arguments; he appeared to possess moral credibility; and his use of language demonstrated that he wanted the best for his audience.The concept of pathos is used to define hate speech, since hate speech targets its audience's emotions. In hate speech it is the

  7. Speech Enhancement

    DEFF Research Database (Denmark)

    Benesty, Jacob; Jensen, Jesper Rindom; Christensen, Mads Græsbøll;

    and their performance bounded and assessed in terms of noise reduction and speech distortion. The book shows how various filter designs can be obtained in this framework, including the maximum SNR, Wiener, LCMV, and MVDR filters, and how these can be applied in various contexts, like in single......Speech enhancement is a classical problem in signal processing, yet still largely unsolved. Two of the conventional approaches for solving this problem are linear filtering, like the classical Wiener filter, and subspace methods. These approaches have traditionally been treated as different classes...

  8. Speech enhancement

    CERN Document Server

    Benesty, Jacob; Chen, Jingdong

    2006-01-01

    We live in a noisy world! In all applications (telecommunications, hands-free communications, recording, human-machine interfaces, etc.) that require at least one microphone, the signal of interest is usually contaminated by noise and reverberation. As a result, the microphone signal has to be ""cleaned"" with digital signal processing tools before it is played out, transmitted, or stored.This book is about speech enhancement. Different well-known and state-of-the-art methods for noise reduction, with one or multiple microphones, are discussed. By speech enhancement, we mean not only noise red

  9. Silent Speech Interfaces

    OpenAIRE

    Denby, B; Schultz, T.; Honda, K.; Hueber, T.; Gilbert, J.M.; Brumberg, J.S.

    2010-01-01

    Abstract The possibility of speech processing in the absence of an intelligible acoustic signal has given rise to the idea of a `silent speech? interface, to be used as an aid for the speech handicapped, or as part of a communications system operating in silence-required or high-background-noise environments. The article first outlines the emergence of the silent speech interface from the fields of speech production, automatic speech processing, speech pathology research, and telec...

  10. Language and Speech Processing

    CERN Document Server

    Mariani, Joseph

    2008-01-01

    Speech processing addresses various scientific and technological areas. It includes speech analysis and variable rate coding, in order to store or transmit speech. It also covers speech synthesis, especially from text, speech recognition, including speaker and language identification, and spoken language understanding. This book covers the following topics: how to realize speech production and perception systems, how to synthesize and understand speech using state-of-the-art methods in signal processing, pattern recognition, stochastic modelling computational linguistics and human factor studi

  11. A study of brainstem evoked response audiometry in high-risk infants and children under 10 years of age

    Directory of Open Access Journals (Sweden)

    Ramanathan Thirunavukarasu

    2015-01-01

    Full Text Available Aims: To evaluate the hearing threshold and find the incidence of hearing loss in infants and children belonging to high-risk category and analyze the common risk factors. Subjects and Methods: Totally, 125 infants and children belonging to high-risk category were subjected to brainstem evoked response audiometry. Clicks were given at the rate of 11.1 clicks/s. Totally, 2000 responses were averaged. The intensity at which wave V just disappears was established as hearing the threshold. Degree of impairment and risk factors were analyzed. Results: Totally, 44 (35.2% were found to have sensorineural hearing loss. Totally, 30 children with hearing loss (68% belonged to age group 1-5 years. Consanguineous marriage was the most commonly associated risk factor. Majority (34 had profound hearing loss. Conclusion: Newborn screening is mandatory to identify hearing loss in the prelinguistic period to reduce the burden of handicap in the community. The need of the hour is health education and genetic counseling to decrease the hereditary hearing loss, as hearing impairment due to perinatal factors has reduced due to recent medical advancements.

  12. Speech coding

    Science.gov (United States)

    Gersho, Allen

    1990-05-01

    Recent advances in algorithms and techniques for speech coding now permit high quality voice reproduction at remarkably low bit rates. The advent of powerful single-ship signal processors has made it cost effective to implement these new and sophisticated speech coding algorithms for many important applications in voice communication and storage. Some of the main ideas underlying the algorithms of major interest today are reviewed. The concept of removing redundancy by linear prediction is reviewed, first in the context of predictive quantization or DPCM. Then linear predictive coding, adaptive predictive coding, and vector quantization are discussed. The concepts of excitation coding via analysis-by-synthesis, vector sum excitation codebooks, and adaptive postfiltering are explained. The main idea of vector excitation coding (VXC) or code excited linear prediction (CELP) are presented. Finally low-delay VXC coding and phonetic segmentation for VXC are described.

  13. Hate speech

    OpenAIRE

    Anne Birgitta Nilsen

    2014-01-01

    The manifesto of the Norwegian terrorist Anders Behring Breivik is based on the “Eurabia” conspiracy theory. This theory is a key starting point for hate speech amongst many right-wing extremists in Europe, but also has ramifications beyond these environments. In brief, proponents of the Eurabia theory claim that Muslims are occupying Europe and destroying Western culture, with the assistance of the EU and European governments. By contrast, members of Al-Qaeda and other extreme Islamists prom...

  14. Speech and Communication Disorders

    Science.gov (United States)

    ... or understand speech. Causes include Hearing disorders and deafness Voice problems, such as dysphonia or those caused by cleft lip or palate Speech problems like stuttering Developmental disabilities Learning disorders Autism spectrum disorder Brain injury Stroke Some speech and ...

  15. Speech disorders - children

    Science.gov (United States)

    ... of speech disorders may disappear on their own. Speech therapy may help with more severe symptoms or speech problems that do not improve. In therapy, the child will learn how to create certain sounds.

  16. Speech recognition and understanding

    Energy Technology Data Exchange (ETDEWEB)

    Vintsyuk, T.K.

    1983-05-01

    This article discusses the automatic processing of speech signals with the aim of finding a sequence of works (speech recognition) or a concept (speech understanding) being transmitted by the speech signal. The goal of the research is to develop an automatic typewriter that will automatically edit and type text under voice control. A dynamic programming method is proposed in which all possible class signals are stored, after which the presented signal is compared to all the stored signals during the recognition phase. Topics considered include element-by-element recognition of words of speech, learning speech recognition, phoneme-by-phoneme speech recognition, the recognition of connected speech, understanding connected speech, and prospects for designing speech recognition and understanding systems. An application of the composition dynamic programming method for the solution of basic problems in the recognition and understanding of speech is presented.

  17. Audiometry in Young Children

    OpenAIRE

    Muller, George

    1987-01-01

    The author of this article reviews various techniques in the auditory assessment of infants and young children. The success of these tests depends on the overall functioning of the child, and not on chronological age alone. Any significant deviation from the normal auditory behaviour should raise suspicion of possible auditory impairment. Diagnostic audiology involves more than mere testing of the peripheral auditory mechanism in isolation. It necessitates investigation of possible neurologic...

  18. Speech and Language Impairments

    Science.gov (United States)

    ... easily be mistaken for other disabilities such as autism or learning disabilities, so it’s very important to ensure that the child receives a thorough evaluation by a certified speech-language pathologist. Back to top What Causes Speech ...

  19. Speech impairment (adult)

    Science.gov (United States)

    ... impairment; Impairment of speech; Inability to speak; Aphasia; Dysarthria; Slurred speech; Dysphonia voice disorders ... in others the condition does not get better. DYSARTHRIA With dysarthria, the person has ongoing difficulty expressing ...

  20. Speech perception as categorization

    OpenAIRE

    Holt, Lori L.; Lotto, Andrew J.

    2010-01-01

    Speech perception (SP) most commonly refers to the perceptual mapping from the highly variable acoustic speech signal to a linguistic representation, whether it be phonemes, diphones, syllables, or words. This is an example of categorization, in that potentially discriminable speech sounds are assigned to functionally equivalent classes. In this tutorial, we present some of the main challenges to our understanding of the categorization of speech sounds and the conceptualization of SP that has...

  1. Talking Speech Input.

    Science.gov (United States)

    Berliss-Vincent, Jane; Whitford, Gigi

    2002-01-01

    This article presents both the factors involved in successful speech input use and the potential barriers that may suggest that other access technologies could be more appropriate for a given individual. Speech input options that are available are reviewed and strategies for optimizing use of speech recognition technology are discussed. (Contains…

  2. Speech-Language Pathologists

    Science.gov (United States)

    ... INDEX | OOH SITE MAP | EN ESPAÑOL Healthcare > Speech-Language Pathologists PRINTER-FRIENDLY EN ESPAÑOL Summary What They ... workers and occupations. What They Do -> What Speech-Language Pathologists Do About this section Speech-language pathologists ...

  3. Decreased Speech-In-Noise Understanding in Young Adults with Tinnitus

    Science.gov (United States)

    Gilles, Annick; Schlee, Winny; Rabau, Sarah; Wouters, Kristien; Fransen, Erik; Van de Heyning, Paul

    2016-01-01

    Objectives: Young people are often exposed to high music levels which make them more at risk to develop noise-induced symptoms such as hearing loss, hyperacusis, and tinnitus of which the latter is the symptom perceived the most by young adults. Although, subclinical neural damage was demonstrated in animal experiments, the human correlate remains under debate. Controversy exists on the underlying condition of young adults with normal hearing thresholds and noise-induced tinnitus (NIT) due to leisure noise. The present study aimed to assess differences in audiological characteristics between noise-exposed adolescents with and without NIT. Methods: A group of 87 young adults with a history of recreational noise exposure was investigated by use of the following tests: otoscopy, impedance measurements, pure-tone audiometry including high-frequencies, transient and distortion product otoacoustic emissions, speech-in-noise testing with continuous and modulated noise (amplitude-modulated by 15 Hz), auditory brainstem responses (ABR) and questionnaires.Nineteen students reported NIT due to recreational noise exposure, and their measures were compared to the non-tinnitus subjects. Results: No significant differences between tinnitus and non-tinnitus subjects could be found for hearing thresholds, otoacoustic emissions, and ABR results.Tinnitus subjects had significantly worse speech reception in noise compared to non-tinnitus subjects for sentences embedded in steady-state noise (mean speech reception threshold (SRT) scores, respectively −5.77 and −6.90 dB SNR; p = 0.025) as well as for sentences embedded in 15 Hz AM-noise (mean SRT scores, respectively −13.04 and −15.17 dB SNR; p = 0.013). In both groups speech reception was significantly improved during AM-15 Hz noise compared to the steady-state noise condition (p < 0.001). However, the modulation masking release was not affected by the presence of NIT. Conclusions: Young adults with and without NIT did not

  4. Avaliação dos limiares auditivos com e sem equipamento de proteção individual Pure tone audiometry with and without specific ear protectors

    Directory of Open Access Journals (Sweden)

    Carlos Antonio Rodrigues de Faria

    2008-06-01

    Full Text Available Os autores realizaram estudo caso-controle audiométrico em indivíduos com e sem protetor auricular auditivo. OBJETIVOS: O objetivo do estudo foi avaliar a real atenuação individual dado pelos protetores. MATERIAL E MÉTODO: Foram avaliados 30 indivíduos (ou 60 orelhas de diferentes atividades profissionais, de ambos os sexos, com idades entre 20 e 58 anos, apresentando audição normal e tendo realizado repouso auditivo de 10 horas, submetidos a exame audiométrico com e sem protetor auricular auditivo, no período de fevereiro a julho de 2003, utilizando protetor tipo plugue. Avaliou-se as audiometrias nas vias aérea e óssea em freqüências de 500 a 4000Hz. RESULTADOS: Os resultados foram analisados estatisticamente e comparados aos dados fornecidos pelo fabricante. Assim se observou em ouvido real os níveis de atenuação auditiva obtidos com o uso destes produtos. CONCLUSÃO: Os resultados permitiram chegar à conclusão de que os índices fornecidos pelos fabricantes foram compatíveis com os que obtive nos testes.The authors evaluated pure tone audiometry with and without specific ear protectors. AIM: The purpose of this case control study was to measure the level of sound attenuation by earplugs. MATERIAL AND METHODS: The evaluation included sixty ears of 30 subjects of both sexes, aged between 20 and 58 years, of various professional activities, with normal hearing thresholds, and following ten hours of auditory rest. The statistical results of pure tone audiometry at 500 to 4000 Hertz with and without specific ear protectors were analyzed. RESULTS: These results were compared with those provided by the ear protector manufacturer. CONCLUSION: The results show that the rate of sound reduction was similar to the manufacturer's specifications.

  5. Digital speech processing using Matlab

    CERN Document Server

    Gopi, E S

    2014-01-01

    Digital Speech Processing Using Matlab deals with digital speech pattern recognition, speech production model, speech feature extraction, and speech compression. The book is written in a manner that is suitable for beginners pursuing basic research in digital speech processing. Matlab illustrations are provided for most topics to enable better understanding of concepts. This book also deals with the basic pattern recognition techniques (illustrated with speech signals using Matlab) such as PCA, LDA, ICA, SVM, HMM, GMM, BPN, and KSOM.

  6. Indirect Speech Acts

    Institute of Scientific and Technical Information of China (English)

    李威

    2001-01-01

    Indirect speech acts are frequently used in verbal communication, the interpretation of them is of great importance in order to meet the demands of the development of students' communicative competence. This paper, therefore, intends to present Searle' s indirect speech acts and explore the way how indirect speech acts are interpreted in accordance with two influential theories. It consists of four parts. Part one gives a general introduction to the notion of speech acts theory. Part two makes an elaboration upon the conception of indirect speech act theory proposed by Searle and his supplement and development of illocutionary acts. Part three deals with the interpretation of indirect speech acts. Part four draws implication from the previous study and also serves as the conclusion of the dissertation.

  7. Esophageal speeches modified by the Speech Enhancer Program®

    OpenAIRE

    Manochiopinig, Sriwimon; Boonpramuk, Panuthat

    2014-01-01

    Esophageal speech appears to be the first choice of speech treatment for a laryngectomy. However, many laryngectomy people are unable to speak well. The aim of this study was to evaluate post-modified speech quality of Thai esophageal speakers using the Speech Enhancer Program®. The method adopted was to approach five speech–language pathologists to assess the speech accuracy and intelligibility of the words and continuing speech of the seven laryngectomy people. A comparison study was conduc...

  8. Speech Alarms Pilot Study

    Science.gov (United States)

    Sandor, Aniko; Moses, Haifa

    2016-01-01

    Speech alarms have been used extensively in aviation and included in International Building Codes (IBC) and National Fire Protection Association's (NFPA) Life Safety Code. However, they have not been implemented on space vehicles. Previous studies conducted at NASA JSC showed that speech alarms lead to faster identification and higher accuracy. This research evaluated updated speech and tone alerts in a laboratory environment and in the Human Exploration Research Analog (HERA) in a realistic setup.

  9. Context dependent speech recognition

    OpenAIRE

    Andersson, Sebastian

    2006-01-01

    Poor speech recognition is a problem when developing spoken dialogue systems, but several studies has showed that speech recognition can be improved by post-processing of recognition output that use the dialogue context, acoustic properties of a user utterance and other available resources to train a statistical model to use as a filter between the speech recogniser and dialogue manager. In this thesis a corpus of logged interactions between users and a dialogue system was used...

  10. Speech input and output

    Science.gov (United States)

    Class, F.; Mangold, H.; Stall, D.; Zelinski, R.

    1981-12-01

    Possibilities for acoustical dialogs with electronic data processing equipment were investigated. Speech recognition is posed as recognizing word groups. An economical, multistage classifier for word string segmentation is presented and its reliability in dealing with continuous speech (problems of temporal normalization and context) is discussed. Speech synthesis is considered in terms of German linguistics and phonetics. Preprocessing algorithms for total synthesis of written texts were developed. A macrolanguage, MUSTER, is used to implement this processing in an acoustic data information system (ADES).

  11. Principles of speech coding

    CERN Document Server

    Ogunfunmi, Tokunbo

    2010-01-01

    It is becoming increasingly apparent that all forms of communication-including voice-will be transmitted through packet-switched networks based on the Internet Protocol (IP). Therefore, the design of modern devices that rely on speech interfaces, such as cell phones and PDAs, requires a complete and up-to-date understanding of the basics of speech coding. Outlines key signal processing algorithms used to mitigate impairments to speech quality in VoIP networksOffering a detailed yet easily accessible introduction to the field, Principles of Speech Coding provides an in-depth examination of the

  12. Advances in Speech Recognition

    CERN Document Server

    Neustein, Amy

    2010-01-01

    This volume is comprised of contributions from eminent leaders in the speech industry, and presents a comprehensive and in depth analysis of the progress of speech technology in the topical areas of mobile settings, healthcare and call centers. The material addresses the technical aspects of voice technology within the framework of societal needs, such as the use of speech recognition software to produce up-to-date electronic health records, not withstanding patients making changes to health plans and physicians. Included will be discussion of speech engineering, linguistics, human factors ana

  13. Ear, Hearing and Speech

    DEFF Research Database (Denmark)

    Poulsen, Torben

    2000-01-01

    An introduction is given to the the anatomy and the function of the ear, basic psychoacoustic matters (hearing threshold, loudness, masking), the speech signal and speech intelligibility. The lecture note is written for the course: Fundamentals of Acoustics and Noise Control (51001)......An introduction is given to the the anatomy and the function of the ear, basic psychoacoustic matters (hearing threshold, loudness, masking), the speech signal and speech intelligibility. The lecture note is written for the course: Fundamentals of Acoustics and Noise Control (51001)...

  14. Advances in speech processing

    Science.gov (United States)

    Ince, A. Nejat

    1992-10-01

    The field of speech processing is undergoing a rapid growth in terms of both performance and applications and this is fueled by the advances being made in the areas of microelectronics, computation, and algorithm design. The use of voice for civil and military communications is discussed considering advantages and disadvantages including the effects of environmental factors such as acoustic and electrical noise and interference and propagation. The structure of the existing NATO communications network and the evolving Integrated Services Digital Network (ISDN) concept are briefly reviewed to show how they meet the present and future requirements. The paper then deals with the fundamental subject of speech coding and compression. Recent advances in techniques and algorithms for speech coding now permit high quality voice reproduction at remarkably low bit rates. The subject of speech synthesis is next treated where the principle objective is to produce natural quality synthetic speech from unrestricted text input. Speech recognition where the ultimate objective is to produce a machine which would understand conversational speech with unrestricted vocabulary, from essentially any talker, is discussed. Algorithms for speech recognition can be characterized broadly as pattern recognition approaches and acoustic phonetic approaches. To date, the greatest degree of success in speech recognition has been obtained using pattern recognition paradigms. It is for this reason that the paper is concerned primarily with this technique.

  15. [The voice and speech].

    Science.gov (United States)

    Pesák, J; Honová, J; Majtner, J; Vojtĕchovský, K

    1998-01-01

    Biophysics is the science comprising the sum of biophysical disciplines describing living systems. It also includes the biophysics of voice and speech. The latter deals with physiological acoustics, phonetics, phoniatry as well as logopaedics. In connection with the problems of voice and speech, including also their teaching problems, a common language is often being sought for appropriate to all the interested scientific branches. As a result of our efforts aimed at removing the existing barriers we have tried to set up a University Society for the Study of Voice and Speech. One of its first activities was also, besides other events, the realization of a videofilm On voice and speech. PMID:10803289

  16. Speech-Language Therapy (For Parents)

    Science.gov (United States)

    ... 5 Things to Know About Zika & Pregnancy Speech-Language Therapy KidsHealth > For Parents > Speech-Language Therapy Print ... with speech and/or language disorders. Speech Disorders, Language Disorders, and Feeding Disorders A speech disorder refers ...

  17. Time-expanded speech and speech recognition in older adults.

    Science.gov (United States)

    Vaughan, Nancy E; Furukawa, Izumi; Balasingam, Nirmala; Mortz, Margaret; Fausti, Stephen A

    2002-01-01

    Speech understanding deficits are common in older adults. In addition to hearing sensitivity, changes in certain cognitive functions may affect speech recognition. One such change that may impact the ability to follow a rapidly changing speech signal is processing speed. When speakers slow the rate of their speech naturally in order to speak clearly, speech recognition is improved. The acoustic characteristics of naturally slowed speech are of interest in developing time-expansion algorithms to improve speech recognition for older listeners. In this study, we tested younger normally hearing, older normally hearing, and older hearing-impaired listeners on time-expanded speech using increased duration and increased intensity of unvoiced consonants. Although all groups performed best on unprocessed speech, performance with processed speech was better with the consonant gain feature without time expansion in the noise condition and better at the slowest time-expanded rate in the quiet condition. The effects of signal processing on speech recognition are discussed. PMID:17642020

  18. Speech Compression for Noise-Corrupted Thai Expressive Speech

    Directory of Open Access Journals (Sweden)

    Suphattharachai Chomphan

    2011-01-01

    Full Text Available Problem statement: In speech communication, speech coding aims at preserving the speech quality with lower coding bitrate. When considering the communication environment, various types of noises deteriorates the speech quality. The expressive speech with different speaking styles may cause different speech quality with the same coding method. Approach: This research proposed a study of speech compression for noise-corrupted Thai expressive speech by using two coding methods of CS-ACELP and MP-CELP. The speech material included a hundredmale speech utterances and a hundred female speech utterances. Four speaking styles included enjoyable, sad, angry and reading styles. Five sentences of Thai speech were chosen. Three types of noises were included (train, car and air conditioner. Five levels of each type of noise were varied from 0-20 dB. The subjective test of mean opinion score was exploited in the evaluation process. Results: The experimental results showed that CS-ACELP gave the better speech quality than that of MP-CELP at all three bitrates of 6000, 8600-12600 bps. When considering the levels of noise, the 20-dB noise gave the best speech quality, while 0-dB noise gave the worst speech quality. When considering the speech gender, female speech gave the better results than that of male speech. When considering the types of noise, the air-conditioner noise gave the best speech quality, while the train noise gave the worst speech quality. Conclusion: From the study, it can be seen that coding methods, types of noise, levels of noise, speech gender influence on the coding speech quality.

  19. Improving Alaryngeal Speech Intelligibility.

    Science.gov (United States)

    Christensen, John M.; Dwyer, Patricia E.

    1990-01-01

    Laryngectomized patients using esophageal speech or an electronic artificial larynx have difficulty producing correct voicing contrasts between homorganic consonants. This paper describes a therapy technique that emphasizes "pushing harder" on voiceless consonants to improve alaryngeal speech intelligibility and proposes focusing on the production…

  20. Speech Situations and TEFL

    Institute of Scientific and Technical Information of China (English)

    吴树奇; 高建国

    2008-01-01

    This paper deals with how speech situations or ratherspeech implicatures affect TEFL.As far as the writer is concerned,they have much influence on many aspect of language teaching.To illustrate this point explicitly,the writer focuses on the influence of speech situations upon pronunciation,intonation,lexical meanings,sentence comprehension and the grammatical study of the English language.

  1. Speech and Language Delay

    Science.gov (United States)

    ... child depends on the cause of the speech delay. Your doctor will tell you the cause of your child's problem and explain any treatments that might fix the problem or make it better. A speech and language pathologist might be helpful in making treatment plans. This ...

  2. Private Speech in Ballet

    Science.gov (United States)

    Johnston, Dale

    2006-01-01

    Authoritarian teaching practices in ballet inhibit the use of private speech. This paper highlights the critical importance of private speech in the cognitive development of young ballet students, within what is largely a non-verbal art form. It draws upon research by Russian psychologist Lev Vygotsky and contemporary socioculturalists, to…

  3. Speech processing standards

    Science.gov (United States)

    Ince, A. Nejat

    1990-05-01

    Speech processing standards are given for 64, 32, 16 kb/s and lower rate speech and more generally, speech-band signals which are or will be promulgated by CCITT and NATO. The International Telegraph and Telephone Consultative Committee (CCITT) of the International body which deals, among other things, with speech processing within the context of ISDN. Within NATO there are also bodies promulgating standards which make interoperability, possible without complex and expensive interfaces. Some of the applications for low-bit rate voice and the related work undertaken by CCITT Study Groups which are responsible for developing standards in terms of encoding algorithms, codec design objectives as well as standards on the assessment of speech quality, are highlighted.

  4. Speech Acts In President Barack Obama Victory Speech 2012

    OpenAIRE

    Januarini, Erna

    2016-01-01

    In the thesis, entitled Speech Acts In President Barack Obama's Victory Speech 2012. The author analyzes the illocutionary acts and direct and indirect speech acts and by Barack Obama as a speaker based on representative, directive, expressive, commissive, and declaration. The purpose of this thesis is to find the types of illocutionary acts and direct and indirect speech acts and in Barack Obama's victory speech 2012. In writing this thesis, the author uses a qualitative method from Huberman...

  5. 78 FR 49693 - Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services...

    Science.gov (United States)

    2013-08-15

    ... Rulemaking, published at 73 FR 47120, August 13, 2008 (2008 STS NPRM). The Commission sought comment on... Abbreviated Dialing Arrangements, CC Docket No. 92-105, Report and Order, published at 65 FR 54799, September... COMMISSION 47 CFR Part 64 Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech...

  6. 78 FR 49717 - Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech Telecommunications Relay Services...

    Science.gov (United States)

    2013-08-15

    ..., Report and Order and Further Notice of Proposed Rulemaking, published at 77 FR 25609, May 1, 2012 (VRS... Nos. 03-123 and 08-15, Notice of Proposed Rulemaking, published at 73 FR 47120, August 13, 2008 (2008... COMMISSION 47 CFR Part 64 Speech-to-Speech and Internet Protocol (IP) Speech-to-Speech...

  7. Going to a Speech Therapist

    Science.gov (United States)

    ... What's in this article? What Do Speech Therapists Help With? Who Needs Speech Therapy? What's It Like? How Long Will Treatment Last? Some kids have trouble saying certain sounds or words. This can be frustrating ... speech therapists (also called speech-language pathologists ). What ...

  8. Survey On Speech Synthesis

    Directory of Open Access Journals (Sweden)

    A. Indumathi

    2012-12-01

    Full Text Available The primary goal of this paper is to provide an overview of existing Text-To-Speech (TTS Techniques by highlighting its usage and advantage. First Generation Techniques includes Formant Synthesis and Articulatory Synthesis. Formant Synthesis works by using individually controllable formant filters, which can be set to produce accurate estimations of the vocal-track transfer function. Articulatory Synthesis produces speech by direct modeling of Human articulator behavior. Second Generation Techniques incorporates Concatenative synthesis and Sinusoidal synthesis. Concatenative synthesis generates speech output by concatenating the segments of recorded speech. Generally, Concatenative synthesis generates the natural sounding synthesized speech. Sinusoidal Synthesis use a harmonic model and decompose each frame into a set of harmonics of an estimated fundamental frequency. The model parameters are the amplitudes and periods of the harmonics. With these, the value of the fundamental can be changed while keeping the same basic spectral..In adding, Third Generation includes Hidden Markov Model (HMM and Unit Selection Synthesis.HMM trains the parameter module and produce high quality Speech. Finally, Unit Selection operates by selecting the best sequence of units from a large speech database which matches the specification.

  9. Global Freedom of Speech

    DEFF Research Database (Denmark)

    Binderup, Lars Grassme

    2007-01-01

    opposed to a legal norm, that curbs exercises of the right to free speech that offend the feelings or beliefs of members from other cultural groups. The paper rejects the suggestion that acceptance of such a norm is in line with liberal egalitarian thinking. Following a review of the classical liberal...... egalitarian reasons for free speech - reasons from overall welfare, from autonomy and from respect for the equality of citizens - it is argued that these reasons outweigh the proposed reasons for curbing culturally offensive speech. Currently controversial cases such as that of the Danish Cartoon Controversy...

  10. Speech and Swallowing

    Science.gov (United States)

    ... Español In Your Area NPF Shop Speech and Swallowing Problems Make Text Smaller Make Text Larger You ... How do I know if I have a swallowing problem? I have recently lost weight without trying. ...

  11. Speech disorders - children

    Science.gov (United States)

    ... this page: //medlineplus.gov/ency/article/001430.htm Speech disorders - children To use the sharing features on ... PA: Elsevier Saunders; 2011:chap 32. Read More Autism spectrum disorder Cerebral palsy Hearing loss Intellectual disability ...

  12. Speech impairment (adult)

    Science.gov (United States)

    ... ALS or Lou Gehrig disease), cerebral palsy, myasthenia gravis, or multiple sclerosis (MS) Facial trauma Facial weakness, ... provider will likely ask about the speech impairment. Questions may include when the problem developed, whether there ...

  13. Computer-generated speech

    Energy Technology Data Exchange (ETDEWEB)

    Aimthikul, Y.

    1981-12-01

    This thesis reviews the essential aspects of speech synthesis and distinguishes between the two prevailing techniques: compressed digital speech and phonemic synthesis. It then presents the hardware details of the five speech modules evaluated. FORTRAN programs were written to facilitate message creation and retrieval with four of the modules driven by a PDP-11 minicomputer. The fifth module was driven directly by a computer terminal. The compressed digital speech modules (T.I. 990/306, T.S.I. Series 3D and N.S. Digitalker) each contain a limited vocabulary produced by the manufacturers while both the phonemic synthesizers made by Votrax permit an almost unlimited set of sounds and words. A text-to-phoneme rules program was adapted for the PDP-11 (running under the RSX-11M operating system) to drive the Votrax Speech Pac module. However, the Votrax Type'N Talk unit has its own built-in translator. Comparison of these modules revealed that the compressed digital speech modules were superior in pronouncing words on an individual basis but lacked the inflection capability that permitted the phonemic synthesizers to generate more coherent phrases. These findings were necessarily highly subjective and dependent on the specific words and phrases studied. In addition, the rapid introduction of new modules by manufacturers will necessitate new comparisons. However, the results of this research verified that all of the modules studied do possess reasonable quality of speech that is suitable for man-machine applications. Furthermore, the development tools are now in place to permit the addition of computer speech output in such applications.

  14. THE PRESENCE OF ADENOID VEGETATIONS AND NASAL SPEECH, AND HEARING LOSS IN RELATION TO SECRETORY OTITIS MEDIA

    Directory of Open Access Journals (Sweden)

    Gabriela KOPACHEVA

    2004-12-01

    Full Text Available This study presents the treatment of 68 children with secretory otitis media. Children underwent adenoid vegetations, nasal speech, conductive hearing loss, ventilation disturbance in Eustachian tube. In all children adenoidectomy was indicated.38 boys and 30 girls at the age of 3-17 were divided in two main groups: * 29 children without hypertrophic (enlarged adenoids, * 39 children with enlarged (hypertrophic adenoids.The surgical treatment included insertion of ventilation tubes and adenoidectomy where there where hypertrophic adenoids.Clinical material was analyzed according to hearing threshold, hearing level, middle ear condition estimated by pure tone audiometry and tympanometry before and after treatment. Data concerning both groups were compared.The results indicated that adenoidectomy combined with the ventilation tubes facilitates secretory otitis media heeling as well as decrease of hearing impairments. That enables prompt restoration of the hearing function as an important precondition for development of the language, social, emotional and academic development of children.

  15. SPEECH DISORDERS ENCOUNTERED DURING SPEECH THERAPY AND THERAPY TECHNIQUES

    Directory of Open Access Journals (Sweden)

    İlhan ERDEM

    2013-06-01

    Full Text Available Speech which is a physical and mental process, agreed signs and sounds to create a sense of mind to the message that change . Process to identify the sounds of speech it is essential to know the structure and function of various organs which allows to happen the conversation. Speech is a physical and mental process so many factors can lead to speech disorders. Speech disorder can be about language acquisitions as well as it can be caused medical and psychological many factors. Disordered speech, language, medical and psychological conditions as well as acquisitions also be caused by many factors. Speaking, is the collective work of many organs, such as an orchestra. Mental dimension of the speech disorder which is a very complex skill so it must be found which of these obstacles inhibit conversation. Speech disorder is a defect in speech flow, rhythm, tizliğinde, beats, the composition and vocalization. In this study, speech disorders such as articulation disorders, stuttering, aphasia, dysarthria, a local dialect speech, , language and lip-laziness, rapid speech peech defects in a term of language skills. This causes of speech disorders were investigated and presented suggestions for remedy was discussed.

  16. Practical speech user interface design

    CERN Document Server

    Lewis, James R

    2010-01-01

    Although speech is the most natural form of communication between humans, most people find using speech to communicate with machines anything but natural. Drawing from psychology, human-computer interaction, linguistics, and communication theory, Practical Speech User Interface Design provides a comprehensive yet concise survey of practical speech user interface (SUI) design. It offers practice-based and research-based guidance on how to design effective, efficient, and pleasant speech applications that people can really use. Focusing on the design of speech user interfaces for IVR application

  17. HUMAN SPEECH EMOTION RECOGNITION

    Directory of Open Access Journals (Sweden)

    Maheshwari Selvaraj

    2016-02-01

    Full Text Available Emotions play an extremely important role in human mental life. It is a medium of expression of one’s perspective or one’s mental state to others. Speech Emotion Recognition (SER can be defined as extraction of the emotional state of the speaker from his or her speech signal. There are few universal emotions- including Neutral, Anger, Happiness, Sadness in which any intelligent system with finite computational resources can be trained to identify or synthesize as required. In this work spectral and prosodic features are used for speech emotion recognition because both of these features contain the emotional information. Mel-frequency cepstral coefficients (MFCC is one of the spectral features. Fundamental frequency, loudness, pitch and speech intensity and glottal parameters are the prosodic features which are used to model different emotions. The potential features are extracted from each utterance for the computational mapping between emotions and speech patterns. Pitch can be detected from the selected features, using which gender can be classified. Support Vector Machine (SVM, is used to classify the gender in this work. Radial Basis Function and Back Propagation Network is used to recognize the emotions based on the selected features, and proved that radial basis function produce more accurate results for emotion recognition than the back propagation network.

  18. Robust Speech/Non-Speech Classification in Heterogeneous Multimedia Content

    NARCIS (Netherlands)

    Huijbregts, Marijn; Jong, de Franciska

    2011-01-01

    In this paper we present a speech/non-speech classification method that allows high quality classification without the need to know in advance what kinds of audible non-speech events are present in an audio recording and that does not require a single parameter to be tuned on in-domain data. Because

  19. Denial Denied: Freedom of Speech

    Directory of Open Access Journals (Sweden)

    Glen Newey

    2009-12-01

    Full Text Available Free speech is a widely held principle. This is in some ways surprising, since formal and informal censorship of speech is widespread, and rather different issues seem to arise depending on whether the censorship concerns who speaks, what content is spoken or how it is spoken. I argue that despite these facts, free speech can indeed be seen as a unitary principle. On my analysis, the core of the free speech principle is the denial of the denial of speech, whether to a speaker, to a proposition, or to a mode of expression. Underlying free speech is the principle of freedom of association, according to which speech is both a precondition of future association (e.g. as a medium for negotiation and a mode of association in its own right. I conclude by applying this account briefly to two contentious issues: hate speech and pornography.

  20. [Audiometry in the cellulose industry].

    Science.gov (United States)

    Corrao, C R; Milano, L; Pedulla, P; Carlesi, G; Bacaloni, A; Monaco, E

    1993-01-01

    A noise level dosimetry and audiometric testing were conducted in a cellulose factory to determine the hazardous noise level and the prevalence of noise induced hearing loss among the exposed workers. The noise level was recorded up to 90 db (A) in several working areas. 18 workers, potentially exposed to noise injury, evidenced a significant hearing loss. While no evidence of noise injury was recorded in a control group of 100 subjects. This finding suggest a strict relationship between audiometric tests, the noise level recorded in the working place and the working seniority of exposed employers. PMID:7720969

  1. Speech spectrogram expert

    Energy Technology Data Exchange (ETDEWEB)

    Johannsen, J.; Macallister, J.; Michalek, T.; Ross, S.

    1983-01-01

    Various authors have pointed out that humans can become quite adept at deriving phonetic transcriptions from speech spectrograms (as good as 90percent accuracy at the phoneme level). The authors describe an expert system which attempts to simulate this performance. The speech spectrogram expert (spex) is actually a society made up of three experts: a 2-dimensional vision expert, an acoustic-phonetic expert, and a phonetics expert. The visual reasoning expert finds important visual features of the spectrogram. The acoustic-phonetic expert reasons about how visual features relates to phonemes, and about how phonemes change visually in different contexts. The phonetics expert reasons about allowable phoneme sequences and transformations, and deduces an english spelling for phoneme strings. The speech spectrogram expert is highly interactive, allowing users to investigate hypotheses and edit rules. 10 references.

  2. Punctuation in Quoted Speech

    CERN Document Server

    Doran, C F

    1996-01-01

    Quoted speech is often set off by punctuation marks, in particular quotation marks. Thus, it might seem that the quotation marks would be extremely useful in identifying these structures in texts. Unfortunately, the situation is not quite so clear. In this work, I will argue that quotation marks are not adequate for either identifying or constraining the syntax of quoted speech. More useful information comes from the presence of a quoting verb, which is either a verb of saying or a punctual verb, and the presence of other punctuation marks, usually commas. Using a lexicalized grammar, we can license most quoting clauses as text adjuncts. A distinction will be made not between direct and indirect quoted speech, but rather between adjunct and non-adjunct quoting clauses.

  3. Protection limits on free speech

    Institute of Scientific and Technical Information of China (English)

    李敏

    2014-01-01

    Freedom of speech is one of the basic rights of citizens should receive broad protection, but in the real context of China under what kind of speech can be protected and be restricted, how to grasp between state power and free speech limit is a question worth considering. People tend to ignore the freedom of speech and its function, so that some of the rhetoric cannot be demonstrated in the open debates.

  4. Speech characteristics in depression.

    Science.gov (United States)

    Stassen, H H; Bomben, G; Günther, E

    1991-01-01

    This study examined the relationship between speech characteristics and psychopathology throughout the course of affective disturbances. Our sample comprised 20 depressive, hospitalized patients who had been selected according to the following criteria: (1) first admission; (2) long-term patient; (3) early entry into study; (4) late entry into study; (5) low scorer; (6) high scorer, and (7) distinct retarded-depressive symptomatology. Since our principal goal was to model the course of affective disturbances in terms of speech parameters, a total of 6 repeated measurements had been carried out over a 2-week period, including 3 different psychopathological instruments and speech recordings from automatic speech as well as from reading out loud. It turned out that neither applicability nor efficiency of single-parameter models depend in any way on the given, clinically defined subgroups. On the other hand, however, no significant differences between the clinically defined subgroups showed up with regard to basic speech parameters, except for the fact that low scorers seemed to take their time when producing utterances (this in contrast to all other patients who, on the average, had a considerably shorter recording time). As to the relationship between psychopathology and speech parameters over time, we found significant correlations: (1) in 60% of cases between the apathic syndrome and energy/dynamics; (2) in 50% of cases between the retarded-depressive syndrome and energy/dynamics; (3) in 45% of cases between the apathic syndrome and mean vocal pitch, and (4) in 71% of low scores between the somatic-depressive syndrome and time duration of pauses. All in all, single parameter models turned out to cover only specific aspects of the individual courses of affective disturbances, thus speaking against a simple approach which applies in general. PMID:1886971

  5. The University and Free Speech

    OpenAIRE

    Grcic, Joseph

    2014-01-01

    Free speech is a necessary condition for the growth of knowledge and the implementation of real and rational democracy. Educational institutions play a central role in socializing individuals to function within their society. Academic freedom is the right to free speech in the context of the university and tenure, properly interpreted, is a necessary component of protecting academic freedom and free speech.

  6. Speech in Parkinson's disease

    OpenAIRE

    Širca, Patricija

    2012-01-01

    The thesis presents an analysis of speech of four male subjects with a diagnosis of Parkinson's disease associated with dementia. The analysis was performed on the record of the description of each one. All persons were asked to describe the scene in the picture, taken from the Boston test for aphasia, entitled: The cookie theft. Description was shot with a recorder and then converted to written words. With the help of pre-prepared check list, the speech has been properly evaluated. Each w...

  7. Sensitivity of cortical auditory evoked potential detection for hearing-impaired infants in response to short speech sounds

    Directory of Open Access Journals (Sweden)

    Bram Van Dun

    2012-01-01

    Full Text Available

    Background: Cortical auditory evoked potentials (CAEPs are an emerging tool for hearing aid fitting evaluation in young children who cannot provide reliable behavioral feedback. It is therefore useful to determine the relationship between the sensation level of speech sounds and the detection sensitivity of CAEPs.

    Design and methods: Twenty-five sensorineurally hearing impaired infants with an age range of 8 to 30 months were tested once, 18 aided and 7 unaided. First, behavioral thresholds of speech stimuli /m/, /g/, and /t/ were determined using visual reinforcement orientation audiometry (VROA. Afterwards, the same speech stimuli were presented at 55, 65, and 75 dB SPL, and CAEP recordings were made. An automatic statistical detection paradigm was used for CAEP detection.

    Results: For sensation levels above 0, 10, and 20 dB respectively, detection sensitivities were equal to 72 ± 10, 75 ± 10, and 78 ± 12%. In 79% of the cases, automatic detection p-values became smaller when the sensation level was increased by 10 dB.

    Conclusions: The results of this study suggest that the presence or absence of CAEPs can provide some indication of the audibility of a speech sound for infants with sensorineural hearing loss. The detection of a CAEP provides confidence, to a degree commensurate with the detection probability, that the infant is detecting that sound at the level presented. When testing infants where the audibility of speech sounds has not been established behaviorally, the lack of a cortical response indicates the possibility, but by no means a certainty, that the sensation level is 10 dB or less.

  8. Evaluation of cleft palate speech.

    Science.gov (United States)

    Smith, Bonnie; Guyette, Thomas W

    2004-04-01

    Children born with palatal clefts are at risk for speech/language delay and speech problems related to palatal insufficiency. These individuals require regular speech evaluations, starting in the first year of life and often continuing into adulthood. The primary role of the speech pathologist on the cleft palate/craniofacial team is to evaluate whether deviations in oral cavity structures, such as the velopharynx, negatively impact speech production. This article focuses on the assessment of velopharyngeal function before and after palatal surgery. PMID:15145667

  9. Denial Denied: Freedom of Speech

    OpenAIRE

    Glen Newey

    2009-01-01

    Free speech is a widely held principle. This is in some ways surprising, since formal and informal censorship of speech is widespread, and rather different issues seem to arise depending on whether the censorship concerns who speaks, what content is spoken or how it is spoken. I argue that despite these facts, free speech can indeed be seen as a unitary principle. On my analysis, the core of the free speech principle is the denial of the denial of speech, whether to a speaker, to a propositio...

  10. Development of Guideline for Rating the Physical Impairment of Otolaryngologic Field

    OpenAIRE

    Park, Chul Won; Do, Nam Yong; Rha, Ki Sang; Chung, Sung Min; Kwon, Young Jun

    2009-01-01

    We develop a guideline for rating the physical impairment of otolaryngologic fields. Assessment of hearing disturbance and tinnitus required physical examination, pure tone audiometry, speech audiometry, impedance audiometry, brainstem evoked response audiometry, Bekesy audiometry, otoacoustic emission test, and imaging examination. History taking, physical examination, and radiological examination for the vestibular organ and brain, righting reflex test, electronystagmography, and caloric te...

  11. Packet speech systems technology

    Science.gov (United States)

    Weinstein, C. J.; Blankenship, P. E.

    1982-09-01

    The long-range objectives of the Packet Speech Systems Technology Program are to develop and demonstrate techniques for efficient digital speech communications on networks suitable for both voice and data, and to investigate and develop techniques for integrated voice and data communication in packetized networks, including wideband common-user satellite links. Specific areas of concern are: the concentration of statistically fluctuating volumes of voice traffic, the adaptation of communication strategies to varying conditions of network links and traffic volume, and the interconnection of wideband satellite networks to terrestrial systems. Previous efforts in this area have led to new vocoder structures for improved narrowband voice performance and multiple-rate transmission, and to demonstrations of conversational speech and conferencing on the ARPANET and the Atlantic Packet Satellite Network. The current program has two major thrusts: i.e., the development and refinement of practical low-cost, robust, narrowband, and variable-rate speech algorithms and voice terminal structures; and the establishment of an experimental wideband satellite network to serve as a unique facility for the realistic investigation of voice/data networking strategies.

  12. Black History Speech

    Science.gov (United States)

    Noldon, Carl

    2007-01-01

    The author argues in this speech that one cannot expect students in the school system to know and understand the genius of Black history if the curriculum is Eurocentric, which is a residue of racism. He states that his comments are designed for the enlightenment of those who suffer from a school system that "hypocritically manipulates Black…

  13. Hearing speech in music

    Directory of Open Access Journals (Sweden)

    Seth-Reino Ekström

    2011-01-01

    Full Text Available The masking effect of a piano composition, played at different speeds and in different octaves, on speech-perception thresholds was investigated in 15 normal-hearing and 14 moderately-hearing-impaired subjects. Running speech (just follow conversation, JFC testing and use of hearing aids increased the everyday validity of the findings. A comparison was made with standard audiometric noises [International Collegium of Rehabilitative Audiology (ICRA noise and speech spectrum-filtered noise (SPN]. All masking sounds, music or noise, were presented at the same equivalent sound level (50 dBA. The results showed a significant effect of piano performance speed and octave (P<.01. Low octave and fast tempo had the largest effect; and high octave and slow tempo, the smallest. Music had a lower masking effect than did ICRA noise with two or six speakers at normal vocal effort (P<.01 and SPN (P<.05. Subjects with hearing loss had higher masked thresholds than the normal-hearing subjects (P<.01, but there were smaller differences between masking conditions (P<.01. It is pointed out that music offers an interesting opportunity for studying masking under realistic conditions, where spectral and temporal features can be varied independently. The results have implications for composing music with vocal parts, designing acoustic environments and creating a balance between speech perception and privacy in social settings.

  14. Free Speech Yearbook 1979.

    Science.gov (United States)

    Kane, Peter E., Ed.

    The seven articles in this collection deal with theoretical and practical freedom of speech issues. Topics covered are: the United States Supreme Court, motion picture censorship, and the color line; judicial decision making; the established scientific community's suppression of the ideas of Immanuel Velikovsky; the problems of avant-garde jazz,…

  15. Charisma in business speeches

    DEFF Research Database (Denmark)

    Niebuhr, Oliver; Brem, Alexander; Novák-Tót, Eszter;

    2016-01-01

    Charisma is a key component of spoken language interaction; and it is probably for this reason that charismatic speech has been the subject of intensive research for centuries. However, what is still largely missing is a quantitative and objective line of research that, firstly, involves analyses...

  16. Hearing speech in music.

    Science.gov (United States)

    Ekström, Seth-Reino; Borg, Erik

    2011-01-01

    The masking effect of a piano composition, played at different speeds and in different octaves, on speech-perception thresholds was investigated in 15 normal-hearing and 14 moderately-hearing-impaired subjects. Running speech (just follow conversation, JFC) testing and use of hearing aids increased the everyday validity of the findings. A comparison was made with standard audiometric noises [International Collegium of Rehabilitative Audiology (ICRA) noise and speech spectrum-filtered noise (SPN)]. All masking sounds, music or noise, were presented at the same equivalent sound level (50 dBA). The results showed a significant effect of piano performance speed and octave (PMusic had a lower masking effect than did ICRA noise with two or six speakers at normal vocal effort (Pmusic offers an interesting opportunity for studying masking under realistic conditions, where spectral and temporal features can be varied independently. The results have implications for composing music with vocal parts, designing acoustic environments and creating a balance between speech perception and privacy in social settings. PMID:21768731

  17. 1984 Newbery Acceptance Speech.

    Science.gov (United States)

    Cleary, Beverly

    1984-01-01

    This acceptance speech for an award honoring "Dear Mr. Henshaw," a book about feelings of a lonely child of divorce intended for eight-, nine-, and ten-year-olds, highlights children's letters to author. Changes in society that affect children, the inception of "Dear Mr. Henshaw," and children's reactions to books are highlighted. (EJS)

  18. Metaheuristic applications to speech enhancement

    CERN Document Server

    Kunche, Prajna

    2016-01-01

    This book serves as a basic reference for those interested in the application of metaheuristics to speech enhancement. The major goal of the book is to explain the basic concepts of optimization methods and their use in heuristic optimization in speech enhancement to scientists, practicing engineers, and academic researchers in speech processing. The authors discuss why it has been a challenging problem for researchers to develop new enhancement algorithms that aid in the quality and intelligibility of degraded speech. They present powerful optimization methods to speech enhancement that can help to solve the noise reduction problems. Readers will be able to understand the fundamentals of speech processing as well as the optimization techniques, how the speech enhancement algorithms are implemented by utilizing optimization methods, and will be given the tools to develop new algorithms. The authors also provide a comprehensive literature survey regarding the topic.

  19. The cortical representation of the speech envelope is earlier for audiovisual speech than audio speech.

    Science.gov (United States)

    Crosse, Michael J; Lalor, Edmund C

    2014-04-01

    Visual speech can greatly enhance a listener's comprehension of auditory speech when they are presented simultaneously. Efforts to determine the neural underpinnings of this phenomenon have been hampered by the limited temporal resolution of hemodynamic imaging and the fact that EEG and magnetoencephalographic data are usually analyzed in response to simple, discrete stimuli. Recent research has shown that neuronal activity in human auditory cortex tracks the envelope of natural speech. Here, we exploit this finding by estimating a linear forward-mapping between the speech envelope and EEG data and show that the latency at which the envelope of natural speech is represented in cortex is shortened by >10 ms when continuous audiovisual speech is presented compared with audio-only speech. In addition, we use a reverse-mapping approach to reconstruct an estimate of the speech stimulus from the EEG data and, by comparing the bimodal estimate with the sum of the unimodal estimates, find no evidence of any nonlinear additive effects in the audiovisual speech condition. These findings point to an underlying mechanism that could account for enhanced comprehension during audiovisual speech. Specifically, we hypothesize that low-level acoustic features that are temporally coherent with the preceding visual stream may be synthesized into a speech object at an earlier latency, which may provide an extended period of low-level processing before extraction of semantic information. PMID:24401714

  20. Application of auditory brainstem response and pure tone audiometry in early diagnosis of acoustic neuroma%听性脑干反应和纯音听阈在听神经瘤早期诊断中的应用

    Institute of Scientific and Technical Information of China (English)

    赵赋; 武丽; 王博; 杨智君; 王振民; 王兴朝; 李朋; 张晶; 刘丕楠

    2015-01-01

    目的 探讨采用听性脑干反应和纯音听阈对早期诊断听神经瘤的临床应用价值.方法 回顾性分析了111例听神经瘤患者的临床资料、纯音听阈、听性脑干反应及增强磁共振结果,采用线性回归分析纯音听阈均值与肿瘤体积、病程是否存在相关性,采用卡方检验分析不同肿瘤体积在听性脑干反应异常发生率上是否存在差异.结果 听神经瘤引起感音神经性耳聋,纯音听阈均值与病程存在显著地相关性(P=0.000);听性脑干反应诊断听神经瘤的敏感度和特异度分别为98.2%和93.6%,肿瘤最大径>3 cm与≤3 cm两组,在患侧和对侧Ⅲ~Ⅳ波间期异常发生率上,差异均具有统计学意义(P值分别为0.038和0.045).结论 听性脑干反应联合纯音测听是早期诊断听神经瘤的有效方法.%Objective To investigate the clinical application value of using auditory brainstem response and pure tone audiometry for early diagnosis of acoustic neuroma.Methods The clinical data,the results of pure tone audiometry,auditory brainstem response,and enhanced MRI in 111 patients with acoustic neuroma were analyzed retrospectively.Linear regression analysis was used to analyze the correlation between the nean value of pure tone audiometry and the neuroma volune or course of disease.Chi-squared test was used to analyze the whether there were differences in the different neuroma volumes on the incidence of abnormal auditory brainstem response.Results Acoustic neuroma caused sensorineural deafness.There was a significant correlation between the mean value of pure tone audiometry and the course of disease (P =0.000).The sensitivity and specificity of auditory brainstem response for the diagnosis of acoustic neuroma were 98.2% and 93.6% respectively.The maximum diameters of neuromas were divided into 2 groups:> 3 cm or ≤3 cm.There were significant differences on the abnormal incidence of the Ⅲ to Ⅴ wave intervals of the

  1. SPEECH PROCESSING –AN OVERVIEW

    Directory of Open Access Journals (Sweden)

    A.INDUMATHI

    2012-06-01

    Full Text Available One of the earliest goals of speech processing was coding speech for efficient transmission. Later, the research spread in various area like Automatic Speech Recognition (ASR, Speech Synthesis (TTS,Speech Enhancement, Automatic Language Translation (ALT.Initially, ASR is used to recognize single words in a small vocabulary, later many product was developed for continuous speech for large vocabulary.Speech Synthesis is used for synthesizing the speech corresponding to a given text Speech Synthesis provide a way to communicate for persons unable to speak. When Speech Synthesis used together withASR, it allows a complete two-way spoken interaction between humans and machines. Speech Enhancement technique is applied to improve the quality of speech signal. Automatic Language Translation helps toconvert one language into another language. Basic concept of speech processing is provided for beginners.

  2. Predicting speech intelligibility in conditions with nonlinearly processed noisy speech

    DEFF Research Database (Denmark)

    Jørgensen, Søren; Dau, Torsten

    2013-01-01

    that a measure of the across audio-frequency variance at the output of the modulation-frequency selective process in the model is sufficient to account for the phase jitter distortion. Thus, a joint spectro-temporal modulation analysis, as proposed in [3], does not seem to be required. The results are......The speech-based envelope power spectrum model (sEPSM; [1]) was proposed in order to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index (SII). The sEPSM applies the signal-tonoise ratio in the envelope domain (SNRenv), which was demonstrated...... to successfully predict speech intelligibility in conditions with nonlinearly processed noisy speech, such as processing with spectral subtraction. Moreover, a multiresolution version (mr-sEPSM) was demonstrated to account for speech intelligibility in various conditions with stationary and...

  3. Hate Speech: Power in the Marketplace.

    Science.gov (United States)

    Harrison, Jack B.

    1994-01-01

    A discussion of hate speech and freedom of speech on college campuses examines the difference between hate speech from normal, objectionable interpersonal comments and looks at Supreme Court decisions on the limits of student free speech. Two cases specifically concerning regulation of hate speech on campus are considered: Chaplinsky v. New…

  4. Variation and Synthetic Speech

    CERN Document Server

    Miller, C; Massey, N; Miller, Corey; Karaali, Orhan; Massey, Noel

    1997-01-01

    We describe the approach to linguistic variation taken by the Motorola speech synthesizer. A pan-dialectal pronunciation dictionary is described, which serves as the training data for a neural network based letter-to-sound converter. Subsequent to dictionary retrieval or letter-to-sound generation, pronunciations are submitted a neural network based postlexical module. The postlexical module has been trained on aligned dictionary pronunciations and hand-labeled narrow phonetic transcriptions. This architecture permits the learning of individual postlexical variation, and can be retrained for each speaker whose voice is being modeled for synthesis. Learning variation in this way can result in greater naturalness for the synthetic speech that is produced by the system.

  5. [Improving speech comprehension using a new cochlear implant speech processor].

    Science.gov (United States)

    Müller-Deile, J; Kortmann, T; Hoppe, U; Hessel, H; Morsnowski, A

    2009-06-01

    The aim of this multicenter clinical field study was to assess the benefits of the new Freedom 24 sound processor for cochlear implant (CI) users implanted with the Nucleus 24 cochlear implant system. The study included 48 postlingually profoundly deaf experienced CI users who demonstrated speech comprehension performance with their current speech processor on the Oldenburg sentence test (OLSA) in quiet conditions of at least 80% correct scores and who were able to perform adaptive speech threshold testing using the OLSA in noisy conditions. Following baseline measures of speech comprehension performance with their current speech processor, subjects were upgraded to the Freedom 24 speech processor. After a take-home trial period of at least 2 weeks, subject performance was evaluated by measuring the speech reception threshold with the Freiburg multisyllabic word test and speech intelligibility with the Freiburg monosyllabic word test at 50 dB and 70 dB in the sound field. The results demonstrated highly significant benefits for speech comprehension with the new speech processor. Significant benefits for speech comprehension were also demonstrated with the new speech processor when tested in competing background noise.In contrast, use of the Abbreviated Profile of Hearing Aid Benefit (APHAB) did not prove to be a suitably sensitive assessment tool for comparative subjective self-assessment of hearing benefits with each processor. Use of the preprocessing algorithm known as adaptive dynamic range optimization (ADRO) in the Freedom 24 led to additional improvements over the standard upgrade map for speech comprehension in quiet and showed equivalent performance in noise. Through use of the preprocessing beam-forming algorithm BEAM, subjects demonstrated a highly significant improved signal-to-noise ratio for speech comprehension thresholds (i.e., signal-to-noise ratio for 50% speech comprehension scores) when tested with an adaptive procedure using the Oldenburg

  6. On Speech Act Theory

    Institute of Scientific and Technical Information of China (English)

    邓仁毅

    2009-01-01

    Speech act has developed from the work of linguistic philosophers and originates in Austin's observation and study. It was the particular search for the eonstative, utterances which describe something outside the text and can therefore be judged true or false that prompted John L. Austin to direct his attention to the distinction with so -called performa-tires. The two representative linguists are Aus-tin and Searle.

  7. HATE SPEECH AS COMMUNICATION

    OpenAIRE

    Gladilin Aleksey Vladimirovich

    2012-01-01

    The purpose of the paper is a theoretical comprehension of hate speech from communication point of view, on the one hand, and from the point of view of prejudice, stereotypes and discrimination on the other. Such a comprehension caused by the need to develop objective forensic linguistics methodology to analyze texts that are supposedly extremist. The method of analysis and synthesis is the basic in the investigation. Approach to functions and other elements of communication theory is based o...

  8. Predicting Speech Intelligibility

    OpenAIRE

    HINES, ANDREW

    2012-01-01

    Hearing impairment, and specifically sensorineural hearing loss, is an increasingly prevalent condition, especially amongst the ageing population. It occurs primarily as a result of damage to hair cells that act as sound receptors in the inner ear and causes a variety of hearing perception problems, most notably a reduction in speech intelligibility. Accurate diagnosis of hearing impairments is a time consuming process and is complicated by the reliance on indirect measurements based on patie...

  9. Regulating hate speech online

    OpenAIRE

    Banks, James

    2010-01-01

    The exponential growth in the Internet as a means of communication has been emulated by an increase in far-right and extremist web sites and hate based activity in cyberspace. The anonymity and mobility afforded by the Internet has made harassment and expressions of hate effortless in a landscape that is abstract and beyond the realms of traditional law enforcement. This paper examines the complexities of regulating hate speech on the Internet through legal and technological frameworks. It ex...

  10. Speech rhythm: a metaphor?

    Science.gov (United States)

    Nolan, Francis; Jeon, Hae-Sung

    2014-12-19

    Is speech rhythmic? In the absence of evidence for a traditional view that languages strive to coordinate either syllables or stress-feet with regular time intervals, we consider the alternative that languages exhibit contrastive rhythm subsisting merely in the alternation of stronger and weaker elements. This is initially plausible, particularly for languages with a steep 'prominence gradient', i.e. a large disparity between stronger and weaker elements; but we point out that alternation is poorly achieved even by a 'stress-timed' language such as English, and, historically, languages have conspicuously failed to adopt simple phonological remedies that would ensure alternation. Languages seem more concerned to allow 'syntagmatic contrast' between successive units and to use durational effects to support linguistic functions than to facilitate rhythm. Furthermore, some languages (e.g. Tamil, Korean) lack the lexical prominence which would most straightforwardly underpin prominence of alternation. We conclude that speech is not incontestibly rhythmic, and may even be antirhythmic. However, its linguistic structure and patterning allow the metaphorical extension of rhythm in varying degrees and in different ways depending on the language, and it is this analogical process which allows speech to be matched to external rhythms. PMID:25385774

  11. Speech and the Right Hemisphere

    Directory of Open Access Journals (Sweden)

    E. M. R. Critchley

    1991-01-01

    Full Text Available Two facts are well recognized: the location of the speech centre with respect to handedness and early brain damage, and the involvement of the right hemisphere in certain cognitive functions including verbal humour, metaphor interpretation, spatial reasoning and abstract concepts. The importance of the right hemisphere in speech is suggested by pathological studies, blood flow parameters and analysis of learning strategies. An insult to the right hemisphere following left hemisphere damage can affect residual language abilities and may activate non-propositional inner speech. The prosody of speech comprehension even more so than of speech production—identifying the voice, its affective components, gestural interpretation and monitoring one's own speech—may be an essentially right hemisphere task. Errors of a visuospatial type may occur in the learning process. Ease of learning by actors and when learning foreign languages is achieved by marrying speech with gesture and intonation, thereby adopting a right hemisphere strategy.

  12. Speech recognition in university classrooms

    OpenAIRE

    Wald, Mike; Bain, Keith; Basson, Sara H

    2002-01-01

    The LIBERATED LEARNING PROJECT (LLP) is an applied research project studying two core questions: 1) Can speech recognition (SR) technology successfully digitize lectures to display spoken words as text in university classrooms? 2) Can speech recognition technology be used successfully as an alternative to traditional classroom notetaking for persons with disabilities? This paper addresses these intriguing questions and explores the underlying complex relationship between speech recognition te...

  13. Visualizing structures of speech expressiveness

    OpenAIRE

    Herbelin, Bruno; Jensen, Karl Kristoffer; Graugaard, Lars

    2008-01-01

    Speech is both beautiful and informative. In this work, a conceptual study ofthe speech, through investigation of the tower of Babel, the archetypal phonemes, and astudy of the reasons of uses of language is undertaken in order to create an artistic workinvestigating the nature of speech. The Babel myth speaks about distance created whenaspiring to the heaven as the reason for language division. Meanwhile, Locquin statesthrough thorough investigations that only a few phonemes are present thro...

  14. Lecturer’s Speech Competence

    OpenAIRE

    Svetlana Viktorovna Panina; Svetlana Yurievna Zalutskaya; Galina Egorovna Zhondorova

    2014-01-01

    The analysis of the issue of lecturer’s speech competence is presented. Lecturer’s speech competence is the main component of professional image, the indicator of communicative culture, having a great impact on the quality of pedagogical activity Research objective: to define the main drawbacks of speech competence of lecturers of North-Eastern Federal University named after M. K. Ammosov (NEFU) (Russia, Yakutsk) and suggest the ways of drawbacks corrections in terms of multilingual education...

  15. Speech Recognition Technology: Applications & Future

    OpenAIRE

    Pankaj Pathak

    2010-01-01

    Voice or speech recognition is "the technology by which sounds, words or phrases spoken by humans are converted into electrical signals, and these signals are transformed into coding patterns to which meaning has been assigned", .It is the technology needs a combination of improved artificial intelligence technology and a more sophisticated speech-recognition engine . Initially a primitive device is developed which could recognize speech, by AT & T Bell Laboratories in the 1940s. According to...

  16. Motor Equivalence in Speech Production

    OpenAIRE

    Perrier, Pascal; Fuchs, Susanne

    2015-01-01

    International audience The first section provides a description of the concepts of “motor equivalence” and “degrees of freedom”. It is illustrated with a few examples of motor tasks in general and of speech production tasks in particular. In the second section, the methodology used to investigate experimentally motor equivalence phenomena in speech production is presented. It is mainly based on paradigms that perturb the perception-action loop during on-going speech, either by limiting the...

  17. Speech therapy for Parkinson's disease.

    OpenAIRE

    Scott, S; Caird, F I

    1983-01-01

    Twenty-six patients with the speech disorder of Parkinson's disease received daily speech therapy (prosodic exercises) at home for 2 to 3 weeks. There were significant improvements in speech as assessed by scores for prosodic abnormality and intelligibility' and these were maintained in part for up to 3 months. The degree of improvement was clinically and psychologically important, and relatives commented on the social benefits. The use of a visual reinforcement device produced limited benefi...

  18. Somatosensory basis of speech production.

    Science.gov (United States)

    Tremblay, Stéphanie; Shiller, Douglas M; Ostry, David J

    2003-06-19

    The hypothesis that speech goals are defined acoustically and maintained by auditory feedback is a central idea in speech production research. An alternative proposal is that speech production is organized in terms of control signals that subserve movements and associated vocal-tract configurations. Indeed, the capacity for intelligible speech by deaf speakers suggests that somatosensory inputs related to movement play a role in speech production-but studies that might have documented a somatosensory component have been equivocal. For example, mechanical perturbations that have altered somatosensory feedback have simultaneously altered acoustics. Hence, any adaptation observed under these conditions may have been a consequence of acoustic change. Here we show that somatosensory information on its own is fundamental to the achievement of speech movements. This demonstration involves a dissociation of somatosensory and auditory feedback during speech production. Over time, subjects correct for the effects of a complex mechanical load that alters jaw movements (and hence somatosensory feedback), but which has no measurable or perceptible effect on acoustic output. The findings indicate that the positions of speech articulators and associated somatosensory inputs constitute a goal of speech movements that is wholly separate from the sounds produced. PMID:12815431

  19. What Is Language? What Is Speech?

    Science.gov (United States)

    ... Public / Speech, Language and Swallowing / Development What Is Language? What Is Speech? [ en Español ] Kelly's 4-year-old son, Tommy, has speech and language problems. Friends and family have a hard time ...

  20. Neurocognitive mechanisms of audiovisual speech perception

    OpenAIRE

    Ojanen, Ville

    2005-01-01

    Face-to-face communication involves both hearing and seeing speech. Heard and seen speech inputs interact during audiovisual speech perception. Specifically, seeing the speaker's mouth and lip movements improves identification of acoustic speech stimuli, especially in noisy conditions. In addition, visual speech may even change the auditory percept. This occurs when mismatching auditory speech is dubbed onto visual articulation. Research on the brain mechanisms of audiovisual perception a...

  1. Enhancing Peer Feedback and Speech Preparation: The Speech Video Activity

    Science.gov (United States)

    Opt, Susan

    2012-01-01

    In the typical public speaking course, instructors or assistants videotape or digitally record at least one of the term's speeches in class or lab to offer students additional presentation feedback. Students often watch and self-critique their speeches on their own. Peers often give only written feedback on classroom presentations or completed…

  2. Auditory detection of non-speech and speech stimuli in noise: Native speech advantage.

    Science.gov (United States)

    Huo, Shuting; Tao, Sha; Wang, Wenjing; Li, Mingshuang; Dong, Qi; Liu, Chang

    2016-05-01

    Detection thresholds of Chinese vowels, Korean vowels, and a complex tone, with harmonic and noise carriers were measured in noise for Mandarin Chinese-native listeners. The harmonic index was calculated as the difference between detection thresholds of the stimuli with harmonic carriers and those with noise carriers. The harmonic index for Chinese vowels was significantly greater than that for Korean vowels and the complex tone. Moreover, native speech sounds were rated significantly more native-like than non-native speech and non-speech sounds. The results indicate that native speech has an advantage over other sounds in simple auditory tasks like sound detection. PMID:27250202

  3. Perceptual learning in speech

    OpenAIRE

    D. Norris; McQueen, J; Cutler, A.

    2003-01-01

    This study demonstrates that listeners use lexical knowledge in perceptual learning of speech sounds. Dutch listeners first made lexical decisions on Dutch words and nonwords. The final fricative of 20 critical words had been replaced by an ambiguous sound, between [f] and [s]. One group of listeners heard ambiguous [f]-final words (e.g., [WI tlo?], from witlof, chicory) and unambiguous [s]-final words (e.g., naaldbos, pine forest). Another group heard the reverse (e.g., ambiguous [na:ldbo?],...

  4. Taking a Stand for Speech.

    Science.gov (United States)

    Moore, Wayne D.

    1995-01-01

    Asserts that freedom of speech issues were among the first major confrontations in U.S. constitutional law. Maintains that lessons from the controversies surrounding the Sedition Act of 1798 have continuing practical relevance. Describes and discusses the significance of freedom of speech to the U.S. political system. (CFR)

  5. Speech Prosody in Cerebellar Ataxia

    Science.gov (United States)

    Casper, Maureen A.; Raphael, Lawrence J.; Harris, Katherine S.; Geibel, Jennifer M.

    2007-01-01

    Persons with cerebellar ataxia exhibit changes in physical coordination and speech and voice production. Previously, these alterations of speech and voice production were described primarily via perceptual coordinates. In this study, the spatial-temporal properties of syllable production were examined in 12 speakers, six of whom were healthy…

  6. Quality Estimation of Alaryngeal Speech

    Directory of Open Access Journals (Sweden)

    R.Dhivya

    2014-01-01

    Full Text Available Quality assessment can be done using subjective listening tests or using objective quality measures. Objective measures quantify quality. The sentence material is chosen from IEEE corpus. Real world noise data was taken from the noisy speech corpus NOIZEUS. Alaryngeal speaker‘s voice (alaryngeal speech is recorded. To enhance the quality of speech produced from the prosthetic device, four classes of enhancement methods encompassing four algorithms mband spectral subtraction algorithm, Karhunen–Loéve transform (KLT subspace algorithm, MASK statistical-model based algorithm and Wavelet Threshold-Wiener algorithm are used. The enhanced speech signals obtained from the four classes of algorithms are evaluated using Perceptual Evaluation of Speech Quality (PESQ. Spectrograms of these enhanced signals are also plotted.

  7. Spatial localization of speech segments

    DEFF Research Database (Denmark)

    Karlsen, Brian Lykkegaard

    1999-01-01

    Much is known about human localization of simple stimuli like sinusoids, clicks, broadband noise and narrowband noise in quiet. Less is known about human localization in noise. Even less is known about localization of speech and very few previous studies have reported data from localization of...... distribution of which azimuth angle the target is likely to have originated from. The model is trained on the experimental data. On the basis of the experimental results, it is concluded that the human ability to localize speech segments in adverse noise depends on the speech segment as well as its point of...... speech in noise. This study attempts to answer the question: ``Are there certain features of speech which have an impact on the human ability to determine the spatial location of a speaker in the horizontal plane under adverse noise conditions?''. The study consists of an extensive literature survey on...

  8. Speech Compression Using Multecirculerletet Transform

    Directory of Open Access Journals (Sweden)

    Sulaiman Murtadha

    2012-01-01

    Full Text Available Compressing the speech reduces the data storage requirements, leading to reducing the time of transmitting the digitized speech over long-haul links like internet. To obtain best performance in speech compression, wavelet transforms require filters that combine a number of desirable properties, such as orthogonality and symmetry.The MCT bases functions are derived from GHM bases function using 2D linear convolution .The fast computation algorithm methods introduced here added desirable features to the current transform. We further assess the performance of the MCT in speech compression application. This paper discusses the effect of using DWT and MCT (one and two dimension on speech compression. DWT and MCT performances in terms of compression ratio (CR, mean square error (MSE and peak signal to noise ratio (PSNR are assessed. Computer simulation results indicate that the two dimensions MCT offer a better compression ratio, MSE and PSNR than DWT.

  9. Techniques for automatic speech recognition

    Science.gov (United States)

    Moore, R. K.

    1983-05-01

    A brief insight into some of the algorithms that lie behind current automatic speech recognition system is provided. Early phonetically based approaches were not particularly successful, due mainly to a lack of appreciation of the problems involved. These problems are summarized, and various recognition techniques are reviewed in the contect of the solutions that they provide. It is pointed out that the majority of currently available speech recognition equipments employ a "whole-word' pattern matching approach which, although relatively simple, has proved particularly successful in its ability to recognize speech. The concepts of time-normalizing plays a central role in this type of recognition process and a family of such algorithms is described in detail. The technique of dynamic time warping is not only capable of providing good performance for isolated word recognition, but how it is also extended to the recognition of connected speech (thereby removing one of the most severe limitations of early speech recognition equipment).

  10. Hammerstein Model for Speech Coding

    Directory of Open Access Journals (Sweden)

    Turunen Jari

    2003-01-01

    Full Text Available A nonlinear Hammerstein model is proposed for coding speech signals. Using Tsay's nonlinearity test, we first show that the great majority of speech frames contain nonlinearities (over 80% in our test data when using 20-millisecond speech frames. Frame length correlates with the level of nonlinearity: the longer the frames the higher the percentage of nonlinear frames. Motivated by this result, we present a nonlinear structure using a frame-by-frame adaptive identification of the Hammerstein model parameters for speech coding. Finally, the proposed structure is compared with the LPC coding scheme for three phonemes /a/, /s/, and /k/ by calculating the Akaike information criterion of the corresponding residual signals. The tests show clearly that the residual of the nonlinear model presented in this paper contains significantly less information compared to that of the LPC scheme. The presented method is a potential tool to shape the residual signal in an encode-efficient form in speech coding.

  11. PCA-Based Speech Enhancement for Distorted Speech Recognition

    Directory of Open Access Journals (Sweden)

    Tetsuya Takiguchi

    2007-09-01

    Full Text Available We investigated a robust speech feature extraction method using kernel PCA (Principal Component Analysis for distorted speech recognition. Kernel PCA has been suggested for various image processing tasks requiring an image model, such as denoising, where a noise-free image is constructed from a noisy input image. Much research for robust speech feature extraction has been done, but it remains difficult to completely remove additive or convolution noise (distortion. The most commonly used noise-removal techniques are based on the spectraldomain operation, and then for speech recognition, the MFCC (Mel Frequency Cepstral Coefficient is computed, where DCT (Discrete Cosine Transform is applied to the mel-scale filter bank output. This paper describes a new PCA-based speech enhancement algorithm using kernel PCA instead of DCT, where the main speech element is projected onto low-order features, while the noise or distortion element is projected onto high-order features. Its effectiveness is confirmed by word recognition experiments on distorted speech.

  12. Hate Speech or Free Speech: Can Broad Campus Speech Regulations Survive Current Judicial Reasoning?

    Science.gov (United States)

    Heiser, Gregory M.; Rossow, Lawrence F.

    1993-01-01

    Federal courts have found speech regulations overbroad in suits against the University of Michigan and the University of Wisconsin System. Attempts to assess the theoretical justification and probable fate of broad speech regulations that have not been explicitly rejected by the courts. Concludes that strong arguments for broader regulation will…

  13. Hate Speech/Free Speech: Using Feminist Perspectives To Foster On-Campus Dialogue.

    Science.gov (United States)

    Cornwell, Nancy; Orbe, Mark P.; Warren, Kiesha

    1999-01-01

    Explores the complex issues inherent in the tension between hate speech and free speech, focusing on the phenomenon of hate speech on college campuses. Describes the challenges to hate speech made by critical race theorists and explains how a feminist critique can reorient the parameters of hate speech. (SLD)

  14. The Stylistic Analysis of Public Speech

    Institute of Scientific and Technical Information of China (English)

    李龙

    2011-01-01

    Public speech is a very important part in our daily life.The ability to deliver a good public speech is something we need to learn and to have,especially,in the service sector.This paper attempts to analyze the style of public speech,in the hope of providing inspiration to us whenever delivering such a speech.

  15. Phonetic Recalibration Only Occurs in Speech Mode

    Science.gov (United States)

    Vroomen, Jean; Baart, Martijn

    2009-01-01

    Upon hearing an ambiguous speech sound dubbed onto lipread speech, listeners adjust their phonetic categories in accordance with the lipread information (recalibration) that tells what the phoneme should be. Here we used sine wave speech (SWS) to show that this tuning effect occurs if the SWS sounds are perceived as speech, but not if the sounds…

  16. Integranting prosodic information into a speech recogniser

    OpenAIRE

    López Soto, María Teresa

    2001-01-01

    In the last decade there has been an increasing tendency to incorporate language engineering strategies into speech technology. This technique combines linguistic and mathematical information in different applications: machine translation, natural language processing, speech synthesis and automatic speech recognition (ASR). In the field of speech synthesis, this hybrid approach (linguistic and mathematical/statistical) has led to the design of efficient models for reproducin...

  17. Infant Perception of Atypical Speech Signals

    Science.gov (United States)

    Vouloumanos, Athena; Gelfand, Hanna M.

    2013-01-01

    The ability to decode atypical and degraded speech signals as intelligible is a hallmark of speech perception. Human adults can perceive sounds as speech even when they are generated by a variety of nonhuman sources including computers and parrots. We examined how infants perceive the speech-like vocalizations of a parrot. Further, we examined how…

  18. From data to speech: a general approach

    NARCIS (Netherlands)

    Theune, M.; Klabbers, E.A.M.; Pijper, de J.R.; Krahmer, E.; Odijk, J.; Boguraev, B.; Tait, J.; Jacquemin, C.

    2001-01-01

    We present a data-to-speech system called D2S, which can be used for the creation of data-to-speech systems in different languages and domains. The most important characteristic of a data-to-speech system is that it combines language and speech generation: language generation is used to produce a na

  19. Automated Speech Rate Measurement in Dysarthria

    Science.gov (United States)

    Martens, Heidi; Dekens, Tomas; Van Nuffelen, Gwen; Latacz, Lukas; Verhelst, Werner; De Bodt, Marc

    2015-01-01

    Purpose: In this study, a new algorithm for automated determination of speech rate (SR) in dysarthric speech is evaluated. We investigated how reliably the algorithm calculates the SR of dysarthric speech samples when compared with calculation performed by speech-language pathologists. Method: The new algorithm was trained and tested using Dutch…

  20. Perception of Emotion in Conversational Speech by Younger and Older Listeners

    Science.gov (United States)

    Schmidt, Juliane; Janse, Esther; Scharenborg, Odette

    2016-01-01

    This study investigated whether age and/or differences in hearing sensitivity influence the perception of the emotion dimensions arousal (calm vs. aroused) and valence (positive vs. negative attitude) in conversational speech. To that end, this study specifically focused on the relationship between participants’ ratings of short affective utterances and the utterances’ acoustic parameters (pitch, intensity, and articulation rate) known to be associated with the emotion dimensions arousal and valence. Stimuli consisted of short utterances taken from a corpus of conversational speech. In two rating tasks, younger and older adults either rated arousal or valence using a 5-point scale. Mean intensity was found to be the main cue participants used in the arousal task (i.e., higher mean intensity cueing higher levels of arousal) while mean F0 was the main cue in the valence task (i.e., higher mean F0 being interpreted as more negative). Even though there were no overall age group differences in arousal or valence ratings, compared to younger adults, older adults responded less strongly to mean intensity differences cueing arousal and responded more strongly to differences in mean F0 cueing valence. Individual hearing sensitivity among the older adults did not modify the use of mean intensity as an arousal cue. However, individual hearing sensitivity generally affected valence ratings and modified the use of mean F0. We conclude that age differences in the interpretation of mean F0 as a cue for valence are likely due to age-related hearing loss, whereas age differences in rating arousal do not seem to be driven by hearing sensitivity differences between age groups (as measured by pure-tone audiometry). PMID:27303340

  1. Speech recognition from spectral dynamics

    Indian Academy of Sciences (India)

    Hynek Hermansky

    2011-10-01

    Information is carried in changes of a signal. The paper starts with revisiting Dudley’s concept of the carrier nature of speech. It points to its close connection to modulation spectra of speech and argues against short-term spectral envelopes as dominant carriers of the linguistic information in speech. The history of spectral representations of speech is briefly discussed. Some of the history of gradual infusion of the modulation spectrum concept into Automatic recognition of speech (ASR) comes next, pointing to the relationship of modulation spectrum processing to wellaccepted ASR techniques such as dynamic speech features or RelAtive SpecTrAl (RASTA) filtering. Next, the frequency domain perceptual linear prediction technique for deriving autoregressive models of temporal trajectories of spectral power in individual frequency bands is reviewed. Finally, posterior-based features, which allow for straightforward application of modulation frequency domain information, are described. The paper is tutorial in nature, aims at a historical global overview of attempts for using spectral dynamics in machine recognition of speech, and does not always provide enough detail of the described techniques. However, extensive references to earlier work are provided to compensate for the lack of detail in the paper.

  2. INTEGRATING MACHINE TRANSLATION AND SPEECH SYNTHESIS COMPONENT FOR ENGLISH TO DRAVIDIAN LANGUAGE SPEECH TO SPEECH TRANSLATION SYSTEM

    Directory of Open Access Journals (Sweden)

    J. SANGEETHA

    2015-02-01

    Full Text Available This paper provides an interface between the machine translation and speech synthesis system for converting English speech to Tamil text in English to Tamil speech to speech translation system. The speech translation system consists of three modules: automatic speech recognition, machine translation and text to speech synthesis. Many procedures for incorporation of speech recognition and machine translation have been projected. Still speech synthesis system has not yet been measured. In this paper, we focus on integration of machine translation and speech synthesis, and report a subjective evaluation to investigate the impact of speech synthesis, machine translation and the integration of machine translation and speech synthesis components. Here we implement a hybrid machine translation (combination of rule based and statistical machine translation and concatenative syllable based speech synthesis technique. In order to retain the naturalness and intelligibility of synthesized speech Auto Associative Neural Network (AANN prosody prediction is used in this work. The results of this system investigation demonstrate that the naturalness and intelligibility of the synthesized speech are strongly influenced by the fluency and correctness of the translated text.

  3. Visualizing structures of speech expressiveness

    DEFF Research Database (Denmark)

    Herbelin, Bruno; Jensen, Karl Kristoffer; Graugaard, Lars

    2008-01-01

    Speech is both beautiful and informative. In this work, a conceptual study of the speech, through investigation of the tower of Babel, the archetypal phonemes, and a study of the reasons of uses of language is undertaken in order to create an artistic work investigating the nature of speech. The...... Babel myth speaks about distance created when aspiring to the heaven as the reason for language division. Meanwhile, Locquin states through thorough investigations that only a few phonemes are present throughout history. Our interpretation is that a system able to recognize archetypal phonemes through...

  4. Speech enhancement theory and practice

    CERN Document Server

    Loizou, Philipos C

    2013-01-01

    With the proliferation of mobile devices and hearing devices, including hearing aids and cochlear implants, there is a growing and pressing need to design algorithms that can improve speech intelligibility without sacrificing quality. Responding to this need, Speech Enhancement: Theory and Practice, Second Edition introduces readers to the basic problems of speech enhancement and the various algorithms proposed to solve these problems. Updated and expanded, this second edition of the bestselling textbook broadens its scope to include evaluation measures and enhancement algorithms aimed at impr

  5. Speech recovery device

    Energy Technology Data Exchange (ETDEWEB)

    Frankle, Christen M.

    2004-04-20

    There is provided an apparatus and method for assisting speech recovery in people with inability to speak due to aphasia, apraxia or another condition with similar effect. A hollow, rigid, thin-walled tube with semi-circular or semi-elliptical cut out shapes at each open end is positioned such that one end mates with the throat/voice box area of the neck of the assistor and the other end mates with the throat/voice box area of the assisted. The speaking person (assistor) makes sounds that produce standing wave vibrations at the same frequency in the vocal cords of the assisted person. Driving the assisted person's vocal cords with the assisted person being able to hear the correct tone enables the assisted person to speak by simply amplifying the vibration of membranes in their throat.

  6. Speech Enhancement via EMD

    Directory of Open Access Journals (Sweden)

    Monia Turki-Hadj Alouane

    2008-06-01

    Full Text Available In this study, two new approaches for speech signal noise reduction based on the empirical mode decomposition (EMD recently introduced by Huang et al. (1998 are proposed. Based on the EMD, both reduction schemes are fully data-driven approaches. Noisy signal is decomposed adaptively into oscillatory components called intrinsic mode functions (IMFs, using a temporal decomposition called sifting process. Two strategies for noise reduction are proposed: filtering and thresholding. The basic principle of these two methods is the signal reconstruction with IMFs previously filtered, using the minimum mean-squared error (MMSE filter introduced by I. Y. Soon et al. (1998, or thresholded using a shrinkage function. The performance of these methods is analyzed and compared with those of the MMSE filter and wavelet shrinkage. The study is limited to signals corrupted by additive white Gaussian noise. The obtained results show that the proposed denoising schemes perform better than the MMSE filter and wavelet approach.

  7. Silog: Speech Input Logon

    Science.gov (United States)

    Grau, Sergio; Allen, Tony; Sherkat, Nasser

    Silog is a biometrie authentication system that extends the conventional PC logon process using voice verification. Users enter their ID and password using a conventional Windows logon procedure but then the biometrie authentication stage makes a Voice over IP (VoIP) call to a VoiceXML (VXML) server. User interaction with this speech-enabled component then allows the user's voice characteristics to be extracted as part of a simple user/system spoken dialogue. If the captured voice characteristics match those of a previously registered voice profile, then network access is granted. If no match is possible, then a potential unauthorised system access has been detected and the logon process is aborted.

  8. Speech recovery device

    Energy Technology Data Exchange (ETDEWEB)

    Frankle, Christen M.

    2000-10-19

    There is provided an apparatus and method for assisting speech recovery in people with inability to speak due to aphasia, apraxia or another condition with similar effect. A hollow, rigid, thin-walled tube with semi-circular or semi-elliptical cut out shapes at each open end is positioned such that one end mates with the throat/voice box area of the neck of the assistor and the other end mates with the throat/voice box area of the assisted. The speaking person (assistor) makes sounds that produce standing wave vibrations at the same frequency in the vocal cords of the assisted person. Driving the assisted person's vocal cords with the assisted person being able to hear the correct tone enables the assisted person to speak by simply amplifying the vibration of membranes in their throat.

  9. Speech is Golden

    DEFF Research Database (Denmark)

    Juel Henrichsen, Peter

    2014-01-01

    on the supply side. The present article reports on a new public action strategy which has taken shape in the course of 2013-14. While Denmark is a small language area, our public sector is well organised and has considerable purchasing power. Across this past year, Danish local authorities have...... organised around the speech technology challenge, they have formulated a number of joint questions and new requirements to be met by suppliers and have deliberately worked towards formulating tendering material which will allow fair competition. Public researchers have contributed to this work, including...... the author of the present article, in the role of economically neutral advisers. The aim of the initiative is to pave the way for the first profitable contract in the field - which we hope to see in 2014 - an event which would precisely break the present deadlock and open up a billion EUR market for...

  10. Microphone Array Speech Recognition : Experiments on Overlapping Speech in Meetings

    OpenAIRE

    Moore, Darren; McCowan, Iain A.

    2002-01-01

    This paper investigates the use of microphone arrays to acquire and recognise speech in meetings. Meetings pose several interesting problems for speech processing, as they consist of multiple competing speakers within a small space, typically around a table. Due to their ability to provide hands-free acquisition and directional discrimination, microphone arrays present a potential alternative to close-talking microphones in such an application. We first propose an appropriate microphone array...

  11. Delayed Speech or Language Development

    Science.gov (United States)

    ... distinction between the two: Speech is the verbal expression of language and includes articulation, which is the ... sounds or words repeatedly and can't use oral language to communicate more than his or her ...

  12. Emotion Recognition using Speech Features

    CERN Document Server

    Rao, K Sreenivasa

    2013-01-01

    “Emotion Recognition Using Speech Features” covers emotion-specific features present in speech and discussion of suitable models for capturing emotion-specific information for distinguishing different emotions.  The content of this book is important for designing and developing  natural and sophisticated speech systems. Drs. Rao and Koolagudi lead a discussion of how emotion-specific information is embedded in speech and how to acquire emotion-specific knowledge using appropriate statistical models. Additionally, the authors provide information about using evidence derived from various features and models. The acquired emotion-specific knowledge is useful for synthesizing emotions. Discussion includes global and local prosodic features at syllable, word and phrase levels, helpful for capturing emotion-discriminative information; use of complementary evidences obtained from excitation sources, vocal tract systems and prosodic features in order to enhance the emotion recognition performance;  and pro...

  13. Speech and Language Developmental Milestones

    Science.gov (United States)

    ... of “brain plasticity”—the ways in which the brain is influenced by health conditions or life experiences—and how it can be used to develop learning strategies that encourage healthy language and speech development in ...

  14. Lattice Parsing for Speech Recognition

    OpenAIRE

    Chappelier, Jean-Cédric; Rajman, Martin; Aragües, Ramon; Rozenknop, Antoine

    1999-01-01

    A lot of work remains to be done in the domain of a better integration of speech recognition and language processing systems. This paper gives an overview of several strategies for integrating linguistic models into speech understanding systems and investigates several ways of producing sets of hypotheses that include more "semantic" variability than usual language models. The main goal is to present and demonstrate by actual experiments that sequential couplingmay be efficiently achieved byw...

  15. Audiometria de altas frequências no diagnóstico complementar em audiologia: uma revisão da literatura nacional High-frequency audiometry in audiological complementary diagnosis: a revision of the national literature

    Directory of Open Access Journals (Sweden)

    Karlin Fabianne Klagenberg

    2011-03-01

    Full Text Available A audiometria de altas frequências (AAF é um exame audiológico importante na detecção precoce de perdas auditivas por lesões na base do ducto coclear. Nos últimos anos, a sua utilização foi facilitada pelo fato de os audiômetros comercializados passarem a incorporar frequências superiores a 8 kHz. Porém, existem diferenças relacionadas aos equipamentos utilizados, às metodologias empregadas e/ou aos resultados e interpretação. Assim, o objetivo deste artigo foi analisar a produção científica nacional sobre a aplicação clínica com AAF, para compreender sua utilização atual. Foram pesquisados textos publicados e indexados nas bases de dados LILACS, SciELO e Medline, num período de tempo de dez anos, utilizando como descritor "audiometria de altas frequências/high-frequency audiometry". Encontraram-se 24 artigos científicos nacionais utilizando AAF, cuja população avaliada, em sua maioria, apresentava de 18 a 50 anos de idade; 13 dos estudos determinaram os limiares utilizando como referência decibel nível de audição (dBNA; alguns estudos realizaram a comparação dos limiares auditivos tonais entre grupos para definir a normalidade; os autores relataram diferenças significativas nos limiares auditivos de altas frequências entre as idades. A AAF é utilizada na clínica audiológica para identificação precoce de alterações auditivas e no acompanhamento da audição de sujeitos expostos a drogas ototóxicas e/ou agentes otoagressores.High-frequency audiometry (HFA is an important audiological test for early detection of hearing losses caused by leasions in the base of the cochlear duct. In recent years, its use was facilitated because audiometers began to identify frequencies higher than 8 kHz. However, there are differences related to the equipment used, the methodologies followed, and/or to the results and their interpretation. Therefore, the aim of this study was to analyze the national scientific production

  16. Novel Techniques for Dialectal Arabic Speech Recognition

    CERN Document Server

    Elmahdy, Mohamed; Minker, Wolfgang

    2012-01-01

    Novel Techniques for Dialectal Arabic Speech describes approaches to improve automatic speech recognition for dialectal Arabic. Since speech resources for dialectal Arabic speech recognition are very sparse, the authors describe how existing Modern Standard Arabic (MSA) speech data can be applied to dialectal Arabic speech recognition, while assuming that MSA is always a second language for all Arabic speakers. In this book, Egyptian Colloquial Arabic (ECA) has been chosen as a typical Arabic dialect. ECA is the first ranked Arabic dialect in terms of number of speakers, and a high quality ECA speech corpus with accurate phonetic transcription has been collected. MSA acoustic models were trained using news broadcast speech. In order to cross-lingually use MSA in dialectal Arabic speech recognition, the authors have normalized the phoneme sets for MSA and ECA. After this normalization, they have applied state-of-the-art acoustic model adaptation techniques like Maximum Likelihood Linear Regression (MLLR) and M...

  17. Neural pathways for visual speech perception

    Directory of Open Access Journals (Sweden)

    Lynne E Bernstein

    2014-12-01

    Full Text Available This paper examines the questions, what levels of speech can be perceived visually, and how is visual speech represented by the brain? Review of the literature leads to the conclusions that every level of psycholinguistic speech structure (i.e., phonetic features, phonemes, syllables, words, and prosody can be perceived visually, although individuals differ in their abilities to do so; and that there are visual modality-specific representations of speech qua speech in higher-level vision brain areas. That is, the visual system represents the modal patterns of visual speech. The suggestion that the auditory speech pathway receives and represents visual speech is examined in light of neuroimaging evidence on the auditory speech pathways. We outline the generally agreed-upon organization of the visual ventral and dorsal pathways and examine several types of visual processing that might be related to speech through those pathways, specifically, face and body, orthography, and sign language processing. In this context, we examine the visual speech processing literature, which reveals widespread diverse patterns activity in posterior temporal cortices in response to visual speech stimuli. We outline a model of the visual and auditory speech pathways and make several suggestions: (1 The visual perception of speech relies on visual pathway representations of speech qua speech. (2 A proposed site of these representations, the temporal visual speech area (TVSA has been demonstrated in posterior temporal cortex, ventral and posterior to multisensory posterior superior temporal sulcus (pSTS. (3 Given that visual speech has dynamic and configural features, its representations in feedforward visual pathways are expected to integrate these features, possibly in TVSA.

  18. Experimental comparison between speech transmission index, rapid speech transmission index, and speech intelligibility index.

    Science.gov (United States)

    Larm, Petra; Hongisto, Valtteri

    2006-02-01

    During the acoustical design of, e.g., auditoria or open-plan offices, it is important to know how speech can be perceived in various parts of the room. Different objective methods have been developed to measure and predict speech intelligibility, and these have been extensively used in various spaces. In this study, two such methods were compared, the speech transmission index (STI) and the speech intelligibility index (SII). Also the simplification of the STI, the room acoustics speech transmission index (RASTI), was considered. These quantities are all based on determining an apparent speech-to-noise ratio on selected frequency bands and summing them using a specific weighting. For comparison, some data were needed on the possible differences of these methods resulting from the calculation scheme and also measuring equipment. Their prediction accuracy was also of interest. Measurements were made in a laboratory having adjustable noise level and absorption, and in a real auditorium. It was found that the measurement equipment, especially the selection of the loudspeaker, can greatly affect the accuracy of the results. The prediction accuracy of the RASTI was found acceptable, if the input values for the prediction are accurately known, even though the studied space was not ideally diffuse. PMID:16521772

  19. Sparse representation in speech signal processing

    Science.gov (United States)

    Lee, Te-Won; Jang, Gil-Jin; Kwon, Oh-Wook

    2003-11-01

    We review the sparse representation principle for processing speech signals. A transformation for encoding the speech signals is learned such that the resulting coefficients are as independent as possible. We use independent component analysis with an exponential prior to learn a statistical representation for speech signals. This representation leads to extremely sparse priors that can be used for encoding speech signals for a variety of purposes. We review applications of this method for speech feature extraction, automatic speech recognition and speaker identification. Furthermore, this method is also suited for tackling the difficult problem of separating two sounds given only a single microphone.

  20. Speech vs. singing: infants choose happier sounds

    OpenAIRE

    Corbeil, Marieve; Trehub, Sandra E.; Peretz, Isabelle

    2013-01-01

    Infants prefer speech to non-vocal sounds and to non-human vocalizations, and they prefer happy-sounding speech to neutral speech. They also exhibit an interest in singing, but there is little knowledge of their relative interest in speech and singing. The present study explored infants' attention to unfamiliar audio samples of speech and singing. In Experiment 1, infants 4–13 months of age were exposed to happy-sounding infant-directed speech vs. hummed lullabies by the same woman. They list...

  1. Speech versus singing: Infants choose happier sounds

    OpenAIRE

    Marieve eCorbeil; Trehub, Sandra E.; Isabelle ePeretz

    2013-01-01

    Infants prefer speech to non-vocal sounds and to non-human vocalizations, and they prefer happy-sounding speech to neutral speech. They also exhibit an interest in singing, but there is little knowledge of their relative interest in speech and singing. The present study explored infants’ attention to unfamiliar audio samples of speech and singing. In Experiment 1, infants 4-13 months of age were exposed to happy-sounding infant-directed speech versus hummed lullabies by the same woman. They l...

  2. APPLICATION OF WAVELET TRANSFORM FOR SPEECH PROCESSING

    Directory of Open Access Journals (Sweden)

    SUBHRA DEBDAS

    2011-08-01

    Full Text Available The distinctive feature of wavelet transforms applications it is used in speech signals. Some problem are produced in speech signals their synthesis, analysis compression and classification. A method being evaluated uses wavelets for speech analysis and synthesis distinguishing between voiced and unvoiced speech, determining pitch, and methods for choosing optimum wavelets for speech compression are discussed. This comparative perception results that are obtained by listening to the synthesized speech using both scalar and vector quantized wavelet parameters are reported in this paper.

  3. Lecturer’s Speech Competence

    Directory of Open Access Journals (Sweden)

    Svetlana Viktorovna Panina

    2014-11-01

    Full Text Available The analysis of the issue of lecturer’s speech competence is presented. Lecturer’s speech competence is the main component of professional image, the indicator of communicative culture, having a great impact on the quality of pedagogical activity Research objective: to define the main drawbacks of speech competence of lecturers of North-Eastern Federal University named after M. K. Ammosov (NEFU (Russia, Yakutsk and suggest the ways of drawbacks corrections in terms of multilingual educational environment of higher education institution. The method of questionnaire was used in the research. The NEFU students took part in the research. The answers to the questionnaire allowed defining the most typical drawbacks for lecturers, working in the multicultural educational environment of region higher education institution. The mentioned drawbacks: words repetition, language rules breaking, wrong vocabulary or pronunciation of foreign words, use of colloquial language, etc. breaking the speech standards and decreasing the quality of lecture material presentation. The authors suggest improving lecturer’s speech competence through the organization of special advanced training courses, business games, discussion platforms, teaching aids and handbooks broadcasting, on-line tutorials, and motivated dialogue mastering as the most effective way of students training process organization.

  4. The treatment of apraxia of speech : Speech and music therapy, an innovative joint effort

    NARCIS (Netherlands)

    Hurkmans, Josephus Johannes Stephanus

    2016-01-01

    Apraxia of Speech (AoS) is a neurogenic speech disorder. A wide variety of behavioural methods have been developed to treat AoS. Various therapy programmes use musical elements to improve speech production. A unique therapy programme combining elements of speech therapy and music therapy is called S

  5. Predicting Speech Intelligibility with a Multiple Speech Subsystems Approach in Children with Cerebral Palsy

    Science.gov (United States)

    Lee, Jimin; Hustad, Katherine C.; Weismer, Gary

    2014-01-01

    Purpose: Speech acoustic characteristics of children with cerebral palsy (CP) were examined with a multiple speech subsystems approach; speech intelligibility was evaluated using a prediction model in which acoustic measures were selected to represent three speech subsystems. Method: Nine acoustic variables reflecting different subsystems, and…

  6. Modeling speech intelligibility in adverse conditions

    DEFF Research Database (Denmark)

    Dau, Torsten

    the normal as well as impaired auditory system. Jørgensen and Dau [(2011). J. Acoust. Soc. Am. 130, 1475-1487] proposed the speech-based envelope power spectrum model (sEPSM) in an attempt to overcome the limitations of the classical speech transmission index (STI) and speech intelligibility index...... (SII) in conditions with nonlinearly processed speech. Instead of considering the reduction of the temporal modulation energy as the intelligibility metric, as assumed in the STI, the sEPSM applies the signal-to-noise ratio in the envelope domain (SNRenv). This metric was shown to be the key for...... predicting the intelligibility of reverberant speech as well as noisy speech processed by spectral subtraction. However, the sEPSM cannot account for speech subjected to phase jitter, a condition in which the spectral structure of speech is destroyed, while the broadband temporal envelope is kept largely...

  7. Speech Sound Disorders: Articulation and Phonological Processes

    Science.gov (United States)

    ... an SLP to learn correct speech sounds. Some speech sound errors can result from physical problems, such as: developmental disorders (e.g.,autism) genetic syndromes (e.g., Down syndrome) hearing loss ...

  8. Speech perception and production in severe environments

    Science.gov (United States)

    Pisoni, David B.

    1990-09-01

    The goal was to acquire new knowledge about speech perception and production in severe environments such as high masking noise, increased cognitive load or sustained attentional demands. Changes were examined in speech production under these adverse conditions through acoustic analysis techniques. One set of studies focused on the effects of noise on speech production. The experiments in this group were designed to generate a database of speech obtained in noise and in quiet. A second set of experiments was designed to examine the effects of cognitive load on the acoustic-phonetic properties of speech. Talkers were required to carry out a demanding perceptual motor task while they read lists of test words. A final set of experiments explored the effects of vocal fatigue on the acoustic-phonetic properties of speech. Both cognitive load and vocal fatigue are present in many applications where speech recognition technology is used, yet their influence on speech production is poorly understood.

  9. Experimental study on phase perception in speech

    Institute of Scientific and Technical Information of China (English)

    BU Fanliang; CHEN Yanpu

    2003-01-01

    As the human ear is dull to the phase in speech, little attention has been paid tophase information in speech coding. In fact, the speech perceptual quality may be degeneratedif the phase distortion is very large. The perceptual effect of the STFT (Short time Fouriertransform) phase spectrum is studied by auditory subjective hearing tests. Three main con-clusions are (1) If the phase information is neglected completely, the subjective quality of thereconstructed speech may be very poor; (2) Whether the neglected phase is in low frequencyband or high frequency band, the difference from the original speech can be perceived by ear;(3) It is very difficult for the human ear to perceive the difference of speech quality betweenoriginal speech and reconstructed speech while the phase quantization step size is shorter thanπ/7.

  10. STUDY ON PHASE PERCEPTION IN SPEECH

    Institute of Scientific and Technical Information of China (English)

    Tong Ming; Bian Zhengzhong; Li Xiaohui; Dai Qijun; Chen Yanpu

    2003-01-01

    The perceptual effect of the phase information in speech has been studied by auditorysubjective tests. On the condition that the phase spectrum in speech is changed while amplitudespectrum is unchanged, the tests show that: (1) If the envelop of the reconstructed speech signalis unchanged, there is indistinctive auditory perception between the original speech and thereconstructed speech; (2) The auditory perception effect of the reconstructed speech mainly lieson the amplitude of the derivative of the additive phase; (3) td is the maximum relative time shiftbetween different frequency components of the reconstructed speech signal. The speech qualityis excellent while td <10ms; good while 10ms< td <20ms; common while 20ms< td <35ms, andpoor while td >35ms.

  11. Speech and Language Problems in Children

    Science.gov (United States)

    Children vary in their development of speech and language skills. Health professionals have milestones for what's normal. ... it may be due to a speech or language disorder. Language disorders can mean that the child ...

  12. Speech Segmentation Algorithm Based On Fuzzy Memberships

    OpenAIRE

    Luis D. Huerta; Jose Antonio Huesca; Julio C. Contreras

    2010-01-01

    In this work, an automatic speech segmentation algorithm with text independency was implemented. In the algorithm, the use of fuzzy memberships on each characteristic in different speech sub-bands is proposed. Thus, the segmentation is performed a greater detail. Additionally, we tested with various speech signal frequencies and labeling, and we could observe how they affect the performance of the segmentation process in phonemes. The speech segmentation algorithm used is described. During th...

  13. Auditory plasticity and speech motor learning

    OpenAIRE

    Nasir, Sazzad M.; Ostry, David J.

    2009-01-01

    Is plasticity in sensory and motor systems linked? Here, in the context of speech motor learning and perception, we test the idea sensory function is modified by motor learning and, in particular, that speech motor learning affects a speaker's auditory map. We assessed speech motor learning by using a robotic device that displaced the jaw and selectively altered somatosensory feedback during speech. We found that with practice speakers progressively corrected for the mechanical perturbation a...

  14. Learning Representations of Affect from Speech

    OpenAIRE

    Ghosh, Sayan; Laksana, Eugene; Morency, Louis-Philippe; Scherer, Stefan

    2015-01-01

    There has been a lot of prior work on representation learning for speech recognition applications, but not much emphasis has been given to an investigation of effective representations of affect from speech, where the paralinguistic elements of speech are separated out from the verbal content. In this paper, we explore denoising autoencoders for learning paralinguistic attributes i.e. categorical and dimensional affective traits from speech. We show that the representations learnt by the bott...

  15. Review of Speech & Language Assessment Tests

    OpenAIRE

    Mohadeseh Tarazani; Nasrin Keramati; Asma Sheikh Najdi; Nilofar Rastegarian; Shohreh Jalaei; Maryam Tarameshlo; Meisam Amid Far; Masomeh Radaei; Maryam Faghani Abokheili

    2010-01-01

    Background and Aim: The standard tests are tools for quantifying different aspects of speech & language abilities and communicative skills. However developing and applying standard tests is necessary for assessing, screening, defining the speech & language abilities, diagnosing the speech &language disorders and attending to outcomes of treatment process. Therefore in this study for providing general view of speech & language assessment tests, we reviewed some of tests related to some imp...

  16. Neural bases of accented speech perception

    OpenAIRE

    Adank, Patti; Nuttall, Helen E.; Banks, Briony; Kennedy-Higgins, Daniel

    2015-01-01

    The recognition of unfamiliar regional and foreign accents represents a challenging task for the speech perception system (Floccia et al., 2006; Adank et al., 2009). Despite the frequency with which we encounter such accents, the neural mechanisms supporting successful perception of accented speech are poorly understood. Nonetheless, candidate neural substrates involved in processing speech in challenging listening conditions, including accented speech, are beginning to be identified. This re...

  17. Software Requirement Specification Using Reverse Speech Technology

    OpenAIRE

    2014-01-01

    Speech analysis had been taken to a new level with the discovery of Reverse Speech (RS). RS is the discovery of hidden messages, referred as reversals, in normal speech. Works are in progress for exploiting the relevance of RS in different real world applications such as investigation, medical field etc. In this paper we represent an innovative method for preparing a reliable Software Requirement Specification (SRS) document with the help of reverse speech. As SRS act as the backbone for the ...

  18. Hate Speech and the First Amendment.

    Science.gov (United States)

    Rainey, Susan J.; Kinsler, Waren S.; Kannarr, Tina L.; Reaves, Asa E.

    This document is comprised of California state statutes, federal legislation, and court litigation pertaining to hate speech and the First Amendment. The document provides an overview of California education code sections relating to the regulation of speech; basic principles of the First Amendment; government efforts to regulate hate speech,…

  19. Liberalism, Speech Codes, and Related Problems.

    Science.gov (United States)

    Sunstein, Cass R.

    1993-01-01

    It is argued that universities are pervasively and necessarily engaged in regulation of speech, which complicates many existing claims about hate speech codes on campus. The ultimate test is whether the restriction on speech is a legitimate part of the institution's mission, commitment to liberal education. (MSE)

  20. Hate Speech on Campus: A Practical Approach.

    Science.gov (United States)

    Hogan, Patrick

    1997-01-01

    Looks at arguments concerning hate speech and speech codes on college campuses, arguing that speech codes are likely to be of limited value in achieving civil rights objectives, and that there are alternatives less harmful to civil liberties and more successful in promoting civil rights. Identifies specific goals, and considers how restriction of…

  1. Epoch-based analysis of speech signals

    Indian Academy of Sciences (India)

    B Yegnanarayana; Suryakanth V Gangashetty

    2011-10-01

    Speech analysis is traditionally performed using short-time analysis to extract features in time and frequency domains. The window size for the analysis is fixed somewhat arbitrarily, mainly to account for the time varying vocal tract system during production. However, speech in its primary mode of excitation is produced due to impulse-like excitation in each glottal cycle. Anchoring the speech analysis around the glottal closure instants (epochs) yields significant benefits for speech analysis. Epoch-based analysis of speech helps not only to segment the speech signals based on speech production characteristics, but also helps in accurate analysis of speech. It enables extraction of important acoustic-phonetic features such as glottal vibrations, formants, instantaneous fundamental frequency, etc. Epoch sequence is useful to manipulate prosody in speech synthesis applications. Accurate estimation of epochs helps in characterizing voice quality features. Epoch extraction also helps in speech enhancement and multispeaker separation. In this tutorial article, the importance of epochs for speech analysis is discussed, and methods to extract the epoch information are reviewed. Applications of epoch extraction for some speech applications are demonstrated.

  2. Characteristics of Speech Motor Development in Children.

    Science.gov (United States)

    Ostry, David J.; And Others

    1984-01-01

    Pulsed ultrasound was used to study tongue movements in the speech of children from 3 to 11 years of age. Speech data attained were characteristic of systems that can be described by second-order differential equations. Relationships observed in these systems may indicate that speech control involves tonic and phasic muscle inputs. (Author/RH)

  3. Interventions for Speech Sound Disorders in Children

    Science.gov (United States)

    Williams, A. Lynn, Ed.; McLeod, Sharynne, Ed.; McCauley, Rebecca J., Ed.

    2010-01-01

    With detailed discussion and invaluable video footage of 23 treatment interventions for speech sound disorders (SSDs) in children, this textbook and DVD set should be part of every speech-language pathologist's professional preparation. Focusing on children with functional or motor-based speech disorders from early childhood through the early…

  4. Factors of Politeness and Indirect Speech Acts

    Institute of Scientific and Technical Information of China (English)

    杨雪梅

    2016-01-01

    Polite principle is influenced deeply by a nation's history,culture,custom and so on,therefor different countries have different understandings and expressions of politeness and indirect speech acts.This paper shows some main factors influencing a polite speech.Through this article,readers can comprehensively know about politeness and indirect speech acts.

  5. Application of wavelets in speech processing

    CERN Document Server

    Farouk, Mohamed Hesham

    2014-01-01

    This book provides a survey on wide-spread of employing wavelets analysis  in different applications of speech processing. The author examines development and research in different application of speech processing. The book also summarizes the state of the art research on wavelet in speech processing.

  6. Current trends in multilingual speech processing

    Indian Academy of Sciences (India)

    Hervé Bourlard; John Dines; Mathew Magimai-Doss; Philip N Garner; David Imseng; Petr Motlicek; Hui Liang; Lakshmi Saheer; Fabio Valente

    2011-10-01

    In this paper, we describe recent work at Idiap Research Institute in the domain of multilingual speech processing and provide some insights into emerging challenges for the research community. Multilingual speech processing has been a topic of ongoing interest to the research community for many years and the field is now receiving renewed interest owing to two strong driving forces. Firstly, technical advances in speech recognition and synthesis are posing new challenges and opportunities to researchers. For example, discriminative features are seeing wide application by the speech recognition community, but additional issues arise when using such features in a multilingual setting. Another example is the apparent convergence of speech recognition and speech synthesis technologies in the form of statistical parametric methodologies. This convergence enables the investigation of new approaches to unified modelling for automatic speech recognition and text-to-speech synthesis (TTS) as well as cross-lingual speaker adaptation for TTS. The second driving force is the impetus being provided by both government and industry for technologies to help break down domestic and international language barriers, these also being barriers to the expansion of policy and commerce. Speech-to-speech and speech-to-text translation are thus emerging as key technologies at the heart of which lies multilingual speech processing.

  7. Acoustics of Clear Speech: Effect of Instruction

    Science.gov (United States)

    Lam, Jennifer; Tjaden, Kris; Wilding, Greg

    2012-01-01

    Purpose: This study investigated how different instructions for eliciting clear speech affected selected acoustic measures of speech. Method: Twelve speakers were audio-recorded reading 18 different sentences from the Assessment of Intelligibility of Dysarthric Speech (Yorkston & Beukelman, 1984). Sentences were produced in habitual, clear,…

  8. Discriminative learning for speech recognition

    CERN Document Server

    He, Xiadong

    2008-01-01

    In this book, we introduce the background and mainstream methods of probabilistic modeling and discriminative parameter optimization for speech recognition. The specific models treated in depth include the widely used exponential-family distributions and the hidden Markov model. A detailed study is presented on unifying the common objective functions for discriminative learning in speech recognition, namely maximum mutual information (MMI), minimum classification error, and minimum phone/word error. The unification is presented, with rigorous mathematical analysis, in a common rational-functio

  9. Reproducible Research in Speech Sciences

    Directory of Open Access Journals (Sweden)

    Kandaacute;lmandaacute;n Abari

    2012-11-01

    Full Text Available Reproducible research is the minimum standard of scientific claims in cases when independent replication proves to be difficult. With the special combination of available software tools, we provide a reproducibility recipe for the experimental research conducted in some fields of speech sciences. We have based our model on the triad of the R environment, the EMU-format speech database, and the executable publication. We present the use of three typesetting systems (LaTeX, Markdown, Org, with the help of a mini research.

  10. Annotating Speech Corpus for Prosody Modeling in Indian Language Text to Speech Systems

    Directory of Open Access Journals (Sweden)

    Kiruthiga S

    2012-01-01

    Full Text Available A spoken language system, it may either be a speech synthesis or a speech recognition system, starts with building a speech corpora. We give a detailed survey of issues and a methodology that selects the appropriate speech unit in building a speech corpus for Indian language Text to Speech systems. The paper ultimately aims to improve the intelligibility of the synthesized speech in Text to Speech synthesis systems. To begin with, an appropriate text file should be selected for building the speech corpus. Then a corresponding speech file is generated and stored. This speech file is the phonetic representation of the selected text file. The speech file is processed in different levels viz., paragraphs, sentences, phrases, words, syllables and phones. These are called the speech units of the file. Researches have been done taking these units as the basic unit for processing. This paper analyses the researches done using phones, diphones, triphones, syllables and polysyllables as their basic unit for speech synthesis. The paper also provides a recommended set of combinations for polysyllables. Concatenative speech synthesis involves the concatenation of these basic units to synthesize an intelligent, natural sounding speech. The speech units are annotated with relevant prosodic information about each unit, manually or automatically, based on an algorithm. The database consisting of the units along with their annotated information is called as the annotated speech corpus. A Clustering technique is used in the annotated speech corpus that provides way to select the appropriate unit for concatenation, based on the lowest total join cost of the speech unit.

  11. Assessing a speaker for fast speech in unit selection speech synthesis

    OpenAIRE

    Moers, Donata; Wagner, Petra

    2009-01-01

    This paper describes work in progress concerning the ad- equate modeling of fast speech in unit selection speech synthesis systems, mostly having in mind blind and visually impaired users. Initially, a survey of the main characteristics of fast speech will be given. Subsequently, strategies for fast speech production will be discussed. Certain requirements concerning the ability of a speaker of a fast speech unit selection inventory are drawn. The following section deals with a perception ...

  12. The treatment of apraxia of speech: Speech and music therapy, an innovative joint effort

    OpenAIRE

    Hurkmans, Josephus Johannes Stephanus

    2016-01-01

    Apraxia of Speech (AoS) is a neurogenic speech disorder. A wide variety of behavioural methods have been developed to treat AoS. Various therapy programmes use musical elements to improve speech production. A unique therapy programme combining elements of speech therapy and music therapy is called Speech-Music Therapy for Aphasia (SMTA). In clinical practice, patients with AoS have experienced positive outcomes of SMTA; however, there was no evidence of this treatment’s effectiveness. This th...

  13. Speech intelligibility of native and non-native speech

    NARCIS (Netherlands)

    Wijngaarden, S.J. van

    1999-01-01

    The intelligibility of speech is known to be lower if the talker is non-native instead of native for the given language. This study is aimed at quantifying the overall degradation due to acoustic-phonetic limitations of non-native talkers of Dutch, specifically of Dutch-speaking Americans who have l

  14. Text To Speech System for Telugu Language

    OpenAIRE

    Siva kumar, M; E. Prakash Babu

    2014-01-01

    Telugu is one of the oldest languages in India. This paper describes the development of Telugu Text-to-Speech System (TTS).In Telugu TTS the input is Telugu text in Unicode. The voices are sampled from real recorded speech. The objective of a text to speech system is to convert an arbitrary text into its corresponding spoken waveform. Speech synthesis is a process of building machinery that can generate human-like speech from any text input to imitate human speakers. Text proc...

  15. Recent Advances in Robust Speech Recognition Technology

    CERN Document Server

    Ramírez, Javier

    2011-01-01

    This E-book is a collection of articles that describe advances in speech recognition technology. Robustness in speech recognition refers to the need to maintain high speech recognition accuracy even when the quality of the input speech is degraded, or when the acoustical, articulate, or phonetic characteristics of speech in the training and testing environments differ. Obstacles to robust recognition include acoustical degradations produced by additive noise, the effects of linear filtering, nonlinearities in transduction or transmission, as well as impulsive interfering sources, and diminishe

  16. Speech in Mobile and Pervasive Environments

    CERN Document Server

    Rajput, Nitendra

    2012-01-01

    This book brings together the latest research in one comprehensive volume that deals with issues related to speech processing on resource-constrained, wireless, and mobile devices, such as speech recognition in noisy environments, specialized hardware for speech recognition and synthesis, the use of context to enhance recognition, the emerging and new standards required for interoperability, speech applications on mobile devices, distributed processing between the client and the server, and the relevance of Speech in Mobile and Pervasive Environments for developing regions--an area of explosiv

  17. Speech perception of noise with binary gains

    DEFF Research Database (Denmark)

    Wang, DeLiang; Kjems, Ulrik; Pedersen, Michael Syskind;

    2008-01-01

    For a given mixture of speech and noise, an ideal binary time-frequency mask is constructed by comparing speech energy and noise energy within local time-frequency units. It is observed that listeners achieve nearly perfect speech recognition from gated noise with binary gains prescribed by the i...... by the ideal binary mask. Only 16 filter channels and a frame rate of 100 Hz are sufficient for high intelligibility. The results show that, despite a dramatic reduction of speech information, a pattern of binary gains provides an adequate basis for speech perception....

  18. Feasibility of Technology Enabled Speech Disorder Screening.

    Science.gov (United States)

    Duenser, Andreas; Ward, Lauren; Stefani, Alessandro; Smith, Daniel; Freyne, Jill; Morgan, Angela; Dodd, Barbara

    2016-01-01

    One in twenty Australian children suffers from a speech disorder. Early detection of such problems can significantly improve literacy and academic outcomes for these children, reduce health and educational burden and ongoing social costs. Here we present the development of a prototype and feasibility tests of a screening and decision support tool to assess speech disorders in young children. The prototype incorporates speech signal processing, machine learning and expert knowledge to automatically classify phonemes of normal and disordered speech. We discuss these results and our future work towards the development of a mobile tool to facilitate broad, early speech disorder screening by non-experts. PMID:27440284

  19. Pattern recognition in speech and language processing

    CERN Document Server

    Chou, Wu

    2003-01-01

    Minimum Classification Error (MSE) Approach in Pattern Recognition, Wu ChouMinimum Bayes-Risk Methods in Automatic Speech Recognition, Vaibhava Goel and William ByrneA Decision Theoretic Formulation for Adaptive and Robust Automatic Speech Recognition, Qiang HuoSpeech Pattern Recognition Using Neural Networks, Shigeru KatagiriLarge Vocabulary Speech Recognition Based on Statistical Methods, Jean-Luc GauvainToward Spontaneous Speech Recognition and Understanding, Sadaoki FuruiSpeaker Authentication, Qi Li and Biing-Hwang JuangHMMs for Language Processing Problems, Ri

  20. Perceived Speech Quality Estimation Using DTW Algorithm

    Directory of Open Access Journals (Sweden)

    S. Arsenovski

    2009-06-01

    Full Text Available In this paper a method for speech quality estimation is evaluated by simulating the transfer of speech over packet switched and mobile networks. The proposed system uses Dynamic Time Warping algorithm for test and received speech comparison. Several tests have been made on a test speech sample of a single speaker with simulated packet (frame loss effects on the perceived speech. The achieved results have been compared with measured PESQ values on the used transmission channel and their correlation has been observed.

  1. Embedding speech into virtual realities

    Science.gov (United States)

    Bohn, Christian-Arved; Krueger, Wolfgang

    1993-05-01

    In this work a speaker-independent speech recognition system is presented, which is suitable for implementation in Virtual Reality applications. The use of an artificial neural network in connection with a special compression of the acoustic input leads to a system, which is robust, fast, easy to use and needs no additional hardware, beside a common VR-equipment.

  2. Speech Recognition on Mobile Devices

    DEFF Research Database (Denmark)

    Tan, Zheng-Hua; Lindberg, Børge

    2010-01-01

    The enthusiasm of deploying automatic speech recognition (ASR) on mobile devices is driven both by remarkable advances in ASR technology and by the demand for efficient user interfaces on such devices as mobile phones and personal digital assistants (PDAs). This chapter presents an overview of ASR...

  3. Paraconsistent semantics of speech acts

    NARCIS (Netherlands)

    Dunin-Kȩplicz, Barbara; Strachocka, Alina; Szałas, Andrzej; Verbrugge, Rineke

    2015-01-01

    This paper discusses an implementation of four speech acts: assert, concede, request and challenge in a paraconsistent framework. A natural four-valued model of interaction yields multiple new cognitive situations. They are analyzed in the context of communicative relations, which partially replace

  4. The Ontogenesis of Speech Acts

    Science.gov (United States)

    Bruner, Jerome S.

    1975-01-01

    A speech act approach to the transition from pre-linguistic to linguistic communication is adopted in order to consider language in relation to behavior and to allow for an emphasis on the use, rather than the form, of language. A pilot study of mothers and infants is discussed. (Author/RM)

  5. Prosodic Contrasts in Ironic Speech

    Science.gov (United States)

    Bryant, Gregory A.

    2010-01-01

    Prosodic features in spontaneous speech help disambiguate implied meaning not explicit in linguistic surface structure, but little research has examined how these signals manifest themselves in real conversations. Spontaneously produced verbal irony utterances generated between familiar speakers in conversational dyads were acoustically analyzed…

  6. On speech recognition during anaesthesia

    DEFF Research Database (Denmark)

    Alapetite, Alexandre

    2007-01-01

    This PhD thesis in human-computer interfaces (informatics) studies the case of the anaesthesia record used during medical operations and the possibility to supplement it with speech recognition facilities. Problems and limitations have been identified with the traditional paper-based anaesthesia ...... accuracy. Finally, the last part of the thesis looks at the acceptance and success of a speech recognition system introduced in a Danish hospital to produce patient records.......This PhD thesis in human-computer interfaces (informatics) studies the case of the anaesthesia record used during medical operations and the possibility to supplement it with speech recognition facilities. Problems and limitations have been identified with the traditional paper-based anaesthesia...... inaccuracies in the anaesthesia record. Supplementing the electronic anaesthesia record interface with speech input facilities is proposed as one possible solution to a part of the problem. The testing of the various hypotheses has involved the development of a prototype of an electronic anaesthesia record...

  7. Modelling speech intelligibility in adverse conditions

    DEFF Research Database (Denmark)

    Jørgensen, Søren; Dau, Torsten

    2013-01-01

    index (STMI) (Elhilali et al., Speech Commun 41:331-348, 2003), which assumes an explicit analysis of the spectral "ripple" structure of the speech signal. However, since the STMI applies the same decision metric as the STI, it fails to account for spectral subtraction. The results from this study....... Instead of considering the reduction of the temporal modulation energy as the intelligibility metric, as assumed in the STI, the sEPSM applies the signal-to-noise ratio in the envelope domain (SNRenv). This metric was shown to be the key for predicting the intelligibility of reverberant speech as well as...... noisy speech processed by spectral subtraction. The key role of the SNRenv metric is further supported here by the ability of a short-term version of the sEPSM to predict speech masking release for different speech materials and modulated interferers. However, the sEPSM cannot account for speech...

  8. Optimal Wavelets for Speech Signal Representations

    Directory of Open Access Journals (Sweden)

    Shonda L. Walker

    2003-08-01

    Full Text Available It is well known that in many speech processing applications, speech signals are characterized by their voiced and unvoiced components. Voiced speech components contain dense frequency spectrum with many harmonics. The periodic or semi-periodic nature of voiced signals lends itself to Fourier Processing. Unvoiced speech contains many high frequency components and thus resembles random noise. Several methods for voiced and unvoiced speech representations that utilize wavelet processing have been developed. These methods seek to improve the accuracy of wavelet-based speech signal representations using adaptive wavelet techniques, superwavelets, which uses a linear combination of adaptive wavelets, gaussian methods and a multi-resolution sinusoidal transform approach to mention a few. This paper addresses the relative performance of these wavelet methods and evaluates the usefulness of wavelet processing in speech signal representations. In addition, this paper will also address some of the hardware considerations for the wavelet methods presented.

  9. Speech Enhancement with Natural Sounding Residual Noise Based on Connected Time-Frequency Speech Presence Regions

    Directory of Open Access Journals (Sweden)

    Sørensen Karsten Vandborg

    2005-01-01

    Full Text Available We propose time-frequency domain methods for noise estimation and speech enhancement. A speech presence detection method is used to find connected time-frequency regions of speech presence. These regions are used by a noise estimation method and both the speech presence decisions and the noise estimate are used in the speech enhancement method. Different attenuation rules are applied to regions with and without speech presence to achieve enhanced speech with natural sounding attenuated background noise. The proposed speech enhancement method has a computational complexity, which makes it feasible for application in hearing aids. An informal listening test shows that the proposed speech enhancement method has significantly higher mean opinion scores than minimum mean-square error log-spectral amplitude (MMSE-LSA and decision-directed MMSE-LSA.

  10. Perception of Speech Sounds in School-Aged Children with Speech Sound Disorders.

    Science.gov (United States)

    Preston, Jonathan L; Irwin, Julia R; Turcios, Jacqueline

    2015-11-01

    Children with speech sound disorders may perceive speech differently than children with typical speech development. The nature of these speech differences is reviewed with an emphasis on assessing phoneme-specific perception for speech sounds that are produced in error. Category goodness judgment, or the ability to judge accurate and inaccurate tokens of speech sounds, plays an important role in phonological development. The software Speech Assessment and Interactive Learning System, which has been effectively used to assess preschoolers' ability to perform goodness judgments, is explored for school-aged children with residual speech errors (RSEs). However, data suggest that this particular task may not be sensitive to perceptual differences in school-aged children. The need for the development of clinical tools for assessment of speech perception in school-aged children with RSE is highlighted, and clinical suggestions are provided. PMID:26458198

  11. Speech Evaluation with Special Focus on Children Suffering from Apraxia of Speech

    Directory of Open Access Journals (Sweden)

    Manasi Dixit

    2013-07-01

    Full Text Available Speech disorders are very complicated in individuals suffering from Apraxia of Speech-AOS. In this paper ,the pathological cases of speech disabled children affected with AOS are analyzed. The speech signalsamples of childrenSpeech disorders are very complicated in individuals suffering from Apraxia of Speech-AOS. In this paper ,the pathological cases of speech disabled children affected with AOS are analyzed. The speech signalsamples of children of age between three to eight years are considered for the present study. These speechsignals are digitized and enhanced using the using the Speech Pause Index, Jitter,Skew ,Kurtosis analysisThis analysis is conducted on speech data samples which are concerned with both place of articulation andmanner of articulation. The speech disability of pathological subjects was estimated using results of aboveanalysis. of age between three to eight years are considered for the present study. These speechsignals are digitized and enhanced using the using the Speech Pause Index, Jitter,Skew ,Kurtosis analysisThis analysis is conducted on speech data samples which are concerned with both place of articulation andmanner of articulation. The speech disability of pathological subjects was estimated using results of aboveanalysis.

  12. Speech and language delay in children.

    Science.gov (United States)

    McLaughlin, Maura R

    2011-05-15

    Speech and language delay in children is associated with increased difficulty with reading, writing, attention, and socialization. Although physicians should be alert to parental concerns and to whether children are meeting expected developmental milestones, there currently is insufficient evidence to recommend for or against routine use of formal screening instruments in primary care to detect speech and language delay. In children not meeting the expected milestones for speech and language, a comprehensive developmental evaluation is essential, because atypical language development can be a secondary characteristic of other physical and developmental problems that may first manifest as language problems. Types of primary speech and language delay include developmental speech and language delay, expressive language disorder, and receptive language disorder. Secondary speech and language delays are attributable to another condition such as hearing loss, intellectual disability, autism spectrum disorder, physical speech problems, or selective mutism. When speech and language delay is suspected, the primary care physician should discuss this concern with the parents and recommend referral to a speech-language pathologist and an audiologist. There is good evidence that speech-language therapy is helpful, particularly for children with expressive language disorder. PMID:21568252

  13. Sensorimotor influences on speech perception in infancy.

    Science.gov (United States)

    Bruderer, Alison G; Danielson, D Kyle; Kandhadai, Padmapriya; Werker, Janet F

    2015-11-01

    The influence of speech production on speech perception is well established in adults. However, because adults have a long history of both perceiving and producing speech, the extent to which the perception-production linkage is due to experience is unknown. We addressed this issue by asking whether articulatory configurations can influence infants' speech perception performance. To eliminate influences from specific linguistic experience, we studied preverbal, 6-mo-old infants and tested the discrimination of a nonnative, and hence never-before-experienced, speech sound distinction. In three experimental studies, we used teething toys to control the position and movement of the tongue tip while the infants listened to the speech sounds. Using ultrasound imaging technology, we verified that the teething toys consistently and effectively constrained the movement and positioning of infants' tongues. With a looking-time procedure, we found that temporarily restraining infants' articulators impeded their discrimination of a nonnative consonant contrast but only when the relevant articulator was selectively restrained to prevent the movements associated with producing those sounds. Our results provide striking evidence that even before infants speak their first words and without specific listening experience, sensorimotor information from the articulators influences speech perception. These results transform theories of speech perception by suggesting that even at the initial stages of development, oral-motor movements influence speech sound discrimination. Moreover, an experimentally induced "impairment" in articulator movement can compromise speech perception performance, raising the question of whether long-term oral-motor impairments may impact perceptual development. PMID:26460030

  14. Extensions to the Speech Disorders Classification System (SDCS)

    Science.gov (United States)

    Shriberg, Lawrence D.; Fourakis, Marios; Hall, Sheryl D.; Karlsson, Heather B.; Lohmeier, Heather L.; McSweeny, Jane L.; Potter, Nancy L.; Scheer-Cohen, Alison R.; Strand, Edythe A.; Tilkens, Christie M.; Wilson, David L.

    2010-01-01

    This report describes three extensions to a classification system for paediatric speech sound disorders termed the Speech Disorders Classification System (SDCS). Part I describes a classification extension to the SDCS to differentiate motor speech disorders from speech delay and to differentiate among three sub-types of motor speech disorders.…

  15. Separating Underdetermined Convolutive Speech Mixtures

    DEFF Research Database (Denmark)

    Pedersen, Michael Syskind; Wang, DeLiang; Larsen, Jan;

    2006-01-01

    A limitation in many source separation tasks is that the number of source signals has to be known in advance. Further, in order to achieve good performance, the number of sources cannot exceed the number of sensors. In many real-world applications these limitations are too restrictive. We propose a...... method for underdetermined blind source separation of convolutive mixtures. The proposed framework is applicable for separation of instantaneous as well as convolutive speech mixtures. It is possible to iteratively extract each speech signal from the mixture by combining blind source separation...... techniques with binary time-frequency masking. In the proposed method, the number of source signals is not assumed to be known in advance and the number of sources is not limited to the number of microphones. Our approach needs only two microphones and the separated sounds are maintained as stereo signals....

  16. Headphone localization of speech stimuli

    Science.gov (United States)

    Begault, Durand R.; Wenzel, Elizabeth M.

    1991-01-01

    Recently, three dimensional acoustic display systems have been developed that synthesize virtual sound sources over headphones based on filtering by Head-Related Transfer Functions (HRTFs), the direction-dependent spectral changes caused primarily by the outer ears. Here, 11 inexperienced subjects judged the apparent spatial location of headphone-presented speech stimuli filtered with non-individualized HRTFs. About half of the subjects 'pulled' their judgements toward either the median or the lateral-vertical planes, and estimates were almost always elevated. Individual differences were pronounced for the distance judgements; 15 to 46 percent of stimuli were heard inside the head with the shortest estimates near the median plane. The results infer that most listeners can obtain useful azimuth information from speech stimuli filtered by nonindividualized RTFs. Measurements of localization error and reversal rates are comparable with a previous study that used broadband noise stimuli.

  17. THE BASIS FOR SPEECH PREVENTION

    Directory of Open Access Journals (Sweden)

    Jordan JORDANOVSKI

    1997-06-01

    Full Text Available The speech is a tool for accurate communication of ideas. When we talk about speech prevention as a practical realization of the language, we are referring to the fact that it should be comprised of the elements of the criteria as viewed from the perspective of the standards. This criteria, in the broad sense of the word, presupposes an exact realization of the thought expressed between the speaker and the recipient.The absence of this criterion catches the eye through the practical realization of the language and brings forth consequences, often hidden very deeply in the human psyche. Their outer manifestation already represents a delayed reaction of the social environment. The foundation for overcoming and standardization of this phenomenon must be the anatomy-physiological patterns of the body, accomplished through methods in concordance with the nature of the body.

  18. Language processing for speech understanding

    Science.gov (United States)

    Woods, W. A.

    1983-07-01

    This report considers language understanding techniques and control strategies that can be applied to provide higher-level support to aid in the understanding of spoken utterances. The discussion is illustrated with concepts and examples from the BBN speech understanding system, HWIM (Hear What I Mean). The HWIM system was conceived as an assistant to a travel budget manager, a system that would store information about planned and taken trips, travel budgets and their planning. The system was able to respond to commands and answer questions spoken into a microphone, and was able to synthesize spoken responses as output. HWIM was a prototype system used to drive speech understanding research. It used a phonetic-based approach, with no speaker training, a large vocabulary, and a relatively unconstraining English grammar. Discussed here is the control structure of the HWIM and the parsing algorithm used to parse sentences from the middle-out, using an ATN grammar.

  19. On speech recognition during anaesthesia

    OpenAIRE

    Alapetite, Alexandre

    2007-01-01

    This PhD thesis in human-computer interfaces (HCI, informatics) studies the case of the anaesthesia record used during medical operations and the possibility to supplement it with speech recognition facilities. Problems and limitations have been identified with the traditional paper-based anaesthesia record, but also with newer electronic versions, in particular ergonomic issues and the fact that anaesthesiologists tend to postpone the registration of the medications and other events during b...

  20. Human perception in speech processing

    OpenAIRE

    Grancharov, Volodya

    2006-01-01

    The emergence of heterogeneous networks and the rapid increase of Voice over IP (VoIP) applications provide important opportunities for the telecommunications market. These opportunities come at the price of increased complexity in the monitoring of the quality of service (QoS) and the need for adaptation of transmission systems to the changing environmental conditions. This thesis contains three papers concerned with quality assessment and enhancement of speech communication systems in adver...

  1. The Value of Commercial Speech

    OpenAIRE

    Munro, Colin

    2003-01-01

    Recent decisions in the courts have encouraged discussion of the extent to which the common law does or should place a high or higher value on political expression. Some scholars argue for a more explicit recognition of the high value of political speech, and would seek, for example, to 'constitutionalise' defamation laws. Others have adopted a more sceptical attitude to the desirability of importing American approaches to freedom of expression generally or to the privileging of political spe...

  2. From Speech Acts to Semantics

    Directory of Open Access Journals (Sweden)

    Mackenzie Jim

    2014-03-01

    Full Text Available Frege introduced the notion of pragmatic force as what distinguishes statements from questions. This distinction was elaborated by Wittgenstein in his later works, and systematised as an account of different kinds of speech acts in formal dialogue theory by Hamblin. It lies at the heart of the inferential semantics more recently developed by Brandom. The present paper attempts to sketch some of the relations between these developments.

  3. The limitations of speech control: perceptions of provision of speech-driven environmental controls

    OpenAIRE

    Judge, S.; Robertson, Z.; Hawley, M.

    2011-01-01

    This study set out to collect data from assistive technology professionals about their provision of speech-driven environmental control systems. This study is part of a larger study looking at developing a new speech-driven environmental control system.

  4. Speech Intelligibility Evaluation for Mobile Phones

    DEFF Research Database (Denmark)

    Jørgensen, Søren; Cubick, Jens; Dau, Torsten

    2015-01-01

    In the development process of modern telecommunication systems, such as mobile phones, it is common practice to use computer models to objectively evaluate the transmission quality of the system, instead of time-consuming perceptual listening tests. Such models have typically focused on the quality...... of the transmitted speech, while little or no attention has been provided to speech intelligibility. The present study investigated to what extent three state-of-the art speech intelligibility models could predict the intelligibility of noisy speech transmitted through mobile phones. Sentences from...... the Danish Dantale II speech material were mixed with three different kinds of background noise, transmitted through three different mobile phones, and recorded at the receiver via a local network simulator. The speech intelligibility of the transmitted sentences was assessed by six normal...

  5. Recent advances in nonlinear speech processing

    CERN Document Server

    Faundez-Zanuy, Marcos; Esposito, Antonietta; Cordasco, Gennaro; Drugman, Thomas; Solé-Casals, Jordi; Morabito, Francesco

    2016-01-01

    This book presents recent advances in nonlinear speech processing beyond nonlinear techniques. It shows that it exploits heuristic and psychological models of human interaction in order to succeed in the implementations of socially believable VUIs and applications for human health and psychological support. The book takes into account the multifunctional role of speech and what is “outside of the box” (see Björn Schuller’s foreword). To this aim, the book is organized in 6 sections, each collecting a small number of short chapters reporting advances “inside” and “outside” themes related to nonlinear speech research. The themes emphasize theoretical and practical issues for modelling socially believable speech interfaces, ranging from efforts to capture the nature of sound changes in linguistic contexts and the timing nature of speech; labors to identify and detect speech features that help in the diagnosis of psychological and neuronal disease, attempts to improve the effectiveness and performa...

  6. Mobile speech and advanced natural language solutions

    CERN Document Server

    Markowitz, Judith

    2013-01-01

    Mobile Speech and Advanced Natural Language Solutions provides a comprehensive and forward-looking treatment of natural speech in the mobile environment. This fourteen-chapter anthology brings together lead scientists from Apple, Google, IBM, AT&T, Yahoo! Research and other companies, along with academicians, technology developers and market analysts.  They analyze the growing markets for mobile speech, new methodological approaches to the study of natural language, empirical research findings on natural language and mobility, and future trends in mobile speech.  Mobile Speech opens with a challenge to the industry to broaden the discussion about speech in mobile environments beyond the smartphone, to consider natural language applications across different domains.   Among the new natural language methods introduced in this book are Sequence Package Analysis, which locates and extracts valuable opinion-related data buried in online postings; microintonation as a way to make TTS truly human-like; and se...

  7. Music expertise shapes audiovisual temporal integration windows for speech, sinewave speech, and music

    OpenAIRE

    Lee, Hweeling; Noppeney, Uta

    2014-01-01

    This psychophysics study used musicians as a model to investigate whether musical expertise shapes the temporal integration window for audiovisual speech, sinewave speech, or music. Musicians and non-musicians judged the audiovisual synchrony of speech, sinewave analogs of speech, and music stimuli at 13 audiovisual stimulus onset asynchronies (±360, ±300 ±240, ±180, ±120, ±60, and 0 ms). Further, we manipulated the duration of the stimuli by presenting sentences/melodies or syllables/tones. ...

  8. Music and speech prosody: a common rhythm

    OpenAIRE

    Hausen, Maija; Torppa, Ritva; Salmela, Viljami R.; Vainio, Martti; Särkämö, Teppo

    2013-01-01

    Disorders of music and speech perception, known as amusia and aphasia, have traditionally been regarded as dissociated deficits based on studies of brain damaged patients. This has been taken as evidence that music and speech are perceived by largely separate and independent networks in the brain. However, recent studies of congenital amusia have broadened this view by showing that the deficit is associated with problems in perceiving speech prosody, especially intonation and emotional prosod...

  9. Automatic Speech Segmentation Based on HMM

    OpenAIRE

    M. Kroul

    2007-01-01

    This contribution deals with the problem of automatic phoneme segmentation using HMMs. Automatization of speech segmentation task is important for applications, where large amount of data is needed to process, so manual segmentation is out of the question. In this paper we focus on automatic segmentation of recordings, which will be used for triphone synthesis unit database creation. For speech synthesis, the speech unit quality is a crucial aspect, so the maximal accuracy in segmentation is ...

  10. The comprehension of gesture and speech

    OpenAIRE

    Willems, R.M.; Özyürek, A.; Hagoort, P.

    2005-01-01

    Although generally studied in isolation, action observation and speech comprehension go hand in hand during everyday human communication. That is, people gesture while they speak. From previous research it is known that a tight link exists between spoken language and such hand gestures. This study investigates for the first time the neural correlates of co-speech gestures and the neural locus of the integration of speech and gesture in a naturally occurring situation, i.e. as an integrated wh...

  11. Post-processing speech recordings during MRI

    OpenAIRE

    Kuortti, Juha; Malinen, Jarmo; Ojalammi, Antti

    2015-01-01

    We discuss post-processing of speech that has been recorded during Magnetic Resonance Imaging (MRI) of the vocal tract. Such speech recordings are contaminated by high levels of acoustic noise from the MRI scanner. Also, the frequency response of the sound signal path is not flat as a result of severe restrictions on recording instrumentation due to MRI technology. The post-processing algorithm for noise reduction is based on adaptive spectral filtering. The speech material consists of sample...

  12. A Bayesian framework for speech motor control

    OpenAIRE

    Patri, Jean-François; Diard, Julien; Perrier, Pascal; Schwartz, Jean-Luc

    2015-01-01

    The remarkable capacity of the speech motor system to adapt to various speech conditions is due to an excess of degrees of freedom, which enables producing similar acoustical properties with different sets of control strategies. To explain how the Central Nervous System selects one of the possible strategies, a common approach, in line with optimal motor control theories, is to model speech motor planning as the solution of an optimality problem based on cost functions. Despite the success of...

  13. Robust speech recognition using articulatory information

    OpenAIRE

    Kirchhoff, Katrin

    1999-01-01

    Current automatic speech recognition systems make use of a single source of information about their input, viz. a preprocessed form of the acoustic speech signal, which encodes the time-frequency distribution of signal energy. The goal of this thesis is to investigate the benefits of integrating articulatory information into state-of-the art speech recognizers, either as a genuine alternative to standard acoustic representations, or as an additional source of information. Articulatory informa...

  14. CAR2 - Czech Database of Car Speech

    Directory of Open Access Journals (Sweden)

    P. Sovka

    1999-12-01

    Full Text Available This paper presents new Czech language two-channel (stereo speech database recorded in car environment. The created database was designed for experiments with speech enhancement for communication purposes and for the study and the design of a robust speech recognition systems. Tools for automated phoneme labelling based on Baum-Welch re-estimation were realised. The noise analysis of the car background environment was done.

  15. Dynamic Automatic Noisy Speech Recognition System (DANSR)

    OpenAIRE

    Paul, Sheuli

    2014-01-01

    In this thesis we studied and investigated a very common but a long existing noise problem and we provided a solution to this problem. The task is to deal with different types of noise that occur simultaneously and which we call hybrid. Although there are individual solutions for specific types one cannot simply combine them because each solution affects the whole speech. We developed an automatic speech recognition system DANSR ( Dynamic Automatic Noisy Speech Recognition System) for hybri...

  16. Music and speech prosody: A common rhythm

    OpenAIRE

    Maija eHausen; Ritva eTorppa; Salmela, Viljami R.; Martti eVainio; Teppo eSärkämö

    2013-01-01

    Disorders of music and speech perception, known as amusia and aphasia, have traditionally been regarded as dissociated deficits based on studies of brain damaged patients. This has been taken as evidence that music and speech are perceived by largely separate and independent networks in the brain. However, recent studies of congenital amusia have broadened this view by showing that the deficit is associated with problems in perceiving speech prosody, especially intonation and emotional prosod...

  17. Time Shrinking Effects on Speech Tempo Perception

    OpenAIRE

    Wagner, Petra; Windmann, Andreas

    2011-01-01

    Time shrinking denotes the psycho-acoustic phenomenon that an acoustic event is perceived as shorter if it follows an even shorter acoustic event. Previous work has shown that time shrinking can be traced in speech-like phrases and may lead to the impression of a higher speech rate and syllable isochrony. This paper provides experimental evidence that time shrinking is effective on foot level as well as phrase level. Some examples from natural speech are given, where time shrinking effe...

  18. Speech perception as an active cognitive process

    OpenAIRE

    Howard Charles Nusbaum

    2014-01-01

    One view of speech perception is that acoustic signals are transformed into representations for pattern matching to determine linguistic structure. This process can be taken as a statistical pattern-matching problem, assuming realtively stable linguistic categories are characterized by neural representations related to auditory properties of speech that can be compared to speech input. This kind of pattern matching can be termed a passive process which implies rigidity of processingd with few...

  19. Voice transformation in parametric speech synthesis

    Czech Academy of Sciences Publication Activity Database

    Vondra, Martin

    Prague: IREE AS CR, 2004 - (Vích, R.), s. 35-37 ISBN 80-86269-10-8. [Czech-German Workshop on Speech Processing /13./. Prague (CZ), 15.09.2003-17.09.2003] R&D Projects: GA ČR GA102/02/0124 Institutional research plan: CEZ:AV0Z2067918 Keywords : speech processing * speech synthesis Subject RIV: JA - Electronics ; Optoelectronics, Electrical Engineering

  20. Speech Recognition in Natural Background Noise

    OpenAIRE

    Julien Meyer; Laure Dentel; Fanny Meunier

    2013-01-01

    In the real world, human speech recognition nearly always involves listening in background noise. The impact of such noise on speech signals and on intelligibility performance increases with the separation of the listener from the speaker. The present behavioral experiment provides an overview of the effects of such acoustic disturbances on speech perception in conditions approaching ecologically valid contexts. We analysed the intelligibility loss in spoken word lists with increasing listene...

  1. Freedom of Speech - The M Word

    OpenAIRE

    Reichhardt, L; Murphy, T; Andersen, Christoffer Molge; Olsen, K.

    2015-01-01

    The first objective of the project is to show how freedom of speech and democracy are dependent on one another in Denmark. The project’s next focal point is to look at how freedom of speech was framed in relation to the Mohammed publications in 2005. To do this, it identifies how freedom of speech was used by many Danish and European newspapers to justify the publications. Arguments against the publications by both the Danish media and the Muslim community (within Denmark ...

  2. Helping Domain Experts Build Speech Translation Systems

    OpenAIRE

    Rayner, Manny; Armando, Alejandro; Bouillon, Pierrette; Ebling, Sarah; Gerlach, Johanna; Halimi, Sonia; Strasly, Irene; Tsourakis, Nikos

    2015-01-01

    We present a new platform, "Regulus Lite", which supports rapid development and web deployment of several types of phrasal speech translation systems using a minimal formalism. A distinguishing feature is that most development work can be performed directly by domain experts. We motivate the need for platforms of this type and discuss three specific cases: medical speech translation, speech-to-sign-language translation and voice questionnaires. We briefly describe initial experiences in devel...

  3. A level stimulator programmed for audiometry

    International Nuclear Information System (INIS)

    This stimulator has been designed for automated audiometric experiments on lemurians. The variations of the transmission level are programmed on punched tape whose reading is controlled by an audiofrequency attenuator. The positive answers of the animal are stored in a seven-counter memory and the results are read by display

  4. DELAYED SPEECH AND LANGUAGE DEVELOPMENT, PRENTICE-HALL FOUNDATIONS OF SPEECH PATHOLOGY SERIES.

    Science.gov (United States)

    WOOD, NANCY E.

    WRITTEN FOR SPEECH PATHOLOGY STUDENTS AND PROFESSIONAL WORKERS, THE BOOK BEGINS BY DEFINING LANGUAGE AND SPEECH AND TRACING THE DEVELOPMENT OF SPEECH AND LANGUAGE FROM THE INFANT THROUGH THE 4-YEAR OLD. CAUSAL FACTORS OF DELAYED DEVELOPMENT ARE GIVEN, INCLUDING CENTRAL NERVOUS SYSTEM IMPAIRMENT AND ASSOCIATED BEHAVIORAL CLUES AND LANGUAGE…

  5. Exploring the Role of Brain Oscillations in Speech Perception in Noise: Intelligibility of Isochronously Retimed Speech

    Science.gov (United States)

    Aubanel, Vincent; Davis, Chris; Kim, Jeesun

    2016-01-01

    A growing body of evidence shows that brain oscillations track speech. This mechanism is thought to maximize processing efficiency by allocating resources to important speech information, effectively parsing speech into units of appropriate granularity for further decoding. However, some aspects of this mechanism remain unclear. First, while periodicity is an intrinsic property of this physiological mechanism, speech is only quasi-periodic, so it is not clear whether periodicity would present an advantage in processing. Second, it is still a matter of debate which aspect of speech triggers or maintains cortical entrainment, from bottom-up cues such as fluctuations of the amplitude envelope of speech to higher level linguistic cues such as syntactic structure. We present data from a behavioral experiment assessing the effect of isochronous retiming of speech on speech perception in noise. Two types of anchor points were defined for retiming speech, namely syllable onsets and amplitude envelope peaks. For each anchor point type, retiming was implemented at two hierarchical levels, a slow time scale around 2.5 Hz and a fast time scale around 4 Hz. Results show that while any temporal distortion resulted in reduced speech intelligibility, isochronous speech anchored to P-centers (approximated by stressed syllable vowel onsets) was significantly more intelligible than a matched anisochronous retiming, suggesting a facilitative role of periodicity defined on linguistically motivated units in processing speech in noise.

  6. The Practical Philosophy of Communication Ethics and Free Speech as the Foundation for Speech Communication.

    Science.gov (United States)

    Arnett, Ronald C.

    1990-01-01

    Argues that communication ethics and free speech are the foundation for understanding the field of speech communication and its proper positioning in the larger array of academic disciplines. Argues that speech communication as a discipline can be traced back to a "practical philosophical" foundation detailed by Aristotle. (KEH)

  7. Text To Speech System for Telugu Language

    Directory of Open Access Journals (Sweden)

    M. Siva Kumar

    2014-03-01

    Full Text Available Telugu is one of the oldest languages in India. This paper describes the development of Telugu Text-to-Speech System (TTS.In Telugu TTS the input is Telugu text in Unicode. The voices are sampled from real recorded speech. The objective of a text to speech system is to convert an arbitrary text into its corresponding spoken waveform. Speech synthesis is a process of building machinery that can generate human-like speech from any text input to imitate human speakers. Text processing and speech generation are two main components of a text to speech system. To build a natural sounding speech synthesis system, it is essential that text processing component produce an appropriate sequence of phonemic units. Generation of sequence of phonetic units for a given standard word is referred to as letter to phoneme rule or text to phoneme rule. The complexity of these rules and their derivation depends upon the nature of the language. The quality of a speech synthesizer is judged by its closeness to the natural human voice and understandability. In this paper we described an approach to build a Telugu TTS system using concatenative synthesis method with syllable as a basic unit of concatenation.

  8. Neural Network Based Hausa Language Speech Recognition

    Directory of Open Access Journals (Sweden)

    Matthew K Luka

    2012-05-01

    Full Text Available Speech recognition is a key element of diverse applications in communication systems, medical transcription systems, security systems etc. However, there has been very little research in the domain of speech processing for African languages, thus, the need to extend the frontier of research in order to port in, the diverse applications based on speech recognition. Hausa language is an important indigenous lingua franca in west and central Africa, spoken as a first or second language by about fifty million people. Speech recognition of Hausa Language is presented in this paper. A pattern recognition neural network was used for developing the system.

  9. American Speech-Language-Hearing Association

    Science.gov (United States)

    ... Careers Certification Publications Events Advocacy Continuing Education Practice Management Research American Speech-Language-Hearing Association (ASHA) Making effective communication, a human ...

  10. Speech Enhancement based on Compressive Sensing Algorithm

    Science.gov (United States)

    Sulong, Amart; Gunawan, Teddy S.; Khalifa, Othman O.; Chebil, Jalel

    2013-12-01

    There are various methods, in performance of speech enhancement, have been proposed over the years. The accurate method for the speech enhancement design mainly focuses on quality and intelligibility. The method proposed with high performance level. A novel speech enhancement by using compressive sensing (CS) is a new paradigm of acquiring signals, fundamentally different from uniform rate digitization followed by compression, often used for transmission or storage. Using CS can reduce the number of degrees of freedom of a sparse/compressible signal by permitting only certain configurations of the large and zero/small coefficients, and structured sparsity models. Therefore, CS is significantly provides a way of reconstructing a compressed version of the speech in the original signal by taking only a small amount of linear and non-adaptive measurement. The performance of overall algorithms will be evaluated based on the speech quality by optimise using informal listening test and Perceptual Evaluation of Speech Quality (PESQ). Experimental results show that the CS algorithm perform very well in a wide range of speech test and being significantly given good performance for speech enhancement method with better noise suppression ability over conventional approaches without obvious degradation of speech quality.

  11. Speech coding research at Bell Laboratories

    Science.gov (United States)

    Atal, Bishnu S.

    2001-05-01

    The field of speech coding is now over 70 years old. It started from the desire to transmit voice signals over telegraph cables. The availability of digital computers in the mid 1960s made it possible to test complex speech coding algorithms rapidly. The introduction of linear predictive coding (LPC) started a new era in speech coding. The fundamental philosophy of speech coding went through a major shift, resulting in a new generation of low bit rate speech coders, such as multi-pulse and code-excited LPC. The semiconductor revolution produced faster and faster DSP chips and made linear predictive coding practical. Code-excited LPC has become the method of choice for low bit rate speech coding applications and is used in most voice transmission standards for cell phones. Digital speech communication is rapidly evolving from circuit-switched to packet-switched networks to provide integrated transmission of voice, data, and video signals. The new communication environment is also moving the focus of speech coding research from compression to low cost, reliable, and secure transmission of voice signals on digital networks, and provides the motivation for creating a new class of speech coders suitable for future applications.

  12. Speech Planning Happens before Speech Execution: Online Reaction Time Methods in the Study of Apraxia of Speech

    Science.gov (United States)

    Maas, Edwin; Mailend, Marja-Liisa

    2012-01-01

    Purpose: The purpose of this article is to present an argument for the use of online reaction time (RT) methods to the study of apraxia of speech (AOS) and to review the existing small literature in this area and the contributions it has made to our fundamental understanding of speech planning (deficits) in AOS. Method: Following a brief…

  13. Perceptual centres in speech - an acoustic analysis

    Science.gov (United States)

    Scott, Sophie Kerttu

    Perceptual centres, or P-centres, represent the perceptual moments of occurrence of acoustic signals - the 'beat' of a sound. P-centres underlie the perception and production of rhythm in perceptually regular speech sequences. P-centres have been modelled both in speech and non speech (music) domains. The three aims of this thesis were toatest out current P-centre models to determine which best accounted for the experimental data bto identify a candidate parameter to map P-centres onto (a local approach) as opposed to the previous global models which rely upon the whole signal to determine the P-centre the final aim was to develop a model of P-centre location which could be applied to speech and non speech signals. The first aim was investigated by a series of experiments in which a) speech from different speakers was investigated to determine whether different models could account for variation between speakers b) whether rendering the amplitude time plot of a speech signal affects the P-centre of the signal c) whether increasing the amplitude at the offset of a speech signal alters P-centres in the production and perception of speech. The second aim was carried out by a) manipulating the rise time of different speech signals to determine whether the P-centre was affected, and whether the type of speech sound ramped affected the P-centre shift b) manipulating the rise time and decay time of a synthetic vowel to determine whether the onset alteration was had more affect on P-centre than the offset manipulation c) and whether the duration of a vowel affected the P-centre, if other attributes (amplitude, spectral contents) were held constant. The third aim - modelling P-centres - was based on these results. The Frequency dependent Amplitude Increase Model of P-centre location (FAIM) was developed using a modelling protocol, the APU GammaTone Filterbank and the speech from different speakers. The P-centres of the stimuli corpus were highly predicted by attributes of

  14. Speech perception as an active cognitive process

    Directory of Open Access Journals (Sweden)

    Shannon eHeald

    2014-03-01

    Full Text Available One view of speech perception is that acoustic signals are transformed into representations for pattern matching to determine linguistic structure. This process can be taken as a statistical pattern-matching problem, assuming realtively stable linguistic categories are characterized by neural representations related to auditory properties of speech that can be compared to speech input. This kind of pattern matching can be termed a passive process which implies rigidity of processingd with few demands on cognitive processing. An alternative view is that speech recognition, even in early stages, is an active process in which speech analysis is attentionally guided. Note that this does not mean consciously guided but that information-contingent changes in early auditory encoding can occur as a function of context and experience. Active processing assumes that attention, plasticity, and listening goals are important in considering how listeners cope with adverse circumstances that impair hearing by masking noise in the environment or hearing loss. Although theories of speech perception have begun to incorporate some active processing, they seldom treat early speech encoding as plastic and attentionally guided. Recent research has suggested that speech perception is the product of both feedforward and feedback interactions between a number of brain regions that include descending projections perhaps as far downstream as the cochlea. It is important to understand how the ambiguity of the speech signal and constraints of context dynamically determine cognitive resources recruited during perception including focused attention, learning, and working memory. Theories of speech perception need to go beyond the current corticocentric approach in order to account for the intrinsic dynamics of the auditory encoding of speech. In doing so, this may provide new insights into ways in which hearing disorders and loss may be treated either through augementation or

  15. Prediction and constraint in audiovisual speech perception.

    Science.gov (United States)

    Peelle, Jonathan E; Sommers, Mitchell S

    2015-07-01

    During face-to-face conversational speech listeners must efficiently process a rapid and complex stream of multisensory information. Visual speech can serve as a critical complement to auditory information because it provides cues to both the timing of the incoming acoustic signal (the amplitude envelope, influencing attention and perceptual sensitivity) and its content (place and manner of articulation, constraining lexical selection). Here we review behavioral and neurophysiological evidence regarding listeners' use of visual speech information. Multisensory integration of audiovisual speech cues improves recognition accuracy, particularly for speech in noise. Even when speech is intelligible based solely on auditory information, adding visual information may reduce the cognitive demands placed on listeners through increasing the precision of prediction. Electrophysiological studies demonstrate that oscillatory cortical entrainment to speech in auditory cortex is enhanced when visual speech is present, increasing sensitivity to important acoustic cues. Neuroimaging studies also suggest increased activity in auditory cortex when congruent visual information is available, but additionally emphasize the involvement of heteromodal regions of posterior superior temporal sulcus as playing a role in integrative processing. We interpret these findings in a framework of temporally-focused lexical competition in which visual speech information affects auditory processing to increase sensitivity to acoustic information through an early integration mechanism, and a late integration stage that incorporates specific information about a speaker's articulators to constrain the number of possible candidates in a spoken utterance. Ultimately it is words compatible with both auditory and visual information that most strongly determine successful speech perception during everyday listening. Thus, audiovisual speech perception is accomplished through multiple stages of integration

  16. Reflection and Optimization of Primary English Teachers’Speech Acts Based on Speech Act Theory

    Institute of Scientific and Technical Information of China (English)

    HU Qi-hai

    2015-01-01

    The primary English teacher's speech acts have major impact on foreign language teaching and learning in primary school. Application of teacher,s speech acts in the classroom is actually a kind of selective process. From the perspective of Speech Act Theory, primary English teachers can optimize their speech acts with the strategies of activating the greetings with proper con⁃text information, standardizing teacher talk, choosing suitable questions,providing appropriate feedback for pupils ’classroom per⁃formances in order to improve the effectiveness of primary teachers,classroom speech acts.

  17. Preschool Speech Intelligibility and Vocabulary Skills Predict Long-Term Speech and Language Outcomes Following Cochlear Implantation in Early Childhood

    OpenAIRE

    Castellanos, Irina; Kronenberger, William G.; Beer, Jessica; Henning, Shirley C.; Colson, Bethany G.; Pisoni, David B.

    2013-01-01

    Speech and language measures during grade school predict adolescent speech-language outcomes in children who receive cochlear implants, but no research has examined whether speech and language functioning at even younger ages is predictive of long-term outcomes in this population. The purpose of this study was to examine if early preschool measures of speech and language performance predict speech-language functioning in long-term users of cochlear implants. Early measures of speech intelligi...

  18. LIBERDADE DE EXPRESSÃO E DISCURSO DO ÓDIO NO BRASIL / FREE SPEECH AND HATE SPEECH IN BRAZIL

    OpenAIRE

    Nevita Maria Pessoa de Aquino Franca Luna; Gustavo Ferreira Santos

    2014-01-01

    The purpose of this article is to analyze the restriction of free speech when it comes close to hate speech. In this perspective, the aim of this study is to answer the question: what is the understanding adopted by the Brazilian Supreme Court in cases involving the conflict between free speech and hate speech? The methodology combines a bibliographic review on the theoretical assumptions of the research (concept of free speech and hate speech, and understanding of the rights of defense of tr...

  19. Clinical and audiological features of a syndrome with deterioration in speech recognition out of proportion to pure hearing loss

    Directory of Open Access Journals (Sweden)

    Abdi S

    2007-04-01

    Full Text Available Background: The objective of this study was to describe the audiologic and related characteristics of a group patient with speech perception affected out of proportion to pure tone hearing loss. A case series of patient were referred for evaluation and management to the Hearing Research Center.To describe the clinical picture of the patients with the key clinical feature of hearing loss for pure tones and reduction in speech discrimination out of proportion to the pure tone loss, having some of the criteria of auditory neuropathy (i.e. normal otoacoustic emissions, OAE, and abnormal auditory brainstem evoked potentials, ABR and lacking others (e.g. present auditory reflexes. Methods: Hearing abilities were measured by Pure Tone Audiometry (PTA and Speech Discrimination Scores (SDS, measured in all patients using a standardized list of 25 monosyllabic Farsi words at MCL in quiet. Auditory pathway integrity was measured by using Auditory Brainstem Response (ABR and Otoacoustic Emission (OAE and anatomical lesions Computed Tomography Scan (CT and Magnetic Resonance Image (MRI of brain and retrocochlea. Patient included in the series were 35 patients who have SDS disproportionably low with regard to PTA, absent ABR waves and normal OAE. Results: All patients reported the beginning of their problem around adolescence. Neither of them had anatomical lesion in imaging studies and neither of them had any finding suggestive of conductive hearing lesion. Although in most of the cases the hearing loss had been more apparent in the lower frequencies (i.e. 1000 Hz and less, a stronger correlation was found between SDS and hearing threshold at higher frequencies. These patients may not benefit from hearing aids, as the outer hair cells are functional and amplification doesn’t seem to help; though, it was tried for all. Conclusion: These patients share a pattern of sensory –neural loss with no detectable lesion. The age of onset and the gradual

  20. Speech Perception and Working Memory in Children with Residual Speech Errors: A Case Study Analysis.

    Science.gov (United States)

    Cabbage, Kathryn L; Farquharson, Kelly; Hogan, Tiffany P

    2015-11-01

    Some children with residual deficits in speech production also display characteristics of dyslexia; however, the causes of these disorders--in isolation or comorbidly--remain unknown. Presently, the role of phonological representations is an important construct for considering how the underlying system of phonology functions. In particular, two related skills--speech perception and phonological working memory--may provide insight into the nature of phonological representations. This study provides an exploratory investigation into the profiles of three 9-year-old children: one with residual speech errors, one with residual speech errors and dyslexia, and one who demonstrated typical, age-appropriate speech sound production and reading skills. We provide an in-depth examination of their relative abilities in the areas of speech perception, phonological working memory, vocabulary, and word reading. Based on these preliminary explorations, we suggest implications for the assessment and treatment of children with residual speech errors and/or dyslexia. PMID:26458199

  1. Fast Monaural Separation of Speech

    DEFF Research Database (Denmark)

    Pontoppidan, Niels Henrik; Dyrholm, Mads

    2003-01-01

    We have investigated the possibility of separating signals from a single mixture of sources. This problem is termed the Monaural Separation Problem. Lars Kai Hansen has argued that this problem is topological tougher than problems with multiple recordings. Roweis has shown that inference from a...... Factorial Hidden Markov Model, with non-stationary assumptions on the source autocorrelations modelled through the Factorial Hidden Markov Model, leads to separation in the monaural case. By extending Hansens work we find that Roweis' assumptions are necessary for monaural speech separation. Furthermore we...

  2. Drone Videos: Surveillance or Speech?

    OpenAIRE

    Kaminski, Margot

    2015-01-01

    Drone policy is fast evolving, and the U.S. legal system is ill-equipped to address privacy concerns. A number of U.S. states have enacted drone-specific privacy laws. Meanwhile, the NTIA is engaging in a multistakeholder process, and federal legislation has been proposed. Are privacy laws necessary—or do they impinge on the free speech rights of drone videographers? This policy question is not unique to drones; it will be raised by a host of coming technologies.

  3. Microintonation Analysis of Emotional Speech

    Czech Academy of Sciences Publication Activity Database

    Přibil, Jiří; Přibilová, Anna

    Berlín : Springer-Verlag, 2010 - (Esposito, A.; Campbell, N.; Vogel, C.; Hussain, A.; Nijholt, A.), s. 268-279 ISBN 978-3-642-12396-2. ISSN 0302-9743. - (Lecture Notes in Computer Science. 5967). [COST 2102 International Training School on Development of Multimodal Interfaces. Dublin (IE), 23.03.2009-27.03.2009] R&D Projects: GA ČR GA102/09/0989 Institutional research plan: CEZ:AV0Z20670512 Keywords : Signal processing * Speech synthesis Subject RIV: JA - Electronics ; Optoelectronics, Electrical Engineering

  4. Lombard speech database for German language

    OpenAIRE

    Soloducha, Michal; Raake, Alexander; Kettler, Frank; Voigt, Peter

    2016-01-01

    This is a publication of Lombard speech database for German language. Additionally, a GitHub project has been created where information about database updates and related data will be stored: https://github.com/Telecommunication-Telemedia-Assessment/Lombard-Speech-database.git

  5. Speech and Language Delays in Identical Twins.

    Science.gov (United States)

    Bentley, Pat

    Following a literature review on speech and language development of twins, case studies are presented of six sets of identical twins screened for entrance into kindergarten. Five sets of the twins and one boy from the sixth set failed to pass the screening test, particularly the speech and language section, and were referred for therapy to correct…

  6. Fighting Words. The Politics of Hateful Speech.

    Science.gov (United States)

    Marcus, Laurence R.

    This book explores issues typified by a series of hateful speech events at Kean College (New Jersey) and on other U.S. campuses in the early 1990s, by examining the dichotomies that exist between the First and the Fourteenth Amendments and between civil liberties and civil rights, and by contrasting the values of free speech and academic freedom…

  7. Only Speech Codes Should Be Censored

    Science.gov (United States)

    Pavela, Gary

    2006-01-01

    In this article, the author discusses the enforcement of "hate speech" codes and confirms research that considers why U.S. colleges and universities continue to promulgate student disciplinary rules prohibiting expression that "subordinates" others or is "demeaning, offensive, or hateful." Such continued adherence to speech codes is by now…

  8. Hate Speech: A Call to Principles.

    Science.gov (United States)

    Klepper, William M.; Bakken, Timothy

    1997-01-01

    Reviews the history of First Amendment rulings as they relate to speech codes and of other regulations directed at the content of speech. A case study, based on an experience at Trenton State College, details the legal constraints, principles, and practices that Student Affairs administrators should be aware of regarding such situations.…

  9. A Representational Account for Apraxia of Speech

    OpenAIRE

    Mayer, Jörg

    1995-01-01

    The present study proposes a new interpretation of the underlying distortion in apraxia of speech. Based on the experimental investigation of coarticulation it is argued that apraxia of speech has to be seen as a defective implementation of phonological representations at the phonology-phonetics interface. The characteristic production deficits of apraxic patients are explained in terms of overspecification of phonetic representations.

  10. Repeated Speech Errors: Evidence for Learning

    Science.gov (United States)

    Humphreys, Karin R.; Menzies, Heather; Lake, Johanna K.

    2010-01-01

    Three experiments elicited phonological speech errors using the SLIP procedure to investigate whether there is a tendency for speech errors on specific words to reoccur, and whether this effect can be attributed to implicit learning of an incorrect mapping from lemma to phonology for that word. In Experiment 1, when speakers made a phonological…

  11. On Multiple Metonymies Within Indirect Speech Acts

    OpenAIRE

    Kosecki Krzysztof

    2007-01-01

    Indirect speech acts are frequently structured by more than a single metonymy. The metonymies are related not only to the illocutionary force of the utterances, but also function within the individual lexemes being their parts. An indirect speech act can thus involve not only multiple, but also multi-levelled operation of conceptual metonymy.

  12. The Effects of TV on Speech Education

    Science.gov (United States)

    Gocen, Gokcen; Okur, Alpaslan

    2013-01-01

    Generally, the speaking aspect is not properly debated when discussing the positive and negative effects of television (TV), especially on children. So, to highlight this point, this study was first initialized by asking the question: "What are the effects of TV on speech?" and secondly, to transform the effects that TV has on speech in a…

  13. Speech Intelligibility in Severe Adductor Spasmodic Dysphonia

    Science.gov (United States)

    Bender, Brenda K.; Cannito, Michael P.; Murry, Thomas; Woodson, Gayle E.

    2004-01-01

    This study compared speech intelligibility in nondisabled speakers and speakers with adductor spasmodic dysphonia (ADSD) before and after botulinum toxin (Botox) injection. Standard speech samples were obtained from 10 speakers diagnosed with severe ADSD prior to and 1 month following Botox injection, as well as from 10 age- and gender-matched…

  14. Speech-Language Pathology: Preparing Early Interventionists

    Science.gov (United States)

    Prelock, Patricia A.; Deppe, Janet

    2015-01-01

    The purpose of this article is to explain the role of speech-language pathology in early intervention. The expected credentials of professionals in the field are described, and the current numbers of practitioners serving young children are identified. Several resource documents available from the American Speech-­Language Hearing Association are…

  15. Anatomy and Physiology of the Speech Mechanism.

    Science.gov (United States)

    Sheets, Boyd V.

    This monograph on the anatomical and physiological aspects of the speech mechanism stresses the importance of a general understanding of the process of verbal communication. Contents include "Positions of the Body,""Basic Concepts Linked with the Speech Mechanism,""The Nervous System,""The Respiratory System--Sound-Power Source,""The…

  16. Quick Statistics about Voice, Speech, and Language

    Science.gov (United States)

    ... Statistics and Epidemiology Quick Statistics About Voice, Speech, Language Voice, Speech, Language, and Swallowing Nearly 1 in 12 (7.7 ... condition known as persistent developmental stuttering. 8 , 9 Language 3.3 percent of U.S. children ages 3- ...

  17. CLEFT PALATE. FOUNDATIONS OF SPEECH PATHOLOGY SERIES.

    Science.gov (United States)

    RUTHERFORD, DAVID; WESTLAKE, HAROLD

    DESIGNED TO PROVIDE AN ESSENTIAL CORE OF INFORMATION, THIS BOOK TREATS NORMAL AND ABNORMAL DEVELOPMENT, STRUCTURE, AND FUNCTION OF THE LIPS AND PALATE AND THEIR RELATIONSHIPS TO CLEFT LIP AND CLEFT PALATE SPEECH. PROBLEMS OF PERSONAL AND SOCIAL ADJUSTMENT, HEARING, AND SPEECH IN CLEFT LIP OR CLEFT PALATE INDIVIDUALS ARE DISCUSSED. NASAL RESONANCE…

  18. Evolution of speech and its acquisition

    NARCIS (Netherlands)

    de Boer, B.

    2005-01-01

    Much is known about the evolution of speech. Fossil evidence points to modern adaptations for speech appearing between 1.5 million and 500,000 years ago. Studies of vocal behavior in apes show the ability to use combinatorial vocalizations in some species (but not chimpanzees) and some cultural infl

  19. The Neural Substrates of Infant Speech Perception

    Science.gov (United States)

    Homae, Fumitaka; Watanabe, Hama; Taga, Gentaro

    2014-01-01

    Infants often pay special attention to speech sounds, and they appear to detect key features of these sounds. To investigate the neural foundation of speech perception in infants, we measured cortical activation using near-infrared spectroscopy. We presented the following three types of auditory stimuli while 3-month-old infants watched a silent…

  20. Speech and Language Problems in Children

    Science.gov (United States)

    ... be due to a speech or language disorder. Language disorders can mean that the child has trouble understanding what others say or difficulty sharing her thoughts. Children who have trouble producing speech sounds correctly or who hesitate or stutter when talking ...

  1. Speech versus singing: Infants choose happier sounds

    Directory of Open Access Journals (Sweden)

    Marieve eCorbeil

    2013-06-01

    Full Text Available Infants prefer speech to non-vocal sounds and to non-human vocalizations, and they prefer happy-sounding speech to neutral speech. They also exhibit an interest in singing, but there is little knowledge of their relative interest in speech and singing. The present study explored infants’ attention to unfamiliar audio samples of speech and singing. In Experiment 1, infants 4-13 months of age were exposed to happy-sounding infant-directed speech versus hummed lullabies by the same woman. They listened significantly longer to the speech, which had considerably greater acoustic variability and expressiveness, than to the lullabies. In Experiment 2, infants of comparable age who heard the lyrics of a Turkish children’s song spoken versus sung in a joyful/happy manner did not exhibit differential listening. Infants in Experiment 3 heard the happily sung lyrics of the Turkish children’s song versus a version that was spoken in an adult-directed or affectively neutral manner. They listened significantly longer to the sung version. Overall, happy voice quality rather than vocal mode (speech or singing was the principal contributor to infant attention, regardless of age.

  2. Speech vs. singing: infants choose happier sounds.

    Science.gov (United States)

    Corbeil, Marieve; Trehub, Sandra E; Peretz, Isabelle

    2013-01-01

    Infants prefer speech to non-vocal sounds and to non-human vocalizations, and they prefer happy-sounding speech to neutral speech. They also exhibit an interest in singing, but there is little knowledge of their relative interest in speech and singing. The present study explored infants' attention to unfamiliar audio samples of speech and singing. In Experiment 1, infants 4-13 months of age were exposed to happy-sounding infant-directed speech vs. hummed lullabies by the same woman. They listened significantly longer to the speech, which had considerably greater acoustic variability and expressiveness, than to the lullabies. In Experiment 2, infants of comparable age who heard the lyrics of a Turkish children's song spoken vs. sung in a joyful/happy manner did not exhibit differential listening. Infants in Experiment 3 heard the happily sung lyrics of the Turkish children's song vs. a version that was spoken in an adult-directed or affectively neutral manner. They listened significantly longer to the sung version. Overall, happy voice quality rather than vocal mode (speech or singing) was the principal contributor to infant attention, regardless of age. PMID:23805119

  3. Acoustic characteristics of Danish infant directed speech

    DEFF Research Database (Denmark)

    Bohn, Ocke-Schwen

    2013-01-01

    speaking to their 18 month old children (infant directed speech - IDS) as opposed to an adult (adult directed speech - ADS). Caregivers were recorded talking about toy animals in conversations with their child and with an adult interlocutor. The toy names were designed to elicit Danish contrasts differing...

  4. Contrast in concept-to-speech generation

    NARCIS (Netherlands)

    Theune, Mariët; Walker, M.; Rambow, O.

    2002-01-01

    In concept-to-speech systems, spoken output is generated on the basis of a text that has been produced by the system itself. In such systems, linguistic information from the text generation component may be exploited to achieve a higher prosodic quality of the speech output than can be obtained in a

  5. Mothers' Speech in Three Social Classes

    Science.gov (United States)

    Snow, C. E.; And Others

    1976-01-01

    Functional and linguistic aspects of the speech of Dutch-speaking mothers from three social classes to their two-year-old children were studied to test the hypothesis that simplified speech is crucial to language acquisition. Available from Plenum Publishing Corp., 227 W. 17th St., New York, NY 10011. (Author/RM)

  6. Speech neglect: A strange educational blind spot

    Science.gov (United States)

    Harris, Katherine Safford

    2005-09-01

    Speaking is universally acknowledged as an important human talent, yet as a topic of educated common knowledge, it is peculiarly neglected. Partly, this is a consequence of the relatively recent growth of research on speech perception, production, and development, but also a function of the way that information is sliced up by undergraduate colleges. Although the basic acoustic mechanism of vowel production was known to Helmholtz, the ability to view speech production as a physiological event is evolving even now with such techniques as fMRI. Intensive research on speech perception emerged only in the early 1930s as Fletcher and the engineers at Bell Telephone Laboratories developed the transmission of speech over telephone lines. The study of speech development was revolutionized by the papers of Eimas and his colleagues on speech perception in infants in the 1970s. Dissemination of knowledge in these fields is the responsibility of no single academic discipline. It forms a center for two departments, Linguistics, and Speech and Hearing, but in the former, there is a heavy emphasis on other aspects of language than speech and, in the latter, a focus on clinical practice. For psychologists, it is a rather minor component of a very diverse assembly of topics. I will focus on these three fields in proposing possible remedies.

  7. Hypnosis and the Reduction of Speech Anxiety.

    Science.gov (United States)

    Barker, Larry L.; And Others

    The purposes of this paper are (1) to review the background and nature of hypnosis, (2) to synthesize research on hypnosis related to speech communication, and (3) to delineate and compare two potential techniques for reducing speech anxiety--hypnosis and systematic desensitization. Hypnosis has been defined as a mental state characterised by…

  8. Tampa Bay International Business Summit Keynote Speech

    Science.gov (United States)

    Clary, Christina

    2011-01-01

    A keynote speech outlining the importance of collaboration and diversity in the workplace. The 20-minute speech describes NASA's challenges and accomplishments over the years and what lies ahead. Topics include: diversity and inclusion principles, international cooperation, Kennedy Space Center planning and development, opportunities for cooperation, and NASA's vision for exploration.

  9. Development of a Danish speech intelligibility test

    DEFF Research Database (Denmark)

    Nielsen, Jens Bo; Dau, Torsten

    2009-01-01

    Abstract A Danish speech intelligibility test for assessing the speech recognition threshold in noise (SRTN) has been developed. The test consists of 180 sentences distributed in 18 phonetically balanced lists. The sentences are based on an open word-set and represent everyday language. The sente...

  10. Speech-Language-Pathology and Audiology Handbook.

    Science.gov (United States)

    New York State Education Dept., Albany. Office of the Professions.

    The handbook contains State Education Department rules and regulations that govern speech-language pathology and audiology in New York State. The handbook also describes licensure and first registration as a licensed speech-language pathologist or audiologist. The introduction discusses professional regulation in New York State while the second…

  11. Electrocardiographic anxiety profiles improve speech anxiety.

    Science.gov (United States)

    Kim, Pyoung Won; Kim, Seung Ae; Jung, Keun-Hwa

    2012-12-01

    The present study was to set out in efforts to determine the effect of electrocardiographic (ECG) feedback on the performance in speech anxiety. Forty-six high school students participated in a speech performance educational program. They were randomly divided into two groups, an experimental group with ECG feedback (N = 21) and a control group (N = 25). Feedback was given with video recording in the control, whereas in the experimental group, an additional ECG feedback was provided. Speech performance was evaluated by the Korean Broadcasting System (KBS) speech ability test, which determines the 10 different speaking categories. ECG was recorded during rest and speech, together with a video recording of the speech performance. Changes in R-R intervals were used to reflect anxiety profiles. Three trials were performed for 3-week program. Results showed that the subjects with ECG feedback revealed a significant improvement in speech performance and anxiety states, which compared to those in the control group. These findings suggest that visualization of the anxiety profile feedback with ECG can be a better cognitive therapeutic strategy in speech anxiety. PMID:22714138

  12. Building Searchable Collections of Enterprise Speech Data.

    Science.gov (United States)

    Cooper, James W.; Viswanathan, Mahesh; Byron, Donna; Chan, Margaret

    The study has applied speech recognition and text-mining technologies to a set of recorded outbound marketing calls and analyzed the results. Since speaker-independent speech recognition technology results in a significantly lower recognition rate than that found when the recognizer is trained for a particular speaker, a number of post-processing…

  13. Speech Fluency in Fragile X Syndrome

    Science.gov (United States)

    Van Borsel, John; Dor, Orianne; Rondal, Jean

    2008-01-01

    The present study investigated the dysfluencies in the speech of nine French speaking individuals with fragile X syndrome. Type, number, and loci of dysfluencies were analysed. The study confirms that dysfluencies are a common feature of the speech of individuals with fragile X syndrome but also indicates that the dysfluency pattern displayed is…

  14. Milton's "Areopagitica" Freedom of Speech on Campus

    Science.gov (United States)

    Sullivan, Daniel F.

    2006-01-01

    The author discusses the content in John Milton's "Areopagitica: A Speech for the Liberty of Unlicensed Printing to the Parliament of England" (1985) and provides parallelism to censorship practiced in higher education. Originally published in 1644, "Areopagitica" makes a powerful--and precocious--argument for freedom of speech and against…

  15. Speech Teachers, Black Studies, and Racial Attitudes.

    Science.gov (United States)

    Butler, Jerry

    Using cognitive dissonance theory as a model in the experimental design, the author investigates the effects on student attitudes of Black ethnic culture materials included in speech classes. One hundred eighty students in all-white speech classes from four Illinois high schools were placed in three categories--prejudiced, moderate, and…

  16. Speech production in amplitude-modulated noise

    DEFF Research Database (Denmark)

    Macdonald, Ewen N; Raufer, Stefan

    2013-01-01

    The Lombard effect refers to the phenomenon where talkers automatically increase their level of speech in a noisy environment. While many studies have characterized how the Lombard effect influences different measures of speech production (e.g., F0, spectral tilt, etc.), few have investigated the...

  17. Subjective Quality Measurement of Speech Its Evaluation, Estimation and Applications

    CERN Document Server

    Kondo, Kazuhiro

    2012-01-01

    It is becoming crucial to accurately estimate and monitor speech quality in various ambient environments to guarantee high quality speech communication. This practical hands-on book shows speech intelligibility measurement methods so that the readers can start measuring or estimating speech intelligibility of their own system. The book also introduces subjective and objective speech quality measures, and describes in detail speech intelligibility measurement methods. It introduces a diagnostic rhyme test which uses rhyming word-pairs, and includes: An investigation into the effect of word familiarity on speech intelligibility. Speech intelligibility measurement of localized speech in virtual 3-D acoustic space using the rhyme test. Estimation of speech intelligibility using objective measures, including the ITU standard PESQ measures, and automatic speech recognizers.

  18. [Improving the speech with a prosthetic construction].

    Science.gov (United States)

    Stalpers, M J; Engelen, M; van der Stappen, J A A M; Weijs, W L J; Takes, R P; van Heumen, C C M

    2016-03-01

    A 12-year-old boy had problems with his speech due to a defect in the soft palate. This defect was caused by the surgical removal of a synovial sarcoma. Testing with a nasometer revealed hypernasality above normal values. Given the size and severity of the defect in the soft palate, the possibility of improving the speech with speech therapy was limited. At a centre for special dentistry an attempt was made with a prosthetic construction to improve the performance of the palate and, in that way, the speech. This construction consisted of a denture with an obturator attached to it. With it, an effective closure of the palate could be achieved. New measurements with acoustic nasometry showed scores within the normal values. The nasality in the speech largely disappeared. The obturator is an effective and relatively easy solution for palatal insufficiency resulting from surgical resection. Intrusive reconstructive surgery can be avoided in this way. PMID:26973984

  19. Two-Microphone Separation of Speech Mixtures

    DEFF Research Database (Denmark)

    Pedersen, Michael Syskind; Wang, DeLiang; Larsen, Jan;

    2008-01-01

    Separation of speech mixtures, often referred to as the cocktail party problem, has been studied for decades. In many source separation tasks, the separation method is limited by the assumption of at least as many sensors as sources. Further, many methods require that the number of signals within...... been combined, independent component analysis (ICA) and binary time–frequency (T–F) masking. By estimating binary masks from the outputs of an ICA algorithm, it is possible in an iterative way to extract basis speech signals from a convolutive mixture. The basis signals are afterwards improved by...... grouping similar signals. Using two microphones, we can separate, in principle, an arbitrary number of mixed speech signals. We show separation results for mixtures with as many as seven speech signals under instantaneous conditions. We also show that the proposed method is applicable to segregate speech...

  20. Strategies for distant speech recognitionin reverberant environments

    Science.gov (United States)

    Delcroix, Marc; Yoshioka, Takuya; Ogawa, Atsunori; Kubo, Yotaro; Fujimoto, Masakiyo; Ito, Nobutaka; Kinoshita, Keisuke; Espi, Miquel; Araki, Shoko; Hori, Takaaki; Nakatani, Tomohiro

    2015-12-01

    Reverberation and noise are known to severely affect the automatic speech recognition (ASR) performance of speech recorded by distant microphones. Therefore, we must deal with reverberation if we are to realize high-performance hands-free speech recognition. In this paper, we review a recognition system that we developed at our laboratory to deal with reverberant speech. The system consists of a speech enhancement (SE) front-end that employs long-term linear prediction-based dereverberation followed by noise reduction. We combine our SE front-end with an ASR back-end that uses neural networks for acoustic and language modeling. The proposed system achieved top scores on the ASR task of the REVERB challenge. This paper describes the different technologies used in our system and presents detailed experimental results that justify our implementation choices and may provide hints for designing distant ASR systems.

  1. The Functional Connectome of Speech Control.

    Science.gov (United States)

    Fuertinger, Stefan; Horwitz, Barry; Simonyan, Kristina

    2015-07-01

    In the past few years, several studies have been directed to understanding the complexity of functional interactions between different brain regions during various human behaviors. Among these, neuroimaging research installed the notion that speech and language require an orchestration of brain regions for comprehension, planning, and integration of a heard sound with a spoken word. However, these studies have been largely limited to mapping the neural correlates of separate speech elements and examining distinct cortical or subcortical circuits involved in different aspects of speech control. As a result, the complexity of the brain network machinery controlling speech and language remained largely unknown. Using graph theoretical analysis of functional MRI (fMRI) data in healthy subjects, we quantified the large-scale speech network topology by constructing functional brain networks of increasing hierarchy from the resting state to motor output of meaningless syllables to complex production of real-life speech as well as compared to non-speech-related sequential finger tapping and pure tone discrimination networks. We identified a segregated network of highly connected local neural communities (hubs) in the primary sensorimotor and parietal regions, which formed a commonly shared core hub network across the examined conditions, with the left area 4p playing an important role in speech network organization. These sensorimotor core hubs exhibited features of flexible hubs based on their participation in several functional domains across different networks and ability to adaptively switch long-range functional connectivity depending on task content, resulting in a distinct community structure of each examined network. Specifically, compared to other tasks, speech production was characterized by the formation of six distinct neural communities with specialized recruitment of the prefrontal cortex, insula, putamen, and thalamus, which collectively forged the formation

  2. Intonation contour in synchronous speech

    Science.gov (United States)

    Wang, Bei; Cummins, Fred

    2003-10-01

    Synchronous Speech (Syn-S), obtained by having pairs of speakers read a prepared text together, has been shown to result in interesting properties in the temporal domain, especially in the reduction of inter-speaker variability in supersegmental timing [F. Cummins, ARLO 3, 7-11 (2002)]. Here we investigate the effect of synchronization among speakers on the intonation contour, with a view to informing models of intonation. Six pairs of speakers (all females) read a short text (176 words) both synchronously and solo. Results show that (1) the pitch accent height above a declining baseline is reduced in Syn-S, compared with solo speech, while the pitch accent location is consistent across speakers in both conditions; (2) in contrast to previous findings on duration matching, there is an asymmetry between speakers, with one speaker exerting a stronger influence on the observed intonation contour than the other; (3) agreement on the boundaries of intonational phrases is greater in Syn-S and intonation contours are well matched from the first syllable of the phrase and throughout.

  3. Inconsistency of speech in children with childhood apraxia of speech, phonological disorders, and typical speech

    Science.gov (United States)

    Iuzzini, Jenya

    There is a lack of agreement on the features used to differentiate Childhood Apraxia of Speech (CAS) from Phonological Disorders (PD). One criterion which has gained consensus is lexical inconsistency of speech (ASHA, 2007); however, no accepted measure of this feature has been defined. Although lexical assessment provides information about consistency of an item across repeated trials, it may not capture the magnitude of inconsistency within an item. In contrast, segmental analysis provides more extensive information about consistency of phoneme usage across multiple contexts and word-positions. The current research compared segmental and lexical inconsistency metrics in preschool-aged children with PD, CAS, and typical development (TD) to determine how inconsistency varies with age in typical and disordered speakers, and whether CAS and PD were differentiated equally well by both assessment levels. Whereas lexical and segmental analyses may be influenced by listener characteristics or speaker intelligibility, the acoustic signal is less vulnerable to these factors. In addition, the acoustic signal may reveal information which is not evident in the perceptual signal. A second focus of the current research was motivated by Blumstein et al.'s (1980) classic study on voice onset time (VOT) in adults with acquired apraxia of speech (AOS) which demonstrated a motor impairment underlying AOS. In the current study, VOT analyses were conducted to determine the relationship between age and group with the voicing distribution for bilabial and alveolar plosives. Findings revealed that 3-year-olds evidenced significantly higher inconsistency than 5-year-olds; segmental inconsistency approached 0% in 5-year-olds with TD, whereas it persisted in children with PD and CAS suggesting that for child in this age-range, inconsistency is a feature of speech disorder rather than typical development (Holm et al., 2007). Likewise, whereas segmental and lexical inconsistency were

  4. Development of The Viking Speech Scale to classify the speech of children with cerebral palsy.

    Science.gov (United States)

    Pennington, Lindsay; Virella, Daniel; Mjøen, Tone; da Graça Andrada, Maria; Murray, Janice; Colver, Allan; Himmelmann, Kate; Rackauskaite, Gija; Greitane, Andra; Prasauskiene, Audrone; Andersen, Guro; de la Cruz, Javier

    2013-10-01

    Surveillance registers monitor the prevalence of cerebral palsy and the severity of resulting impairments across time and place. The motor disorders of cerebral palsy can affect children's speech production and limit their intelligibility. We describe the development of a scale to classify children's speech performance for use in cerebral palsy surveillance registers, and its reliability across raters and across time. Speech and language therapists, other healthcare professionals and parents classified the speech of 139 children with cerebral palsy (85 boys, 54 girls; mean age 6.03 years, SD 1.09) from observation and previous knowledge of the children. Another group of health professionals rated children's speech from information in their medical notes. With the exception of parents, raters reclassified children's speech at least four weeks after their initial classification. Raters were asked to rate how easy the scale was to use and how well the scale described the child's speech production using Likert scales. Inter-rater reliability was moderate to substantial (k>.58 for all comparisons). Test-retest reliability was substantial to almost perfect for all groups (k>.68). Over 74% of raters found the scale easy or very easy to use; 66% of parents and over 70% of health care professionals judged the scale to describe children's speech well or very well. We conclude that the Viking Speech Scale is a reliable tool to describe the speech performance of children with cerebral palsy, which can be applied through direct observation of children or through case note review. PMID:23891732

  5. An articulatorily constrained, maximum entropy approach to speech recognition and speech coding

    Energy Technology Data Exchange (ETDEWEB)

    Hogden, J.

    1996-12-31

    Hidden Markov models (HMM`s) are among the most popular tools for performing computer speech recognition. One of the primary reasons that HMM`s typically outperform other speech recognition techniques is that the parameters used for recognition are determined by the data, not by preconceived notions of what the parameters should be. This makes HMM`s better able to deal with intra- and inter-speaker variability despite the limited knowledge of how speech signals vary and despite the often limited ability to correctly formulate rules describing variability and invariance in speech. In fact, it is often the case that when HMM parameter values are constrained using the limited knowledge of speech, recognition performance decreases. However, the structure of an HMM has little in common with the mechanisms underlying speech production. Here, the author argues that by using probabilistic models that more accurately embody the process of speech production, he can create models that have all the advantages of HMM`s, but that should more accurately capture the statistical properties of real speech samples--presumably leading to more accurate speech recognition. The model he will discuss uses the fact that speech articulators move smoothly and continuously. Before discussing how to use articulatory constraints, he will give a brief description of HMM`s. This will allow him to highlight the similarities and differences between HMM`s and the proposed technique.

  6. The Effect of English Verbal Songs on Connected Speech Aspects of Adult English Learners’ Speech Production

    Directory of Open Access Journals (Sweden)

    Farshid Tayari Ashtiani

    2015-02-01

    Full Text Available The present study was an attempt to investigate the impact of English verbal songs on connected speech aspects of adult English learners’ speech production. 40 participants were selected based on the results of their performance in a piloted and validated version of NELSON test given to 60 intermediate English learners in a language institute in Tehran. Then they were equally distributed in two control and experimental groups and received a validated pretest of reading aloud and speaking in English. Afterward, the treatment was performed in 18 sessions by singing preselected songs culled based on some criteria such as popularity, familiarity, amount, and speed of speech delivery, etc. In the end, the posttests of reading aloud and speaking in English were administered. The results revealed that the treatment had statistically positive effects on the connected speech aspects of English learners’ speech production at statistical .05 level of significance. Meanwhile, the results represented that there was not any significant difference between the experimental group’s mean scores on the posttests of reading aloud and speaking. It was thus concluded that providing the EFL learners with English verbal songs could positively affect connected speech aspects of both modes of speech production, reading aloud and speaking. The Findings of this study have pedagogical implications for language teachers to be more aware and knowledgeable of the benefits of verbal songs to promote speech production of language learners in terms of naturalness and fluency.Keywords: English Verbal Songs, Connected Speech, Speech Production, Reading Aloud, Speaking 

  7. E-learning-based speech therapy: a web application for speech training.

    Science.gov (United States)

    Beijer, Lilian J; Rietveld, Toni C M; van Beers, Marijn M A; Slangen, Robert M L; van den Heuvel, Henk; de Swart, Bert J M; Geurts, Alexander C H

    2010-03-01

    Abstract In The Netherlands, a web application for speech training, E-learning-based speech therapy (EST), has been developed for patients with dysarthria, a speech disorder resulting from acquired neurological impairments such as stroke or Parkinson's disease. In this report, the EST infrastructure and its potentials for both therapists and patients are elucidated. EST provides patients with dysarthria the opportunity to engage in intensive speech training in their own environment, in addition to undergoing the traditional face-to-face therapy. Moreover, patients with chronic dysarthria can use EST to independently maintain the quality of their speech once the face-to-face sessions with their speech therapist have been completed. This telerehabilitation application allows therapists to remotely compose speech training programs tailored to suit each individual patient. Moreover, therapists can remotely monitor and evaluate changes in the patient's speech. In addition to its value as a device for composing, monitoring, and carrying out web-based speech training, the EST system compiles a database of dysarthric speech. This database is vital for further scientific research in this area. PMID:20184455

  8. Tracking Change in Children with Severe and Persisting Speech Difficulties

    Science.gov (United States)

    Newbold, Elisabeth Joy; Stackhouse, Joy; Wells, Bill

    2013-01-01

    Standardised tests of whole-word accuracy are popular in the speech pathology and developmental psychology literature as measures of children's speech performance. However, they may not be sensitive enough to measure changes in speech output in children with severe and persisting speech difficulties (SPSD). To identify the best ways of doing this,…

  9. Speech Sound Disorders in a Community Study of Preschool Children

    Science.gov (United States)

    McLeod, Sharynne; Harrison, Linda J.; McAllister, Lindy; McCormack, Jane

    2013-01-01

    Purpose: To undertake a community (nonclinical) study to describe the speech of preschool children who had been identified by parents/teachers as having difficulties "talking and making speech sounds" and compare the speech characteristics of those who had and had not accessed the services of a speech-language pathologist (SLP). Method:…

  10. SII-Based Speech Prepocessing for Intelligibility Improvement in Noise

    DEFF Research Database (Denmark)

    Taal, Cees H.; Jensen, Jesper

    2013-01-01

    A linear time-invariant filter is designed in order to improve speech understanding when the speech is played back in a noisy environment. To accomplish this, the speech intelligibility index (SII) is maximized under the constraint that the speech energy is held constant. A nonlinear approximation...

  11. The New Findings Made in Speech Act Theory

    Institute of Scientific and Technical Information of China (English)

    管彦波

    2007-01-01

    Through carefully studying the theory of speech acts and the literature concerning it,the author made some new findings which reflects in three aspects:the similarities and differences in Chinese and English in expressing the same speech act,the relations between different types of speech acts and the correspondence between sentenee sets and sets of speech acts.

  12. Speech Characteristics Associated with Three Genotypes of Ataxia

    Science.gov (United States)

    Sidtis, John J.; Ahn, Ji Sook; Gomez, Christopher; Sidtis, Diana

    2011-01-01

    Purpose: Advances in neurobiology are providing new opportunities to investigate the neurological systems underlying motor speech control. This study explores the perceptual characteristics of the speech of three genotypes of spino-cerebellar ataxia (SCA) as manifest in four different speech tasks. Methods: Speech samples from 26 speakers with SCA…

  13. The Interpersonal Metafunction Analysis of Barack Obama's Victory Speech

    Science.gov (United States)

    Ye, Ruijuan

    2010-01-01

    This paper carries on a tentative interpersonal metafunction analysis of Barack Obama's victory speech from the interpersonal metafunction, which aims to help readers understand and evaluate the speech regarding its suitability, thus to provide some guidance for readers to make better speeches. This study has promising implications for speeches as…

  14. Ahab's Speeches: Bombs or Bombastics? A Rhetorical Criticism.

    Science.gov (United States)

    Fadely, Dean

    In an attempt to define rhetorical discourse, the paper examines the speeches of Ahab, the main character from Herman Melville's book, "Moby-Dick." The paper first determines if Ahab's speeches actually fall into the category of rhetorical discourse by examining his major speeches, and then ascertains whether his speeches are bombs (successful…

  15. Emotion recognition from speech: tools and challenges

    Science.gov (United States)

    Al-Talabani, Abdulbasit; Sellahewa, Harin; Jassim, Sabah A.

    2015-05-01

    Human emotion recognition from speech is studied frequently for its importance in many applications, e.g. human-computer interaction. There is a wide diversity and non-agreement about the basic emotion or emotion-related states on one hand and about where the emotion related information lies in the speech signal on the other side. These diversities motivate our investigations into extracting Meta-features using the PCA approach, or using a non-adaptive random projection RP, which significantly reduce the large dimensional speech feature vectors that may contain a wide range of emotion related information. Subsets of Meta-features are fused to increase the performance of the recognition model that adopts the score-based LDC classifier. We shall demonstrate that our scheme outperform the state of the art results when tested on non-prompted databases or acted databases (i.e. when subjects act specific emotions while uttering a sentence). However, the huge gap between accuracy rates achieved on the different types of datasets of speech raises questions about the way emotions modulate the speech. In particular we shall argue that emotion recognition from speech should not be dealt with as a classification problem. We shall demonstrate the presence of a spectrum of different emotions in the same speech portion especially in the non-prompted data sets, which tends to be more "natural" than the acted datasets where the subjects attempt to suppress all but one emotion.

  16. Fifty years of progress in speech recognition

    Science.gov (United States)

    Reddy, Raj

    2004-10-01

    Human level speech recognition has proved to be an elusive goal because of the many sources of variability that affect speech: from stationary and dynamic noise, microphone variability, and speaker variability to variability at phonetic, prosodic, and grammatical levels. Over the past 50 years, Jim Flanagan has been a continuous source of encouragement and inspiration to the speech recognition community. While early isolated word systems primarily used acoustic knowledge, systems in the 1970s found mechanisms to represent and utilize syntactic (e.g., information retrieval) and semantic knowledge (e.g., Chess) in speech recognition systems. As vocabularies became larger, leading to greater ambiguity and perplexity, we had to explore the use task specific and context specific knowledge to reduce the branching factors. As the need arose for systems that can be used by open populations using telephone quality speech, we developed learning techniques that use very large data sets and noise adaptation methods. We still have a long way to go before we can satisfactorily handle unrehearsed spontaneous speech, speech from non-native speakers, and dynamic learning of new words, phrases, and grammatical forms.

  17. A pattern recognition based esophageal speech enhancement system

    Directory of Open Access Journals (Sweden)

    A.Mantilla‐Caeiros

    2010-04-01

    Full Text Available A system for improving the intelligibility and quality of alaryngeal speech based on the replacement of voiced segments ofalaryngeal speech with the equivalent segments of normal speech is proposed. To this end, the system proposed identifies thevoiced segments of the alaryngeal speech signal by using isolate speech recognition methods, and replaces them by theirequivalent voiced segments of normal speech, keeping the silence and unvoiced segments without change. Evaluation resultsusing objective and subjective evaluation methods show that the proposed system proposed provides a fairly goodimprovement of the quality and intelligibility of alaryngeal speech signals.

  18. Investigating Pragmatics of Complaint Speech Acts in English and Chinese

    Institute of Scientific and Technical Information of China (English)

    张颖卉; 李尚哲

    2013-01-01

    The speech act of complaint is an important research subject of pragmatics, which is worthy of research among speech acts. With the development of research into speech acts, some scholars have performed investigations of complaints ,but they have done little work on Chinese language complaints. Therefore, it is necessary to make a further study on complaint as a speech act in Chinese. This thesis is based on speech act theory and the politeness principle as an empirical study of the speech act of com-plaint in Chinese. It aims to provide a more complete and comprehensive result of participant production of the speech act of complaint.

  19. Speech-specific audiovisual perception affects identification but not detection of speech

    DEFF Research Database (Denmark)

    Eskelund, Kasper; Andersen, Tobias

    Speech perception is audiovisual as evidenced by the McGurk effect in which watching incongruent articulatory mouth movements can change the phonetic auditory speech percept. This type of audiovisual integration may be specific to speech or be applied to all stimuli in general. To investigate this...... audiovisual integration specific to speech perception. However, the results of Tuomainen et al. might have been influenced by another effect. When observers were naïve, they had little motivation to look at the face. When informed, they knew that the face was relevant for the task and this could increase...... noise were measured for naïve and informed participants. We found that the threshold for detecting speech in audiovisual stimuli was lower than for auditory-only stimuli. But there was no detection advantage for observers informed of the speech nature of the auditory signal. This may indicate that...

  20. A Danish open-set speech corpus for competing-speech studies

    DEFF Research Database (Denmark)

    Nielsen, Jens Bo; Dau, Torsten; Neher, Tobias

    2014-01-01

    ) when the competing speech signals are spatially separated. To achieve higher SRTs that correspond more closely to natural communication situations, an open-set, low-context, multi-talker speech corpus was developed. Three sets of 268 unique Danish sentences were created, and each set was recorded......Studies investigating speech-on-speech masking effects commonly use closed-set speech materials such as the coordinate response measure [Bolia et al. (2000). J. Acoust. Soc. Am. 107, 1065-1066]. However, these studies typically result in very low (i.e., negative) speech recognition thresholds (SRTs...... in a setup with a frontal target sentence and two concurrent masker sentences at ±50 degrees azimuth. For a group of 16 normal-hearing listeners and a group of 15 elderly (linearly aided) hearing-impaired listeners, overall SRTs of, respectively, +1.3 dB and +6.3 dB target-to-masker ratio were obtained...

  1. Hate Speech Revisited: The "Toon" Controversy

    Directory of Open Access Journals (Sweden)

    Rajeev Dhavan

    2010-01-01

    Full Text Available Examining the cartoon controversy which ignited violent protests and ban in various countries, this article examines the contours of "hate speech" in various legal systems. While broadly supporting the case of free speech the authors remind users of free speech to exercise self-restraint. Absolute bans should not be made, but time, person and place constraints may be essential. Ironically, the toon controversy also reveals the silence of the sympathetic majority. Similarly, there is a duty to speak. Even though not enforceable, it remains a duty to democracy.

  2. Automatic speech recognition a deep learning approach

    CERN Document Server

    Yu, Dong

    2015-01-01

    This book summarizes the recent advancement in the field of automatic speech recognition with a focus on discriminative and hierarchical models. This will be the first automatic speech recognition book to include a comprehensive coverage of recent developments such as conditional random field and deep learning techniques. It presents insights and theoretical foundation of a series of recent models such as conditional random field, semi-Markov and hidden conditional random field, deep neural network, deep belief network, and deep stacking models for sequential learning. It also discusses practical considerations of using these models in both acoustic and language modeling for continuous speech recognition.

  3. Personality in speech assessment and automatic classification

    CERN Document Server

    Polzehl, Tim

    2015-01-01

    This work combines interdisciplinary knowledge and experience from research fields of psychology, linguistics, audio-processing, machine learning, and computer science. The work systematically explores a novel research topic devoted to automated modeling of personality expression from speech. For this aim, it introduces a novel personality assessment questionnaire and presents the results of extensive labeling sessions to annotate the speech data with personality assessments. It provides estimates of the Big 5 personality traits, i.e. openness, conscientiousness, extroversion, agreeableness, and neuroticism. Based on a database built on the questionnaire, the book presents models to tell apart different personality types or classes from speech automatically.

  4. Speech recognition based on pattern recognition techniques

    Science.gov (United States)

    Rabiner, Lawrence R.

    1990-05-01

    Algorithms for speech recognition can be characterized broadly as pattern recognition approaches and acoustic phonetic approaches. To date, the greatest degree of success in speech recognition has been obtained using pattern recognition paradigms. The use of pattern recognition techniques were applied to the problems of isolated word (or discrete utterance) recognition, connected word recognition, and continuous speech recognition. It is shown that understanding (and consequently the resulting recognizer performance) is best to the simplest recognition tasks and is considerably less well developed for large scale recognition systems.

  5. Testing for robust speech recognition performance

    Science.gov (United States)

    Simpson, C. A.; Moore, C. A.; Ruth, J. C.

    Results are reported from two studies which evaluated speaker-dependent connected-speech template-matching algorithms. One study examined the recognition performance for vocabularies spoken within a spacesuit. Two token vocabularies were used that were recorded in different noise levels. The second study evaluated the rejection accuracy for two commercial speech recognizers. The spoken test tokens were variations on a single word. The tests underscored the inferiority of speech recognizers relative to the human capability for discerning among phonetically different words. However, one commercial recognizer exhibited over 96-percent rejection accuracy in a noisy environment.

  6. Speech-enabled Computer-aided Translation

    DEFF Research Database (Denmark)

    Mesa-Lao, Bartolomé

    2014-01-01

    The present study has surveyed post-editor trainees’ views and attitudes before and after the introduction of speech technology as a front end to a computer-aided translation workbench. The aim of the survey was (i) to identify attitudes and perceptions among post-editor trainees before performing...... a post-editing task using automatic speech recognition (ASR); and (ii) to assess the degree to which post-editors’ attitudes and expectations to the use of speech technology changed after actually using it. The survey was based on two questionnaires: the first one administered before the...

  7. Two Sides of the Same Coin: The Scope of Free Speech and Hate Speech in the College Community.

    Science.gov (United States)

    Schuett, Faye

    2000-01-01

    This article presents the Two Sides interviews, which confront the serious and immediate conflict between free speech and hate speech on college campuses. Dr. Robert O' Neil discusses the scope of free speech in the college community, while Dr. Timothy Shiell focuses on hate speech on campuses. Contains 12 references. (VWC)

  8. Empathy, Ways of Knowing, and Interdependence as Mediators of Gender Differences in Attitudes toward Hate Speech and Freedom of Speech

    Science.gov (United States)

    Cowan, Gloria; Khatchadourian, Desiree

    2003-01-01

    Women are more intolerant of hate speech than men. This study examined relationality measures as mediators of gender differences in the perception of the harm of hate speech and the importance of freedom of speech. Participants were 107 male and 123 female college students. Questionnaires assessed the perceived harm of hate speech, the importance…

  9. The effectiveness of Speech-Music Therapy for Aphasia (SMTA) in five speakers with Apraxia of Speech and aphasia

    NARCIS (Netherlands)

    Hurkmans, Joost; Jonkers, Roel; de Bruijn, Madeleen; Boonstra, Anne M.; Hartman, Paul P.; Arendzen, Hans; Reinders - Messelink, Heelen

    2015-01-01

    Background: Several studies using musical elements in the treatment of neurological language and speech disorders have reported improvement of speech production. One such programme, Speech-Music Therapy for Aphasia (SMTA), integrates speech therapy and music therapy (MT) to treat the individual with

  10. Speech information retrieval: a review

    Energy Technology Data Exchange (ETDEWEB)

    Hafen, Ryan P.; Henry, Michael J.

    2012-11-01

    Audio is an information-rich component of multimedia. Information can be extracted from audio in a number of different ways, and thus there are several established audio signal analysis research fields. These fields include speech recognition, speaker recognition, audio segmentation and classification, and audio finger-printing. The information that can be extracted from tools and methods developed in these fields can greatly enhance multimedia systems. In this paper, we present the current state of research in each of the major audio analysis fields. The goal is to introduce enough back-ground for someone new in the field to quickly gain high-level understanding and to provide direction for further study.

  11. Speech and language disorders in pre-school

    OpenAIRE

    Benda, Monika

    2014-01-01

    Speech is important in human life; it forms every individual and gives him the opportunity to establish communication with his surroundings. Communication is disturbed when a person is confronted with speech and language disorder. An educator must be especially familiar with the stages of development of speech perception and speech and language disorders, because speech and language disorders affect children as early as in the preschool period. In the theoretical part, I used a variety of lit...

  12. Objects Control through Speech Recognition Using LabVIEW

    OpenAIRE

    Ankush Sharma; Srinivas Perala; Priya Darshni

    2013-01-01

    Speech is the natural form of human communication and the speech processing is the one of the most stimulating area of the signal processing. Speech recognition technology has made it possible for computer to follow the human voice command and understand the human languages. The objects (LED, Toggle switch etc.) control through human speech is designed in this paper. By combine the virtual instrumentation technology and speech recognition techniques. And also provided password authentication....

  13. VLSI Implementation of Hybrid Algorithm Architecture for Speech Enhancement

    OpenAIRE

    Jigar Shah; Satish Shah

    2012-01-01

    The speech enhancement techniques are required to improve the speech signal quality without causing any offshoot in many applications. Recently the growing use of cellular and mobile phones, hands free systems, VoIP phones, voice messaging service, call service centers etc. require efficient real time speech enhancement and detection strategies to make them superior over conventional speech communication systems. The speech enhancement algorithms are required to deal with additive noise and c...

  14. Exploring speech therapy games with children on the autism spectrum

    OpenAIRE

    Picard, Rosalind W.; Lane, Joseph K.; el Kaliouby, Rana; Goodwin, Matthew; Hoque, Mohammed Ehasanul

    2009-01-01

    Individuals on the autism spectrum often have difficulties producing intelligible speech with either high or low speech rate, and atypical pitch and/or amplitude affect. In this study, we present a novel intervention towards customizing speech enabled games to help them produce intelligible speech. In this approach, we clinically and computationally identify the areas of speech production difficulties of our participants. We provide an interactive and customized interface for the participants...

  15. Advancing Electromyographic Continuous Speech Recognition: Signal Preprocessing and Modeling

    OpenAIRE

    Wand, Michael

    2014-01-01

    Speech is the natural medium of human communication, but audible speech can be overheard by bystanders and excludes speech-disabled people. This work presents a speech recognizer based on surface electromyography, where electric potentials of the facial muscles are captured by surface electrodes, allowing speech to be processed nonacoustically. A system which was state-of-the-art at the beginning of this thesis is substantially improved in terms of accuracy, flexibility, and robustness.

  16. Development of The Viking Speech Scale to Classify the Speech of Children with Cerebral Palsy

    OpenAIRE

    Pennington, L; Virella, D; Mjøen, T; Andrada, MG; Murray, J.; Colver, A; Himmelmann, K; Rackauskaite, G; Greitane, A; Prasauskiene, A; Andersen, G.; Cruz, J.

    2013-01-01

    Surveillance registers monitor the prevalence of cerebral palsy and the severity of resulting impairments across time and place. The motor disorders of cerebral palsy can affect children’s speech production and limit their intelligibility. We describe the development of a scale to classify children’s speech performance for use in cerebral palsy surveillance registers, and its reliability across raters and across time. Speech and language therapists, other healthcare professionals and paren...

  17. Comparison of speech parameterization techniques for the classification of speech disfluencies

    OpenAIRE

    FOOK, Chong Yen; Muthusamy, Hariharan; CHEE, Lim Sin; YAACOB, Sazali Bin; ADOM, Abdul Hamid Bin

    2012-01-01

    Stuttering assessment through the manual classification of speech disfluencies is subjective, inconsistent, time-consuming, and prone to error. The aim of this paper is to compare the effectiveness of the 3 speech feature extraction methods, mel-frequency cepstral coefficients, linear predictive coding (LPC)-based cepstral parameters, and perceptual linear predictive (PLP) analysis, for classifying 2 types of speech disfluencies, repetition and prolongation, from recorded disfluent spee...

  18. Song and speech: examining the link between singing talent and speech imitation ability

    OpenAIRE

    Christiner, Markus; Reiterer, Susanne M.

    2013-01-01

    In previous research on speech imitation, musicality, and an ability to sing were isolated as the strongest indicators of good pronunciation skills in foreign languages. We, therefore, wanted to take a closer look at the nature of the ability to sing, which shares a common ground with the ability to imitate speech. This study focuses on whether good singing performance predicts good speech imitation. Forty-one singers of different levels of proficiency were selected for the study and their ab...

  19. Song and speech: examining the link between singing talent and speech imitation ability

    OpenAIRE

    Markus eChristiner; Susanne Maria Reiterer

    2013-01-01

    In previous research on speech imitation, musicality and an ability to sing were isolated as the strongest indicators of good pronunciation skills in foreign languages. We, therefore, wanted to take a closer look at the nature of the ability to sing, which shares a common ground with the ability to imitate speech. This study focuses on whether good singing performance predicts good speech imitation. Fourty-one singers of different levels of proficiency were selected for the study and their ab...

  20. Relations between affective music and speech: Evidence from dynamics of affective piano performance and speech production

    Directory of Open Access Journals (Sweden)

    Xiaoluan eLiu

    2015-07-01

    Full Text Available This study compares affective piano performance with speech production from the perspective of dynamics: unlike previous research, this study uses finger force and articulatory effort as indexes reflecting the dynamics of affective piano performance and speech production respectively. Moreover, for the first time physical constraints such as piano fingerings and speech articulatory distance are included due to their potential contribution to different patterns of dynamics. A piano performance experiment and speech production experiment were conducted in four emotions: anger, fear, happiness and sadness. The results show that in both piano performance and speech production, anger and happiness generally have high dynamics while sadness has the lowest dynamics, with fear in the middle. Fingerings interact with fear in the piano experiment and articulatory distance interacts with anger in the speech experiment, i.e., large physical constraints produce significantly higher dynamics than small physical constraints in piano performance under the condition of fear and in speech production under the condition of anger. Using production experiments, this study firstly supports previous perception studies on relations between affective music and speech. Moreover, this is the first study to show quantitative evidence for the importance of considering motor aspects such as dynamics in comparing music performance and speech production in which motor mechanisms play a crucial role.

  1. Cortical speech and non-speech discrimination in relation to cognitive measures in preschool children.

    Science.gov (United States)

    Kuuluvainen, Soila; Alku, Paavo; Makkonen, Tommi; Lipsanen, Jari; Kujala, Teija

    2016-03-01

    Effective speech sound discrimination at preschool age is known to be a prerequisite for the development of language skills and later literacy acquisition. However, the speech specificity of cortical discrimination skills in small children is currently not known, as previous research has either studied speech functions without comparison with non-speech sounds, or used much simpler sounds such as harmonic or sinusoidal tones as control stimuli. We investigated the cortical discrimination of five syllable features (consonant, vowel, vowel duration, fundamental frequency, and intensity), covering both segmental and prosodic phonetic changes, and their acoustically matched non-speech counterparts in 63 6-year-old typically developed children, by using a multi-feature mismatch negativity (MMN) paradigm. Each of the five investigated features elicited a unique pattern of differentiating negativities: an early differentiating negativity, MMN, and a late differentiating negativity. All five studied features showed speech-related enhancement of at least one of these responses, suggesting experience-related neural commitment in both phonetic and prosodic speech processing. In addition, the cognitive performance and language skills of the children were tested extensively. The speech-related neural enhancement was positively associated with the level of performance in several neurocognitive tasks, indicating a relationship between successful establishment of cortical memory traces for speech and enhanced cognitive functioning. The results contribute to the understanding of typical developmental trajectories of linguistic vs. non-linguistic auditory skills, and provide a reference for future studies investigating deficits in language-related disorders at preschool age. PMID:26647120

  2. The Enhancement of Onsets in the Speech Envelope increases Speech Intelligibility in Noise Vocoded Cochlear Implant Simulations

    OpenAIRE

    Koning, Raphael; Wouters, Jan

    2011-01-01

    peech understanding with cochlear implants (CIs) can be good in quiet, but very poor in more adverse listening conditions. One objective of speech enhancement algorithms is to improve the intelligibility of noisy speech signals. Recent studies showed that the transient parts of a speech signal contribute most to speech intelligibility in normal-hearing (NH) listeners. In this study, the influence of onset envelope enhancement on speech intelligibility in noisy conditions using an eight channe...

  3. Evolution of speech-specific cognitive adaptations

    Directory of Open Access Journals (Sweden)

    Bart ede Boer

    2015-09-01

    Full Text Available This paper briefly reviews theoretical results that shed light on what kind of cognitive adaptations we can expect to have evolved for (combinatorial speech and then reviews concrete empirical work investigating adaptations for combinatorial speech. The paper argues that an evolutionary perspective is natural when investigating cognitive adaptations related to speech and language. This is because properties of language are determined through complex interaction between biologically evolved cognitive mechanisms (possibly adapted to language and cultural (evolutionary processes. It turns out that there is as yet no strong direct evidence for cognitive traits that have undergone selection related to speech in general or combinatorial structure in particular, but there is indirect evidence that indicates selection. However, the traits that may have undergone selection are expected to be continuously variable ones, rather than the discrete ones that linguists have focused on traditionally.

  4. The Beginnings of Danish Speech Perception

    DEFF Research Database (Denmark)

    Østerbye, Torkil

    Little is known about the perception of speech sounds by native Danish listeners. However, the Danish sound system differs in several interesting ways from the sound systems of other languages. For instance, Danish is characterized, among other features, by a rich vowel inventory and by different...... reductions of speech sounds evident in the pronunciation of the language. This book (originally a PhD thesis) consists of three studies based on the results of two experiments. The experiments were designed to provide knowledge of the perception of Danish speech sounds by Danish adults and infants......, in the light of the rich and complex Danish sound system. The first two studies report on native adults’ perception of Danish speech sounds in quiet and noise. The third study examined the development of language-specific perception in native Danish infants at 6, 9 and 12 months of age. The book points...

  5. Ultra low bit-rate speech coding

    CERN Document Server

    Ramasubramanian, V

    2015-01-01

    "Ultra Low Bit-Rate Speech Coding" focuses on the specialized topic of speech coding at very low bit-rates of 1 Kbits/sec and less, particularly at the lower ends of this range, down to 100 bps. The authors set forth the fundamental results and trends that form the basis for such ultra low bit-rates to be viable and provide a comprehensive overview of various techniques and systems in literature to date, with particular attention to their work in the paradigm of unit-selection based segment quantization. The book is for research students, academic faculty and researchers, and industry practitioners in the areas of speech processing and speech coding.

  6. Autosomal dominant rolandic epilepsy with speech dyspraxia.

    Science.gov (United States)

    Scheffer, I E

    2000-01-01

    Autosomal Dominant Rolandic Epilepsy with Speech Dyspraxia (ADRESD) is a rare disorder which highlights the relationship between Benign Rolandic Epilepsy (BRE) and speech and language disorders. Subtle speech and language disorders have recently been well characterised in BRE. ADRESD is associated with long term, more severe speech and language difficulties. The time course of rolandic epilepsy in ADRESD is typical of that of BRE. ADRESD is inherited in an autosomal dominant manner with anticipation. It is postulated that the anticipation may be due to an, as yet unidentified, triplet repeat expansion in a gene for rolandic epilepsy. BRE follows complex inheritance but it is possible that ADRESD may hold some valuable clues to the pathogenesis of BRE. PMID:11231219

  7. Speech for People with Tracheostomies or Ventilators

    Science.gov (United States)

    ... Swallowing / Disorders and Diseases Speech for People With Tracheostomies or Ventilators [ en Español ] What is a tracheostomy ? ... people with tracheostomies or ventilators ? What is a tracheostomy? A tracheostomy is a surgical opening in the ...

  8. Counteracting Acoustic Disturbances in Human Speech Communication

    OpenAIRE

    Westerlund, Nils

    2006-01-01

    A signal can be said to be any information bearing unit or action carrying a message from a sender to a receiver. This definition covers a vast number of human and non-human actions, ranging from flirtation to satellite communication. This thesis deals with increasing the quality of one of the most ubiquitous human-to-human signals: Speech. Surrounding noise is a severe obstacle to relaxed speech communication. Cars, industry and many everyday machines emit high noise levels that render perso...

  9. An introduction to statistical parametric speech synthesis

    Indian Academy of Sciences (India)

    Simon King

    2011-10-01

    Statistical parametric speech synthesis, based on hidden Markov model-like models, has become competitive with established concatenative techniques over the last few years. This paper offers a non-mathematical introduction to this method of speech synthesis. It is intended to be complementary to the wide range of excellent technical publications already available. Rather than offer a comprehensive literature review, this paper instead gives a small number of carefully chosen references which are good starting points for further reading.

  10. A Dialectal Chinese Speech Recognition Framework

    Institute of Scientific and Technical Information of China (English)

    Jing Li; Thomas Fang Zheng; William Byrne; Dan Jurafsky

    2006-01-01

    A framework for dialectal Chinese speech recognition is proposed and studied, in which a relatively small dialectal Chinese (or in other words Chinese influenced by the native dialect) speech corpus and dialect-related knowledge are adopted to transform a standard Chinese (or Putonghua, abbreviated as PTH) speech recognizer into a dialectal Chinese speech recognizer. Two kinds of knowledge sources are explored: one is expert knowledge and the other is a small dialectal Chinese corpus. These knowledge sources provide information at four levels: phonetic level, lexicon level, language level,and acoustic decoder level. This paper takes Wu dialectal Chinese (WDC) as an example target language. The goal is to establish a WDC speech recognizer from an existing PTH speech recognizer based on the Initial-Final structure of the Chinese language and a study of how dialectal Chinese speakers speak Putonghua. The authors propose to use contextindependent PTH-IF mappings (where IF means either a Chinese Initial or a Chinese Final), context-independent WDC-IF mappings, and syllable-dependent WDC-IF mappings (obtained from either experts or data), and combine them with the supervised maximum likelihood linear regression (MLLR) acoustic model adaptation method. To reduce the size of the multipronunciation lexicon introduced by the IF mappings, which might also enlarge the lexicon confusion and hence lead to the performance degradation, a Multi-Pronunciation Expansion (MPE) method based on the accumulated uni-gram probability (AUP) is proposed. In addition, some commonly used WDC words are selected and added to the lexicon. Compared with the original PTH speech recognizer, the resulting WDC speech recognizer achieves 10-18% absolute Character Error Rate (CER) reduction when recognizing WDC, with only a 0.62% CER increase when recognizing PTH. The proposed framework and methods are expected to work not only for Wu dialectal Chinese but also for other dialectal Chinese languages and

  11. Task repetition and second language speech processing

    OpenAIRE

    Lambert, Craig; Kormos, Judit; Minn, Danny

    2016-01-01

    This study examines the relationship between the repetition of oral monologue tasks and immediate gains in L2 fluency. It considers the effect of aural-oral task repetition on speech rate, frequency of clause-final and mid-clause filled pauses, and overt self-repairs across different task types and proficiency levels and relates these findings to specific stages of L2 speech production (conceptualization, formulation and monitoring). Thirty-two Japanese learners of English sampled at three le...

  12. Modeling speech intelligibility in adverse conditions

    OpenAIRE

    Dau, Torsten

    2012-01-01

    In everyday life, the speech we listen to is often mixed with many other sound sources as well as reverberation. In such situations, people with normal hearing are able to almost effortlessly segregate a single voice out of the background. In contrast, hearing-impaired people have great difficulty understanding speech when more than one person is talking, even when reduced audibility has been fully compensated for by a hearing aid. The reasons for these difficulties are not well understood. T...

  13. Speech perception in a sparse domain

    OpenAIRE

    Li, Guoping

    2008-01-01

    Environmental statistics are known to be important factors shaping our perceptual system. The visual and auditory systems have evolved to be effcient for processing natural images or speech. The com- mon characteristics between natural images and speech are that they are both highly structured, therefore having much redundancy. Our perceptual system may use redundancy reduction and sparse coding strategies to deal with complex stimuli every day. Both redundancy reduction ...

  14. Phonetic Alphabet for Speech Recognition of Czech

    OpenAIRE

    J. Uhlir; Psutka, J.; J. Nouza

    1997-01-01

    In the paper we introduce and discuss an alphabet that has been proposed for phonemicly oriented automatic speech recognition. The alphabet, denoted as a PAC (Phonetic Alphabet for Czech) consists of 48 basic symbols that allow for distinguishing all major events occurring in spoken Czech language. The symbols can be used both for phonetic transcription of Czech texts as well as for labeling recorded speech signals. From practical reasons, the alphabet occurs in two versions; one utilizes Cze...

  15. Unsupervised Topic Adaptation for Lecture Speech Retrieval

    OpenAIRE

    Fujii, Atsushi; Itou, Katunobu; Akiba, Tomoyosi; Ishikawa, Tetsuya

    2004-01-01

    We are developing a cross-media information retrieval system, in which users can view specific segments of lecture videos by submitting text queries. To produce a text index, the audio track is extracted from a lecture video and a transcription is generated by automatic speech recognition. In this paper, to improve the quality of our retrieval system, we extensively investigate the effects of adapting acoustic and language models on speech recognition. We perform an MLLR-based method to adapt...

  16. Rapid, generalized adaptation to asynchronous audiovisual speech

    OpenAIRE

    Van der Burg, Erik; Goodbourn, Patrick T.

    2015-01-01

    The brain is adaptive. The speed of propagation through air, and of low-level sensory processing, differs markedly between auditory and visual stimuli; yet the brain can adapt to compensate for the resulting cross-modal delays. Studies investigating temporal recalibration to audiovisual speech have used prolonged adaptation procedures, suggesting that adaptation is sluggish. Here, we show that adaptation to asynchronous audiovisual speech occurs rapidly. Participants viewed a brief clip of an...

  17. Integration of speech with natural language understanding.

    OpenAIRE

    Moore, R C

    1995-01-01

    The integration of speech recognition with natural language understanding raises issues of how to adapt natural language processing to the characteristics of spoken language; how to cope with errorful recognition output, including the use of natural language information to reduce recognition errors; and how to use information from the speech signal, beyond just the sequence of words, as an aid to understanding. This paper reviews current research addressing these questions in the Spoken Langu...

  18. NOISE CANCELLING HEADSETS FOR SPEECH COMMUNICATION

    OpenAIRE

    Håkansson, Lars; Johansson, Sven; Dahl, Mattias; Sjösten, Per; Claesson, Ingvar

    2002-01-01

    Headsets for speech communication are used in a wide range of applications. The basic idea is to allow hands-free speech communication, leaving both hands available for other tasks. One typical headset application is aircraft pilot communication. The pilot must be able to communicate with personnel on the ground and at the same time use both hands to control the aircraft. Communication headsets usually consist of a pair of headphones and a microphone attached with an adjustable boom. Headphon...

  19. Towards robust speech acquisition using sensor arrays

    OpenAIRE

    Maganti, Hari Krishna

    2007-01-01

    An integrated system approach was developed to address the problem of distant speech acquisition in multi-party meetings, using multiple microphones and cameras. Microphone array processing techniques have presented a potential alternative to close-talking microphones by providing speech enhancement through spatial filtering and directional discrimination. These techniques relied on accurate speaker locations for optimal performance. Tracking accurate speaker locations solely based on audio w...

  20. Towards Quranic reader controlled by speech

    OpenAIRE

    Yacine Yekache; Yekhlef Mekelleche; Belkacem Kouninef

    2012-01-01

    In this paper we describe the process of designing a task-oriented continuous speech recognition system for Arabic, based on CMU Sphinx4, to be used in the voice interface of Quranic reader. The concept of the Quranic reader controlled by speech is presented, the collection of the corpus and creation of acoustic model are described in detail taking into account a specificities of Arabic language and the desired application.

  1. Towards Quranic reader controlled by speech

    Directory of Open Access Journals (Sweden)

    Yacine Yekache

    2011-11-01

    Full Text Available In this paper we describe the process of designing a task-oriented continuous speech recognition system for Arabic, based on CMU Sphinx4, to be used in the voice interface of Quranic reader. The concept of the Quranic reader controlled by speech is presented, the collection of the corpus and creation of acoustic model are described in detail taking into account a specificities of Arabic language and the desired application.

  2. Phonological abstraction without phonemes in speech perception

    OpenAIRE

    Mitterer, H.; Scharenborg, O.; McQueen, J

    2013-01-01

    Recent evidence shows that listeners use abstract prelexical units in speech perception. Using the phenomenon of lexical retuning in speech processing, we ask whether those units are necessarily phonemic. Dutch listeners were exposed to a Dutch speaker producing ambiguous phones between the Dutch syllable-final allophones approximant [r] and dark [l]. These ambiguous phones replaced either final /r/ or final /l/ in words in a lexical-decision task. This differential exposure affected percepti...

  3. How does a dictation machine recognize speech?

    OpenAIRE

    Dutoit, T.; Couvreur, L.; Bourlard, Hervé

    2008-01-01

    There is magic (or is it witchcraft?) in a speech recognizer that transcribes continuous radio speech into text with a word accuracy of even not more than 50%. The extreme difficulty of this task, tough, is usually not perceived by the general public. This is because we are almost deaf to the infinite acoustic variations that accompany the production of vocal sounds, which arise from physiological constraints (co-articulation), but also from the acoustic environment (additive or convolutional...

  4. SPEECH TACTICS IN MASS MEDIA DISCOURSE

    Directory of Open Access Journals (Sweden)

    Olena Kaptiurova

    2014-06-01

    Full Text Available The article deals with the basic speech tactics used in mass media discourse. It has been stated that such tactics as contact establishment and speech interaction termination, yielding up initiative or its preserving are compulsory for the communicative situation of a talk show. Language personalities of television talk shows anchors and linguistic ways of the interview organisation are stressed. The material is amply illustrated with relevant examples.

  5. The fragility of freedom of speech.

    Science.gov (United States)

    Shackel, Nicholas

    2013-05-01

    Freedom of speech is a fundamental liberty that imposes a stringent duty of tolerance. Tolerance is limited by direct incitements to violence. False notions and bad laws on speech have obscured our view of this freedom. Hence, perhaps, the self-righteous intolerance, incitements and threats in response to Giubilini and Minerva. Those who disagree have the right to argue back but their attempts to shut us up are morally wrong. PMID:23637438

  6. Analysis of speech fluency in Williams syndrome

    OpenAIRE

    Rossi, Natalia F.; Sampaio, Adriana; Gonçalves, Óscar F.; Giacheti, Célia Maria

    2011-01-01

    Williams syndrome (WS) is a neurodevelopmental genetic disorder, often referred as being characterized by dissociation between verbal and non-verbal abilities, although the number of studies disputing this proposal is emerging. Indeed, although they have been traditionally reported as displaying increased speech fluency, this topic has not been fully addressed in research. In previous studies carried out with a small group of individuals with WS, we reported speech breakdowns d...

  7. Reflections on mirror neurons and speech perception

    OpenAIRE

    Lotto, Andrew J.; Hickok, Gregory S.; Holt, Lori L.

    2009-01-01

    The discovery of mirror neurons, a class of neurons that respond when a monkey performs an action and also when the monkey observes others producing the same action, has promoted a renaissance for the Motor Theory (MT) of speech perception. This is because mirror neurons seem to accomplish the same kind of one to one mapping between perception and action that MT theorizes to be the basis of human speech communication. However, this seeming correspondence is superficial, and there are theoreti...

  8. Spectrum Modification for Emotional Speech Synthesis

    Czech Academy of Sciences Publication Activity Database

    Přibilová, A.; Přibil, Jiří

    Vol. 5398. Berlin: SPRINGER-VERLAG, 2009 - (Esposito, A.; Hussain, A.; Marinaro, M.; Martone, R.), s. 232-241. (Lecture Notes in Artificial Intelligence. 5398). ISBN 978-3-642-00524-4. ISSN 0302-9743. [ Cognition International Training School on Multimodal Signals - Cognitive and Algorithmic Issues. Vietri sul Mare (IT), 21.04.2008-26.04.2008] Institutional research plan: CEZ:AV0Z20670512 Keywords : emotional speech * speech synthesis Subject RIV: JA - Electronics ; Optoelectronics, Electrical Engineering

  9. Wideband Speech Recovery Using Psychoacoustic Criteria

    Directory of Open Access Journals (Sweden)

    Visar Berisha

    2007-08-01

    Full Text Available Many modern speech bandwidth extension techniques predict the high-frequency band based on features extracted from the lower band. While this method works for certain types of speech, problems arise when the correlation between the low and the high bands is not sufficient for adequate prediction. These situations require that additional high-band information is sent to the decoder. This overhead information, however, can be cleverly quantized using human auditory system models. In this paper, we propose a novel speech compression method that relies on bandwidth extension. The novelty of the technique lies in an elaborate perceptual model that determines a quantization scheme for wideband recovery and synthesis. Furthermore, a source/filter bandwidth extension algorithm based on spectral spline fitting is proposed. Results reveal that the proposed system improves the quality of narrowband speech while performing at a lower bitrate. When compared to other wideband speech coding schemes, the proposed algorithms provide comparable speech quality at a lower bitrate.

  10. Wideband Speech Recovery Using Psychoacoustic Criteria

    Directory of Open Access Journals (Sweden)

    Berisha Visar

    2007-01-01

    Full Text Available Many modern speech bandwidth extension techniques predict the high-frequency band based on features extracted from the lower band. While this method works for certain types of speech, problems arise when the correlation between the low and the high bands is not sufficient for adequate prediction. These situations require that additional high-band information is sent to the decoder. This overhead information, however, can be cleverly quantized using human auditory system models. In this paper, we propose a novel speech compression method that relies on bandwidth extension. The novelty of the technique lies in an elaborate perceptual model that determines a quantization scheme for wideband recovery and synthesis. Furthermore, a source/filter bandwidth extension algorithm based on spectral spline fitting is proposed. Results reveal that the proposed system improves the quality of narrowband speech while performing at a lower bitrate. When compared to other wideband speech coding schemes, the proposed algorithms provide comparable speech quality at a lower bitrate.

  11. Effects of human fatigue on speech signals

    Science.gov (United States)

    Stamoulis, Catherine

    2001-05-01

    Cognitive performance may be significantly affected by fatigue. In the case of critical personnel, such as pilots, monitoring human fatigue is essential to ensure safety and success of a given operation. One of the modalities that may be used for this purpose is speech, which is sensitive to respiratory changes and increased muscle tension of vocal cords, induced by fatigue. Age, gender, vocal tract length, physical and emotional state may significantly alter speech intensity, duration, rhythm, and spectral characteristics. In addition to changes in speech rhythm, fatigue may also affect the quality of speech, such as articulation. In a noisy environment, detecting fatigue-related changes in speech signals, particularly subtle changes at the onset of fatigue, may be difficult. Therefore, in a performance-monitoring system, speech parameters which are significantly affected by fatigue need to be identified and extracted from input signals. For this purpose, a series of experiments was performed under slowly varying cognitive load conditions and at different times of the day. The results of the data analysis are presented here.

  12. Setswana Speech Recognizer for Computer Based Applications

    Directory of Open Access Journals (Sweden)

    Oratile Leteane

    2012-11-01

    Full Text Available This study is the development and adaptation of Setswana speech recognizer into computer applications. Setswana database is used together with sphinx decoder to build a generic Setswana speech recognizer. The recognizer is then adapted into a developed game called General Knowledge Game (GKG in which the user plays the game using Setswana speech. The developed recognizers level of accuracy was 60 percent on the worst case. The recognizer improves to more than 80 percent when shorter words are used to drive the application. Participants have shown that they prefer using speech driven applications over traditional approach of using mouse and keyboard. Analysis shows that though it is more effective to use keyboard and mouse to drive computer applications, users still prefer speech interaction because HCI method is easy to learn particularly for the users who are semi-literate and illiterate. It shows that using traditional approach (mouse and keyboard requires some degree of literacy for someone to be competent while with speech interaction, anyone can use

  13. Robust coarticulatory modeling for continuous speech recognition

    Science.gov (United States)

    Schwartz, R.; Chow, Y. L.; Dunham, M. O.; Kimball, O.; Krasner, M.; Kubala, F.; Makhoul, J.; Price, P.; Roucos, S.

    1986-10-01

    The purpose of this project is to perform research into algorithms for the automatic recognition of individual sounds or phonemes in continuous speech. The algorithms developed should be appropriate for understanding large-vocabulary continuous speech input and are to be made available to the Strategic Computing Program for incorporation in a complete word recognition system. This report describes process to date in developing phonetic models that are appropriate for continuous speech recognition. In continuous speech, the acoustic realization of each phoneme depends heavily on the preceding and following phonemes: a process known as coarticulation. Thus, while there are relatively few phonemes in English (on the order of fifty or so), the number of possible different accoustic realizations is in the thousands. Therefore, to develop high-accuracy recognition algorithms, one may need to develop literally thousands of relatively distance phonetic models to represent the various phonetic context adequately. Developing a large number of models usually necessitates having a large amount of speech to provide reliable estimates of the model parameters. The major contributions of this work are the development of: (1) A simple but powerful formalism for modeling phonemes in context; (2) Robust training methods for the reliable estimation of model parameters by utilizing the available speech training data in a maximally effective way; and (3) Efficient search strategies for phonetic recognition while maintaining high recognition accuracy.

  14. Racist-Sexist-Hate Speech on College Campuses: Free Speech v. Equal Protection.

    Science.gov (United States)

    Jahn, Karon L.

    On college campuses today, the debate rages over whether self-restraint and tolerance for nonconformity is overriding a need to protect certain individuals and groups from objectionable speech. Some administrators, students, and alumni wish to prevent "bad speech" in the form of expressions of racism, sexism, and the like. Advocates for limiting…

  15. Autonomic and Emotional Responses of Graduate Student Clinicians in Speech-Language Pathology to Stuttered Speech

    Science.gov (United States)

    Guntupalli, Vijaya K.; Nanjundeswaran, Chayadevie; Dayalu, Vikram N.; Kalinowski, Joseph

    2012-01-01

    Background: Fluent speakers and people who stutter manifest alterations in autonomic and emotional responses as they view stuttered relative to fluent speech samples. These reactions are indicative of an aroused autonomic state and are hypothesized to be triggered by the abrupt breakdown in fluency exemplified in stuttered speech. Furthermore,…

  16. Predicting speech intelligibility in adverse conditions: evaluation of the speech-based envelope power spectrum model

    DEFF Research Database (Denmark)

    Jørgensen, Søren; Dau, Torsten

    2011-01-01

    conditions by comparing predictions to measured data from [Kjems et al. (2009). J. Acoust. Soc. Am. 126 (3), 1415-1426] where speech is mixed with four different interferers, including speech-shaped noise, bottle noise, car noise, and cafe noise. The model accounts well for the differences in intelligibility...

  17. The Clinical Practice of Speech and Language Therapists with Children with Phonologically Based Speech Sound Disorders

    Science.gov (United States)

    Oliveira, Carla; Lousada, Marisa; Jesus, Luis M. T.

    2015-01-01

    Children with speech sound disorders (SSD) represent a large number of speech and language therapists' caseloads. The intervention with children who have SSD can involve different therapy approaches, and these may be articulatory or phonologically based. Some international studies reveal a widespread application of articulatory based approaches in…

  18. Between-Word Simplification Patterns in the Continuous Speech of Children with Speech Sound Disorders

    Science.gov (United States)

    Klein, Harriet B.; Liu-Shea, May

    2009-01-01

    Purpose: This study was designed to identify and describe between-word simplification patterns in the continuous speech of children with speech sound disorders. It was hypothesized that word combinations would reveal phonological changes that were unobserved with single words, possibly accounting for discrepancies between the intelligibility of…

  19. A Motor Speech Assessment for Children with Severe Speech Disorders: Reliability and Validity Evidence

    Science.gov (United States)

    Strand, Edythe A.; McCauley, Rebecca J.; Weigand, Stephen D.; Stoeckel, Ruth E.; Baas, Becky S.

    2013-01-01

    Purpose: In this article, the authors report reliability and validity evidence for the Dynamic Evaluation of Motor Speech Skill (DEMSS), a new test that uses dynamic assessment to aid in the differential diagnosis of childhood apraxia of speech (CAS). Method: Participants were 81 children between 36 and 79 months of age who were referred to the…

  20. Stability and Composition of Functional Synergies for Speech Movements in Children with Developmental Speech Disorders

    Science.gov (United States)

    Terband, H.; Maassen, B.; van Lieshout, P.; Nijland, L.

    2011-01-01

    The aim of this study was to investigate the consistency and composition of functional synergies for speech movements in children with developmental speech disorders. Kinematic data were collected on the reiterated productions of syllables spa(/spa[image omitted]/) and paas(/pa[image omitted]s/) by 10 6- to 9-year-olds with developmental speech…

  1. Prediction Method of Speech Recognition Performance Based on HMM-based Speech Synthesis Technique

    Science.gov (United States)

    Terashima, Ryuta; Yoshimura, Takayoshi; Wakita, Toshihiro; Tokuda, Keiichi; Kitamura, Tadashi

    We describe an efficient method that uses a HMM-based speech synthesis technique as a test pattern generator for evaluating the word recognition rate. The recognition rates of each word and speaker can be evaluated by the synthesized speech by using this method. The parameter generation technique can be formulated as an algorithm that can determine the speech parameter vector sequence O by maximizing P(O¦Q,λ) given the model parameter λ and the state sequence Q, under a dynamic acoustic feature constraint. We conducted recognition experiments to illustrate the validity of the method. Approximately 100 speakers were used to train the speaker dependent models for the speech synthesis used in these experiments, and the synthetic speech was generated as the test patterns for the target speech recognizer. As a result, the recognition rate of the HMM-based synthesized speech shows a good correlation with the recognition rate of the actual speech. Furthermore, we find that our method can predict the speaker recognition rate with approximately 2% error on average. Therefore the evaluation of the speaker recognition rate will be performed automatically by using the proposed method.

  2. Dramatic Effects of Speech Task on Motor and Linguistic Planning in Severely Dysfluent Parkinsonian Speech

    Science.gov (United States)

    Van Lancker Sidtis, Diana; Cameron, Krista; Sidtis, John J.

    2012-01-01

    In motor speech disorders, dysarthric features impacting intelligibility, articulation, fluency and voice emerge more saliently in conversation than in repetition, reading or singing. A role of the basal ganglia in these task discrepancies has been identified. Further, more recent studies of naturalistic speech in basal ganglia dysfunction have…

  3. The Use of Interpreters by Speech-Language Pathologists Conducting Bilingual Speech-Language Assessments

    Science.gov (United States)

    Palfrey, Carol Lynn

    2013-01-01

    The purpose of this non-experimental quantitative study was to explore the practices of speech-language pathologists in conducting bilingual assessments with interpreters. Data were obtained regarding the assessment tools and practices used by speech-language pathologists, the frequency with which they work with interpreters, and the procedures…

  4. The Role of Supralexical Prosodic Units in Speech Production: Evidence from the Distribution of Speech Errors

    Science.gov (United States)

    Choe, Wook Kyung

    2013-01-01

    The current dissertation represents one of the first systematic studies of the distribution of speech errors within supralexical prosodic units. Four experiments were conducted to gain insight into the specific role of these units in speech planning and production. The first experiment focused on errors in adult English. These were found to be…

  5. The Modification of the Basic Speech Course for Speech Apprehensive Students.

    Science.gov (United States)

    Ragsdale, Vicki Abney

    This paper begins by pointing out that approximately 15-20% of college students suffer from a fear of public speaking, and that a 1993 study of 369 students at Northern Kentucky University revealed high levels of speech apprehension (SA) at the beginning of the semester in the introductory speech course. The paper reports that although…

  6. Multimedia content with a speech track: ACM multimedia 2010 workshop on searching spontaneous conversational speech

    NARCIS (Netherlands)

    Larson, M.; Ordelman, R.; Metze, F.; Kraaij, W.; Jong, F. de

    2010-01-01

    When multimedia content has a speech track, a whole array of techniques involving speech recognition and analysis becomes available for indexing and structuring and can provide users with improved access and search. The set of new domains standing to benefit from these techniques encompasses talksho

  7. A Clinician Survey of Speech and Non-Speech Characteristics of Neurogenic Stuttering

    Science.gov (United States)

    Theys, Catherine; van Wieringen, Astrid; De Nil, Luc F.

    2008-01-01

    This study presents survey data on 58 Dutch-speaking patients with neurogenic stuttering following various neurological injuries. Stroke was the most prevalent cause of stuttering in our patients, followed by traumatic brain injury, neurodegenerative diseases, and other causes. Speech and non-speech characteristics were analyzed separately for…

  8. Spotlight on Speech Codes 2010: The State of Free Speech on Our Nation's Campuses

    Science.gov (United States)

    Foundation for Individual Rights in Education (NJ1), 2010

    2010-01-01

    Each year, the Foundation for Individual Rights in Education (FIRE) conducts a rigorous survey of restrictions on speech at America's colleges and universities. The survey and resulting report explore the extent to which schools are meeting their legal and moral obligations to uphold students' and faculty members' rights to freedom of speech,…

  9. Spotlight on Speech Codes 2011: The State of Free Speech on Our Nation's Campuses

    Science.gov (United States)

    Foundation for Individual Rights in Education (NJ1), 2011

    2011-01-01

    Each year, the Foundation for Individual Rights in Education (FIRE) conducts a rigorous survey of restrictions on speech at America's colleges and universities. The survey and accompanying report explore the extent to which schools are meeting their legal and moral obligations to uphold students' and faculty members' rights to freedom of speech,…

  10. Spotlight on Speech Codes 2009: The State of Free Speech on Our Nation's Campuses

    Science.gov (United States)

    Foundation for Individual Rights in Education (NJ1), 2009

    2009-01-01

    Each year, the Foundation for Individual Rights in Education (FIRE) conducts a wide, detailed survey of restrictions on speech at America's colleges and universities. The survey and resulting report explore the extent to which schools are meeting their obligations to uphold students' and faculty members' rights to freedom of speech, freedom of…

  11. Implementing Speech Supplementation Strategies: Effects on Intelligibility and Speech Rate of Individuals with Chronic Severe Dysarthria.

    Science.gov (United States)

    Hustad, Katherine C.; Jones, Tabitha; Dailey, Suzanne

    2003-01-01

    A study compared intelligibility and speech rate differences following speaker implementation of 3 strategies (topic, alphabet, and combined topic and alphabet supplementation) and a habitual speech control condition for 5 speakers with severe dysarthria. Combined cues and alphabet cues yielded significantly higher intelligibility scores and…

  12. Speech Enhancement by MAP Spectral Amplitude Estimation Using a Super-Gaussian Speech Model

    Directory of Open Access Journals (Sweden)

    Lotter Thomas

    2005-01-01

    Full Text Available This contribution presents two spectral amplitude estimators for acoustical background noise suppression based on maximum a posteriori estimation and super-Gaussian statistical modelling of the speech DFT amplitudes. The probability density function of the speech spectral amplitude is modelled with a simple parametric function, which allows a high approximation accuracy for Laplace- or Gamma-distributed real and imaginary parts of the speech DFT coefficients. Also, the statistical model can be adapted to optimally fit the distribution of the speech spectral amplitudes for a specific noise reduction system. Based on the super-Gaussian statistical model, computationally efficient maximum a posteriori speech estimators are derived, which outperform the commonly applied Ephraim-Malah algorithm.

  13. Speech research: Studies on the nature of speech, instrumentation for its investigation, and practical applications

    Science.gov (United States)

    Liberman, A. M.

    1982-03-01

    This report is one of a regular series on the status and progress of studies on the nature of speech, instrumentation for its investigation and practical applications. Manuscripts cover the following topics: Speech perception and memory coding in relation to reading ability; The use of orthographic structure by deaf adults: Recognition of finger-spelled letters; Exploring the information support for speech; The stream of speech; Using the acoustic signal to make inferences about place and duration of tongue-palate contact. Patterns of human interlimb coordination emerge from the the properties of nonlinear limit cycle oscillatory processes: Theory and data; Motor control: Which themes do we orchestrate? Exploring the nature of motor control in Down's syndrome; Periodicity and auditory memory: A pilot study; Reading skill and language skill: On the role of sign order and morphological structure in memory for American Sign Language sentences; Perception of nasal consonants with special reference to Catalan; and Speech production Characteristics of the hearing impaired.

  14. Integrated speech and morphological processing in a connectionist continuous speech understanding for Korean

    CERN Document Server

    Lee, G; Lee, Geunbae; Lee, Jong-Hyeok

    1996-01-01

    A new tightly coupled speech and natural language integration model is presented for a TDNN-based continuous possibly large vocabulary speech recognition system for Korean. Unlike popular n-best techniques developed for integrating mainly HMM-based speech recognition and natural language processing in a {\\em word level}, which is obviously inadequate for morphologically complex agglutinative languages, our model constructs a spoken language system based on a {\\em morpheme-level} speech and language integration. With this integration scheme, the spoken Korean processing engine (SKOPE) is designed and implemented using a TDNN-based diphone recognition module integrated with a Viterbi-based lexical decoding and symbolic phonological/morphological co-analysis. Our experiment results show that the speaker-dependent continuous can be achieved with over 80.6\\% success rate directly from speech inputs for the middle-level vocabularies.

  15. Preschool speech intelligibility and vocabulary skills predict long-term speech and language outcomes following cochlear implantation in early childhood.

    Science.gov (United States)

    Castellanos, Irina; Kronenberger, William G; Beer, Jessica; Henning, Shirley C; Colson, Bethany G; Pisoni, David B

    2014-07-01

    Speech and language measures during grade school predict adolescent speech-language outcomes in children who receive cochlear implants (CIs), but no research has examined whether speech and language functioning at even younger ages is predictive of long-term outcomes in this population. The purpose of this study was to examine whether early preschool measures of speech and language performance predict speech-language functioning in long-term users of CIs. Early measures of speech intelligibility and receptive vocabulary (obtained during preschool ages of 3-6 years) in a sample of 35 prelingually deaf, early-implanted children predicted speech perception, language, and verbal working memory skills up to 18 years later. Age of onset of deafness and age at implantation added additional variance to preschool speech intelligibility in predicting some long-term outcome scores, but the relationship between preschool speech-language skills and later speech-language outcomes was not significantly attenuated by the addition of these hearing history variables. These findings suggest that speech and language development during the preschool years is predictive of long-term speech and language functioning in early-implanted, prelingually deaf children. As a result, measures of speech-language functioning at preschool ages can be used to identify and adjust interventions for very young CI users who may be at long-term risk for suboptimal speech and language outcomes. PMID:23998347

  16. A multimodal corpus of speech to infant and adult listeners.

    Science.gov (United States)

    Johnson, Elizabeth K; Lahey, Mybeth; Ernestus, Mirjam; Cutler, Anne

    2013-12-01

    An audio and video corpus of speech addressed to 28 11-month-olds is described. The corpus allows comparisons between adult speech directed toward infants, familiar adults, and unfamiliar adult addressees as well as of caregivers' word teaching strategies across word classes. Summary data show that infant-directed speech differed more from speech to unfamiliar than familiar adults, that word teaching strategies for nominals versus verbs and adjectives differed, that mothers mostly addressed infants with multi-word utterances, and that infants' vocabulary size was unrelated to speech rate, but correlated positively with predominance of continuous caregiver speech (not of isolated words) in the input. PMID:25669300

  17. The intelligibility of interrupted speech depends upon its uninterrupted intelligibility.

    Science.gov (United States)

    Ardoint, Marine; Green, Tim; Rosen, Stuart

    2014-10-01

    Recognition of sentences containing periodic, 5-Hz, silent interruptions of differing duty cycles was assessed for three types of processed speech. Processing conditions employed different combinations of spectral resolution and the availability of fundamental frequency (F0) information, chosen to yield similar, below-ceiling performance for uninterrupted speech. Performance declined with decreasing duty cycle similarly for each processing condition, suggesting that, at least for certain forms of speech processing and interruption rates, performance with interrupted speech may reflect that obtained with uninterrupted speech. This highlights the difficulty in interpreting differences in interrupted speech performance across conditions for which uninterrupted performance is at ceiling. PMID:25324110

  18. The Phase Spectra Based Feature for Robust Speech Recognition

    Directory of Open Access Journals (Sweden)

    Abbasian ALI

    2009-07-01

    Full Text Available Speech recognition in adverse environment is one of the major issue in automatic speech recognition nowadays. While most current speech recognition system show to be highly efficient for ideal environment but their performance go down extremely when they are applied in real environment because of noise effected speech. In this paper a new feature representation based on phase spectra and Perceptual Linear Prediction (PLP has been suggested which can be used for robust speech recognition. It is shown that this new features can improve the performance of speech recognition not only in clean condition but also in various levels of noise condition when it is compared to PLP features.

  19. Speech Pathology in Ancient India--A Review of Sanskrit Literature.

    Science.gov (United States)

    Savithri, S. R.

    1987-01-01

    The paper is a review of ancient Sanskrit literature for information on the origin and development of speech and language, speech production, normality of speech and language, and disorders of speech and language and their treatment. (DB)

  20. Speech evaluation in children with temporomandibular disorders

    Directory of Open Access Journals (Sweden)

    Raquel Aparecida Pizolato

    2011-10-01

    Full Text Available OBJECTIVE: The aims of this study were to evaluate the influence of temporomandibular disorders (TMD on speech in children, and to verify the influence of occlusal characteristics. MATERIAL AND METHODS: Speech and dental occlusal characteristics were assessed in 152 Brazilian children (78 boys and 74 girls, aged 8 to 12 (mean age 10.05 ± 1.39 years with or without TMD signs and symptoms. The clinical signs were evaluated using the Research Diagnostic Criteria for TMD (RDC/TMD (axis I and the symptoms were evaluated using a questionnaire. The following groups were formed: Group TMD (n=40, TMD signs and symptoms (Group S and S, n=68, TMD signs or symptoms (Group S or S, n=33, and without signs and symptoms (Group N, n=11. Articulatory speech disorders were diagnosed during spontaneous speech and repetition of the words using the "Phonological Assessment of Child Speech" for the Portuguese language. It was also applied a list of 40 phonological balanced words, read by the speech pathologist and repeated by the children. Data were analyzed by descriptive statistics, Fisher's exact or Chi-square tests (α=0.05. RESULTS: A slight prevalence of articulatory disturbances, such as substitutions, omissions and distortions of the sibilants /s/ and /z/, and no deviations in jaw lateral movements were observed. Reduction of vertical amplitude was found in 10 children, the prevalence being greater in TMD signs and symptoms children than in the normal children. The tongue protrusion in phonemes /t/, /d/, /n/, /l/ and frontal lips in phonemes /s/ and /z/ were the most prevalent visual alterations. There was a high percentage of dental occlusal alterations. CONCLUSIONS: There was no association between TMD and speech disorders. Occlusal alterations may be factors of influence, allowing distortions and frontal lisp in phonemes /s/ and /z/ and inadequate tongue position in phonemes /t/; /d/; /n/; /l/.

  1. Intelligibility for Binaural Speech with Discarded Low-SNR Speech Components.

    Science.gov (United States)

    Schoenmaker, Esther; van de Par, Steven

    2016-01-01

    Speech intelligibility in multitalker settings improves when the target speaker is spatially separated from the interfering speakers. A factor that may contribute to this improvement is the improved detectability of target-speech components due to binaural interaction in analogy to the Binaural Masking Level Difference (BMLD). This would allow listeners to hear target speech components within specific time-frequency intervals that have a negative SNR, similar to the improvement in the detectability of a tone in noise when these contain disparate interaural difference cues. To investigate whether these negative-SNR target-speech components indeed contribute to speech intelligibility, a stimulus manipulation was performed where all target components were removed when local SNRs were smaller than a certain criterion value. It can be expected that for sufficiently high criterion values target speech components will be removed that do contribute to speech intelligibility. For spatially separated speakers, assuming that a BMLD-like detection advantage contributes to intelligibility, degradation in intelligibility is expected already at criterion values below 0 dB SNR. However, for collocated speakers it is expected that higher criterion values can be applied without impairing speech intelligibility. Results show that degradation of intelligibility for separated speakers is only seen for criterion values of 0 dB and above, indicating a negligible contribution of a BMLD-like detection advantage in multitalker settings. These results show that the spatial benefit is related to a spatial separation of speech components at positive local SNRs rather than to a BMLD-like detection improvement for speech components at negative local SNRs. PMID:27080648

  2. Gesture facilitates the syntactic analysis of speech

    Directory of Open Access Journals (Sweden)

    Henning eHolle

    2012-03-01

    Full Text Available Recent research suggests that the brain routinely binds together information from gesture and speech. However, most of this research focused on the integration of representational gestures with the semantic content of speech. Much less is known about how other aspects of gesture, such as emphasis, influence the interpretation of the syntactic relations in a spoken message. Here, we investigated whether beat gestures alter which syntactic structure is assigned to ambiguous spoken German sentences. The P600 component of the Event Related Brain Potential indicated that the more complex syntactic structure is easier to process when the speaker emphasizes the subject of a sentence with a beat. Thus, a simple flick of the hand can change our interpretation of who has been doing what to whom in a spoken sentence. We conclude that gestures and speech are an integrated system. Unlike previous studies, which have shown that the brain effortlessly integrates semantic information from gesture and speech, our study is the first to demonstrate that this integration also occurs for syntactic information. Moreover, the effect appears to be gesture-specific and was not found for other stimuli that draw attention to certain parts of speech, including prosodic emphasis, or a moving visual stimulus with the same trajectory as the gesture. This suggests that only visual emphasis produced with a communicative intention in mind (that is, beat gestures influences language comprehension, but not a simple visual movement lacking such an intention.

  3. Music and speech prosody: A common rhythm

    Directory of Open Access Journals (Sweden)

    Maija eHausen

    2013-09-01

    Full Text Available Disorders of music and speech perception, known as amusia and aphasia, have traditionally been regarded as dissociated deficits based on studies of brain damaged patients. This has been taken as evidence that music and speech are perceived by largely separate and independent networks in the brain. However, recent studies of congenital amusia have broadened this view by showing that the deficit is associated with problems in perceiving speech prosody, especially intonation and emotional prosody. In the present study the association between the perception of music and speech prosody was investigated with healthy Finnish adults (n = 61 using an on-line music perception test including the Scale subtest of Montreal Battery of Evaluation of Amusia (MBEA and Off-Beat and Out-of-key tasks as well as a prosodic verbal task that measures the perception of word stress. Regression analyses showed that there was a clear association between prosody perception and music perception, especially in the domain of rhythm perception. This association was evident after controlling for music education, age, pitch perception, visuospatial perception and working memory. Pitch perception was significantly associated with music perception but not with prosody perception. The association between music perception and visuospatial perception (measured using analogous tasks was less clear. Overall, the pattern of results indicates that there is a robust link between music and speech perception and that this link can be mediated by rhythmic cues (time and stress.

  4. Low SNR Speech Recognition using SMKL

    Directory of Open Access Journals (Sweden)

    Qin Yuan

    2014-05-01

    Full Text Available While traditional speech recognition methods have achieved great success in a number of real word applications, their further applications to some difficult situations, such as Signal-to-Noise Ratio (SNR signal and local languages, are still limited by their shortcomings in adaption ability. In particular, their robustness to pronunciation level noise is not satisfied enough. To overcome these limitations, in this paper, we propose a novel speech recognition approach for low signal-to-noise ratio signal. The general steps for our speech recognition approach are composed of signal preprocessing, feature extraction and recognition with simple multiple kernel learning (SMKL method. Then the application of SMKL in speech recognition with low SNR is presented. We evaluate the proposed approach over a standard data set. The experimental results show that the performance of SMKL method for low SNR speech recognition is significantly higher than that of the method based on other popular approaches. Further, SMKL based method can be straightforwardly applied to recognition problem of large scale dataset, high dimension data, and a large amount of isomerism information.

  5. Music and speech prosody: a common rhythm.

    Science.gov (United States)

    Hausen, Maija; Torppa, Ritva; Salmela, Viljami R; Vainio, Martti; Särkämö, Teppo

    2013-01-01

    Disorders of music and speech perception, known as amusia and aphasia, have traditionally been regarded as dissociated deficits based on studies of brain damaged patients. This has been taken as evidence that music and speech are perceived by largely separate and independent networks in the brain. However, recent studies of congenital amusia have broadened this view by showing that the deficit is associated with problems in perceiving speech prosody, especially intonation and emotional prosody. In the present study the association between the perception of music and speech prosody was investigated with healthy Finnish adults (n = 61) using an on-line music perception test including the Scale subtest of Montreal Battery of Evaluation of Amusia (MBEA) and Off-Beat and Out-of-key tasks as well as a prosodic verbal task that measures the perception of word stress. Regression analyses showed that there was a clear association between prosody perception and music perception, especially in the domain of rhythm perception. This association was evident after controlling for music education, age, pitch perception, visuospatial perception, and working memory. Pitch perception was significantly associated with music perception but not with prosody perception. The association between music perception and visuospatial perception (measured using analogous tasks) was less clear. Overall, the pattern of results indicates that there is a robust link between music and speech perception and that this link can be mediated by rhythmic cues (time and stress). PMID:24032022

  6. Speech tests as measures of outcome.

    Science.gov (United States)

    Gatehouse, S

    1998-01-01

    Speech tests comprise an important and integral part of any assessment of the effectiveness of intervention for hearing disability and handicap. Particularly when considering hearing aid services for adult listeners, careful consideration has to be given to the particular form and application of inferences drawn from speech identification procedures if erroneous conclusions are to be avoided. It is argued that four such components relate to the statistical properties and discriminatory leverage of speech identification procedures, the choice of presentation level and conditions in regard to the auditory environment experienced by hearing-impaired clients, the extent to which speech tests based on segmental intelligibility provide appropriate information in relationship to perceived disabilities and handicaps, and the ways in which speech identification procedures to evaluate the potential benefits of signal-processing schemes for hearing aids are dependent upon sufficient listening experiences. Data are drawn from the literature to illuminate these points in terms of application in clinical practice and clinical evaluation exercises, and also with regard to future research needs. PMID:10209778

  7. Irrelevant speech effects and statistical learning.

    Science.gov (United States)

    Neath, Ian; Guérard, Katherine; Jalbert, Annie; Bireta, Tamra J; Surprenant, Aimée M

    2009-08-01

    Immediate serial recall of visually presented verbal stimuli is impaired by the presence of irrelevant auditory background speech, the so-called irrelevant speech effect. Two of the three main accounts of this effect place restrictions on when it will be observed, limiting its occurrence either to items processed by the phonological loop (the phonological loop hypothesis) or to items that are not too dissimilar from the irrelevant speech (the feature model). A third, the object-oriented episodic record (O-OER) model, requires only that the memory task involves seriation. The present studies test these three accounts by examining whether irrelevant auditory speech will interfere with a task that does not involve the phonological loop, does not use stimuli that are compatible with those to be remembered, but does require seriation. Two experiments found that irrelevant speech led to lower levels of performance in a visual statistical learning task, offering more support for the O-OER model and posing a challenge for the other two accounts. PMID:19370483

  8. Speech Evoked Auditory Brainstem Response in Stuttering

    Directory of Open Access Journals (Sweden)

    Ali Akbar Tahaei

    2014-01-01

    Full Text Available Auditory processing deficits have been hypothesized as an underlying mechanism for stuttering. Previous studies have demonstrated abnormal responses in subjects with persistent developmental stuttering (PDS at the higher level of the central auditory system using speech stimuli. Recently, the potential usefulness of speech evoked auditory brainstem responses in central auditory processing disorders has been emphasized. The current study used the speech evoked ABR to investigate the hypothesis that subjects with PDS have specific auditory perceptual dysfunction. Objectives. To determine whether brainstem responses to speech stimuli differ between PDS subjects and normal fluent speakers. Methods. Twenty-five subjects with PDS participated in this study. The speech-ABRs were elicited by the 5-formant synthesized syllable/da/, with duration of 40 ms. Results. There were significant group differences for the onset and offset transient peaks. Subjects with PDS had longer latencies for the onset and offset peaks relative to the control group. Conclusions. Subjects with PDS showed a deficient neural timing in the early stages of the auditory pathway consistent with temporal processing deficits and their abnormal timing may underlie to their disfluency.

  9. The irrelevant speech effect: a PET study.

    Science.gov (United States)

    Gisselgård, Jens; Petersson, Karl Magnus; Baddeley, Alan; Ingvar, Martin

    2003-01-01

    Positron emission tomography (PET) was performed in normal volunteers during a serial recall task under the influence of irrelevant speech comprising both single item repetition and multi-item sequences. An interaction approach was used to identify brain areas specifically related to the irrelevant speech effect. We interpreted activations as compensatory recruitment of complementary working memory processing, and decreased activity in terms of suppression of task relevant areas invoked by the irrelevant speech. The interaction between the distractors and working memory revealed a significant effect in the left, and to a lesser extent in the right, superior temporal region, indicating that initial phonological processing was relatively suppressed. Additional areas of decreased activity were observed in an a priori defined cortical network related to verbal working memory, incorporating the bilateral superior temporal and inferior/middle frontal cortices extending into Broca's area on the left. We also observed a weak activation in the left inferior parietal cortex, a region suggested to reflect the phonological store, the subcomponent where the interference is assumed to take place. The results suggest that the irrelevant speech effect is correlated with and thus tentatively may be explained in terms of a suppression of components of the verbal working memory network as outlined. The results can be interpreted in terms of inhibitory top-down attentional mechanisms attenuating the influence of the irrelevant speech, although additional studies are clearly necessary to more fully characterize the nature of this phenomenon and its theoretical implications for existing short-term memory models. PMID:14572523

  10. When speech sounds like music.

    Science.gov (United States)

    Falk, Simone; Rathcke, Tamara; Dalla Bella, Simone

    2014-08-01

    Repetition can boost memory and perception. However, repeating the same stimulus several times in immediate succession also induces intriguing perceptual transformations and illusions. Here, we investigate the Speech to Song Transformation (S2ST), a massed repetition effect in the auditory modality, which crosses the boundaries between language and music. In the S2ST, a phrase repeated several times shifts to being heard as sung. To better understand this unique cross-domain transformation, we examined the perceptual determinants of the S2ST, in particular the role of acoustics. In 2 Experiments, the effects of 2 pitch properties and 3 rhythmic properties on the probability and speed of occurrence of the transformation were examined. Results showed that both pitch and rhythmic properties are key features fostering the transformation. However, some properties proved to be more conducive to the S2ST than others. Stable tonal targets that allowed for the perception of a musical melody led more often and quickly to the S2ST than scalar intervals. Recurring durational contrasts arising from segmental grouping favoring a metrical interpretation of the stimulus also facilitated the S2ST. This was, however, not the case for a regular beat structure within and across repetitions. In addition, individual perceptual abilities allowed to predict the likelihood of the S2ST. Overall, the study demonstrated that repetition enables listeners to reinterpret specific prosodic features of spoken utterances in terms of musical structures. The findings underline a tight link between language and music, but they also reveal important differences in communicative functions of prosodic structure in the 2 domains. PMID:24911013

  11. Can you hear my age? Influences of speech rate and speech spontaneity on estimation of speaker age.

    Science.gov (United States)

    Skoog Waller, Sara; Eriksson, Mårten; Sörqvist, Patrik

    2015-01-01

    Cognitive hearing science is mainly about the study of how cognitive factors contribute to speech comprehension, but cognitive factors also partake in speech processing to infer non-linguistic information from speech signals, such as the intentions of the talker and the speaker's age. Here, we report two experiments on age estimation by "naïve" listeners. The aim was to study how speech rate influences estimation of speaker age by comparing the speakers' natural speech rate with increased or decreased speech rate. In Experiment 1, listeners were presented with audio samples of read speech from three different speaker age groups (young, middle aged, and old adults). They estimated the speakers as younger when speech rate was faster than normal and as older when speech rate was slower than normal. This speech rate effect was slightly greater in magnitude for older (60-65 years) speakers in comparison with younger (20-25 years) speakers, suggesting that speech rate may gain greater importance as a perceptual age cue with increased speaker age. This pattern was more pronounced in Experiment 2, in which listeners estimated age from spontaneous speech. Faster speech rate was associated with lower age estimates, but only for older and middle aged (40-45 years) speakers. Taken together, speakers of all age groups were estimated as older when speech rate decreased, except for the youngest speakers in Experiment 2. The absence of a linear speech rate effect in estimates of younger speakers, for spontaneous speech, implies that listeners use different age estimation strategies or cues (possibly vocabulary) depending on the age of the speaker and the spontaneity of the speech. Potential implications for forensic investigations and other applied domains are discussed. PMID:26236259

  12. Speech and Language Disorders in the School Setting

    Science.gov (United States)

    ... learning ? How may a speech-language disorder affect school performance ? How do parents and school personnel work together ... school. How may a speech-language disorder affect school performance? Children with communication disorders frequently do not perform ...

  13. What Is Voice? What Is Speech? What Is Language?

    Science.gov (United States)

    ... What Is Voice? What Is Speech? What Is Language? On this page: Voice Speech Language Where can ... may occur in children who have developmental disabilities. Language Language is the expression of human communication through ...

  14. An Abnormal Speech Detection Algorithm Based on GMM-UBM

    Directory of Open Access Journals (Sweden)

    Jun He

    2014-05-01

    Full Text Available To overcome the defects of common used algorithms based on model for abnormal speech recognition, which existed insufficient training data and difficult to fit each type of abnormal characters, an abnormal speech detection method based on GMM-UBM was proposed in this paper. For compensating the defects of methods based on model which difficult to deal with the diversification speech. Firstly, many normal utterances and unknowing type abnormal utterances came from different speaker, were used to train the GMM-UBM for normal speech and abnormal speech, respectively; secondly, the GMM-UBM obtained by training normal speech and abnormal speech were used to s core for these testing utterances. From the results show that compared with GMM and GMM-SVM methods under 24 Gaussians and the ratio of training speech and testing is 6:4, the correct classification ratio of this proposed have 6.1% and 4.4% improvement, respectively

  15. Recognizing intentions in infant-directed speech: evidence for universals.

    Science.gov (United States)

    Bryant, Gregory A; Barrett, H Clark

    2007-08-01

    In all languages studied to date, distinct prosodic contours characterize different intention categories of infant-directed (ID) speech. This vocal behavior likely exists universally as a species-typical trait, but little research has examined whether listeners can accurately recognize intentions in ID speech using only vocal cues, without access to semantic information. We recorded native-English-speaking mothers producing four intention categories of utterances (prohibition, approval, comfort, and attention) as both ID and adult-directed (AD) speech, and we then presented the utterances to Shuar adults (South American hunter-horticulturalists). Shuar subjects were able to reliably distinguish ID from AD speech and were able to reliably recognize the intention categories in both types of speech, although performance was significantly better with ID speech. This is the first demonstration that adult listeners in an indigenous, nonindustrialized, and nonliterate culture can accurately infer intentions from both ID speech and AD speech in a language they do not speak. PMID:17680948

  16. Speech Recognition Technology Applied to Intelligent Mobile Navigation System

    Institute of Scientific and Technical Information of China (English)

    2002-01-01

    The capability of human-computer interaction reflects the intelligent degree of mobile navigation system.The navigation data and functions of mobile navigation system are divided into system commands and non-system commands in this paper.And then a group of speech commands are Abstracted.This paper applies speech recognition technology to intelligent mobile navigation system to process speech commands and does some deep research on the integration of speech recognition technology with mobile navigation system.The navigation operation can be performed by speech commands,which makes human-computer interaction easy during navigation.Speech command interface of navigation system is implemented by Dutty ++ Software,which is based on speech recognition system -Via Voice of IBM.Through navigation experiments,navigation can be done almost without keyboard,which proved that human-computer interaction is very convenient by speech commands and the reliability is also higher.

  17. Aspectos fonoaudiológicos na síndrome de Crouzon: estudo de caso Speech-language aspects on Crouzon syndrome: case study

    Directory of Open Access Journals (Sweden)

    Isabela Gomes

    2008-01-01

    hearing assessments of a case of Crouzon syndrome at the age 6:4 years. PROCEDURE: the subject carried out the following evaluations: ABFW, Test of Receptive Vocabulary, Language-Cognition Development Evaluation, Evaluation of Structures and Functions of the Stomatognathic System, pure-tone audiometry threshold, immitance measures and vocal audiometry. RESULTS: the pure-tone audiometry identified bilateral moderate conductive hearing loss, compatible with vocal audiometry's and immitance measures' results. The stomatognathic system evaluation showed that the structures had reduced tonus and altered posture and mobility. Suction, chewing, deglutition and breathing functions were also altered. Phonologically, the following processes were identified: Cluster Simplification, Stopping of Fricatives and Others. In the Fluency evaluation, subject's performance was below the expected scores for matched age and gender. In the Pragmatics test, the child had 14.4 acts per minute and, predominantly, gestural communication. The Receptive Vocabulary Test showed scores 7.1% below reference. In the Expressive Vocabulary Test, data indicated a performance compatible to the reference values of 4 and 5 year-old children, below the expected scores for the subject's age. Regarding language and cognition, the analysis indicated a gap between the child's performance and the developmental level. CONCLUSION: the deficits caused by the syndrome are diffuse and interconnected. The present study had the aim to present the Speech-Language and Hearing Pathology associated aspects of a Crouzon syndrome case and to provide initial data to further investigate these aspects and the intervention process.

  18. Music expertise shapes audiovisual temporal integration windows for speech, sinewave speech and music

    Directory of Open Access Journals (Sweden)

    Hwee Ling eLee

    2014-08-01

    Full Text Available This psychophysics study used musicians as a model to investigate whether musical expertise shapes the temporal integration window for audiovisual speech, sinewave speech or music. Musicians and non-musicians judged the audiovisual synchrony of speech, sinewave analogues of speech, and music stimuli at 13 audiovisual stimulus onset asynchronies (±360, ±300 ±240, ±180, ±120, ±60, and 0 ms. Further, we manipulated the duration of the stimuli by presenting sentences/melodies or syllables/tones. Critically, musicians relative to non-musicians exhibited significantly narrower temporal integration windows for both music and sinewave speech. Further, the temporal integration window for music decreased with the amount of music practice, but not with age of acquisition. In other words, the more musicians practiced piano in the past three years, the more sensitive they became to the temporal misalignment of visual and auditory signals. Collectively, our findings demonstrate that music practicing fine-tunes the audiovisual temporal integration window to various extents depending on the stimulus class. While the effect of piano practicing was most pronounced for music, it also generalized to other stimulus classes such as sinewave speech and to a marginally significant degree to natural speech.

  19. Fighting Hate Speech through EU Law

    Directory of Open Access Journals (Sweden)

    Uladzislau Belavusau

    2012-02-01

    Full Text Available

    This article explores the rise of the European ‘First Amendment’ beyond national and Strasbourg law, offering a fresh look into the previously under-theorised issue of hate speech in EU law. Building its argument on (1 the scrutiny of fundamental rights protection, (2 the distinction between commercial and non-commercial speech, and, finally, (3 the looking glass of critical race theory, the paper demonstrates how the judgment of the ECJ in the Feryn case implicitly consolidated legal narratives on hate speech in Europe. In this way, the paper reconstructs the dominant European theory of freedom of expression via rhetorical and victim-centered constitutional analysis, bearing important ethical implications for European integration.

     

  20. Speech Recognition Technology for Hearing Disabled Community

    Directory of Open Access Journals (Sweden)

    Tanvi Dua

    2014-09-01

    Full Text Available As the number of people with hearing disabilities are increasing significantly in the world, it is always required to use technology for filling the gap of communication between Deaf and Hearing communities. To fill this gap and to allow people with hearing disabilities to communicate this paper suggests a framework that contributes to the efficient integration of people with hearing disabilities. This paper presents a robust speech recognition system, which converts the continuous speech into text and image. The results are obtained with an accuracy of 95% with the small size vocabulary of 20 greeting sentences of continuous speech form tested in a speaker independent mode. In this testing phase all these continuous sentences were given as live input to the proposed system.

  1. Bimodal Emotion Recognition from Speech and Text

    Directory of Open Access Journals (Sweden)

    Weilin Ye

    2014-01-01

    Full Text Available This paper presents an approach to emotion recognition from speech signals and textual content. In the analysis of speech signals, thirty-seven acoustic features are extracted from the speech input. Two different classifiers Support Vector Machines (SVMs and BP neural network are adopted to classify the emotional states. In text analysis, we use the two-step classification method to recognize the emotional states. The final emotional state is determined based on the emotion outputs from the acoustic and textual analyses. In this paper we have two parallel classifiers for acoustic information and two serial classifiers for textual information, and a final decision is made by combing these classifiers in decision level fusion. Experimental results show that the emotion recognition accuracy of the integrated system is better than that of either of the two individual approaches.

  2. Template based low data rate speech encoder

    Science.gov (United States)

    Fransen, Lawrence

    1993-09-01

    The 2400-b/s linear predictive coder (LPC) is currently being widely deployed to support tactical voice communication over narrowband channels. However, there is a need for lower-data-rate voice encoders for special applications: improved performance in high bit-error conditions, low-probability-of-intercept (LPI) voice communication, and narrowband integrated voice/data systems. An 800-b/s voice encoding algorithm is presented which is an extension of the 2400-b/s LPC. To construct template tables, speech samples of 420 speakers uttering 8 sentences each were excerpted from the Texas Instrument - Massachusetts Institute of Technology (TIMIT) Acoustic-Phonetic Speech Data Base. Speech intelligibility of the 800-b/s voice encoding algorithm measured by the diagnostic rhyme test (DRT) is 91.5 for three male speakers. This score compares favorably with the 2400-b/s LPC of a few years ago.

  3. Rehabilitation of impaired speech function (dysarthria, dysglossia

    Directory of Open Access Journals (Sweden)

    Schröter-Morasch, Heidrun

    2005-09-01

    Full Text Available Speech disorders can result (1 from sensorimotor impairments of articulatory movements = dysarthria, or (2 from structural changes of the speech organs, in adults particularly after surgical and radiochemical treatment of tumors = dysglossia. The decrease of intelligibility, a reduced vocal stamina, the stigmatization of a conspicuous voice and manner of speech, the reduction of emotional expressivity all mean greatly diminished quality of life, restricted career opportunities and diminished social contacts. Intensive therapy based on the pathophysiological facts is absolutely essential: Functional exercise therapy plays a central role; according to symptoms and their progression it can be complemented with prosthetic and surgical approaches. In severe cases communicational aids have to be used. All rehabilitation measures have to take account of frequently associated disorders of body motor control and/or impairment of cognition and behaviour.

  4. Changes in breathing while listening to read speech: the effect of reader and speech mode

    Directory of Open Access Journals (Sweden)

    Amélie eRochet-Capellan

    2013-12-01

    Full Text Available The current paper extends previous work on breathing during speech perception and provides supplementary material regarding the hypothesis that adaptation of breathing during perception could be a basis for understanding and imitating actions performed by other people (Paccalin and Jeannerod, 2000, Brain Research, 862(1-2, p. 194. The experiments were designed to test how the differences in reader breathing due to speaker-specific characteristics, or differences induced by changes in loudness level or speech rate influence the listener breathing. Two readers (a male and a female were pre-recorded while reading short texts with normal and then loud speech (both readers or slow speech (female only. These recordings were then played back to forty-eight female listeners. The movements of the rib cage and abdomen were analyzed for both the readers and the listeners. Breathing profiles were characterized by the movement expansion due to inhalation and the duration of the breathing cycle. We found that both loudness and speech rate affected each reader’s breathing in different ways. Listener breathing was different when listening to the male or the female reader and to the different speech modes. However, differences in listener breathing were not systematically in the same direction as reader differences. The breathing of listeners was strongly sensitive to the order of presentation of speech mode and displayed some adaptation in the time course of the experiment in some conditions. In contrast to specific alignments of breathing previously observed in face-to-face dialogue, no clear evidence for a listener-reader alignment in breathing was found in this purely auditory speech perception task. The results and methods are relevant to the question of the involvement of physiological adaptations in speech perception and to the basic mechanisms of listener-speaker coupling.

  5. On Speech Act Theory in Conversations of the Holy Bible

    Institute of Scientific and Technical Information of China (English)

    YANG Hongya

    2014-01-01

    Speech act theory is an important theory in current pragmatics, which is originated with the Oxford philosopher John Langshaw Austin. Speech act theory is started from research on daily language’s function. There are few papers using speech act theory to analyze literature works. The holy bible is a literature treasure in human history, so this paper tries to use speech act theory to analyze conversations of Bible and provide some enlightenment for readers.

  6. Subband Modulator Kalman Filtering for Single Channel Speech Enhancement

    OpenAIRE

    Ishaq, Rizwan; Zapirain, Begona Garcia; Shahid, Muhammad; Lövström, Benny

    2013-01-01

    his paper presents a single channel speech enhancement technique based on sub-band modulator Kalman filtering for laryngeal (normal) and alaryngeal (Esophageal speech) speech signals. The noisy speech signal is decomposed into sub-bands and subsequently each sub-band is demodulated into its modulator and carrier components. Kalman filter is applied to modulators of all sub-bands without altering the carriers. Performance of the proposed system has been validated by Mean Opinion Score (MOS) fo...

  7. Objective Neural Indices of Speech-in-Noise Perception

    OpenAIRE

    Anderson, Samira; Kraus, Nina

    2010-01-01

    Numerous factors contribute to understanding speech in noisy listening environments. There is a clinical need for objective biological assessment of auditory factors that contribute to the ability to hear speech in noise, factors that are free from the demands of attention and memory. Subcortical processing of complex sounds such as speech (auditory brainstem responses to speech and other complex stimuli [cABRs]) reflects the integrity of auditory function. Because cABRs physically resemble t...

  8. A multimodal corpus of speech to infant and adult listeners

    OpenAIRE

    Johnson, E.K.; Laheij, M.A.A.; Ernestus, M.T.C.; Cutler, A.

    2013-01-01

    An audio and video corpus of speech addressed to 28 11-month-olds is described. The corpus allows comparisons between adult speech directed towards infants, familiar adults and unfamiliar adult addressees, as well as of caregivers’ word teaching strategies across word classes. Summary data show that infant-directed speech differed more from speech to unfamiliar than familiar adults; that word teaching strategies for nominals versus verbs and adjectives differed; that mothers mostly addressed ...

  9. Making Sense of Variations: Introducing Alternatives in Speech Synthesis

    OpenAIRE

    Obin, Nicolas; Veaux, Christophe; Lanchantin, Pierre

    2012-01-01

    This paper addresses the use of speech alternatives to enrich speech synthesis systems. Speech alternatives denote the variety of strategies that a speaker can use to pronounce a sentence - depending on pragmatic constraints, speaking style, and specific strategies of the speaker. During the training, symbolic and acoustic characteristics of a unit-selection speech synthesis system are statistically modelled with context-dependent para-metric models (GMMs/HMMs). During the synthesis, symbo-li...

  10. The Kiel Corpora of "Speech & Emotion" - A Summary

    DEFF Research Database (Denmark)

    Niebuhr, Oliver; Peters, Benno; Landgraf, Rabea; Schmidt, Gerhard

    technology applications that sneak in every corner of our life. Apart from the fact that speech corpora seem to become constantly larger (for example, in order to properly train self-learning speech synthesis/recognition algorithms), the content of speech corpora also changes. In particular, recordings of...... isolated logatomes, words or sentences are successively supplanted by more realistic, interactive, and informal speech-production tasks....

  11. Cross-Word Modeling for Arabic Speech Recognition

    CERN Document Server

    AbuZeina, Dia

    2012-01-01

    "Cross-Word Modeling for Arabic Speech Recognition" utilizes phonological rules in order to model the cross-word problem, a merging of adjacent words in speech caused by continuous speech, to enhance the performance of continuous speech recognition systems. The author aims to provide an understanding of the cross-word problem and how it can be avoided, specifically focusing on Arabic phonology using an HHM-based classifier.

  12. Cued Speech: A visual communication mode for the Deaf society

    OpenAIRE

    Heracleous, Panikos; Beautemps, Denis

    2010-01-01

    Cued Speech is a visual mode of communication that uses handshapes and placements in combination with the mouth movements of speech to make the phonemes of a spoken language look different from each other and clearly understandable to deaf individuals. The aim of Cued Speech is to overcome the problems of lip reading and thus enable deaf persons to wholly understand spoken language. In this study, automatic phoneme recognition in Cued Speech for French based on hidden Markov model (HMMs) is i...

  13. Neurophysiological Influence of Musical Training on Speech Perception

    OpenAIRE

    Shahin, Antoine J.

    2011-01-01

    Does musical training affect our perception of speech? For example, does learning to play a musical instrument modify the neural circuitry for auditory processing in a way that improves one’s ability to perceive speech more clearly in noisy environments? If so, can speech perception in individuals with hearing loss, who struggle in noisy situations, benefit from musical training? While music and speech exhibit some specialization in neural processing, there is evidence suggesting that skill...

  14. Intuitive visualizations of pitch and loudness in speech

    OpenAIRE

    Schaefer, R.S.; Beijer, L.J.; Seuskens, W.L.J.B.; Rietveld, A.C.M.; Sadakata, M.

    2015-01-01

    Visualizing acoustic features of speech has proven helpful in speech therapy; however, it is as yet unclear how to create intuitive and fitting visualizations. To better understand the mappings from speech sound aspects to visual space, a large web-based experiment (n = 249) was performed to evaluate spatial parameters that may optimally represent pitch and loudness of speech. To this end, five novel animated visualizations were developed and presented in pairwise comparisons, together with a...

  15. Sound, speech, voice and music from phenomenological perspective

    OpenAIRE

    Kivle, Ineta

    2008-01-01

    Ineta Kivle Promotion paper Sound, speech, voice and music in phenomenological perspective Annotation The phenomenological aspects of sound, speech, voice, and music are analyzed in the four chapters of the promotion paper: 1) The significance of phenomenology as a philosophy and method in viewing sound, speech, voice and music. 2) The basic stances of the phenomenology of sound. 3) Further expansion of the phenomenological view: sound, speech, voice. 4) Phenomenological interpret...

  16. Neural correlates of auditory-somatosensory interaction in speech perception

    OpenAIRE

    Ito, Takayuki; Gracco, Vincent; Ostry, David J.

    2015-01-01

    Speech perception is known to rely on both auditory and visual information. However, sound specific somatosensory input has been shown also to influence speech perceptual processing (Ito et al., 2009). In the present study we addressed further the relationship between somatosensory information and speech perceptual processing by addressing the hypothesis that the temporal relationship between orofacial movement and sound processing contributes to somatosensory-auditory interaction in speech p...

  17. ViSQOL: an objective speech quality model

    OpenAIRE

    Kokaram, Anil; KOKARAM, ANIL CHRISTOPHER; Harte, Naomi; HINES, ANDREW

    2015-01-01

    PUBLISHED Export Date: 27 August 2015 This paper presents an objective speech quality model, ViSQOL, the Virtual Speech Quality Objective Listener. It is a signal-based, full-reference, intrusive metric that models human speech quality perception using a spectro-temporal measure of similarity between a reference and a test speech signal. The metric has been particularly designed to be robust for quality issues associated with Voice over IP (VoIP) transmission. This paper describes the a...

  18. Evaluation of the Vulnerability of Speaker Verification to Synthetic Speech

    OpenAIRE

    De Leon, P.L.; Pucher, M.; Yamagishi, J

    2010-01-01

    In this paper, we evaluate the vulnerability of a speaker verification (SV) system to synthetic speech. Although this problem was first examined over a decade ago, dramatic improvements in both SV and speech synthesis have renewed interest in this problem. We use a HMM-based speech synthesizer, which creates synthetic speech for a targeted speaker through adaptation of a background model and a GMM-UBM-based SV system. Using 283 speakers from the Wall-Street Journal (WSJ) ...

  19. Learning Fault-tolerant Speech Parsing with SCREEN

    OpenAIRE

    Wermter, Stefan; Weber, Volker

    1994-01-01

    This paper describes a new approach and a system SCREEN for fault-tolerant speech parsing. SCREEEN stands for Symbolic Connectionist Robust EnterprisE for Natural language. Speech parsing describes the syntactic and semantic analysis of spontaneous spoken language. The general approach is based on incremental immediate flat analysis, learning of syntactic and semantic speech parsing, parallel integration of current hypotheses, and the consideration of various forms of speech related errors. T...

  20. Joint speech and spearker recognition using neural networks

    OpenAIRE

    Xue, Xiaoguo

    2013-01-01

    Speech is the main communication method between human beings. Since the time of the invention of the computer people have been trying to let the computer understand natural speech. Speech recognition is a technology which has close connections with computer science, signal processing, voice linguistics and intelligent systems. It has been a ”hot” subject not only in the field of research but also as a practical application. Especially in real life, speaker and speech recognition have been use...

  1. Speech Perception Within an Auditory Cognitive Science Framework

    OpenAIRE

    Holt, Lori L.; Lotto, Andrew J.

    2008-01-01

    The complexities of the acoustic speech signal pose many significant challenges for listeners. Although perceiving speech begins with auditory processing, investigation of speech perception has progressed mostly independently of study of the auditory system. Nevertheless, a growing body of evidence demonstrates that cross-fertilization between the two areas of research can be productive. We briefly describe research bridging the study of general auditory processing and speech perception, show...

  2. Speech disfluencies in Parkinson’s disease

    Directory of Open Access Journals (Sweden)

    Paweł J. Półrola

    2016-01-01

    Full Text Available Introduction : Even though speech disfluency is listed in the clinical description of Parkinson’s disease (PD, its nature, intensity, symptomatology, and the effect on verbal communication have not hitherto been defined. Aim of the research: The research paper presents the results of studies aimed at the description of speech disfluencies in PD and the influence on verbal communication. Material and methods : The tests involved 10 patients from 54 to 72 years of age with documented PD, responsive to L-dopa preparations. The principal method of the study was based on the linguistic analysis of the utterances produced by the people with PD. Results: The intensity of the speech disfluency observed in the utterances of persons with PD ranged from 6.6% to 23.0%, so it was significantly higher than that which is assumed as acceptable (3–5%; the speaking rate of the examined persons ranged from 0.7 syllabes (syl./s to 4.0 syl./s, and only 2 examined persons spoke with a rate considered to be correct (4–6 syl./s. This demonstrates that speech disfluency is a communication barrier in PD. Conclusions : The absence of differentiation in the speech disfluency (SD severity between different types of verbal utterances (difference not statistically significant and a specified hierarchy of SD symptoms indicate that the speech disfluency in PD has an essentially organic background and is generated by cognitive, linguistic, and motor deficits resulting from the damage to the central nervous system. This is also confirmed by the established hierarchy of utterances with respect to the SD intensity, not excluding the simultaneous participation of the emotional factor.

  3. Development of a System for Automatic Recognition of Speech Development of a System for Automatic Recognition of Speech

    Directory of Open Access Journals (Sweden)

    Michal Kuba

    2003-01-01

    Full Text Available The article gives a review of a research on processing and automatic recognition of speech signals (ARR at the Department of Telecommunications of the Faculty of Electrical Engineering, University of iilina. On-going research is oriented to speech parametrization using 2-dimensional cepstral analysis, and to an application of HMMs and neural networks for speech recognition in Slovak language. The article summarizes achieved results and outlines future orientation of our research in automatic speech recognition.The article gives a review of a research on processing and automatic recognition of speech signals (ARR at the Department of Telecommunications of the Faculty of Electrical Engineering, University of Zilina. On-going research is oriented to speech parametrization using 2-dimensional cepstral analysis, and to an application of HMMs and neural networks for speech recognition in Slovak language. The article summarizes achieved results and outlines future orientation of our research in automatic speech recognition.

  4. ANALYSIS OF RANGE CONDITIONS OF CHILDREN WITH SPEECH DIFFICULTIES WITH SPEECH THERAPY TREATMENT AT PRE-SCHOOL PERIOD

    Directory of Open Access Journals (Sweden)

    V. MIRAKOVSKA

    1997-12-01

    Full Text Available An analysis for range of children with speech difficulties in their development with speech therapy treatment is made for a period of 3 years ( 1993-1996 in the Institute for hearing, speech and voice in Skopje. The obtained results from the analysis show that “ for the time the children have to go to school, parent’s interest in their children’s speech is increased. About 90 % of the children ranged with speech therapy treatment, in that period they were at 5-7 years of age, and that shows that a part of the utmost period for development of the speech is missed.Also a great number of children with speech difficulties in their development were not ranged with speech therapy treatment. The reasons for that are the late directing of children, the social and cultural factor, economic condition of the family.

  5. Formulaic Speech in Early Classroom Second Language Development.

    Science.gov (United States)

    Ellis, Rod

    Formulaic speech, expressions learned as unanalyzed wholes and used on particular occasions by native speakers, is contrasted to "grammatical" sentences using novel combinations of words in the second language classroom. The speech produced by three limited English-speaking children in an English program suggests that formulaic speech enables…

  6. Variability and Diagnostic Accuracy of Speech Intelligibility Scores in Children

    Science.gov (United States)

    Hustad, Katherine C.; Oakes, Ashley; Allison, Kristen

    2015-01-01

    Purpose: We examined variability of speech intelligibility scores and how well intelligibility scores predicted group membership among 5-year-old children with speech motor impairment (SMI) secondary to cerebral palsy and an age-matched group of typically developing (TD) children. Method: Speech samples varying in length from 1-4 words were…

  7. 38 CFR 8.18 - Total disability-speech.

    Science.gov (United States)

    2010-07-01

    ... 38 Pensions, Bonuses, and Veterans' Relief 1 2010-07-01 2010-07-01 false Total disability-speech... SERVICE LIFE INSURANCE Premium Waivers and Total Disability § 8.18 Total disability—speech. The organic loss of speech shall be deemed to be total disability under National Service Life...

  8. Central Timing Deficits in Subtypes of Primary Speech Disorders

    Science.gov (United States)

    Peter, Beate; Stoel-Gammon, Carol

    2008-01-01

    Childhood apraxia of speech (CAS) is a proposed speech disorder subtype that interferes with motor planning and/or programming, affecting prosody in many cases. Pilot data (Peter & Stoel-Gammon, 2005) were consistent with the notion that deficits in timing accuracy in speech and music-related tasks may be associated with CAS. This study replicated…

  9. The Influence of Bilingualism on Speech Production: A Systematic Review

    Science.gov (United States)

    Hambly, Helen; Wren, Yvonne; McLeod, Sharynne; Roulstone, Sue

    2013-01-01

    Background: Children who are bilingual and have speech sound disorder are likely to be under-referred, possibly due to confusion about typical speech acquisition in bilingual children. Aims: To investigate what is known about the impact of bilingualism on children's acquisition of speech in English to facilitate the identification and treatment of…

  10. Developing a Weighted Measure of Speech Sound Accuracy

    Science.gov (United States)

    Preston, Jonathan L.; Ramsdell, Heather L.; Oller, D. Kimbrough; Edwards, Mary Louise; Tobin, Stephen J.

    2011-01-01

    Purpose: To develop a system for numerically quantifying a speaker's phonetic accuracy through transcription-based measures. With a focus on normal and disordered speech in children, the authors describe a system for differentially weighting speech sound errors on the basis of various levels of phonetic accuracy using a Weighted Speech Sound…

  11. The Prevalence of Speech Disorders among University Students in Jordan

    Science.gov (United States)

    Alaraifi, Jehad Ahmad; Amayreh, Mousa Mohammad; Saleh, Mohammad Yusef

    2014-01-01

    Problem: There are no available studies on the prevalence, and distribution of speech disorders among Arabic speaking undergraduate students in Jordan. Method: A convenience sample of 400 undergraduate students at the University of Jordan was screened for speech disorders. Two spontaneous speech samples and an oral reading of a passage were…

  12. Speech and audio processing for coding, enhancement and recognition

    CERN Document Server

    Togneri, Roberto; Narasimha, Madihally

    2015-01-01

    This book describes the basic principles underlying the generation, coding, transmission and enhancement of speech and audio signals, including advanced statistical and machine learning techniques for speech and speaker recognition with an overview of the key innovations in these areas. Key research undertaken in speech coding, speech enhancement, speech recognition, emotion recognition and speaker diarization are also presented, along with recent advances and new paradigms in these areas. ·         Offers readers a single-source reference on the significant applications of speech and audio processing to speech coding, speech enhancement and speech/speaker recognition. Enables readers involved in algorithm development and implementation issues for speech coding to understand the historical development and future challenges in speech coding research; ·         Discusses speech coding methods yielding bit-streams that are multi-rate and scalable for Voice-over-IP (VoIP) Networks; ·     �...

  13. Acceptance of a speech interface for biomedical data collection.

    OpenAIRE

    Grasso, M. A.; Ebert, D.; Finin, T.

    1997-01-01

    Speech interfaces have the potential to address the data entry bottleneck of many applications is the field of medical informatics. An experimental study evaluated the effect of perceptual structure on a multimodal speech interface for the collection of histopathology data. A perceptually structured multimodal interface, using speech and direct manipulation, was shown to increase speed and accuracy. Factors influencing user acceptance are also discussed.

  14. Auditory Long Latency Responses to Tonal and Speech Stimuli

    Science.gov (United States)

    Swink, Shannon; Stuart, Andrew

    2012-01-01

    Purpose: The effects of type of stimuli (i.e., nonspeech vs. speech), speech (i.e., natural vs. synthetic), gender of speaker and listener, speaker (i.e., self vs. other), and frequency alteration in self-produced speech on the late auditory cortical evoked potential were examined. Method: Young adult men (n = 15) and women (n = 15), all with…

  15. Influences of Infant-Directed Speech on Early Word Recognition

    Science.gov (United States)

    Singh, Leher; Nestor, Sarah; Parikh, Chandni; Yull, Ashley

    2009-01-01

    When addressing infants, many adults adopt a particular type of speech, known as infant-directed speech (IDS). IDS is characterized by exaggerated intonation, as well as reduced speech rate, shorter utterance duration, and grammatical simplification. It is commonly asserted that IDS serves in part to facilitate language learning. Although…

  16. Recent Research on the Treatment of Speech Anxiety.

    Science.gov (United States)

    Page, Bill

    Apprehension on the part of students who must engage in public speaking figures high on the list of student fears. Speech anxiety has been viewed as a trait--the overall propensity to fear giving speeches--and as a state--the condition of fearfulness on a particular occasion of speech making. Methodology in therapy research should abide by the…

  17. DIRECTIVE SPEECH ACT IN THE MOVIE SLEEPING BEAUTY

    Directory of Open Access Journals (Sweden)

    Muhartoyo

    2013-09-01

    Full Text Available Pragmatics is one of linguistics studies that is quite attractive to learn more about. There are many aspects of pragmatics; one of them is dealing with speech acts. Speech acts consist of many categories; one of them is directive speech act. This study aims to identify the directive speech act performed in Sleeping Beauty movie. Likewise, it will find out how often the directive speech act performed and which type of directive speech act that are most frequently used in the movie. This study used qualitative method in which data collection is done by watching the movie, analyzing the body movement and the dialogues of each character, reading the script and library research. A total of 139 directive speech acts were successfully identified. The result of analysis showed that the directive speech act of ordering is the most frequently used in the movie (21,6%. The least frequently used directive speech act is inviting directive speech act (0,7%. The study also revealed the importance of directive speech act in keeping the flow of storyline of the movie. This study is expected to give some useful insights in understanding what directive speech acts is.

  18. Phonological modeling for continuous speech recognition in Korean

    CERN Document Server

    Lee, W I; Lee, J H; Lee, WonIl; Lee, Geunbae; Lee, Jong-Hyeok

    1996-01-01

    A new scheme to represent phonological changes during continuous speech recognition is suggested. A phonological tag coupled with its morphological tag is designed to represent the conditions of Korean phonological changes. A pairwise language model of these morphological and phonological tags is implemented in Korean speech recognition system. Performance of the model is verified through the TDNN-based speech recognition experiments.

  19. Emotion Classification from Noisy Speech - A Deep Learning Approach

    OpenAIRE

    Rana, Rajib

    2016-01-01

    This paper investigates the performance of Deep Learning for speech emotion classification when the speech is compounded with noise. It reports on the classification accuracy and concludes with the future directions for achieving greater robustness for emotion recognition from noisy speech.

  20. Intuitive visualizations of pitch and loudness in speech

    NARCIS (Netherlands)

    Schaefer, R.S.; Beijer, L.J.; Seuskens, W.L.J.B.; Rietveld, A.C.M.; Sadakata, M.

    2016-01-01

    Visualizing acoustic features of speech has proven helpful in speech therapy; however, it is as yet unclear how to create intuitive and fitting visualizations. To better understand the mappings from speech sound aspects to visual space, a large web-based experiment (n = 249) was performed to evaluat

  1. Multistage audiovisual integration of speech: dissociating identification and detection

    DEFF Research Database (Denmark)

    Eskelund, Kasper; Tuomainen, Jyrki; Andersen, Tobias

    2011-01-01

    signal. Here we show that identification of phonetic content and detection can be dissociated as speech-specific and non-specific audiovisual integration effects. To this end, we employed synthetically modified stimuli, sine wave speech (SWS), which is an impoverished speech signal that only observers...

  2. Prosodic Features and Speech Naturalness in Individuals with Dysarthria

    Science.gov (United States)

    Klopfenstein, Marie I.

    2012-01-01

    Despite the importance of speech naturalness to treatment outcomes, little research has been done on what constitutes speech naturalness and how to best maximize naturalness in relationship to other treatment goals like intelligibility. In addition, previous literature alludes to the relationship between prosodic aspects of speech and speech…

  3. Changes in Speech Production Associated with Alphabet Supplementation

    Science.gov (United States)

    Hustad, Katherine C.; Lee, Jimin

    2008-01-01

    Purpose: This study examined the effect of alphabet supplementation (AS) on temporal and spectral features of speech production in individuals with cerebral palsy and dysarthria. Method: Twelve speakers with dysarthria contributed speech samples using habitual speech and while using AS. One hundred twenty listeners orthographically transcribed…

  4. Improving the speech intelligibility in classrooms

    Science.gov (United States)

    Lam, Choi Ling Coriolanus

    One of the major acoustical concerns in classrooms is the establishment of effective verbal communication between teachers and students. Non-optimal acoustical conditions, resulting in reduced verbal communication, can cause two main problems. First, they can lead to reduce learning efficiency. Second, they can also cause fatigue, stress, vocal strain and health problems, such as headaches and sore throats, among teachers who are forced to compensate for poor acoustical conditions by raising their voices. Besides, inadequate acoustical conditions can induce the usage of public address system. Improper usage of such amplifiers or loudspeakers can lead to impairment of students' hearing systems. The social costs of poor classroom acoustics will be large to impair the learning of children. This invisible problem has far reaching implications for learning, but is easily solved. Many researches have been carried out that they have accurately and concisely summarized the research findings on classrooms acoustics. Though, there is still a number of challenging questions remaining unanswered. Most objective indices for speech intelligibility are essentially based on studies of western languages. Even several studies of tonal languages as Mandarin have been conducted, there is much less on Cantonese. In this research, measurements have been done in unoccupied rooms to investigate the acoustical parameters and characteristics of the classrooms. The speech intelligibility tests, which based on English, Mandarin and Cantonese, and the survey were carried out on students aged from 5 years old to 22 years old. It aims to investigate the differences in intelligibility between English, Mandarin and Cantonese of the classrooms in Hong Kong. The significance on speech transmission index (STI) related to Phonetically Balanced (PB) word scores will further be developed. Together with developed empirical relationship between the speech intelligibility in classrooms with the variations

  5. The Need for a Speech Corpus

    OpenAIRE

    Campbell, Dermot; McDonnell, Ciaran; Meinardi, Marty; Richardson, Bunny

    2007-01-01

    This paper outlines the ongoing construction of a speech corpus for use by applied linguists and advanced EFL/ESL students. The first section establishes the need for improvements in the teaching of listening skills and pronunciation practice for EFL/ESL students. It argues for the need to use authentic native-to-native speech in the teaching/learning process so as to promote social inclusion and contextualises this within the literature, based mainly on the work of Swan, Brown and McCarthy. ...

  6. Two-microphone Separation of Speech Mixtures

    DEFF Research Database (Denmark)

    2006-01-01

    Matlab source code for underdetermined separation of instaneous speech mixtures. The algorithm is described in [1] Michael Syskind Pedersen, DeLiang Wang, Jan Larsen and Ulrik Kjems: ''Two-microphone Separation of Speech Mixtures,'' 2006, submitted for journal publoication. See also, [2] Michael...... Syskind Pedersen, DeLiang Wang, Jan Larsen and Ulrik Kjems: ''Overcomplete Blind Source Separation by Combining ICA and Binary Time-Frequency Masking,'' in proceedings of IEEE International workshop on Machine Learning for Signal Processing, pp. 15-20, 2005. All files should be in the same directory. The...

  7. Man machine interface based on speech recognition

    International Nuclear Information System (INIS)

    This work reports the development of a Man Machine Interface based on speech recognition. The system must recognize spoken commands, and execute the desired tasks, without manual interventions of operators. The range of applications goes from the execution of commands in an industrial plant's control room, to navigation and interaction in virtual environments. Results are reported for isolated word recognition, the isolated words corresponding to the spoken commands. For the pre-processing stage, relevant parameters are extracted from the speech signals, using the cepstral analysis technique, that are used for isolated word recognition, and corresponds to the inputs of an artificial neural network, that performs recognition tasks. (author)

  8. Arabic Speech Pathology Therapy Computer Aided System

    Directory of Open Access Journals (Sweden)

    Z. A. Benselama

    2007-01-01

    Full Text Available This article concerns a computer aided pathological speech therapy program, based on speech models such as the hidden Markov model and artificial intelligence networks, in order to help persons, suffering from language pathologies, follow a correction learning process, with different interactive feedbacks, aiming to evaluate the degree of evolution of the illness or the therapy. We dealt with the Arabic occlusive sigmatism as a prime approach, which is the inability to pronounce the[s] or [∫]. Results obtained are satisfying and the therapy program is prepared, for autonomous use by patients, for deep analysis and verifications.

  9. Phonetic Alphabet for Speech Recognition of Czech

    Directory of Open Access Journals (Sweden)

    J. Uhlir

    1997-12-01

    Full Text Available In the paper we introduce and discuss an alphabet that has been proposed for phonemicly oriented automatic speech recognition. The alphabet, denoted as a PAC (Phonetic Alphabet for Czech consists of 48 basic symbols that allow for distinguishing all major events occurring in spoken Czech language. The symbols can be used both for phonetic transcription of Czech texts as well as for labeling recorded speech signals. From practical reasons, the alphabet occurs in two versions; one utilizes Czech native characters and the other employs symbols similar to those used for English in the DARPA and NIST alphabets.

  10. Sentence Clustering Using Parts-of-Speech

    Directory of Open Access Journals (Sweden)

    Richard Khoury

    2012-02-01

    Full Text Available Clustering algorithms are used in many Natural Language Processing (NLP tasks. They have proven to be popular and effective tools to use to discover groups of similar linguistic items. In this exploratory paper, we propose a new clustering algorithm to automatically cluster together similar sentences based on the sentences’ part-of-speech syntax. The algorithm generates and merges together the clusters using a syntactic similarity metric based on a hierarchical organization of the parts-of-speech. We demonstrate the features of this algorithm by implementing it in a question type classification system, in order to determine the positive or negative impact of different changes to the algorithm.

  11. [Bifocal atypical rolandic epilepsy with speech dyspraxia].

    Science.gov (United States)

    Karlov, V A; Baiarrnaa Dondovyn; Gnezditskiĭ, V V

    2004-01-01

    Clinical and neurophysiological analysis of a case of a 7 year old patient with typical benign partial seizures with rolandic spikes and speech disorder, differing from those in Landau-Kleffner syndrome and in typical benign partial epilepsy of childhood presenting as speech dyspraxia. Two independent foci (in the premotor cortex of the left front lobe (dominant hemisphere) and in the temporal lobe of the right hemisphere were found. Significant clinical improvement and electrographical positive effect in EEG were achiered after prednisolone and sodium valproate treatment. PMID:15849864

  12. Fluency in native and nonnative English speech

    CERN Document Server

    Götz, Sandra

    2013-01-01

    This book takes a new and holistic approach to fluency in English speech and differentiates between productive, perceptive, and nonverbal fluency. The in-depth corpus-based description of productive fluency points out major differences of how fluency is established in native and nonnative speech. It also reveals areas in which even highly advanced learners of English still deviate strongly from the native target norm and in which they have already approximated to it. Based on these findings, selected learners are subjected to native speakers' ratings of seven perceptive fluency variables in or

  13. Post-editing through Speech Recognition

    DEFF Research Database (Denmark)

    Mesa-Lao, Bartolomé

    In the past couple of years automatic speech recognition (ASR) software has quietly created a niche for itself in many situations of our lives. Nowadays it can be found at the other end of customer-support hotlines, it is built into operating systems and it is offered as an alternative text...... the most popular computer-aided translation workbenches in the market (i.e. MemoQ) together with one of the most well-known ASR packages (i.e. Dragon Naturally Speaking from Nuance). Two data correction modes will be considered: a) keyboard vs. b) keyboard and speech combined. These two different ways...

  14. Hidden neural networks: application to speech recognition

    DEFF Research Database (Denmark)

    Riis, Søren Kamaric

    1998-01-01

    We evaluate the hidden neural network HMM/NN hybrid on two speech recognition benchmark tasks; (1) task independent isolated word recognition on the Phonebook database, and (2) recognition of broad phoneme classes in continuous speech from the TIMIT database. It is shown how hidden neural networks...... (HNNs) with much fewer parameters than conventional HMMs and other hybrids can obtain comparable performance, and for the broad class task it is illustrated how the HNN can be applied as a purely transition based system, where acoustic context dependent transition probabilities are estimated by neural...... networks...

  15. Speech Act Theory. A Critical Overview

    OpenAIRE

    Loftur Árni Björgvinsson 1983

    2011-01-01

    This essay examines J.L. Austin's theory regarding speech acts, or how we do things with words. It starts by reviewing the birth and foundation of speech act theory as it appeared in the 1955 William James Lectures at Harvard before going into what Austin's theory is and how it can be applied to the real world. The theory is explained and analysed both in regards to its faults and advantages. Proposals for the improvement of the theory are then developed, using the ideas of other scholars ...

  16. Effects of Synthetic Speech Output on Requesting and Natural Speech Production in Children with Autism: A Preliminary Study

    Science.gov (United States)

    Schlosser, Ralf W.; Sigafoos, Jeff; Luiselli, James K.; Angermeier, Katie; Harasymowyz, Ulana; Schooley, Katherine; Belfiore, Phil J.

    2007-01-01

    Requesting is often taught as an initial target during augmentative and alternative communication intervention in children with autism. Speech-generating devices are purported to have advantages over non-electronic systems due to their synthetic speech output. On the other hand, it has been argued that speech output, being in the auditory…

  17. The benefit obtained from visually displayed text from an automatic speech recognizer during listening to speech presented in noise

    NARCIS (Netherlands)

    Zekveld, A.A.; Kramer, S.E.; Kessens, J.M.; Vlaming, M.S.M.G.; Houtgast, T.

    2008-01-01

    OBJECTIVES: The aim of this study was to evaluate the benefit that listeners obtain from visually presented output from an automatic speech recognition (ASR) system during listening to speech in noise. DESIGN: Auditory-alone and audiovisual speech reception thresholds (SRTs) were measured. The SRT i

  18. Automatic Speech Signal Analysis for Clinical Diagnosis and Assessment of Speech Disorders

    CERN Document Server

    Baghai-Ravary, Ladan

    2013-01-01

    Automatic Speech Signal Analysis for Clinical Diagnosis and Assessment of Speech Disorders provides a survey of methods designed to aid clinicians in the diagnosis and monitoring of speech disorders such as dysarthria and dyspraxia, with an emphasis on the signal processing techniques, statistical validity of the results presented in the literature, and the appropriateness of methods that do not require specialized equipment, rigorously controlled recording procedures or highly skilled personnel to interpret results. Such techniques offer the promise of a simple and cost-effective, yet objective, assessment of a range of medical conditions, which would be of great value to clinicians. The ideal scenario would begin with the collection of examples of the clients’ speech, either over the phone or using portable recording devices operated by non-specialist nursing staff. The recordings could then be analyzed initially to aid diagnosis of conditions, and subsequently to monitor the clients’ progress and res...

  19. SpeechJammer: A System Utilizing Artificial Speech Disturbance with Delayed Auditory Feedback

    CERN Document Server

    Kurihara, Kazutaka

    2012-01-01

    In this paper we report on a system, "SpeechJammer", which can be used to disturb people's speech. In general, human speech is jammed by giving back to the speakers their own utterances at a delay of a few hundred milliseconds. This effect can disturb people without any physical discomfort, and disappears immediately by stop speaking. Furthermore, this effect does not involve anyone but the speaker. We utilize this phenomenon and implemented two prototype versions by combining a direction-sensitive microphone and a direction-sensitive speaker, enabling the speech of a specific person to be disturbed. We discuss practical application scenarios of the system, such as facilitating and controlling discussions. Finally, we argue what system parameters should be examined in detail in future formal studies based on the lessons learned from our preliminary study.

  20. Speech sound acquisition, coarticulation, and rate effects in a neural network model of speech production.

    Science.gov (United States)

    Guenther, F H

    1995-07-01

    This article describes a neural network model of speech motor skill acquisition and speech production that explains a wide range of data on variability, motor equivalence, coarticulation, and rate effects. Model parameters are learned during a babbling phase. To explain how infants learn language-specific variability limits, speech sound targets take the form of convex regions, rather than points, in orosensory coordinates. Reducing target size for better accuracy during slower speech leads to differential effects for vowels and consonants, as seen in experiments previously used as evidence for separate control processes for the 2 sound types. Anticipatory coarticulation arises when targets are reduced in size on the basis of context; this generalizes the well-known look-ahead model of coarticulation. Computer simulations verify the model's properties. PMID:7624456

  1. Speech and non-speech processing in children with phonological disorders: an electrophysiological study

    OpenAIRE

    Isabela Crivellaro Gonçalves; Haydée Fiszbein Wertzner; Alessandra Giannella Samelli; Carla Gentile Matas

    2011-01-01

    OBJECTIVE: To determine whether neurophysiological auditory brainstem responses to clicks and repeated speech stimuli differ between typically developing children and children with phonological disorders. INTRODUCTION: Phonological disorders are language impairments resulting from inadequate use of adult phonological language rules and are among the most common speech and language disorders in children (prevalence: 8 ‐ 9%). Our hypothesis is that children with phonological disorders have basi...

  2. Towards Artificial Speech Therapy: A Neural System for Impaired Speech Segmentation.

    Science.gov (United States)

    Iliya, Sunday; Neri, Ferrante

    2016-09-01

    This paper presents a neural system-based technique for segmenting short impaired speech utterances into silent, unvoiced, and voiced sections. Moreover, the proposed technique identifies those points of the (voiced) speech where the spectrum becomes steady. The resulting technique thus aims at detecting that limited section of the speech which contains the information about the potential impairment of the speech. This section is of interest to the speech therapist as it corresponds to the possibly incorrect movements of speech organs (lower lip and tongue with respect to the vocal tract). Two segmentation models to detect and identify the various sections of the disordered (impaired) speech signals have been developed and compared. The first makes use of a combination of four artificial neural networks. The second is based on a support vector machine (SVM). The SVM has been trained by means of an ad hoc nested algorithm whose outer layer is a metaheuristic while the inner layer is a convex optimization algorithm. Several metaheuristics have been tested and compared leading to the conclusion that some variants of the compact differential evolution (CDE) algorithm appears to be well-suited to address this problem. Numerical results show that the SVM model with a radial basis function is capable of effective detection of the portion of speech that is of interest to a therapist. The best performance has been achieved when the system is trained by the nested algorithm whose outer layer is hybrid-population-based/CDE. A population-based approach displays the best performance for the isolation of silence/noise sections, and the detection of unvoiced sections. On the other hand, a compact approach appears to be clearly well-suited to detect the beginning of the steady state of the voiced signal. Both the proposed segmentation models display outperformed two modern segmentation techniques based on Gaussian mixture model and deep learning. PMID:27354188

  3. Gesture and Speech Coordination: The Influence of the Relationship Between Manual Gesture and Speech

    OpenAIRE

    Roustan, Benjamin; Dohen, Marion

    2010-01-01

    Communication is multimodal. In particular, speech is often accompanied by manual gestures. Moreover, their coordination has often been related to prosody. The aim of this study was to further explore the coordination between prosodic focus and different manual gestures (pointing,beat and control gestures) on ten speakers using motion capture. As compared to previous studies, results show that the coordination between gestures and speech is modulated by the relationship between themanual gest...

  4. Semantic Framing of Speech : Emotional and Topical Cues in Perception of Poorly Specified Speech

    OpenAIRE

    Lidestam, Björn

    2003-01-01

    The general aim of this thesis was to test the effects of paralinguistic (emotional) and prior contextual (topical) cues on perception of poorly specified visual, auditory, and audiovisual speech. The specific purposes were to (1) examine if facially displayed emotions can facilitate speechreading performance; (2) to study the mechanism for such facilitation; (3) to map information-processing factors that are involved in processing of poorly specified speech; and (4) to present a comprehensiv...

  5. Noise Estimation and Suppression Using Nonlinear Function with A Priori Speech Absence Probability in Speech Enhancement

    OpenAIRE

    Lee, Soojeong; Lee, Gangseong

    2016-01-01

    This paper proposes a noise-biased compensation of minimum statistics (MS) method using a nonlinear function and a priori speech absence probability (SAP) for speech enhancement in highly nonstationary noisy environments. The MS method is a well-known technique for noise power estimation in nonstationary noisy environments; however, it tends to bias noise estimation below that of the true noise level. The proposed method is combined with an adaptive parameter based on a sigmoid function and a...

  6. On Automatic Voice Casting for Expressive Speech: Speaker Recognition vs. Speech Classification

    OpenAIRE

    Obin, Nicolas; Roebel, Axel; Bachman, Grégoire

    2014-01-01

    This paper presents the first large-scale automatic voice casting system, and explores the adaptation of speaker recognition techniques to measure voice similarities. The proposed system is based on the representation of a voice by classes (e.g., age/gender, voice quality, emotion). First, a multi-label system is used to classify speech into classes. Then, the output probabilities for each class are concatenated to form a vector that represents the vocal signature of a speech recording. Final...

  7. Plasticity in the Human Speech Motor System Drives Changes in Speech Perception

    OpenAIRE

    Lametti, Daniel R.; Rochet-Capellan, Amélie; Neufeld, Emily; Shiller, Douglas M.; Ostry, David J.

    2014-01-01

    Recent studies of human speech motor learning suggest that learning is accompanied by changes in auditory perception. But what drives the perceptual change? Is it a consequence of changes in the motor system? Or is it a result of sensory inflow during learning? Here, subjects participated in a speech motor-learning task involving adaptation to altered auditory feedback and they were subsequently tested for perceptual change. In two separate experiments, involving two different auditory percep...

  8. Song and speech: examining the link between singing talent and speech imitation ability

    Directory of Open Access Journals (Sweden)

    Markus eChristiner

    2013-11-01

    Full Text Available In previous research on speech imitation, musicality and an ability to sing were isolated as the strongest indicators of good pronunciation skills in foreign languages. We, therefore, wanted to take a closer look at the nature of the ability to sing, which shares a common ground with the ability to imitate speech. This study focuses on whether good singing performance predicts good speech imitation. Fourty-one singers of different levels of proficiency were selected for the study and their ability to sing, to imitate speech, their musical talent and working memory were tested. Results indicated that singing performance is a better indicator of the ability to imitate speech than the playing of a musical instrument. A multiple regression revealed that 64 % of the speech imitation score variance could be explained by working memory together with educational background and singing performance. A second multiple regression showed that 66 % of the speech imitation variance of completely unintelligible and unfamiliar language stimuli (Hindi could be explained by working memory together with a singer’s sense of rhythm and quality of voice. This supports the idea that both vocal behaviors have a common grounding in terms of vocal and motor flexibility, ontogenetic and phylogenetic development, neural orchestration and sound memory with singing fitting better into the category of "speech" on the productive level and "music" on the acoustic level. As a result, good singers benefit from vocal and motor flexibility, productively and cognitively, in three ways. 1. Motor flexibility and the ability to sing improve language and musical function. 2. Good singers retain a certain plasticity and are open to new and unusual sound combinations during adulthood both perceptually and productively. 3. The ability to sing improves the memory span of the auditory short term memory.

  9. Song and speech: examining the link between singing talent and speech imitation ability.

    Science.gov (United States)

    Christiner, Markus; Reiterer, Susanne M

    2013-01-01

    In previous research on speech imitation, musicality, and an ability to sing were isolated as the strongest indicators of good pronunciation skills in foreign languages. We, therefore, wanted to take a closer look at the nature of the ability to sing, which shares a common ground with the ability to imitate speech. This study focuses on whether good singing performance predicts good speech imitation. Forty-one singers of different levels of proficiency were selected for the study and their ability to sing, to imitate speech, their musical talent and working memory were tested. Results indicated that singing performance is a better indicator of the ability to imitate speech than the playing of a musical instrument. A multiple regression revealed that 64% of the speech imitation score variance could be explained by working memory together with educational background and singing performance. A second multiple regression showed that 66% of the speech imitation variance of completely unintelligible and unfamiliar language stimuli (Hindi) could be explained by working memory together with a singer's sense of rhythm and quality of voice. This supports the idea that both vocal behaviors have a common grounding in terms of vocal and motor flexibility, ontogenetic and phylogenetic development, neural orchestration and auditory memory with singing fitting better into the category of "speech" on the productive level and "music" on the acoustic level. As a result, good singers benefit from vocal and motor flexibility, productively and cognitively, in three ways. (1) Motor flexibility and the ability to sing improve language and musical function. (2) Good singers retain a certain plasticity and are open to new and unusual sound combinations during adulthood both perceptually and productively. (3) The ability to sing improves the memory span of the auditory working memory. PMID:24319438

  10. Speech Acquisition and Automatic Speech Recognition for Integrated Spacesuit Audio Systems

    Science.gov (United States)

    Huang, Yiteng; Chen, Jingdong; Chen, Shaoyan

    2010-01-01

    A voice-command human-machine interface system has been developed for spacesuit extravehicular activity (EVA) missions. A multichannel acoustic signal processing method has been created for distant speech acquisition in noisy and reverberant environments. This technology reduces noise by exploiting differences in the statistical nature of signal (i.e., speech) and noise that exists in the spatial and temporal domains. As a result, the automatic speech recognition (ASR) accuracy can be improved to the level at which crewmembers would find the speech interface useful. The developed speech human/machine interface will enable both crewmember usability and operational efficiency. It can enjoy a fast rate of data/text entry, small overall size, and can be lightweight. In addition, this design will free the hands and eyes of a suited crewmember. The system components and steps include beam forming/multi-channel noise reduction, single-channel noise reduction, speech feature extraction, feature transformation and normalization, feature compression, model adaption, ASR HMM (Hidden Markov Model) training, and ASR decoding. A state-of-the-art phoneme recognizer can obtain an accuracy rate of 65 percent when the training and testing data are free of noise. When it is used in spacesuits, the rate drops to about 33 percent. With the developed microphone array speech-processing technologies, the performance is improved and the phoneme recognition accuracy rate rises to 44 percent. The recognizer can be further improved by combining the microphone array and HMM model adaptation techniques and using speech samples collected from inside spacesuits. In addition, arithmetic complexity models for the major HMMbased ASR components were developed. They can help real-time ASR system designers select proper tasks when in the face of constraints in computational resources.

  11. Beyond speech intelligibility and speech quality: measuring listening effort with an auditory flanker task

    OpenAIRE

    Wibrow, M. A.

    2015-01-01

    If listening to speech against a background of noise increases listening effort, then the effectiveness of a speech technology designed to reduce background noise could be measured by the reduction in listening effort it provides. Reports of increased listening effort in environments with greater background noise have been linked to accompanying decreases in performance (e.g., slower responses and more errors) which are commonly attributed to the increased demands placed on limited cognitive ...

  12. Liberalism, feminism and republicanism on freedom of speech: the cases of pornography and racist hate speech

    OpenAIRE

    Power Febres, C.

    2011-01-01

    The central issue tackled in this thesis is whether there is room for legitimate restrictions upon pornography and extreme right political organisations' racist hate speech; whether such restrictions can be made without breaching generally accepted liberal rights and within a democratic context. Both these forms of speech, identified as 'hard cases' in the literature, are presented as problems that political theorists should be concerned with. This concern stems from the increase in these ...

  13. SynFace-Speech-Driven Facial Animation for Virtual Speech-Reading Support

    OpenAIRE

    Giampiero Salvi; Jonas Beskow; Samer Al Moubayed; Björn Granström

    2009-01-01

    This paper describes SynFace, a supportive technology that aims at enhancing audio-based spoken communication in adverse acoustic conditions by providing the missing visual information in the form of an animated talking head. Firstly, we describe the system architecture, consisting of a 3D animated face model controlled from the speech input by a specifically optimised phonetic recogniser. Secondly, we report on speech intelligibility experiments with focus on multilinguality and robustness t...

  14. Linguistic representation of Finnish in a lomited domain speech-to-speech translation system

    OpenAIRE

    Santaholma, Marianne Elina

    2005-01-01

    This paper describes the development of Finnish linguistic resources for use in MedSLT, an Open Source medical domain speech-to-speech translation system. The paper describes the collection of the medical sub-domain corpora for Finnish, the creation of the Finnish generation grammar by adapting the original English grammar, the composition of the domain specific Finnish lexicon and the definition of interlingua to Finnish mapping rules for multilingual translation. It is shown that Finnish ca...

  15. Introducing Two New Terms into the Literature of Hate Speech: “Hate Discourse” and “Hate Speech Act” Application of “speech act theory” into hate speech studies in the era of Web 2.0

    OpenAIRE

    Özarslan, Yrd. Doç. Dr. Zeynep

    2014-01-01

    The aim of this paper is to explain the need for a revision of the term “hate speech” in the era of Web 2.0 and to introduce two new terms into the literature of hate speech with the help of application of “speech act theory”, that is “hate discourse” and “hate speech act.” The need for the revision arises from the examination of the methodology used to analyze hate speech, which is critical discourse analysis (CDA). Even though CDA seems fairly sufficient for hate speech analysis in traditio...

  16. The development of speech production in children with cleft palate

    DEFF Research Database (Denmark)

    Willadsen, Elisabeth; Chapman, Kathy

    The purpose of this chapter is to provide an overview of speech development of children with cleft palate +/- cleft lip. The chapter will begin with a discussion of the impact of clefting on speech. Next, we will provide a brief description of those factors impacting speech development for this...... population of children. Finally, research examining various aspects of speech development of infants and young children with cleft palate (birth to age five) will be reviewed. This final section will be organized by typical stages of speech sound development (e.g., prespeech, the early word stage, and...

  17. Disordered Speech Assessment Using Automatic Methods Based on Quantitative Measures

    Directory of Open Access Journals (Sweden)

    Shrivastav Rahul

    2005-01-01

    Full Text Available Speech quality assessment methods are necessary for evaluating and documenting treatment outcomes of patients suffering from degraded speech due to Parkinson's disease, stroke, or other disease processes. Subjective methods of speech quality assessment are more accurate and more robust than objective methods but are time-consuming and costly. We propose a novel objective measure of speech quality assessment that builds on traditional speech processing techniques such as dynamic time warping (DTW and the Itakura-Saito (IS distortion measure. Initial results show that our objective measure correlates well with the more expensive subjective methods.

  18. Listeners' Perceptions of Speech and Language Disorders

    Science.gov (United States)

    Allard, Emily R.; Williams, Dale F.

    2008-01-01

    Using semantic differential scales with nine trait pairs, 445 adults rated five audio-taped speech samples, one depicting an individual without a disorder and four portraying communication disorders. Statistical analyses indicated that the no disorder sample was rated higher with respect to the trait of employability than were the articulation,…

  19. Hate Speech on Small College Campuses.

    Science.gov (United States)

    Hemmer, Joseph J., Jr.

    A study identified and evaluated the approach of small colleges in dealing with hate speech and/or verbal harassment incidents. A questionnaire was sent to the Dean of Students at 200 randomly-selected small (500-2000 students), private, liberal arts colleges and universities. Responses were received from 132 institutions, for a response rate of…

  20. The Struggle with Hate Speech. Teaching Strategy.

    Science.gov (United States)

    Bloom, Jennifer

    1995-01-01

    Discusses the issue of hate-motivated violence and special laws aimed at deterrence. Presents a secondary school lesson to help students define hate speech and understand constitutional issues related to the topic. Includes three student handouts, student learning objectives, instructional procedures, and a discussion guide. (CFR)